« Time To Fear? No Dice For Me Yet. | Main | First It's A Candy, Then It's Another Arm? »

March 07, 2007

Can’t See The Trees For The Forest?

Yes, I know that, technically, the expression is ”Can’t see the forest for the trees” - and that this notion suggests that someone can become so involved in the details of a problem that they fail to look at the situation as a whole. However, can’t there be a reverse application to this theory (as my title suggests) as well?

Recent conversation here about Alex Rodriguez strikeouts brought cause for me to wonder about this concept.

Many are quick to dismiss a player's strikeout totals and/or frequency as long as his other numbers “are there.” Very often you will hear someone say “Who cares about his strikeouts as long as he has an OPS over nine hundred?” Further, those familiar with the study of Run Expectancy would be quick to add that "going down on strikes" is merely another vanilla form of being retired - and that the "K" was no better or worse, for the most part, than any other way of being retired as a batter.

However, if we are concerned about a batter's whiffs, when we look at a player’s OPS for the season, or study the Run Expectancy of every base/out situation and eventual results for several seasons worth of data, are we not then allowing the forest to inhibit our ability to look at the trees?

Assume that a player does have a very high OPS for the season – along with many strikeouts. The high OPS indicates to us that the player did a very good job at avoiding outs and/or hitting for power. This is, indeed, good news. However, does it really tell us anything about his strikeouts? Not really. But, here, again, the sabermetric school tells us that his whiffs are just another form of downtime for a batter and that there’s no story to be told here because historians have looked at two million (or thereabouts) past situations where batters struck out, and on average, whiffs have no greater impact on run production compared to other ways of being retired.

Thinking this over now, I am beginning to wonder about whether or not there is an issue with this approach.

If we can agree that the size of a study can be classified in terms of its granularity, then we should be able to agree that studies have fine granularity if they contain key or precise data elements and that studies have course granularity if they contain tons and tons of examples. Here, fine granularity is the trees and course granularity is the forest.

Last time I checked, fine granularity provides a much clearer picture than course granularity. Therefore, if you want to look at a tree, it’s better to look at the tree than the whole forest. Related, if you really want to know about the impact of a batter’s strikeouts, it makes more sense (with respect to an inspection focus) to look at his strikeouts. You should look at the trees (meaning just his strikeouts) and not the forest (meaning the impact of strikeouts based on several decades worth of data on hundreds of players). You should look at each and very strikeout for the batter in question – in each situation, what was the cost of non-contact, etc.? And, then tally those findings. This will tell you if his strikeouts mattered – much more so than looking at his OPS or by applying a rule based on a watered-down average calculation.

Back to A-Rod, I cannot tell you if Alex Rodriguez’ strikeouts last year were an issue – because I have not looked at each and everyone one and determined the non-contact impact on that Plate Appearance and the offensive series in which it resided. But, on the same token, no one can tell us now that A-Rod’s whiffs were not an issue – unless they have looked at each and everyone one and determined the non-contact impact, etc.

Until someone does the actual and direct analysis, any opinion on the matter is just that – an opinion.

Posted by Steve Lombardi at March 7, 2007 02:36 PM

Comments

Oh. My. God.

You need to get an A-Rod statue. A-Rod struck out! Oh my god! He must be the most unclutch player in history! Something you can rant at all day long about every single little thing you can find wrong with him.

Maybe, maybe then, you'll realize you're obsessed with one of the greats. And you'll finally say "I'm sorry."

You have an obsession. Please keep it to yourself.

Posted by: Andrew [TypeKey Profile Page] at March 7, 2007 03:15 PM

I think I see what you're getting at, Steve, but I'm curious as to what assumptions you think need to made in order to understand the cost of a strike out.

Correct me if I'm wrong: You say a strike out with 2 outs and no one on is less costly to the team than a strike out with a runner on third and 1 out. The former is simply the third out in an inning while the latter removes the possibility of a sac fly.

So to understand A-Rod's strike outs and potentially how detrimental they are to the Yanks, we would have to assess each of his 139 strike outs in 2006 for their cost - in run expectancy - to the team. Then, you would have to do this across the league and see if his strike outs are costly.

Then, we would have to figure out if A-Rod is costing the team more by striking out so much or if he more than makes up for it with his RC/27 outs of greater than 7.

The thing is: A run is a run is a run. Runs scored through solo shots in the 2nd inning are just as important to a team as runs scored in the 9th. They're just less dramatic and quote-unquote clutch.

As one who feels ARod isn't appreciated enough around New York, those are my initial thoughts.

Posted by: Benjamin Kabak [TypeKey Profile Page] at March 7, 2007 03:15 PM

The nature of baseball is that you need to see the forest for the trees. With certain players, you accept strikeouts (outs?) if they bring other things to the table offensively, and to an extent, defensively.

There's a significant difference in production between Rodriguez and, say, Tony Bautista (who has posted gaudy HR/RBI numbers in the past)

Posted by: Raf [TypeKey Profile Page] at March 7, 2007 03:25 PM

Such a study would have to look at situational ABs for, say, the top 10% in OPS. (I'm not sure a simple A-Rod vs. David Ortiz study would be fair -- make it larger to get a better context of what the average future HoFer would do.)
Now a lead off out doesn't matter if it's a K or a ground out or any other out. One on and no outs -- a K would be better than a DP, but less than a groundball that moves the runner over (or is a hit). Jeez, there would need to be a lot of categories. It would be interesting to break down strikeouts into those categories to see if the conventional sabermetric wisdom (a K is just another out) does pan out.

Posted by: rbj [TypeKey Profile Page] at March 7, 2007 03:37 PM

~~~I think I see what you're getting at, Steve, but I'm curious as to what assumptions you think need to made in order to understand the cost of a strike out.~~~

That's an excellent question - and I can only guess at the answer.

I suppose you could look at each PA/out/base situation where there was a K and then assume what would happen if there was a grounder, flyout, or hit in that situation - some form of contact - and then see how the inning would have been different.

Maybe you look at the spread of a hitter's GB/FB history and use that to assume some type of contact result in the K-PA and play out the inning from the one that you apply instead of the K?

Or, maybe you just look at the negative that the K brought (if it did) and that's a minus, or the non-event that the K had had (if it did), and that's a plus, and then you do a plus/minus total?

I have no idea what would be best - if I did, I would probably do the study.

Posted by: Steve Lombardi [TypeKey Profile Page] at March 7, 2007 03:54 PM

Not being much of a stathead, I've always wondered about how strikeouts and "regular" outs could possibly be considered equals. I know that my visceral reaction to strikeouts is much greater, but we all know that emotions can sometimes deceive us.

And Andrew - here's an idea: Get your own blog, and then you can write about A-Rod or not write about A-rod...or do whatever.

Posted by: brockdc [TypeKey Profile Page] at March 7, 2007 03:57 PM

O/T

I did a couple of community projections on my blog (Melky, Pavano, Igawa) and decided today that with all the talk about Hughes (since his appearance yesterday)a projection for him could be fun.

If anyone wants to participate, please feel free.

http://jeteupthemiddle.blogspot.com/

Posted by: Jeteupthemiddle [TypeKey Profile Page] at March 7, 2007 04:01 PM

brockdc - if K's interest you, you may enjoy these two older things that I did:

http://www.waswatching.com/archives/2005/12/welcome_to_my_n.html

http://www.waswatching.com/archives/2006/12/october_contact.html

re: your comment to Andrew, well, I have no issue with his feedback - and I appreciate it. I may not do anything with it, but, I really do find it interesting.

Posted by: Steve Lombardi [TypeKey Profile Page] at March 7, 2007 05:09 PM

I've thought about this for a while, Steve. I think you're right about the usefulness of a study with fine granularity in determining if a specific player has trouble with situations such as those. It IS theoretically possible, with the pairing of baseball-reference.com and retrosheet.org to determine the base/out situation of every one of Alex Rodriguez's at bats. We could then, as someone else suggested, compare it to the other top-10 OPS (or Win Shares, or whatever) players. It would be a long study, but I'm used to those. Maybe I'll check it out in my spare time.

However, I take issue with your implication that a reduction in A-Rod's strikeouts will make him better. First of all, those studies you referenced have proven that, with few exceptions (Joe D, Pujols), when a player's strikeouts go down, his OPS and EQA also take big hits. This tells me that, in general, a player attempting to limit his strikeouts will also become a less effective hitter.

So I ask: what dip in OPS/EQA/RC27 would you permit in order to pick up a few extra "clutch" hits over the course of the season? If A-Rod had an OPS of .850 instead of over .900, but at the same time led the league in sacrifice flies, or in batting average with RISP... would that make you happier? Would you permit a dip to .800?

I, for one, would not want to hurt the New York Yankees by insisting that its best slugger try to adjust his approach, ruining his overall production.

Posted by: mehmattski [TypeKey Profile Page] at March 7, 2007 07:57 PM

The reason that strikeouts, on average, are not worse than other outs is because players who strike out a lot tend to ground into double plays a lot less frequently (per opportunity; don't look at totals, because they are heavily dependent on batting with runners on first). Players who make contact often usually offset the baserunners they advance with batted balls with the runners they eliminate with batted balls.

With no one on base, a strikeout is definitely worse than other outs. But with a runner on first and less than two out, it's clearly much better to strike out than it is to make a contact out.

As far as when a batter strikes out, the most important determinant is the pitcher he's facing.

Cal Ripken didn't strike out all that often, and that's one of the reasons he's the all time gidp leader with 350. On the other hand,someone like Adam Dunn will almost never reach double digits in double plays in a single season, no matter how many "opportunities" he has.

Posted by: spira [TypeKey Profile Page] at March 7, 2007 08:28 PM

I'm not quite clear what you're getting at here, but doesn't WPA and some of the other linear weights stuff that Tango, Lichtman, and Dolphin write about in "The Book" address what you're talking about here?

Posted by: jonm [TypeKey Profile Page] at March 7, 2007 09:46 PM

I agree with Steve that there are situations in which putting the ball in play is probably better than a strikeout. They are:
Runner on 3rd, no outs
Runner on 3rd, one out
2nd and 3rd, no outs
2nd and 3rd, one out
1st and 3rd, no outs

It would take much more time than I have to see how many at bats there are in these specific instances, but there is some info. For example, with a runner on 3rd alone, any outs, A-Rod had 16 plate appearances, and struck out in 3 of these in 2006. Of these three strikeouts, two came with two outs.

With runners on 2nd and 3rd, any out, A-Rod had 7 strikeouts in 22 plate appearances. Of the seven strikeouts, one came with two outs, five came with one out, one came with 1 out.

With runners on first and third, any out, A-Rod had 32 PA, 5 K, and 3 GDP. With no outs, A-Rod had
zero strikeouts with runners on first and third in 2006.

I disagree with spira above that a strikeout is worse than a ball in play with the bases empty. This goes back to the data that suggests that a player's total offensive output and a player's strikeout rate are directly proportional: more strikeouts, more production. So a hitter with the bases empty, in any out situation, should be trying to maximize offense. A-Rod, however, did not do this last season. With the bases empty in 2006, A-Rod had 318 PA, with 69 strikeouts and a .888 OPS. This is an increase in K-rate of 1% and a decrease from his OPS of .915 on the season.

Over his career, however, A-Rod has the same K-rate with the bases empty as with men on (18%), with a .011 dip in OPS when the bases are empty.

I think the small numbers alone should be dissuading anyone from drawing any conclusions at all. I don't just mean for small sample size- I also mean from the standpoint of hoping for A-Rod to change his approach. A-Rod is supposed to focus on striking out less so that he can be more effective in the 50 or so at bats a season when there's a runner on third and less than two outs? I think that's a foolish strategy.

Posted by: mehmattski [TypeKey Profile Page] at March 7, 2007 10:26 PM

~~~I'm not quite clear what you're getting at here, but doesn't WPA and some of the other linear weights stuff that Tango, Lichtman, and Dolphin write about in "The Book" address what you're talking about here?~~~

WPA could be the answer to this - or at least a better way to judge how a player is doing to impact a game, as opposed to just using his overall OPS.

Posted by: Steve Lombardi [TypeKey Profile Page] at March 7, 2007 10:30 PM

~~~when a player's strikeouts go down, his OPS and EQA also take big hits.~~~

In the forest, as a whole, maybe so. But, in terms of one tree, we cannot assume this to be true, can we?

Actually, if this were a fact, if his K-rate goes up, should we then expect to see an increase in OPS, etc?

A-Rod's K-Rate went up in 2006 (from 2005) - but his RCAA, etc., went way down.

Or, is the K-rate thing just a one way rule?

Posted by: Steve Lombardi [TypeKey Profile Page] at March 7, 2007 10:44 PM

~~~Players who make contact often usually offset the baserunners they advance with batted balls with the runners they eliminate with batted balls.~~~

Even the ones with very high BABIP?

Posted by: Steve Lombardi [TypeKey Profile Page] at March 7, 2007 10:47 PM

I'm not sure if you've made this point before, Steve, but I looked at A-Rod's K rate and his OPS+ from each of his full seasons, from 1996 on. His K-rate has a very small standard deviation, so even small differences in K-rate (1.5%) are fairly meaningful. Anyway, the correlation between A-rod's K-rate and his OPS+ is... 0.19.

So maybe, for this tree, strikeout rate doesn't matter all that much.

Posted by: mehmattski [TypeKey Profile Page] at March 7, 2007 11:18 PM

I actually thought that his contact rate was going down each year - albeit a small amount - over the last 5 seasons or so. And, that his Power Index was inconsistent during this time as well. (I have to check this tonight.) If true, I think it tells us that there is not direct link, no?

Posted by: Steve Lombardi [TypeKey Profile Page] at March 8, 2007 09:39 AM