Getting WAR Right (3/5/2013)

By Paul Bessire

Tuesday, March 5 at 11:30 PM ET

Before getting deeper into my analysis on and new method for approaching Wins Above Replacement, please see my recap of last year's Sloan Sports Analytics Conference (here) for my take on the concept of "analytics." The labels placed on "football people" (or whichever sport) and "analytics people" and the widely discussed great divide between the two groups permeated yet another SSAC this week and it's a shame. As I see it, in any industry, while goals for all organizations tend to be similar (make "optimal" decisions), there are those that stubbornly - whether due to overconfidence in one's experiences and abilities or general ignorance or otherwise - refuse to leverage the major advances in recent technology to help them find the truths to their options in a decision and those that embrace it. Take a look at just about any industry. Those that embrace technology are winning - and generally doing a better job at it than anyone in the sports world. Usually, the smart ones are those that realize that, given the availability of information and resources to evaluate it, they are not smart enough on their own to ignore the power and truth in that information...

Having recently returned from the Sloan Sports Analytics Conference at MIT, I realized that, among those that I generally believe "get it," (aka anyone on a panel that was not in held in the main ballroom), the concept of discussing players relative to their Wins Above Replacement value (i.e. he is a "five win player") is no longer even a debate. It's widely accepted.

While there are many merits to the mostly thorough nature of discussing someone's past performance in the context of "WAR" (I will freely admit that I have been paid to develop WAR calculations for over 20 collegiate sports that are not what I discuss in my proposal below), simply using the concept's easiest translation, relying on WAR for all analytical discussions is incredibly dangerous for many reasons:

  • There are disputed/different ways to calculate WAR like here and here.
  • Not only is there a lack of a clear definition for WAR, it is not an easy or intuitive calculation and can be prone to significant error (each of the definitions I just linked to has at least six sections divided into many individual steps).
  • While performance relative to peers (or someone just worse than one's peers) is considered, actual opponents are ignored and, in some iterations, ballparks are also ignored (like we discussed with the "Total QBR" blackbox, that could add incredible noise to the analysis).
  • It tells a story of what has happened and not what will (or at least could with some specific level of probability) happen. This is a common issue/misconception with statistics and a concept that I consistently fight as I/we try to fight through what we know to draw reasonable, helpful conclusions.
  • While, I would/will contend that it is possible to remove the bias inherent in numbers from correlated entities like teammates, scheme, position, etc. (and ballpark and opponents, but we just discussed that)to create a neutral baseline that puts all players in true context, A) that is not what WAR is and B) as it relates to WAR, that has little to do with what situations a player could be in.

At the Monday Morning Quarterbacking panel at SSAC this year, Herm Edwards refuted the notion that he should have tried the "analytics" preferred option when going for a two point conversion because he had Tyler Thigpen as his quarterback. Whether he was exactly right or not (I have yet to research it), it is the right point to make. Context means everything. Because, in the entire history of the league a certain result has proven to be Y% likely, does not mean much when Tyler Thigpen is the quarterback (or even when Tom Brady is the quarterback). Context. (Quick, yet very important sidebar: While I appreciate and understand the value in the work of Brian Burke from and others in the live win probability space, this is why Live ScoreCaster is more appropriate. Instead of looking at historically similar situations, we are literally only concerned about expectations stemming from the actual two teams playing - from that point in the game through its conclusion... 50,000 times.)

As it is stated, "Wins Above Replacement" sounds like putting any specific player on a team that does not have a league-caliber starting player at that position on its roster will result in that team winning X wins more than otherwise. It's a great concept, but misses a few steps. First of all, what is X? Is X the player's previous season WAR? In that case, that is essentially what did happen. Secondly, how does X translate to the new schedule and scheme (and ballpark)? Thirdly, how do teammates impact this value? How did they? And, lastly, what impact did the defense behind a pitcher or in front of a hitter play in numbers.

Context. WAR is great at this as a story-teller when used as a comprehensive measure of previous value by which all players in a sport are compared. In that sense, though, it is like PER in basketball, where all we know is how players stack up against each other and if they are above or below average. Alone, PER is a nonsensical value. WAR obviously gets us a little closer to meaning something, but not as much as one may assume based on the way it is discussed.

But we can (or at least I believe I can) calculate how many wins above replacement a player should mean to a team. We can also calculate how many wins above/below the player to be replaced he should mean or what that should mean to a team's chances of making the postseason or winning a championship. We do this by finding the baseline, free of bias expectations of who a player truly is and then placing that player into a specific context and seeing what happens (50,000 times). Not only is the entire concept of a true, contextual WAR possible, technology is getting close to calculating it every day for every player in/from any situation.

Those who know the site and frequent the blog are probably following along in this paragraph (or already left the page to find something to wager upon). Since I have done this kind of work for teams and players in the past though and because there is one incredibly extreme player that illustrates this perfectly, I think it is best that I leave the rest to this example. Meet Ervin Santana (please note that this actual analysis was conducted immediately following the 2012 season and is not representative of our actual MLB Preview - which launches on Thursday).

2012 Ervin Santana

This last season, 18.9% of Santana's fly balls allowed were home runs. This is the second-highest recorded total since such information was recorded in 2002. As a traditionally fly ball oriented pitcher (56.8% of his batted balls allowed were fly balls as opposed to ground balls), this led to 1.97 home runs allowed per nine innings, which is the sixth-highest total in the entire history of MLB.

I read this all as good news. While those numbers do not sound good, they are also unlucky and extremely unlikely to occur again to Santana.

It should be noted that, Bronson Arroyo, who has almost identical career HR/9, BABIP, GB/FB , HR/FB and ERA numbers to Ervin Santana (though Santana is younger and has notably better strikeout and K:BB numbers), is one of the players who has a season that ranked ahead of Santana's. In 2011, Arroyo allowed 2.08 HR/9 in 199 innings pitched for the Cincinnati Reds. Remaining in the same ballpark and against a similar schedule in 2012, Arroyo allowed just 1.16 HR/9 (close to his career 1.22 mark) and was worth around three wins above replacement to a playoff team.

Furthermore, among his other seven qualified seasons, Santana's HR/FB ratio never topped 13% and his career HR/9 is 1.24. In 2012, while he played his home games in a pitcher-friendly park in Anaheim, Santana's Angels faced the second-toughest overall schedule in baseball and the second most difficult schedule of any team in MLB as it relates to power-hitting opponents (only Seattle topped LAA in each category) in what was the best overall division in baseball. Additional good news is that Santana maintained above average strikeout rates, average-to-above-average control and topped 2,250 pitches thrown for the eighth (straight) time in his career. Santana has only once dealt with health issues. Since his rookie season, Santana's only two appearances on the Disabled List came in 2009. He still threw 2,300 MLB pitches that season. (Any current concerns about health beyond what the numbers and Santana's age would tell us were not considered for any analysis.)

That's the good news. There are some potential issues (in addition to the high fly ball rate, which is part of who he is and will likely lead to more home runs allowed than the average pitcher no matter where he goes). While his batting average allowed was essentially the same in 2012 as in recent years, Santana's batting average on balls in play (BABIP) was much lower, .241, than the league average of around .300 as well as his career average of .284. His 2012 BABIP was certainly suppressed by his home run total (HR are not considered “in play”), but this suggests that many of his home runs allowed would still translate into hits - likely extra base hits - in a different ballpark situation and without such bad luck. Santana is still a fly ball pitcher so, while we expect HR to go down, his doubles and triples allowed should go up. This is compounded by the fact that the Angels had the best overall defense and best outfield defense in the American League. Defensively, Santana pitching in front of a strong outfield contingent gives him added value.

Baseline Projections

We have discussed ways in which Ervin Santana's bad luck, strength of competition, ballpark, health and defense have impacted his numbers. Before we look at what that means in specific situations, we must leverage all of the information that we know about Santana's history to remove the biases from his recent numbers and create baseline numbers for how Santana would look in a totally neutral situation - against an average MLB lineup, in an average MLB park and with an average defense - as he ages. The chart below evaluates Santana's statistical expectations for his 30-34 year old seasons in such a neutral environment. (Note: Pitches thrown is utilized as a constant here because the amount of innings that a pitcher can pitch with those pitches is related to ballpark and defense. Also, such pitch expectations presume that a team tries to pitch Santana as its third starter throughout a full season - given that he may miss time relative to his projected DL chances.)

Ervin Santana 2013-17 in Neutral Environment

2013 30 3,141 0.244 7.54% 1.18% 0.144 0.406 18.03% 2.39 13.3%
2014 31 3,110 0.247 7.56% 1.18% 0.144 0.407 17.78% 2.35 14.3%
2015 32 2,954 0.249 7.57% 1.19% 0.146 0.408 17.53% 2.32 19.3%
2016 33 2,798 0.250 7.58% 1.21% 0.149 0.409 17.03% 2.25 24.2%
2017 34 2,762 0.250 7.58% 1.21% 0.151 0.410 16.03% 2.12 25.4%

* Ignores previous and projected IBB

** DL% is the chance given that Ervin Santana lands on the Disabled List in a season (time on DL is weighted towards 15 days, with the possibility of more based on all similar pitchers of similar ages in recorded history).

Ballpark Fit

Knowing what his baseline expectations would be allows us to evaluate those numbers in all 30 MLB ballparks. To evaluate ballpark fit, we present the following chart below that incorporates our historical analysis on each ballpark to project Santana's expected OAV and SLG in each park (teams are ranked in ascending order by projected OAV + 2*SLG). Ballpark fit also considers the league that the team plays in (including the Houston Astros in the American League). These are the parks where, against neutral league competition and with a neutral defense, Santana's numbers would be expected to look the best. Also, we have addressed Seattle's Safeco Park which will bring its fences in for the 2013 season (similar to what we saw with the New York Mets and Citi Field this year).

Ballpark/League Fits for all 30 MLB Stadiums:

Team Rank Team Rank
San Diego Padres 1 Minnesota Twins 16
San Francisco Giants 2 Kansas City Royals 17
Tampa Bay Rays 3 Milwaukee Brewers 18
Los Angeles Dodgers 4 Detroit Tigers 19
New York Mets 5 Los Angeles Angels 20
Pittsburgh Pirates 6 Miami Marlins 21
Seattle Mariners 7 Chicago Cubs 22
Cleveland Indians 8 Cincinnati Reds 23
Philadelphia Phillies 9 New York Yankees 24
St. Louis Cardinals 10 Toronto Blue Jays 25
Oakland Athletics 11 Arizona Diamondbacks 26
Washington Nationals 12 Chicago White Sox 27
Houston Astros 13 Boston Red Sox 28
Atlanta Braves 14 Texas Rangers 29
Baltimore Orioles 15 Colorado Rockies 30

Current Team Fit

As an extension of this analysis, we can utilize our proprietary simulation engine (example simulation presentation attached) to illustrate what Santana would mean to each MLB team in 2013. To do this, we play the 2013 MLB season. This gives us a control group to gauge the current expected strength of each team.

Here are the results of the simulation without Santana which provide expected win totals and likelihood (from those 50,000 simulations) of making the playoffs:

A league without Ervin Santana in 2013:

Team Projected Wins Playoff% Team Projected Wins Playoff%
Arizona Diamondbacks 79.6 33.0% Milwaukee Brewers 87.3 41.9%
Atlanta Braves 86.3 38.6% Minnesota Twins 63.0 1.2%
Baltimore Orioles 80.7 20.8% New York Mets 71.4 12.0%
Boston Red Sox 80.5 21.0% New York Yankees 92.0 77.7%
Chicago Cubs 76.2 5.5% Oakland Athletics 85.1 29.5%
Chicago White Sox 85.1 42.5% Philadelphia Phillies 88.6 40.8%
Cincinnati Reds 87.7 41.3% Pittsburgh Pirates 71.1 3.2%
Cleveland Indians 68.1 4.1% San Diego Padres 74.3 22.5%
Colorado Rockies 64.4 2.8% San Francisco Giants 91.8 59.7%
Detroit Tigers 90.6 68.1% Seattle Mariners 69.2 3.9%
Houston Astros 67.0 2.0% St. Louis Cardinals 93.6 68.4%
Kansas City Royals 71.6 5.2% Tampa Bay Rays 88.0 68.2%
Los Angeles Angels 93.7 74.9% Texas Rangers 92.5 69.9%
Los Angeles Dodgers 87.1 45.2% Toronto Blue Jays 74.1 7.5%
Miami Marlins 76.5 18.9% Washington Nationals 91.6 65.0%

After this, we run 30 separate simulations of 50,000 2013 seasons where Santana takes the place of the weakest pitcher for each team. Here are the results for each team when those simulations are conducted:

Ervin Santana replacing worst rotation member on each team (30 different seasons run)

Team Projected Wins Playoff Prob Team Projected Wins Playoff Prob
Arizona Diamondbacks 79.2 31.7% Milwaukee Brewers 87.9 44.3%
Atlanta Braves 87.3 40.1% Minnesota Twins 66.2 3.6%
Baltimore Orioles 82.5 33.6% New York Mets 72.6 15.8%
Boston Red Sox 80.0 18.5% New York Yankees 90.5 73.6%
Chicago Cubs 76.3 5.6% Oakland Athletics 84.4 29.3%
Chicago White Sox 86.4 47.0% Philadelphia Phillies 88.2 39.9%
Cincinnati Reds 87.1 40.3% Pittsburgh Pirates 73.4 4.9%
Cleveland Indians 71.3 8.4% San Diego Padres 76.6 30.3%
Colorado Rockies 66.5 4.9% San Francisco Giants 89.6 54.4%
Detroit Tigers 91.3 70.2% Seattle Mariners 70.1 6.0%
Houston Astros 69.1 4.2% St. Louis Cardinals 91.0 63.8%
Kansas City Royals 75.0 18.8% Tampa Bay Rays 87.4 67.9%
Los Angeles Angels 93.0 73.1% Texas Rangers 90.8 65.5%
Los Angeles Dodgers 86.2 45.6% Toronto Blue Jays 76.2 19.2%
Miami Marlins 76.8 19.2% Washington Nationals 91.3 64.5%

In other words, Ervin Santana means an improvement in 3.4 wins and a 13.6% greater chance of making the postseason for the Kansas City Royals. However, on the San Francisco Giants in 2013, even though his numbers would improve playing at a pitcher-friendly park, Santana would actually cost San Francisco 2.2 wins and a 5.3% chance at the playoffs. Here is the win and playoff probability impact for each team:

Adding Santana to 2013 MLB Rotation:

Team Win Change Playoff Change Team Win Change Playoff Change
Kansas City Royals 3.4 13.6% Miami Marlins 0.3 0.3%
Cleveland Indians 3.2 4.3% Chicago Cubs 0.1 0.1%
Minnesota Twins 3.2 2.4% Washington Nationals (0.3) -0.5%
Pittsburgh Pirates 2.3 1.7% Philadelphia Phillies (0.4) -0.9%
San Diego Padres 2.3 7.8% Arizona Diamondbacks (0.4) -1.3%
Toronto Blue Jays 2.1 11.7% Boston Red Sox (0.5) -2.5%
Houston Astros 2.1 2.2% Tampa Bay Rays (0.6) -0.3%
Colorado Rockies 2.1 2.1% Cincinnati Reds (0.6) -1.0%
Baltimore Orioles 1.8 12.8% Oakland Athletics (0.7) -0.2%
Chicago White Sox 1.3 4.5% Los Angeles Angels (0.7) -1.8%
New York Mets 1.2 3.8% Los Angeles Dodgers (0.9) 0.4%
Atlanta Braves 1.0 1.5% New York Yankees (1.5) -4.1%
Seattle Mariners 0.9 2.1% Texas Rangers (1.7) -4.4%
Detroit Tigers 0.7 2.1% San Francisco Giants (2.2) -5.3%
Milwaukee Brewers 0.6 2.4% St. Louis Cardinals (1.6) -4.6%

Wins Above Replacement

While the approach above is a great way to understand Santana's impact in each team's current context, it makes many assumptions about pitching rotations and does not really take into account whether or not Santana is an upgrade over other versions of himself (different ballpark, different defense, etc.) in each scenario. Wins Above Replacement (WAR) is a relatively new concept in the analytical and sports worlds, but it can be a great way to understand how much better a player is than throwing the top AAA player into the mix. It is also something that simulation can address in a very straight-forward manner without applying any error-prone, rigid formulas.

For the final piece of this analysis, we will recreate the exercise above except that, instead of swapping out the worst starting pitcher for Ervin Santana in 30 different iterations of the seasons, we will put the same “replacement level” player in this spot. Comparing those results to what each team looks like with Santana in the rotation will give us Santana's team and ballpark specific WAR. On average, he may be projected to be worth 1.6 WAR in 2013, but that has a relatively wide variance. This figure is also great for evaluating future teams beyond this year that may be good fits for Santana but may not have the financial or roster situation that makes him a good fit for 2013.

Here is each team's projected record from 30 different seasons where it had a replacement level pitcher in the rotation:

Replacement Level Player replacing worst rotation member on each team (30 different seasons run)

Team Projected Wins Playoff% Team Projected Wins Playoff%
Arizona Diamondbacks 77.7 26.4% Milwaukee Brewers 86.2 38.6%
Atlanta Braves 85.5 36.4% Minnesota Twins 64.8 3.3%
Baltimore Orioles 80.2 18.7% New York Mets 71.3 11.8%
Boston Red Sox 79.0 17.6% New York Yankees 89.1 64.5%
Chicago Cubs 75.9 4.6% Oakland Athletics 83.5 25.1%
Chicago White Sox 84.8 39.9% Philadelphia Phillies 87.4 39.7%
Cincinnati Reds 85.6 37.0% Pittsburgh Pirates 71.3 3.8%
Cleveland Indians 67.8 3.9% San Diego Padres 74.1 22.0%
Colorado Rockies 66.2 4.6% San Francisco Giants 87.3 40.4%
Detroit Tigers 88.4 63.1% Seattle Mariners 67.6 2.2%
Houston Astros 67.0 2.0% St. Louis Cardinals 89.9 59.3%
Kansas City Royals 72.0 7.1% Tampa Bay Rays 85.8 37.5%
Los Angeles Angels 92.5 69.9% Texas Rangers 90.7 64.9%
Los Angeles Dodgers 84.7 29.2% Toronto Blue Jays 74.3 8.0%
Miami Marlins 76.0 18.6% Washington Nationals 89.6 58.8%

That leads us to these rankings for Ervin Santana's team-specific WAR:

Team Team WAR WAR Playoff Change Team Projected Wins Playoff%
Cleveland Indians 3.5 4.5% Los Angeles Dodgers 1.5 16.4%
Kansas City Royals 3.0 11.7% Cincinnati Reds 1.5 3.3%
Detroit Tigers 2.9 7.1% Arizona Diamondbacks 1.5 5.3%
San Diego Padres 2.5 8.3% Minnesota Twins 1.4 0.3%
Seattle Mariners 2.5 3.8% New York Yankees 1.4 9.1%
Baltimore Orioles 2.3 14.9% New York Mets 1.3 4.0%
San Francisco Giants 2.3 14.0% St. Louis Cardinals 1.1 4.5%
Pittsburgh Pirates 2.1 1.1% Boston Red Sox 1.0 0.9%
Houston Astros 2.1 2.2% Oakland Athletics 0.9 4.2%
Toronto Blue Jays 1.9 11.2% Philadelphia Phillies 0.8 0.2%
Atlanta Braves 1.8 3.7% Miami Marlins 0.8 0.6%
Washington Nationals 1.7 5.7% Los Angeles Angels 0.5 3.2%
Milwaukee Brewers 1.7 5.7% Chicago Cubs 0.4 1.0%
Tampa Bay Rays 1.6 30.4% Colorado Rockies 0.3 0.3%
Chicago White Sox 1.6 7.1% Texas Rangers 0.1 0.6%

As usual, if you have any of your own comments about this article or suggestions about how to improve the site, please do not hesitate to contact us at any time. We respond to every support contact as quickly as we can (usually within a few hours) and are very amenable to suggestions. I firmly believe that open communication with our customers and user feedback is the best way for us to grow and provide the types of products that will maximize the experience for all. Thank you in advance for your suggestions, comments and questions.