June 17th, 2024

Read the not very interesting introduction and the first chapter. The first chapter is on batting average and why it is not the super-important stat that many believe it is. It was a stat created in the early days (pre-modern), when the rules were not set in stone. Hitters could call pitch locations, walks and strikes varied in number, homers were rare. Batting average does not include critical items like walks, hits-by-pitch, and sac-flys. Walks alone take up a huge number of plate appearances. Thus the on-base percentage (OBP) has a higher correlation with runs earned. Slugging and OPS have a higher correlation, but the author doesn’t seem to think they’re as good as OBP. There’s chapters later on these stats, so we have to wait and see why. Tradition prevails, though, and people still get batting titles and MVPs for highest average, even if they were far from the best hitter and player.

June 18th, 2024

The next chapter is on another dumb stat, pitcher wins. It never made any sense to me. Why does the pitcher get the credit and the blame? He doesn’t control fielding or runs scored. He doesn’t even bat anymore. Back in the day the pitcher played 9 innings, but even then it’s dumb. Pitching rotation has changed a lot, too. There are just less games played. ERA and walks thrown are much more important stats that tell you how a pitcher performs. Kill this stat. There’s a later chapter just for saves. The next chapter is on RBIs and I am curious as to the problem with those. They are probably useless because a batter cannot control how other players bat. It's interesting, but not defining like OBP.

June 19th, 2024

As I thought, the dislike of RBIs comes from the fact that a player has no control over the team. Runs should be a team stat. A player has no control over the batting order, who is batting in front of them, who else advanced batters or who stole bases, etc. There are many variables which have nothing to do with the individual at bat. I do find RBIs to be an interesting statistic, but it is mostly a stat of chance and thus is mere trivia. The author says Runs Created have a higher correlation to Runs Created next season, but only 60%. I don’t know much about those and they don’t seem to have their own chapter in the “good” section. So I am now in agreement, down with RBIs.

June 20th, 2024

The author really hates saves. They don’t really make sense to me, but I never paid much attention to pitcher stats besides ERA. A save is credited to a closer, under some specific circumstances, who doesn’t lose a lead. It’s very arbitrary, as you need a certain lead or certain men on base, and you must be the last pitcher. If the two previous relievers don’t give up the lead, they get nothing. Thus it inflates the importances (and cost) of the 9th inning pitcher, even if he has an easier setup than earlier pitchers. The author really hates it because it led to managers not using their closer (often their best reliever) in non-save situations, even when it could turn the game around. In that case, it’s hurting the team. Not to mention it is not a stat of an individual’s, but a team’s, worth. You can also have crappy pitchers get saves just due to circumstance and thus have an inflated reputation, or a good pitcher kept out of the hall of fame due to low save count. It does seem like a bad stat.

June 21st, 2024

I was confused when I saw the next chapter was on stolen bases. Stealing bases is cool. How can the author have a problem with them? Well, he doesn’t have a problem with stealing per se. The problem is that it’s half of a stat. The important stat is the ratio of stolen bases to times caught stealing. If a runner has 50 stolen bases but got caught stealing 40 times, he only has a 55% success rate. The other problem is with the decision to steal. It’s hard to get a runner on base. Losing a runner because of a failed steal is much more costly than the benefit gained from advancing the runner. It’s almost 4x more costly. That means for a runner to have gained anything from stealing, he’d need 4x SB than CS. Otherwise he was a detriment to the team. Plus, the decision has to be strategic. If you have a slugger who has a good chance of homering at bat, do not under any circumstances attempt a steal. If you have an inattentive pitcher and some weak hitters coming up, it may be worth the risk to get into scoring position. All good points that made me reevaluate the steal.

June 24th, 2024

The next stat I’m not familiar with at all. Possibly the author’s wish came true and the stat is dead. This is fielding percentage. This is putouts + assists divided by P + A + errors. The author has two problems with this. Three, I guess. First, he does not like errors. They are dependent on one person’s judgment and thus not useful for mathematics. Not only are errors subjective, they are only counted for a play. Thus a player who does less is rewarded while player who tries something and fails gets an error. They encourage a lack of effort. That said, the fielding percentage only counts work done by the fielder. All they work they didn’t do is ignored. Again, it’s a partial stat. A similar stat, Range Factor, is P + A per inning played times 9, like an ERA. While the subjective error is gone, we still only see work done (that resulted in an out) and not total tries. Lastly, these stats view all plays as equal, which we know they are not. An easy fly ball is not the same as a jump and catch that prevents a home run, yet these stats do not weigh plays. So the fielding percentage is useless in determining defensive skill. It is a hard stat to create, but some efforts have been made. Some stats will be brought up later in the book.

June 25th, 2024

The last chapter of the dumb stats section is about “myths” in baseball. First was the concept of the clutch hitter. Data shows that a good hitter is just a good hitter, no matter the situation. Even the best like Jeter perform approximately the same in high-pressure situations or postseason. The point I had to make was that some people may crack under pressure. The author addresses this by stating that the majors would have weeded out these individuals. That very well may be the case in lower leagues where the skill ceiling is lower, but not for the MLB. Another myth was line protection. This is where you put a good hitter behind a good hitter so they don’t intentionally walk the first one. Using earlier data, putting someone on base is always a bad choice. Maybe before NL had the DH and pitchers had to hit it had uses, but not now. Again, this is for MLB. Not to mention that intentional walking is a bitch move. Another bad idea is the sac bunt. It’s just better to try to get on base because the extra out does not make up the odds for moving a player a base. This is a generalization and maybe there is a time for it, like the fielders left a big gap, but then the sac bunt is no longer a sac but just a hit. That’s the gist.

June 26th, 2024

The first chapter of “good” stats is awarded to OBP. It’s already been discussed by this is superior to batting average. Hits are good, but if you consider a batter’s job as not getting an out, then the average is incomplete. OBP looks at the full picture. A career OBP above .400 is godly. Ted Williams had the best season at above .550 until Barry Bonds reached .600. Harper has batted above .450 and Juan Soto (now a Yankee) has a top 20 career above .420. The average in the past few seasons has been between .310 and .320. Next chapter adds some meat to OBP with OPS.

June 27th, 2024

Now we get into slugging and OPS, but mostly slugging. It’s the same as batting average in that it only counts hits, but it weighs hits based on the total bases gotten. Thus someone who homers a lot will have a higher SLG than someone who rarely does. It doesn’t give the complete picture like OBP, but it does give a good idea of someone’s power. Thorough batter ratings give the triple-slash, or BA / OBP / SLG. Ops is OBP plus SLG, but this doesn’t make much sense mathematically (different denominators) and it muddies the water. If two batters have equal OPS but one has a higher OBP, than he is better. Getting on first is the hardest part of the game and whoever does it most is best. A point of OBP is at least twice as important as a point of SLG. However, OPS is a good team stat. For the team, it has a better correlation to runs than OBP and SLG individually. So it’s still good for something.

June 28th, 2024

Read two short chapters, skipping the ERA chapter for Monday. The first one is about the attempt to make a “one number” stat that gives a view of a hitter. This has led to wOBA, which the author says he uses but never in isolation. It’s similar to OBP but it is weighted, similar to SLG but more precise. Walks and singles have a factor, HBP, etc. This factor changes depending on the environment, meaning how all players are doing. It’s scale is similar to OBP, so if the number looks good as OBP, it’s a good wOBA. (I pronounce it “whoa-buh”). I’ve seen xwOBA, too. The other stat is wRC+, and honestly I forget what it is. Weighted runs created? It’s similar except it adjusts the values based on the park played in and normalizes everything to 1. It’s probably fine but wOBA is easier to grasp. The other chapter was on WPA or win probability added. This is none a stat of skill but of circumstance. It says how useful a certain play was to winning a game. For example, a 2 run home run will have higher WPA in a 1-2 game than in a 6-0 game. The inverse is true: the pitcher who threw that would get negative WPA. The losing team ends with -0.5 and the winning with 0.5, and each player will have his own numbers that will contribute to that total. A neat way to tell the story of a game, but useless for prediction and comparison. It’s a narrative.

July 1st, 2024

The ERA chapter is a bit hard to follow. I think the trickiness comes from the fact that it’s hard to separate the pitcher from the defense. The pitcher controls his throws, but cannot control fielding. He can try to force a groundout, but the defense can err. He can strikeout and not walk, but even this is subjective to the umpire. Replace the umpire with a computer, please! ERA is a good comparative stat, but it doesn’t tell you enough about how good a pitcher is. The subjectiveness of umpires can make a pitcher look worse (or better I guess) and of errors can make a pitcher look better. There was something about BABIP (batting average of balls batted in play) but it made no sense. The other stat is FIP, field independent pitching, which ignores the results of balls batted in. It is based on HR, BB, HBP, and K. You probably need to use both ERA and FIP together to see how a pitcher did. It is a team game after. Very confusing stuff, no wonder people stick with the simple stats.

July 2nd, 2024

Defense is a difficult thing to gauge, probably more difficult than pitching. The fact is that it is subjective because you need to decide if a player SHOULD have made a play. You need to determine how an average player would perform and whether the defender’s playing is helping or hurting the team. The book said this in many, many more words. While there is no perfect method, there are two decent ones that only improve as more data is received through modern play-by-play data. The two are UZR, ultimate zone rating, and dRS, defensive runs saved. They are different methods of doing the same thing. They divide the field into zones and assign a zone to position(s). It's a map that says whether someone should be able to get there and play or catch the ball. Obviously it is effected by positioning, which can be a managerial decision. Complex stuff, but maybe it’s useful. I can’t say I’ve seen it reported. Plenty of these other sabermetric stats at least make the jumbotron. I’ll have to look harder.

I forgot to talk about catchers. Their job is hard to quantify because they tend to call the game and a catcher with a good reputation for throwing out stealers will eventually not have many steal attempts against him, thus dropping his numbers. The passed ball vs wild pitch can also be subjective, like the error. But I learned something very interesting: catcher framing. It sounds like cheating. The umpire cannot see the ball through the catcher, so the catcher can attempt to make a ball look like a strike at the edge of the plate by how he handles it, or frames it. That could explain a lot of bad calls by umpires. This skill would be completely obsolete if the computer were used to call the balls and strikes. I need to keep an eye out for this when watching a game.

July 3rd, 2024

The last chapter of the section is about WAR, wins above replacement. This is a metric, not a stat, that takes compares the player to a baseline, similar to UZR or dRS. Instead of a league average player, the baseline is a AAA player. If a player with a WAR of 2 is injured for the season and replaced by a minor leaguer, that team would be expected to win 2 less games in the season. For example, Bryce Harper is on the 10 day injured list and he has a WAR of 3.7 for 2024, let’s call it 4. With Kody Clemens in his spot for let’s say 8 games, the Phillies will win 8/162*4 or 0.2 games less that season. WAR is not a single calculation. Any institution can use their own public or private method to determine value. More or less you weigh everything they do (or fail to do), sum it and divide by the baseline. It’s more controversial for pitchers because there’s two schools of thought on how to value them. Do you value the pitcher who lets guys on base but doesn’t allow runs, or do you value the guy who keeps guys on base but has a few more runs? The runs prevented number can vary quite a bit depending on how you view this and there does not seem to be a peace treaty in the making. WAR is not an absolute but relative scale. I think of it like distance vs temperature. 20 feet is twice as long as 10 feet, but 70°F is not “twice as hot” as 35°F. Having a higher WAR is only valuable in the difference between two values. If you change the baseline, two players will still have a WAR difference of 2.

July 8th, 2024

The last section starts with a chapter about the baseball hall of fame. It’s mostly the author complaining about how certain people with poor “good” stats are in while people who deserve to be in are not and are no longer on the ballot. I’m sure the complaints are justified, but does the baseball hall of fame really matter? Just look up top X players by Y on Wikipedia. The author also likes to use WAR for this.

July 9th, 2024

This last section of the book feels more like an appendix than something that’s part of the “narrative”. The next chapter is about scouts. The author says scouts use good stats but just spends a lot of time just describing what scouts do and how they choose who should be drafted. It’s interesting but doesn’t seem relevant to the overarching theme. Oh well.

July 10th, 2024

Next chapter is about Statcast, which was pretty new at the time of the book writing. Pretty interesting stuff. You watch a game now and you see all the pitching information and exit velocity from the bat. That’s all tracked by radar. It also tracks where the ball is essentially at all times. I didn’t know there was also an optical system that tracks player locations. The data can tell you accurately the range of a fielder or how fast someone was running. The machine learning can identify pitch style for each pitcher. Definitely a good asset for stats, providing terabytes of data. A lot of that is probably proprietary and does not get public release. I wonder if all data is available to all teams, or if there are limits for your players only. The latter seems more fair.

July 11th, 2024

The book ends with a chapter on “the future” and an epilogue. Not much to say here. The last chapter could have just said now that they have data, they will use it. Pretty dull end to the book. The data will be/is used to evaluate players and even determine if pitchers are fatiguing. The epilogue tells the author’s story as an analyst for the Blue Jays before Big Data and how nowadays he wouldn’t even get an interview for the job. Masters and PhDs are required for this work now. And anyone praising the RBI or the Win is a dinosaur.

Some bullet points for my future self: