MLB faces a problem with data. They have so much, it takes the human element out of the game. Why past performance does not always predict future results.
The relationship between statistics and MLB goes back to the beginning of professional baseball.
Newspapers published box scores in the 1880s along with play-by-play descriptions in late-afternoon editions allowing businessmen to keep up with what happened at the park. Telegraph offices flashed in-game results around the country during the World Series on a pitch-by-pitch basis. While other sports were figuring out how to play, baseball wrote the book on how to collect data and use it.
As fans, we learned the game studying batting averages and home run totals on the back of a baseball card. A picture, stale gum and your hero’s record in 10-point type on a piece of cardboard. The beautiful simplicities of childhood sneaking in basic math lessons and an introduction to statistical probability.
Is it any wonder how data drives Major League Baseball today?
There were 721,279 pitches thrown during the 2017 season and we can account for each one. Every single pitch. From that, we know what Bryce Harper and Mookie Betts did on Saturday nights on full counts after eating pizza before the game. Anyone can plot a chart where balls landed in the field and what to pitch in any given situation.
With computers, we can create new statistics which judge value, set salaries and tell managers what moves they must make. Between the reams of paper and blinking hard drives, we have turned a game into a scientific study equal to helping with traffic flow and how much water a farmer needs to spread on her soybean field.
As with any business, getting maximum production is crucial to success. If Alex Cora has a note on his tablet telling him the optimal time to pitch Craig Kimbrel is now, he will bring him in. He knows from that app how many pitches Kimbrel threw this week, the number he throws while warming up and any other relevant statistic at the swipe of a finger.
When the Boston Red Sox hired guru Bill James in the early 2000s, along with Billy Beane’s massive success building strong rosters in Oakland based on analysis with no money, other teams paid attention.
Today, people fill MLB’s front offices trying to optimize lineups and pitching rotations by pouring through past performances as a guide predicting the future. It is analysis by paralysis.
Lost in the probability and prediction business is the human element. Unlike the chip running your smart phone needing x amount of electrical connections to work, an athlete does not work with certainty.
When you hear a Joe Maddon complain about computers, listen closer. No one in any MLB dugout will argue data is bad. But, a computer or an analyst cannot forecast a hitch in a swing corrected by a coach or a gut feel. This is a game played by people run by people.
When Kirk Gibson took Dennis Eckersley deep in Game 1 of the 1988 World Series, is there any software system in the world thinking him hobbling to the plate was the right call? So hobbled, Gibson stood a high chance getting thrown out at first on a ball hit to right field. You know what happened.
It is foolish to think MLB teams will not use what they collect to win. If you know Daniel Murphy scorches hard grounders in a six-foot box between second and first, it is your obligation to put a defender in the way. But, computers only are 100 percent correct on past performances. They pay the manager, right or wrong to run the game. Teams who discourage his creativity do so at their own peril.