Playoff-Bound Baseball Gets Big Data Boost

October 2, 2012
Originally appeared at wsj.com
By:
Rachel King

Baseball is one of the frontiers of big data innovation because there’s so much money involved in it. Tuesday, Major League Baseball said it sold the broadcast rights to games through 2021 to News Corp . and Time Warner for $6.8 billion. But MLB’s foray into media goes beyond television and into online tracking and analysis.

Indeed, some of the most cutting-edge examples of Big Data can be found at five ballparks across the country. There, Major League Baseball is testing a system that tracks – in real time – every moving person or object on the field, including three base runners, nine fielders, the ball and the umpires. Called FIELDf/x, the system will generate defense statistics, including the difficulty of a catch and the probability that a particular fielder will make it.

“From a research and analytics perspective, it is viewed as the last frontier in baseball, understanding the interrelatedness of everything that happens on the field,” Cory Schwartz, vice president of stats at MLB.com, told CIO Journal Tuesday, at an event in New York called The Human Face of Big Data.

It’s also a tangible example of a world where more and more phenomena, from the migration patterns of birds to advertising buys, will be monitored and analyzed in real time. Sometimes it’s difficult to comprehend just how much these vast amounts of data, from the way we watch baseball to how we address crime, will shape our experience.

Team executives, scouts, field managers and, of course, most fans, understand the game is as much about analytics – knowing the probability of when and where a player will hit – as it is about balls and strikes. For years, MLB has been analyzing pitch types and speeds, and putting that information online in real time. It uses software called PITCHf/x, developed by Sportvision. Now, MLB.com is testing and refining FIELDf/x in five ballparks, with the goal of rolling it out to all 30 MLB ballparks, and making it accessible to fans online. But the project requires that MLB.com process and store much more information than it did with PITCHf/x.

It takes about 0.4 second to record a pitch, but an entire play can take 5-to-15 seconds, said Schwartz. “We get roughly the amount of data from FIELDf/x for one game as we get for an entire season in PITCHf/x because we’re tracking 16 moving pieces,” he said. It’s a massive amount of information, so MLB.com is working with Sportvision on techniques to store it more efficiently.

To get the system up and running in ballparks, MLB is installing more cameras and cables and then working to make sure the system is able to distinguish the 16 different objects from one another. If two players cross one another in the outfield while they are trying to make a catch, the system can still correctly identify each person. This year, MLB.com expanded the test from the original locations in San Francisco, Kansas City and Tampa Bay to ballparks in Boston and Milwaukee. Each ballpark has different physical attributes and the testing lets MLB.com find and fix the weaknesses, flaws and limitations of the system.

“When we roll out to all 30 ballparks, we’ll be able to track all events a lot more accurately and in real time,” said Schwartz.

Then fans, professional scouts and amateur statisticians can really have a field day.

Write to rachael.king@wsj.com