Moneyball, the recent movie featuring Brad Pitt gave me two reasons to remember the movie. One was a quote which I think I will remember all my life.
“I hate losing more than I even wanna win” – Billy Beane
The second reason is going to be the topic under consideration here – Statistics & Sports. My first thoughts when I put these 2 terms together was, is it really possible to club them? A sport is an art, a treat for the viewers and associated with lot of fun & excitement. Statistics, on the other hand, are stuff for the nerds who usually wear big spectacles and mostly involve enough numbers to leave anyone confused. But little did it dawn upon me that our deliberations on sports are normally filled with so many statistics that we don’t realize it.
Moneyball brought my attention to this field of Sabremetrics. It is a term coined by Bill James who derived the name from the abbreviation for Society for American Baseball Research (SABR). It involves use of statistics on baseball data to arrive at answers to questions like how much a particular player contributes to offense. The set of metrics developed for this is quite different from the traditional measures used. One of the major reasons for the origin of this analysis was to determine the accuracy of the valuation of players that was being made. This was seen as a solution to actually value players based on their actual effectiveness and not just pure skill. Baseball being a club level game, Sabremetrics helped in choosing players while forming a team.
Let us look at how widespread the use of statistics in some of the other sports is.
Cricket is a game rich in statistics. Statistics are used to help players improve their game. With the increasing use of technology, all the teams now have their own statisticians who load the players with all kinds of numbers after every match. We all talk about strike rate, economy rate, batting average, etc. But these numbers are actually outcomes of a game and are good only for the sake of records. John Buchanan was one of the coaches who used numbers a lot and made good use of it. For example, he said strike rate of a batsman does not tell much. If we take a particular innings and do an analysis on the number of scoring shots actually played, it gives more insight into a player. So if he found that a particular player did his scoring from 40% of the balls he played, it was a clear indication that he could improve his game to make the most of the remaining 60% as well. In this context the IPL is a stage where there could be use of a lot of statistics. If the system of trading players becomes as common as in baseball, statistics would find a more important role.
Basketball has a field of study similar to baseball. It is called APBRmetrics where APBR stands for Association for Professional Basketball Research. The analysis of data is done for the number of possessions since in a match both the teams have almost equal number of possessions. Further effort is made to break down a player’s game into effectiveness per minute. For example, Field Goal % (FG%) is a commonly used statistic to rate a player. APBRmetrics has defined a metric called Effective Field Goal % (eFG%) which takes into consideration the fact that 3 pointers are worth an extra point. A player who shoots 3 out of 6 inside the perimeter would have accounted for 6 points and have a FG% of 50% whereas a player who shoots 2 out of 6 but all 3 pointers also account for 6 points but has a FG% of 33%. eFG% takes into account this disparity.
Football uses relatively less statistics to analyze players. There is a lot of data generated after a game with regard to possession, passes completed, tackles won, etc. But football has a lot of variables which are non-quantifiable like the style of play, pitch conditions, stadium, formation used, etc. Certain stats like pass completion percentage and shots of target percentage are good measures of player skill and performance. Though certain player effectiveness criteria are used, football is still a thinking man’s game which involves analysis of playing style of opponents and player capabilities rather than any statistics.
Tennis is a rather late entrant to the use of statistics. With the increased involvement of coaches, statistics have also gained prominence. First Serve %, Service games won %, break points saved, etc. are the statistics normally poured on to a player’s plate after a match. These stats help a player model his game based on the opponent. In the 2008 Wimbledon final, the baseline-lover Nadal attacked the net three times in the final game – once even behind his serve for the first time in the entire contest, which was a strategic departure from the norm. In the match both the players had stayed back 9 out 10 times after serving.
So the next time you hear terms like optimization, regression and indexes in sporting parlance, you know you are not in the wrong place!