Study that claims to reveal officials’ bias reveals the study has bias against officials

A study purports to show a correlation between sideline position and the likelihood of a flag. Is this proof of officiating bias?



When it comes to officiating, there are a minefield of perceived biases. Visitors claim that officials succumb to the cheers and jeers of the home team crowd. Fans claim that the NFL pushes officials to favor a chosen team. At the end of the day, both teams may leave convinced that the officials were out to get them. However, a recent article by Noah Davis and Dr. Michael Lopez may have overplayed its hand when when it claimed, “sideline bias in the NFL is real, and it’s spectacular.”

The FiveThirtyEight article is based on a 2016 paper by Lopez, an assistant professor of statistics at Skidmore College, which looked at whether plays toward a sideline resulted in a higher proportion of fouls beneficial to that sideline. Four types of penalties, including holding and pass interference, were included in the analysis, which appeared in the October 2016 issue of the economics journal Economic Inquiry. While the paper finds that there is an imbalance between the number of fouls called to each sideline, a critical review finds that the absolute difference is quite small. The paper then errs when it asserts that officials must be the cause for the discrepancy while discounting the possibility that the difference might be a natural consequence of normal play.

Issues start early on when, during the presentation of the paper’s hypotheses, Lopez asserts that any difference found would necessarily be attributable to pressure to call or not call a penalty which benefits the team whose sideline the official is on. This assertion, which is unnecessary to the paper’s core hypotheses, presupposes that the singular cause is the officiating. By doing so at the beginning of its methods section, it leads the reader to believe that the author has analyzed a “control” group which factors for  all other causes.

The paper utilizes raw data sourced from Armchair Analysis and Football Outsiders, both reputable subscription services. From this data, Lopez concludes that there is a statistically significant difference in the frequency of calls to each sideline, but only for defensive pass interference and aggressive defensive fouls. He further concludes that location relative to the goal line has an effect on whether fouls for offensive holding are called on runs toward the defense’s sideline (though there is no statistically significant difference overall).

Lopez, a former collegiate offensive lineman, has a keen interest in using statistics to analyze sports, especially football and hockey. Lopez has written several reviews, including an analysis of when penalties such as offensive holding are more likely to be called during a game. In all, it’s not Lopez’s statistics which are flawed, but rather the conclusion that he jumps to with those statistics. The paper finds some statistically significant correlations, but then attempts to present the findings as showing a causal link, which is beyond the reach of this type of study. This gap between correlation and cause would be acceptable for a discussion at the local watering hole, where there are lower editorial standards than a peer-reviewed journal article.

Notably, Lopez does not consider whether the fouls called are actually warranted. The NFL issues grades to every official each game, breaking down each play and judging both calls and no-calls. By the NFL’s accounting, officials get over 97% of the calls correct every game. Fewer than 5 calls per game are graded either as a miscall or a “support” (a call in a  grey area and not judged wrong). These ratings by the NFL stand in contradiction to claims that the officials exhibit a sideline bias, as an observer grading the game after the fact is not subject to the supposed influence of the sideline. In fact, an official who does display a bias, whether it be toward or against a team or sideline, is likely to be out of a job with a tarnished reputation — although the league’s extensive scouting and vetting process would identify any of these tendencies before such an official is hired.

In addition, the magnitude of the difference is blown out of proportion by the article. The graphs published in the paper and displayed by FiveThirtyEight give the appearance of bigger disparity than bears scrutiny. While FiveThirtyEight claims that defensive penalties are called about 50% more often on the offensive team’s sideline, over the 160 plays in an average NFL game, the actual figure comes out to less than 1 foul per game. At these levels, the differences are more mundane than they are spectacular. The chart from FiveThirtyEight below (“defensive aggressive penalties”) shows areas of uncertainty that are separated by less than  ¼ of a penalty for every 1,000 pass plays, but the scale of the chart gives the appearance of a wide gulf.


In fact, the discrepancy may have a more natural cause: the players. A defensive player, playing on the offensive sideline, may play more aggressively. “Friendly sideline banter” may increase the likelihood that a defensive player, who doesn’t have the luxury of knowing the play beforehand, will commit a foul when on the offensive sideline.

Granted, the article does include a discussion of possible of player-performance factors, admitting that the effect on sideline plays is most relevant for aggressive defensive penalties. But it discards this explanation because it is an inadequate justification for offensive holding calls, one of the more weakly correlated results. He further discounts this player-sideline effect for DPI, claiming that, because DPI could occur anywhere in the outer third of the field, the effect would be minor. Here Lopez errs in failing to appreciate the time cornerbacks spend near to the opposing sideline before and at the start of plays. In discarding this explanation, Lopez goes all-in on the officials as the reason for the difference, but the existence of a difference alone doesn’t necessarily demonstrate that the cause was official bias.

It’s easy to place the blame at the feet of the officials who point out the fouls. We want officials to “let the players play” when a foul is called on our team. At the same time, we want the officials to stop the other team from “getting away with murder.” But officials tend to work by one common sense rule: they don’t hand out flags; the players earn them.

