A pair of Pitt graduate students have parlayed their data science expertise and football knowledge into a win in the National Football League’s Big Data Bowl.
Participants in the annual football analytics competition use the same player tracking data available to NFL teams to develop new insights into specific facets of the game.
The 2020-21 competition, open to college students and professionals, was focused on identifying approaches for evaluating defensive performance on passing plays. Teams received a set of data containing player information and tracking locations, as well as game situation information for all passing plays during the 2018 regular season.
Wei Peng and Marc Richards, PhD students in the Department of Statistics in the Kenneth P. Dietrich School of Arts and Sciences, collaborated with two alumni from Richards’ undergraduate institution, St. Olaf College, on their winning entry.
Their project, “A Defensive Player Coverage Evaluation Framework,” focused on developing fuller defensive coverage metrics by first identifying whether a defender is playing man or zone coverage then evaluating their performance before and after the pass.
“Sports analytics has become exceedingly popular over the years and with most of us being NFL fans to varying degrees, we thought this would be a great opportunity to blend our passion for statistics and data with a love for sports,” said Richards. “With that in mind, we sought to answer specific questions that current publicly available statistics didn’t really answer, such as ‘Who were the best defenders when the ball is in the air?’ and ‘Which players most closely followed their receiver?’”
Their solution won one of five $15,000 prizes in the open competition division. Three teams in a separate college division competition were awarded $5,000 prizes. The winners were chosen from approximately 200 entries from around the world.
The eight teams will present their winning solutions to the NFL for a chance at an additional $10,000 prize.
In addition, Kostas Pelechrinis, faculty member in the School of Computing and Information, also was on a team recognized in this year’s Big Data Bowl for a project that focused on evaluating the ability of defenders to limit receivers’ yardage after catch. His team earned one of nine honorable mentions.
Richards grew up a Vikings fan in Minnesota; Peng, a native of China, became interested in football after being invited to a 2020 Super Bowl party at Richards’ home.
“I think I was kind of tricked by him to get to know about football and participate in this competition after eight months,” Peng quipped. “Having domain expertise is crucial in data science and statistics, so being football fans definitely helped in our project solution.”
This is the first time they’d submitted an entry in the Big Data Bowl.
Richards typically works on applications of statistics in crime, but he had entered a similar competition to analyze player and puck tracking data in the National Hockey League. This summer Richards will be working part time as a data scientist for the National Basketball Association’s Oklahoma City Thunder.
For Peng, whose statistical machine learning research seeks to conduct classical statistical inference on modern machine learning models, the project represented just another typical research problem.
“I enjoy working on fundamental theoretical problems,” said Peng, who will graduate this spring. “I really enjoyed the process of mathematical derivations to develop the algorithm. We customized a novel algorithm to identify the defensive assignment according to the characteristics of this game. This algorithm paved the way for developing defensive performance metrics.”
At Pitt, Peng and Richards work with Department of Statistics faculty member Lucas Mentch, who couldn’t be more proud of their win.
“I've worked closely with both Wei and Marc for long enough to know what excellent statisticians and data scientists they are, so the fact that they put together an excellent submission comes as no surprise,” said Mentch.
“It’s quite satisfying to have that excellence recognized in a prominent competition.”
Mentch said that quality data analysis is difficult. “As the technology and software evolve, more and more people are capable of making pretty pictures and charts, but fewer and fewer seem to be capable of really understanding what's going on under the surface. Big data can provide a lot of valuable information but in inexperienced hands it can also make it very easy to misidentify spurious relationships as genuine insight. The fact that others were able to recognize this kind of careful analysis in Marc and Wei's work, I think, is very encouraging.”
The team’s win is in turn a win for Pitt undergraduates, Mentch observed.
For the second consecutive year, this spring, Richards is a teaching assistant in Mentch’s Statistical Learning and Data Science course. It features a final project designed just like the Big Data Bowl challenge.
“Students get a dataset, apply a variety of methods, look for insights, determine which models are best and which variables are most important, and then are responsible for reporting their findings in both a technical and non-technical fashion,” Mentch said.
“I think it's great that students are getting very practical, hands-on experience and I hope they feel proud to be able to work with Marc, who has proven himself to be great at it.”