This was a fun mini assignment I had for a calculus class(?) a while ago, felt worth posting.
Nikola Jokic, selected 41st overall in the 2014 NBA draft after three years in the Serbian League, was described by scouts as a “bottom-tier athlete” whose “slow-footed tendencies” would “likely prevent him from ever playing 30-plus minutes in a game.” At 6’11” and an abdominous 284 pounds, he is a strong candidate for shortest vertical leap among professional basketball players. Even his own head coach, Michael Malone, in response to a question about Jokic’s first dunk of the 2018 season, declared it such a rare phenomenon that “it should have been a national holiday.”
Defying all presuppositions about the necessary correlation between athleticism and effectiveness on a basketball court, Nikola Jokic has emerged as an indomitable offensive dynamo and consecutive League MVP award winner, a development which has catalyzed eruptive polarization within basketball discourse over the cursed subject of ‘analytics.’ Patrons of the numbers have hailed Jokic’s aberrant statistical influence as evidence of his transcendent greatness, while skeptics point to a perceived contradiction between his gracelessness and overperformance in the metrics as impetus for discarding the metrics altogether. These arguments tend to be circular in nature; statisticians, certain of their calculations, are immovable in their convictions about his excellence while casual viewers, equally adamant that he is not a superstar-caliber player, are unwilling to entertain any algorithms which crown him. All of this begs several questions, chief among them being, ‘how, if at all, do you encapsulate the value a basketball player is delivering to his team in a single statistic?’
In addressing the question, this paper intends to review the evolution of basketball’s fickle relationship with math to understand the birth of ‘analytics’ before critically examining two radically opposed, particularly controversial one-number metrics: John Hollinger’s PER, and fivethirtyeight’s RAPTOR. All essential terms will be concisely defined with an ultimate aim towards demystifying the statistical process driving these compact assessments, not explicitly defending them.
Founded in 1946, the Basketball Association of America’s inaugural season marked the glorious beginning of organized tracking data for the sport professionally, ripe with oversights and misallocations. The league fielded precisely eleven teams, a decision which resulted in two teams playing 61 games while the others played 60, and which yielded a frankly indecipherable six-team playoff format in which the two best teams were pitted against each other in the very first round. Nonetheless, the data was robust. Every occasion of a player attempting to score a basket, regardless of the result of the shot, was documented and attributed to both the player and his team as a Field Goal Attempt (FGA). When a basket was scored, it was additionally marked down as a Field Goal Made (FGM). A successful shot always awarded two points, which were added to the player’s season-long tally and divided by his total number of games played to determine Points per Game (PPG). Dividing a player’s FGM by their FGA at the end of the season would result in a decimal, representing the frequency with which they scored when they attempted to: Field Goal Percentage (FG%). League-wide, the FG% in 1946 was an abysmal 27.9%, more than 20 percentage points inferior to modern averages.
This alone would have been a commendable start, but the BAA went further, attending to two stranger methods by which points can be scored, the first of which being Free Throws. If a defender makes excessive contact with an offensive player, particularly while they are in the act of shooting, a referee may exercise his or her judgment in calling a foul (PF). Foul calls abruptly halt the natural flow of the game and initiate a rather bizarre routine in which the impacted player, alone, is instructed to stand approximately 15 feet from the hoop and take two uncontested shots: Free Throw Attempts (FTA). For each successful shot (FTM), the player is awarded just 1 point. In 1946, the league-wide average conversion rate on these shots, their Free Throw Percentage (FT%), was 64.1%, sufficient to clearly disincentivize fouling. The other type of scoring play involves situations in which a teammate’s pass puts a player in position to score very easily; in these cases, the teammate who made the pass is awarded an assist, which accumulate (APG) in much the same way points do.
The potential ambiguity in awarding assists will be returned to later, but for the most part these were fairly concrete, objective measures of player performance which linger as cornerstones of modern player assessment. The consensus best scorers in the league tend to have high PPG and above-average FG%, while the consensus best passers in the league tend to have high APG. The introduction of rebounds (RPG), however, the third pillar of the traditional ‘box score,’ marked the onset of a steady descent into metrics of increasing obscurity and subjectivity.
First tracked in 1950 after the BAA merged with the National Basketball League to become the NBA, a rebound is defined as the act of securing possession of the basketball after any missed shot. The statistic is blind to difficulty or elaborate context; the ball bouncing wildly off the rim into the arms of an unsuspecting player counts the same as a player soaring through the air to snatch the ball while surrounded by aggressors of the opposing team. While clearly valuable holistically, the vast majority of individual rebounds are uncontested matters of random chance. Unlike points and assists, it is difficult to understand why the players who gather the most rebounds would necessarily be contributing the most value, yet this false premise has rather irreversibly entrenched itself in basketball discourse.
Glaringly absent thus far is any measure of a player’s contributions on defense, and that is for good reason. Traditional counting statistics remain flummoxed as to how to quantify defensive impact, and thus have long relied on shockingly unsophisticated measures: blocks (BPG) and steals (SPG). Both statistics are somewhat self-evident; blocks are recorded when a player deflects an attempted shot with their hand, and steals are recorded when a defensive player intercepts a pass or dribble. Intuitively, both plays are valuable, but their compilation tends to reflect defensive skill rather inaccurately, rewarding players who recklessly overshoot their assignments in search of these highlight plays. The best defenders are often so menacing that offenses will avoid them entirely, suppressing the player’s block and steal totals in spite of their elevated ability. Such is the inescapable limitation of box score evaluation. Suppose a player is so talented that the opposing team dedicates two players to defending him at all times, allowing his teammates to essentially play four vs three on offense. The player, despite being inarguably the most valuable on the floor, will likely end the game with a miniature statistical footprint that wrongly suggests a diminished contribution to the team’s success.
Thus, decades passed, with data scientists subdued by the sport’s complexity and resigned as such to eternally absorb NBA data in concert with the dreaded ‘eye test.’ To the extent any significant breakthroughs were made, they failed to infiltrate the mainstream, as fans and players alike grew attached to the ‘Points/Rebounds/Assists’ split. Despite its flaws, the traditional disjointed box score was a unifying authority in basketball circles through the late 1980s, serving as an approachable guidepost for a burgeoning fanbase to identify the centralizing icons of the sport. Enter John Hollinger.
Hollinger, an economics and environmental science major who graduated from the University of Virginia in 1993, entered the world of sports writing as a hobby. Dissatisfied with the clunkiness of the box score, he derived a relatively simple formula for consolidating these traditional ‘counting statistics’ into a single, authoritative number, a number which would grow to become so popular that it brought forth a seismic disturbance in the lexicon of conventional basketball analysis: PER, or Player Efficiency Rating. Its formula is as follows.
uPER = (1/MP) * [3PM + ⅔ AST + (2 – factor * (team_AST / team_FGM)) * FGM) + (½ FTM * (1 + (1 – (team_AST / team_FGM)) + ⅔ (team_AST / team_FGM))) – (VOP * TOV) – (VOP * DRB% * (FGA – FGM)) – (VOP * .44 (.44 + (.56 * DRB)) * (FTA – FT)) + (VOP * (1 – DRB%) * (TRB – ORB)) + (VOP * DRB% * ORB) + (VOP * STL) + (VOP * DRB% * BLK) – (PF * (lg_FTM / lg_PF) – 0.44 * (lg_FTA / lg_PF) * VOP))]
This is, of course, a mess, but it becomes more digestible when evaluated in segments whose general intentions are crystallized.
Ignoring the (1/MP) for now, which contextualizes the result of the algorithm as being ‘per minute played,’ the first four terms of the formula, through ⅔ (team_AST / team_FGM), all serve the broader category of ‘contributions to scoring.’ Self-evidently, this algorithm cares as much about the efficiency with which a player scores points as it does about sheer scoring volume, which became much more complicated than the aforementioned FG% with the advent of a three-point line. In confronting this problem, Hollinger evaluated each of the three categories of made baskets separately. Three-pointers (3PM) are added in isolation, while total made baskets (FGM) and free throws (FTM) are balanced relative to the team the subject plays for. (Team_AST / Team_FGM) represents a shortcut for estimating the extent to which a player’s teammates facilitate easier scoring opportunities for him; the assumption here is that a team who records more assists is generating more of those opportunities. The enigmatic factor variable ensures that all of these numbers are weighted against the league averages for that season, preventing PER from skyrocketing universally as leaguewide scoring numbers increase.
Already, some significant doubt ought to arise about this approach. For one, players who record lots of assists, often the best passers in the league, are directly punished for doing so. Their assists will inflate the ‘team_AST’ number, which in turn depresses the formula’s evaluation of their scoring impact. To use a radical example, a theoretical player who accumulates 900 assists throughout the season, whose teammates never once pass the ball, let alone record an assist, would be perceived as the recipient of his own assists by the algorithm and have his scoring graded identically to that of his teammates. Still, Hollinger is attempting a herculean task, and thus far PER seems promising as a rough sketch of scoring value in most conventional cases.
Seeing as the first four terms are adding value for every successful shot attempt, it follows understandably that the next three terms are his way of subtracting value for missed shots and turnovers. The term VOP represents the average league-wide points scored per possession, which has hovered between .95 and 1.15 for the last three decades. This, like factor, is essential in accounting for broader yearly fluctuations so that the statistic can be standardized and compared across seasons. Just as an individual successful basket is relatively less valuable in a season where teams are scoring more, individual missed shots are less harmful in seasons where teams are scoring less. (VOP * TOV) is a straightforward punishment for players having the ball stolen from them. DRB%, the proportion of all missed shots that were rebounded by the defending team, is an additional subtlety in determining the harm caused by missing shots which Hollinger factors into the equation, applying a coefficient of ~.5 in the calculation regarding free throws missed to account for the approximate half of free throw attempts for which no rebounding is allowed.
The next four terms are responsible for the bulk of PER’s pitfalls, as the equation attributes similar value to the accumulation of rebounds, blocks, and steals as it does to points and assists. These statistics, like the others, are each individually balanced against the yearly climate, but the fundamental illogic remains consistent with the shortcomings of a traditional box score. Zach Fein of Bleacher Report replicated PER in 2009 using ‘linear weights’ in order to approximate the arbitrary value the equation has assigned to each quantifiable action, and he concluded that PER values blocks more than assists, steals more than points, and individual rebounds as much as half of a point. Despite the intimidating facade, PER remains constitutionally arbitrary.
From this conclusion stems a fascinating circularity with regard to PER. The statistic has become so popular in large part because it comes to popular conclusions; Michael Jordan, Lebron James, and Shaquille O’Neal were all really good at basketball. The list of NBA career leaders for Player Efficiency Rating is a rather comprehensive collection of the league’s most iconic players, so much so that it begs the question: Did Hollinger work backwards from his conclusion? This is mere conjecture, but it stands to reason that the development and adoption of such a statistic would be drastically expedited by anchoring itself to a couple of iron-clad preconceptions. The equation is meticulously crafted to appreciate these star players, who most often have the ball in their hands, but it is woefully under-equipped to make intelligent distinctions between ancillary contributors, who comprise the overwhelming bulk of the league. Players like Tony Bradley and Marcus Smart, ferocious defenders and excellent passers who played integral roles on championship-caliber teams, are neglected by the model entirely, grading out as less valuable than consensus benchwarmers like Jock Landale and Mike Muscala, who stockpile rebounds on inferior teams.
PER remains the pinnacle of box score evaluation, but perhaps more accurately remains the loudest signal that such an approach is inadequate. Basketball is too rich, too interconnected and complexly tactical to be quantified by the sum of any player’s individual actions of purported significance. A radical shift in perspective was needed.
Plus-minus, implemented in official NBA box scores at the start of the season in 2007, is an artless nightmare of a statistic. Its derivation is blunt – assess the team’s margin of victory or defeat only during the minutes in which the player was on the court – but it can be difficult to conceptualize without a couple of examples. Imagine a game in which Team A outscores Team B in the first quarter by 7 points, then is outscored by Team B in the second quarter by 7 points, then scores the exact same number of points as Team B in the third quarter, and finally outscores them by 6 points in the fourth quarter to win the game. If Player X plays in every second of this game, his +/- is a simple +6. Suppose Player Y participates only during the first half of the game; his +/- is 0. Player Z, who plays only during the second and third quarters, ends up with a +/- of -7, despite his team winning the game!
Interpreted generously over a large sample size, +/- is an attempt to capture the impact a player’s mere presence on the court has on the performance of their team, regardless of their concrete statistical contributions. This approach understandably resonated with skilled defensive players and role players, but in its nascency suffered from a comic swath of confounding variables. For one, the metric was criminally unkind to great players on bad teams, who would consistently register negative +/- scores despite brilliant performances. It also offers no means of distinguishing between the five players on the court at any given team. If a team surrenders 20 consecutive points because Player X has kidnapped his teammates and tied them up at halfcourt, Player X and his teammates are held equally accountable for that -20. Likewise, if a team wins because Player Y scored 400 points and stole the ball 62 times, Player Y and his teammates will be attributed equal credit for the victory. Furthermore, suppose Player Q, who plays very infrequently, happens to enter the game at the same time as Opposing Player M, who is the worst player in the league. Player Q’s +/- will be exorbitant, owing merely to the fact that his minutes aligned with the other team’s biggest liability. To use a concrete example, the best four players according to +/- in the 2021 NBA season were all members of the Utah Jazz. Among the four was Royce O’Neal, who averages a meager seven points per game and yet utterly dwarfs the mighty Lebron James in this statistic.
It is clear thusly that plus-minus in its vanilla form offers next to zero reliable insight into a player’s effectiveness, but data scientists refused to give up on the novel approach. The Dallas Mavericks front office generated ‘Adjusted Plus-Minus’ in the mid-2000s, which accounted for some of these variables rather elegantly by weighing the data against both the team’s general strength and the perceived strength of the other nine players sharing the floor at the given time. There is a circularity to this as well – how does one calculate the APM of any player if it depends on the APM of the other players whose own APM can’t be calculated without the knowledge of the original player’s APM? – but it was nevertheless a significant improvement on the original metric. It was becoming excitingly clear that a delicate synthesis of the two approaches could spawn nearly all-encompassing models of unprecedented reliability, and indeed, as the 2010s waned, dozens of these models rose to assume positions of absolute authority in player evaluation worldwide.
Robust Algorithm (using) Player Tracking (and) On/Off Ratings, mercifully abbreviated as RAPTOR, stands as arguably the defining tool for modern, mathematically inclined NBA talent evaluators. As described by those who engineered the algorithm, “RAPTOR consists of two major components that are blended together to rate players: a ‘box score’ component, which uses individual statistics… and an ‘on-off’ component, which evaluates a team’s performance when the player and various combinations of his teammates are on the floor.” These approaches have magnificent synergy in accounting for their respective deficits. The “box score” tethers RAPTOR to an appreciation of workload; when a surge in team productivity occurs with five players on the court at the same time, individual statistics are necessary for distinguishing their relative contributions. Additionally, the best players often play the overwhelming bulk of available minutes, which allows for dramatically high variance in their +/- over the few minutes they sit on the bench throughout the year. RAPTOR’s box component ameliorates this problem, inherently trusting sheer volume of playing time as an indication of an individual’s importance to their team. Simultaneously, the +/- component of RAPTOR compensates dutifully for the blind spots of traditional statistics. As previously outlined, skilled defenders and savvy veterans perform exceptionally well in +/- metrics, while so-called “ball hogs” are punished appropriately for their inefficiency. RAPTOR also makes its results quite easily digestible, spitting out a number which approximates “the number of points a player contributes to his team’s offense and defense per 100 possessions.”
Much of RAPTOR’s esteem has come as a result of its impressive predictive prowess. Take this year’s playoff matchup between the Boston Celtics and Brooklyn Nets, for example. The Nets were an offensive juggernaut featuring Kyrie Irving and Kevin Durant, idolized as two of the greatest scorers in the history of basketball, whose weaknesses lay entirely on the defensive end. The Celtics, meanwhile, were a suffocating defensive team that lacked superstar players, operating instead as a cohesive unit. The ease of quantifying offensive skill relative to defensive skill, coupled with the cultural deification of elite players, led the Nets to become overwhelming betting favorites. RAPTOR, however, unimpeded by sentimentality, assessed the Boston defense as nearly 10 points more valuable than the Nets offense per 100 possessions, and forecasted a dominant Celtics victory. The Celtics annihilated the Nets, four games to zero.
So is RAPTOR the panacea? Has basketball nerdvana been achieved? Sadly, the answer is still no. Any statistic will eternally retain the inability to witness raw talent, and as such is bound inexorably to the rigid confines of the player’s concrete imprint as suggested numerically, only with regard to the specific teammates and schematic approaches they played with in any given season. The algorithm could not have surveyed Lebron James in his rookie season, at once the fastest and strongest player in the league at just 18 years old, and simply intuited that he was going to become one of the greatest players of all time, as everyone who physically witnessed him did. Congruently, to come full circle, the model is blind to the subcutaneous fat carried by Nikola Jokic. When a team scores a basket, as opposed to missing a shot or committing a turnover, they allow themselves time to settle into their optimal defensive alignment: NBA offenses are several points per 100 possessions worse against a ‘set defense.’ Nikola Jokic makes a lot of shots, so many that this seemingly irrelevant nuance has driven the algorithm to declare him the second best defensive player in the NBA, which goes diametrically against common wisdom about his defensive skills. This is all good and well to the extent that this phenomenon indicates that the Denver Nuggets generally allow fewer points when he is playing, but it serves as actively counterproductive in isolating their strongest defensive players on a possession by possession basis. Nuggets head coach Michael Malone offered a rather scathing, unspoken indictment of RAPTOR’s conclusion by benching Jokic for the final (defensive) possession of critical playoff games against the Golden State Warriors. Jokic is incredible, and these algorithms are becoming increasingly crucial in accounting for the qualities we naturally fail to observe and appreciate, but applying them as a substitute for watching basketball will forever be a feckless exercise in quantifying the unquantifiable.
- https://fivethirtyeight.com/features/how-our-raptor-metric-works/
- https://www.basketball-reference.com/about/per.html
- https://bleacherreport.com/articles/2096255-nikola-jokic-nba-draft-2014-highlights-scouting-report-and-more
- https://www.basketball-reference.com/leagues/NBA_1971.html
- https://twitter.com/ginamizell/status/959327886267199488?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E959327886267199488%7Ctwgr%5E%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fwww.9news.com%2Farticle%2Fsports%2Fnba%2Fdenver-nuggets%2Fnuggets-nikola-jokic-has-more-triple-doubles-than-dunks-this-season%2F73-536930374
6. https://news.virginia.edu/content/stat-man-alumnus-became-basketball-analytics-pioneer
8. https://projects.fivethirtyeight.com/nba-player-ratings/
9. https://www.statmuse.com/nba/ask/player-best-plus-minus-2021-nba
10. http://www.espn.com/nba/hollinger/statistics/_/year/2019
12. https://www.nbastuffer.com/analytics101/adjusted-plus-minus/




