Is the RPI the Best Way to Compare Teams?


I am diverging a bit from my series on the NCAA tournament so that I can gather and analyze all of the data. Analyzing the PASE (performance against seed expectations) isn’t as easy as I expected. But, it is coming along and I’ll post Part IV soon. In the meantime, let’s take a look back at a post I originally put up in July on the RPI, as it is related to where teams get seeded come NCAA tournament time. I suspect that the way teams inflate their RPI is related to why so many teams are overseeded in the tournament and thus are more ripe for upsets or get an unfair advantage on their way to the Sweet 16.

I think most people agree the Top 25 polls by the AP and Coaches are inherently flawed and potentially biased. Since most of the AP voters are in the Eastern time zone and will not stay up late enough to watch games out west, all they often can rely on is the box score in the morning paper or any highlights they catch on ESPN. For the coaches, they really don’t have the time to watch the hundreds of other teams in Division I play. They have to focus on their own team, who they already played, and maybe some scouting video of upcoming opponents.

The real problem with these voting systems is that most people involved can not have seen a large enough sample of games of all of the relevant teams to make their vote mean anything. When you consider there are 347 Division I basketball teams, that is an impossible challenge. Even among the power teams, the voters more often than not vote based purely on their record, reputation, and latest results, rather than any truly objective measure of how all of the teams stack up. In addition, they only vote for 25 teams. Can you imagine if they had to rank all 347 teams? What a nightmare…

That is why the Ratings Percentage Index (RPI) was created.

It allows for a simply mathematical formula to directly compare of all Division I teams, even if no one watched them play. Let’s look at an example of how it works. Pretend the University of Washington played in two parallel universes, where their schedule and record were exactly the same, but how they achieved them were different. In both scenarios UW is 3-1.

1) UW @ #1 Duke Win, #20 Iowa @ UW  Win, #347 Houston Baptist @ UW Win, UW @  #110 Arkansas Loss

In scenario #1, UW lost on the road to a relatively weak Arkansas team, but beat the top team in the country Duke on the road, as well as, beating a good Iowa and a bad HBU at home.

2) UW @ #1 Duke Loss, #20 Iowa @UW Win, #347 HBU @ UW Win, UW @ #110 Arkansas Win

In scenario #2, the scores were the same, but in this case UW lost to #1 Duke and took care of Arkansas.

So, which is the better UW team?

If you said #1, you are incorrect according to the RPI. It does not matter to the RPI that they managed to beat the #1 team in the country at Cameron Indoor.  Most AP voters would probably be impressed with the Duke win and dismiss UW’s loss in Fayetteville and say that the “real” UW showed up to win in Durham.

In scenario #2, UW lost the game they were expected to against Duke and managed to win the games they were expected to elsewhere. So, while a solid 3-1 record would be recognized, it would not get the same press and prestige and thus they probably wouldn’t be ranked as high in the AP Top 25. But, in reality, they would have exactly the same RPI!

The reason for this is that the Ratings Percentage Index does not take into account WHO you beat or by how much; it simply takes your record, that of your opponents, and that of your opponent’s opponents. The calculations are as follows: 25% is based on your winning percentage (WP), 50% is the winning percentage of your opponents (OWP), while 25% is the winning percentage of your opponent’s opponents (OOW). You do get a bonus for road games and a penalty for home games, taking into account road games are more difficult. Thus, road wins are worth 1.4, neutral site games are worth 1.0, and home games are worth 0.6 games.

Based on scenario #1, let’s calculate UW’s actual RPI. For comparison purposes, let’s say Duke is 3-1, Iowa is 2-2, Houston Baptist is 0-4, Arkansas is 2-2 and to simply the calculations, we’ll say all four team’s opponents are 8-8. For UW’s WP you take the three wins, look at where they were played, and divide it by 4 total games played (1.4 + 0.6 + 0.6)/4 = 0.650. So even though UW has won 75% of its games, since it only won 1 road game, its RPI value is actually below that mark.

For  the OWP simply take their records (minus the result of the game against UW). So Duke would be 3-0, Iowa 2-1, HBU 0-3, and Arkansas 1-2. Thus, the total would be 6-6 (0.500). For the OOW percentage, you take the records of those teams. Since it is 8-8 in all cases, we’ll use the figure 0.500.

Lastly, you multiply each score by its importance factor (25% for WP, 50% by OWP, 25% for OOW) to come up with the RPI value. For UW it would be (0.65 * .25) + (0.5 *0.5) + (0.5 * 0.25) = 0.5375

At this stage, that value only means something when compared to the other 347 teams in Division I. While the math seems complicated, it is actually pretty easy to set it up in an Excel spreadsheet and as you enter results from games, it will simply recalculate RPI’s for every school in the spreadsheet. It is this simplicity that has made it very desirable for the NCAA. They don’t have to try and compare score differentials, the biases and/or ignorance of the voters across the country, and the so called “any given day” permutations that can occur when low ranked teams stun higher ranked teams. They just get a list of teams and can use that as a guide for looking deeper into their specific results.

While there are certainly anomolies, such as Harvard being #35 last year, it makes for a pretty reasonable guide. Any team with an RPI in the top 50 tends to be pretty good while any team ranked #200 or above probably would have no shot at the NCAA tournament regardless of their specific results. Plus, the NCAA only uses the RPI as a guide to determining who should be considered.  When they get into the specifics of who on the bubble should get into the tournament or where to seed all 68 teams, they can go into the nitty-gritty of every team’s quality wins and bad losses later.

If the RPI has any limitations, it really is that there is probably too much emphasis on the strength of schedule (SOS) and not enough on the specific game results, especially the so-called quality wins/losses or score differential. In reality, 75% of the RPI is based on SOS. This automatically gives a huge bonus to teams from power conferences, who rack up lots of wins in the year against low-to-mid-majors (thus having high winning percentage) and then when they enter conference play they are only playing teams with already good records (and thus helping the 50% of the RPI based on SOS). Thus, right there 75% of the RPI is high, while only the 25% based on OOW is on the lower end, taking into account your opponents played scrubs in the non-conference schedule.

This has forced good mid-major programs to stack their schedule with power teams to improve their strength of schedule before they play their weaker conference schedule. Even if they lose many of those games, that 50% of the RPI based on OWP makes up for the cost of hurting their own record because that is only worth 25% of the RPI.

The first case where the variation between what voters think of teams and what the RPI calculates came to my attention in 2002 when Gonzaga was ranked #6 in the country by the AP with a 29-3 record. However, the RPI ended up having them at #26 due to the fact they played so many games in the weak WCC. In the end, Gonzaga ended up with a #6 seed (meaning they were considered by the NCAA selection committee to be around 21-24th best) and they ended up losing to #11 Wyoming in the 1st round of the tournament.

So, let’s look back at scenario #1 and scenario #2 above.  The RPI says they are the same. What do you think?