Tournament ELO Explained

These rankings are determined using a slightly modified ELO rating system, which is a commonly used method for calculating the relative skill levels of players in zero-sum games such as Chess. For calibration data, we used over 5000 chronologically ordered competitive tournament matches dating back to April 2020. No third place games are used, as they are commonly forfeited. For any data before that we were unable to find sufficient records. However, the relative ratings of players are largely unaffected by this missing data. As long as the player has been active in the past year, the system will recalibrate them to their expected ELO.

The ladder was initialized so that every new player starts at 1500 ELO. A players rating is then updated using the ELO system formula, which is R1 = R0 + K(M)(S-E), where R1 is the players new rating, R0 is the players previous rating, K is a scalar constant determining the maximum number of points that can be awarded or lost from the outcome of one game, M is a scaling multiplier that scales K based on the players ELO, S is the players score of the match, and E is the players expected score.

In our system, we have set K = 32, and our scaling multiplier M = 500/(R0-1000) up to a maximum factor of 1. For perspective on this scale, a newly initialized player at 1500 ELO will have an M value of 1, while a top1 player at 2000 ELO would have an M value of 0.5. The reason why we implemented this scaling multiplier to K was to lower variability as you climb the ladder so that rankings are not massively inflated or tanked based on a small sample size of games. Rather than just using a smaller K value, a scaling K value accomplishes this without requiring hundreds of games played to reach your expected ranking. Based on the scale of our ladder, a consistent 70% win-rate player will be able to reach their expected top 10 ranking in just 40-50 matches. For perspective, the International Chess Federation uses a K value between 40 and 10.

The last variable E is the players expected score of the match. E is calculated based on the formula E = 1 / (1 + 10D/X)(N) where D is the difference in the two players ratings Rp2 – Rp1, X is a scaling variable that determines the size of the ladder, and N is the number of games in the set. Based on our data, the highest long-term win-rate achieved in Temtem is slightly under 70% for the top 5 players. We have selected an X value of 1000 to reflect this win-rate. Using X=1000, our ladder is scaled so that a 2000 ELO player (top1) is expected to have a 71% win-rate against the ladders weighted avg ELO of 1600.

Here is a simple example to summarize the above:
Player A (1500 ELO) 2 - 1 Player B (1500 ELO)

RA1 = RA0 + K(M)(S-E) E = 1/(1+100/1000)(3) = 1.5 M = 500/(1500-1000) = 1
RA1 = RA0 + 32(1)(2-1.5) = 1516

Conversely for Player B:
RA1 = RA0 + K(M)(S-E) E = 1/(1+100/1000)(3) = 1.5 M = 500/(1500-1000) = 1
RA1 = RA0 + 32(1)(1-1.5) = 1484

Player A rises from 1500 to 1516 and Player B falls from 1500 to 1484.