Elo fetishism & Category madness
About inflation in chess ratings. As an economist who also worked as professional betmaker, I've kept an eye on this topic.
The fundamentals on FIDE Elo and adjusted historical Elo rating and ranking are developed and discussed in-depth by the highly regognized statisticians Jeff Sonas (www.chessmetrics.com), Rod Edwards (http://www.edochess.ca/), Mark Glickman (http://www.glicko.net/glicko.html), Ken Thompson, John Nunn, Christian Krause, and Arpad Elo, the inventor himself (based on previous works).
Here some empirical rhubarb about the influence on average rating by reducing the number of participants in a closed tournaments. The next chapter then will have a closer look on the Elo system itself (for highlighting, ELO is written in uppercase letters).
Highest Rated Tournaments (category)
First Category XVI
A so-called World Master Tournament held in Torino 1982 as a one-off event, is considered as the first category XVI tournament ever (with an ELO average of 2627,14). Seven, an odd number, of the world's top players, including the world champion, competed in the double round robin event. Unfortunately, due to illness, Dr. Robert Hübner was forced to withdraw after the seventh round in half-time, and his schedule was expunged from the second cycle of games, making it now a six player’s competition. The participants in Italy were Anatoly Karpov, Ulf Andersson, Boris Spassky, Robert Hübner, Ljubomir Ljubojevic, Lajos Portisch, and Lubomir Kavalek. Andersson shared first place with Karpov at 6/11 each, with Andersson having gone undefeated.
Jan Timman (no. 2 of the world in the January – June ELO list 1982, rating and ranking was a half-yearly up-dated), Vice World Champion Viktor Korchnoi (no.3), and Garry Kasparov (no.4, rapidly progressing, he climbed no.2 position in the ELO list of July 1982) were all not invited; Karpov then avoided to play in international tournaments with rising Kasparov, and defector Korchnoi was boycotted principally for eight years from 1976 up to 1983 in chess tournaments by the Soviet authorities, meaning that he could not compete in invitation tournaments with Soviet players.
Bugojno 1986, the 5th and last International Bugojno Chess Tournament (a biannual series, started in 1978 by Serbian organizers in a Bosnian town in the Yugoslavia of Marshall Tito) was then the next event considered to be the highest category XVI FIDE tournament ever held (ELO average of 2627,5), eight grandmasters were invited to participate in a double round robin, Anatoly Karpov won, runner-up Andrei Sokolov was the only player to remained unbeaten and could beat eventual winner Karpov. Two top-shots were missing: Garry Kasparov and Viktor Korchnoi who was never ever invited at all Bugojno series, apparently to secure Karpov’s traditional participation.
Highest tournament average since ELO rating system does exist
Bugojno was surpassed in the same year 1986 by the OHRA-Brussels 1986 (not to mix up with the S.W.I.F.T. series at Brussels) which averaged out at ELO 2636 in a double round robin of six participants as the strongest tournament since the ELO system of rating was introduced. Kasparov won that OHRA (his first all-play-all tournament since 1983, that is also to say Garry Kasparov's first tournament after his gruelling series of marathons against Anatoly Karpov for the world title) ahead of Senior Viktor Korchnoi who had won ahead of Spassky at the OHRA-Brussels in 1985.
Kasparov and Korchnoi were the only two players at Brussels 1986 above the 50% mark, Hübner and Nunn tied for third-fourth place, Short who launched his premier win against Kasparov was fifth, Portisch came in as sixth and last. Of the thirty games, twenty were decisive, an impressive quantity for such a top-level clash! Brussels (OHRA-A) in December 1986 was the first major chess tournament televised by the BBC and containing commentary by the players themselves.
At that time (since ELO introduction around 1970), a category XVI (16) tournament was regarded as the absolute maximum. Well, they didn’t know how ultimate inflation works .
Note: A Mini-Tournament in Johannesburg 1981 (won by Ulf Andersson, ahead of Viktor Korchnoi, Robert Hübner, and John Nunn in a quadruple round robin to offer Korchnoi, boycotted in all tournaments with soviet players participating, ‘a rhythm’ of strong opponents), achieved already category XVI with an average of 2629 ELO points, but was not recognized by FIDE due to the ruling Apartheid system in South Africa.
First Category XVII
The Mini-Tournament Optibeurs (EOE) in May 1988 at Amsterdam turned out to be the first ever Category XVII event (on a scale originally limited to XVI units!). Reigning World Champion Kasparov won that quadruple of four players with 9/12. Anatoly Karpov was the runner-up (6.5/12), the Dutchman Jan Timman (5./12) third, his compatriot John Van der Wiel fourth and last (3/12).
Played only a few months after the World Championship match in Sevilla in 1987, where Kasparov and Karpov tied for a 12-12, Kasparov this time was deeply satisfied, winning his individual Mini-Match vs. Karpov 3-1. In the nation standings, the USSR beat The Netherlands decisively 11.5-4.5.
ELO barrier of 2800 broken: Kasparov surpassing Fischer’s record ELO mark
After winning the 13th Tilburg Interpolis 1989 (another cat. XVI, double rounded, Kasparov won with incredible 12/14 with a 3.5 points leading margin, he and Senior Viktor Korchnoi as clear second were again the only two players above 50%, out of eight participants), Garry Kasparov surpassed Bobby Fischer’s record ELO figure of 2785 ELO points with virtually 2788 ELO in October 1989, and after winning in his usual fashion as well at Belgrade Investbank 1989 (at 9.5/11 in a single round robin, three full points in front, without Karpov and Korchnoi), Kasparov reached for the next official FIDE list in January 1990, the magic number of 2800 ELO, as first chess player ever to do so.
In other words: Great Gazza counted 2775 ELO points in the July – December 1989 ELO list, he played two tournaments in the second half of the year – Tilburg Interpolis in September / October 1989 and Belgrade Investbank in November 1989 – gaining 25 rating points to be quoted exactly at ELO 2800 in January – June 1990 (half-yearly published lists, granularity of five points).
First Category XVIII
The 34th edition of Reggio Emilia 1991/92 was the first category XVIII chess tournament ever played; it was won by Viswanathan Anand (6/9 points) as clear first ahead of nine former Soviets! Gelfand and Kasparov finished joint second, followed by Karpov as fourth, equal fifth Khalifman, Ivanchuk, Polugaevsky, then shared M. Gurevich and Salov, and finally Beliavsky as tenth and last.
The Alekhine Memorial at Moscow in 1992 (GM round robin of eight players, not to mix with the strong Alekhine Memorial Open at same place) was then the second cat. XVIII tournament, won by Anand and Gelfand, jointly (3. Kamsky, 4.-6. Karpov, Salov, Yusupov, 7. Shirov, and 8. Timman).
Best performance in a modern supertournament achieved by Karpov
The 12th edition of Linares 1994 (again cat. XVIII) was a next “strongest ever” chess tournament. After the split of FIDE by Kasparov and Short in 1993, Garry Kasparov had no official ELO rating any longer, the organizers gave him a virtual ELO figure of 2800, making it a new tournament’s record average of 2685 ELO (Short was not invited; soon, he and Kasparov were re-integrated in the FIDE rating / ranking list despite the schism still existing concerning the World Champion title).
The 1994 Linares tournament was extraordinary because of the incredible performance by Karpov, who chalked up nine wins and no losses, winning with 11/13 and an incredible margin of 2.5 points over Kasparov who finished joint second with Shirov in a field of 14 players including also Anand, Kramnik, Topalov, Gelfand, Ivanchuk, Bareev, Beliavsky, Kamsky, Lautier and Judit Polgar. After Linares 1994, Karpov reached his individual peak rating of ELO 2780 which, at the time, was the third highest rating ever achieved (Fischer 2785, Kasparov now at 2815 in July – Dec. 1993 list).
First category XIX
Novgorod 1994: Ivanchuk at better tie-break above Kasparov claimed the inaugural edition of Novgorod, Russia, as joint winners, both going undefeated, six players battled in a double round robin event organised by the PCA, and not rated by FIDE. The participants were in order of ELO: Garry Kasparov (2815 inofficial), Alexey Shirov (2740), Vladimir Kramnik (2725), Vassily Ivanchuk (2695), Evgeny Bareev (2675), and Nigel Short (2675 inofficial), the then reigning Vice World Champion according to PCA, who could not win a single game throughout the whole tournament.
First Category XXI
A new landmark in category madness was Las Palmas 1996, directly surpassing cat. XX as the first cat. XXI (based on the semi-official ELO list of November 1996, edited in-between the then regular half-year intervals with list in January and July), labelled proudly as “World Championship of tournaments” by the organizers (“Supertorneo Mundial de Ajedrez Cran Canaria ‘96”) – sadly, it turned out to be the last big event at Las Palmas.
The participants were in order of ELO: Kasparov (2785), Karpov (2775), Kramnik (2765), Topalov (2750), Anand (2735), and Ivanchuk (2730), then clearly the strongest tournament of the modern era (with an average ELO of 2756). Kasparov emerged triumphant as "the best player in the world at that moment", Anand was sole second, Kramnik and Topalov joint third-fourth, Ivanchuk and Karpov shared last, the latter as only player without a single game win in this double round robin.
New best-ever ELO rating of 2851 by Kasparov
It was right after the victory at Bosna, Sarajevo in May 1999, when Garry Kasparov achieved in the next ELO list from July (to December) 1999 his phenomenal ELO 2851 rating:
Always as clear first, Kasparov scored that year 10/13 at Wijk aan Zee, then 10.5/14 at Linares (doubleround) and afterwards 7/9 at Sarajevo (Bosna), all together these 36 games gave him a plus of 39 points to climb from ELO 2812 in January 1999 to ELO 2851 in July 1999 – remaining in January 2000 as well at ELO 2851 (no games of Kasparov in the second half-year 1999).
ELO list of (July - December) 1999 II:
Kasparov reaching ELO 2851
1 Kasparov, Garry (RUS) ELO 2851 +39 points *1963
2 Anand, Viswanathan (IND) 2771 -10 *1969
3 Kramnik, Vladimir (RUS) 2760 +9 *1975
4 Morozevich, Alexander (RUS) 2758 +35 *1977
5 Shirov, Alexei (ESP) 2734 +8 *1972
6 Kamsky, Gata (USA) 2720 0 *1974
7 Gelfand, Boris (ISR) 2713 +22 *1968
8 Karpov, Anatoly (RUS) 2709 -1 *1951 (FIDE WCC, not defending title at Las Vegas 1999)
9 Adams, Michael (ENG) 2708 -8 *1971
10 Ivanchuk, Vassily (UKR) 2702 -12 *1969
11 Leko, Peter (HUN) 2701 +7 *1979 (eleven players with ELO >2700)
12 Bareev, Evgeny (RUS) 2698 +19 *1966
13 Topalov, Veselin (BUL) 2690 -10 *1975
14 Svidler, Peter (RUS) 2684 -29 *1976
15 Azmaiparashvili, Zurab (GEO) 2681 0 *1960
16 Dreev, Alexey (RUS) 2679 +40 *1969
17 Korchnoi, Viktor (SUI) 2676 +3 *1931 (aged 68 / 69 years - and in the ELO top twenty)
18 Short, Nigel (ENG) 2675 -22 *1965
19 Smirin, Ilia (ISR) 2671 +19 *1968
20 Polgar, Judit (HUN) 2671 -6 *1976
New best-ever ELO rating of 2882 by Carlsen
That ELO number of 2851 was Garry’s highest rating ever – and a world record, later and still only surpassed by Magnus Carlsen who peaked at 2882 in May 2014 (ELO lists now monthly published by FIDE), of course, not taken ELO inflation into account.
Again and again, another forthcoming tournament was topping the previous average benchmark due to galloping rating inflation (from about mid-1980s to mid 2010s), and the reduced numbers of players in comparison with most previous world-class tournaments.
First Category XXII
The Bilbao Grand Slam Final Masters 2010 the first cat. XXII event ever held, provided new food for Elo fetishists, in fact, it was a Mini-Tournament with only four players in a double round robin. Since then, category XXII has been reached by several other supertournament series, too. In no particular order: Tal Memorial in Moscow, Gashimov Memorial in Shamkir, Norway Chess, London Chess Classic, Zurich Chess Challenge, or the Sinquefield Cup.
First Category XVIII
Zurich Chess Challenge 2014 (six players, point scoring of classical chess combined with rapid chess in single round robins) was the first cat. XXIII chess tournament, a little later averaged / edged out marginally the same year by the 1st Sinquefield Cup in St. Louis, Missouri 2014 which was won by Fabiano Caruana with a stratospheric 8.5/10 points ahead of the World Champion Magnus Carlsen at 5.5/10, Topalov sole third, followed by joint Vachier-Lagrave and Aronian, and Nakamura as last (six players invited in a double round robin).
In the rush of daily Liverating hysteria, Youth mania, ELO fetishism, and CATEGORY madness,
i) The smaller the number of participants, the easier to pimp up the average! We should not be focussed (only) on the average, but on median and number of players of a tournament.
Tournaments with only four players should be regarded as an own category ("Mini-Tournament"). From a statistical point of view, chances to (co-)win a tournament with four or six invited major players, increases considerably compared to the old-fashioned all-play-all tournaments with 16, 18, 20 players even if among them always some local minor entrants were included. Look at London 2014, a single round robin of six players: after just five rounds which is little reliable, 50% of the <invited> participants were co-winning...
The number of players does have an impact on the individual chances to win, not only average counts! The probability to win for instance a round robin tournament such as Wijk aan Zee with normally 14 players, mostly top hundred including a handful from the top ten, could be therefore statistically lower than to win any tournament with four or six players even if all of them are in the top ten and their average of players subsequently higher. Of course, the gap in strength and rating between 'major' and 'minor' players shouldn't be too much.
ii) From a practical point of view, on top level, we should more focus on (peak) ranking rather than (peak) rating of players. Read more in the following chapter on Elo system and inflation.
Table: The fifteen “strongest” chess tournaments in terms of pure category numbers (average Elo rating of players) all took place in the 21th century since 2008, two main reasons: reduced number of participants in closed invitation tournaments and FIDE Elo rating inflation.
Highest category (average Elo rating) tournaments ever assembled – all held in the 21th centruy:
2014 Zurich Chess Challenge (mixed with Rapid score)
2015 Zurich Chess Challenge (mixed with Rapid score)
Source: http://www.chessfocus.com/trend/highest-rated-tournaments (as of 2015)
History of Elo system
By far the most famous chess rating system is the Elo System, named after Prof. Arpad Elo and his wife, who invented it in the early 1960's.
He had already predecessors, especially Kenneth Harkness, creator of the Harkness rating system for the U.S. Chess Federation (USCF), a chess rating and ranking system used from 1950 to 1960, Richard W. B. Clarke, he devised the English Chess Federation (formerly British Chess Federation) Grading System, first published in 1950s for the British Chess Federation (BCF, then English Chess Federation ECF), whereby points are scored by chess players for every game played in a registered competition, or even earlier Anton Hößlinger (1875-1959, he worked as a postal supervisor and saw the need for a system that somehow could define the chess strength of a player). The Ingo rating he devised, used in Germany since 1947, named after his town Ingolstadt, Bavaria, formed a basis for later systems and was very popular in Germany for a long time.
A Grading System was and is an essential tool for organisers of (open) chess events, used quickly for Seeded Swiss pairings, grading restricted tournaments or grading prizes.
As pointed out, the Elo chess rating and ranking system wasn't the first; rather than creating a new system from scratch, Arpad Elo and his wife kept fundamental aspects of the Harkness system – the rating scale and the class categories – and modified the rest.
The Elo system was approved by the USCF in 1960, although not everyone was satisfied with it, and was finally adopted by FIDE assembly in 1970 for global use.
Already at Lugano, Switzerland, Chess Olympiad in 1968, Professor Arpad Elo (Milwaukee, Wisconsin), Folke Rogard (FIDE President, Sweden), Dr. Wilfried Dorazil (FIDE Vice-President, Austria) and GM Svetozar Gligorić, formed a (FIDE) sub-committee charged with creating an internationally compatible rating system.
Such a system could be used to judge the comparative strength of players and provide a fairer basis upon which 'master titles' would be awarded. Upon completion of their task, the newly conceived Elo rating system is used to make sense of the many game results collected from January 1966 to May 1969.
The resulting, provisional 'world list' comprises the top 10 players:
- Bobby Fischer, United States 2720
- Boris Spassky, Soviet Union 2690
- Viktor Korchnoi, Soviet Union 2680
- Mikhail Botvinnik, Soviet Union 2660
- Tigran Petrosian, Soviet Union 2650
- Bent Larsen, Denmark 2630
- Efim Geller, Soviet Union 2620
- Lajos Portisch, Hungary 2620
- Paul Keres, Soviet Union 2610
- Lev Polugaevsky, Soviet Union 2610
- and further
includes (sorted alphabetically, if equal):
10.= Smyslov, Stein, Tal (all 2610); 14./ 15. Olafsson, Kholmov (both 2600); then Bronstein, Furman, Gligoric, Hort, Najdorf, Taimanov (all 2590); Gipslis, Krogius (both 2580); Evans, Lein, Reshevsky, Vasiukov (all 2570); Antoshin, Lutikov, Matulovic, Savon, Suetin, Unzicker, Zaitsev A. (all 2560) … Note: Wikipedia and Olimpbase had some slight difference concerning Smyslov's rating (2610 as given above vs. 2620).
The Elo grading system was then officially accepted by FIDE in 1970 at their Congress during the Chess Olympiad in Siegen, Germany, and has been in use ever since.
On Elo system and FIDE Elo chess rating inflation
The Elo System (established and maintained by Arpad Elo himself up to the mid-1980s) is by definition a <zero sum game>, so even if people in general are getting better at chess (i.e. thanks to computer and databases, or / and a dominant leading player, harder training methods, etc.), it shouldn't mean that average (median) ratings are going up! The only thing that matters for the ratings are the final result of the game (win, draw, or lose). Subsequently the Elo System can be used in other fields as well.
That's the difference to let's say a new absolute record in athletics 100 metres, swimming, etc. This is an important distinction.
The basic principle of any sport rating system is that the difference between the rating of any two participants should serve as a <predictor> of the expected outcome of a match between them:
When two players compete, the rating system predicts that the one with the higher rating is expected to win more often than the lower rated player. The more marked the difference in ratings is, the greater the likelihood that the higher rated player will win (“paired comparison” modeling and analysis).
If you are expected to score 70%, and you only score 60%, your Elo rating will go down. Conversely, if you are expected to score 20%, and you score eg. 30%, then your rating will go up despite the fact that you lost more games than you won.
After a tournament finished, some players have gained points, some have lost, but in total always in a zero-sum. Technically it shouldn't have any inflation or deflation (conditional, both players have the same K-factor).
The Elo chess ratings (median rating of top 10, median rating of top 100, etc.) for the strongest players were fairly constant in the late 1960s, 1970s and early 1980s; there have been a regular good dozen of players with more than 2600 Elo points.
The domination of the no. 1 player can vary: Fischer was more dominant than Karpov (the only no. 1 player with an Elo rating sometimes below 2700), Kasparov as World champion was much more dominant than Kramnik as World champion, as prominent examples.
Zero sum means, the median of the top ten, top fifty, top hundred should be stable. In other words: the number of players above a certain Elo rating should not change.
But since mid-1980s the <FIDE Elo Ratings> have been drifting upwards on average:
Inflation started at around 1985/86, with another inflationary burst around 2007/08.
Several articles, mostly by statisticians have been published on this topic and how to devise an even more accurate <method for measuring chess strength and predicting results>, most notably by Jeff Sonas, inventor of the rectified (inflation-free) historical chessmetrics, Rod Edwards (Edo historical chess ratings), computer and computer chess pioneer Ken Thompson, Mark Glickman, grandmaster, mathematician and author John Nunn, Christian Krause, and Arpad Elo.
Still there is some disagreement in the chess community as to the cause of this FIDE Elo Inflation. In no particular order, some exogenous manipulation, pure politically driven, to remember from the time when Campomanes came in office:
The Karpov rule, fabricated by Karpov, stipulating that a tournament winner or co-winner cannot lose any Elo points, regardless his actual performance (a joke), granting a 100 bonus rating points to all female players minus Zsuzsa Polgar in November 1986 for 1987 (completely arbitrary), and the free choice at FIDE team events whether or whether not your results should be rated (decision at Dubai Olympiads in 1986), all these nonsense is no longer in force, but the initial inflation damage is done.
Remaining are technical issues: the notorious K-factor determing how sensitive (volatile) your system is, the use of different K-factors in the same game, rating floors (used by the USCF ratings with the intention to avoid sandbagging), interval of calculation, the management of entries and drop-outs (inactive players, how to define a necessary minimum of played games), or the disparity limits (400 point rule).
The Chinese top player Li Chao achieved at an Open tournament a 2622 Elo performance, RELATIVELY below his then already 2693 Elo rating. In spite of this he had still managed to gain 5.6 Elo points. How is this possible?
Of course, at its introduction, the computational task was comparatively easy because only some hundred player in closed round robin tournaments were rated by FIDE and the minimum FIDE rating was Elo 2200. In other words, the greatest disparity possible was around 500 Elo points, but now games with up to about 1500 Elo difference could take place and there are many ten thousands of rated players, overall most games are held in open tournaments nowadays, and sometimes players with a different K-factor are facing each other.
In latest time there is data evidence that the FIDE Elo inflation period is over, the rating increase at the top is levelling off in recent years.
Recommended reading User: AylerKupp (section on Elo inflation, scroll down).
(quote) <looking at the data through 2015 it is even more evident that the era of ratings inflation appears to be over> Top ratings stopped rising in recent years. Apparently the halt started namely at the bottom: The year with most new GMs was 2007. The year with most new 2700s was 2008. The year with highest average rating of top-100 was 2012, as players was the year with most active 2700s.
My hypothesis is, that there are now too many underestimated / underrated young and rising players, meanwhile the (on average) declining players aged 50plus are a group of their own and playing primarily each other in Senior events.
Note: Sometimes the objection is heard, that players today are getting better with engines and databases. Of course they do, as for instance players from the mid-1960s, 1970s on learned with the new information system of codes for the classification of chess (openings) offered by The Chess Informant.
For two decades prior to the emergence of computer databases, Chess Informant publications were a leading source of games and analysis for serious chess players – yet the Elo ratings then remained stable on average as they should.
In the 1970s and 1980s normally only one single player was above 2700 (Fischer, then Karpov, Tal once, later Karpov and rising Kasparov); for some Elo lists there was no player at all at 2700plus. Players like Spassky or Korchnoi, despite ranked no. 2, were never rated at 2700 in FIDE Elo, but today there are about 40 players rated at Elo 2700 or higher.
Thus, in historical comparisons, ranking matters, not nominal rating, referring to top-level.
Elo chess rating and ranking lists:
http://www.chessmetrics.com/ (Historical ratings by Jeff Sonas)
http://www.edochess.ca/ (Historical ratings by Rod Edwards)
http://www.2700chess.com/ (inofficial daily Live Rating for players of Elo 2700 and above)
and monthly published FIDE Elo - men)
https://ratings.fide.com/top.phtml?list=women (official and monthly published FIDE Elo - women)
https://ratings.fide.com/ (FIDE Elo list, player search function)
http://www.olimpbase.org/ (all official FIDE Elo rating lists from 1971 to 2001
(including the prior auxiliary lists), direct link: http://www.olimpbase.org/index.html?http%3A%2F%2Fwww.olimpbase.org%2FElo%2Fsummary.html
Bilbao 2016 (six players, double round robin) lined-up World Champion Magnus Carlsen, Challenger Sergey Karjakin, Anish Giri, Hikaru Nakamura, Wesley So, and Wei Yi, providing an average rating of 2775+ Elo (Category XXII).
The young age of the field is notable: Age average of the participants is 23 to 24 years, roughly calculated: Wei Yi, born 1999, Giri, born 1994, So, born 1993, Carlsen, born (November) 1990, Karjakin, born (January) 1990, Nakamura born 1987.
The “oldest” player in the field is indeed Nakamura (born in December 1987), at age of 28.
The novice is Chinese teenager Wei Yi (born in June 1999), 17 years young.
New trend: Average age of players should be below category number of the tournament.
Note, sometimes I speak sarcasm..