In this article, we examine how well the matchmaking works, by analyzing the game history for every player ranked on the ladders in the month prior to the recent big balance patch, in particular for 1v1 games. This also sheds some light on faction balance across most skill ranges.
Since this article ended up being a bit lengthy and technical, I'll start with:
tl;dr - Conclusions
For players at level 12 or lower:
- OH players tend to be matched up with slightly lower ranking SOV/UKF players, while their USF opponents are commonly worse by more than one level.
- OKW players tend to be matched with SOV/UKF players who are about a level higher.
If we assume that players of the same level have approximately the same skill, this would indicate that for these levels USF and OKW are the strongest factions, while OH is slightly weaker than the rest.
- The ladder range in which a player has to be found in order to create an even game is very narrow, typically less than one level.
- Most games happen outside of that range, as the range is often more narrow than that in which 33% of the games happen. In other words, the outcome of more than two-thirds of games are strongly affected by the significant skill difference of the two players.
- OH players experience a comparably low win percentage against USF. The other matchups are reasonably well balanced from an Axis perspective.
- All allied factions achieve lower win rates at level 7 or lower, probably because fewer players queue with an axis faction in those levels.
- SOV is holding its own against OH, but is not as good against OKW.
- UKF has the upper hand against OH at levels 12-8 and balanced below 7. It does poorly against OKW, though.
- USF enjoys comparably high win percentages against OH in the level range from 12-6, but does poorly against OKW at level 7 or lower.
- SOV is holding its own against OH, but is not as good against OKW.
The plan is to update the statistics in about a month, to see if the recent patch resulted in changes in terms of balance.
Balance at the top level is tricky to interpret, mostly because the definition of "top level" is not obvious. Most matchups seem reasonably well balanced, although the data supports that OKW performs poorly against USF.
This article is based on the leaderboard and match history data which was downloaded once or twice a day. Data from the leaderboard was downloaded first, followed by the match history for each player, omitting games that had been obtained previously.
The stats below include games from the 24th May to the 21st June, 2016. The history for each player contains only the last ten matches, so a game would be lost if all players involved played ten more matches. However, a cross-check against the total number of matches played, showed that the amount of the lost games is below 5%.
Since the ranks were retrieved only once or twice a day, the exact ranks of the players at the start of each match are not available. Instead, for a given period, each player is assigned the average of their rank just before and after the said period.
For the statistics, matches were discarded when either player involved had no rank on the respective ladder.
Players in Automatch
Let us start with some general data on the players. The table below shows how many players were in the different ladders on 20th June 2016.
|axis team||allies team|
Most players are placed on more than one ladder. In total, there were 40,274 players in at least one of the ladders. Still, it turns out that the player base is a fairly disjointed group. Based on which ladders each player had a valid rank on the 20th, we get this picture:
Note, that "valid rank" does not mean that e.g. an "OH-only" player might never have played an OH-game, only that he either did not play more than nine games in total, or no games in the last four weeks.
Each day, approximately 2,350, 3,050, 1,750 and 2,850 games were played in 1v1, 2v2, 3v3 and 4v4 respectively. For more details on games played, check out coh2charts.
The Basics: ELO, Rank and Level
Matchmaking is based on the ELO rating. Details of the implementation are not available, but it seems likely that each player has an individual ELO rating for each mode and faction played. Additionally, each arranged team has individual ELO ratings for their Axis and Allies play.
The matchmaking algorithm attempts to match players with similar ELO values. After the game, the ELO of the winner increases while that of the loser decreases.
The ELO value itself is not shown anywhere. Players are placed on the ladder according to their ELO. The "rank", the position of a player on each ladder, is therefore a good way to compare players on the same ladder.
However, it is not very helpful when comparing players on different ladders. This is obvious, considering that there are vastly different numbers of players per ladder. For example, rank 1,500 is at the end of the UKF/1v1 ladder, but only in the middle of OH/1v1.
About a year ago, Relic introduced "levels". Those are numbers between 1 and 20, which are derived from the ranks. If a rank is 200 or better, they directly translate to levels:
If the rank is worse than 200, levels are applied in a relative way:
|worse than||94%||86%||80%||75%||65%||55%||45%||38%||31%||25%||20%||15%||10%||5%||top 200|
A 1v1 player with ranks 750 and 1,500 for UKF and OH respectively, would be in the middle of each ladder. This means he is worse than about 50% of the UKF and OH players and accordingly would have level 7 with both factions. With a rank of 750 for OKW, he would be at 750/2,505 = 30%, so level 10 there.
It is not clear why ranks above and below 200 are treated differently. However, the approach seems reasonable if we assume that overall skill should be similarly distributed, regardless of how many players play each faction, while on the other hand most top players will play all factions to some extent. Picking rank 200 as the threshold, however, is somewhat arbitrary.
Some fun facts regarding the levels:
- Ladders having less than 4001 members will not have all levels. The reason is that rank 201 is already worse than the top 5%. So for example, rank 200 for UKF will result in level 16; rank 201 however is already only level 13, due to the small number of players on the ladder.
- The threshold for level 16 for arranged 4v4 teams is 179 instead of 200, probably due to the small number of teams. Still, rank 180 is already only level 8, so levels 9-15 are missing on that ladder.
- The level is shown next to the faction icon on the automatch search screen. However, the program shows the best level of the player for that faction across all modes. That is, if for SOV the 2v2 level is the highest, the game will display this level, even when searching e.g. exclusively for a 4v4 match.
Matchmaking in 1v1
While 2,350 1v1 games per day might sound like a lot, it is actually not that much. If we round the figure up to 2,400 games, it means we have about 100 1v1 matches per hour. Still sounds like a lot? Well, assuming everybody would wait about five minutes in the queue, we are left with about (100 games/60 minutes * 5 minutes=) eight axis and allied players across all skill ranges.
There is no way to find out who was waiting in the queue at a given time. To give some idea about the situation, the diagram above shows the 1v1 matches made on the 6th June, between 20:00hrs and 20:10hrs CET. The upper part shows the distribution of Axis and Allied players vs. the ladder range. The lower part shows who was actually matched with whom, and in what order the matches were made.
NB There were likely a few more players in the queue during those 10 minutes. On the other hand, probably not all of the players who are matched at the end of the 10 minute period, were already available at the beginning.
There are several players queued, but their distribution across the ladders is very different. While there are more Allied players at levels 10 and up, there are more Axis at levels 9 to 7.
The plots above show how many games have been played with each faction across the ladder (in a smoothed version for clarity). Unsurprisingly, the top 8% play more games than the rest. The number of games played as UKF is fairly constant across the skill ranges. Below 20%, SOV is the most popular Allied faction. This changes in the range between 20 and 55%, where USF is more popular.
Likewise, there are different numbers of games played on Allied or Axis sides, depending on the ladder range. Below about 10% and between 23%-50%, playing Axis is more popular. Beyond 50%/level 7, more games are played on the Allied side. As we will see later, this might have some implications on the win ratios for lower ranking Allied players.
The Factions in 1v1 Matchmaking
We saw that in most cases, the Axis players are matched with lower level Allied players. This might actually happen on purpose: Matchmaking is supposed to create even matchups, by finding players with similar ELO values, not similar levels.
The ELO rating reflects both individual player skill, as well as faction performance. Assuming a level 10 player has the same skill with an under- and an overperforming faction, his ELO will be higher with the latter. As a result, he should ideally be matched with different players, depending on which faction he picks.
Accordingly, faction balance is almost irrelevant in this respect, because imbalances are compensated by matching with a better or worse player. Obviously, this does not work for the top players, as "better" players are simply not available (another reason for balancing from the top).
To check how the matchmaking for 1v1 works in detail, the matches were grouped for each faction matchup, according to the ladder ranks (or more precisely, ladder percentage).
The left-hand side of the diagram below shows one group as an example: The green area shows how often a level 6 OH player (rank roundabout 1809) would be matched with USF players of different ladder percentages (for reference, it is also denoted which rank or level that would be). The peak for the games is somewhere at 75%/rank 1500/level 4.5, but there are way more games against better players.
The red-hatched area denotes the number of games won versus. USF players of a given ladder position. There are no games won versus USF players better than rank 1000, while the level 6 OH players achieved an almost 100% win percentage against USF players of level 3 or worse.
In order to get a better overview, the games are summed up from low to high ranks. Additionally, win percentages are computed. The result is shown in the diagram on the right-hand side.
The blue curve shows the cumulative number of games, so basically, the green area added up from the left, and normalized to the total number of games. The blue dot denotes the 50% point, so in half of the games, the rank 1809 OH player was matched with rank 1380 USF players or worse.
The dark grey rectangle denotes the central range in which a third of the games happen, so in about 33% of cases, the level 6 OH player is matched with a level 5 USF player. The light grey area then shows the range in which 60% of the games happen.
The green line denotes the (heavily smoothed) win percentage. As expected, the win percentage is 0% when playing against good USF players, while almost 100% against bad players.
The dark green, hatched area denotes the skill range, in which the OH player has a 40%-60% win chance. It turns out to be surprisingly narrow, only spanning about 100 USF ranks. The blue dot is on the left boundary of the hatched area, so in half of the games the OH player has to face USF players which are good enough to push the OH players win percentage below 40%.
The same analysis was carried out for all other skill ranges and matchups. The information from the right-hand side, in particular the light grey, dark grey and hatched areas, as well as the green and blue dots are then compiled in one diagram for all skill ranges:
The blue and green lines denote the position of the blue and green dots in the previous diagram; hatched, dark and light areas also mean the same thing. The red line denotes the win percentage for the different level ranges, including error bars.
The left y-axis now denotes the percentage difference between the USF player and the OH player. So, for example, an OH player at 50%/level 7/rank 1500 would have a 50% winning chance (green line) against a USF player that is roughly 13% worse, and in total numbers he would be at 63%, which in turn is equivalent to level 6 or a rank of about 1250.
In this case, the blue curve is above the central, black line at 0%. This means that the OH player is commonly matched with USF players who have a lower level. Moreover, the blue line is often below the green line. This means that - while being matched with lower ranking USF players - he actually would have to be matched with even lower ranking USF players, in order to stand a 50:50 chance of winning. Consequently, the win ratios are typically below 50%. That is, excluding OH players better than level 13.
The following round of spoilers contain the plots for all five factions:
Overall there is a good correlation between the win percentages and the relative position of the blue and the green curve: Whenever the green curve is significantly higher than the blue curve, win percentages are below 50% and vice versa.
It might seem counterintuitive that for example,. the OH-vs-SOV curves are not simply the inverse of the SOV-vs-OH curves. To explain why that is, assume that I (about level 9) am always matched with Jove, regardless of what faction I play. The result would be that all of these factions would have a 0% win ratio at level 9, while enjoying a 100% at 0%.
You can often observe that win percentages for Allied players decrease beyond 50%/level 7. This is likely, due to less games being played with Axis factions in that range, so the Allied players more commonly have to be matched with better Axis players.
OH players tend to be matched with SOV and UKF players who are slightly lower in their respective ladders. Against USF, the difference often is more than one level. For instance, a level 8 OH player would be matched with a level 7 or worse USF player. On the other hand, OKW players tend to be matched with higher ranking SOV and UKF players.
This indicates that USF and OKW are relatively strong at levels below 12.
Part of the problem why certain matchups are not totally balanced, might be the simplistic approach of (supposedly) having only one ELO value per faction. To illustrate the issue, let us assume that OH would always win vs. SOV but always lose against USF. OKW, on the other hand, would always win against USF but lose to SOV, rather like Rock-Paper-Scissors. Now, if all factions were played equally as often, the win percentage for all factions would be 50%. The faction imbalance cannot be solved with one ELO value per faction, because changing it might improve balance in one matchup, but would have adverse effects on the other.
At levels 12 or better, players would commonly be matched with worse opponents, inflating the win-percentages there. The diagrams are not very useful in that range; the next section will deal with faction balance at the top of the ladders.
Win percentages between top players
Coh2charts shows win percentages for the top 250 of each faction. It includes all games for which one player was in the top 250 at the start of the match, regardless of the rank of the opponent.
With this data set, we have the option to look at games played only between top players and for different faction matchups. The issue then becomes to define what is a "top player". There are only very few games between level 19/20 players (ranks up to 13), which is way too few to extract any meaningful statistics.
Also, it turns out that there are only a few players who actually play all five factions at the top level (if at all), so it seems unlikely that the number of top players is the same for each ladder.
The bar diagram below shows the win percentages for the Axis factions for each matchup. Specifically, the bar covers the range within one standard deviation, so there is about a 68% chance that the "true" win percentage is within the bars. There are four sets of bars for different definitions of "top player", namely:
- Top 36 (up to level 18): 352 games
- Top 80 (up to level 17): 1195 games
- Top 2%: 572 games
- Top 3%: 968 games
Note: 1% equals about 30, 23, 25, 20 and 16 ranks for OH, SOV, OKW, USF and UKF, respectively. So, the "top 3%", OH vs. UKF bar includes games in which a top 90 OH player played a top 48 UKF player.
Results differ somewhat, depending on the range defined. Not surprisingly, the OH win percentage goes down when looking at top X%, while UKF percentages go up.
Overall, the matchups including OH seem reasonably well balanced. OKW does poorly against USF, but otherwise seems balanced as well. When concentrating on the top 36 only, OH might have an advantage over USF, while SOV might be doing poorly against OKW.
Balancing for the lower ranks, where apparently USF and OKW are doing well while OH appears slightly weaker than SOV and UKF, would have to be done in a clever way in order to not degrade the balance at the top.