Hey! Have you considered becoming a paid subscriber to this newsletter but you’re on the fence? Well here’s a great way to try it out. John Muller of the excellent Space Space Space has an offer where he’ll give you a free month of Grace on Football or Ryan O’Hanlon’s No Grass in the Clouds for free if you subscribe to his newsletter. The deal runs until Monday and it’s a great way to dip your toes in any of the three newsletters.
After Liverpool’s win over Leicester City, FiveThirtyEight’s Club Soccer Predictions model did something it hadn’t all season so far: put Liverpool as the favourites to win the title. I thought this was notable, so I tweeted about it.
The responses had a pretty clear theme.
Pep Guardiola’s side are in 13th place with a negative goal difference. Everyone who’s anyone seems to think City are in the mud. Except FiveThirtyEight. What’s happening here. Is the data missing something that we can all see? Is the model weighting the wrong stuff? Is it actually totally right and we’ve all overreacted to a slow start? Let’s see what we can figure out.
Full disclosure: I have done freelance work for FiveThirtyEight in the past and potentially may again in the future. I’ve never spoken to anyone involved in putting the model together and have no inside knowledge of how the sausage is made beyond what they make publicly available. These are my thoughts and mine alone.
First of all, let’s look at what the model, as of Friday 27th November, is forecasting. The Champions League performances actually flipped things back around to City.
With the current table being what it is, this would be quite the turnaround. City would take 69 (nice) points from their remaining 30 games, at a clip of 2.3 points per game. Liverpool meanwhile would take just 59 points from their next 31 (2.03 ppg), Chelsea 53 points (1.83 ppg) and Spurs just 50 points (1.72 ppg). The model, in other words, strongly expects City to win more games than their rivals between now and the end of the season. If not for the other sides’ points in the bank, it’d be predicting a blowout, just as it did before the season began. So how on Earth did it get there?
At the core of FiveThirtyEight’s model is the SPI rating. This is, pretty straightforwardly, about putting a number on how good or bad any football team is, across different leagues and countries. In most sports, FiveThirtyEight starts with every team or player’s Elo rating as the basis. Elo began as a means to evaluate chess players but quickly became a great tool across many sports, with the FIFA Rankings being an obvious example of its uses. You can check out its judgement of the European club game here.
The problem is that football, unlike almost every other sport where Elo has been applied well, is so low scoring. You can play a 38 game campaign and still lose out on the title on the basis of a single goal. We just don’t have a big enough sample size to accurately evaluate teams on results alone. It’s a frustrating side effect of what makes football such a thrilling sport.
And so FiveThirtyEight measures performances on three axes: “adjusted goals”, “shot-based expected goals” and “non-shot expected goals”.
Adjusted goals is the scoreline but, well, adjusted. You score three against a side with ten men, and they’ll downrate that due to the greater ease when a man up. You score late on when the scoreline is all but settled and the goal is meaningless, same deal. It’s a straightforward goals model, but with a little control for certain factors.
Shot-based expected goals (aka “normal” expected goals) I’d expect we’re all familiar with at this point. FiveThirtyEight uses an Opta-based model, so I wouldn’t expect anything too fishy or out of the ordinary here.
Non-shot expected goals attempt to estimate how many a team “should” have scored using everything except shots. Passes into dangerous areas, that sort of thing. These models, from what people who have attempted them say, are kind of difficult to get right for a lot of reasons. FiveThirtyEight likely isn’t an exception here, but it’s a useful corrective at times when strange things are happening with the shot based model, and increases the sample size of “stuff we’re measuring on the pitch” significantly.
These three numbers are what feeds into the Soccer Power Index* (SPI). This then allows a sort of Elo or FIFA Rankings type model but with data about how well the teams played, rather than simply the result. Perform well against a very good team, even if you lose, and your SPI will go up. Play poorly against a weaker side, even if you win, and the SPI will go down. You get the picture. This is then what lets the model predict the result of every game and simulate an end of season table.
City started the season with a very strong SPI. The data saw them as the clear best team in the Premier League, and weighted accordingly before a ball was kicked back in early September.
City spent all of last season as the most xG-dominant side in the Premier League. We can sit here all day and argue about how much this really means when Liverpool won the title so handily, and feel free to do that yourself anywhere except my Twitter mentions. But the model only knows what it knows. What it knew was that City were putting up huge, huge numbers.
Added to this was the specific level of dominance in the restart period. This was obvious just looking at the results, with City scoring 34 and conceding just 4, but the metrics were just as overwhelming. The context is that these fixtures weren’t all that important, but it’s difficult for the model to properly parse this. This gave City a very strong baseline going into the start of the season. The model’s assumption was thus that City were clearly the best team in England, and it needed reasons to disprove this.
So what does it have in the data now?
City’s adjusted goals are, unsurprisingly, not too far from the actual scorelines. Guardiola’s team have an “adjusted goal difference” in the Premier League of -0.9. You don’t need me to tell you that’s bad. BUT. City have won games handily in the Champions League against decent opposition, and that all counts towards the model. If you include their European results, City’s adjusted goal difference jumps to +7.3. That works out at +0.6 per game, which isn’t outstanding, and it’s definitely down from their previous levels. It’s probably hit their SPI at least a little bit, but not enough for the model to really panic.
Shot-based expected goals tells something of a similar story. City’s xG difference here is +3.5, which is really off their standards in the last few seasons. This has been the core reason most other statistical readings have panicked. It’s not just bad finishing leading to dropped points here. But again, City have done really well in the Champions League, and that counts. Their shot-based expected goal difference in both competitions is at +10.6, or about +0.9 per game. This isn’t as good as previous years, but by the measure of mere mortals, it’s pretty damn good.
Then we get to non-shot expected goals, which is the big variation in how FiveThirtyEight does it compared to most. And just as shot-based xG liked City a little more than adjusted goals, non-shot xG likes them a little more again. Their non-shot xG difference is +7.6 in the league and +16.2 overall. That works out at +1.35 per game. Now we’re getting somewhere.
On the attacking side, the non-shot and shot-based xG models look pretty similar. There’s a slight gain for the non-shot model (23.2 against 21.7 overall), but nothing notable. The defensive side, however, is where the split can be seen a bit more. City have conceded 11.1 shot based expected goals (again, in both competitions) and just 7.0 non-shot xG. My hunch about what’s going on here is counter attacks. If a team gets a really good scoring chance against City from a fast counter, where the chief advantage is running into the open space, that’s quite difficult for the non-shot model to pick up on. It just sees a pass into the final third, then a dribble into the box, or something like that. At least the shot-based model has Opta’s big chances data to “correct” this sort of thing at least a little bit.
It might be that the non-shot model is doing a good job here, and we shouldn’t expect teams to get on the end of chances against City so easily going forward. Equally, it might not quite be seeing these effective counters as I suggested. Annoyingly, it’ll probably take a larger sample size before we can be confident on this.
The same can be said of the Champions League performances. I know everyone likes to joke about how easy City’s draws are, but Porto, Olympiacos and Marseille are not worse than the average Premier League side. It might just be that City’s really good performances happened to take place in those games rather than domestically. It might tell us something real, but the data really isn’t there to back it up.
Those are the two key things to watch with the team going forward: are their performances in Europe materially different than the Premier League, and are they conceding chances in a fast counter-attacking way that a non-shot xG model would struggle to pick up? Right now, I don’t think we can be confident that FiveThirtyEight’s model is “wrong” about them, but it’s certainly something to keep a close eye on.
*And Transfermarkt values at the start of the season, but those don’t continue to be important and weren’t really a significant factor in the title race projections. We’re just overcomplicating it now.