Stories

No, seriously: what the heck is expected goals (xG)?

Expected goals (xG)
We are part of The Trust Project What is it?

Expected goals may now be starting to appear in more post-match analysis alongside shots on target and the number of corners, but it doesn’t really belong in the same company. While statistics will tell you what has just happened, analytics is able to give you a much clearer idea of what could be yet to come.

“A good example I cite is Juventus in 2015/16,” explains Alexander. “After 10 league matches they had only won three times, but over the 10 games they had scored far fewer goals than you’d expect them to have based on the quality of their chances, and conceded more based on the quality of chances their opponents were creating. Their results had been much worse than their performances had suggested.

“The Turin side had scored 11 goals in those 10 games, when their xG was 19. At the other end they had leaked nine, when expected goals suggested it would usually have been five. Looking at those numbers, we expected things to regress to normal and, lo and behold, the Old Lady’s luck changed. In fact, they won their next 15 Serie A matches on the way to winning another title.”

Expected goals predicted Juventus's 2015/16 improvement

The same method can be implemented to xG figures for an individual player. For example, a largely overlooked centre-forward who has not found the net too often may be about to start scoring for fun – and xG could help you see it coming.

“Harry Kane has consistently scored above his xG for the last three seasons,” says Alexander. “You are never going to sign a young striker on the basis of one season of similar numbers to Harry Kane, although these numbers will help you to spot players who, for whatever reason – be it some poor team-mates or a particularly rotten spell of luck – may be going under the radar.”

Caley – a Spurs fan – was able to use the model to predict Kane’s rise to goalscoring greatness before he'd even achieved the status of ‘one-season wonder’.

“I wrote an article about Kane’s shot production before he’d earned a regular place in the Tottenham line-up,” Caley tells FFT. “It outlined that, in the limited minutes he was getting for Spurs, as well as while out on loan, he had been putting up the type of numbers that looked like those of an elite forward.”

Kane’s numbers during the final months of the 2013/14 campaign – when Tim Sherwood was still in charge at White Hart Lane – were, as Caley says, “through the roof”.

Michael Caley foresaw Harry Kane's rise before he was a first-team regular

It’s not inconceivable that, had a shrewd Premier League rival taken note of the statistics, been a little bit bolder and made an offer for the Tottenham rookie in the summer of 2014 when he was still very much on the fringes in N17, perhaps he would have recently netted his 100th goal in their colours instead.

But English football hasn’t always welcomed change with open arms. Just as foreign managers of the ’90s were met with some bewildered gawps when they dared suggest downing pints and gorging on steak and chips may not be the perfect preparation for elite-level athletes, those who have more recently attempted to utilise analytical models to evaluate the game have been met with, at best, a mixed response.

You are never going to sign a young striker on the basis of one season of similar numbers to Harry Kane, although these numbers will help you to spot players who may be going under the radar

- Duncan Alexander, Opta

Poor old Gab Marcotti certainly isn’t the first person to cite analytical data in assessing a sporting fixture, only to then be immediately shot down by sceptical naysayers.

“We weren't interested in convincing people – frankly it was to our advantage that no one was convinced,” Beane admits to FFT, speaking of his early work in baseball.

A tough sell

Despite it becoming increasingly clear that analytics has got plenty to offer, there are still doubters. When xG was made a part of Match of the Day’s graphics from the start of this Premier League campaign, suddenly it was mainstream. Within minutes of its first appearance on screen, social media was instantly awash with mentions of ‘hipsters and stat nerds’, demands for the BBC to ‘get in the sea’ and endless assertions that the numbers are ‘pointless’ and ‘bollocks’.

This was precisely why, as Match of the Day’s editor Richard Hughes explains, the programme always planned for the inclusion of expected goals to not be too intrusive.

We’ve never demanded that people use our data or claimed this stuff is going to replace humans. Expected goals is going to help football clubs make decisions and help pundits illustrate their point. It’s not going to replace the human eye

- Duncan Alexander, Opta

Match of the Day attracts a lot of debate on Twitter and something new like expected goals will always divide opinion – that is why we’ve deliberately made it a pretty low-key introduction,” Hughes tells FFT.

“It's there for people who know about xG already and are keen to see it, but it’s not detracting from the experience of those who don’t.

“We’ve worked very closely with Opta over the past few seasons to integrate a lot more data into the show, and this seemed like a natural progression – something new and innovative. We have had more and more data on screen – not necessarily things that have been spoken about by the pundits, but rather support the visuals that have backed up the points they are making.”

Opta’s Alexander concurs that analytical models such as xG won’t ever replace living, breathing scouts or pundits, but merely aid them.

“We’ve never been zealots,” he says. “We’ve never demanded that people use our data or claimed this stuff is going to replace humans. Expected goals is going to help football clubs make decisions and help pundits illustrate their point. It’s not going to replace the human eye.

“Ultimately, what all these models should do is throw up a little bit of insight and then help people to form cogent arguments,” he adds.

“I would be lying if I said the pundits weren’t a tad sceptical in terms of the value it brings,” admits Hughes. “Gary Lineker, Ian Wright and Alan Shearer know quite a lot about scoring goals, and there have been variables in the model that they’ve questioned when we’ve discussed it – in particular, things such as defensive positioning and long-shot chances. The key for them is always which player has taken the shot.”

So the strikers’ union will always have their say on the performances of their brethren – regardless of the rise of xG – but what about other areas of the pitch? Will we end up having some similar conversations about defensive contributions?

“Events on the ball are what we all focus on, but there are so many other things going on that will impact what happens next,” explains Beane. “There are things that happen on a football field that aren’t being measured, so players don’t get the credit for them. For example, a defensive player, who by virtue of his ability is able to get himself into a position to alter a shot, will completely change the dynamic of the play despite never touching the ball. Eventually, that is the kind of thing you want to measure.”

Billy Beane, whose Oakland Athletics used sabermetrics to narrow their financial disadvantage

The good news for Beane and the world’s best centre-backs is that an analytical way of assessing defensive contribution is in the pipeline.

“Expected goals is the first model and the one that has received the most coverage, but it’s the first in a series of hopefully quite a few we will be using,” says Alexander. "We’re also now working with ‘expected assists’, which is similar to xG, and ‘sequences’ from which you derive a team’s style of play and the pace at which they attack.

“And we’re also working on something called ‘defensive coverage’, which could be big for us because the criticism of Opta event data – and a reasonably valid one – has been that it’s a lot harder to assess defending than it is attacking.”

The future?

Defensive coverage can measure the area of defensive responsibility implied by a player’s defensive actions throughout a match – tackles, blocks, interceptions, clearances etc. So Chelsea’s all-action midfield lunatic N’Golo Kante, for instance, may cover a large area of the pitch, while a full-back in a team that’s being dominated by the opposition will likely have a smaller area.

“A good example of that from last season was when Ander Herrera marked Eden Hazard out of the game [between Manchester United and Chelsea] at Old Trafford in April,” says Alexander. “He’s nominally a central midfielder, but the Spaniard’s ‘defensive zone’ was a rough parallelogram on the edge of the right-hand side of United’s box. He was tasked with stopping Hazard, who ultimately didn’t have a single touch inside the penalty area.

“Any pundit who watched that match would certainly have spotted that Herrera performed very well, but up until now there hasn’t really been a way of illustrating that.”

That may not be music to the ears of Craig Burley, half of Twitter and anyone else who’d rather stick their fingers in their ears and pretend football’s ‘data revolution’ isn’t actually happening. But as Billy Beane puts it, “the genie’s out of the bottle now, and it’s not going back in”.

This feature originally appeared in the November 2017 issue of FourFourTwo. Subscribe!

New features you'd love on FourFourTwo.com