Friday, June 22, 2012

Don’t Give Me That Award! Don’t Fire Me! That Streak Was Just Random!

Why streaks, whether good or bad, are merely random events. There is a reason why they happen all the time.

In EPL’s most recent season, Man City did not lose for 14 games in a row. At the opposite end of the table, Wolves did not win for 14 games in a row. Fourteen is more than a third of 38, the entire season. Impressive, or is it?

Streaks embody the short-term thinking that is prevalent in football today. Managers with good streaks are rewarded with the manager of the month awards, while those with poor streaks are fired. Football journalists extrapolate streaks into grandiose long term statements like “Seventh win in a row opens the Man City Era,” or “Four games without a win, is it time to finally sack Wenger?” 

But for as long as we can remember, streaks have always happened. So I wondered, are streaks just random events with no special meaning? Are streaks mere reflections of a team’s overall performance, rather than momentary performance? 

So I did some math and it turns out most streaks that happened in the 11-12 season were likely events. For example, Man U’s 8 game winning streak had a probability of 61.7%, Man City’s 14 game without a loss had 53.7%, and Wolves’ 14 game without a win had a 53.7%. Of all the 80 streaks that I calculated (20 each for win, loss, without a loss, without a win), Wigan’s eight game losing streak was the least likely, at 2.8%. 

In fact, most streaks were likely or highly likely. Only four of the 80 streaks exhibited less than 30% likelihood: Arsenal’s seven game win (21.8%), Newcastle’s six game win (24.5%), Newcastle’s six game without a win (24.3%), and the Wolves’ streak mentioned above. The implications of these numbers are explained later on.

To briefly mention the methodology, you may skip this part if you believe in what I say, I measured the team’s overall performance from the W/D/L count. Using these numbers, I calculated the probability of each streak, involving the recursive reasoning. To provide a snapshot, the probability of Man City’s seven game streak in 38 games can only be derived by calculating the its probability in 37 games, which depends on its probability in 36 games, and so forth. This ladder of calculation stops at the probability of Man City’s seven game streak in seven games, which is simply 0.737 (Man City’s overall win percentage) multiplied by itself seven times. Thus, rather than having a defined formula, I had to use excel to derive a plethora of numbers, and the results are summarized below.

As Figure 2 shows and as mentioned above, most streaks are either likely or highly likely. There are exceptions, however, and these exceptions give us some insight into the 11-12 season.

Arsenal and Newcastle's win streaks are impressive. With such low likelihood, their momentary performances are admirable. On the contrary, if you look at the likelihood of poor performing teams' likelihood of losing streaks, the likelihoods are relatively high, except for Wolves. To be more specific, look at Arsenal, Tottenham, and Newcastle in their winning streaks, an d then look at Aston Villa, QPR, Bolton, and Blackburn's losing or without-a-win streaks. The contrast in these numbers show that Arsenal, Tottenham, and Newcastle have at times performed beyond their usual level, while the lowly teams performed exactly at their usual levels. This implies that good teams do create unlikely winning momentum that eventually contributes to their high league standings.

The data also shows that Tottenham is the most erratic team. All four of their streaks have the likelihood in the 0.300s. This shows that all of their streaks were somewhat unlikely, and to have unlikely good streaks as well as unlikely bad streaks shows that Spurs were inconsistent, perhaps even moody.

This study can be immediately improved in two ways. First is the definition of a streak. In this calculation, I only considered win only, no loss, loss only, no win streaks. If you look at Wigan's last 9 games (7W 2L) or Liverpool games 24 - 32 (just one win in 9 games), these can be considered to be streaks as well, but they were not involved in this study because they were outside the boundaries of the definition of a streak.

Another more challenging aspect of the study is the confounding effect. This study assumes that the team's performance is constant throughout the season. I assumed that the team's season win percentage was something that is constant throughout. In other words, Man City's 0.737 win percentage is an innate quality of the team that was displayed from matchday 1 all the way through matchday 38. The results of soccer, however, can change by a matter of a goal. Draws can turn into wins by the smallest variables. Hence, if one thinks about those three or four games so-and-so team could have won, and had the outcome been different, the calculations above would be wildly different, especially if those outcomes have been back-to-back games.

1 comment:

kabcity said...

Tommy this is incredible stuff. Correct me if I'm wrong, but isn't it commonly said that Spurs are quite inconsistent? I love when data verifies conventional wisdom, though as you said for the most part streaks get a lot of attention even though they are random. Anyway, I'm sure you can publish this somewhere.

Another fascinating bit (quite tedious but no more complex than what you've already done)to measure whether or not a streak was notably "good" or "bad" would be to calculate what the likelihood of a particular streak was at the time it started. Eg, looking at a team's previous 50 games (going back to past seasons), what is the probability that the team will go on a 14 game Win+Draw streak? Man City's streak may be unremarkable for 2011-12, but at the time it started it may have had a low probability of happening.