We looked across Europe to see which were the most -- and least -- productive regions in terms of producing professional players.
Q: How did you do that?
A: First, to give us a base from which to work, we plotted the birthplace of every footballer from 18 European leagues. The data are based on the second half of the 2017-18 season, i.e. once the January transfer window closed.
We knew this approach would not be perfect, as not everyone grows up and learns his football in the place he was born. This is especially true in Europe, where there have been important migration patterns in the past few decades.
However, other than those who relocate for non-footballing reasons, the majority tend not to go far until they are at least 14 and, in most cases, 16 or 18, mainly because most football associations have rules that prohibit kids from moving across the country. By that age, most future professionals have already been heavily scouted.
Q: How did you split up the regions?
A: We took guidance from the European Union, which breaks the continent into regions called NUTS, from the French Nomenclature des unites territoriales statistiques.
This was convenient because the EU provides detailed statistical data on each NUTS region, and we wanted to see which demographic and socio-economic factors -- per capita GDP, migration patterns, level of education, weather -- correlate to producing footballers.
There are three levels of region. NUTS1, with a minimum of three million people, was too big -- England has just nine NUTS1 regions and Italy a mere five -- whereas NUTS3 was too small -- the minimum for a region is 150,000, and Germany alone, for example, has 429 of them. NUTS2 felt like a good compromise.
Q: What are the problems of using NUTS?
A: Again, the system is not perfect. For example, London and Paris have comparable metropolitan populations, yet London straddles no fewer than five NUTS2 regions, whereas Paris and its suburbs are lumped into one. Still, as long as you were mindful of this and willing to look at the wider picture, it made sense.
Q: How did you judge which regions were strongest?
A: We had the number of players from each region, which was important. However, not all regions are the same size, so we calculated the number of professional players-per-million (PPM), and that told us which areas were more productive in relative terms.
That said, we knew it would not be enough on its own, and we needed some kind of "qualitative control." Some leagues are stronger than others, so a region with 20 guys playing in, say, the Czech Republic isn't quite as impressive as one with 20 guys playing in England or Germany.
Q: How did you solve this problem?
A: We narrowed the focus to look at the top flights of Europe's "Big Five" leagues: England's Premier League, Germany's Bundesliga, France's Ligue 1, Spain's Primera Division and Italy's Serie A.
Suddenly, the map changed, and the evident hotbeds emerged: The Basque Country and neighbouring Navarra in northern Spain clocked in at more than 20 PPM each, for example, while Paris, East London, Corsica, Copenhagen, Iceland and the Canary Islands also stood out.
We knew this approach was not foolproof. Those leagues would be over-represented because they include young players and guys being given a chance but who might prove to not be good enough. Still, it was a starting point.
Q: Does sample size count?
A: Absolutely. We ran into another issue at this point, though: Certain areas appeared productive in relative terms, but because populations were small, it took only a couple of players for them to clock in at more than 10 PPM.
For example, the Belgian province of Luxembourg -- not to be confused with the country of Luxembourg, which is next door -- has 10.59 PPM, but that amounts to just three guys: David Henen, Timothy Castagne and Thomas Meunier.
This was not quite enough to justify calling it a hotbed and meant that, beyond looking at the relative numbers, the absolute numbers -- indeed, the list of the players themselves -- had to be considered.
Q: How were those issues addressed?
A: We needed a bigger, broader sample, so decided to focus on each of the big five countries individually, while also delving into their lower tiers. For example, with Germany we considered German-born players plying their trade in the top two tiers of the Bundesliga, as well as those in the top flights of French, English, Italian and Spanish football.
That gave a good localised picture and, crucially, samples big enough to be meaningful. Therefore, we repeated the same process for each of the Big Five -- adding that country's second tier -- except for England, where we drilled down all the way to the fourth level of the pyramid. This was necessary because the Premier League and the Championship have a disproportionately high number of overseas players. To find enough English-born ones, we had to go deeper.
We also included the entire British Isles (including Republic of Ireland) in the data set. This made sense because, while the Irish top flight is nominally professional, financially it cannot compete with the English league system. Moreover, English clubs have long had scouting networks in Ireland, to the extent that the region known as Southern and Eastern contributes proportionally more players to the English game than most of England itself.
There were challenges -- plenty of them -- but we believe this is the most accurate snapshot of where today's top professionals come from. What's more, it offers plenty of opportunity for further study.