Alick Bird and Andrew Plume, International Center for the Study of Research, Elsevier
Elsevier’s International Center for the Study of Research (ICSR) was delighted to support Advanced Oxford’s landmark Oxfordshire’s Innovation Engine 2023 report by co-creating the Oxfordshire Innovation Ecosystem Dashboard that supports many of its key findings. Indeed, a large part of ICSR’s mission is to develop rigorous and evidence-based approaches to tracking how research-based knowledge generates societal and economic impact for local and national economies. In this blog post we’d like to share some of our challenges with data that arose in the course of this work, in the hopes of highlighting improvement that could make such future exercises far less laborious.
Given the importance of local innovation as a driver for economic growth across the UK, we worked pragmatically to assemble datasets from multiple sources to produce a dashboard that could be readily applied to other geographies in the UK. Our challenges with data largely related to difficulties in obtaining data that were sufficiently granular (i.e. disaggregated), clean (i.e. disambiguated) and/or complete. For example, we were not able to obtain data disaggregated at company level from HMRC’s Research and Development (R&D) expenditure credits or ONS’ Inter-Departmental Business Register, which could have enriched our cross-linked dataset combining data from Companies House, Innovate UK, Scopus (publication) and TotalPatent One (patent) and lists of university spin-outs and science & business park tenants.
Without wishing to add to the online chorus of dissatisfaction with ONS’ Standard Industrial Classification (SIC) system, we’d like to highlight how these outdated, sometimes poorly assigned, and insufficiently granular codes meant we were unable to robustly answer a seemingly simple question: “How many of Oxfordshire’s companies specialise in artificial intelligence?”. No SIC code from the 2007 revision even mentions ‘artificial intelligence’, so businesses are forced to choose a far less informative category such as “62012: Business and domestic software development” which includes everything from “Data analysis consultancy services” and “Programming services” through to the cryptically-named “Software house”. An overhaul of the SIC system is long overdue, and ironically several AI technology companies are attempting to solve for this problem already.
Why does this all matter? Aside from making life harder for analysts, the collective upshot of these issues is that it is impossible to create a single-source-of-truth answer to apparently straightforward questions such as “How many innovative companies are there in Oxfordshire, how many people do they employ and what value to they add to the economy?” Instead, myriad qualifications and imperfect lenses must be applied to the question, making the answer less clear and robust than it should be if policymakers and the public are to continue to support public investment in the knowledge economy. At the UK level, if we are to earn the tag ‘global science & innovation superpower’ we need to be able to show with rigorous and reproducible statistics where we started from and demonstrate when we have arrived.