We mapped all 117,116 of England’s farms – here’s why that matters for transition finance and net zero
Decarbonising the agricultural sector is at the heart of the climate challenge. In the UK, as in much of the world, it is responsible for 11% of the total Greenhouse gas (GHG) emissions. We know little about the farms and farmers managing these landscapes as there is lack of asset-level farm data, including ownership, land use, and production, which hinders effective transition finance and decarbonisation efforts.
If we are to meet our climate and biodiversity targets, this data gap must be addressed. That is what motivated our latest research, published in Scientific Data, where my co-authors and I have released a first-of-its-kind asset-level geospatial dataset of farmland ownership and land use across England.
Despite agriculture’s role in the climate and nature crisis, financial actors have long struggled to assess and support change at the farm level as there is no one open registry that exists. In other sectors like energy or real estate, we can trace emissions and financial risk to specific assets. Not so in agriculture. We have national statistics and satellite images, but no reliable way to match ownership, land use, and production at the level of individual farms. This lack of asset-level resolution makes it hard to channel finance to the right places, assess risk, or verify the impact of interventions such as regenerative agricultural practices, carbon credits, or biodiversity net gain (BNG) schemes.
To create the dataset, we stitched together multiple fragmented sources of public data: land parcel registries, subsidy payments (Rural Payment Agency), company ownership records (Companies house), and crop maps derived from satellite imagery (CROME). The challenge that we faced the most were the incoherency of records where names or addresses were misaligned, making it difficult to clean data; moreover, we found that ownership is frequently hidden behind shell companies or outdated records.
However, we addressed this gap using natural language processing (NLP) to match entities across noisy datasets and performed unsupervised learning (where an algorithm learns from unlabelled data), mapping farm names to spatial polygons to fill ownership and entity gaps. When we were performing data wrangling, we had to navigate privacy concerns, public access restrictions, and a lack of standardisation across datasets. Some of the most challenging work was simply understanding the oddities of how English farmland is recorded. But perhaps the biggest challenge was conceptual: how to represent the overlapping realities of land tenure, ownership, and use. A single plot might be owned by one entity, farmed by another, and claimed under subsidy by a third. Representing that in a consistent, accurate format required new data structures and a lot of careful assumptions.
In England, this approach identified 117,116 farming entities with essential attributes such as addresses, land areas, crop types, production output, and geospatial coordinates. Such emerging datasets are also critical for financial instruments supporting sustainable agriculture, enabling verification of carbon credits, enhancing sustainability-linked loans, and improving risk assessment for climate finance.
We see this foundational work as a first step in providing open-source information for financial institutions and corporates who are engaged with the agricultural sector. Since the dataset itself requires further fine-tuning (dependent on availability of secondary data), we hope this work inspires others to push for greater transparency in land registry systems. Just as financial markets rely on open data to price risk, sustainable land use requires visibility of who owns and uses the land, and how. This dataset is a first step in that direction.
Dr Hassan Aftab Sheikh is an applied earth scientist working at the intersection of finance, nature, and data science.