Meta pixel
Menu

Harnessing Machine Learning for Sustainable Farming and Water Protection

955 537 Stroud Water Research Center

Machine Learning Will Help Scientists Analyze Massive Amounts of Environmental DNA Data to Discover Impact of Ag Practices

Jinjun Kan, Ph.D.

By Jinjun Kan, Ph.D.

The Evolution of DNA Sequencing

It’s been over two decades since scientists completed The Human Genome Project in 2003. The effort led to advancements in DNA sequencing technology. 

Back then, examining a whole community of organisms, such as those in a soil or water sample, wasn’t easy. Microbial ecologists relied on DNA fingerprinting methods or clone libraries to identify what microbes exist in the world around us. It’s a time-consuming process that requires expert skill and care to interpret the data. DNA must be extracted, amplified, separated based on sequence and structure, and eventually analyzed to determine the makeup of a microbial community. After all that, only a small fraction of the microbes in a sample can be described using this technology. 

But that was then. 

The technology has advanced further, and now scientists can characterize the whole community in environmental samples. Collect a beaker of water, or dig a clump of soil, and it can be analyzed for its detailed genetic composition. The genetic information of that whole environmental sample reveals a complex, interconnected world of bacteria, archaea, fungi, and signs of the passersby from organic material left behind. 

How Machine Learning is Revolutionizing Environmental DNA Analysis

Imagine you have one gram of soil from the farm near your home. After extracting the DNA and sequencing it, you have a dataset that exceeds a million sequences with thousands of species. A scientist would be hard-pressed to interpret the data in a meaningful way. More than that, a single location and a single sample are not sufficient for any ecological study, so multiply the data by hundreds or even thousands to decipher what microbes are present. Ecological studies also need to describe how change occurs across locations and how microbial communities change their environment over time. 

This is where machine learning is set to open a new expanse of knowledge about how our environment functions both naturally and when humans alter it.

The Role of Microbes in Sustainable Agriculture and Ecosystem Health

As a molecular microbial biologist, I study these types of whole samples. In agricultural settings, we can learn a lot about the productivity or health of a farm’s soils over the long term by understanding how nutrients flow through the farming system at the tiniest scales — through the bacteria, archaea, and fungi. 

A cross-section showing cover crop roots growing deeply into the soil.

At Stroud Water Research Center, we study freshwater ecosystems, and the practices that affect those ecosystems, so we pay great attention to agriculture. My research focuses on how microbial communities affect the soil, nutrients in the soil, agricultural productivity, and ultimately the streams and groundwater that leave the farmer’s fields.

The development and advancement of DNA sequencing technology has produced mind-boggling quantities of raw data. Now we are challenged to analyze the data and then interpret it. What does the presence or absence of microbes mean for the soil? For nutrient cycling? For the crop? For the water flowing over and under the field? 

Jinjun Kan in his laboratory.

Machine learning can assist with data analytics, and that’s where my research, with colleagues from around the world, is contributing. Our first step was to explore analyzing the microbial data with machine learning approaches to overcome the limitations of linear assumptions of change. With data resolution at a finer scale, machine learning can improve our ability to make accurate predictions through the use of random forest algorithms. 

We can examine interactions between individual microbes and groups of microbes as well as interactions between microbes and their environments. We are demonstrating how identifying groups of microbes in the upper soil layers can reliably reveal the impact of various farming practices. 

These are baby steps that demonstrate the potential of machine learning and AI in microbial data analysis that I had the pleasure of sharing with 1,500 of my peers from around the world at the 2024 International Symposium of Microbial Ecology in Cape Town, South Africa.

We learned from each other and explored the boundaries of what we know about the microbes in our world. I enjoyed sharing research ideas, results, and achievements. Being at the forefront of microbial ecology as our understanding grows by leaps and bounds is exciting. 

I’m pleased to be sharing these new approaches. With machine learning, we have a new means of discovering how agriculture can sustainably support food security and protect clean fresh water.