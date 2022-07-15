CORVALLIS, Ore. — Oregon State University researchers have developed a computer model that can predict if a pesticide will be harmful to honey bees.
The project involved training a machine learning model to predict whether any new proposed herbicide, fungicide or insecticide would be toxic to bees based on the compound's molecular structure.
"This research could be useful to other academics, researchers or companies exploring potential future pesticides," said Cory Simon, assistant professor of chemical engineering at Oregon State University.
With support from Simon and his co-researcher, Xiaoli Fern, associate professor of computer science at OSU, graduate students Ping Yang and Adrian Henle wrote computer codes and "trained" machine learning algorithms.
For this study, the researchers used an existing dataset containing 382 pesticide molecules for which honey bee toxicity outcomes were already known.
The researchers split this dataset in two. They used 80% of the molecules as a training set — using the data to train the machine learning algorithm to recognize patterns of toxicity by showing it examples.
The researchers used the remaining 20% to test how well the algorithm they had developed worked. For this test data set, they compared the algorithm's toxicity predictions against real data they already knew about toxicity.
The algorithm works by looking for common patterns or sequences that signal toxicity to bees using a mathematical concept called a "random walk."
Simon describes a random walk like this: Imagine a tiny ant is taking a random stroll along a pesticide molecule's chemical structure, making its way from atom to atom along the bonds that hold the compound together. If that ant were then to visit another molecule and make another random journey, would it see a similar sequence of atoms and bonds? If so, then the two molecules are similar. And if one of those molecules is toxic to bees, the similar molecule also has a higher statistical likelihood of being toxic to bees.
This method is the first of two ways OSU's new study represents the pesticide molecules. The second part of the study focuses on molecular access system, or MACCS, fingerprints, structure fingerprints that are used to measure molecular similarity.
This portion of the research involves looking for specific, pre-defined substructures and patterns in the molecules that are known to be predictive of the activity of drug molecules. Researchers look for specific things, said Simon: "Is there a ring? Is there a chlorine atom? How about an amine group?" This, he said, gives an "interpretable" machine learning model.
Simon said researchers can use OSU's algorithm, in conjunction with other molecular models, to predict which pesticides are likely to be toxic to bees and therefore merit further study.
The scientist said he hopes OSU's work "can help (researchers) design effective pesticides that aren't toxic to bees."
More broadly, Simon said the use of data-driven molecular modeling and machine learning is "really burgeoning" and may play a larger role in pesticide development in the future.
"It's a really growing field," said Simon.