There were 1,112 press releases posted in the last 24 hours and 398,548 in the last 365 days.

Predicting Metabolic Potential in Bacteria From Limited Genome Data

The Science

How bacteria eat food, and what kinds of products they can make from that food, is dictated by the metabolic network of enzyme patterns encoded in their genomes. Using computational methods to learn these patterns across a large number of known bacteria allows the genome of a new bacteria to be analyzed. This reveals what kind of metabolism it is capable ofeven when only partial information is provided, which is common in environmental samples.

The Impact

This new method enables discovery of new metabolic capabilities for bacteria important for the environment and bioenergy applications. This is important for understanding microbiomes (communities of bacteria and other microorganisms) that support plant growth for improved crop yields. In addition, better understanding different metabolic networks will allow new ways of engineering bacteria for other bioenergy and biomedical applications.

Summary

The method learns patterns of proteins present in metabolic pathways from a large collection of annotated bacterial genomes using a deep learning approach. A significant advantage of this tool is that it is designed and tested on incomplete genomic data. This allows bacterial genomes to be identified and assessed for metabolic potential in complex microbiomes from soil or other sources, samples that are often incomplete.

This project was initiated by David Geller-McGrath as part of his graduate thesis project at the Woods Hole Oceanographic Institute under Dr. Edgcomb. He refined the approaches and developed the code during his time as an Office of Science Graduate Student Research fellowship in computational biosciences working with Dr. McDermott at Pacific Northwest National Laboratory, and with Dr. Wheeler at the University of Arizona.

PNNL Contact

Jason McDermott, Pacific Northwest National Laboratory, Jason.McDermott@pnnl.gov

Funding

Funding provided by Department of Energy (DOE) SCGSR Fellowship for the 2020 Solicitation 2 in Computational Biology and Bioinformatics. National Institutes of Health, National Institute for General Medical Sciences R01GM132600. DOE Biological and Environmental Research (BER) program through the “Machine-Learning Approaches for Integrating Multi-Omics Data to Expand Microbiome Annotation”