In response to the growing global threat of bioterrorism, Stottler Henke explores the potential for machine learning to address the need for biosurveillance technology capable of assessing the pathogenic potential of novel bacteria, for the Defense Advanced Research Projects Agency (DARPA). We use supervised and unsupervised learning algorithms to analyze data from assays designed to expose bacterial phenotype. In domains where labeled data is scarce, this semi-supervised approach outperforms supervised techniques, because it can learn from both labeled and unlabeled data.
Advances in machine learning technology can help to bridge the gap between our understanding and the real world. The problems that must be addressed in the development of biosurveillance technology have proven to be formidable, many of which are active areas of study within microbiology and infectious disease research. Machine learning has found success in modeling phenomena not yet well understood, particularly in the areas of computer vision and natural language processing. The difficulty and importance of the biosurveillance/biodefense problem highlight its significance and emphasizes the necessity that it be addressed as soon as possible. Thus, a data-driven approach to solving these complex biosurveillance challenges may prove beneficial.
Disclaimer: This material is based upon work supported by DARPA under Contract No. 140D6319C0030. The views, opinions, and/or findings expressed are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. This material is approved for public release, distribution unlimited.