Translating biomedical research data to knowledge through bioinformatics


Journal Title

Journal ISSN

Volume Title



In this dissertation, we present the application of biomedical informatics, including data mining and knowledge discovery, to extract knowledge from genomics, transcriptomics, and clinical data. To improve our understanding of biomedical problems such as infectious disease and addiction research, we have successfully applied next generation sequencing (NGS) techniques to generate both transcriptomics and genomics data. First, through the investigation of transcriptomics data on rats in environmentally enriched and isolated conditions, we concluded that the pathways of retinoic acid receptor activation, eukaryotic initiation factor 2 signaling, and protein ubiquitination play significant roles in addictive behavior and thus direct our focus on individual differences in susceptibility to addiction. Second, through mining Syrian golden hamster transcriptomics data during visceral leishmaniasis, we discovered that splenic macrophages experienced mixed classic and alternative polarization/activation and whole spleen tissue experienced massive inflammatory response during visceral leishmaniasis. We proposed several mechanisms to understand the pathogenesis of L. donovani. Additionally, we investigated genomics data of several Venezuelan equine encephalitis virus mutants and showed that the replication fidelity of Tc-83 can be increased by incorporating point mutations at the RNA-dependent RNA polymerase region. These findings should accelerate the development of a new live attenuated vaccine for Venezuelan equine encephalitis virus.

In addition to the NGS data mining and knowledge discovery, we developed a novel ensemble data analysis method to improve the predictive ability of classic bagging and AdaBoost methods. By evaluating forty-one online datasets, we demonstrate the ability of our ensemble method in increasing predictive accuracy, which could be particularly useful for identifying novel diagnostic biomarker panels. Furthermore, we assessed two different intervention strategies for schistosomiasis using the meta-analysis method and demonstrated that implementation of the new integrated strategy reduces the infection risk by ~3–4 times compared to the conventional strategy. This approach is applicable to evaluate any new prevention, diagnosis, or treatment strategy.



bioinformatics, NGS, Addiction, Visceral leishmaniasis, VEEV, Ensemble machine learning, Meta-analysis, Schistosomiasis