Pawan Kumar Raghav
Indraprastha Institute of Information Technology, India
Title: Machine learning based identification of stem cell genes involved in stemness
Biography
Biography: Pawan Kumar Raghav
Abstract
Stem cells are being used to study aging, embryonic development and diseases including cancer, diabetes and neurodegenerative diseases. Stem cells possess two essential properties, self-renewal and differentiation. Stemness is defined as the potential of a cell to self-renewal and differentiation. Characterizing the genes involved in the regulation of these properties is fundamental to understand the concept of stemness, the underlying mechanisms which can be further used for therapy. Extensive data is available from various stem cell studies. However, yet, the prediction model of the stemness of genes is still not available. Therefore, molecular profiling assays experimental data of pluripotent stem cells have been collected. The analysis of the gene expression data revealed stem cells specific novel stemness genes. Afterward, we used machine learning to predict the stemness of the genes and established a reference for the pluripotent state. We have thus developed a machine learning model based on the random forest, support vector machine and artificial neural network methods of markers gene expression, which has been identified to regulate stemness. The training data was validated using stratified 5-fold cross-validations (CV) and corresponds to an 80:20 test and training set ratio. Finally, this developed classification model categorized the genes into pluripotent and non-pluripotent using the machine learning methods based on accuracy, sensitivity and specificity. The present work assessed the accurate performance of the method used to evaluate stemness that classifies pluripotent genes. Using automated classifiers based on the random forest machine-learning algorithm we are able to identify the hotspot pluripotent genes responsible for its stemness.