Identification of Risk Factors for COVID-19-related Death using Machine Learning Methods

Document Type : Original Article

Authors

1 Social Determinants of Health Research Center, Hamadan University of Medical Sciences, Hamadan, Iran

2 Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran

3 Health Science Research Center, Hamadan University of Medical Sciences, Hamadan, Iran

4 Hamadan University of Medical Sciences, Hamadan, Iran

5 Department of Infectious Disease, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran

6 Department of Molecular Medicine and Genetics, Faculty of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran

7 Modeling of Non-Communicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran

Abstract

Background: Unknown cases of pneumonia appeared in late 2019 in Wuhan, China. Following the worldwide spread of the disease, the World Health Organization declared it a pandemic on March 11, 2020. The total number of infected people worldwide as of December 16, 2020, was more than 74 million, more than one million and six hundred thousand of whom died from Coronavirus Disease 2019 (COVID-19). This study aimed to identify the risk factors for the mortality of COVID-19 in Hamadan, west of Iran.
Materials and Methods: This cross-sectional study used the information of all patients with COVID-19 admitted to Shahid Beheshti and Sina hospitals in Hamadan during January 2020-November 2020. Logistic regression model, decision tree, and random forest were used to assess risk factors for death due to COVID-19.
Results: This study was conducted on 1853 people with COVID-19. Blood urea nitrogen change, SPO2 at admission, the duration of hospitalization, age, neutrophil count, lymphocyte count, number of breaths, complete blood count, systolic blood pressure, hemoglobin, and sodium were effective predictors in both methods of decision tree and random forest.
Conclusion: The risk factors identified in the present study may serve as surrogate indicators to identify the risk of death due to COVID-19. The proper model to predict COVID-19-related mortality is random forest based on sensitivity.
 

Keywords