Machine learning for predicting stroke risk in heavy smokers - NHANES Study

SHI Xuesong, LIU Ya

China Digital Medicine ›› 2025, Vol. 20 ›› Issue (3) : 26-35.

China Digital Medicine ›› 2025, Vol. 20 ›› Issue (3) : 26-35. DOI: 10.3969/j.issn.1673-7571.2025.03.004

Machine learning for predicting stroke risk in heavy smokers - NHANES Study

  • SHI Xuesong, LIU Ya
Author information +
History +

Abstract

Objective To assess the stroke risk in heavy smokers and provide reference for optimizing public health prevention strategies. Methods Based on relevant data of NHANES from 2017-2020, feature selection was performed by Lasso regression, followed by modeling with seven machine learning algorithms including Random Forest, XGBoost, LightGBM and stacking algorithm. Model performance was evaluated using 10-fold cross-validation,and decision curve analysis (DCA) and clinical impact curve (CIC) analysis were conducted. SHAP values were used to enhance model interpretability. Results The stacking model performed best on the test set, with an AUC of 0.7645, effectively distinguishing between individuals with high and low stroke risks. The AUC value on the training set was 0.753 3, confirming the model's stability during training. DCA and CIC analyses demonstrated that the model provided significant net benefits at multiple clinical decision thresholds. SHAP value analysis showed the contribution of key variables such as history of heart disease and hepatitis B vaccination to the prediction. Conclusion Machine learning can effectively predict stroke risk in heavy smokers, providing scientific basis for personalized prevention strategies. The study demonstrates the potential of data-driven models in disease prevention.

Key words

Machine learning
/ Heavy smoking / Stroke / Disease risk assessment

Cite this article

Download Citations
SHI Xuesong, LIU Ya. Machine learning for predicting stroke risk in heavy smokers - NHANES Study[J]. China Digital Medicine, 2025, 20(3): 26-35 https://doi.org/10.3969/j.issn.1673-7571.2025.03.004

53

Accesses

0

Citation

Detail

Sections
Recommended

/