Comparative Analysis of Diabetic Prediction Using Machine Learning Algorithms
DOI:
https://doi.org/10.53555/jaz.v45iS4.4308Keywords:
Diabetes mellitus, Logistic regression, Machine learning, Support vector machine, WEKAAbstract
Diabetes mellitus (DM) is a severe worldwide health problem, and its prevalence is quickly growing. It is a spectrum of metabolic illnesses definite by continually increased blood glucose levels. Undiagnosed diabetes can lead to a variety of difficulties, including retinopathy, nephropathy, neuropathy, and other vascular abnormalities. In this context, machine learning (ML) technologies may be mainly useful for early disease identification, diagnosis, and therapy monitoring. The core idea of this study is to detect the strong ML algorithm to forecast it. For this numerous ML algorithms were chosen i.e., support vector machine (SVM), Naïve Bayes (NB), K nearest neighbor (KNN), random forest (RF), logistic regression (LR), and decision tree (DT), according to this work. Two, Pima Indian diabetic (PID) and Germany diabetes datasets were used and the research was implemented using Waikato environment for knowledge analysis (WEKA) 3.8.6 tool. This research discussed performance matrices and error rates of classifiers for both datasets. The outcomes showed that for the PID database (PIDD), SVM works improved with an accuracy of 74% whereas for Germany RF and KNN work improved with 98.7% accuracy. This study can helps healthcare facilities and researchers in understanding the value and application of ML algorithms in predicting diabetes at an initial stage
Downloads
References
P. Prabhu and S. Selvabharathi, “Deep belief neural network model for prediction of diabetes mellitus,” in 2019 3rd International Conference on Imaging, Signal Processing and Communication (ICISPC), 2019, pp. 138–142, doi: 10.1109/ICISPC.2019.8935838.
N. A. Farooqui, . R., and A. Tyagi, “Prediction model for diabetes mellitus using machine learning techniques,” Int. J. Comput. Sci. Eng., vol. 6, no. 3, pp. 292–296, 2018, doi: 10.26438/ijcse/v6i3.292296.
TFPR editorial, “Diabetes is a pandemic in India. But the Sugar Association wants people to consume more!,” The Future of Public Relations, 2020.
“National Diabetes and Diabetic Retinopathy Survey - INSIGHTSIAS.” https://www.insightsonindia.com/2019/10/11/nationaldiabetes-and-diabetic-retinopathy-survey/ (accessed Dec. 08, 2022).
Z. Mushtaq, M. F. Ramzan, S. Ali, S. Baseer, A. Samad, and M. Husnain, “Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques,” Hindawi, vol. 2022, no. Special Issue, 2022, doi: 10.1155/2022/6521532.
V. Rawat, S. Joshi, S. Gupta, D. P. Singh, and N. Singh, “Machine learning algorithms for early diagnosis of diabetes mellitus: A comparative study,” Mater. Today Proc., vol. 56, part 1, pp. 502–506, 2022, doi: 10.1016/j.matpr.2022.02.172.
L. Ismail, H. Materwala, M. Tayefi, P. Ngo, and A. P. Karduck, “Type 2 Diabetes with Artificial Intelligence Machine Learning: Methods and Evaluation,” Arch. Comput. Methods Eng., vol. 29, no. 1, pp. 313–333, 2022, doi: 10.1007/s11831-021-09582-x.
S. V. K. R. Rajeswari and V. Ponnusamy, “Prediction of diabetes mellitus using machine learning,” Ann. Rom. Soc. Cell Biol., vol. 25, no. 5, pp. 17–20, 2021.
A. Sharma, K. Guleria, and N. Goyal, “Prediction of Diabetes Disease using Machine Learning Model,” in Lecture Notes in Electrical Engineering (2021) 733 LNEE 683-692, 2021, no. March, doi: 10.1007/978-981-33-4909-4.
[10] R. Patra and B. Khuntia, “Analysis and prediction of pima Indian diabetes dataset using SDKNN classifier technique,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1070, no. 1, pp. 1–14, 2021, doi: 10.1088/1757-899X/1070/1/012059.
K. S. Kumari and K. Bhargavi, “Performance Analysis of Diabetes Mellitus Using Machine Learning Techniques,” Turkish J. Comput. Math. Educ., vol. 12, no. 6, pp. 225–230, 2021.
S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” Int. J. Cogn. Comput. Eng., vol. 2, no. November 2020, pp. 40–46, 2021, doi: 10.1016/j.ijcce.2021.01.001.
F. Alaa and A. M. Al-bakry, “Diagnosis of diabetes using machine learning algorithms,” Mater. Today Proc., 2021, doi: 10.1016/j.matpr.2021.07.196.
R. D. Joshi and C. K. Dhakal, “Predicting Type 2 Diabetes Using Logistic Regression andMachine Learning Approaches,” Int. J. Environ. Res. Public Health, vol. 18, no. 14, p. 7346, 2021, doi: 10.3390/ijerph18147346.
S. Barik, S. Mohanty, S. Mohanty, and D. Singh, “Analysis of prediction accuracy of diabetes using classifier and hybrid machine learning techniques,” Smart Innov. Syst. Technol., vol. 153, no. January, pp. 399–409, 2021, doi: 10.1007/978-981-15-6202-0_41.
P. Nagabushanam, N. C. Jayan, C. Antony Joel, and S. Radha, “CNN architecture for diabetes classification,” 2021 3rd Int. Conf. Signal Process. Commun. ICPSC 2021, no. May, pp. 166–170, 2021, doi: 10.1109/ICSPC51351.2021.9451724.
G. A. Pethunachiyar, “Classification Of Diabetes Patients Using Kernel Based Support Vector Machines,” in 2020 International Conference on Computer Communication and Informatics, ICCCI 2020, 2020, pp. 22–25, doi: 10.1109/ICCCI48352.2020.9104185.
R. Pradhan, M. Aggarwal, D. Maheshwari, A. Chaturvedi, and D. Sharma, Kumar, “Diabetes Mellitus Prediction and Classifier Comparitive Study,” in 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control, 2020, pp. 133–139, doi: 10.1109/PARC49193.2020.236572.
S. Rajendar, R. Thangaraj, J. Palanisamy, and V. K. Kaliappan, “Comparative analysis of classifier models for the early prediction of type 2 diabetes,” Int. J. Adv. Sci. Technol., vol. 29, no. 7, pp. 2184–2194, 2020.
G. Tripathi and R. Kumar, “Early Prediction of Diabetes Mellitus Using Machine Learning,” in ICRITO 2020 - IEEE 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (2020) 1009- 1014, 2020, pp. 1009–1014, doi: 10.1109/ICRITO48877.2020.9197832.
R. Patil, S. Tamane, S. A. Rawandale, and K. Patil, “A modified mayfly-SVM approach for early detection of type 2 diabetes mellitus,” Int. J. Electr. Comput. Eng., vol. 12, no. 1, pp. 524–533, 2022, doi: 10.11591/ijece.v12i1.pp524-533.
] Y. Tan, H. Chen, J. Zhang, R. Tang, and P. Liu, “Early Risk Prediction of Diabetes Based on GA-Stacking,” Appl. Sci., vol. 12, no. 2, p. 632, 2022, doi: 10.3390/app12020632.
C. Mallika and S. Selvamuthukumaran, “A Hybrid Crow Search and Grey Wolf Optimization Technique for Enhanced Medical Data Classification in Diabetes Diagnosis System,” Int. J. Comput. Intell. Syst., vol. 14, no. 1, 2021, doi: 10.1007/s44196-021- 00013-0.
S. Samreen, “Memory-efficient, accurate and early diagnosis of diabetes through a machine learning pipeline employing crow search-based feature engineering and a stacking ensemble,” IEEE Access, vol. 9, pp. 134335–134354, 2021, doi: 10.1109/ACCESS.2021.3116383.
C. Azad, B. Bhushan, R. Sharma, A. Shankar, K. K. Singh, and A. Khamparia, “Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus,” Multimed. Syst., vol. 28, no. 4, pp. 1289–1307, 2022, doi: 10.1007/s00530-021-00817-2.
Patil, T. Sharvari, and R. Nirmal, “Hybrid ANFIS-GA and ANFIS-PSO Based Models for Prediction of Type 2 Diabetes Mellitus,” in Advances in Intelligent Systems and Computing, vol. 1227, 2021, pp. 11–23.
H. Qteat and M. Awad, “Using Hybrid Model of Particle Swarm Optimization and Multi-Layer Perceptron Neural Networks for Classification of Diabetes,” Int. J. Intell. Eng. Syst., vol. 14, no. 3, pp. 11–22, 2021, doi: 10.22266/ijies2021.0630.02.
T. M. Le, T. M. Vo, T. N. Pham, and S. V. T. Dao, “A Novel Wrapper-Based Feature Selection for Early Diabetes Prediction Enhanced with a Metaheuristic,” IEEE Access, vol. 9, pp. 7869–7884, 2021, doi: 10.1109/ACCESS.2020.3047942.
M. S. Islam, M. K. Qaraqe, S. B. Belhaouari, and M. A. Abdul-Ghani, “Advanced Techniques for Predicting the Future Progression of Type 2 Diabetes,” IEEE Access, vol. 8, pp. 120537–120547, 2020, doi: 10.1109/ACCESS.2020.3005540.
N. Singh and P. Singh, “Stacking-based multi-objective evolutionary ensemble framework for prediction of diabetes mellitus,” Biocybern. Biomed. Eng., vol. 40, no. 1, pp. 1–22, 2020, doi: 10.1016/j.bbe.2019.10.001.
“Pima Indians Diabetes Database | Kaggle.” https://www.kaggle.com/uciml/pima-indians-diabetes-database (accessed Jul. 29, 2021).
diabetes Kaggle: https://www.kaggle.com/datasets/johndasilva/diabetes?resource=download (accessed Apr. 23, 2022).
B. Kotsiantis, D. Kanellopoulos, and P. E. Pintelas, “Data preprocessing for supervised leaning,” Int. J. Comput. Sci., vol. 1, no. 2, pp. 111–117, 2006, doi: 10.1080/02331931003692557.
R. Kimball and J. Caserta, The data warehouse ETL toolkit: practical techniques for extracting, cleaning, conforming, and delivering data. New Jersey, USA: John Wiley & Sons, 2015.
G. Mutlu, “SVM-SMO-SGD : A hybrid-SVM-SMO-SGD: A hybrid-parallel support vector machine algorithm using sequential minimal optimization with stochastic gradient descent,” Parallel Comput., vol. 113, no. July, 2022, doi: 10.1016/j.parco.2022.102955.
J. Ali, R. Khan, N. Ahmad, and I. Maqsood, “Random Forests and Decision Trees,” Int. J. Comput. Sci., vol. 9, no. December 2013, pp. 272–278, 2012.
T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, “Decision tree and random forest models for outcome prediction in antibody incompatible kidney transplantation,” Biomed. Signal Process. Control, vol. 52, pp. 456–462, 2019, doi: 10.1016/j.bspc.2017.01.012.
Waikato, “Weka 3: machine learning software in Java,” GitHub, 2021.
D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Comput. Sci., vol. 7, pp. 1–24, 2021, doi: 10.7717/PEERJCS.623.
R. Naseem et al., “Empirical assessment of machine learning techniques for software requirements risk prediction,” Electron., vol. 10, no. 2, pp. 1–19, 2021, doi: 10.3390/electronics10020168.
L. Ali et al., “An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure,” IEEE Access, vol. 7, pp. 54007–54014, 2019, doi: 10.1109/ACCESS.2019.2909969.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Ms. Madhuvanthi B, Dr. Baskaran T S

This work is licensed under a Creative Commons Attribution 4.0 International License.
