Research Article - (2025) Volume 13, Issue 1
Ensemble learning has drawn appreciable research attention attributable to its exceptional generalization achievement. In the field of rainfall prediction, many researchers have adopted various machine learning techniques based on varying meteorological parameters. This paper contributes to employing ensemble techniques for rainfall prediction based on seven climatic features form the Ghana meteorological agency covering 1980-2019 for 22 synoptic stations. The experiments involved 6 base algorithms (Logistic regression, decision tree, random forest, extreme gradient boosting, multilayer perceptron and K-nearest neighbor), 3 meta algorithms (Voting, stacking and bagging) and the ensemble learning. More specifically, the ensemble approach in this study focused on the combination of the base classifiers and vote meta classifier. The performance of the models was evaluated based on correlation coefficient, mean absolute error and root mean squared error. Another mode of comparison of the models was dependent on the time taken to build and test the models on the supplied test set. Generally, findings from the study showed that the ensemble learning outperformed both base and meta algorithms.
Ensemble learning has drawn appreciable research attention attributable to its exceptional generalization achievement. In the field of rainfall prediction, many researchers have adopted various machine learning techniques based on varying meteorological parameters. This paper contributes to employing ensemble techniques for rainfall prediction based on seven climatic features form the Ghana meteorological agency covering 1980-2019 for 22 synoptic stations. The experiments involved 6 base algorithms (Logistic regression, decision tree, random forest, extreme gradient boosting, multilayer perceptron and K-nearest neighbor), 3 meta algorithms (Voting, stacking and bagging) and the ensemble learning. More specifically, the ensemble approach in this study focused on the combination of the base classifiers and vote meta classifier. The performance of the models was evaluated based on correlation coefficient, mean absolute error and root mean squared error. Another mode of comparison of the models was dependent on the time taken to build and test the models on the supplied test set. Generally, findings from the study showed that the ensemble learning outperformed both base and meta algorithms. Keywords: Base algorithms • Meta algorithms • Ensemble learning • Rainfall prediction Introduction Artificial Intelligence (AI) has received much attention in this age of global digitalization which has yielded varying techniques and applications adapted in various aspect of our lives. Popular among these techniques, is machine learning that is being employed in many sectors. Machine learning has been extensively used in areas such as cyber security for intrusion detecting which remains a critical security issue [1]. In the medical field, conditions such as breast cancer early detection system, early diagnosis of Alzheimer disease which largely affects the elderly and the prediction of cardiovascular diseases have all received considerable attention from machine learning techniques. Generally, the performance of these machine learning techniques hinges on continual learning which is technically grouped under supervised learning and unsupervised learning [2]. Climate change analyzes the nature of the weather over a region for definite period of time [3]. One of the significant climate change burdens is the prediction of rainfall over a specific region by employing meteorological features such as humidity, temperature, wind speed and sunshine. In spite of the significance of rainfall prediction which aids in proper management of water resources, the relevance of the accurate and prompt prediction of this phenomenon is largely due to its severe impact on sensitive sectors of the economy such as agriculture, energy and the outcome of floods that goes on to destroy critical installations and livelihoods [4]. For instance, in, half a million people were severely affected by rainstorms across four African countries with about 500,000 people losing their lives due to flood events between 2007 and 2009. Also, in view of the rainfall variability affecting agriculture within Sub-Saharan Africa, it has been reported, that malnutrition that led to the death of about 50,000 children could worsen. In relation to time, forecasting of the weather can be short range, medium range and long-range covering a duration of 48 hours, three to seven days and beyond seven days respectively. To accomplish a successful seasonal forecast is subject to a meticulous insight into the oceanatmosphere interactions. Further to this, an understanding of the influence of this interaction on seasonal rainfall prediction is based on the timescales such as monthly, bimonthly and seasonal that the ocean-atmosphere interactions span [5]. In spite of the use of state-of-the-art technology, rainfall predictions do experience some level of constraints due to the chaotic nature of the weather. Among various approaches to accurately predicting rainfall, machine learning techniques have received much attention by researchers. These include but not limited to decision tree, random forest, K-nearest neighbour, support vector machine and neutral networks. Variations among the performance of these algorithms suggests an adoption of an ensemble model that is achieved by assimilating different models. The thrust convention is to scale the various classifiers and merge them to yield an improved classification which surpasses the performance of the individual classifiers. It has been established that ensemble methods enhances prediction performance [6]. Over two decades, ensemble learning has been widely adopted due to the magnitude of its generalization. It has been well documented in many studies that multiple classifiers generally enhance generalization as compared to the individual classifiers. Fundamentally, two major steps are involved in building an ensemble scheme: (1) Construction of various base classification models; (2) Using an efficient technique to merge them. Generally, the merging algorithms could be achieved by voting and estimate weights. In spite of the fact that there isn’t an affirmed theory that explains how diversity enhances ensemble model accuracy, it has been well established that the prerequisite feature for an outstanding ensemble performance is the diverse and accurate nature of the base classifiers [7]. Therefore, in this paper, the major focus is employing ensemble methods to enhance rainfall prediction. In summary, the significant contributions we put forward in this paper are as follows: Journal of Climatology & Weather Forecasting, 2025, Vol.13, Issue 1, 1-10 Research Article 1 • First, we conduct an empirical study of six base algorithm models and three meta algorithm models to predict rainfall. • Second, we propose the utilizing of meta-classifiers as the combining technique for the ensemble framework to predict rainfall. • Third, we demonstrate the necessity of employing evaluation metrics such as Correlation Coefficient (CC), Mean Absolute Error (MAE) and Root Mean-Squared Error (RMSE) to evaluate ensembles models. inverse approach and using non-linear approach. Findings from their study revealed that the proposed ensemble model gained higher accuracy level in comparison to the individual models. In a case study of bay sedimentation, employed ensemble learning to predict heavy metal contamination. Their study compared the extreme gradient boosting, random forest, artificial neural networks and support vector machines. Performance evaluation based on coefficient of determination; the extreme gradient boosting obtained the highest value compared to the other classification algorithms. Validation of the models depicted extreme gradient boosting the least reduction in coefficient of determination, with a decrement of 7.99%. However, the decrement in R2 values for random forest, support vector machines and artificial neural network were 10.26%, 36.19% and 8.31% respectively. Ensemble learning techniques have been evaluated for the prediction of solar irradiance. Their study first of all used base algorithms such as support vector machine, artificial neural network and decision tree. Based on a five-year meteorological data, the study compared the performance of the base classifiers to two proposed ensemble learning techniques; boosting and bagging. Findings from the study showed the boosting and bagging techniques performed well based on the root mean squared error and coefficient of determination values. Materials and Methods We used the following 6 methods as base classification algorithms. The criteria for the selection of these algorithms to create the rainfall prediction models was based on criteria utilized. The base algorithms were implemented by using Scikit-learn.
Received: 18-Jul-2024 Editor assigned: 20-Jul-2024 Reviewed: 03-Aug-2024 Revised: 19-Jan-2025, Manuscript No. JCWF-24-33027 (R0; Published: 26-Jan-2025, DOI: 10.35248/2332-2594.25.13(1).001-010
Copyright: © 2025 Drah K, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.