基于MF-DFA与BorutaShap的天然气需求预测模型

1.武汉软件工程职业学院(武汉开放大学)商学院;2.湖北能源集团股份有限公司

天然气需求;随机森林;Sobol;BorutaShap;蜜獾优化算法;莱维飞行;XGBoost

Natural gas demand forecast model based on MF-DFA and BorutaShap
WEN Quan1,WANG Ning1,WEI Xuehua2

1.Business School, Wuhan Vocational College of Software and Engineering (Wuhan Open University); 2.Hubei Energy Group Co., Ltd.

natural gas demand, random forest (RF), Sobol, BorutaShap, honey badger algorithm (HBA), Levy flight, XGBoost

DOI: 10.6047/j.issn.1000-8241.2025.01.011

备注

【目的】天然气需求受诸多因素影响,有效获取天然气月度需求时序数据的局部特征信息,可提升天然气需求预测模型非线性拟合能力与预测精度。【方法】首先,引入多重分形消除趋势波动分析(Multi-Fractal Detrended Fluctuation Analysis, MF-DFA),对天然气月度需求时序数据进行分形研究。其次,采用二次插值法与随机森林(Random Forest, RF)插值法,处理影响因素特征序列数据中时间粒度不一致与缺失的情况。而后,选择极限梯度提升(eXtreme Gradient Boosting, XGBoost)模型,分别对插值前后原特征序列及经Boruta、SHAP(SHapley Additiveex Planations)、BorutaShap筛选后的新特征序列进行计算误差分析,以确定最佳特征序列筛选降维方式,进一步降低模型输入数据的维度与规模。最后,引入Sobol低差异序列、改进密度因子及莱维飞行策略,以提升蜜獾优化算法(Honey Badger Algorithm, HBA)种群初始化覆盖范围的均匀分布度、扩大迭代搜索范围及跳出局部最优的能力,从而增强改进HBA对XGBoost模型中决策树数量、决策树深度、学习速率等决定模型拟合能力的参数寻优效果。【结果】采用BorutaShap算法进行特征序列筛选降维最佳,新提出的多策略优化HBA-XGBoost模型的预测精度优于其他对比模型,其平均绝对百分比误差、平均绝对值误差、均方根误差及决定系数分别为2.87%、9.3509、11.3353及0.8909。【结论】该方法适用于多种影响因素条件下的天然气需求预测,可为天然气行业发展规划决策提供参考依据。(图 12表2,参[27]
[Objective] This study aims to enhance the nonlinear fitting ability and prediction accuracy of natural gas demand forecast models by effectively capturing local feature information from the time series data of monthly natural gas demands, considering the multiple factors that influence these demands. [Methods] First, multi-fractal detrended fluctuation analysis (MF-DFA) was utilized for the fractal analysis of the time series data of monthly natural gas demands. Second, quadratic interpolation and random forest (RF) interpolation methods were employed to address inconsistencies and gaps in the time granularities of the feature sequence data related to the influencing factors. Third, the eXtreme Gradient Boosting (XGBoost) model was applied to compare computational errors in the original feature sequences before and after interpolation, as well as in the new feature sequences screened by Boruta, SHAP, and BorutaShap. This comparison identifies the optimal dimension reduction method for feature sequence screening, thereby further lowering the dimensionality and scale of model input data. Finally, the Sobol low-discrepancy sequence, an enhanced density factor, and the Levy flight strategy were incorporated to increase distribution uniformity in the population initialization range, expand the iterative search range, and avoid local optima for the honey badger algorithm (HBA). These enhancements collectively strengthened the optimization effect of the improved HBA on parameters that determine the model’s fitting ability, such as the number of decision trees, depth of decision trees, and learning rate in the XGBoost model. [Results] The BorutaShap algorithm was identified as the most effective method for the dimension reduction of feature sequences. The proposed model outperformed the reference models in terms of prediction accuracy, achieving MAPE of 2.87%, MAE of 9.3509, RMSE of 11.3353, and R2 of 0.8909. [Conclusion] The proposed methodology is suitable for natural gas demand forecast under various influencing factors, providing a reference basis for planning and decision-making in the natural gas industry. (12 Figures, 2 Tables, 27 References)
·