徐敏,徐经纬,谢志清,高苹,李亚春,缪璟秋. 2020. 随机森林机器算法在江苏省小麦赤霉病病穗率预测中的应用[J]. 气象学报, (0):-, doi:10.11676/qxxb2020.007
随机森林机器算法在江苏省小麦赤霉病病穗率预测中的应用
Application of the random forest machine algorithm in forecasting of the diseased panicle rate of wheat scab in Jiangsu province
投稿时间:2019-07-08  修订日期:2019-08-19
DOI:10.11676/qxxb2020.007
中文关键词:  小麦赤霉病,随机森林法,病穗率预测
英文关键词:Wheat scab,Random forest method,Forecast of the diseased panicle rate
基金项目:2020年国内外作物产量气象预报专项、江苏省气象局科研基金(KM201906)和中国气象局气象关键技术集成重点项目(CMAGJ2015Z02)
作者单位E-mail
徐敏 江苏省气候中心 amin0506@163.com 
徐经纬 南京信息工程大学 270500578@qq.com 
谢志清 江苏省气候中心 175987468@qq.com 
高苹 江苏省气候中心 571043860@qq.com 
李亚春 江苏省气候中心 295646190 
缪璟秋 苏州市吴中区东山气象站 279878901@qq.com 
摘要点击次数: 50
全文下载次数: 39
中文摘要:
      明确影响赤霉病的主导气象因子和前期影响因子,建立病穗率分阶段预测模型,对提升赤霉病预测能力和保护农田生态环境具有重要意义。本文基于2002—2018年江苏省13个市的小麦赤霉病病穗率资料与生育期观测资料、相应时段内的逐日气象数据,应用随机森林(Random Forest)机器学习算法,分生育期分区域定量评估影响病穗率的主要气象因子特征变量和贡献率,按不同起报时间建立预测模型并进行验证。结果表明,各生育期重要特征变量贡献率的排序为:抽穗扬花期>拔节期>越冬期。抽穗扬花期湿度、降水连续≥3d的雨日和日照对赤霉病起主导作用,拔节期日照、降雨量、湿度和雨日与越冬期气温和降雪对赤霉病均具有前期影响,甄别出的重要特征变量排序结果符合赤霉病菌发育、释放、侵染和流行规律;基于RF算法建立的病穗率预测模型的精度与重要特征变量个数、赤霉病发生区域、Mtry参数设定、生育期有关;最早可在3月初进行预测,预测时效近3个月,起报时间越接近乳熟期,输入的重要特征变量越多,则病穗率预测准确率越高,病穗率模拟值与实际值的波动趋势完全一致,对赤霉病“中等”和“偏重”等级模拟效果好,表明RF算法在赤霉病预测中有较高的可靠性和业务应用潜力。
英文摘要:
      Identification of meteorological and biotic factors with significant impacts of wheat scab,and development of models predicting diseased panicle rates at different stages are of remarkable significance for improving the ability to predict scab seriousness and to protect ecological environment of farmlands. On the basis of observations of diseased panicle rate, winter wheat phenology, and daily meteorology in 13 cities of Jiangsu Province, China during the period from 2002 to 2008, the dominant meteorological variables determining diseased panicle rates were identified and the contributions of these individual variables to diseased panicle rates were assessed for different phonological stages in various regions. Then, models with different starting time for predicting diseased panicle rates were developed using the random Forest (RF) regression algorithm. The reliability of models was validated using observations of diseased panicle rates. Meteorological and biotic factors during the heading and flowering stage have the largest contribution to final diseased panicle rates, followed by those of the jointing stage and overwintering period. The dominant determinates of final diseased panicle rates were relative humidity,total rainy days with precipitation continuously above 3 days, and sunshine during the heading and flowering stage. Sunshine duration,precipitation,relative humidity and rainy days during the jointing stage had significant influences on final diseased panicle rates. Temperature and snowfall overwintering period had large lagged impact on final diseased panicle rates. The identified relative importance of key variables in each growth period was consistent with the theory on the development,release,infection, and epidemic of scab. The accuracy of models predicting diseased panicle rates based on RF algorithm varies with the number of critical characteristic variables,regions,the values of parameter Mtry,and the growth period. The earliest time, when the models could be used to output useable prediction of diseased panicle rates, is the beginning of March. The longest forecast time limits of models were about 3 months. With the time approaching the maturity period, increase in the number of important characteristic variables as inputs, the accuracy of modes increased and the discrepancy between predicted and observed diseased panicle rates significantly decreased. Models had better skills in predicting medium and serious categories of scab. This study indicates that the RF algorithm is able to provide reliable prediction of scab and has large application potential.
查看全文   查看/发表评论  下载PDF阅读器
分享按钮