The Anatomy of Out-of-Sample Forecasting Accuracy

Daniel Borup, Philippe Goulet Coulombe, David E. Rapach, Erik Christian Montes Schütte, and Sander Schwenk-Nebbe
Working Paper 2022-16
November 2022

Full text Adobe PDF file format :: Appendix Adobe PDF file format


We develop metrics based on Shapley values for interpreting time-series forecasting models, including "black-box" models from machine learning. Our metrics are model agnostic, so that they are applicable to any model (linear or nonlinear, parametric or nonparametric). Two of the metrics, iShapley-VI and oShapley-VI, measure the importance of individual predictors in fitted models for explaining the in-sample and out-of-sample predicted target values, respectively. The third metric is the performance-based Shapley value (PBSV), our main methodological contribution. PBSV measures the contributions of individual predictors in fitted models to the out-of-sample loss and thereby anatomizes out-of-sample forecasting accuracy. In an empirical application forecasting US inflation, we find important discrepancies between individual predictor relevance according to the in-sample iShapley-VI and out-of-sample PBSV. We use simulations to analyze potential sources of the discrepancies, including overfitting, structural breaks, and evolving predictor volatilities.

JEL classification: C22, C45, C53, E37, G17

Key words: variable importance, out-of-sample performance, Shapley value, loss function, machine learning, inflation

The authors thank seminar and conference participants at the European Commission Joint Research Center: Online Seminar, 2022 International Symposium on Forecasting, and Workshop on Advances in Alternative Data and Machine Learning for Macroeconomics and Finance, as well as Daniele Bianchi, Giulio Caperna, Todd Clark, Marco Colagrossi, Claudia Foroni (Workshop on Advances in Alternative Data and Machine Learning discussant), Nikolay Gospodinov, Andreas Joseph, Juri Marcucci, Michael McCracken, Marcelo Medeiros, Stig Møller, and Mirco Rubin, for insightful comments. The authors created the PythonOff-site link package anatomyOff-site link to compute the metrics for interpreting fitted prediction models developed in this paper. The views expressed here are those of the authors and not necessarily those of the Federal Reserve Bank of Atlanta or the Federal Reserve System. Any remaining errors are the authors' responsibility.

Daniel Borup is with Aarhus University and CREATES. Philippe Goulet Coulombe is with the Université du Québec à,À Montréal. David E. Rapach is with the Federal Reserve Bank of Atlanta. Erik Christian Montes Schütte is with Aarhus University, CREATES, and DFI. Sander Schwenk-Nebbe is with Aarhus University. Please address questions regarding content to David Rapach, Research Department, Federal Reserve Bank of Atlanta, 1000 Peachtree Street NE, Atlanta, GA 30309.

To receive e-mail notifications about new papers, subscribe. Under "Publications" select "Working Papers."