The main objective of this paper is to use large language models (LLM) to generate conditional inflation forecasts and compare these to standard sources of inflation forecasts. In particular, we structure the analysis throughout to maximize the comparability between inflation forecasts produced by LLMs with those produced by professional forecasters in the Federal Reserve Bank of Philadelphia’s Survey of Professional Forecasters (SPF).
Some of the most prominent LLMs that are publicly available are GPT, developed by OpenAI, and PaLM, developed by Google AI.
Standard measures of forecast performance suggest that PaLM, the LLM we focus on, is able to generate conditional forecasts that are at least as good if not better than one of the most trusted and respected sources of inflation forecasts, the SPF. We believe that this is relevant for agents, practitioners and policymakers alike, as technological improvements in hardware and software are likely to significantly reduce the cost of developing and training these models. LLMs may then become an accessible means to generating forecasts, especially when compared to potentially costlier surveys of experts and households.
We think, however, that LLMs may become particularly useful in terms of forecasting variables for which we do not conduct surveys, and/or for which such surveys would be expensive and complicated to design and implement. These include, for instance, disaggregated time series variables such as labor force indicators for specific demographic groups or household disposable income for specific geographical regions, among others. - St. Louis Fed Economic Research
Full paper here: https://s3.amazonaws.com/real.stlouisfed.org/wp/2023/2023-015.pdf
Comments