COVID-19 Forecasts by County

COVID-19 Forecasts by County provides U.S. county-level forecasts of COVID-19 infection growth rates based on the econometric model developed in Wilson (2020). The forecast methodology uses county panel data from early 2020 through the latest available data to estimate out-of-sample predicted values from a panel fixed-effects regression specification. To measure the reliability of these forecasts, this page also shows a comparison of past predictions to estimates of actual growth over several forecast horizons. See Wilson (2021) for a discussion of the forecast accuracy.

The regression equation is derived from the canonical epidemiological model of infectious disease spread, known as the Susceptible-Infectious-Removed (SIR) model. It relates a county’s subsequent growth, measured as the change in the log number of infected residents, over a specific horizon to current and past values of observable drivers of disease transmission. These drivers include social distancing behavior (mobility), weather (temperature and precipitation), and confirmed COVID-19 cases per capita to date. The reg.xlression also includes county fixed effects and county-specific linear time trends. This allows counties to have different levels and trends in their infection growth rates independent of the differences in those observed drivers of transmission.

This forecasting model relies on daily county-level, near real-time data on weather, mobility, and COVID-19 cases. The weather data are based on NOAA/NCDC weather station readings and aggregated to the county level following Wilson (2019). The mobility data used here are from the Device Exposure Index (DEX) produced by Couture, et al. (2020) and based on mobile device geolocation data from PlaceIQ. Other mobility data such as the discontinued Dallas Fed’s Mobility and Engagement Index (MEI), or Google Mobility Reports, as demonstrated in Wilson (2020), have yielded similar results. Data on COVID-19 cases come from usafacts.org, which compiles data from state public health agencies.

The forecasts also incorporate state-level data from Centers for Disease Control and Prevention on vaccinations. To do so, the data need to be adjusted for individual counties. Specifically, counties are assumed to have the same share of their population fully vaccinated as their state’s average, and vaccinations are assumed to be 95% effective. In accordance with the standard SIR model, this effective vaccination share of the population directly reduces forecasted growth in infections. In other words, vaccinations move people from the susceptible category (S) to the removed category (R).

These forecasts will be updated approximately every week, assuming there are no disruptions in data feeds for weather, mobility, and cases. The latest forecasts from this empirical model for horizons from 10 days ahead up to 70 days ahead (at 10-day intervals) are available in the downloadable file.

Forecasts of the change in log active infections in each county for each horizon can be easily aggregated using a population-weighted average to provide a national forecast of the change in log active infections for the same horizons. Applying these predicted national growth rates to the latest level of national infections yields a forecast for new active infections over the next 70 days. Figure 1 shows the latest national forecast (dotted line), along with the historical data (solid line).

Figure 1: Forecast of Active U.S. COVID-19 Infections

As an illustration of the county-level forecasts, Map 1 shows the forecasts for 30 days ahead. Note that the DEX mobility measure is unavailable in less populous counties due to non-disclosure restrictions for privacy protection.

Map 1: Projected Growth of COVID-19 Infections by County, 30 Days Ahead

Note: Map 1 shows each county’s projected growth in infections, color-coded into one of five groups, ranked from the most negative to most positive; gray shading indicates insufficient data. Growth is measured by the change in log active infections from the date listed to 30 days later. Note that the exponent of the change in log values, minus 1 and times 100 expresses growth in percentage terms.

To put these latest forecasts in perspective, Map 2 shows the actual change in log active infections over the past 30 days, based on approximating active infections using data on cases; see Wilson (2020) for details.

Map 2: Actual Growth of COVID-19 Infections by County, 30 Days Prior

Note: Map 2 shows each county’s actual growth in infections, color-coded into one of five groups, ranked from the most negative to most positive; gray shading indicates insufficient data. Growth is measured by the change in log estimated active infections from 30 days prior to the latest date of data. Note that the exponent of the change in log values, minus 1 and times 100 expresses growth in percentage terms.

To assess the accuracy of the forecasting model, Map 3 shows the model’s out-of-sample forecasts of the change in log active infections over the same 30-day period as in Map 2, using data only up to the beginning of the period. A strong similarity between Maps 2 and 3 indicates relatively high accuracy of the forecasts.

Map 3: Prior Month Projections for Infection Growth by County

Note: Map 3 shows the earlier 30-day-ahead projections for the period in Map 2, to illustrate forecast accuracy. Each county’s projected growth is color-coded into one of five groups, ranked from most negative to most positive; gray shading indicates insufficient data. Data are based on the forecasted value of the change in log active infections from the date listed to 30 days later. Note that the exponent of the change in log values, minus 1 and times 100 expresses growth in percentage terms.

Table 1 provides formal evidence on forecast accuracy for various forecast horizons. The table shows two accuracy metrics for the county forecasts, comparing each horizon’s forecast, as of the beginning of that period, against actual growth in infections over the same period. For example, the first row evaluates the accuracy of a 10-day-ahead forecast of infection growth, based on data up until 10 days from the last day of available data, by comparing it to actual growth in infections over those 10 days. The first column indicates the length of the model forecast in days. The correlation metric is the cross-county correlation between the forecasted value and the actual value of the change in log active infections over those 30 days. The closer this correlation is to 1, the better the forecasting model is at predicting the cross-county distribution of COVID-19 growth. The RMSFE metric is the root mean-squared forecast error, which is the square root of the mean across counties of their squared forecast errors. The smaller the RMSFE, the more accurate is the average county’s forecast.

To put these forecast errors in perspective, it is useful to compare them to the cross-county means of the actual and predicted values of the change in log active infections, shown in the next two columns. The last column of the table shows how many counties had sufficient data to produce a forecast.

Table 1. Forecast Accuracy Statistics

Horizon (days)CorrelationRMSFEMean (actual)Mean (predicted)No.
100.17460.5274-0.33650.02161751
200.09290.773-0.52680.00891757
300.0670.8798-0.6947-0.13521756
400.07930.9334-0.7703-0.29461750
500.11830.9278-0.7416-0.56191767
600.21970.9858-0.53-0.76531801
700.23331.1867-0.2976-0.99871795
Note: The values shown in the columns RMSFE, Mean (actual), and Mean (predicted) are based on population-weighted means.

References

Couture, Victor, Jonathan I. Dingel, Allison E. Green, Jessie Handbury, Kevin R. Williams. 2020. “Measuring Movement and Social Contact with Smartphone Data: A Real-Time Application to COVID-19” National Bureau of Economic Research Working Paper 27560 (July).

Wilson, Daniel J. 2021. “SF Fed Launches Tool to Forecast COVID-19 Infections.” SF Fed Blog, February 3, 2021. https://www.frbsf.org/our-district/about/sf-fed-blog/sf-fed-launches-tool-forecast-covid-19-infections/

Wilson, Daniel J. 2020.  “Weather, Mobility, and COVID-19: A Panel Local Projections Estimator for Understanding and Forecasting Infectious Disease Spread.” FRBSF Working Paper 2020-23 (December). https://doi.org/10.24148/wp2020-23

Wilson, Daniel J. 2019. “Clearing the Fog: The Predictive Power of Weather for Employment Reports and their Asset Price Responses.American Economic Review: Insights 1(3), December, pp. 373–388. https://doi.org/10.1257/aeri.20180432

Download Data

Projections data (Excel document, 848 kb)