NPR’s California Newsroom partnered with Stanford University’s Environmental Change and Human Outcomes Lab to map the increasing prevalence of wildfire smoke across the United States. The monthslong analysis, based on more than 10 years of data collected by the National Oceanic and Atmospheric Administration, reveals a startling increase in the number of days residents are breathing smoke at the ZIP code level — across California and the Pacific Northwest to Denver and Salt Lake City in the Rocky Mountains and rural Kentucky and West Virginia in Appalachia.
We also probed the potential health consequences of inhaling all this smoke, examining state hospitalization data in California and federally funded prescriptions for albuterol, a common asthma medication.
Here’s how we calculated the average number of annual smoke days for every ZIP code in America
We used data collected by the National Oceanic and Atmospheric Administration’s meteorological satellites that capture images of smoke and fire activity every few hours throughout the day across the United States.
Trained experts in NOAA’s Satellite Analysis Branch hand draw the boundaries of smoke plumes produced by wildfires for the Hazard Mapping System. The HMS lists active fires and shows the contours of smoke plumes for every day of the year
Before the data is added to a new day’s map, the fire pixels are screened to discard fires and smoke that are not associated with burning vegetation. This separates out smoke caused by industrial processes, like gas flaring at oil refineries, along with fires that mainly occur in residential, commercial or community-based buildings.
The limitations of these data is that they are unable to detect smoke plumes obscured by cloud cover and plumes at night. They are also unable to identify the vertical height of plumes and therefore cannot easily distinguish between plumes at surface level and plumes high above the ground.
To localize the persistence of the smoke at the ZIP code level, we partnered with Stanford University’s Environmental Change and Human Outcomes Lab led by professor Marshall Burke. With advanced computer modeling, Burke’s team mapped smoke exposure to every ZIP code each day between 2009 and 2020 by checking whether any smoke plumes from NOAA’s HMS data product intersected with any part of the ZIP code on that day. There were instances where all smoke plume information was missing due to cloud cover, but this represented less than 1% of the days they studied. When this was the case, it was assumed that smoke was not present.
The resulting dataset offers an understanding of the frequency of smoke days rather than the density. At the granular ZIP code level, the data on density of smoke is less meaningful and harder to map than at the aggregated county level. Though, as the frequency map shows, there can be variations in exposure to smoke within a county. Moreover, NOAA’s data collection on smoke density did not begin until late 2010.
This “smoke day” metric can be aggregated over time periods to provide a count of the number of days wildfire smoke was overhead for each ZIP Code Tabulation Area. Smoke consists of tiny unburnt solid particles suspended in the air, which depending on weather conditions, move around unpredictably. Since fires occur in different places in different years, we decided to average out the exposure to smoke over five-year periods to best understand the impact of the “new normal” of wildfires, which are burning hotter, faster and more frequently due to climate change.
We defined 2009 to 2013 as the “base period” and 2016 to 2020 as the “current period.” We selected 2009 to 2013 as the five-year “base period” after analysis of Cal Fire data showed it to be a window of relative calm, in terms of acres burned each year by wildfires.
We then queried the data to get the average smoke days recorded in these two periods and quantified the percent change from the base to the current period for every ZIP code across the country.
We combined the information with census data to place every ZIP code within its city, county and state limit. Using census mapping files or TIGER/Line shapefiles, we then created a choropleth map to visualize the analyzed data.
Zip Code Tabulation Areas are approximate area representations of U.S. Postal Service (USPS) ZIP code service areas that the U.S. Census Bureau creates to present statistical data for each decennial census.
ZIP Code Tabulation Areas do not nest neatly within state, county and city boundaries, and so in the created dataset, we have some ZIPs with multiple cities and counties listed. In less than 2% of ZIP codes, we ran into placement errors and so removed these from the created dataset.
With this cleaned dataset, we found several parts of the U.S. are now exposed to many more smoke days than they were a decade ago. California was particularly affected.
We offer a few caveats. A handful of ZIP codes in Florida recorded the worst smoke days in the U.S., even higher than California, including parts of Palm Beach, Hendry, Glades and Okeechobee counties. These ZIP codes are where most of the state’s sugar crop is grown and where residents battle the smoke from burning sugar cane fields before harvest season.
Elsewhere in Florida, however, there were positive turnarounds with exposure actually declining, including in the cities of Jacksonville and Tampa, where the number of smoke days declined 28% and 37% respectively. Experts we spoke to believe this may be due to their success with prescribed burns for controlling wildfires in Florida.
Our analysis found many areas of the Midwest experienced a slight drop in the number of days with smoke overhead since 2009. Our analysis showed an 8% decline in the number of smoke days in Chicago, for example, and a 12% decline in Milwaukee. But Burke’s modeling of the satellite imagery shows the smoke there is now thicker, suggesting “wildfires are also worsening overall air quality in the Midwest, just as they are in the West."
Here’s how we calculated the health impacts of all that wildfire smoke
For our analyses on how smoke is impacting health outcomes, we consulted with a panel of experts that included:
- Dr. John Balmes, a pulmonary physician, professor of medicine at UCSF and chief of the division of occupational and environmental medicine at San Francisco General Hospital;
- Dr. Francesca Dominici, professor of biostatistics at the Harvard T.H. Chan School of Public Health and co-director of the Data Science Initiative at Harvard University;
- Dr. Stephanie Holm, an environmental pediatrician and epidemiologist, and co-director of the Western States Pediatric Environmental Health Specialty Unit at UCSF.
Data on hospitalizations:
Research studies and experts have repeatedly warned that exposure to smoke impacts respiratory and cardiac health. To get a sense of the trends in hospitalizations at a granular geographic level across California, we looked at the California Office of Statewide Health Planning and Development’s (OSHPD) data on hospital discharges by individual facility for the years 2016 to 2019. This data is only available for the top 25 diagnosis groups per hospital in the state. We filtered the data for all hospital discharges related to respiratory and cardiovascular conditions.
We originally sought to match the years analyzed in our smoke analysis with the years studied in the hospital data. However, although the OSHPD dataset was available for previous years, in late 2015, reporting codes for diseases were updated from ICD-9 to ICD-10, affecting the way hospitals reported these admissions and discharges for various diagnosis groups. After consulting our panel of experts, we excluded all years 2015 and prior. We also excluded data on 2020, anticipating hospitalizations caused by COVID-19 would skew our analysis.
With each hospital’s OSHPD ID, we geocoded the dataset, using OSHPD’s Licensed Healthcare Facility Listing dataset from June 2021 to get the city and county of each individual facility.
Then, we added our smoke days analysis from NOAA data for the ZIP codes of each hospital listed. Since this dataset on hospitalizations and diseases is not based on the ZIP codes of patients, it was more meaningful to aggregate the data above the county level to account for hospital closures, residents who may travel to nearby counties for health care or counties that do not have any hospitals — Alpine County, for example.
In grouping the hospitals, we used the regions delineated by the California Hospital Association as it offered the best estimate of how patients may move across county borders. We then mapped out the percent change in hospitalizations and compared it to the median rise in smoke days for counties across California.
Out of these 21 hospital regions, we found a positive relationship in 18 regions between exposure to wildfire smoke days and a growth in hospitalizations for respiratory and cardiac illnesses. The other three regions covered Alpine, Amador, Calaveras, San Joaquin, Tuolumne, San Mateo, Tulare and Kings counties. Most of these counties, except for San Mateo, are in more rural parts of the state where the distribution of hospitals and the number of beds are lower, rendering it harder to decipher trends.
We also tried California Department of Public Health’s regional grouping for hospitals which divides into five zones. However, with more than 50% of the state’s population covered in the Southern California region alone, the analysis proved statistically insignificant.
Data on asthma medication prescriptions:
Data for albuterol prescriptions, a short-acting bronchodilator medication, is sourced from the Centers for Medicare & Medicaid Services. Albuterol is the most common “controller” or inhaled corticosteroid medication used as a “rescue” medication when a patient with asthma is having an acute exacerbation.
From the CMS trove of datasets, we specifically looked at the “Medicare Part D Prescribers — by Provider and Drug” dataset that gives information on prescription drugs funded by the federal agency and prescribed by physicians and other health care providers, aggregated by provider and drug. The CMS has this data publicly available for the years 2013 to 2018.
We filtered the Part D dataset for albuterol, a generic name for the drug and summed their total albuterol claims for each year between 2013 and 2018 in California.
We refrained from carrying out further analysis, as several other factors, such as insurance coverage, the prevalence of cigarette smoking and industrial pollution are also at play with CMS data, rendering it hard to interpret at the macro level.
Our investigation into these data shows wildfire smoke poses the greatest threat to everyday life across the U.S., not merely the West. As a fairly recent phenomenon that is projected to only worsen, the impact of smoke on human health is inescapable and insidious, given its ability to go undetected in the air we breathe. Early research shows that older people, children and poorer people, especially those from Black, brown and indigenous communities, may be the most vulnerable to the health risks related to exposure to PM2.5 from wildfires.
The first step to addressing a problem is awareness, and it is our endeavor through this ongoing investigation to propel forward discussion and policy change to address one of the greatest impacts of climate change. That has been our goal here.
Share your thoughts and questions
Alison Saldanha is a data journalist who led this investigation for NPR’s California Newsroom. Aaron Glantz, senior investigations editor for the newsroom edited this story together with managing editor, Adriene Hill.
The California Newsroom is a collaboration of NPR and 17 public radio stations across the state, from San Diego to the Oregon border.