Missing But Not Gone: New Homes for Federal Data

In the months since the current presidential administration took office, government websites have seen noticeable changes to their content. References to diversity, equity, and inclusion (DEI) and related concepts have been scrubbed across multiple agencies, including the National Park Service, the Centers for Disease Control and Prevention, and the Social Security Administration. Moreover, some federal datasets and tools have been decommissioned. In January, more than 2,000 datasets were taken offline, and archivists note that much of the missing information relates to climate change, environmental justice, and public health. In at least one case, however, litigation has reinstated federal government data.
Government Data Preservation Efforts
This has widespread implications across nearly all sectors. For planners, the disappearance of these datasets could create gaps in essential information for decision-making, research, and policy development. Digital archivists, researchers, organizations, and individuals work to find and save the missing information. A non-exhaustive list of entities preserving access to this data is presented here.
Internet Archive Wayback Machine
The most expansive and perhaps well-known archive website is the Internet Archive. Through its Wayback Machine, this nonprofit has preserved over 835 billion webpages and millions of books, photos, videos, and audio recordings. Users can search for specific URLs or words related to a dataset in question, and filter by ".gov web pages" to limit search results to U.S. government websites.
Webrecorder U.S. Government Web Archive
Webrecorder is a company founded to develop open-source web archival tools. Its End of Term Web Archive initiative saves federal government websites at the end of presidential terms. These include copies of websites from agencies such as USAID, CDC, EPA, and FEMA.
Source Cooperative
Source Cooperative, a project currently in beta development by nonprofit Radiant Earth, aims to enable the preservation and sharing of datasets by trusted organizations. Its key planning-relevant data repository, managed by the Harvard Law School Library Innovation Lab, seeks to preserve public U.S. federal datasets. This includes downloads of all data files posted on data.gov and an explainer of how the data is organized.
Public Environmental Data Partners
Public Environmental Data Partners (PEDP), a group of volunteer academics and archivists, focuses on preserving federal environmental data. Since January 2025, PEDP has retained and made public dozens of datasets and tools, including the EPA's EJScreen tool, FEMA's Future Risk Index, and the CDC's Social Vulnerability and Environmental Justice indices.
Environmental Data and Governance Initiative
Environmental Data and Governance Initiative, a research collaborative of public policy and science professionals, tracks changes to environmental data using open-source software. These changes are documented and freely accessible, and analyses of these shifts are detailed in blog posts.
These examples represent only a share of the organizations working to preserve data in the wake of recent federal actions. Colleges and universities are also addressing data loss. One such example is Occidental College, whose library website offers information on missing data; a list of data rescue organizations and resources, including some of those described above; and instructions on how to track down missing federal data.
Planners can bookmark and check these sites for updates on the ever-evolving federal data landscape and access to the growing number of rescued datasets. As the saying goes, what gets posted on the internet stays there forever, which, at least in the case of data preservation, is cause for hope.
Top image: iStock/Getty Images Plus - RoschetzkyIstockPhoto
About the Author