Researchers Push for Public Test of Census Privacy Tools

June 12, 2020by Michael Macagnone, CQ-Roll Call (TNS)
The Census Bureau last month released a "report card" showing it had gotten more accurate while still preserving the privacy of census responses. (Jonathan Weiss/Dreamstime/TNS)

WASHINGTON — The Census Bureau says it has improved its ability to give accurate data while protecting the privacy of its 2020 questionnaire responses, but experts worry they won’t be able to test the agency’s strategy before it is finalized.

The tweaks to the new method are critical to an accurate population count, one that will affect legislative mapmaking and the distribution of $1.5 trillion in federal funds.

The bureau made changes to an algorithm that adds “noise” to census data, a policy referred to as differential privacy, after researchers argued last year that a public test showed the policy made census data unusable. They found large errors, such as graveyards populated with living residents, and small ones, such as age distributions that skewed older or younger at small geographies.

The Census Bureau last month released a “report card” showing it had gotten more accurate while still preserving the privacy of census responses. But without another full public test and more technical details, researchers fear they won’t know whether the report card represents conscious policy tradeoffs or glitches in the system.

David Van Riper, spatial analysis director for the Minnesota Population Center at the University of Minnesota, said metrics show the Census Bureau has made progress on things such as the accuracy of age distributions but the results are off in other areas.

“It’s hard to deal with metrics without seeing what’s driving the changes,” Van Riper said. “Having the new metrics is a good step but is not sufficient for assessing how good this new algorithm is.”

The Census Bureau adopted its differential privacy policy after research showed existing methods, such as randomly swapping members of households, failed to do enough to protect the identity of individual participants. Privacy researchers at a conference last year also said they feared census responses could be cross-referenced with other datasets to identify individuals.

The differential privacy algorithm changes the data based on what’s called a privacy loss budget — the lower the budget, the noisier the data, and the higher the budget, the more accurate. Currently, only state-level population totals and a few other measures are kept constant, according to Census Bureau officials.

Since publishing the dataset last year, the Census Bureau said it has made several changes meant to address problems with last year’s test run using 2010 data. However, the agency doesn’t plan to produce another public trial run.

“Unfortunately, the tabulation, documentation and quality control processes required for public releases of data products are enormously time and labor intensive,” Michael Hawes, the agency’s senior adviser for data access and policy, said in a statement. “With the 2020 Census now underway, we are unable to support the release of another full demonstration product.”

Hawes said the agency may do an “alternative file release” to provide researchers more information before finalizing the rules. He said the agency intends in September to decide which population levels will not be altered by the algorithm, then finalize the privacy budget and other specifications by March 2021.

Without another release, researchers will have to trust the agency’s report of its progress on hammering out glitches and its balance between privacy and accuracy. That means they won’t know whether they can use the data for thousands of decisions, ranging from legislative mapmaking to the distribution of federal funds, until after census data has been released next year.

“This is one of the most important datasets, if not the most important data set for the nation,” said Alexis Santos, a Pennsylvania State University professor and demographer. “We need to do it in a way that the data are still usable so that we can use it to draw maps and study the population.”

Santos said last year’s public test dataset, applied on a batch of 2010 census results, proved to be too inaccurate. The public test data distorted race and ethnicity data in particular, potentially hiding disparities people use to study health impacts and policing by race.

“A lot more people are beginning to understand the structural differences that various people have faced for years — really since the founding of the country. And census data is the way we look at that,” Santos said.

Organizations such as the National Conference of State Legislatures have called for another test run, raising concerns that differential privacy will complicate legislative mapmaking.

NCSL’s director for elections and redistricting, Wendy Underhill, said the current iterations of differential privacy may make it difficult for states to meet their constitutional requirements when drawing up new congressional districts.

“If differential privacy does not have accurate population totals at block level, it is hard for districts to be built that we are sure are of equal population,” she said.

Underhill said it may be possible to show that an area’s actual population is different from what census data claims, potentially opening up new avenues for litigation. The addition of noise to data may also complicate drawing districts under the Voting Rights Act, which prohibits racial discrimination at the polls.

A key part of figuring out whether an area needs to protect voter rights depends on analyzing racially polarized voting trends at the precinct level, said Loyola University Law School professor Justin Levitt.

“That is a really small unit, and differential privacy makes a big difference in really small units,” Levitt said. “It’s going to make it hard to show voting is as polarized as it actually is on the ground.”

Levitt said some in the civil rights community have pushed the Census Bureau to merge some very rare combinations of data — like people over 100, or people of five or more races — to cut down on the amount of data released.

The executive order that President Donald Trump issued last year after dropping a citizenship question from the questionnaire only adds to the uncertainty of how accurate the census data will be. That order required the Census Bureau to compile citizenship data for the entire country at the most detailed geographic level, which Levitt said will only take away from the privacy loss budget and make the entire census less accurate.

“(The executive order) is adding to the concern that at the most local geographies, the data will be noisy enough to be problematic,” Levitt said. “Every additional reduction in the precise data available helps with accuracy.”


©2020 CQ-Roll Call, Inc., All Rights Reserved

Distributed by Tribune Content Agency, LLC.


US Census Enters Final Stage of Counting With a Shorter Deadline
US Census Enters Final Stage of Counting With a Shorter Deadline
August 14, 2020
by Gracie Kreth

WASHINGTON - The US Census Bureau entered its final stage of counting -- moving door-to-door -- with a shorter deadline. This week census takers began knocking on doors around the country in an attempt to count households that haven’t yet responded to the census. The Bureau... Read More

Democrats Criticize Trump for Ordering Illegal Immigrants Excluded From Census
Democrats Criticize Trump for Ordering Illegal Immigrants Excluded From Census
July 30, 2020
by Tom Ramstack

WASHINGTON - Democrats attacked the president’s plan to exclude illegal immigrants from the 2020 Census during a congressional hearing Wednesday while Republicans said they wanted to prevent foreign influence. The House Oversight Committee held what it called an “emergency hearing” to respond to a July 21... Read More

Trump’s Move to Exclude Some Immigrants in Census Could Affect States’ Seats in Congress
Trump’s Move to Exclude Some Immigrants in Census Could Affect States’ Seats in Congress

WASHINGTON — President Donald Trump on Tuesday directed his administration to exclude immigrants who are in the United States illegally when calculating how many seats in Congress each state gets after the current census, a decision that critics denounced as unconstitutional and one that will likely... Read More

Trump Moves to Bar Undocumented Aliens From Reapportionment Count
In The News
Trump Moves to Bar Undocumented Aliens From Reapportionment Count
July 21, 2020
by Dan McCue

WASHINGTON - President Donald Trump signed a memorandum Tuesday barring people in the U.S. illegally from being counted in congressional reapportionment. The Supreme Court last year blocked the administration's effort to add a citizenship question to the census form, with a majority saying the administration's rationale... Read More

Chicago Mayor Sends Out ‘Census Cowboy’ to Boost Low Response Rates
Chicago Mayor Sends Out ‘Census Cowboy’ to Boost Low Response Rates
July 14, 2020
by Gaspard Le Dem

Chicago Mayor Lori Lightfoot on Monday unveiled an unusual plan to encourage Chicagoans to participate in the U.S. government’s once-a-decade push to count every U.S. resident: a cowboy on a horse.  The mayor was speaking at a press conference when a man in torn denim and... Read More

Census to Start Door-Knocking After Pandemic Pause
Census to Start Door-Knocking After Pandemic Pause

WASHINGTON — The Census Bureau will plan a “final push” outreach effort this month to encourage many communities that haven’t responded to the census to do so, the agency said Wednesday during an update of plans amid a coronavirus pandemic that continues to spike in pockets... Read More

News From The Well
scroll top