When Is Big Data Bad Data? When It Causes Bias

Bloomberg BNA Kevin McGowan
July 28, 2016

From Daily Labor Report

Daily Labor Report ® is the objective resource the nation’s foremost labor and employment professionals read and rely on, providing reliable, analytical coverage of top labor and employment…   http://www.bna.com/big-data-bad-n73014445584

By Kevin McGowan

July 28 Employers are turning to computer-driven algorithms to find, recruit and hire job candidates online, but one negative output could be unintentional discrimination.

It’s a bit of a black box, ” said Commissioner Victoria Lipnic (R) of the Equal Employment Opportunity Commission, referring to the formulas data analysts and programmers develop to aid employers in their talent searches.

Vendors that promote the algorithms say using a neutral formula that eliminates the human element, at least in the early stages of searching for and recruiting candidates, reduces the risk of unlawful bias.

But others fear the algorithms, depending on how they are constructed and used, could create or perpetuate discrimination based on race, sex or other protected characteristics.

Lipnic, lawyers and an academic interviewed by Bloomberg BNA this summer aren’t sure current laws are adequate to address the potential discriminatory effects.

Global Search for Talent

*               *               *

Algorithms are used to search the digital footprints ” of potential job candidates, including those who haven’t applied for a job and aren’t actively seeking new employment.

This data mining ” is intended to unearth all the online information available about a candidate. The formulas use statistical matches between the characteristics of a company’s incumbent successful employees and online candidates to predict which ones would also succeed.

Multiple Levels of Risk

The factors put into the formula may be statistically correlated with job success, but they aren’t necessarily causally related to job performance, said Pauline Kim, a law professor at Washington University in St. Louis. Algorithms may be used to predict things such as job retention or odds of a workplace injury, which differ from the traditional employment test, Kim told Bloomberg BNA.

The algorithms may not actually be measuring an individual’s ability to perform the job, she said.

They also may lead employers to replicate their current workforce’s demographics. Searching for people who resemble a company’s top-rated performers may perpetuate existing under-representation of women, racial minorities or other protected groups. If performance appraisals are affected by unconscious bias, that might be baked into the algorithm.

It’s also possible certain identifiable groups of people have less of a digital footprint ” than others and won’t be discovered by models that scan the Internet for potential job candidates.

Employers contemplating the use of algorithms should be clear on what they’re trying to measure, whether their model actually measures those qualities and what the potential impacts on protected groups might be, Kim said.

Once they apply the formula, employers should retain their data and audit the results for potential bias, she said.

Are Current Laws Sufficient?

Some of those interviewed by Bloomberg BNA questioned if traditional legal analysis under Title VII of the 1964 Civil Rights Act is adequate to handle the emerging issues presented by employers’ use of algorithms.

Workers excluded by an online selection device could allege disparate treatment or intentional discrimination, but the more likely claim is disparate impact. That requires a plaintiff to identify a specific employment practice that has a disproportionate adverse impact on individuals belonging to a protected class.

If disparate impact is shown, an employer can defend the practice as job related ” and consistent with business necessity. ” Even if an employer makes that showing, a plaintiff still could prevail under Title VII if he shows there’s a selection device with less discriminatory impact that would achieve the business-related objectives.

Some of those steps are problematic when the selection device is an algorithm with elements that may be a mystery even to the employer or programmer.

*               *               *

Discovering the elements of the algorithm could be difficult because vendors might claim trade secret protection for their proprietary formulas. Programmers also may be unable to identify what variable within an algorithm is producing discriminatory effects, Kim said.

One solution would be to identify the entire algorithm as the employment practice producing the alleged disparate impact, said Adam Klein, a plaintiffs’ attorney with Outten & Golden in New York.

The affected worker shouldn’t have to deconstruct an algorithm that even the employer might not fully understand, he said during an American Bar Association webinar June 21.

Validating Tools

The traditional method of showing business necessity is to validate the selection device as job-related, under the federal government’s Uniform Guidelines on Employee Selection Procedures, issued in 1978 and unchanged since then.

*               *               *

The traditional notions of test validation can be applied to algorithms, plaintiffs’ attorney Klein said.

It’s an employment practice ” to use the algorithm, which functionally is a test like any traditional tool, ” he said during the ABA webinar. Provided the output data are available, Klein said a plaintiffs’ statistical expert could do a shortfall analysis ” on how many members of the protected group were selected compared with how many should have been chosen absent discrimination.

Very large pools of data aren’t new either, ” Klein said. For example, he said a recently settled discrimination case against the U.S. Census Bureau regarding its criminal background check policy involved 4 million applicants, with about 1 million hired and 800,000 subjected to background checks.

All the relevant concepts for analyzing an algorithm’s potentially discriminatory effects are pretty established ” under Title VII and the UGESP, Klein said.

*               *               *

EEOC in “Learning Phase.’

Another concern in measuring potential discrimination from use of algorithms is whether an employer kept the relevant data, the EEOC’s Lipnic told Bloomberg BNA.

Under Title VII, employers generally must keep applicant records for a considerable period after hiring decisions are made. Compliance with that obligation could be an issue in this context, Lipnic said. The EEOC generally doesn’t pursue record-keeping violations as a stand-alone case rather than as a complement to a substantive discrimination claim, she said.

The EEOC in March 2014 held a public meeting on the impacts of social media on discrimination issues. Since then, all the agency’s offices have been educating themselves on the potential bias issues raised by employers’ reliance on algorithms and other online tools, Lipnic said.

Everyone’s definitely in a learning phase, ” she said.

The agency is very much trying to understand what is happening, ” what’s being created by employers and how it’s being done, Lipnic said. The issues discussed at the EEOC’s 2014 meeting were the tip of the iceberg ” compared with what employers are doing today, she said.

*               *                 *

Alternative Approach Suggested

The EEOC and the courts also could consider an alternative way to analyze discrimination claims under Title VII, distinct from the disparate treatment and disparate impact paradigms, Kim said.

In a draft law review article, Kim suggested a new approach based on classification ” language found in Section 703(a)(2) of the act.

That provision makes it an unlawful employment practice for an employer to classify ” employees or job applicants in any way that would deprive or tend to deprive ” any individual of employment opportunities because of race, color, religion, sex or national origin.

Considering the issues raised by employers’ use of big data through a prism of classification bias ” might be better than trying to shoehorn them into disparate impact analysis, Kim said.

Disparate impact analysis is a particularly poor fit for addressing the types of harms potentially caused by workplace analytics, ” Kim said in her paper. Rather than providing specific criteria which are justified by clearly stated employer rationales, data models typically involve opaque decision processes, rest on unexplained correlations and lack clearly articulated employer justifications, “she said.

When an algorithm relies on seemingly arbitrary characteristics or observed behaviors interacting in some complex way to predict job performance, the claim that it is “job related’ often reduces to the fact that there is an observed statistical correlation, “she wrote. If a statistical correlation were sufficient to satisfy the defense of job-relatedness, the standard would be a tautology rather than a meaningful legal test. In order to protect against discriminatory harms, something more must be required to justify the use of an algorithm that produces biased outcomes. ”

Under her proposed Title VII analysis, an employer would bear the burden of establishing the algorithm’s validity and it wouldn’t be sufficient to show a statistical correlation exists.

A bottom line defense ” might make sense for employers using algorithms, Kim said. Because of the difficulty of isolating the effect of particular variables, it will often make sense to treat the algorithm as an undifferentiated whole, ” she wrote. And if its operation does not disproportionately exclude members of protected groups, then it is difficult to identify a discriminatory harm in the absence of any motive directed against particular individuals. ”

In any event, she said, the law will have to depart from traditional disparate impact doctrine in significant ways if it is to respond effectively to these challenges. ”

Whether the discussion is framed in terms of “classification bias’ or a revised disparate impact theory, the critical point is to recognize that data analytics are fundamentally different from the employer practices subject to challenge in earlier cases, “Kim wrote. It is certainly possible to interpret Title VII in ways better suited to meet those differing threats to workplace equality.”