How Neuroscience is Reforming Criminal Justice

New research into how the brain works is contributing to innovative strategies for reducing recidivism and developing alternatives to incarceration.

In the courtroom, testimony or evidence about abnormalities or damage to a defendant’s brain has been used to assess the level of responsibility for criminal behavior. But new research into how the brain works is contributing to innovative strategies for reducing recidivism and developing alternatives to incarceration.

The Mind Research Network, a non-profit based in Albuquerque, N.M., has been on the forefront of discovering how the brains of psychopaths and violent offenders differ from the average person’s.

Psychopaths make up a substantial part of prison population and are 20 to 25 times more likely to be in prison than non-psychopaths.

Dr. Kent Kiehl, a lead researcher for the network, says the research can help target appropriate treatment for example, for youths who have demonstrated violent behavioral  traits.

“This will improve our ability to predict which kids are high-risk, and how to individually tailor treatment to help kids change,” he told The Crime Report.

Using a portable MRI machine, Kiehl and his team studied and scanned the brains of roughly 4,000 violent juvenile and adults offenders from 10 prisons in two states over the last decade. The process yielded the largest neuroscientific database of violent offenders in the world.

One focus of the research was to examine the differences in the brains of juveniles who have committed homicides and juveniles who haven’t.

“Scanning is the easy part,” said Kiehl about the intensive process that goes into examining each inmate he studies.

In addition to scanning, Kiehl and his team conducted intensive clinical interviewing which examined IQ, past trauma and socioeconomic history, as well as personality.

Kiehl and the Mind Research Network have partnered with the Today=Tomorrow Program at Mendota Juvenile Treatment Center (MJTC).  in Madison, Wi., a cognitive behavioral treatment program that tries to help juvenile offenders with psychopathic traits, according to its website,  by educating “youth of the connection between their thoughts, attitudes, and emotions to their behaviors; to identify ‘thinking barriers’ and substitute responsible thinking, and to increase pro-social thinking and skills through modeling and role-play practices.”

The Today=Tomorrow Program at the MJTC has garnered some attention in recent years and has been covered in-depth by several outlets like NPR and The Atlantic.

Similarly, a study by the Douglas County Juvenile Department in Wisconsin found an 85 percent decrease in recidivism after one year among 48 subjects who went through the program, and a 94 percent decrease in recidivism after two years with a smaller sample size of 12 subjects.

The program is not likely to make these troubled juveniles into model citizens, but it tries to teach them a practical form of empathy that can help them avoid the impulse to commit violent or criminal acts.

Kiehl has been scanning subjects in the program three times during the process of treatment to understand mechanisms of change in those who don’t come back for repeat offenses.

Research like Kiehl’s is helping neuroscientists map out which brain regions should be targeted for treatment that will translate to improved behavioral and life outcomes.

This approach has been likened to working on a muscle that has atrophied from not being used.

“That’s the holy grail, to show which therapies and treatments adjust and help these circuits adapt,” Kiehl said.  “What is the brain mechanism of change, and is it sustainable?

“That’s what we’re working on now.  And if we figure that out we can get carefully derived measure of treatment efficacy.”

The concept of brain age and maturity is at the heart of Kiehl’s work with juveniles and psychopaths. Neuroscience can be a way to help determine how mature a person’s brain is more accurately than their numerical age.

Kiehl says that this is the essence of neuro-prediction.

“If you can measure the brain components that predict something rather than a proxy, like impulsivity, that circuit will indicate someone’s future impulsive behavior better than self-reports or other ways of measurements,” he said.

“Brain age is a better predictor if you re-offended than your date of birth.”

Drawing conclusions about people’s behavior based on brain imaging presents challenges and pitfalls for neuroscientists.

One danger involves a term called reverse inference, which can amount to researchers overestimating how much a certain area of the brain is involved or responsible for determining a specific behavior or cognitive process.

Kiehl argues that having strong data that can allow for good predictive power helps lessen the need for interpretation, and therefore creates the possibility of errors that stem from things like reverse inference.

Scientists may disagree about the exact function and role of the amygdala, for example, which has been linked to fear and aggression. But, according to Kiehl, “an amygdala deficit is an amygdala deficit.  We can academically argue about the interpretation, but what’s really important is the data.”

Other than the obvious benefit of reducing future violent acts and the damage that ripples from them, the kind of treatment offered at the MJTC can ultimately be much cheaper than it is to incarcerate people in the long run if reoffending can be reliably reduced.

A 2006 study in the Journal of Research on Crime and Delinquency found, “Over the 4.5 year follow–up, the return on the investment in the MJTC amounted to over 700 percent.”

Dr. Daniel Martell, a forensic expert at Park Dietz & Associates, and assistant clinical professor at the David Geffen School of Medicine at U.C.L.A., says that even though neuroscience is starting to tackle problems that were once thought to be unsolvable, like how psychopathy can be effectively treated, it has a long way to go before it’s ready for widespread implementation.

“We get great findings, but the problem is getting researchers to actually replicate those findings,” said Martell.

“We don’t really even have first generation studies to be replicated, so that’s where people like Kiehl are contributing.”

Martell went on to say that in terms of overall progress, “we’re still crawling.”

Dr. Francis Shen, an associate professor of law at the University of Minnesota who specializes in what’s called neurolaw, thinks that neuroscience will need to work in tandem with other developing sciences, such as genetics and psychology, in order to make the most valuable contributions to law and other fields.

Shen notes that neuroscience presently doesn’t show a lot of new ways to successfully alter the brain, and it is currently best thought of as a tool to aid in behavioral interventions, not only for juveniles but in other areas of law as well, such as poverty.

Shen used the example of poverty law to describe how neuroscience can make contributions to data we already have from other disciplines about the effects poverty has on people.

“We are starting to open the black box,” Shen said in an interview with The Crime Report. “We don’t need neuroscience to tell us that poverty’s bad, but neuroscience will let us understand the mechanisms allow for earlier and more targeted interventions and reframe discussion for policies.”

As mentioned above, neuroscience offers opportunities to extrapolate brain data and make claims about human behavior that aren’t justified.

However, Shen and Kiehl have both pointed out that neuroscience, as well as other human sciences, usually make predictions based on a spectrum.

Neuroscience deals in probabilities, and tries to predict the likelihood someone will behave a certain way.

Someone who is diagnosed as highly psychopathic may never actually commit a violent crime, although the probability they will is higher compared to the average person.

While one should always be wary of both the past mistakes that have been made in the name of brain science, and the obstacles that lie ahead, the work mentioned by Kiehl and others is trying to up-end the determinism that many fear results from a neuroscientific perspective of behavior.

Dane Stallone is a TCR news intern. He welcomes comments from readers.


Civil Rights Advocates Say Risk Assessment May ‘Worsen Racial Disparities’ in Bail Decisions

More than 100 civil rights, “digital justice” and community groups issued a statement expressing concerns about the expanding use of risk assessment instruments as a substitute for basing bail releases on money. The groups said risk assessment tools may not only exacerbate racial bias but “allow further incarceration.”

More than 100 civil rights, “digital justice” and community groups have joined in a statement expressing concerns about the expanding use of risk assessment instruments as a substitute for basing bail releases on money.

The organizations said Monday that risk assessment, which they termed “algorithmic-based decision-making,” may “worsen racial disparities and allow further incarceration.”

Many critics of the money bail system argue that risk assessment is a superior method of advising judges on whether to release a suspect pending disposition of a case. They say that the decision should be based more on science than on how much money the defendant can pay to gain release.

Risk assessment tools use data to forecast a person’s likelihood of appearance at future court dates and the risk of re-arrest.

Instead of risk assessment, the critical groups urge criminal justice leaders to “reform their systems to significantly reduce arrests, end money bail, severely restrict pretrial detention, implement robust due process protections, preserve the presumption of innocence, and eliminate racial inequity.”

The groups maintain that courts can ensure that people are not jailed unneccessarily without using risk assessment tools.

“America’s pretrial justice system is broken,” said Vanita Gupta, president of The Leadership Conference Education Fund. “If our goals are to shrink the justice system and end racial disparities, we can’t simply end money bail and replace it with risk assessments.”

Gupta headed the U.S. Justice Department’s civil rights division during the Obama administration.

The groups’ critical statement echoes concerns expressed in 2014 by then-Attorney General Eric Holder about the use of risk assessments by judges to help make sentencing decisions.

“By basing sentencing decisions on static factors and immutable characteristics – like the defendant’s education level, socioeconomic background, or neighborhood – they may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society,” Holder said in a speech to the National Association of Criminal Defense Lawyers.

Among the leading advocates of risk assessment is the Texas-based Laura and John Arnold Foundation, which said this spring that it planned to expand access to its Public Safety Assessment (PSA) “dramatically” and broaden the level of research on its use and effectiveness.

Since a version of risk assessment developed by the foundation was launched in 2013, more than 600 jurisdictions have expressed interest in using it.

“This intense level of interest reflects the nationwide momentum favoring evidence-based pretrial decisions,” the foundation says, saying that the system is aimed at addressing “the inequity in the system that causes the poor to be jailed simply because they’re unable to make bail.”

As of April, the foundation said its assessment tool was used by about 40 cities, counties and states.

The foundation said that over the next five years, pretrial researchers will work with 10 diverse jurisdictions to receive training, technical expertise and implementation of pretrial risk assessments locally.

The advocacy group statement on Monday argued that because “police officers disproportionately arrest people of color, criminal justice data used for training the tools will perpetuate this correlation.”

The groups said the “main problem that has caused the mass incarceration of innocent people pretrial is the detention of individuals perceived as dangerous (‘preventive detention’). Though the Constitution requires that this practice be the rare and “carefully limited exception,” it has instead become the norm. Risk assessment tools exacerbate this issue by relying upon a prediction of future arrest as a proxy for so-called ‘danger.’ ”

Some developers of risk assessment tools have refused to make public the details of their design and operation.

Monday’s statement by critical groups declared that, “The design of pretrial risk assessment instruments must be public, not secret or proprietary.”

Among the groups signing the statement were the American Civil Liberties Union, the Drug Policy Alliance, the Leadership Conference on Civil and Human Rights, the NAACP, the National Employment Law Project, and the Prison Policy Initiative.

In response to the criticism, the Arnold Foundation said that the groups’ statement “misconstrues the role of risk assessments.”

Risk assessments “do not make pretrial release decisions or replace a judge’s discretion,” the foundation said. “They provide judges and other court officers with information they can choose to consider—or not—when making release decisions.

“We believe—and early research shows—that this type of data-informed approach can help reduce pretrial detention, address racial disparities and increase public safety.”

Nicholas Turner, president of the Vera Institute of Justice, which has pursued bail reform since the 1960s, said that Vera agrees with the critics’ goals but did not sign the statement because, “We help implement risk assessments when they will improve upon the often standardless and arbitrary regimes that exist in much of America.”

Turner agreed that risk assessments are not a panacea for inequities in the bail system.

Ted Gest is president of Criminal Justice Journalists and Washington bureau chief of The Crime Report. Readers’ comments are welcome.


How ‘Pseudo-Science’ Turns Sex Offenders into Permanent Outlaws

A risk assessment tool used for two decades to assess sex offenders’ likelihood of committing a future offense has been repeatedly exposed as “pseudo-scientific humbug.” So why do New York State courts continue to use it?

A New York Appeals court has rejected the notion that risk prediction under the state’s Sex Offender Registration Act (SORA) should have a scientific basis. According to the July 2017 decision in People v. Curry, courts must not only adhere to a risk assessment instrument (RAI) that has been repeatedly exposed as pseudo-scientific humbug, they may not even consider a scientifically validated instrument such as the Static-99.

It wasn’t the first time. For the 20 years since SORA was enacted, courts have used the RAI to classify individuals after they’ve completed their sentences for a designated “sex offense.” The classifications purport to show the person’s likelihood of committing another sex offense in the future.

Persons adjudicated as level 2 or 3 are thought to be very dangerous indeed, and must register with law enforcement for the rest of their lives.

Their photographs, addresses, and a description of the past offense are made publicly available online at the sex offender registry. They may legally be denied jobs and housing, including shelters. They may be evicted, fired or hounded from the neighborhood by civic-minded vigilantes such as Parents for Megan’s Law.

This looks an awful lot like advance punishment for a future crime, like the science fiction film “Minority Report.” It also looks like a second punishment for a past offense—a practice the Constitution frowns on in the Double Jeopardy Clause.

Not at all, say the courts. SORA isn’t punishment, but merely a regulatory measure to protect public safety. As one legislator put it, it’s like affixing warning labels to toxic substances.

In that case, you’d think everyone would be deeply concerned to make sure that the label is as accurate as possible. It hardly contributes to public safety to broadcast over the Internet that Mr. Jones might commit a sex offense at any minute, when in fact he presents no such risk.

But that’s not how courts think.

Risk level under SORA is determined through an adversarial hearing in criminal court where the prosecutor proffers the RAI and typically seeks the highest possible classification. The RAI is a chart, cobbled together by employees of the Department of Parole, that adds up points for factors such as whether the past offense involved contact over or under clothing, or whether the victim was under age eleven or over 62.

The more points, the higher the risk level.

Defense attorneys have repeatedly proffered peer-reviewed research and the uncontested expert testimony of psychologists specializing in sex offender recidivism showing that the RAI is based on the facile but discredited assumption that “if he did it before he’ll do it again.” The instrument takes no account of the scientific consensus that recidivism isn’t correlated to the perceived heinousness of the past offense.

The scientific articles cited by the RAI are not only outdated; they don’t remotely stand for the conclusions for which they’re cited. Although the RAI purports to be an objective scientific instrument, it uses its own idiosyncratic system of assigning and weighing points that’s heavily biased towards a finding of maximum risk.

We’ve proffered instruments such as the Static-99 and the SVR-20 which, unlike the RAI, have been tested and validated by mental health professionals. In contrast, nobody except New York judges and District Attorneys uses the RAI.

The judicial response ranges from numb indifference to sputtering indignation. The outstanding exception is Daniel Conviser, a trial judge in Manhattan, who issued a 100-page opinion in 2010 after hearing expert testimony. After analyzing the RAI in detail, he concluded that the instrument is so arbitrary that it violates due process. Unfortunately, his decision isn’t binding on other courts and has been ignored.

crystal ball

The crystal ball approach to risk assessment. Illustration by Squawk

It’s like a drug test that can’t tell the difference between coffee and cocaine.

Even courts that recognize that the RAI may not be “the optimal tool” initially reasoned that there’s no harm in using it because it’s “only a recommendation.” But the Court of Appeals subsequently held that the RAI is so “presumptively reliable” that courts are bound by its conclusions unless the defendant can somehow prove that it overestimates his future risk.

The obvious course, until now, was for the defendant to show that a scientifically tested and validated instrument such as Static-99 put him at a lower risk. No dice, says the Appellate Division. Why? Because although the Static-99 measures the probability of re-offending, it doesn’t say what offense the person will commit if he re-offends.

Which conveniently ignores that no matter what the RAI claims, it doesn’t accurately predict anything.

It’s hard to see how this implacable rejection of science squares with the notion that SORA isn’t punishment but merely a regulatory measure to protect public safety. So long as risk prediction is based on the perceived heinousness of the past crime, it’s nothing but punishment under an alias.

There are now over 40,000 New Yorkers on the sex offender registry, most of whom have been adjudicated as level 2 or 3 based on the RAI. Public safety isn’t served by creating a permanent, ever-growing underclass of people who will remain forever barred from normal civic life based on a pseudo-scientific instrument.

Appellate Squawk is the pseudonym of an appellate attorney in New York City, and the author of a satirical legal blog of that name. Readers’ comments are welcomed.


Why I Am Not a Recidivist

A Washington State parole board rejected our columnist’s appeal for release from prison for a crime committed when he was a juvenile on the grounds that he had a “moderate to high” likelihood of re-offending. But they appear to have based the decision on a psychological risk assessment tool used to measure adult offenders.

Across the United States, there are hundreds of prisoners serving sentences of life without the possibility of parole for crimes committed when they were juveniles, but who now have an opportunity to be freed from newly imposed indeterminate sentences once they complete lengthy minimum terms of confinement. I am one of them.

Call us the Miller family. (After the 2003 Supreme Court Miller vs. Alabama ruling that determined imposing a life without parole sentence on a juvenile violated constitutional protections from cruel and unusual punishment.)

Jeremiah Bourgeois

Jeremiah Bourgeois

My original sentence was imposed for crimes that I committed when I was 14. However, in light of the Court ruling, the Washington State legislature gave prisoners like me the opportunity to be freed—provided that we are deemed by the parole board to be unlikely to “commit new criminal law violations if released.

I must admit I rejoiced at this news after serving 20 years of a natural-life sentence. Yet as I moved closer to completing my newly imposed minimum term, I came to realize that the light at the end of the tunnel might actually be a train: my former cellmate, Anthony Powers, was denied parole even though, to many in the know, he was a model of reform.

Take the Deputy Secretary of the Department of Corrections (DOC), for example. Prior to the parole hearing, he wrote to Powers declaring:

I recognize your contributions to making Washington State prisons safer for both offenders and staff. Your efforts have made a difference. I also believe those efforts will continue to make a difference for the men that are released back into the community [ ] I encourage you to continue to be a role model for other offenders. You have made a difference in many lives.

Nevertheless, when Powers later underwent the requisite psychological assessment to determine whether he posed a recidivism risk, the conclusion was that he posed a high risk to reoffend.

This made me wary—for the arc of our lives had striking similarities. I too had committed a heinous crime when I was a teen. Therefore, to my mind, if it could be said that “a role model for other offenders” posed a risk to public safety, surely the same could be said for me.

My history provided all the elements necessary to craft a narrative to support keeping me confined, permanently, or setting me free—notwithstanding the results of a potentially negative psychological risk analysis.

Quite simply, there was the good, the bad, and the ugly.

The case for freedom could summarize that “I used to be dangerous. Now I can effectively speak in public. I can present cogent legal arguments. I am a columnist.

An account of my history confined could emphasize:

I had spent almost a decade doing little more than fighting prisoners and assaulting guards, until I somehow found the strength to turn my anger into something positive. Now I write term papers and legal briefs that benefit both me and others confined with me [ ] No longer confined to an existence that the prison subculture glorifies, my intellect rather than ruthlessness is the basis for self-respect. This is the essence of rehabilitation.

Were this the parole board’s conception of me, undoubtedly I would be freed.

This is the narrative that I tried to focus upon to prevent being consumed by worry over psychological methodologies that were, quite frankly, a mystery to me. But worrying was becoming all too easy. In doing research to understand the legal landscape governing the authority vested in parole boards, the case law that I read further unsettled me.

Consider the law.

Across the US, the release of a prisoner who is serving an indeterminate life sentence is often “subject entirely to the discretion of the Board, which may parole him now or never.” Therefore, a prisoner has an opportunity to be freed—but he may never have an opportunity to be free.

As for determining whether a prisoner is rehabilitated, parole boards assess “a multiplicity of imponderables, entailing primarily what a man is and what he may become rather than simply what he has done.

Thus, parole can be denied “for a variety of reasons” that involve nothing more than “informed predictions as to what would best serve [correctional goals] or the safety and welfare of the inmate.

All of this reading was chilling. Given the “multiplicity of imponderables” involved in this decision making, it seemed parole boards could do damn near anything.

Although the standard for parole eligibility is less discretionary when (as here) the governing statutes require prisoners to be freed unless a preponderance of the evidence shows that a disqualifying condition is present; in the final analysis, how a parole board weighs the evidence is entirely subjective.

Educated guesses and static risk assessments are all that most parole boards are left with. As a consequence, little has changed in the 50 years since the Washington Supreme Court gave voice to the mindset of parole boards:

[A]lthough releasing a convicted felon on parole may be beneficent and rehabilitative and in the long run produce a social benefit, it is also a risky business. The parole may turn loose on society individuals of the most depraved, sadistic, cruel and ruthless character who may accept parole with no genuine resolve for rehabilitation nor to observe the laws and customs promulgated by the democratic society, which in the process of self-government granted the parole.

This begs the question: How can a parole board with any degree of certainty utilize a rational means to separate prisoners who are “depraved, sadistic, cruel and ruthless” from those who pose little risk to public safety?

Psychological evaluations to measure a prisoner’s recidivism risk are one way to go about the process. In fact, they are mandated for Washington State prisoners affected by Miller v. Alabama and its progeny.

Prisoners just like me.

Stafford Creek

Stafford Creek Corrections Center, Aberdeen, Wa., where Jeremiah Bourgeois is currently serving a sentence of 25 years to life. Photo courtesy Washington State Dept. of Corrections

Which leads us back to my pre-parole hearing wariness about psychological risk assessments.

On which side of the coin would I fall on after undergoing such an analysis?

Rehabilitated or likely recidivist?

This question was resolved for me on Nov.7, 2017, when the Indeterminate Sentence Review Board informed me of the following:

“The Board commends Mr. Bourgeois for completing a significant amount of programming. However the Board has determined that he does not meet the statutory criteria for release at this time for the following reasons. Mr. Bourgeois has been assessed in his most recent psychological evaluation at a ‘Moderate to High’ risk to reoffend. Additionally, he has a history of serious violence while in prison, to include two felony assaults against Corrections Officers during his prison stay. Also, Mr. Bourgeois’ offense is particularly heinous as it was a revenge killing against victims of a crime for which they had been willing to testify in court to assist in securing a conviction of their perpetrator, Mr. Bourgeois’ brother.”

And that was the end for me: The parole board took note of the good, but was primarily influenced by the bad—and ugly.

Since this decision was reached, I have come to understand the methodology behind the DOC psychologist’s finding that I am a “Moderate to High risk to reoffend” if conditionally released. Indeed, my discovery gives insight into the difficulty in assessing the recidivism risk of those who have spent decades confined for crimes that they committed when they were minors.

Since there is no large-scale data specific to the parole outcomes of prisoners like me, psychologists within DOC rely upon the Violence Risk Appraisal Guide (VRAG) which was constructed and validated on a cohort comprised mostly of white Canadian male forensic patients.

Further, in its revised edition (VRAG-R), relies upon a sample of individuals who, for the most part, either plead or were found not guilty by reason of insanity and spent an average of four years imprisoned.

The VRAG-R is designed to measure the risk of future violence by those who committed their instant offense when they were adults, not adolescents and, as Dr. John Monahan, a preeminent expert on risk assessments, explains:

[T]here comes a point at which the sample to which an actuarial instrument is being applied appears so fundamentally dissimilar to the sample on which it was constructed and originally validated [ ] that one would be hard pressed to castigate the evaluator who took the actuarial estimate as advisory rather than conclusive.

The VRAG-R scoring sheet, for instance, gives higher points if a person did not live with their parent(s) until they were at least age 16, are unmarried, and their crime(s) took place before they were age 26. These strikes are therefore baked in the cake when assessing those who are confined as adolescents because, ultimately, the assessment does not account for the fact that “children are different.

Notwithstanding the efficacy of utilizing the VRAG-R to assess the potential risk I pose to public safety—as I said in the beginning—my history provided the means for crafting a narrative to support keeping me confined permanently, or setting me free.

In this instance, I just happened to fall within the category of those believed to be cloaking their criminogenic propensities.

I am still coming to terms with the notion that I am a likely recidivist.

I don’t know if I will be able to get over this.

Viktor E. Frankl, in Man’s Search for Meaning, observed that every case of suicide may not be “undertaken out of a feeling of meaninglessness, [but] it may well be that an individual’s impulse to take his life would have been overcome had he been aware of some meaning and purpose worth living for.

I know exactly what he means.

Having been denied parole after 25 years of confinement for crimes committed when I was 14-years-old, I can now envision the day when all I will have to live for is writing my monthly columns for The Crime Report.

Jeremiah Bourgeois is a regular contributor to TCR, and an inmate in Washington State, where he has been serving a life sentence since the age of 14. He welcomes comments from readers. Those who wish to express their opinion regarding the decision to deny his release can contact the Indeterminate Sentence Review Board. Readers’ comments are welcome.


New Research Casts More Doubt on Risk Assessment Tools

Two computer scientists, writing in the journal “Science Advances,” say the two-decade-old COMPAS system is no more accurate or fair than predictions made by people with little or no criminal justice expertise.” Over the past two decades, the program has been used to assess more than one million criminal offenders.

Two computer scientists have cast more doubt on the accuracy of risk assessment tools.

After comparing predictions made by a group of untrained adults to those of the risk assessment software COMPAS, authors found that the software “is no more accurate or fair than predictions made by people with little or no criminal justice expertise,” and that, moreover, “a simple linear predictor provided with only two features is nearly equivalent to COMPAS with its 137 features.”

Julia Dressel, a software engineer, and Hany Farid, a computer science professor at Dartmouth, concluded, in a paper published Tuesday by Science Advances, that “collectively, these results cast significant doubt on the entire effort of algorithmic recidivism prediction.”

COMPAS, short for Correctional Offender Management Profiling for Alternative Sanctions, has been used to assess more than one million criminal offenders since its inception two decades ago.

In response to a May 2016 investigation by Propublica that concluded the software is both unreliable and racially biased, Northpointe defended its results, arguing the algorithm discriminates between recidivists and non recidivists equally well for both white and black defendants. Propublica stood by its own study, and the debate ended in a stalemate.

Rather than weigh in on the algorithm’s fairness, authors of this study simply compared the software’s results to that of “untrained humans,” and found that “people from a popular online crowdsourcing marketplace—who, it can reasonably be assumed, have little to no expertise in criminal justice—are as accurate and fair as COMPAS at predicting recidivism.”

Each of the untrained participants were randomly assigned 50 cases from a pool of 1000 defendants, and given a few facts including the defendant’s age, sex and criminal history, but excluding race. They were asked to predict the likelihood of re-offending within two years. The mean and median accuracy of these “untrained humans” to be 62.1% and 64%, respectively.

Authors then compared these results to COMPAS predictions for the same set of 1000 defendants, and found the program to have a median accuracy of 65.2 percent.

These results caused Dressel and Farid to wonder about the software’s level of sophistication.

Although they don’t have access to the algorithm, which is proprietary information, they created their own predictive model with the same inputs given participants in their study.

“Despite using only 7 features as input, a standard linear predictor yields similar results to COMPAS’s predictor with 137 features,” the authors wrote. “We can reasonably conclude that COMPAS is using nothing more sophisticated than a linear predictor or its equivalent.”

Both study participants and COMPAS were found to have the same level of accuracy for black and white defendants.

The full study, “The accuracy, fairness, and limits of predicting recidivism,” was published in Science Advances and can be found online here. This summary was prepared by Deputy Editor Victoria Mckenzie. She welcomes readers’ comments.


NYC Measure Would Probe Bias in Justice Algorithms

Critics charge that despite claims of objectivity, algorithms reproduce existing biases, disproportionately targeting people by class, race, and gender. Reformers say another New York City bill, the Right to Know Act, doesn’t go far enough.

New York City is taking steps to address algorithmic bias in city services. The City Council passed a bill that will require the city to address bias in algorithms used by the police department, courts, and dozens of city agencies, Vice reports. The bill would create a task force to figure out how to test city algorithms for bias, how citizens can request explanations of algorithmic decisions when they don’t like the outcome, and whether it’s feasible for the source code used by city agencies to be made publicly available.

Criminal justice reformers and civil liberties groups charge that despite claims of objectivity, algorithms reproduce existing biases, disproportionately targeting people by class, race, and gender. A Pro Publica investigation found that a risk assessment tool was more likely to mislabel black than white defendants. Studies have found facial recognition algorithms were less accurate for black and female faces.

Critics of predictive policing—which uses statistics to determine where cops should spend time on their beats—say it reinforces existing biases and brings cops back to already over-policed neighborhoods.

Rachel Levinson-Waldman of the Brennan Center of Justice said New York’s police department refuses to disclose the source code for the predictive policing program, claiming it would help criminals evade the cops. (Three academics argue in the New York Times that even imperfect algorithms improve the justice system.) The City Council on Tuesday approved the Right to Know Act, which requires changes to day-to-day interactions between police officers and those they encounter.

The measures drew opposition from criminal justice reform groups and the city’s largest officers’ union, the New York Times reports. Reformers said the bill omitted many common street encounters, including car stops and questioning by officers in the absence of any reasonable suspicion of a crime.


Fewer Prisoners, Less Crime? The Elusive Promise of Algorithms

Early evidence suggests some risk assessment tools offer promise in rationalizing decisions on granting bail without racial bias. But we still need to monitor how judges actually use the algorithms, says a Boston attorney.

Next Monday morning, visit an urban criminal courthouse. Find a seat on a bench, and then watch the call of the arraignment list.

Files will be shuffled. Cases will be called. Knots of lawyers will enter the well of the court and mutter recriminations and excuses. When a case consumes more than two minutes you will see unmistakable signals of impatience from the bench.

Pleas will be entered. Dazed, manacled prisoners—almost all of them young men of color—will have their bails set and their next dates scheduled.

Some of the accused will be released; some will be detained, and stepped back into the cells.

You won’t leave the courthouse thinking that this is a process that needs more dehumanization.

But a substantial number of criminal justice reformers have argued that if the situation of young men facing charges is to be improved, it will be through reducing each accused person who comes before the court to a predictive score that employs mathematically derived algorithms which weigh only risk.

This system of portraiture, known as risk assessment tools, is claimed to simultaneously reduce pretrial detentions, pretrial crime, and failures to appear in court—or at least that was the claim during a euphoric period when the data revolution first poked its head up in the criminal justice system.

We can have fewer prisoners and less crime. It would be, the argument went, a win/win: a silver bullet that offers liberals reduced incarceration rates and conservatives a whopping cost cut.

These confident predictions came under assault pretty quickly. Prosecutors—represented, for example, by Eric Sidall here in The Crime Report—marshaled tales of judges (“The algorithm made me do it!”) who released detainees who then committed blood-curdling crimes.

Other voices raised fears about the danger that risk assessment tools derived from criminal data trails that are saturated with racial bias will themselves aggravate already racially disparate impacts.

ProPublica series analyzed the startling racial biases the authors claim were built into one widely used proprietary instrument. Bernard Harcourt of Columbia University argued that “risk” has become a proxy for race.

A 2016 study by Jennifer Skeem and Christopher Lowenkamp dismissed Harcourt’s warnings as “rhetoric,” but found that on the level of particular factors (such as the criminal history factors) the racial disparities are substantial.

Meanwhile, a variety of risk assessment tools have proliferated: Some are simple checklists; some are elaborate “machine learning” algorithms; some offer transparent calculations; others are proprietary “black boxes.”

Whether or not the challenge of developing a race-neutral risk assessment tool from the race-saturated raw materials we have available can ever be met is an argument I am not statistician enough to join.

But early practical experience seems to show that some efforts, such as the Public Safety Assessment instrument, developed by the Laura and John Arnold Foundation and widely adopted, do offer a measure of promise in rationalizing bail decision-making at arraignments without aggravating bias (anyway, on particular measurements of impact).

The Public Safety Assessment (PSA), developed relatively transparently, aims to be an objective procedure that could encourage timid judges to separate the less dangerous from the more dangerous, and to send the less dangerous home under community-based supervision.

At least, this practical experience seems to show that in certain Kentucky jurisdictions where (with a substantial push from the Kentucky legislature) PSA has been operationalized, the hoped-for safety results have been produced—and with no discernible increase in racial disparity in outcomes.

Unfortunately, the same practical experience also shows that those jurisdictions are predominately white and rural, and that there are other Kentucky jurisdictions, predominately minority and urban, where judges have been—despite the legislature’s efforts—gradually moving away from using PSA.

These latter jurisdictions are not producing the same pattern of results.

The judges are usually described as substituting “instinct” or “intuition” for the algorithm. The implication is that they are either simply mobilizing their personal racial stereotypes and biases, or reverting to a primitive traditional system of prophesying risk by opening beasts and fowl and reading their entrails, or crooning to wax idols over fires.

As Malcolm M. Feeley and Jonathan Simon predicted in a 2012 article for Berkeley Law, past decades have seen a paradigm shift in academic and policy circles, and “the language of probability and risk increasingly replaces earlier discourse of diagnosis and retributive punishment.”

A fashion for risk assessment tools was to be expected, they wrote, as everyone tried to “target offenders as an aggregate in place of traditional techniques for individualizing or creating equities.”

But the judges at the sharp end of the system whom you will observe on your courthouse expedition don’t operate in a scholarly laboratory.

They have other goals to pursue besides optimizing their risk-prediction compliance rate, and those goals exert constant, steady pressure on release decision-making.

Some of these “goals” are distasteful. A judge who worships the great God, Docket, and believes the folk maxim that “Nobody pleads from the street” will set high bails to extort quick guilty pleas and pare down his or her room list.

Another judge, otherwise unemployable, who needs re-election or re-nomination, will think that the bare possibility that some guy with a low predictive risk score whom he has just released could show up on the front page tomorrow, arrested for a grisly murder, inexorably points to detention as the safe road to continued life on the public payroll.

They are just trying to get through their days.

But the judges are subject to other pressures that most of us hope they will respect.

For example, judges are expected to promote legitimacy and trust in the law.

It isn’t so easy to resist the pull of “individualizing “and “diagnostic” imperatives when you confront people one at a time.

Somehow, “My husband was detained, so he lost his job, and our family was destroyed, but after all, a metronome did it, it was nothing personal” doesn’t seem to be a narrative that will strengthen community respect for the courts.

Rigorously applying the algorithm may cut the error rate in half, from two in six to one in six, but one in six are still Russian roulette odds, and the community knows that if you play Russian roulette all morning (and every morning) and with the whole arraignment list, lots of people get shot.

No judge can forget this community audience, even if the “community” is limited to the judge’s courtroom work group. It is fine for a judge to know whether the re-offense rate for pretrial releases in a particular risk category is eight in ten, but to the judges, their retail decisions seem to be less about finding the real aggregated rate than about whether this guy is one of the eight or one of the two.

Embedded in this challenge is the fact that you can make two distinct errors in dealing with difference.

First, you can take situations that are alike, and treat them as if they are different: detain an African-American defendant and let an identical white defendant go.

Second, you can take things that are very different and treat them as if they are the same: Detain two men with identical scores, and ignore the fact that one of the two has a new job, a young family, a serious illness, and an aggressive treatment program.

A risk assessment instrument at least seems to promise a solution to the first problem: Everyone with the same score can get the same bail.

But it could be that this apparent objectivity simply finesses the question. An arrest record, after all, is an index of the detainee’s activities, but it also a measure of police behavior. If you live in an aggressively policed neighborhood your history may be the same as your white counterpart’s, but your scores can be very different.

And risk assessment approaches are extremely unwieldy when it comes to confronting the second problem. A disciplined sticking-to-the-score requires blinding yourself to a wide range of unconsidered factors that might not be influential in many cases, but could very well be terrifically salient in this one.

This tension between the frontline judge and the backroom programmer is a permanent feature of criminal justice life. The suggested solutions to the dissonance range from effectively eliminating the judges by stripping them of discretion in applying the Risk Assessment scores to eliminating the algorithms themselves.

But the judges aren’t going away, and the algorithms aren’t going away either.

As more cautious commentators seem to recognize, the problem of the judges and the algorithms is simply one more example of the familiar problem of workers and their tools.

If the workers don’t pick up the tools it might be the fault of the workers, but it might also be the fault of the design of the tools.

And it’s more likely that the fault does not lie in either the workers or the tools exclusively but in the relationship between the workers, the tools, and the work. A hammer isn’t very good at driving screws; a screw-driver is very bad at driving nails; some work will require screws, other work, nails.

If you are going to discuss these elements, it usually makes most sense to discuss them together, and from the perspectives of everyone involved.

The work that the workers and their tools are trying to accomplish here is providing safety—safety for everyone: for communities, accused citizens, cops on the streets. A look at the work of safety experts in other fields such as industry, aviation, and medicine provides us with some new directions.

To begin with, those safety experts would argue that this problem can never be permanently “fixed” by weighing aggregate outputs and then tinkering with the assessment tool and extorting perfect compliance from workers. Any “fix” we install will be under immediate attack from its environment.

Among the things that the Kentucky experience indicates is that in courts, as elsewhere, “covert work rules”, workarounds, and “informal drift” will always develop, no matter what the formal requirements imposed from above try to require.

The workers at the sharp end will put aside the tool when it interferes with their perception of what the work requires. Deviations won’t be huge at first; they will be small modifications. But they will quickly become normal.

And today’s small deviation will provide the starting point for tomorrow’s.

What the criminal justice system currently lacks—but can build—is the capacity for discussing why these departures seemed like good ideas. Why did the judge zig, when the risk assessment tool said he or she should have zagged? Was the judge right this time?

Developing an understanding of the roots of these choices can be (as safety and quality experts going back to W. Edwards Deming would argue) a key weapon in avoiding future mistakes.

We can never know whether a “false positive” detention decision was an error, because we can never prove that the detainee if released would not have offended. But we can know that the decision was a “variation” and track its sources. Was this a “special cause variation” traceable to the aberrant personality of a particular judge? (God knows, they’re out there.)

Or was it a “common cause variation” a natural result of the system (and the tools) that we have been employing?

This is the kind of analysis that programs like the Sentinel Events Initiative demonstration projects about to be launched by the National Institute of Justice and the Bureau of Justice Assistance can begin to offer. The SEI program, due to begin January 1, with technical assistance from the Quattrone Center for the Fair Administration of Justice at the University of Pennsylvania Law School, will explore the local development of non-blaming, all-stakeholders, reviews of events (not of individual performances) with the goal of enhancing “forward-looking accountability” in 20-25 volunteer jurisdictions.

The “thick data” that illuminates the tension between the algorithm and the judge can be generated. The judges who have to make the decisions, the programmers who have to refine the tools, the sheriff who holds the detained, the probation officer who supervises the released, and the community that has to trust both the process and the results can all be included.

james doyle

James Doyle

We can mobilize a feedback loop that delivers more than algorithms simply “leaning in” to listen to themselves.

What we need here is not a search for a “silver bullet,” but a commitment to an ongoing practice of critically addressing the hard work of living in the world and making it safe.

James Doyle is a Boston defense lawyer and author, and a frequent contributor to The Crime Report. He has advised in the development of the Sentinel Events Initiative of the National Institute of Justice. The opinions expressed here are his own. He welcomes readers’ comments.


U.S. Gets ‘Abysmal’ Grade on Pretrial Justice

The first baseline measurement of pretrial justice across the U.S. has found most states to be failing, with a few “promising” exceptions, according to the Pretrial Justice Institute.

The first baseline measurement of pretrial justice across the U.S. has found most states to be failing, with a few “promising” exceptions, according to a national advocacy group.

In a study released Wednesday by the Pretrial Justice Institute, authors measured the rates of pretrial detention, use of available risk assessment tools, and the status of money bail systems in every state.

“Needless” incarceration before trail is the primary cause for states’ failing grades: according to PJI’s findings, two thirds of the current U.S. jail population has not yet been to trail.

At the forefront of pretrial justice reform are Washington D.C., where 92 percent of those arrested are released pretrial and no one is detained for inability to pay; and New Jersey, which implemented statewide pretrial services earlier this year, resulting in a 15 percent reduction of pretrial detainees within the first six months.

The report also highlights legislative advances made by Alaska, Arizona, California, Indiana, Maryland, and New Mexico in the area of pretrial justice reform.

While the number of jurisdictions using risk assessment tools has more than doubled in the past four years, authors note that the increase is driven by “a few states and densely populated jurisdictions,” adding that “evidence-based pretrial assessments show that most people released before trial will appear in court and not be arrested on new charges pending trial.”

See also: Risk Assessment: The Devil’s in the Details

The study used money bail as its final measure because “financial conditions play such a large role in needlessly detaining people and giving us a false sense of safety,” according to the authors. New Jersey is the only state to have eliminated money bail, so this is where the U.S. pretrial justice score hovers closest to zero: only 3% of Americans live in a jurisdiction that has eliminated cash bail.

“As long as pretrial systems use money as a condition of pretrial release,” concludes the report, “poor and working class people will remain behind bars while those who are wealthy go home, regardless of their likelihood of pretrial success. This is a fundamental injustice.”

See also: Bail Reform: Why Judges Should Reject ‘Risk Assessment’

This summary was prepared by Victoria Mckenzie, Deputy Editor of The Crime Report. Readers’ comments are welcome.


Is Crime Predictable?

In Philip K. Dick’s “Minority Report,” criminals could be identified before they committed a crime. Computer-generated risk algorithms used by courts to determine whether individuals should be released ahead of trial have brought us a step closer to that world–and our challenge is to use them responsibly, says a George Mason University professor.

Should the increased use of computer-generated risk algorithms to determine criminal justice outcomes be cause for concern or celebration?

This is a hard question to answer, but not for the reasons most people think.

Judges around the country are using computer-generated algorithms to predict the likelihood that a person will commit crime in the future. They use these predictions to help determine pretrial custody, sentence length, prison security-level, probation, parole, and post-release supervision.

Proponents argue that by replacing the ad-hoc and subjective assessments of judges with sophisticated risk assessment instruments, we can reduce incarceration without affecting public safety.

Critics respond that they don’t want to live in a “Minority Report” state where people are punished for crimes before they are committed—particularly if risk assessments are biased against blacks.

Which side is right?

Should the increased use of computer-generated risk algorithms to determine criminal justice outcomes be cause for concern or celebration? This is a hard question to answer, but not for the reasons most people think.

It’s hard to answer because there is no single answer: The impacts that risk assessments have in practice depend crucially on how they are implemented.

Risk assessments are tools—no more and no less. They can be used to increase incarceration or decrease incarceration. They can be used to increase racial disparities or decrease disparities.

They can be used to direct “high risk” people towards support and services or to punish them more harshly.They can be implemented in such a broad set of ways that thinking about them monolithically just doesn’t make sense.

Take bail reform, for example.

Bail reform is one of the most active areas of change in criminal justice right now, and risk assessments have been a key part of many reform efforts. The idea behind the current bail reform movement is that pretrial custody decisions should be made on the basis of risk, not resources.

Instead of conditioning pretrial release on the ability to pay bail—which discriminates against the poor—reformers argue that pretrial release should be determined by a defendant’s risk of crime or flight.

Traditionally, risk of crime or flight was evaluated informally by a judge. Now, many jurisdictions are providing judges with computer-generated risk scores to help them decide whether the defendant can be safely released.These risk scores take into account factors like criminal history, age and sometimes even socio-economic characteristics like employment or stable housing.

One of the more popular pretrial risk assessment instruments, called the PSA, was developed by the Laura and John Arnold Foundation in 2013 and has since been adopted in some thirty jurisdictions as well as three entire states. The results have been mixed.

New Jersey has seen a dramatic decline in its pretrial detention rate: the number of people detained pretrial has dropped by about a third since the PSA was adopted in January. Lucas County which hosts the low-income city of Toledo, Ohio, has actually seen an increase in the pretrial detention rate since the PSA was adopted.

And a recent report suggests that Chicago judges have been largely ignoring the PSA. Why such different results in different places?  It’s too soon to say for sure, but there are a number of details related to implementation that could make all the difference.

For one, determining what level of risk should be considered “high” is a subjective determination.

In fact, there is little consensus on this issue: depending on the instrument and the jurisdiction, a high risk classification can correspond with a probability of re-arrest that’s as low as 10% or as high as 42%. 

Editor’s Note: For a critical view on the validity of risk-assessment tools, see Eric Siddall’s Viewpoint in TCR, Aug. 25, 2017.

With the PSA, jurisdictions can decide themselves where to set the cutoff points between a low, moderate, and high risk ranking.

These groupings are important, because many jurisdictions also adopt specific recommendations for each risk classification. For example, New Jersey uses a decision-making framework that recommends pretrial detention only for defendants with the highest risk scores: this has been defined so as to include only about 5% of arrestees.

In Mecklenberg County, another PSA site, generally only defendants who are ranked “low” or “below-average” on their risk score are recommended for release without secured monetary bond, making it less likely that risk assessment will increase release rates very much.

The impact that risk assessments have in practice will also depend on the extent to which judges use them. In most jurisdictions, judges are given the final say, and if they do not want to follow the recommendations associated with the risk assessment they don’t have to.

recent survey showed that only a small minority of judges thought that risk assessments were better at predicting future crime than judges.

If judges are skeptical, what would them motivate them? They will be more likely to use the risk assessment if they are incentivized to do so; for example, if deviating from the recommendations requires a detailed written reason for doing so.

Or, if there is a system of accountability where their actions are tracked and monitored. Finally, it’s always possible to implement risk assessment in a way that doesn’t involve judicial discretion at all.

Kentucky, a leader in the use of pretrial risk assessments, recently revised its procedures so that all low and moderate risk defendants facing non-serious charges are automatically released immediately after booking.

As for racial disparities, we know very little about how these have been impacted by the adoption of risk assessment. But what little we do know suggests that implementation details are important.In a recent study, I found that pretrial risk assessment in Kentucky benefited white defendants more than black, but this was solely because judges in the predominantly-white rural counties followed the recommendations of the risk assessment more than judges in the more racially mixed urban counties.

In other words, the increased racial disparities brought on by risk assessment were caused by regional trends in use, not by the bias of the instrument.This pattern might have been reversed if training, monitoring, and accountability in urban areas were higher.

Furthermore, risk assessment is more likely to reduce racial disparities if it is used to replace monetary bail. Since black defendants tend to have lower incomes, they tend to be less able to afford bail than white defendants.

One study shows that half the race gap in pretrial detention is explained by race differences in the likelihood of posting a given bond amount.

Megan Stevenson

We already live in a “Minority Report” state: the practice of grounding criminal justice decisions on predictions about future crime has been around a long time. The recent shift towards adopting risk assessment tools simply formalizes this process—and in doing so, provides an opportunity to shape what this process looks like.

Instead of embracing risk assessment wholeheartedly or condemning it without reserve, reformers should ask whether there is a particular implementation design by which risk assessment could advance the much-needed goals of reform.

Megan T. Stevenson is an economist and Assistant Professor of Law at George Mason University. She welcomes comments from readers.


Bail Reform: Why Judges Should Reject ‘Risk Assessment’

Tools that use algorithms to determine whether to detain accused individuals before a trial are increasingly being used across the country as an alternative to the bail system. But the vice president of the Los Angeles County Association of Deputy District Attorneys argues that the tools also lead to tragedies.

If you aren’t following bail reform, you may not be aware that accompanying the attempt to eliminate bail across the country is the touting of “risk assessment tools” to determine who should be detained on bail before trial.

Eric Siddall

The chief proponent of such tools is the Arnold Foundation, which maintains that its own “risk assessment tool” is a cutting-edge way of providing an objective assessment in this area.

The tool’s principal developer, (former New Jersey attorney general Anne Milgram), has said she introduced “rigorous statistical analysis” to the process in order to “moneyball criminal justice.”

Editor’s Note: 38 jurisdictions currently use the tool developed by the Arnold Foundation.

However, the use of this tool has led to the wholesale release of violent criminals—and tragedy.

Three recent examples in New Mexico, New Jersey and San Francisco illustrate my point.

A story published by the conservative website The Daily Wire said the assessment tool has led to virtually every defendant arrested in New Mexico for a violent crime being released without bail.

The story quoted a report from Albuquerque NBC affiliate KOB4, saying, “Even with the highest rate of failing to appear in court and the highest rate of new criminal activity for a defendant, the tool still recommends that person[s] be released on their own recognizance unless the prosecutors have filed for preventative detention.”

In New Jersey, according to the Washington Post, the tool determined that a man jailed for illegally possessing a gun was not a danger and recommended his release.  Days later, that man hunted down a rival and shot at him 22 times, killing him.  The family of the victim is now suing the Arnold Foundation, amongst others, for the death.

In San Francisco, the online website SFGate reported that a man suspected of murder was released days earlier after being arrested for possession of two guns.  According to the website, the judge, relying on the assessment tool, rejected the District Attorney’s office recommendation that the man be kept in jail on a probation violation.

A spokesman for the DA’s office was quoted as saying the use of the tool has caused “many instances of contention.”

He continued: “As it relates to this case along with many other cases, we have a disagreement with how that risk assessment is being calculated. They suggested release with certain conditions, and the judge carried out that recommendation and this defendant was released.”

The Arnold Foundation argues that its tool is needed because “failing to appropriately determine the level of risk that a defendant poses impacts future crime and violence, and carries enormous costs–both human and financial.”

The examples in New Mexico, New Jersey and San Francisco certainly attest to the truth of that statement.

Additional Reading: Risk assessment tools have triggered a contentious debate in the criminal justice community. In June, the Supreme Court refused to hear the case of a Wisconsin man who was sentenced to six years in prison by a judge who consulted the results of a  risk assessment algorithm.The plaintiff argued that the use of the algorithm violated his rights to due process. 

The tools represent a threat to the bail bond industry, which has backed two federal lawsuits seeking to end the algorithm’s use.

Eric W. Siddall is Vice President of the Los Angeles Association of Deputy District Attorneys (ADDA), the collective bargaining agent representing nearly 1,000 deputy district attorneys who work for the County of Los Angeles. This is an edited version of an essay that appeared earlier this month on ADDA’s website. Readers’ comments are welcome.