Chapter Seven
STATISTICAL EVIDENCE
[W]e are not concerned with the routine random differences in outcomes based on
luck. Rather the concept of discrimination encompasses only those differences that
are so systematic that they do not cancel each other out within large groups. --Ehrenberg & Smith
1. Explaining the Numbers
Statistical reasoning in the social and natural sciences can easily be
reconstructed as a related pair of inferences to the best explanation. In the
first inference the explanatory question focuses on some quantitative
relationship. Suppose I am struck by the remarkably high percentage of phone
calls I received last night from worthy causes asking for money. I might
generalize to a hypothesis about what is going on outside my home.
e1. Three phone calls last night; all from worthy causes.
===========================================
t0. Unusually high percentage of charitable solicitations last night.
Just a moment's reflection, however, yields a rival explanation that I hope
strikes you as every bit as good.
t1. It's just a coincidence that all of my phone calls last night came from
charitable organizations.
On the other hand, consider the extensive medical data that was uncovered
over several decades in the famous Framingham study. Medical researchers
were surprised to discover that twenty-nine percent of the men in the forty to
forty-nine year range suffered from coronary heart disease while only fourteen
percent of the women in the same age range suffered from the disease. This
tells us something potentially very important about gender and heart disease.
e1. Of the 771 men in the 40-49 year age group, 29% showed some
signs of coronary heart disease.
e2. Of the 954 women in the 40-49 year age group, only 14% showed
signs of coronary heart disease.
===========================================
t0. Coronary heart disease appears much more often in men than in
women.
In this latter argument, as in the case of the phone calls, the rival explanation
that must be ruled out is the so-called null hypothesis. We want to know if the
numbers indicate some real world tendency, or whether they are simply a
coincidence. Much of statistics can be seen as the sophisticated application of
mathematical theory, particularly the probability calculus, to developing
reliable techniques for distinguishing between the null hypothesis as the better
explanation of the numerical data, or some genuine correlation as the better
explanation.
2. Explaining the Correlations
If we are confident in our explanatory inference from the raw numerical data
to some genuine connection or correlation, the interesting explanatory work has
just begun. A Mississippi study established a remarkably strong correlation
between hearing loss and being incarcerated in the state penitentiary -- almost
fifty percent of the prisoners showed some hearing loss. The researchers
hypothesized that this data provided strong evidence that hearing loss causally
contributed to a life of crime. The argument is a classic inference to the best
explanation.
e1. Forty-eight percent of the prisoners show some hearing loss.
===========================================
t0. Hearing loss [partially] causes a life of crime.
Given the inference that this forty-eight percent figure is not simply a fluke or
coincidence -- for the modern social scientist, that it is statistically significant --
we must now determine whether t0 is the best explanation of the correlation.
We must compare it to some rival explanations. Perhaps, for example, the life
of the career criminal is particularly hard on one's ears. All the necessary time
on the firing range might be causally responsible for hearing loss.
t1. A life of crime [partially] causes hearing loss.
I take it that such a rival is initially very implausible in this particular case.
Much more challenging would be a rival account that brings in some third
factor -- a common cause -- that causally explains the hearing loss, and
independently explains the life of crime; child abuse comes immediately to
mind.
t2. Being an abused child often causes significant hearing loss (the
actual mechanism might be either physiological, or psychological),
independently, coming from a background of abuse causes antisocial
behavior and often leads to a life of crime.
In our first inference when we were attempting to explain an observed
quantitative relationship modern statistical theory was of great help. We
might have, for instance, simply entered the numbers into a computer and the
program simply told us whether our result was statistically significant (that a
genuine correlation was the best explanation) or not (that the null hypothesis
[the coincidence hypothesis] was the best). When it comes to sorting out
explanations of genuine correlations the task is more difficult.
Returning to the Framingham study, gender seems to play some role in
coronary heart disease. But what is the actual causal mechanism? I can think
of a number of possible explanations.
t1. Male and female hormones have different effects on the human
circulatory system.
t2. "Traditional" male jobs [remember the study went back to the 1950s]
are more stressful, causing a higher rate of coronary heart disease.
t3. Males are socialized to prefer a diet that is much higher in fat and
salt.
It may well be that we will need to go out and uncover some additional data --
conduct some new studies or experiments -- before we are in a position to
confidently commit ourselves to any particular explanation of an established
correlation as being the best. At other times, however, will be confident about
the relative explanatory plausibility as things stand.
3. Some Very Disturbing Data
Most of my students seem to believe that the problems of racial and sexual
discrimination have been pretty much solved. Sure, we all know blatant racists
or sexists, but at a national level the problem is much less sever than even
twenty years ago. Many Americans believe that affirmative action is actually a
worse social problem than the racism and sexism it was designed to combat. I
think my students are dead wrong about this, and here's some statistical
evidence to support that claim.
Labor market statistics suggest that there are great disparities between blacks
and whites, both in terms of the characteristics they take into the labor market
and in terms of the benefits derived from the labor market, as the following
table illustrates.
RACIAL DIFFERENCES IN
INCOME, POVERTY, AND EDUCATION 1991
| RACE |
% WITH 4 YRS
COLLEGE |
MEDIAN
INCOME |
POVERTY RATE |
MEDIAN WKLY
PAY |
| White |
22.2% |
$36,915 |
10.7% |
$446 |
| Black |
11.5% |
$21,423 |
31.9% |
$348 |
Source: Statistical Abstract of the United States, 1992
In addition to the above racial differences in poverty rates, income and
earnings differentials, and education levels, black employment tends to be
concentrated in low-paying, low-status jobs and under-represented in high-paying, high-profile careers. Thus 30% of nursing aides are black, 29% of
domestic servants are black, 25% of vehicle washers are black, and 21% of
janitors are black, while only 0.7% of geologists are black), 1.5% of dentists
are black, 2.1% of architects are black, and 2.6% of lawyers are black. Thus
many employment opportunities for blacks appear to be concentrated in the
secondary labor market, with low-paying, high turnover jobs, while careers in
the primary labor market, jobs that are high-paying with advancement
potential, are limited in their availability to blacks.
There are, of course, many other data that support the contention that blacks
and whites differ in characteristics that significantly affect their economic
success. Blacks are more likely to be high school dropouts and thus have a
lower high school graduation rate; black children are 2.8 times more likely to
be poor than white children; and blacks are twice as likely to be in the bottom
20% of the income distribution than whites. Given this wealth of data, how do
we explain it?
As sad as things are with respect to the wealth and race, things are even worse
with respect to health. In 1915, the Department of Labor, Children's Bureau
issued a report on infant mortality which demonstrated the coincidence
between infant mortality rates and socioeconomic factors, including low
earnings and poor housing. Infant mortality is defined as death within the first
364 days after birth, and while infant mortality rates have declined over the
last eighty years, the differential between whites and blacks remains virtually
unchanged. Consider the data in the table below.
Selected Pregnancy Outcome Rates By Race:
United States, 1970 and 1985
| |
|
1970 |
|
|
|
1985 |
|
|
WHITE |
BLACK |
RATIO |
|
WHITE |
BLACK |
RATIO |
| TOTAL
MORTALITY* |
29.8 |
55.1 |
1.85 |
|
16.2 |
30.5 |
1.89 |
| FETAL
MORTALITY* |
12.3 |
23.2 |
1.89 |
|
7.0 |
29.3 |
1.81 |
| INFANT
MORTALITY* |
17.8 |
32.6 |
1.84 |
|
9.3 |
18.2 |
1.95 |
* Per 1000 live births plus fetal deaths
+ Per 1000 live births
Overall, the black infant mortality rate is about twice the white infant
mortality rate. More specifically, black infants are 2.5 times as likely as whites
infants to die of pneumonia and influenza; over twice as likely to die of
maternal complications of pregnancy, interuterine hypoxia, and birth
asphyxia; and 3.5 times as likely to die as a result of short gestation and
unspecified low birth weight. Low birth weight infants are 5 times more likely
to die during the first year of life than a normal birth weight infant, with black
infants being twice as likely to be born with low birth weights. Although the
factors leading to low birth weight are not completely understood, it is
generally believed that inadequate prenatal care is a primary determinant and
that socioeconomic status limits prenatal care, affects nutrition during
pregnancy, increases stress, and increases the likelihood of harmful behavior
during pregnancy, such as smoking and drug use. The weight of evidence
seems to argue that poverty (and therefore race) strongly influence the
adequacy of prenatal care, which is a determinant of low birth weight, which in
turn is a leading cause of infant mortality.
If we schematize all of this data in the familiar form, we get an awfully
powerful prime facie case that something is seriously wrong.
e1. There is a strong correlation between wealth and poverty,
and race.
e2. There is a strong correlation between health -- infant
mortality, morbidity rates, etc. -- and race.
========================================================================
t0. Blacks are at a distinct disadvantage, both economically
and in terms of health, because of their race. Contemporary
American society discriminates against blacks; we are a racist
nation.
Many of you will be extremely uncomfortable with this diagnosis of the data;
you will seek rival explanations. I fear that some will be tempted further down
the racist path and suggest that these racial differentials can be explained in
terms of worth, ability, and natural physiological differences. I almost hate to
put it into words, but one rival explanation to be considered is the following.
t1. Whites are smarter and more hard working, hence their
superior economic success. Whites are also physiologically
better off, hence the improved health statistics.
A slightly less offensive rival explanation is very popular in contemporary
economics. According this view the individual is seen as a "rational actor."
Each person makes individual decisions for him or herself, including decisions
to "invest" in one's future occupation though training and education, and one's
future health through the consumption medical services and the adoption of a
healthy lifestyle and diet. The model of "human capital" sees some of these
decisions as rational -- blacks tend to have shorter work lifetimes and earn
lower wages, hence there would be less return from investing in job training --
and some of then as culturally based -- diet, for instance.
t2. Whites have more economic success, and a brighter health
picture, because of differing decisions between blacks and
whites regarding investments in human capital.
I suspect that the human capital model does get at something interesting. If I
knew that I would only earn eighty cents on the dollar compared to a white co-worker, I might we be less inclined to invest in my professional career. But
notice that this explanation presupposes a relevant racial difference in our
earning potentials. Furthermore, the human capital model can be carried to
implausible extremes, as when economists suggest that pregnant woman in the
ghetto make "rational" investment decisions in the health of their unborn
children.
Thus, I stand by the diagnosis of racism as the best explanation of at least part
of the racial differentials in wealth and health. For a nation founded on the
principle that "all men are created equal," and a nation whose Consitution
guarantees "equal protection of the laws," this is an incredably damning
account. One that I would argue we should all be ashamed of.
4. Statistics and the Death Penalty
You will remember that I had started to climb on a soapbox about capital
punishment at the end of the last chapter. I claimed that the best constructive
interpretation of our nation's abstract intentions with respect to the death
penalty was the following.
t0. Capital punishment may be exercised by Congress or the
State, if and only if, (i) there is due process of law, (ii) capital
punishment is neither cruel nor unusual, and (iii) capital
punishment is administered in a non-discriminatory manner
that constitutes equal protection of the laws.

United
State Supreme Court
I want now to continue with my case against the death penalty by arguing that
the third necessary condition for permissible executions is demonstrably absent. My argument to this effect will depend on the
analysis of statistical
evidence.
I take it that legal historians would agree with me that capital punishment has
in the past been applied in a manner that was clearly discriminatory. We
would like to think, however, that we have made some progress in the area of
racial justice. That is why the following data is so disappointing.
Professor Baldus examined over twenty-four hundred homicide cases in the
state of Georgia during the period between 1974 and 1979. The dates are
significant because the Georgia murder stature had been rewritten after
Furman v. Georgia in order that death sentences not be administered in a
"random and capricious manner." Here's a brief summary of what Professor
Baldus discovered.
|
Race Killer/Victim |
Death Sentence |
Percentage |
|
Black/White |
50 of 223 |
22% |
|
White/White |
58 of 748 |
8% |
|
Black/Black |
18 of 1443 |
1% |
|
White/Black |
2 of 60 |
3% |
|
Total by Victim |
|
|
|
White |
108 of 981 |
11% |
|
Black |
20 of 1503 |
1% |
The original Baldus study controlled for over two hundred non-racial variables
such as the defendant's record and the severity of the crime. When all of this
data was considered, the study concluded that murderers of white victims were
4.3 times as likely to receive the death penalty. Justice Brennan expressed this
correlation in characteristically vivid language.
At some point in this case, Warren McCleskey doubtless asked his lawyer
whether the jury was likely to sentence him to die. A candid reply to this
question would have to tell McCleskey that few of the details of the crime
or of McCleskey's past criminal record were more important than the fact
that his victim was white. Furthermore, counsel would feel bound to tell
McCleskey that defendants charged with killing white victims in Georgia
are 4.3 times as likely to be sentenced to die as defendants charged with
kill blacks.
I have discussed the McCleskey case with hundreds of students in the last
several years. Many simply refuse to accept the following data.
e1. When controlled for over two hundred non-racial variables
such as the defendant's record and the severity of the crime,
the Baldus study concluded that murderers of white victims
were 4.3 times as likely to receive the death penalty.
It is, of course, true that life in the ghetto is different than life in the suburbs,
and that the black culture is different that the white culture. The shocking
figure that over four times as many murderers of whites receive the death
penalty takes all of that into account. I know some of you will continue to
believe that "statistics always lie." But the very same techniques that tell us
that cigarette smoking causes cancer, or that so-and-so will win next month's
election, tell us that the connection between race and the death penalty in
Georgia is for real. Thus, the question before us is producing an explanation of
why this correlation holds.
There is no big mystery about the reason for this disparity. The original study
contained the crucial data.
e2. District attorneys ask for a capital sentence in seventy
percent of the cases involving a black defendant and a white
victim. When the victim is black and the defendant is white,
however, a mere nineteen percent are even prosecuted as
capital cases.
Clearly the discretion of local law enforcement officials is the point at which
racial attitudes enter the criminal process. Apparently, black on black
murders are considered even less important, since the Baldus study showed
that the death penalty was requested in only nineteen percent of these cases.
In Furman the Supreme Court was concerned with the arbitary and capricious
actions of trial judges, and particularly juries. It appears now, however, that
the judgments, both arbitary and prejudical, of other legal officials is even
more problematic. I take it to be obvious that the best explanation of the
Baldus data is the following.
t0. The race of the murder victim causally influences the
decision whether to seek the death penalty.
Given my earlier interpretation of the Eighth Amendment, this account of the
murder statistics in Georgia seem to demand that the Supreme Court declare
capital punishment, at least in the state of Georgia, to be unconstitutional.
5. What's the Best Explanation?
As I hope by this point you are all already thinking, the crucial question is
whether t0 is really the best explanation. I, personally, cannot see anyway in
this could be a case of "reverse causation." Thus, I reject any possibility that
the following needs to be considered at all.
t1. The decision to seek the death penalty causes the race of
the victim.
I, also, find it pretty hard to explain the Baldus study results as simply a
"statistical fluke." Such things are always possible, but modern statistical
analysis guarantees us that they are exceedingly unlikely. Consequently, the
following is also very low on my plausibility ranking.
t2. It's just a coincidence that victims' race "correlated" with
capital sentences in Georgia.
The only serious competitor I can imagine, therefore, is that there is some
unnoticed "common cause" that is independently responsible for both the race
of the homicide victims, and the fact that their murderers received the
sentences they did. The Baldus team tried to think of some of the possible
factors in their original evaluation of the data. That's what they were up to
when they performed the statistical tests that "controlled for over two hundred
non-racial variables." Even when they did this, it turned out that murderers of
white victims were 4.3 times as likely to receive death sentences. Maybe
something else is responsible for the correlation, but we have yet to see what it
is. Hence, I am willing to take the following seriously as a potential rival
explanation.
t3. Some unidentified non-racial factor is responsible for the
correlation of victim race and death sentences.
Since we have yet to even think of what this non-racial factor might be, I admit
it's possibility, but rank it significantly lower than the racial explanation in t0.