Nate Silver: NT -- maybe ILI-Te? [Creative subtype] (ILI-ILE?)
- from The Signal and the Noise: Why So Many Predictions Fail—but Some Don’t by Nate Silver; pp.
47-54 (ARE YOU SMARTER THAN A TELEVISION PUNDIT?): For many people, political prediction is
synonymous with the television program The McLaughlin Group, a political roundtable that has
been broadcast continually each Sunday since 1982 and parodied by Saturday Night Live for nearly
as long. The show, hosted by John McLaughlin, a cantankerous octogenarian who ran a failed bid for the
United States Senate in 1970, treats political punditry as sport, cycling through four or five subjects in
the half hour, with McLaughlin barking at his panelists for answers on subjects from Australian politics to
the prospects for extraterrestrial intelligence.
At the end of each edition of The McLaughlin Group, the program has a final segment called
“Predictions,” in which the panelists are given a few seconds to weigh in on some matter of the day.
Sometimes, the panelists are permitted to pick a topic and make a prediction about anything even
vaguely related to politics. At other times, McLaughlin calls for a “forced prediction,” a sort of pop quiz
that asks them their take on a specific issue.
Some of McLaughlin’s questions—say, to name the next Supreme Court nominee from among several
plausible candidates—are difficult to answer. But others are softballs. On the weekend before the 2008
presidential election, for instance, McLaughlin asked his panelists whether John McCain or Barack
Obama was going to win.
That one ought not to have required much thought. Barack Obama had led John McCain in almost every
national poll since September 15, 2008, when the collapse of Lehman Brothers had ushered in the worst
economic slump since the Great Depression. Obama also led in almost every poll of almost every swing
state: in Ohio and Florida and Pennsylvania and New Hampshire—and even in a few states that
Democrats don’t normally win, like Colorado and Virginia. Statistical models like the one I developed for
Five-ThirtyEight suggested that Obama had in excess of a 95 percent chance of winning the election.
Betting markets were slightly more equivocal, but still had him as a 7 to 1 favorite.
But McLaughlin’s first panelist, Pat Buchanan, dodged the question. “The undecideds will decide this
weekend,” he remarked, drawing guffaws from the rest of the panel. Another guest, the Chicago
Tribune’s Clarence Page, said the election was “too close to call.” Fox News’ Monica Crowley was bolder,
predicting a McCain win by “half a point.” Only Newsweek’s Eleanor Clift stated the obvious, predicting a
win for the Obama-Biden ticket.
The following Tuesday, Obama became the president-elect with 365 electoral votes to John McCain’s
173—almost exactly as polls and statistical models had anticipated. While not a landslide of historic
proportions, it certainly hadn’t been “too close to call”: Obama had beaten John McCain by nearly ten
million votes. Anyone who had rendered a prediction to the contrary had some explaining to do.
There would be none of that on The McLaughlin Group when the same four panelists gathered
again the following week. [The McLaughlin Group transcript, Federal News Service; taped November 7,
2008. http://www.mclaughlin.com/transcript.htm?id=688 ] The panel discussed the statistical minutiae
of Obama’s win, his selection of Rahm Emanuel as his chief of staff, and his relations with Russian
president Dmitry Medvedev. There was no mention of the failed prediction—made on national
television in contradiction to essentially all available evidence. In fact, the panelists made it sound as
though the outcome had been inevitable all along; Crowley explained that it had been a “change
election year” and that McCain had run a terrible campaign—neglecting to mention that she had been
willing to bet on that campaign just a week earlier.
Rarely should a forecaster be judged on the basis of a single prediction—but this case may warrant an
exception. By the weekend before the election, perhaps the only plausible hypothesis to explain why
McCain could still win was if there was massive racial animus against Obama that had gone undetected
in the polls. None of the panelists offered this hypothesis, however. Instead they seemed to be
operating in an alternate universe in which the polls didn’t exist, the economy hadn’t collapsed, and
President Bush was still reasonably popular rather than dragging down McCain.
Nevertheless, I decided to check to see whether this was some sort of anomaly. Do the panelists on
The McLaughlin Group—who are paid to talk about politics for a living—have any real skill at
I evaluated nearly 1,000 predictions that were made on the final segment of the show by McLaughlin
and the rest of the panelists. About a quarter of the predictions were too vague to be analyzed or
concerned events in the far future. But I scored others on a five-point scale ranging from completely
false to completely true.
The panel may as well have been flipping coins. I determined 338 of their predictions to be either mostly
or completely false. The exact same number—338—were either mostly or completely true.
Nor were any of the panelists—including Clift, who at least got the 2008 election right—much better
than the others. For each panelist, I calculated a percentage score, essentially reflecting the number of
predictions they got right. Clift and the three other most frequent panelists—Buchanan, the late Tony
Blankley, and McLaughlin himself—each received almost identical scores ranging from 49 percent to 52
percent, meaning that they were about as likely to get a prediction right as wrong. They displayed about
as much political acumen as a barbershop quartet.
The McLaughlin Group, of course, is more or less explicitly intended as slapstick entertainment for
political junkies. It is a holdover from the shouting match era of programs, such as CNN’s Crossfire,
that featured liberals and conservatives endlessly bickering with one another. Our current echo chamber
era isn’t much different from the shouting match era, except that the liberals and conservatives are
confined to their own channels, separated in your cable lineup by a demilitarized zone demarcated by
the Food Network or the Golf Channel. This arrangement seems to produce higher ratings if not
necessarily more reliable analysis.
But what about those who are paid for the accuracy and thoroughness of their scholarship—rather than
the volume of their opinions? Are political scientists, or analysts at Washington think tanks, any better at
Are Political Scientists Better Than Pundits?
The disintegration of the Soviet Union and other countries of the Eastern bloc occurred at a remarkably
fast pace—and all things considered, in a remarkably orderly way.
On June 12, 1987, Ronald Reagan stood at the Brandenburg Gate and implored Mikhail Gorbachev to
tear down the Berlin Wall—an applause line that seemed as audacious as John F. Kennedy’s pledge to
send a man to the moon. Reagan was prescient; less than two years later, the wall had fallen.
On November 16, 1988, the parliament of the Republic of Estonia, a nation about the size of the
state of Maine, declared its independence from the mighty USSR. Less than three years later, Gorbachev
parried a coup attempt from hard-liners in Moscow and the Soviet flag was lowered for the last time
before the Kremlin; Estonia and the other Soviet Republics would soon become independent nations.
If the fall of the Soviet empire seemed predictable after the fact, however, almost no mainstream
political scientist had seen it coming. The few exceptions were often the subject of ridicule. [Eugene
Lyons, Workers’ Paradise Lost (New York: Paperback Library, 1967).] If political scientists couldn’t
predict the downfall of the Soviet Union—perhaps the most important event in the latter half of the
twentieth century—then what exactly were they good for?
Philip Tetlock, a professor of psychology and political science, then at the University of California at
Berkeley, was asking some of the same questions. As it happened, he had undertaken an ambitious and
unprecedented experiment at the time of the USSR’s collapse. Beginning in 1987, Tetlock started
collecting predictions from a broad array of experts in academia and government on a variety of topics
in domestic politics, economics, and international relations.
Political experts had difficulty anticipating the USSR’s collapse, Tetlock found, because a prediction that
not only forecast the regime’s demise but also understood the reasons for it required different strands
of argument to be woven together. There was nothing inherently contradictory about these ideas, but
they tended to emanate from people on different sides of the political spectrum, and scholars firmly
entrenched in one ideological camp were unlikely to have embraced them both.
On the one hand, Gorbachev was clearly a major part of the story—his desire for reform had been
sincere. Had Gorbachev chosen to become an accountant or a poet instead of entering politics, the
Soviet Union might have survived at least a few years longer. Liberals were more likely to hold this
sympathetic view of Gorbachev. Conservatives were less trusting of him, and some regarded his talk of
glasnost as little more than posturing.
Conservatives, on the other hand, were more instinctually critical of communism. They were quicker to
understand that the USSR’s economy was failing and that life was becoming increasingly difficult for the
average citizen. As late as 1990, the CIA estimated—quite wrongly—that the Soviet Union’s GDP was
about half that of the United States (on a per capita basis, tantamount to where stable democracies like
South Korea and Portugal are today). In fact, more recent evidence has found that the Soviet economy—
weakened by its long war with Afghanistan and the central government’s inattention to a variety of
social problems—was roughly $1 trillion poorer than the CIA had thought and was shrinking by as much
as 5 percent annually, with inflation well into the double digits.
Take these two factors together, and the Soviet Union’s collapse is fairly easy to envision. By opening
the country’s media and its markets and giving his citizens greater democratic authority, Gorbachev had
provided his people with the mechanism to catalyze a regime change. And because of the dilapidated
state of the country’s economy, they were happy to take him up on his offer. The center was too weak
to hold: not only were Estonians sick of Russians, but Russians were nearly as sick of Estonians, since the
satellite republics contributed less to the Soviet economy than they received in subsidies from Moscow.
Once the dominoes began falling in Eastern Europe—Czechoslovakia, Poland, Romania, Bulgaria,
Hungary, and East Germany were all in the midst of revolution by the end of 1989—there was little
Gorbachev or anyone else could do to prevent them from caving the country in. A lot of Soviet scholars
understood parts of the problem, but few experts had put all the puzzle pieces together, and almost no
one had forecast the USSR’s sudden collapse.
Tetlock, inspired by the example of the Soviet Union, began to take surveys of expert opinion in other
areas—asking the experts to make predictions about the Gulf War, the Japanese real-estate bubble, the
potential secession of Quebec from Canada, and almost every other major event of the 1980s and
1990s. Was the failure to predict the collapse of the Soviet Union an anomaly, or does “expert” political
analysis rarely live up to its billing? His studies, which spanned more than fifteen years, were eventually
published in the 2005 book Expert Political Judgment.
Tetlock’s conclusion was damning. The experts in his survey—regardless of their occupation, experience,
or subfield—had done barely any better than random chance, and they had done worse than even
rudimentary statistical methods at predicting future political events. They were grossly overconfident
and terrible at calculating probabilities: about 15 percent of events that they claimed had no
chance of occurring in fact happened, while about 25 percent of those that they said were
absolutely sure things in fact failed to occur. It didn’t matter whether the experts were making
predictions about economics, domestic politics, or international affairs; their judgment was equally bad
across the board.
The Right Attitude for Making Better Predictions: Be Foxy
While the experts’ performance was poor in the aggregate, however, Tetlock found that some had done
better than others. On the losing side were those experts whose predictions were cited most frequently
in the media. The more interviews that an expert had done with the press, Tetlock found, the worse his
predictions tended to be.
Another subgroup of experts had done relatively well, however. Tetlock with his training as a
psychologist, had been interested in the experts’ cognitive styles—how they thought about the world.
So he administered some questions lifted from personality tests to all the experts.
On the basis of their responses to these questions, Tetlock was able to classify his experts along a
spectrum between what he called hedgehogs and foxes. The reference to hedgehogs and
foxes comes from the title of an Isaiah Berlin essay on the Russian novelist Leo Tolstoy—The
Hedgehog and the Fox. Berlin had in turn borrowed his title from a passage attributed to the Greek
poet Archilochus: “The fox knows many little things, but the hedgehog knows one big thing.”
Unless you are a fan of Tolstoy—or of flowery prose—you’ll have no particular reason to read
Berlin’s essay. But the basic idea is that writers and thinkers can be divided into two broad categories:
• Hedgehogs are type A personalities who believe in Big Ideas—in governing
principles about the world that behave as though they were physical laws and undergird virtually every
interaction in society. Think Karl Marx and class struggle, or Sigmund Freud and the unconscious. Or
Malcolm Gladwell and the “tipping point.”
• Foxes, on the other hand, are scrappy creatures who believe in a plethora of little
ideas and in taking a multitude of approaches toward a problem. They tend to be more tolerant of
nuance, uncertainty, complexity, and dissenting opinion. If hedgehogs are hunters, always looking out
for the big kill, then foxes are gatherers.
Foxes, Tetlock found, are considerably better at forecasting than hedgehogs. They had come closer to
the mark on the Soviet Union, for instance. Rather than seeing the USSR in highly ideological terms—as
an intrinsically “evil empire,” or as a relatively successful (and perhaps even admirable) example of a
Marxist economic system—they instead saw it for what it was: an increasingly dysfunctional nation that
was in danger of coming apart at the seams. Whereas the hedgehogs’ forecasts were barely any better
than random chance, the foxes’ demonstrated predictive skill.
How Foxes Think
Multidisciplinary: Incorporate ideas from different disciplines and regardless of their origin on the
Adaptable: Find a new approach—or pursue multiple approaches at the same time—if they aren’t
sure the original one is working.
Self-critical: Sometimes willing (if rarely happy) to acknowledge mistakes in their predictions and
accept the blame for them.
Tolerant of complexity: See the universe as complicated, perhaps to the point of many
fundamental problems being irresolvable or inherently unpredictable.
Cautious: Express their predictions in probabilistic terms and qualify their opinions.
Empirical: Rely more on observation than theory.
Foxes are better forecasters.
How Hedgehogs Think
Specialized: Often have spent the bulk of their careers on one or two great problems. May view
the opinions of “outsiders” skeptically.
Stalwart: Stick to the same “all-in” approach—new data is used to refine the original model.
Stubborn: Mistakes are blamed on bad luck or on idiosyncratic circumstances—a good model had
a bad day.
Order-seeking: Expect that the world will be found to abide by relatively simple governing
relationships once the signal is identified through the noise.
Confident: Rarely hedge their predictions and are reluctant to change them.
Ideological: Expect that solutions to many day-to-day problems are manifestations of some
grander theory or struggle.
Hedgehogs are weaker forecasters.
- pp. 265-269 (RAGE AGAINST THE MACHINES): The father of the modern chess computer was MIT’s
Claude Shannon, a mathematician regarded as the founder of information theory, who in 1950
published a paper called “Programming a Computer for Playing Chess.” Shannon identified some of the
algorithms and techniques that form the backbone of chess programs today. He also recognized why
chess is such an interesting problem for testing the powers of information-processing machines.
Chess, Shannon realized, has an exceptionally clear and distinct goal—achieving checkmate. Moreover,
it follows a relatively simple set of rules and has no element of chance or randomness. And yet, as
anybody who has played chess has realized (I am not such a good player myself), using those simple
rules to achieve that simple goal is not at all easy. It requires deep concentration to survive more than a
couple of dozen moves into a chess game, let alone to actually win one. Shannon saw chess as a litmus
test for the power of computers and the sort of abilities they might someday possess.
But Shannon, in contrast to some who came after him, did not hold the romanticized notion that
computers might play chess in the same way that humans do. Nor did he see their victory over humans
at chess as being inevitable. Instead, he saw four potential advantages for computers:
1. They are very fast at making calculations.
2. They won’t make errors, unless the errors are encoded in the program.
3. They won’t get lazy and fail to fully analyze a position or all the possible moves.
4. They won’t play emotionally and become overconfident in an apparent winning position
that might be squandered or grow despondent in a difficult one that might be salvaged.
These were to be weighed, Shannon thought, against four distinctly human advantaged:
1. Our minds are flexible, able to shift gears to solve a problem rather than follow a set code.
2. We have the capacity for imagination.
3. We have the ability to reason.
4. We have the ability to learn.
It seemed like a fair fight to Shannon. But that was only the case for a few fleeting moments in
the mid-1990s, when the Russian grandmaster Garry Kasparov—the best chess player of all
time—went up against what was then one of the most advanced computers ever built, IBM’s
Before their match, humans were winning the fight—it wasn’t even close. Yet computers have
prevailed ever since, and will continue to do so for as long as we live.
Chess, Prediction, and Heuristics
In accordance with Bayes’s theorem, prediction is fundamentally a type of information-
processing activity—a matter of using new data to test our hypotheses about the objective
world, with the goal of coming to truer and more accurate conceptions about it.
Chess might be thought of as analogous to prediction. The players must process information—
the position of the thirty-two pieces on the board and their possible moves. They use this
information to devise strategies to place their opponent in checkmate. These strategies in
essence represent different hypotheses about how to win the game. Whoever succeeds in that
task had the better hypothesis.
Chess is deterministic—there is no real element of luck involved. But the same is theoretically
true of the weather . . . Our knowledge of both systems is subject to considerable imperfections.
In weather, much of the problem is that our knowledge of the initial conditions is incomplete.
Even though we have a very good idea of the rules by which the weather system behaves, we
have incomplete information about the position of all the molecules that form clouds and
rainstorms and hurricanes. Hence, the best we can do is make probabilistic forecasts.
In chess, we have both complete knowledge of the governing rules and perfect
information—there are a finite number of chess pieces, and they’re right there in plain sight. But
the game is still very difficult for us. Chess speaks to the constraints on our information-
processing capabilities—and it might tell us something about the best strategies for making
decisions despite them. The need for prediction arises not necessarily because the world itself is
uncertain, but because understanding it fully is beyond our capacity.
Both computer programs and human chess masters therefore rely on making simplifications to
forecast the outcome of the game. We can think of these simplifications to forecast the
outcome of the game. We can think of these simplifications as “models,” but heuristics is
the preferred term in the study of computer programming and human decision making. It comes
from the same Greek root word from which we derive eureka. A heuristic approach to
problem solving consists of employing rules of thumb when a deterministic solution to a
problem is beyond our practical capacities.
Heuristics are very useful things, but they necessarily produce biases and blind spots. For
instance, the heuristic “When you encounter a dangerous animal, run away!” is often a useful
guide but not when you meet a grizzly bear; she may be startled by your sudden movement and
she can easily outrun you. (Instead, the National Park Service advises you to remain as quiet and
as still as possible when you encounter a grizzly bear and even to play dead if necessary.)*
Humans and computers apply different heuristics when they play chess. When they play against
each other, the game usually comes down to who can find his opponent’s blind spots first.
* Lauren Himiak, “Bear Safety Tips,” National & State Parks, About.com. http://usparks.about.com/od/backcoun...ear-Safety.htm
Kasparov’s Failed Prediction
In January 1988, Garry Kasparov, the top-rated chess player in the world from 1986 until his
retirement in 2005, predicted that no computer program would be able to defeat a human
grandmaster at chess until at least the year 2000. “If any grandmaster has difficulties playing
computers,” he quipped at a press conference in Paris, “I would be happy to provide my advice.”
Later that same year, however, the Danish grandmaster Bent Larsen was defeated by a program
named Deep Thought, a graduate-school project by several students at Carnegie Mellon
The garden-variety grandmaster, however, was no Kasparov, and when Deep Thought squared
off against Kasparov in 1989 it was resoundingly defeated. Kasparov has always respected the
role of computing technology in chess, and had long studied with computers to improve his
game, but he offered Deep Thought only the faintest praise, suggesting that one day a computer
could come along that might require him to exert his “100 percent capabilities” in order to
The programmers behind Deep Thought, led by Feng-hsiung Hsu and Murray Campbell, were
eventually hired by IBM, where their system evolved into Deep Blue. Deep Blue did defeat
Kasparov in the first game of a match in Philadelphia in 1996, but Kasparov rebounded to claim
the rest of the series fairly easily. It was the next year, in a rematch in New York, when the
unthinkable happened. Garry Kasparov, the best and most intimidating chess player in history,
was intimidated by a computer.
In the Beginning . . .
A chess game, like everything else, has three parts: the beginning, the middle and the end.
What’s a little different about chess is that each of these phases tests different intellectual and
emotional skills, making the game a mental triathlon of speed, strength, and stamina.
In the beginning of a chess game the center of the board is void, with pawns, rooks, and
bishops neatly aligned in the first two rows awaiting instructions from their masters. The
possibilities are almost infinite. White can open the game in any of twenty different ways, and
black can respond with twenty of its own moves, creating 4,000 possible sequences after the
first full turn. After the second full turn, there are 71,852 possibilities; after the third, there are
9,132,484. The number of possibilities in an entire chess game, played to completion, is so large
that it is a significant problem even to estimate it, but some mathematicians put the number as
high as [ten to the power of ten to the power of fifty]. These are astronomical numbers: as
Diego Rasskin-Gutman has written, “There are more possible chess games than the number of
atoms in the universe.” [Garry Kasparov, “The Chess Master and the Computer,” New York
Review of Books, February 11, 2010.
- pp. 276-279 (The Beginning of the End): In the final stage of a chess game, the endgame,
the number of pieces on the board are fewer, and winning combinations are sometimes more
explicitly calculable. Still, this phase of the game necessitates a lot of precision, since closing out
a narrowly winning position often requires dozens of moves to be executed properly without
any mistakes . . . .
The endgame can be a mixed blessing for computers. There are few intermediate tactical goals
left, and unless a computer can literally solve the position to the bitter end, it may lose the
forest for the trees. However, just as chess computers have databases to cover the opening
moves, they also have databases of these endgame scenarios. Literally all positions in which
there are six or fewer pieces on the board have been solved to completion. Work on seven-
piece positions is mostly complete – some of the solutions are intricate enough to require as
many as 517 moves – but computers have memorized exactly which are the winning, losing, and
Thus, something analogous to a black hole has emerged by this stage of the game: a point
beyond which the gravity of the game tree becomes inescapable, when the computer will draw
all positions that should be drawn and win all of them that should be won. The abstract goals of
this autumnal phase of a chess game are replaced by a set of concrete ones: get your queenside
pawn to here, and you will win; induce black to move his rook there, and you will
Deep Blue, then, had some incentive to play on against Kasparov in Game 1. Its circuits told it
that its position was a losing one, but even great players like Kasparov make serious blunders
about once per seventy-five moves. One false step by Kasparov might have been enough to
trigger Deep Blue’s sensors and allow it to find a drawing position. Its situation was desperate,
but not quite hopeless.
Instead, Deep Blue did something very strange, at least to Kasparov’s eyes. On its forty-fourth
turn, Deep Blue moved one of its rooks into white’s first row rather than into a more
conventional position that would have placed Kasparov’s king into check. The computer’s move
seemed completely pointless. At a moment when it was under assault from every direction, it
had essentially passed its turn, allowing Kasparov to advance one of his pawns into black’s
second row, where it threatened to be promoted to a queen. Even more strangely, Deep Blue
resigned the game just one turn later.
What had the computer been thinking? Kasparov wondered. He was used to seeing Deep
Blue commit strategic blunders—for example, accepting the bishop-rook exchange—in complex
positions where it simply couldn’t think deeply enough to recognize the implications. But this
had been something different: a tactical error in a relatively simple position—exactly the
sort of mistake that computers don’t make.
“How can a computer commit suicide like that?” Kasparov asked Frederic Friedel, a German
chess journalist who doubled as his friend and computer expert, when they studied the match
back at the Plaza Hotel that night. There were some plausible explanations, none of which
especially pleased Kasparov. Perhaps Deep Blue had indeed committed “suicide,” figuring that
since it was bound to lose anyway, it would rather not reveal any more to Kasparov about how it
played. Or perhaps, Kasparov wondered, it was part of some kind of elaborate hustle? Maybe
the programmers were sandbagging, hoping to make the hubristic Kasparov overconfident by
throwing the first game?
Kasparov did what came most naturally to him when he got anxious and began to pore through
the data. With the assistance of Friedel and the computer program Fritz, he found that the
conventional play—black moving its rook into the sixth column and checking white’s king—
wasn’t such a good move for Deep Blue after all: it would ultimately lead to a checkmate for
Kasparov, although it would still take more than twenty moves for him to complete it.
But what this implied was downright frightening. The only way the computer would pass on a
line that would have required Kasparov to spend twenty moves to complete his checkmate, he
reasoned, is if it had found another one that would take him longer. As Friedel recalled:
Deep Blue had actually worked it all out, down to the very end and simply chosen the least
obnoxious losing line. “It probably saw mates in 20 and more,” said Garry, thankful that he had
been on the right side of these awesome calculations.
To see twenty moves ahead in a game as complex as chess was once thought to be impossible
for both human beings and computers. Kasparov’s proudest moment, he once claimed, had
come in a match in the Netherlands in 1999, when he had visualized a winning position some
fifteen moves in advance. Deep Blue was thought to be limited to a range of six to eight moves
ahead in most cases. Kasparov and Friedel were not exactly sure what was going on, but what
had seemed to casual observers like a random and inexplicable blunder instead seemed to them
to reveal great wisdom.
Kasparov would never defeat Deep Blue again.
- pp. 282-283: Kasparov resolved that he wouldn’t be able to beat Deep Blue by playing the forceful,
intimidating style of chess that made him World Champion. Instead, he would have to try to trick the
computer with a cautious and unconventional style, in essence playing the role of the hacker who prods
a program for vulnerabilities. But Kasparov’s opening move in the third game, while unusual enough to
knock Deep Blue out of its databases, was too inferior to yield anything better than a draw. Kasparov
played better in the fourth and fifth games, seeming to have the advantage at points in both of them,
but couldn’t overcome the gravity of Deep Blue’s endgame databases and drew both of them as well.
The match was square at one win for each player and three ties, with one final game to play.
On the day of the final game, Kasparov showed up at the Equitable Center looking tired and
forlorn; Friedel later recalled that he had never seen him in such a dark mood. Playing the black pieces,
Kasparov opted for something called the Caro-Kann Defense. The Caro-Kann is considered somewhat
weak—black’s winning percentage with it is 44.7 percent historically—although far from irredeemable
for a player like Karpov who knows it well. But Kasparov did not know the Caro-Kann; he had rarely
played it in tournament competition. After just a few moves, he was straining, taking a long time to
make decisions that were considered fairly routine. And on his seventh move, he committed a grievous
blunder, offering a knight sacrifice one move too early. Kasparov recognized his mistake almost
immediately, slumping down in his chair and doing nothing to conceal his displeasure. Just twelve moves
later—barely an hour into the game—he resigned, storming away from the table.
Deep Blue had won. Only, it had done so less with a bang than an anticlimactic whimper. Was Kasparov
simply exhausted, exacerbating his problems by playing an opening line with which he had little
familiarity? Or, as the grandmaster Patrick Wolff concluded, had Kasparov thrown the game, to
delegitimize Deep Blue’s accomplishment? Was there any significance to the fact that the line he had
selected, the Caro-Kann, was a signature of Karpov, the rival whom he had so often vanquished?
But these subtleties were soon lost to the popular imagination. Machine had triumphed over man! It
was like when HAL 9000 took over the spaceship. Like the moment when, exactly thirteen seconds into
“Love Will Tear Us Apart,” the synthesizer overpowers the guitar riff, leaving rock and roll in its dust.
[This metaphor is borrowed from Bill Wyman, a music critic for the Chicago Reader, who ranked it as the
greatest moment in rock history. Bill Wyman, “The 100 Greatest Moments in Rock History,” Chicago
Reader, September 28, 1995. http://www.chicagoreader.com/chicago...ent?oid=888578
- pp. 204-209 [Chapter 7 (Role Models)]: The flu hit Fort Dix like clockwork every January; it had almost
become a rite of passage. Most of the soldiers would go home for Christmas each year, fanning out to all
corners of the United States for their winter break. They would then return to the base, well-fed and
well-rested, but also carrying whichever viruses might have been going around their hometowns. If the
flu was anywhere in the country, it was probably coming back with them. Life in the cramped setting of
the barracks, meanwhile, offered few opportunities for privacy or withdrawal. If someone—anyone—
had caught the flu back home, he was more likely than not to spread it to the rest of the platoon. You
could scarcely conjure a scenario more favorable to transmission of the disease.
Usually this was no cause for concern; tens of millions of Americans catch the flu in January and
February every year. Few of them die from it, and young, healthy men like David Lewis, a nineteen-year-
old private from West Ashley, Massachusetts, who had returned to Fort Dix that January, are rarely
among the exceptions. So Lewis, even though he’d been sicker than most of the recruits and ordered to
stay in the barracks, decided to join his fellow privates on a fifty-mile march through the snow-
blanketed marshlands of central New Jersey. He was in no mood to let a little fever bother him—it was
1976, the year of the nation’s bicentennial, and the country needed order and discipline in the uncertain
days following Watergate and Vietnam.
But Lewis never made it back to the barracks: thirteen miles into the march, he collapsed and was later
pronounced dead. An autopsy revealed that Lewis’s lungs were flush with blood: he had died of
pneumonia, a common complication of flu, but not usually one to kill a healthy young adult like Lewis.
The medics at Fort Dix had already been nervous about that year’s flu bug. Although some of
the several hundred soldiers who had gotten ill that winter had tested positive for the A/Victoria flu
strain—the name for the common and fairly benign virus that was going around the world that year—
there were others like Lewis who had suffered from an unidentified and apparently much more severe
type of flu. Samples of their blood were sent to the Center for Disease Control (CDC) in Atlanta for
Two weeks later the CDC revealed the identity of the mysterious virus. It was not a new type of flu after
all but instead something altogether more disturbing, a ghost from epidemics past: influenza virus type
H1N1, more commonly known as the swine flu. H1N1 had been responsible for the worst pandemic in
modern history: the Spanish flu of 1918-20, which afflicted a third of humanity and killed 50 million,
including 675,000 in the United States. For reasons of both science and superstition, the disclosure sent
a chill through the nation’s epidemiological community. The 1918 outbreak’s earliest manifestations had
also come at a military base, Fort Riley in Kansas, where soldiers were busy preparing to enter World
War I. Moreover, there was a belief at that time—based on somewhat flimsy scientific evidence—that a
major flu epidemic manifested itself roughly once every ten years. The flu had been severe in 1938,
1947, 1957, and 1968; in 1976, the world seemed due for the next major pandemic.
A series of dire predictions soon followed. The concern was not an immediate outbreak—by the time
the CDC had positively identified the H1N1 strain, flu season had already run its course. But scientists
feared that it foreshadowed something much worse the following winter. There had never been a case,
a prominent doctor noted to the New York Times, in which a newly identified strain of the flu had
failed to outcompete its rivals and become the global hegemon: wimpy A/Victoria stood no chance
against its more virulent and ingenious rival. And if H1N1 were anywhere near as deadly as the 1918
version had been, the consequences might be very bad indeed. Gerald Ford’s secretary of health, F.
David Mathews, predicted that one million Americans would die, eclipsing the 1918 total.
President Ford found himself in a predicament. The vaccine industry, somewhat like the fashion
industry, needs at least six months of lead time to know what the hip vaccine is for the new season; the
formula changes a little bit every year. If they suddenly had to produce a vaccine that guarded against
H1N1—and particularly if they were going to produce enough of it for the entire nation—they would
need to get started immediately. Meanwhile, Ford was struggling to overcome a public perception that
he was slow-witted and unsure of himself—an impression that grew more entrenched every weekend
with Chevy Chase’s bumbling-and-stumbling caricature of him on NBC’s new hit show, Saturday Night
Live. So Ford took the resolute step of asking Congress to authorize some 200 million doses of
vaccine, and ordered a mass vaccination program, the first the country had seen since Jonas Salk had
developed the polio vaccine in the 1950s.
The press portrayed the mass vaccination program as a gamble. But Ford thought of it as a gamble
between money and lives, and one that he was on the right side of. Overwhelming majorities in both
houses of Congress approved his plans at a cost of $180 million.
By summer, however, there were serious doubts about the government’s plans. Although summer is the
natural low season for the flu in the United States, it was winter in the Southern Hemisphere, when flu is
normally at its peak. And nowhere, from Auckland to Argentina, were there any signs of H1N1; instead,
the mild and common A/Victoria was the dominant strain again. Indeed, the roughly two hundred cases
at Fort Dix remained the only confirmed cases of H1N1 anywhere in the world, and Private Lewis’s the
only death. criticism started to pour in from all quarters: from the assistant director of the CDC, the
World Health Organization, the prestigious British medical journal The Lancet, and the editorial
pages of the New York Times, which was already characterizing the H1N1 threat a “false alarm”. No
other Western country had called for such drastic measures.
Instead of admitting that they had overestimated the threat, the Ford administration doubled down,
preparing a series of frightening public service announcements that ran in regular rotation on the
nation’s television screens that fall. One mocked the naïveté of those who refused flu shots—“I’m the
healthiest fifty-five-year-old you’ve ever seen—I play golf every weekend!” the balding everyman
says, only to be shown on his deathbed moments later. Another featured a female narrator tracing the
spread of the virus from one person to the next, dishing about it in breathy tones as though it were an
STD—“Betty’s mother gave it to the cabdriver . . . and to one of the charming stewardesses . . . and
then she gave it to her friend Dottie, who had a heart condition and died.”
The campy commercials were intended to send a very serious message: Be afraid, be very afraid.
Americans took the hint. Their fear, however, manifested itself as much toward the vaccine as toward
the disease itself. Throughout American history, the notion of the government poking needles into
everyone’s arm has always provoked more than its fair share of anxiety. But this time there was a more
tangible basis for public doubt. In August of that year, under pressure from the drug companies,
Congress and the White House had agreed to indemnify them from legal liability in the event of
manufacturing defects. This was widely read as a vote of no-confidence; the vaccine looked as though it
was being rushed out without adequate time for testing. Polls that summer showed that only about 50
percent of Americans planned to get vaccinated, far short of the government’s 80 percent goal.
The uproar did not hit a fever pitch until October, when the vaccination program began. On
October 11, a report surfaced from Pittsburgh that three senior citizens had died shortly after receiving
their flu shots; so had two elderly persons in Oklahoma City; so had another in Fort Lauderdale. There
was no evidence that any of the deaths were linked to the vaccinations—elderly people die every day,
after all. But between the anxiety about the government’s vaccination program and the media’s dubious
understanding of statistics, every death of someone who’d gotten a flu shot became a cause for alarm.
Even Walter Cronkite, the most trusted man in America—who had broken from his trademark austerity
to admonish the media for its sensational handling of the story—could not calm the public down.
Pittsburgh and many other cities shuttered their clinics.
By late fall, another problem had emerged, this one far more serious. About five hundred patients, after
receiving their shots, had begun to exhibit the symptoms of a rare neurological condition known as
Guillain-Barre syndrome, an autoimmune disorder that can cause paralysis. This time, the statistical
evidence was far more convincing: the usual incidence of Guillain-Barre in the general population is only
about one case per million persons. In contrast, the rate in the vaccinated population had been ten
times that—five hundred cases out of the roughly fifty million people who had been administered the
vaccine. Although scientists weren’t positive why the vaccines were causing Guillain-Barre,
manufacturing defects triggered by the rush production schedule were a plausible culprit, and the
consensus of the medical community was that the vaccine program should be shut down for good,
which the government finally did on December 16.
In the end, the outbreak of H1N1 at Fort Dix had been completely isolated; there was never another
confirmed case anywhere in the country. Meanwhile, flu deaths from the ordinary A/Victoria strain were
slightly below average in the winter of 1976-77. It had been much ado about nothing.
The swine flu fiasco—as it was soon dubbed—was a disaster on every level for President Ford, who lost
his bid for another term to the Democrat Jimmy Carter that November. The drug makers had been
absolved of any legal responsibility, leaving more than $2.6 billion in liability claims against the United
States government. It seemed like every local paper had run a story about the poor waitress or
schoolteacher who had done her duty and gotten the vaccine, only to have contracted Guillain-Barre . . .
Ford’s handling of H1N1 was irresponsible on a number of levels. By invoking the likelihood of a
1918-type pandemic, he had gone against the advice of medical experts, who believed at the time that
the chance of such a worst-case outcome was no higher than 35 percent and perhaps as low as 2
Still, it was not clear what had caused H1N1 to disappear just as suddenly as it emerged. And predictions
about H1N1 would fare little better when it came back some thirty-three years later. Scientists at first
missed H1N1 when it reappeared in 2009. Then they substantially overestimated the threat it might
pose once they detected it.
- pp. 371-380 [A CLIMATE OF HEALTHY SKEPTICISM (The Noise and the Signal)]: Many of the examples in
this book concerns cases where forecasters mistake correlation for causation and noise for a signal. Up
until about 1997, the conference of the winning Super Bowl team had been very strongly correlated
with the direction of the stock market over the course of the next year. However, there was no
credible causal mechanism behind the relationship, and if you had made investments on that basis
you would have lost your shirt. The Super Bowl indicator was a false positive.
The reverse can sometimes also be true. Noisy data can obscure the signal, even when there is
essentially no doubt that the signal exists. Take a relationship that few of us would dispute: if you
consume more calories, you are more likely to become fat. Surely such a basic relationship would show
up clearly in the statistical record?
I downloaded data from eighty-four countries for which estimates of both obesity rates and daily
caloric consumption are publicly available. Looked at in this way, the relationship seems surprisingly
tenuous. The daily consumption in South Korea, which has a fairly meat-heavy diet, is about 3,070
calories per person per day, slightly above the world average. However, the obesity rate there is only
about 3 percent. The Pacific island nation of Nauru, by contrast, consumes about as many calories as
South Korea per day, but the obesity rate there is 79 percent. If you plot the eighty-four countries on a
graph . . . there seems to be only limited evidence of a connection between obesity and calorie
consumption; it would not qualify as “statistically significant” by standard tests.*
There are, of course, many conflating factors that obscure the relationship. Certain countries have
better genetics, or better exercise habits. And the data is rough: estimating how many calories an adult
consumes in a day is challenging. [One common technique requires adults to dutifully record everything
they eat over a period of weeks, and trusts them to do so honestly when there is a stigma attached to
overeating (and more so in some countries than others).] A researcher who took this statistical
evidence too literally might incorrectly reject the connection between calorie consumption and
obesity, a false negative.
* As I discuss in chapter 6, the concept of “statistical significance” is very often problematic in practice.
But to my knowledge there does not exist a community of “obesity skeptics” who cite statistics like
these to justify a diet of Big Macs and Fritos.
It would be nice if we could just plug data into a statistical model, crunch the numbers, and take for
granted that it was a good representation of the real world. Under some conditions, especially in
data-rich fields like baseball, that assumption is fairly close to being correct. In many other cases, a
failure to think carefully about causality will lead us up blind alleys.
There would be much reason to doubt claims about global warming were it not for their grounding in
causality. The earth’s climate goes through various warm and cold phases that play out over periods of
years or decades or centuries. These cycles long predate the dawn of industrial civilization.
However, predictions are potentially much stronger when backed up by a sound understanding of the
root causes behind a phenomenon. We do have a good understanding of the cause of global warming: it
is the greenhouse effect.
The Greenhouse Effect Is Here
In 1990, two years after Hansen’s hearing, the United Nations’ International Panel on Climate Change
(IPCC) released more than a thousand pages of findings about the science of climate change in its First
Assessment Report. Produced over several years by a team of hundreds of scientists from around the
globe, the report went into voluminous detail on the potential changes in temperatures and
ecosystems, and outlines a variety of strategies to mitigate these effects.
The IPCC’s scientists classified just two findings as being absolutely certain, however. These findings did
not rely on complex models, and they did not make highly specific predictions about the climate.
Instead, they were based on relatively simple science that had been well-understood for more than 150
years and which is rarely debated even by self-described climate skeptics. They remain the most
important scientific conclusions about climate change today.
The IPCC’s first conclusion was simply that the greenhouse effect exists:
There is a natural greenhouse effect that keeps the Earth warmer than it otherwise would be. [J. T.
Houghton, G. J. Jenkins, and J. J. Ephraums, “Report Prepared for Intergovernmental Panel on Climate
Change by Working Group I,” Climate Change: The IPCC Scientific Assessment (Cambridge:
Cambridge University Press, 1990), p. XI.]
The greenhouse effect is the process by which certain atmospheric gases—principally water vapor,
carbon dioxide (CO2), methane, and ozone—absorb solar energy that has been reflected from the
earth’s surface. Were it not for this process, about 30 percent of the sun’s energy would be reflected
back out into space in the form of infrared radiation. That would leave the earth’s temperatures much
colder than they actually are: about 0 [degrees] Fahrenheit or -18 [degrees] Celsius on average, or the
same as a warm day on Mars.
Conversely, if these gases become more plentiful in the atmosphere, a higher fraction of the sun’s
energy will be trapped and reflected back onto the surface, making temperatures much warmer. On
Venus, which has a much thicker atmosphere consisting almost entirely of carbon dioxide, the average
temperature is 460 [degrees Celsius]. Some of that heat comes from Venus’s proximity to the sun, but
much of it is because of the greenhouse effect.
There is no scenario in the foreseeable future under which the earth’s climate will come to resemble
that of Venus. However, the climate is fairly sensitive to changes in atmospheric composition, and
human civilization thrives within a relatively narrow band of temperatures. The coldest world capital is
Ulan Bator, Mongolia, where temperatures average about -1 [degrees Celsius] (or +30 [degrees
Fahrenheit]) over the course of the year; the warmest is probably Kuwait City, Kuwait, where they
average +27 [degrees Celsius] (+81 [degrees Fahrenheit]). Temperatures can be hotter or cooler
during winter or summer or in sparsely populated areas, but the temperature extremes are modest on
an interplanetary scale. On Mercury, by contrast, which has little atmosphere to protect it,
temperatures often vary between about -200 [degrees Celsius] and +400 [degrees Celsius] over the
course of a single day.
The IPCC’s second conclusion made an elementary prediction based on the greenhouse effect: as the
concentration of greenhouse gases increased in the atmosphere, the greenhouse effect and global
temperatures would increase along with them:
Emissions resulting from human activities are substantially increasing the atmospheric concentrations
of the greenhouse gases carbon dioxide, methane, chlorofluorocarbons (CFCs) and nitrous oxide. These
increases will enhance the greenhouse effect, resulting on average in additional warming of the Earth’s
surface. The main greenhouse gas, water vapor, will increase in response to global warming and
further enhance it.
This IPCC finding makes several different assertions, each of which is worth considering in turn.
First, it claims that atmospheric concentrations of greenhouse gases like CO2 are increasing, and
as a result of human activity. This is a matter of simple observation. Many industrial processes,
particularly the use of fossil fuels, produce CO2 as a by-product.* Because CO2 remains in the
atmosphere for a long time, its concentrations have been rising: from about 315 parts per million
(ppm) when CO2 levels were first directly monitored at the Muana Loa Observatory in Hawaii in
1959 to about 390 PPM as of 2011. [“Full Mauna Loa CO2 Record” in Trends in Atmospheric Carbon
Dioxide, Earth System Research Laboratory, National Oceanic & Atmospheric Administration
Research, U.S. Department of Commerce. http://www.esrl.noaa.gov/gmd/ccgg/trends/#mlo_full
* “Human-Related Sources and Sinks of Carbon Dioxide” in Climate Change—Greenhouse Gas
Emissions, Environmental Protection Agency. http://www.epa.gov/climatechange/emi...co2_human.html
The second claim, “these increases will enhance the greenhouse effect, resulting on average in
additional warming of the Earth’s surface,” is essentially just a restatement of the IPCC’s first
conclusion that the greenhouse effect exists, phrased in the form of a prediction. The prediction relies
on relatively simple chemical reactions that were identified in laboratory experiments many years ago.
The greenhouse effect was first proposed by the French physicist Joseph Fourier in 1824 and is usually
regarded as having been proved by the Irish physicist John Tyndall in 1859, the same year that Charles
Darwin published On the Origin of the Species. [Isaac M. Held and Brian J. Soden, “Water Vapor
Feedback and Global Warming,” Annual Review of Energy and the Environment, 25 (November
2000), pp. 441-475. http://www.annualreviews.org/doi/abs...nergy.25.1.441
The third claim—that water vapor will also increase along with gases like CO2, thereby
enhancing the greenhouse effect—is modestly bolder. Water vapor, not CO2, is the largest contributor
to the greenhouse effect. If there were an increase in CO2 alone, there would still be some warming, but
not as much as has been observed to date or as much as scientists predict going forward. But a basic
thermodynamic principle known as the Clausius-Clapeyron relation, which was proposed and proved in
the nineteenth century, holds that the atmosphere can retain more water vapor at warmer
temperatures. Thus, as CO2 and other long-lived greenhouse gases increase in concentration and
warm the atmosphere, the amount of water vapor will increase as well, multiplying the effects of CO2
and enhancing warming.
This Isn’t Rocket Science
Scientists require a high burden of proof before they are willing to conclude that a hypothesis is
incontrovertible. The greenhouse hypothesis has met this standard, which is why the original IPCC
report singled it out from among hundreds of findings as the only thing that scientists were absolutely
certain about. The science behind the greenhouse effect was simple enough to have been widely
understood by the mid- to late nineteenth century, when the light-bulb and the telephone and the
automobile were being invented—and not the atomic bomb or the iPhone or the Space Shuttle. The
greenhouse effect isn’t rocket science.
Indeed, predictions that industrial activity would eventually trigger global warming were made long
before the IPCC—as early as 1897* by the Swedish chemist Svante Arrhenius, and at many other times
before the warming signal produced by the greenhouse signal had become clear enough to be
distinguished from natural causes. [J. H. Mercer, “West Antarctic Ice Sheet and CO2 Greenhouse Effect:
A Threat of Disaster,” Nature, 271 (January 1978), pp. 321-325. http://stuff.mit.edu/~heimbach/paper..._1978_wais.pdf
* Kerry A. Emanuel, “Advance Written Testimony,” Hearing on Climate Change: Examining the Processes
Used to Create Science and Policy, House Committee on Science, Space and Technology, U.S. House of
Representatives, March 31, 2011. http://science.house.gov/sites/repub...0testimony.pdf
It now seems almost quaint to refer to the greenhouse effect. In the mid-1980s, the term greenhouse
effect was about five times more common in English-language books than the phrase global
warming. But usage of greenhouse effect peaked in the early 1990s and has been in steady
decline since. It is now used only about one-sixth as often as the term global warming, and one-tenth as
often as the broader term climate change.
This change has largely been initiated by climate scientists* as they seek to expand the predictive
implications of the theory. However, the pullback from speaking about the causes of the
change—the greenhouse effect—yields predictably misinformed beliefs about it. [In this sense,
the term climate change may be inferior to the more specific term global warming.
Climate change creates the impression that any potential change in our environment—
warming or cooling, more precipitation or less—is potentially consistent with the theory. In fact,
some of these phenomena (like cooler temperatures) would contradict the predictions made by the
theory under most circumstances.]
* Erik Conway, “What’s in a Name? Global Warming vs. Climate Change,” NASA.gov.
In January 2012, for instance, the Wall Street Journal published an editorial entitled “No Need to
Panic About Global Warming,” which was signed by a set of sixteen scientists and advocates who might
be considered global warming skeptics. Accompanying the editorial was a video produced by the Wall
Street Journal that was captioned with the following phrase:
A large number of scientists don’t believe that carbon dioxide is causing global warming. [“No Need
to Panic About Global Warming;” Wall Street Journal, January 26, 2012.
In fact, very few scientists doubt this—there is essentially no debate that greenhouse gases cause global
warming. Among the “believers” in the theory was the physics professor William Happer of Princeton,
who cosigned the editorial and who was interviewed for the video. “Most people like me believe that
industrial emissions will cause warming,” Happer said about two minutes into the video. Happer takes
issue with some of the predictions of global warming’s effects, but not with its cause.
I do not mean to suggest that you should just blindly accept a theory in the face of contradictory
evidence. A theory is tested by means of its predictions, and the predictions made by climate scientists
have gotten some things right and some things wrong. Temperature data is quite noisy. A warming
trend might validate the greenhouse hypothesis or it might be caused by cyclical factors. A
cessation in warming could undermine the theory or it might represent a case where the noise in
the data had obscured the signal.
But even if you believe, as Bayesian reasoning would have it, that almost all scientific hypotheses
should be thought of probabilistically, we should have a greater degree of confidence in a hypothesis
backed up by strong and clear causal relationships. Newly discovered evidence that seems to militate
against the theory should nevertheless lower our estimate of its likelihood, but it should be weighed in
the context of the other things we know (or think we do) about the planet and its climate.
Healthy skepticism needs to proceed from this basis. It needs to weigh the strength of new evidence
against the overall strength of the theory, rather than rummaging through fact and theory alike for
argumentative and ideological convenience, as is the cynical practice when debates become
partisan and politicized.
Three Types of Climate Skepticism
It is hard to imagine a worse time and place to hold a global climate conference than Copenhagen in
December, as the United Nations did in 2009. During the winter solstice there, the days are short and
dark—perhaps four hours of decent sunlight—and the temperatures are cold, with the wind whipping
off the Oresund, the narrow strait that separates Denmark from Sweden.
Worst yet, the beer is expensive: the high taxes on alcohol and pretty much everything else in
Denmark help to pay for a green-technology infrastructure that rivals almost anywhere in the world.
Denmark consumes no more energy today than it did in the late 1960s,* in part because it is
environmentally friendly and in part because of its low population growth. (By contrast, the
United States’ energy consumption has roughly doubled over the same period.) [“United States
Energy Use (kt of oil equivalent),” World Bank data via Google Public Data, last updated March 30, 2012.
* “Denmark Energy Use (kt of oil equivalent),” World Bank data via Google Public Data, last updated
March 30, 2012.
The implicit message seemed to be that an energy-efficient future would be cold, dark and expensive.
It is little wonder, then, that the mood at Copenhagen’s Bella Center ranged far beyond
skepticism and toward outright cynicism. I had gone to the conference, somewhat naively, seeking a
rigorous scientific debate about global warming. What I found instead was politics, and the differences
Delegates from Tuvalu, a tiny, low-lying Pacific island nation that would be among the most vulnerable
to rising sea levels, roamed the halls, loudly protesting what they thought to be woefully inadequate
targets for greenhouse-gas reduction. Meanwhile, the large nations that account for the vast majority of
greenhouse-gas emissions were nowhere near agreement.
President Obama had arrived at the conference empty-handed, having burned much of his political
capital on his health-care bill and his stimulus package. Countries like China, India, and Brazil, which are
more vulnerable than the United States to climate change impacts because of their geography but are
reluctant to adopt commitments that might impair their economic growth, weren’t quite sure where to
stand. Russia, with its cold climate and its abundance of fossil-fuel resources, was a wild card. Canada,
also cold and energy-abundant, was another, unlikely to push for any deal that the United States lacked
the willpower to enact.* There was some semblance of a coalition among some of the wealthier nations
in Europe, along with Australia, Japan, and many of the world’s poorer countries in Africa and the
Pacific.** But global warming is a problem wherein even if the politics are local, the science is not. CO2
quickly circulates around the planet: emissions from a diesel truck in Qingdao will eventually affect the
climate in Quito. Emissions-reductions targets therefore require near-unanimity, and not mere
coalition-building, in order to be enacted successfully. That agreement seemed years if not decades
* “FAQ: Copenhagen Conference 2009;” CBCNews.ca, December 8, 2009.
** Nate Silver, “Despite Protests, Some Reason for Optimism in Copenhagen, FiveThirtyEight.com,
December 9, 2009.
I was able to speak with a few scientists at the conference. One of them was Richard Rood, a
soft-spoken North Carolinian who once led teams of scientists at NASA and who now teaches a course
on climate policy to students at the University of Michigan.
“At NASA, I finally realized that the definition of rocket science is using relatively simple physics to solve
complex problems,” Rood told me. “The science part is relatively easy. The other parts—how do you
develop policy, how do you respond in terms of public health—these are all relatively difficult
problems because they don’t have as well defined a cause-and-effect mechanism.”
As I was speaking with rood, we were periodically interrupted by announcements from the Bella
Center’s loudspeaker. “No consensus was found. Therefore I suspend this agenda item,” said a
French-sounding woman, mustering her best English. But Rood articulated the three types of
skepticism that are pervasive in the debate about the future of climate.
One type of skepticism flows from self-interest. In 2011 alone, the fossil fuel industry spent about $300
million on lobbying activities (roughly double what they’d spent just five years earlier).* Some climate
scientists I later spoke with for this chapter used conspiratorial language to describe their activities. But
there is no reason to allege a conspiracy when an explanation based on rational self-interest will suffice:
these companies have a financial incentive to preserve their position in the status quo, and they are
within their First Amendment rights to defend it. What they say should not be mistaken for an attempt
to make accurate predictions, however.
A second type of skepticism falls into the category of contrarianism. In any contentious debate, some
people will find it advantageous to align themselves with the crowd, while a smaller number will come
to see themselves as persecuted outsiders. This may especially hold in a field like climate science, where
the data is noisy and the predictions are hard to experience in a visceral way. And it may be especially
common in the United States, which is admirably independent-minded. “If you look at climate, if you
look at ozone, if you look at cigarette smoking, there is always a community of people who are skeptical
of the science-driven results,” Rood told me.
Most importantly, there is scientific skepticism. “You’ll find that some in the scientific community have
valid concerns about one aspect of the science or the other,” Rood said. “At some level, if you really
want to move forward, we need to respect some of their points of view.”
* “Energy/Natural Resources: Lobbying, 2011,” OpenSecrets.org.
- pp. 382-385 (All the Climate Scientists Agree on Some of the Findings): There is an unhealthy obsession
with the term consensus as it is applied to global warming. Some who dissent from what they see
as the consensus view are proud to acknowledge it and label themselves as heretics.* Others, however,
have sought strength in numbers, sometimes resorting to dubious techniques like circulating online
petitions in an effort to demonstrate how much doubt there is about the theory. [One such petition,
which was claimed to have been signed by 15,000 scientists, later turned up names like Geri Halliwell,
a.k.a Ginger Spice of the Spice Girls, who had apparently given up her career as a pop star to pursue a
degree in microbiology.] Meanwhile, whenever any climate scientist publicly disagrees with any finding
about global warming, they may claim that this demonstrates a lack of consensus about the theory.
Many of these debates turn on a misunderstanding of the term. In formal usage, consensus is not
synonymous with unanimity—nor with having achieved a simple majority. Instead, consensus connotes
broad agreement after a process of deliberation, during which time most members of a group
coalesce around a particular idea or alternative. (Such as in: “We reached a consensus to get Chinese
food for lunch, but Horatio decided to get pizza instead.”)
A consensus-driven process, in fact, often represents an alternative to voting. Sometimes when a
political party is trying to pick a presidential nominee, one candidate will perform so strongly in
early-voting states like Iowa and New Hampshire that all the others drop out. Even though the
candidate is far from having clinched the nomination mathematically, there may be no need for the
other states to hold a meaningful vote if the candidate has demonstrated that he is acceptable to most
key coalitions within the party. Such a candidate can be described as having won the nomination by
Science, at least ideally, is exactly this sort of deliberative process. Articles are published and
conferences are held. Hypotheses are tested, findings are argued over; some survive the scrutiny
better than others.
* Nicholas Dawidoff, “The Divil Heretic,” New York Times Magazine, March 25, 2009.
The IPCC is potentially a very good example of a consensus process. Their reports take years to produce
and every finding is subject to a thorough—if somewhat byzantine and bureaucratic—review process.
“By convention, every review remark has to be addressed,” Rood told me. “If your drunk cousin wants
to make a remark, it will be addressed.”
The extent to which a process like the IPCC’s can be expected to produce better predictions is more
debatable, however. There is almost certainly some value in the idea that different members of a group
can learn from one another’s expertise. But this introduces the possibility of groupthink and herding.
Some members of a group may be more influential because of their charisma or status and not
necessarily because they have the better idea. Empirical studies of consensus-driven predictions have
found mixed results, in contrast to a process wherein individual members of a group submit
independent forecasts and those are averaged or aggregated together, which can almost always be
counted on to improve predictive accuracy.
The IPCC process may reduce the independence of climate forecasters. Although there are nominally
about twenty different climate models used in the IPCC’s forecast, they make many of the same
assumptions and use some of the same computer code; the degree of overlap is significant enough
that they represent the equivalent of just five or six independent models. And however many models
there are, the IPCC settles on just one forecast that is endorsed by the entire group.
Climate Scientists Are Skeptical About Computer Models
“It’s critical to have a diversity of models,” I was told by Kerry Emanuel, an MIT meteorologist who is one
of the world’s foremost theorists about hurricanes. “You do not want to put all your eggs in one basket.”
One of the reasons this is so critical, Emanuel told me, is that in addition to the different
assumptions these models employ, they also contain different bus. “That’s something nobody likes to
talk about,” he said. “Different models have different coding errors. You cannot assume that a model
with millions and millions of lines of code, literally millions of instructions, that there isn’t a mistake in
If you’re used to thinking about the global warming debate as series of arguments between “skeptics”
and “believers,” you might presume that this argument emanates from a scientist on the skeptical side
of the aisle. In fact, although Emanuel has described himself as conservative and Republican*--which is
brave enough at MIT—he would probably not think of himself as a global warming skeptic. Instead, he is
a member in good standing of the scientific establishment, having been elected to the National
Academy of Sciences. His 2006 book presented a basically “consensus” (and extremely thoughtful and
well-written) view on climate science. [Kerry Emanuel, What We Know About Climate Change
(Boston: MIT Press, 2007).
* Neela Banerjee, “Scientist Proves Conservatism and Belief in Climate Change Aren’t Incompatible,”
Los Angeles Times, January 5, 2011.
Emanuel’s concerns are actually quite common among the scientific community: climate scientists are in
much broader agreement about some parts of the debate than others. A survey of climate scientists
conducted in 2008* found that almost all (94 percent) were agreed that climate change is occurring
now, and 84 percent were persuaded that it was the result of human activity. But there was much less
agreement about the accuracy of climate computer models. The scientists held mixed views about the
ability of these models to predict global temperatures, and generally skeptical ones about their capacity
to model other potential effects of climate change. Just 19 percent, for instance, thought they did a
good job of modeling what sea-rise levels will look like fifty years hence.
* Dennis Bray and Hans von Storch, “CliSci2008: A Survey of the Perspectives of Climate Scientists
Concerning Climate Science and Climate Change,” Institute for Coastal Research, 2008.
Results like these ought to be challenging to anyone who takes a caricatured view of climate science.
They should cut against the notion that scientists are injudiciously applying models to make fantastical
predictions about the climate; instead, the scientists have as much doubt about the models as many of
their critics. [And these doubts are not just expressed anonymously; the scientists are exceptionally
careful, in the IPCC reports, to designate exactly which findings they have a great deal of confidence
about and which they see as more speculative.] However, cinematographic representations of climate
change, like Al Gore’s An Inconvenient Truth, have sometimes been less cautious, portraying a
polar bear clinging to life in the Arctic, or South Florida and Lower Manhattan flooding over.* Films like
these are not necessarily a good representation of scientific consensus. The issues that climate
scientists actively debate are much more banal: for instance, how do we develop computer code to
make a good representation of a cloud?
* Ronald Bailey, “An Inconvenient Truth: Gore as Climate Exaggerator,” Reason.com, June 16, 2006.
- pp. 388-389 (Beyond a Cookbook Approach to Forecasting): The criticisms that Armstrong and Green
make about climate forecasts derive from their empirical study of disciplines like economics in which
there are few such physical models available and the causal relationships are poorly understood. Overly
ambitious approaches toward forecasting have often failed in these fields, and so Armstrong and Green
infer that they will fail in climate forecasting as well. [Gavin Schmidt, “Green and Armstrong’s Scientific
Forecast,” RealClimate.org, July 20, 2007.
The goal of any predictive model is to capture as much signal as possible and as little noise as
possible. Striking the right balance is not always so easy, and our ability to do so will be dictated by
the strength of the theory and the quality and quantity of the data. In economic forecasting, the data is
very poor and the theory is weak, hence Armstrong’s argument that “the more complex you make the
model the worse the forecast gets.”
In climate forecasting, the situation is more equivocal: the theory about the greenhouse effect is
strong, which supports more complicated models. However, temperature data is very noisy, which
argues against them. Which consideration wins out? We can address this question empirically, by
evaluating the success and failure of different predictive approaches in climate science. What matters
most, as always, is how well the predictions do in the real world.
I would urge caution against reducing the forecasting process to a series of bumper-sticker slogans.
Heuristics like Occam’s razor (“other things being equal, a simpler explanation is better than a more
complex one”*) sound sexy, but they are hard to apply. We have seen cases, as in the SIR models
used to forecast disease outbreaks, where the assumptions of a model are simple and elegant—but
where they are much too naive to provide for very skillful forecasts. We have also seen cases, as in
earthquake prediction, where unbelievably convoluted forecasting schemes that look great in the
software package fail miserably in practice.
An admonition like “The more complex you make the model the worse the forecast gets” is equivalent
to saying “Never add too much salt to the recipe.” How much complexity—how much salt—did you
begin with? If you want to get good at forecasting, you’ll need to immerse yourself in the craft and
trust your own taste buds.
* “Occam’s Razor;” Wikipedia.org. http://en.wikipedia.org/wiki/Occam’s_razor
- p. 391 (Uncertainty in Climate Forecasts): Central Park happens to have a particularly good
temperature record; it dates back to 1869* . . . I have plotted the monthly average temperature for
Central Park in the century encompassing 1912 through 2011 . . . . the temperature fluctuates
substantially (but predictably enough) from warm to cool and back again—a little more so in some years
than others. In comparison to the weather, the climate signal is barely noticeable. But it does exist:
temperatures have increased by perhaps 4 [degrees Fahrenheit] on average over the course of this
one-hundred-year period in Central Park.
* “Average Monthly & Annual Temperatures at Central Park,” Eastern Regional Headquarters, National
Weather Service. http://www.erh.noaa.gov/okx/climate/...nnualtemp.html
- pp. 394-398 (A Note on the Temperature Record): A more recent entrant into the temperature
sweepstakes are observations from satellites. The most commonly used satellite records are from the
University of Alabama at Huntsville and from a private company called Remote Sensing Systems. The
satellites these records rely on do not take the temperature directly—instead, they infer it by
measuring microwave radiation. But the satellites’ estimates of temperatures in the lower
atmosphere provide a reasonably good proxy for surface temperatures.
The temperature records also differ in how far they track the climate backward; the oldest are the
observations from the UK’s Met Office, which date back to 1850; the satellite records are the youngest
and date from 1979. And the records are measured relative to different baselines—the NASA/GISS
record is taken relative to average temperatures from 1951 through 1980, for instance, while
NOAA’s temperatures are measured relative to the average throughout the twentieth century. But this
is easy to correct for, and the goal of each system is to measure how much temperatures are rising or
falling rather than what they are in any absolute sense.
Reassuringly, the differences between the various records are fairly modest . . . All six show both 1998
and 2010 as having been among the three warmest years on record, and all six show a clear long-term
warming trend, especially since the 1950s when atmospheric CO2 concentrations began to increase at a
faster rate. For purposes of evaluating the climate forecasts, I’ve simply averaged the six temperate
James Hansen’s Predictions
One of the more forthright early efforts to forecast temperature rise came in 1981, when Hansen and
six other scientists published a paper in the esteemed journal Science.* These predictions, which
were based on relatively simple statistical estimates of the effects of CO2 and other atmospheric
gases rather than a fully fledged simulation model, have done quite well. In fact, they very slightly
underestimated the amount of global warming observed through 2011. [Geert Jan van Oldenborgh and
Rein Haarsma,”Evaluating a 1981 Temperature Projection,” RealClimate.org, April 2, 2012.
* J. Hansen, et al. “Climate Impact of Increasing Atmospheric Carbon Dioxide,” Science, 213,
4511(August 28, 1981).
Hansen is better known, however, for his 1988 congressional testimony as well as a related 1988 paper
that he published in the Journal of Geophysical Research. This set of predictions did rely on a
three-dimensional physical model of the atmosphere. [J. Hansen, et al., “Global Climate Changes as
Forecast by Goddard Institute for Space Studies Three-Dimensional Model,” Journal of Geophysical
Research, 93, D8 (August 20, 1988), pp. 9341-9364. http://pubs.giss.nasa.gov/abs/ha02700w.html
Hansen told Congress that Washington could expect to experience more frequent “hot
summers.” In his paper, he defined a hot summer as one in which average temperatures in Washington
were in the top one-third of the summers observed from 1950 through 1980. He said that by the 1990s,
Washington could expect to experience these summers 55 to 70 percent of the time, or roughly twice
their 33 percent baseline rate.
In fact, Hansen’s prediction proved to be highly prescient for Washington, DC. In the 1990s, six of the
ten summers qualified as hot . . . right in line with his prediction. About the same rate persisted in the
2000s and Washington experienced a record heat wave in 2012 . . . .
The IPCC’s 1990 Predictions
The IPCC’s 1990 forecasts represented the first true effort at international consensus predictions in the
field and therefore received an especially large amount of attention. These predictions were less
specific than Hansen’s, although when they did go into detail they tended to get things mostly
right. For instance, they predicted that land surfaces would warm more quickly than water surfaces,
especially in the winter, and that there would be an especially substantial increase in temperature in
the Arctic and other northerly latitudes. Both of these predictions have turned out to be correct.
The headline forecast, however, was that of the global temperature rise . . . .
The IPCC forecasts were predicated on a “business-as-usual” case that assumed that there would be no
success at all in mitigating carbon emissions. This scenario implied that the amount of atmospheric CO2
would increase to about four hundred parts per million (ppm) by 2010. In fact, some limited efforts to
reduce carbon emissions were made, especially in the European Union, and this projection was
somewhat too pessimistic; CO2 levels had risen to about 390 ppm as of 2010. In other words, the
error in the forecast in part reflected scenario uncertainty—which turns more on political and
economic questions than on scientific ones—and the IPCC’s deliberately pessimistic assumptions
about carbon mitigation efforts. [If you scale back their warming estimates to reflect the
smaller-than-assumed rate of CO2 increase, you wind up with a revised projection of 1.4 [degrees
Celsius] to 3.6 [degrees Celsius] in warming per century. The actual rate of increase, a pace of 1.5
[degrees Celsius] per century since the report was published, falls within this range, albeit barely.]
Nevertheless, the IPCC later acknowledged their predictions had been too aggressive. When they
issued their next forecast, in 1995, the range attached to their business-as-usual case had been
revised considerably lower: warming at a rate of about 1.8 [degrees Celsius] per century. This
version of the forecasts has done quite well relative to the actual temperature trend.* Still, that
represents a fairly dramatic shift. It is right to correct a forecast when you think it might be wrong
rather than persist in a quixotic fight to the death for it. But this is evidence of the uncertainties
inherent in predicting the climate.
* Pielke, Jr., “Verification of IPCC Temperature Forecasts 1990, 1995, 2001, and 2007.
The score you assign to these early forecasting efforts overall might depend on whether you are
grading on a curve. The IPCC’s forecast miss in 1990 is partly explained by scenario uncertainty. But
this defense would be more persuasive if the IPCC had not substantially changed its forecast just
five years later. On the other hand, their 1995 temperature forecasts have gotten things about right,
and the relatively few specific predictions they made beyond global temperature rise (such as ice
shrinkage in the Arctic*) have done quite well. If you hold forecasters to a high standard, the IPCC
might deserve a low but not failing grade. If instead you have come to understand the history of
prediction is fraught with failure, they look more decent by comparison.
* Julienne Stroeve, Marika M. Holland, Walt Meier, Ted Scambos, and Mark Serreze, “Arctic Sea Ice
Decline: Faster Than Forecast,” Geophysical Research Letters, 34, 2007.
Uncertainty in forecasts is not necessarily a reason not to act—the Yale economist William Nordhaus
has argued instead that it is precisely the uncertainty in climate forecasts that compels action,* since
the high-warming scenarios could be quite bad. Meanwhile, our government spends hundreds of
billions toward economic stimulus programs, or initiates wars in the Middle East, under the pretense of
what are probably far more speculative forecasts than are pertinent in climate science. [Richard B.
Rood, Maria Carmen Lemos, and Donald E. Anderson, “Climate Projections: From Useful to Usability,”
University of Michigan, December 15, 2010.
* William Nordhaus, “The Challenge of Global Warming: Economic Models and Environmental Policy,”