Sunday, January 22, 2012

Searching for Competent Science Journalism

Jonah Lehrer, the guy who wrote that New Yorker article about the decline effect, wrote another December bombshell -- this time for Wired.

The first article attacked "the scientific method," concluding that what science proves might not be true after all and that experimentation is insufficient for settling beliefs about the world. The essay was widely discussed and criticized: see here, here, here, and here. I won't try to summarize the criticisms here, but they are worth reading.

The more recent essay narrows the focus to causal inferences. Basically, Lehrer argues that causes cannot be discovered using one-at-a-time inference techniques that treat the world as a mechanism reducible to its various parts. As an example of this, Lehrer uses the case of torcetrapib, a drug designed to improve cardiovascular health by increasing HDL cholesterol while decreasing LDL cholesterol. Lehrer points out that despite knowing a lot of things about how the cholesterol mechanism works, in human trials, torcetrapib turned out to cause heart attacks -- basically the opposite of what the manufacturers intended the drug to do. While torcetrapib did what it was supposed to do with respect to cholesterol, it also increased blood pressure. The total effect was to increase heart attack risk. And so, torcetrapib was not approved by the FDA.

The case is interesting, but the conclusion Lehrer draws is almost completely wrong. He writes:
The story of torcetrapib is a tale of mistaken causation. Pfizer was operating on the assumption that raising levels of HDL cholesterol and lowering LDL would lead to a predicatable outcome: Improved cardiovascular health. Less arterial plaque. Cleaner pipes. But that didn't happen.
The only mistaken causation occurred before the human-subject trials. At that point, the researchers conjectured (reasonably) that the drug would reduce the risk of heart attack. They didn't realize that the drug increased blood pressure. But when they actually conducted the human experiment, they learned some causal facts: torcetrapib does not decrease the risk of heart attack, torcetrapib increases blood pressure, the effect of blood pressure on heart-attack risk is greater than the effect of cholesterol, etc.

At best what Lehrer is illustrating is the law of unintended consequences. After all, the researchers didn't want or expect any of their subjects to die as a result of taking the drug.

But maybe Lehrer's mischaracterization of the result is not especially surprising. He doesn't do a very good job even with basic statistical tools, like p-values.
Researchers have developed an impressive system for testing these correlations. For the most part, they rely on an abstract measure known as statistical significance, invented by English mathematician Ronald Fisher in the 1920s. This test defines a “significant” result as any data point that would be produced by chance less than 5 percent of the time.
And he doesn't even try to describe any more complicated statistical techniques for causal inference: no mention of Bayes' theorem, no causal Bayes nets, no causal Markov axiom, no potential outcomes, and no structural equation models. He doesn't even try to describe the logic behind causal inference by experimentation -- how or why isolation and intervention let us make causal inferences.

The fact that Lehrer says nothing about developments in causal inference is really unfortunate, since Lehrer defended his first essay in part by arguing that we should ask whether science can work better than it currently does. I think such a question is admirable, but then, why didn't Lehrer bother to look around and see whether anyone is busy developing a better way of thinking about causal inference? Opportunity wasted.

Saturday, January 7, 2012

Newcomb's Problem

Newcomb's problem is a famous puzzle in decision theory. The basic set-up is that a super-intelligence of some sort (a super-neuroscientist, a psychic, God, a super-computer, Omega, whatever) offers to play a game with you. The intelligence puts a box and a thousand dollars on a table. Either the box contains a million dollars or it is empty (or in special circumstances, it contains a high-yield explosive). The intelligence tells you that you may have either the contents of the box -- the one-box choice -- or you may have both the contents of the box and the thousand dollars -- the two-box choice. But there is a catch. The intelligence tells you that if it predicts you will take both the box and the thousand dollars, then it has put nothing in the box. If, however, the intelligence predicts that you will take only the contents of the box, then it filled the box with a million dollars. (Any fancy choosing -- like flipping a coin or calling in a friend -- is punished: if the intelligence predicted that you would do something fancy, then it filled the box with a high-yield explosive.) Suppose for the purposes of the thought experiment that you have excellent evidence that the intelligence is a very reliable predictor: a range of values work here, but to be concrete, suppose the intelligence predicts correctly nine times out of ten. The question is whether you should make the one-box choice or the two-box choice.

Decision theorists line up on both sides. Evidential decision theorists point out that conditioning on the two possible acts -- the one-box choice or the two-box choice -- the act that maximizes expected utility is the one-box choice. Hence, one should take just the contents of the box. Causal decision theorists point out that the content of the box at the time of the choice does not causally depend on the choice being made. Whatever the intelligence put in the box is there. Choosing to take or not to take the contents of the box won't change whatever's in it. Moreover, whatever happens to be in the box, taking both the box and the thousand dollars dominates taking just the contents of the box.

A large and growing literature has developed around Newcomb's problem. I will not try to cover even the most basic issues here. (For example, I'm not going to try to address pre-commitment.) What I want to do is ask a simple question: Why does Newcomb's problem need the box at all?

As far as I can tell, the problem is left completely unchanged by putting it as follows. A super-intelligence wants to play a game with you. The intelligence scans you and predicts whether you will take exactly one million dollars from the table. If the intelligence thinks that you will take exactly one million dollars, then it places $1,001,000 on the table. Otherwise, it places $1,000 on the table. Question: As you stand in front of the table, do you take all of the money in front of you or not?

Sunday, January 1, 2012

Does Anything Exist That Cannot Be Measured?

The Cardinals recently lost Albert Pujols to the Angels. Consequently, in Cardinal Nation, there has been much bitter wailing and even some jersey-burning in effigy.

Now, I've looked at the numbers a little, and I think the Cardinals were wise not to meet Pujols' contract demands. The nobler side of me hopes that I am wrong and that Pujols' best seasons are coming up. But the slightly more realistic and vindictive side of me thinks Pujols is likely to blow out his elbow soon and in any event, he is already on the wrong side of his peak. At least, that is what I see in the graph below:

The red line is a simple linear regression of WAR (wins above replacement) on years in the majors. The blue line is a regression of WAR on years and years squared. The green line is a three-year moving average. The dots use those various trend lines to predict what Pujols will do next season (his twelfth year in the major leagues).

Still, it's disappointing to lose Pujols. As one writer has it, Pujols leaving St. Louis is the death of romanticism in baseball. (Yes, that's overblown, but I still agree with the sentiment, overall.) Also, from a pure business perspective, there was surely reason to keep Albert in Cardinal red. Pujols puts fans in the seats and keeps eyes on the television. Pujols has been the face of the franchise.

Long set-up, I know, but stay with me, I am almost coming to my point. During the holiday break, one of Kerrith's cousins said to me that Pujols was especially valuable for all of the intangibles: the effect of his presence in the clubhouse, his effect on attendance at ball games, his influence in the community, etc. My typical response to this sort of line is that nothing exists that cannot be measured. The "intangibles" are only intangible because we have not yet been clever enough to find good ways of measuring them, not because they cannot be measured in principle.

I am pretty well convinced of the thesis that nothing exists that cannot be measured, but I admit that it has a number of difficulties. Some people will point to abstract notions, like love or justice, in arguing that some things exist that cannot be measured. Kerrith jibed that love could be measured by dollars spent. She meant the suggestion as a reductio ad absurdum of the idea that everything that exists can be measured, but I think dollars spent is a way of measuring love, just not a very good one. That is, keeping track of dollars spent tells us something about the spender's loves, even if it isn't especially accurate or precise.

But I actually think that more serious objections to the thesis come from singular cases, like that of Pujols' effect on his teammates. Just take last year's season as an example. Yadier Molina produced 4.1 WAR last season, and Matt Holliday produced 5.0 WAR last season. How much of that WAR was due to the influence of Albert Pujols in the clubhouse? Did Pujols help them with their approaches at the plate? Did they have conversations that helped them through slumps? Did Pujols hurt their performance?

We cannot directly perform any experiments to answer these questions. We cannot re-run the season with and without Pujols in the clubhouse. The best we can do, I think, is find proxies and analogues to help us build models of the Cardinals with and without Pujols. That is a very serious limitation, but as with love, I want to say that the result is an imperfect measurement of something that exists.

Monday, December 26, 2011

Two Kinds of Time-Travel Paradox

Yesterday was the Doctor Who Christmas Special. Not very good this year, in my not-so-humble opinion. But it is an excuse for thinking about time travel. And, actually, I have two excuses this Christmas. For me, the holidays always involve a lot of driving. Driving often leads to interesting conversations with my wife, and this year was no exception. I don't recall the set-up, but somehow we ended up talking about time travel and causal loops. Coming out of that conversation, I want to notice two kinds of time-travel paradox.

Causal Loops

One kind of time-travel paradox is the causal loop. I've talked about this sort of thing before: here, here, here, and here. The basic idea of a causal loop is that some event c is part of a directed cycle, say cd1d2 → ... → c, such that c is a causal ancestor of itself.

Here is an example. Harry has always heard a story about how a mysterious stranger stole his mother's necklace on the subway and consequently, she met Harry's father and fell in love. Much later in life, Harry builds a time machine. He decides he wants to find the mysterious stranger, so he travels to the fateful date and rides the subway looking for his mother. He ends up on a subway car with his mother. But no one else is around.

"Where is the mysterious stranger?" Harry thinks.

He waits, but no one comes. Eventually, he worries that if someone doesn't steal that necklace soon, he might never be born. So, he does the stealing himself. His mother chases him trying to retrieve her stolen property, but Harry uses his time machine to vanish after running around a corner. Only after returning home and calming down does Harry realize that he was the mysterious stranger all along.

Since it is Christmas time, we might imagine that Harry wraps the necklace up as a gift for his mother. (He's not really a thief, after all.)

The story is perfectly consistent. Just how or why the universe settles on this consistent story may not be clear, but then, I'm not sure why the universe requires photons to travel along null geodesics. One might worry, along the lines of a bilking argument, about what would have happened had Harry decided not to steal the necklace. If Harry had not stolen the necklace, then Harry's mother would never have met his father, and so on. Yes, but Harry did steal the necklace. The central point is that such stories are consistent, and consistency is all that matters.

Ontic Loops

Another class of time-travel paradox is what I will call ontic loops. An ontic loop occurs when there is a self-contained chain of objects o1 = o2 = ... = on = o1, where the objects are identical. Under this characterization, ontic loops are no more mysterious or inconsistent than causal loops. However, an additional continuity constraint leads to serious problems.

Rather than try to say what the constraint looks like abstractly, let me illustrate with a problematic object in the atrocious film Somewhere in Time.

In the film, Christopher Reeve's character receives a pocket watch from an elderly woman who tells him to find her again. He then travels to the past using what can only be described as the silliest time-travel theory ever. While in the past, he gives the pocket watch to a young version of the elderly woman. And there you have it: a watch from nowhere.

What's problematic about the watch from nowhere? Pick a time during which the watch exists, say March 4, 1973. Presumably, the watch was a little older -- a little more worn out -- than it was in 1972. But then, repeat that observation and trace back far enough, and you discover that the watch on March 4, 1973 is older and more worn out than it was on March 4, 1973! The watch ends up having incompatible properties at the same time.

If we are allowed to introduce discontinuities, then we can get out of the paradox. For example, we might suppose that at some time(s), the watch is very thoroughly repaired -- with all of its parts replaced. In that case, the watch never has inconsistent properties at the same time. But if even so much as a single gear is never replaced, then we have to imagine a different kind of discontinuity: the gear (or watch) becomes older and more worn over time until at some time, all at once, the gear (or watch) is suddenly quite a bit younger, quite a bit less worn out.

To take another example, imagine a glass of water and ice cubes on a closed timelike curve. Suppose that the curve is long enough to allow all of the ice to completely melt into water. As you go around the curve, the ice melts almost surely (in the technical sense). But at some point, there is a discontinuity, where going forward in time, the glass of water transforms into a glass filled with ice.

Alternatively, we could try to preserve continuity by imagining the watch becoming older for a while and then becoming younger for a while, or by imagining the ice melting for a while and then freezing for a while. But to do that requires that for half of its travels through time, the problematic object is violating laws of thermodynamics. And in this house, we obey the laws of thermodynamics!