About Me

When not at work with students, I spend my time in my room either reading, calculating something using pen and paper, or using a computer. I read almost anything: from the pornographic to the profound, although my main interests are mathematics and physics. "When I get a little money I buy books; and if any is left I buy food and clothes." -Erasmus

Saturday, January 15, 2011

How to Lie with Statistics

I've recently read a paper from the journal Contraception, a journal published by Elsevier. The paper, Contraception 83 (2011) 82–87Trends in the use of contraceptive methods and voluntary interruption of pregnancy in the Spanish population during 1997–2007  by Duenas et al, is being used by people against the Reproductive Health Bill. The paper has noted that from 1997 to 2007, the Spanish population's abortion rate (coyly termed early voluntary interruption of pregnancy) increased from 1997 to 2007, at the same time that the contraception rate increased. Anti-R. H. bill writers have used this paper to argue that encouraging contraceptive use is the same as encouraging abortions, which is nonsense, as I see it.

There are a number of reasons why I disagree (after all, although I think that people should use contraception or go on cold turkey, it doesn't necessarily mean that I would make adoption freely available, so there is a counter-example!), but what makes this a case of lying with statistics is the citation of numbers without even mentioning the changing demographics of the Spanish population. The paper, if read, contains a more nuanced interpretation precisely because the numbers, as presented, do not make a strong case for the premise that more contraception means more abortion.

Who is most likely to use abortion? A good candidate would be a poor, young, uneducated unwed pregnant woman who does not wish to take on the burden of caring for a child. If we want to know if encouraging contraceptive use within this segment causes increased use of abortion, we should focus on data within this segment only, and not on the population as a whole.

This group, by the way, is probably among that part of the population that is least likely to use contraception.  If this group has increased in the population, at the same time that more of the other parts of the population are adopting effective birth control methods, then we should be able to see an increase in both the incidence of birth control use as well as abortion.

Demographic data from Spain does support an increase in  that part of the population. Immigrants, for example, (likely to be poor and uneducated) have increased from a mere 2 percent of the population (at the start of the study) to 10 percent of the population (at the end of the study). The authors of the study probably did not expect such large demographic changes; and worse, they do not have the numbers of poor, unwed, pregnant women (the portion at risk). I'm still reading the paper to see if I can tease apart these numbers from the data. For now though, the cautious conclusion-- the data is insufficient to support the premise that contraceptive use increases abortion use-- holds.

An analogy: cheating on an exam versus studying for the exam. Suppose you have a group of 1000 students, with 10 of them deciding to cheat, 490 of them using ineffective study methods, and the rest (500) using effective study methods. For simplicity, let us assume that these groups are mutually exclusive. We will now have the following percentages: 50% for effective study methods, 49% using ineffective study methods, and 1 percent cheating.

Suppose that next year, you get another sample of 1000 students. Of this sample, 60 decide to cheat, 340 use ineffective study methods, the rest (600) have adopted effective study methods. Then we will obtain the following percentages: 60 percent use effective study methods, 34 percent use ineffective study methods, while 6 percent cheat. Again assume that these segments are mutually exclusive.

We now have an amusing result: a 10 percent increase of people who study effectively, and at the same time, a 5 percent rise in the incidence of cheating.  Ergo, encouraging the use of effective study methods increases the incidence of cheating. (There are other ways of calculating the numbers in a way to make a more dramatic point. We could say instead, the fraction of people who cheat has increased six times, at the same time that more people use effective study methods.)

I've recently read Darrell Huff's How to Lie with Statistics, and I was greatly entertained. It's a good book for people who want to learn how to detect statistical shenanigans; J. Michael Steele, a Wharton professor, has written a review of the book, so I will not talk much about it. I recommend it because it contains lots of tales of how statistics can be misused, especially when there is actually less than meets the eye.

No comments: