Statistics may be defined as “a body of methods for making wise decisions in the face of uncertainty.”
Statistics are crucial to the skeptical position on so many topics, even though it can be frustrating to explain why to people. They provide us with a very powerful toolbox to separate what we wish to see or what “common sense” tells us, from what is actually true. But reality is ugly and messy, and anecdotes are simpler to understand. On Skeptic North alone, we’ve mentioned statistics in relation to hockey, H1N1, blood donation, public transit, and homeopathy, to name a few.
But even when we understand why our data are better then someone else’s, it can be hard to explain why. So, here’s a refresher on some basic statistical methods.
Part 1 – The Cookie War (or What’s a Confidence Interval?)
Say you’re an aspiring skeptic in small town Alberta. The current argument in the long standing feud between the two bakers in town has erupted over who has the most chocolate chips in his cookies. You decide that someone should put an end to the matter.
The first thing you need is a way to measure the chocolate chip density (CCD) per cookie. You could just count the chips, but what if one baker makes larger cookies? For this example, let’s say that the chosen method is number of chocolate chips per 50g cookie. And, to avoid the claim that one of the bakers just had a bad day, let’s say that you collect two cookies per day from each baker for a month.
Many types of data in life, especially when dealing with large populations, can be assumed to be the shape of a bell curve. The height of everyone in the population is an example of this. There’s the average height, a range on either side that we would still consider “normal,” and there are fewer very tall or very short people the further you get from the average. If you count the number of people at each height and graph it, you get a curve.
This is probably a reasonable approximation for cookies. There will be an average CCD, but some will have more chocolate chips and some will have fewer. Very rarely, you’ll see a cookie that has no chocolate chips, or is almost all chocolate chips.
This one is pretty simple to explain to people, and it’s useful. For this study*, Baker A has an average CCD of 7.35 and Baker B has an average CCD of 7.48. If you were to hear this study on the radio, you’d likely hear that Baker B has more chocolaty cookies. But that isn’t the whole story, and may not be true.
The Standard Deviation and Confidence Interval
In newspapers, you’ll usually find a statement about a poll that says, “This poll is accurate to within 3%, 19 times out of 20.” This is very important from a skeptical perspective. In a bell curve, there’s a value called the standard deviation. It’s based on how spread out the numbers are or the population size, depending on the type of data you have.
Obviously, we haven’t sampled the entire population (e.g., every cookie made by each baker). If we had, we wouldn’t need statistics to figure out who has the most chocolate chips. However, if you take the average and 2 standard deviations on either side of it, you have a 95% confidence interval. This means that if we repeat the study again and again, 95% of the studies will have confidence intervals that overlap with whatever the real average is.
To make life easier, I’ll just tell you that the standard deviation for Baker A is 1.05 and for Baker B it’s 1.24. We can say that the 95% confidence interval for Baker A’s cookies is between 5.25 and 9.46, and Baker B’s is between 4.99 and 9.97. Another way that it’s stated more commonly would be, “Baker A’s cookies have a chocolate chip density of 7.35. This is considered accurate plus or minus 1.05, 19 times out of 20.”
This “19 times out of 20″ is important. Focusing on just Baker A for a moment, if you repeat this experiment a total of 20 times, at least 19 of those results should have confidence intervals that overlap. The experiment that was the odd man out would likely be close, but not overlapping. The experiment we just did could very well be the one that’s wrong. If you took those 20 sets of results, you could perform a meta-analysis and come up with an even smaller confidence interval than any individual experiment could.
From this cookie experiment, Baker A and Baker B are tied. Baker B’s confidence interval overlaps with all of Baker A’s confidence interval, and so you cannot (honestly) say that one has a more chocolate chips than the other. However, we can say that if you want more consistent cookies, Baker A is the one to go to.
Of course, this doesn’t settle the feud, so you’ll want to repeat the experiment, because all good science should be repeatable.
The Take Home Message
Think about this the next time that you see a poll for the popularity of political parties. If two of the parties have confidence intervals that overlap, the poll is reporting a tie. If you see a study for some form of new medical treatment that was tested against a placebo, and the confidence interval for the placebo and the medicine overlap, then this particular study has shown that the medicine has no effect.
* No, I didn’t just make these numbers up. They’re derived from ages in the February 2007 Major League Baseball roster. Because, while I find baseball boring, they do keep good statistics.