Real Kato Blog: Five Lessons: Probability and Statistics

For today: Five Lessons in Probability and Statistics.. The math behind probability is fairly straightforward, but it often leads to counterintuitive results that confuse a lot of people. I've written to some degree about this topic before. I find this kind of stuff interesting; I guess I just like proving that intuition isn't a substitute for real thought.

So here we go...

1. There is no "law of averages". Some people expect that if, say, a coin flip has come up heads five times in a row, then the odds must be high that tails will come up next, because heads and tails should come up a roughly equal number of times over the long run. But that's a misinterpretation of the math. While it's true that the odds of six heads coming up in a row are small before you've flipped the first time (63-to-1 against), the odds of a sixth head coming up after the first five heads are back to 50-50. You've already passed the improbable part of the event, the first five heads. Nothing about the past events will affect the physics of how the coin flips next.

There is, however, something called "regression towards the mean". Suppose I've flipped a coin ten times and it's only come up heads twice. If I'm predicting the next series of ten coin flips, it's significantly more probable that you'll end up with more than two heads in the next series, than two or fewer. It's not that the first ten flips are affecting the next ten; it's just that for any given series of ten flips, you'd expect to average five. So if you must call something a "law of averages", I guess that would be it.

But speaking of coin flips...

2. Real coin flips are not fair. That is to say, the odds of a coin landing heads or tails are not precisely 50-50. If a coin is allowed to land on the ground, it's slightly more probable (around 51-49) that it will land the same way it was facing prior to being flipped. If a coin is spinning on the floor and then comes to rest, it's significantly more probable that it will end up with the heavier side of the coin down (for most coins, that means tails up). Math problems in textbooks will often use the term "fair coin" or "ideal coin" to account for the fact that real-life coins are not fair.

But also, there's a small but non-zero chance that the coin will land on its edge. If you think about it, it happens more often that you probably realize: whenever a coin comes to rest leaning against a piece of furniture or someone's shoe instead of landing flat on the ground, I'd argue that counts as "edge". (It's also possible for a coin to land perfectly on its edge on a smooth floor. But it's an unstable equilibrium; any disturbance, like a vibration or a gust of wind, would probably cause the coin to tip over a moment later.)

There's lots more about coin flips here.

3. You can't assume all possible outcomes happen in equal proportions. This seems like an obvious statement, but consider this puzzle (posed in a Marilyn Vos Savant column): A woman tells you she has two children, and at least one of them is a boy. What are the odds that the other is also a boy? For most people, the intuitive answer is 50-50. People might even give an argument like I gave in item 1 above, saying that the sex of one child has no bearing on the sex of the other.

But actually, there are four approximately equally probable ways a woman can have two children: boy then boy, boy then girl, girl then boy, or girl then girl. (It matters which one is older, as any kid who grew up with a sibling can tell you.) If a woman tells you at least one child is a boy, you've only eliminated the last case (girl then girl). That leaves three possibilities, and in only one of those three cases is the other child also a boy. So the answer to the above problem is "one out of three", not "one out of two"... even though there are only two possibilities for the sex of the other child.

Marilyn got a lot of heat for this explanation, but as usual, she was right and most of America was wrong.

4. Beware of false positive results. Consider you have a medical test for a rare but deadly disease. The doctor tells you the test is 99.9% accurate. You test positive for the disease. Are you doomed? Do you start giving away your worldly possessions? Well, not so fast.

If the disease is very rare, then you've actually got a good chance or surviving. Suppose there's a 1 in 10,000 chance of catching the disease. If you administer this medical test to 10,000 people, then the one person who has the disease will almost certainly test positive for it. But ten people who don't have the disease will also test positive, because 0.1% of the time, the test is wrong. So you're ten times more likely to be a false positive than an actual sick person.

Hopefully you'll never come across this situation, but maybe when a doctor diagnoses you with some rare disease based on some test she performed on you, you'll do the math and figure out that you need a second opinion.

5. You can believe polls based on random samples. This applies to everything from TV ratings to political polls. I've heard people argue, "How could Nielsen possibly know what everyone is watching if they only ask a few thousand?" But statistically speaking, even a small sample of the population can give you high confidence that you've got an accurate picture of the general population... if the sample is random.

So there are two caveats there. The first caveat is, it's very hard to get a truly random sample. Political polls are skewed because, among other reasons, some people don't like to answer polls, and polls are conducted over landline phones but not cell phones. Pollsters apply a "fudge factor" to account for these things, but that's where inaccuracy creeps in.

The second caveat is, statisticians never claim that the result of the poll is a certainty. They'll say something like, "I have 95% confidence that Candidate X has the support of 55% of the population, plus or minus 3%". That means that (a) the results are probably somewhere between 52% and 58% (the "margin of error"), but (b) that there's a 5% chance that the results are outside that range.

So yes, you can believe the math behind polls, but make sure you understand what the polls are really telling you, and keep in mind that sampling bias can undermine a poll's reliability.

As always, feel free to post comments or questions.

Permalink 2 Comment

Posted by Ken in: interesting

Comments

Comment #1 from Timothy Ross (Guest)
2009 Aug 5 - 7:11 pm : #

I am loving these 5 lessons articles!

On #4... Before you get too relaxed, don't forget that most of the times that that you are being tested for a rare disease it is because you have shown symptoms. The disease may affect 1 in 10,000 but 9,000 of those wouldn't show any signs. That would make your chances (if I do my math right) 50/50 (1/1000 or 0.1% chance of false positive and 1/1000 chance of dying within the year).

Also a quick #6 (but more psychological than mathematical):
6. Fractions are scarier than percentages
A 44 year-old woman has a 1 in 35 chance of having a baby with Down Syndrome. 1 is 35!!! I know 35 people! That is scary. Oh.. wait, 1 in 35? That is less than a 3% chance. Let's get it on!

Comment #2 from Howard Hendrickson Ph.D (Guest)
2009 Sep 29 - 6:19 pm : #

I have actually observered, (and it fact "called" edge in a coin toss), the incident of a coin landing on it's edge. If one thinks about what the differences are between disc and cylinder, one would find that they are the same thing, the only difference being height. In layman's terms: Increase the height of a disc, and you have a cylinder. Decrease the height of a cylinder and you have a disc. It is easy to see that there is NOT a 50/50 chance of heads or tails in a coin toss. Thought for food...

Comments are closed for this post.

Search This Site