« Donate to Tornado Victims | Main | Boy Scouts to Allow Gay Youth »

Tuesday Boy Problem Solved by Simulation

Math PuzzleThe other day, I came across a logic/math problem I hadn't heard before, The Tuesday Birthday Problem. It goes like this:

I have two children, one of whom is a son born on a Tuesday. What is the probability that I have two boys?

This puzzle was apparently first presented at a convention for mathematicians, magicians and puzzle enthusiasts (yeah, that's a pretty specialized convention) by Gary Foshee. Immediately after giving the puzzle, he followed up with this.

The first thing you think is 'What has Tuesday got to do with it?' Well, it has everything to do with it.

I know my first inclination was to dismiss that extra fact. How could it have any effect on the probability of the sex of the other child. I first read this puzzle late at night when I was tired, so I didn't feel like putting too much thought into it. Instead, I just read the explanations of how that extra bit of information alters the odds. But I still wasn't ready to buy those explanations just yet. But rather than try to think through the explanation that night, I decided to tackle it from a different angle. Instead of trying to figure out the odds, I'd just program a simulation and see how it played out.

In fact, this is a very simple simulation. I didn't program it in the most efficient manner, but it got the job done. Here's what I did. I created a 4 x 10,000 element array. That is, 10,000 sets of kids, with four pieces of information to designate sex and birth day of the week for each kid (sex 1, day 1, sex 2, day 2). Then, I randomly assigned sex and birth day to each of the kids. Next, I created a couple variables that would be filled in in the next stage. First was a variable keeping track of the number of sets where at least one was a boy born on a Tuesday - that is, the number of sets where the father would have given his first statement. The other variable was the number of sets with a boy born on a Tuesday and another son - the sets fulfilling the second statement. With the array and variables in place, I went back and did some if statements to simulate the father's conditions, increasing the totals of those variables as appropriate. When that was done, I simple divided the number of sets with kids with a boy born on a Tuesday and another son by the number of sets with at least one boy born on a Tuesday.

After running this program a few times, I found a small problem. 10,000 sets wasn't enough. The fraction was varying by several percentage points each time I ran it. So, I added one more feature to allow the program to keep a running average every time it ran.

Oh, and just to be sure I was doing things properly, I added a similar set of calculations to calculate the probability for a simpler puzzle:

I have two children, one of whom is a son. What is the probability that I have two boys?

This is much easier to understand, so it was my control to make sure the algorithm was working properly.

Warning: Don't read on if you want to solve the problem on your own, first.

Well, guess what I found out. After running the simulation on 100,000,000 sets of kids, I got a probability of 0.4813391 for the Tuesday boy problem, and 0.3333046 for the simpler boy problem. Those are very close to the actual odds of 13/27 (0.481481481...) and 1/3 (0.33333333...). It's pretty counterintuitive, but I guess those eggheads know what they're talking about, after all.

Image Source: Wikimedia Commons

Anyone interested in checking this out for themselves can download my program below:


I haven't read exactly what your simulation does, but I don’t need to. I know what it does from the answer it got. Let me make a suggestion for a slightly different one.

Rather than "eliminating" two-child families that don't meet certain criteria, keep track of fourteen different types: those where a parent tells you about a boy born on a Sunday, a boy born on a Monday, ..., a boy born on a Saturday, a girl born on a Sunday, ..., and a girl born on a Saturday. Each parent tells you only one thing, but is truthful. Calculate the probability of two children with the same gender separately for each category.

You will find that one of the following two facts is true: either the answer is vastly different based on what the parent tells you, or the answer is exactly 1/2 for each category.

What you are overlooking is that the parent of a boy born on a Tuesday, and a child that doesn't fit that description, is equally likely to tell you about either child. (Unless, of course, you require the parent to be biased toward Tuesday Boys. That's what makes the probabilities vastly different, based on that bias.)

The intuitive query "What has Tuesday got to do with it?" is valid, and can be explained much better without the "Tuesday" complication. Say a parent tells you "I have two children." The probability this person has two children of the same gender is 50%: there are four possible gender combinations - BB, BG, GB, and GG - and in half of them the children share a gender.

But suppose the parent adds "... and one has the gender I've written in this sealed envelope." One of the combinations we originally counted contradicts whatever is written there, so it may seem that one case is "eliminated," even if you don’t know which. Of the three remaining, only one includes two of the same gender. So even though you learned nothing relevant to the probability question, it seems the answer changed from 1/2 to 1/3.

The fallacy in this logic is that the parent of a boy and a girl had to decide what to write in the envelope, and could have chosen either "boy" or "girl." So you need to "eliminate" not only the parents who *couldn't* write "boy," but also those who *would* write something else even though they have a boy.

This apparent paradox was first created, in a form that differs only in the numbers of cases, in 1889 by a French mathematician named Joseph Bertrand. He used it to warn future generations that you need to more than just a fact, you need to know how it was obtained. Too many modern mathematicians fail to heed that warning.

I'm not trying to be difficult, but could you please tell me exactly how you would model the Tuesday Birthday problem without worrying about other cases? It's not that I couldn't do it, it's just a fair amount of work for a blog entry when I don't think it will make much difference. Just to repeat what I did:

1. Generate random two child combinations.
2. Search for all sets that satisfy the first statement.
3. Search for all subsets of 2 that satisfy the second statement.
4. Divide the two totals to get the probability.

I don't see that what you're suggesting is actually much different, other than that I'd be repeating steps 2 through 4 for all the different combinations of birth day and gender. So, the Tuesday boy case should come out exactly the same as my original program, and I don't see why it would be any different for any other gender/day combination. What am I missing?

I thought I did describe it. I was brief, but it's there. But...

1) Initialize Count(1:14) to 0. Initialize Match(1:14) to zero.
2) For i=1:LotsAndLots
3) Generate a random two-child family.
4) Generate a true statement of the form "One is a [boy/girl] born on a [Sun/Mon/Tues/Wednes/Thurs/Fri/Satur]day." Determine which j in 1:14 corresponds to this statement.
5) Increment Count(j).
6) If both children have the same gender, increment Match(j)
7) next i
8) For j=1:14
9) Print out Match(j)/Count(j)
10) next j

The only thing I left to you is how to generate the statement in step 4). If you do it randomly, all of the numbers you print out will be 1/2, +/- the accuracy of your simulation. If you look for the lowest j whose statement is true, you will print out different numbers for each.

The point is that your simulation does not represent a real-world situation. It reflects a world where every parent of two is asked "Is one of your children a boy who was born on Tuesday?" Those who answer "yes" are counted, while those who answer "no" are not.

Mine represents a world where every parent is free to describe whatever they like. It counts those who say "I have a boy who was born on a Tuesday." It does not count, in the same category, those with a Tuesday Boy who would tell you about a Thursday Girl. Which is what the problem statement suggests.

The difference is that you count all of the parents of a Tuesday Boy and a Thursday Girl in the "Tuesday Boy" category, and none of them in the "Thursday Girl" category. I split them.

Post a comment


TrackBack URL for this entry:


Selling Out