Not exactly a stats post. Just a thinkin' about stuff and writin' about stuff post. These are some questions I've either been asked at some point, or that I ask myself every night as I fall asleep, fearful that the darkness has no answers, more fearful that it will speak answers I cannot bear to hear. Think of it as a Frequently Asked Questions post, using a liberal interpretation of the terms "frequently" and "asked."
Friday, October 31, 2014
Wednesday, October 1, 2014
Ten Card Draw
Sort of a sequel-post here, with more of the tarot math I was playing around with last time. Some of the references I used for that post discussed the idea that if you properly shuffle a deck of 52 playing cards, there's a decent chance that you've created a unique order of cards, never before seen in the history of shuffled decks. I wondered, what are the chances that any given tarot reading is unique in the history of tarot readings?
The spread pictured above is the Celtic cross spread, one of the more popular tarot spreads out there. Different people have different preferences for what each position signifies-- hell, there are only about three matches between the diagram above and the layout as I learned it-- but it's always ten cards, and the meaning of the spread depends upon the order the cards are drawn as well as the orientation in which they're laid down.
Given those parameters, we can calculate the total number of unique Celtic cross spreads:
Now we come to the second part of the question: what's the chance that, given how many possible Celtic cross spreads there are, no two Celtic cross spreads in the history of tarot readings have been identical? To answer this, we can use the math of the Birthday Paradox. If you haven't heard of it, the Birthday Paradox is the name given to the fact that you don't need as large a group of people as you might think before you start getting a pretty good chance that at least two people in that group share a birthday. If you have 23 people in a room together, there's about a fifty-fifty chance that there's a shared birthday among them. The linked explanation of the Birthday Paradox is better than any I could give, so I'll just let y'all educate yourselves there if you want to know more about it.
What's useful for our purposes is the shortcut formula near the end of that post: if you've got a pool of a given number of things, and you have an equal chance of drawing any one of the things, how many times do you need to draw from that pool before your chances of having drawn the same thing twice are about fifty-fifty? The precise math is complex, but we can get a decent estimate by taking the square root of the size of our pool-- or, a little more specifically, 1.177 times the square root of our pool. There are 365 possible birthdays out there (excluding leap years), and 1.177 times the square root of 365 is about 22.49, very close to the actual 23-person figure.
We have a pool of Celtic cross configurations of a known (if enormous) size. Roughly how many tarot readings would need to occur to reach an even chance of at least one repeat?
The answer: You'd need more than 80 billion Celtic cross tarot readings before the chances of a repeat reading reach fifty percent. Specifically, you'd need to do 80,482,750,652 tarot readings, or 11.3 for every human alive on Earth today. That is a big number.
I'm not exactly sure how to determine the total number of tarot readings that have ever occurred. If the average person has had a dozen Celtic cross readings in their lifetime, then we've probably reached 80 billion tarot readings total. But have they? I imagine the variance for that dataset is pretty big-- lots and lots of people who've never had a reading, versus enthusiasts who might have had hundreds. There's probably a way to estimate that, but it would take more effort than I'm willing to put in.
In any case, if every person on earth sat down and did ten consecutive Celtic cross spreads right now, there's a better-than-average chance that every single one of those readings would be unique. That's a staggering enough thought that I'm willing to say there's a good chance that your ten-card Celtic cross reading, while still bullshit, is your very own, never-before-seen, personal bullshit. And isn't that something?
The spread pictured above is the Celtic cross spread, one of the more popular tarot spreads out there. Different people have different preferences for what each position signifies-- hell, there are only about three matches between the diagram above and the layout as I learned it-- but it's always ten cards, and the meaning of the spread depends upon the order the cards are drawn as well as the orientation in which they're laid down.
Given those parameters, we can calculate the total number of unique Celtic cross spreads:
78 x 2 x 77 x 2 x 76 x 2 x 75 x 2 x 74 x 2 x 73 x 2 x 72 x 2 x 71 x 2 x 70 x 2 x 69 x 2 =
4,675,765,217,094,107,136,000
The point of all those 2s in there is to represent that each time a card is laid down, there are two possibilities for the way it faces. As you can see, there are a buttload of possible outcomes for a Celtic cross spread. Using standard US nomenclature for really big numbers, we can say that there are more than 4.6 sextillion possible Celtic cross spreads.
... the major motivation behind this blog post is that I calculated a number that gives me an excuse to say the word "sextillion." It is, scientifically speaking, the funniest number-related word.
A Saga-themed tarot deck would be AMAZING, btw. |
Now we come to the second part of the question: what's the chance that, given how many possible Celtic cross spreads there are, no two Celtic cross spreads in the history of tarot readings have been identical? To answer this, we can use the math of the Birthday Paradox. If you haven't heard of it, the Birthday Paradox is the name given to the fact that you don't need as large a group of people as you might think before you start getting a pretty good chance that at least two people in that group share a birthday. If you have 23 people in a room together, there's about a fifty-fifty chance that there's a shared birthday among them. The linked explanation of the Birthday Paradox is better than any I could give, so I'll just let y'all educate yourselves there if you want to know more about it.
What's useful for our purposes is the shortcut formula near the end of that post: if you've got a pool of a given number of things, and you have an equal chance of drawing any one of the things, how many times do you need to draw from that pool before your chances of having drawn the same thing twice are about fifty-fifty? The precise math is complex, but we can get a decent estimate by taking the square root of the size of our pool-- or, a little more specifically, 1.177 times the square root of our pool. There are 365 possible birthdays out there (excluding leap years), and 1.177 times the square root of 365 is about 22.49, very close to the actual 23-person figure.
We have a pool of Celtic cross configurations of a known (if enormous) size. Roughly how many tarot readings would need to occur to reach an even chance of at least one repeat?
The answer: You'd need more than 80 billion Celtic cross tarot readings before the chances of a repeat reading reach fifty percent. Specifically, you'd need to do 80,482,750,652 tarot readings, or 11.3 for every human alive on Earth today. That is a big number.
I'm not exactly sure how to determine the total number of tarot readings that have ever occurred. If the average person has had a dozen Celtic cross readings in their lifetime, then we've probably reached 80 billion tarot readings total. But have they? I imagine the variance for that dataset is pretty big-- lots and lots of people who've never had a reading, versus enthusiasts who might have had hundreds. There's probably a way to estimate that, but it would take more effort than I'm willing to put in.
In any case, if every person on earth sat down and did ten consecutive Celtic cross spreads right now, there's a better-than-average chance that every single one of those readings would be unique. That's a staggering enough thought that I'm willing to say there's a good chance that your ten-card Celtic cross reading, while still bullshit, is your very own, never-before-seen, personal bullshit. And isn't that something?
Wednesday, September 24, 2014
Four of Wands, Page of Math
Like many people, I have an embarrassing hobby.
So I'm a statistician, right? A general fan of science and facts and stuff. A stickler for evidence. The kind of person whose mantra is correlation does not necessarily imply causation, except when the causal relationship in question isn't based on quantitative data and therefore can't really be described using the term "correlation," in which case the dubious argument is better criticized by referencing the logical fallacy "post hoc ergo propter hoc." (Because I'm also the kind of person who is a pain-in-the-ass stickler for using the word "correlation" correctly.)
So, it follows that I wouldn't be into any sort of mystical divination practices, since their fakeness is fairly obvious. Even my homegirl Hermione rolls her eyes at them. She trusts the judgment of a telepathic singing hat, but not someone's subjective interpretation of a pile of tea leaves, because even literal witches know that divination is bullshit.
That is my embarrassing hobby: I read tarot cards. I don't believe they have any predictive power, but I know how to do the spreads and interpret them as though they do. The only excuses I have for why I read tarot cards are
My friend Lauren wants to go back to school to finish up her degree, and a while back, she asked if I could read her tarot cards with regard to her prospects. We shuffled and re-shuffled (a LOT) and I drew three cards for her: a reversed ace of wands (stagnation or lack of passion for a new opportunity), a nine of swords (mounting anxiety and insomnia), and a reversed five of swords (unavoidable catastrophic failure). Not even the most liberal interpretation could spin anything positive out of that hand.
So I blew it off and shuffled again, because Lauren is awesome and deserves a fortune-telling session that predicts piles of cash and hookers (lookin' at you, ten of pentacles). And the new cards were a reversed high priestess (impatience, wasted potential), a five of pentacles (poverty and bad luck), and the goddamned devil (which is pretty much as bad as it sounds).
This ridiculous bullshit continued for about seven more drawings, separated by increasingly intense shuffling sessions, including the foolproof throw-all-the-cards-in-the-air-and-swear-loudly method. Time and again, Lauren got cards that weren't just irrelevant or neutral, but seemed to be actively shitting on her dreams. More than once, she shouted at me, "you're a statistician-- what are the odds of this?"
Challenge accepted!
At first I thought it would be pretty easy to answer Lauren's question. How many ways can there be to draw three cards from a deck? While a card's position in its spread usually affects its interpretation, we weren't applying any past-present-future or problem-advice-outcome meanings to the three card spreads we were doing, so order doesn't matter for our purposes. There's 78 cards in a tarot deck, so there's 76,076 possible hands of three.
Now we just have to determine how many of those spreads would be unfavorable for Lauren's education! This part turned out to be trickier.
Since orientation affects interpretation, each of the 78 cards has 2 possible outcomes, for 156 total. I categorized each of these 156 outcomes as positive, negative, or irrelevant with regard to Lauren's education, because ambiguity is for chumps. Of those, there are 43 good possibilities (28%), 64 bad possibilities (41%), and 49 possibilities (31%) that have no bearing on Lauren's question (get outta here two of cups, we weren't asking about crushes). I defined an unfavorable spread as any combination of three cards that includes no positive results and at least one negative result: either all negatives, two negatives and an irrelevant, or one negative and two irrelevant.
And that's where I ran into a bit of a problem. See, I planned to calculate all the different ways one could draw any of those three hands using plain old n-choose-k, where I'd be looking for how many ways to choose 3 negative cards from the pool of 64... but there aren't exactly 64 negative cards. It's impossible for me to first draw an upright eight of swords (feeling trapped by circumstance) and then draw a reversed eight of swords (feeling trapped by circumstance, but like, even worse). But I've included both the upright and reversed versions of the card in my tally of negative outcomes. Essentially, I've backed myself into a mathematical corner where I've artificially doubled the size of the deck: I'm doing calculations based on 156 outcomes, rather than 78 outcomes with two variations each.
To be truly rigorous, I should tally up how many cards switch from positive to negative, from irrelevant to positive, from negative to irrelevant, etc, versus how many cards maintain their general meaning when reversed, and work those numbers into much longer calculations of conditional probability.
Yeah, I based my calculations on the imaginary 156-card deck, where each orientation of each card counts as its own separate card. It's fortune-telling, for crying out loud, I'm not going to worry too much about rigor.
If we allow the fudge-factor of a theoretical 156-card deck to deal with the problem of card orientation, there are 620,620 possible combinations of three cards that one can draw from a tarot deck. There are 41,664 ways to get three negative cards, 98,784 ways to get two negative cards and one irrelevant card, and 75,264 ways to get two irrelevant cards and one negative card.
All together, that's 215,712 crappy hands, out of 620,620 possible hands. There's about a 35% chance that any individual tarot reading I do for Lauren's educational prospects will be negative. If we multiply things out to reflect the fact that Lauren got eight successive crappy readings, we get a probability of just over 0.02%. So... huh. Dang Lauren, sure looks like the universe has it in for you.
So I'm a statistician, right? A general fan of science and facts and stuff. A stickler for evidence. The kind of person whose mantra is correlation does not necessarily imply causation, except when the causal relationship in question isn't based on quantitative data and therefore can't really be described using the term "correlation," in which case the dubious argument is better criticized by referencing the logical fallacy "post hoc ergo propter hoc." (Because I'm also the kind of person who is a pain-in-the-ass stickler for using the word "correlation" correctly.)
So, it follows that I wouldn't be into any sort of mystical divination practices, since their fakeness is fairly obvious. Even my homegirl Hermione rolls her eyes at them. She trusts the judgment of a telepathic singing hat, but not someone's subjective interpretation of a pile of tea leaves, because even literal witches know that divination is bullshit.
That is my embarrassing hobby: I read tarot cards. I don't believe they have any predictive power, but I know how to do the spreads and interpret them as though they do. The only excuses I have for why I read tarot cards are
- You can convince drunk people that you're magic
- They're pretty
- It's kind of like Rorschach blots, you know? The meaning that you impose on the ambiguity can highlight thoughts, feelings, and motivations that might otherwise be difficult to identify
- They're really pretty
- Did I mention how pretty they are
they are really very pretty |
So I blew it off and shuffled again, because Lauren is awesome and deserves a fortune-telling session that predicts piles of cash and hookers (lookin' at you, ten of pentacles). And the new cards were a reversed high priestess (impatience, wasted potential), a five of pentacles (poverty and bad luck), and the goddamned devil (which is pretty much as bad as it sounds).
This ridiculous bullshit continued for about seven more drawings, separated by increasingly intense shuffling sessions, including the foolproof throw-all-the-cards-in-the-air-and-swear-loudly method. Time and again, Lauren got cards that weren't just irrelevant or neutral, but seemed to be actively shitting on her dreams. More than once, she shouted at me, "you're a statistician-- what are the odds of this?"
Challenge accepted!
At first I thought it would be pretty easy to answer Lauren's question. How many ways can there be to draw three cards from a deck? While a card's position in its spread usually affects its interpretation, we weren't applying any past-present-future or problem-advice-outcome meanings to the three card spreads we were doing, so order doesn't matter for our purposes. There's 78 cards in a tarot deck, so there's 76,076 possible hands of three.
Now we just have to determine how many of those spreads would be unfavorable for Lauren's education! This part turned out to be trickier.
Since orientation affects interpretation, each of the 78 cards has 2 possible outcomes, for 156 total. I categorized each of these 156 outcomes as positive, negative, or irrelevant with regard to Lauren's education, because ambiguity is for chumps. Of those, there are 43 good possibilities (28%), 64 bad possibilities (41%), and 49 possibilities (31%) that have no bearing on Lauren's question (get outta here two of cups, we weren't asking about crushes). I defined an unfavorable spread as any combination of three cards that includes no positive results and at least one negative result: either all negatives, two negatives and an irrelevant, or one negative and two irrelevant.
And that's where I ran into a bit of a problem. See, I planned to calculate all the different ways one could draw any of those three hands using plain old n-choose-k, where I'd be looking for how many ways to choose 3 negative cards from the pool of 64... but there aren't exactly 64 negative cards. It's impossible for me to first draw an upright eight of swords (feeling trapped by circumstance) and then draw a reversed eight of swords (feeling trapped by circumstance, but like, even worse). But I've included both the upright and reversed versions of the card in my tally of negative outcomes. Essentially, I've backed myself into a mathematical corner where I've artificially doubled the size of the deck: I'm doing calculations based on 156 outcomes, rather than 78 outcomes with two variations each.
To be truly rigorous, I should tally up how many cards switch from positive to negative, from irrelevant to positive, from negative to irrelevant, etc, versus how many cards maintain their general meaning when reversed, and work those numbers into much longer calculations of conditional probability.
Yeah, I based my calculations on the imaginary 156-card deck, where each orientation of each card counts as its own separate card. It's fortune-telling, for crying out loud, I'm not going to worry too much about rigor.
If we allow the fudge-factor of a theoretical 156-card deck to deal with the problem of card orientation, there are 620,620 possible combinations of three cards that one can draw from a tarot deck. There are 41,664 ways to get three negative cards, 98,784 ways to get two negative cards and one irrelevant card, and 75,264 ways to get two irrelevant cards and one negative card.
All together, that's 215,712 crappy hands, out of 620,620 possible hands. There's about a 35% chance that any individual tarot reading I do for Lauren's educational prospects will be negative. If we multiply things out to reflect the fact that Lauren got eight successive crappy readings, we get a probability of just over 0.02%. So... huh. Dang Lauren, sure looks like the universe has it in for you.
Technically, the universe has it in for all of us. Enjoy your inevitable disintegration, everything! <3 Entropy |
Friday, March 28, 2014
Public Radio Pledge Drive Probability
North Carolina Public Radio Pledge Drive Season is the worst time to miss a call from an unfamiliar number. I have missed several such calls in the past few weeks, and I am convinced that each one was Eric Hodge. He was calling to inform me, via vague, leading questions about famous landmarks and culturally-specific foods, that I'd won a drawing for one of the fantastic getaways they always advertise. One of these days, I'll be right. But which one of these days?
It's a little tough to figure out how likely I am to win any particular WUNC Trip Drawing. My probability of winning is always nonzero, since I'm a Sustainer, which means I'm a) better than everyone else and b) automatically entered into every Pledge Drive drawing. The amount of funding that North Carolina Public Radio receives from listener contributions is public knowledge, but the precise number of contributors isn't listed anywhere that I could find. Even if I could figure out how many Sustainers there are, the number of one-time-gifters in the pool changes with every drawing.
There are a couple ways I can estimate how many people listen to WUNC, which is a good start for figuring out how many people are competing with me for that weekend getaway in France. This Nielson report from December of last year indicates that 90% of Americans listen to radio each week, and the Raleigh-Durham region has a population of about two million, so the local radio-listening audience is probably somewhere around 1,800,000. WUNC's weekly cume is a little under 17% of the market, so there are probably at least 300,000 people tuning in each week. Maybe. Man, I hope I'm estimating radio listenership the right way, since it's sort of my job that I get paid for!
So, let's be optimistic and assume that maybe one out of every twenty listeners donates to the station. I've got no clue if that estimate is wildly high or wildly low or wildly spot-on, never having worked for an organization that depends on donations, but it feels like a vaguely educated guess. That's 15,000 donors total. It probably fluctuates, but let's say, for argument's sake, that I have a 1 in 15,000 chance in any given drawing of winning a trip to Rome. There are three WUNC Pledge Drives each year, and each one seems to have at least five trip drawings, so I have fifteen chances each year to win.
How long would I have to wait before I could be at least 90% sure of winning at least one trip drawing? Put another way, how many times do I have to enter a drawing so that my chance of losing every single last of them dwindles to 10% or less? My chance of losing any individual drawing is 14999/15000, and my chance of losing all the drawings I enter is 14999/15000 raised to the power of the number of drawings in which I participate. So let's solve this equation!
Using the definition of a logarithm and the change of base formula, both of which I totally remembered and did not need to look up just now, because I am a smart mathematician who never ever forgets really basic important facts, and being chronically rusty on logarithms definitely isn't a significant source of anxiety for me, we conclude that the number of drawings I'd need to enter in order to have a 90% chance of winning at least one of them is... 34,538 drawings, or over 2,300 years worth of pledge drives. Lucky for me, my loyalty to quality radio programming is as undying as the sun. Which means, strictly speaking, not actually undying. But undying enough to last for 2,300 years.
It's a little tough to figure out how likely I am to win any particular WUNC Trip Drawing. My probability of winning is always nonzero, since I'm a Sustainer, which means I'm a) better than everyone else and b) automatically entered into every Pledge Drive drawing. The amount of funding that North Carolina Public Radio receives from listener contributions is public knowledge, but the precise number of contributors isn't listed anywhere that I could find. Even if I could figure out how many Sustainers there are, the number of one-time-gifters in the pool changes with every drawing.
There are a couple ways I can estimate how many people listen to WUNC, which is a good start for figuring out how many people are competing with me for that weekend getaway in France. This Nielson report from December of last year indicates that 90% of Americans listen to radio each week, and the Raleigh-Durham region has a population of about two million, so the local radio-listening audience is probably somewhere around 1,800,000. WUNC's weekly cume is a little under 17% of the market, so there are probably at least 300,000 people tuning in each week. Maybe. Man, I hope I'm estimating radio listenership the right way, since it's sort of my job that I get paid for!
So, let's be optimistic and assume that maybe one out of every twenty listeners donates to the station. I've got no clue if that estimate is wildly high or wildly low or wildly spot-on, never having worked for an organization that depends on donations, but it feels like a vaguely educated guess. That's 15,000 donors total. It probably fluctuates, but let's say, for argument's sake, that I have a 1 in 15,000 chance in any given drawing of winning a trip to Rome. There are three WUNC Pledge Drives each year, and each one seems to have at least five trip drawings, so I have fifteen chances each year to win.
How long would I have to wait before I could be at least 90% sure of winning at least one trip drawing? Put another way, how many times do I have to enter a drawing so that my chance of losing every single last of them dwindles to 10% or less? My chance of losing any individual drawing is 14999/15000, and my chance of losing all the drawings I enter is 14999/15000 raised to the power of the number of drawings in which I participate. So let's solve this equation!
The sooner I get LaTeX on my new computer, the better |
Using the definition of a logarithm and the change of base formula, both of which I totally remembered and did not need to look up just now, because I am a smart mathematician who never ever forgets really basic important facts, and being chronically rusty on logarithms definitely isn't a significant source of anxiety for me, we conclude that the number of drawings I'd need to enter in order to have a 90% chance of winning at least one of them is... 34,538 drawings, or over 2,300 years worth of pledge drives. Lucky for me, my loyalty to quality radio programming is as undying as the sun. Which means, strictly speaking, not actually undying. But undying enough to last for 2,300 years.
Sunday, March 23, 2014
This one isn't about statistics, but is instead about my mom, and her death
I feel very sad, very often, and very little remedies the sadness.
I've started writing this thing so many times, and discarded so many drafts. If I took all the words I've written and deleted since December 7th 2013 and put them together, I'd win NaNoWriMo. Sometimes what I wrote was eloquent, but most of the time it was impenetrable garbage that rambled on way past the point where it stopped making sense. Sometimes what I wrote was angry, and mostly it was angry at people who had nothing to do with what I was angry about. Sometimes it had a lot of science in it. Sometimes it used a lot of metaphors. Sometimes it included a lot of fandom references. Sometimes it had pictures, and sometimes they were pictures that I drew (poorly). Sometimes I was bitter. Sometimes I wrote while I was sober, sometimes I wrote while I was drunk, and sometimes I wrote while I was crying and couldn't stop.
Mostly, all the things I wrote were just different ways of saying the same thing: My mom died. I feel very sad, very often, and there are very few things that remedy the sadness. Writing isn't one of them.
I've started writing this thing so many times, and discarded so many drafts. If I took all the words I've written and deleted since December 7th 2013 and put them together, I'd win NaNoWriMo. Sometimes what I wrote was eloquent, but most of the time it was impenetrable garbage that rambled on way past the point where it stopped making sense. Sometimes what I wrote was angry, and mostly it was angry at people who had nothing to do with what I was angry about. Sometimes it had a lot of science in it. Sometimes it used a lot of metaphors. Sometimes it included a lot of fandom references. Sometimes it had pictures, and sometimes they were pictures that I drew (poorly). Sometimes I was bitter. Sometimes I wrote while I was sober, sometimes I wrote while I was drunk, and sometimes I wrote while I was crying and couldn't stop.
Mostly, all the things I wrote were just different ways of saying the same thing: My mom died. I feel very sad, very often, and there are very few things that remedy the sadness. Writing isn't one of them.
Subscribe to:
Posts (Atom)