Well okay, this isn't a political blog so we won't really talk about the Iran situation itself. It was just a piece in the Washington Post regarding the numbers of votes each candidate in Iran got for each district, to add more credence to the accusations that the election was rigged. (Not that there's much doubt of that of course, but this is some interesting extra evidence). And numbers (as opposed to politics) are fun and interesting! And chock full of science!

I don't have the numbers themselves so I can't double-check what "the experts" cited in the Washington Post piece have analyzed, but it sounds plausible enough for me.

The hypothesis is this: Humans are very bad at coming up with random numbers. If the numbers were rigged, we could be able to see patterns in the result numbers that indicate that they were chosen by a human, and not the true result.

We can focus on the last two digits of a vote number. These last two digits are not significant to the overall outcome in a country with fourty million votes. In a normal voting distribution, we will expect to see each of the ten numbers possible for the last digit about 10% of the time each. If you asked a human to give you a random number, they will not give you an even distribution. The number 5, being in the middle, does not 'seem random' to us, and thus we do not choose it when asked to make a random number. 3, 7 and 9 (7 in particular) are numbers that seem much more random to us, and we tend to pick those when asked for a random number.

The numbers look suspicious. We find too many 7s and not enough 5s in the last digit. We expect each digit (0, 1, 2, and so on) to appear at the end of 10 percent of the vote counts. But in Iran's provincial results, the digit 7 appears 17 percent of the time, and only 4 percent of the results end in the number 5. Two such departures from the average -- a spike of 17 percent or more in one digit and a drop to 4 percent or less in another -- are extremely unlikely. Fewer than four in a hundred non-fraudulent elections would produce such numbers.In addition, this works too when being asked for a two-digit random number. Humans are more likely to pick a number with two adjacent digits (for example 23, 87 and 65) than a number without adjacent digits (16, 93 or 52). A two-digit number where both numbers are the same (22, 55 or 88, for example) are also avoided by us.

We've blogged about randomness and our inability to deal with it before. The findings of the Washington Post are what you would expect if you took what we have learned about randomness, and applied it to the Iranian elections. Humans are distinctly non-random creatures and because we always try to see patterns in noise, we are also ourselves incapable of generating true random noise for purposes of randomness.

The lesson is: If you're going to rig an election, use random.org for your random numbers, rather than trying to come up with them yourself.

This is also one of the topics in the wonderful book Freakonomics, where they explain how this is used to detect tax evasion and the likes. (Freakonomics is written by Dubner and Levitt, and is also a blog hosted by NYtimes.)

ReplyDeleteI'm sure I've read somewhere that numbers gathered from the real world tend to have about 30 % ones in them, and then declining percentages for the other digits. I'm sure this differs, though, depending on what kind of numbers we are talking about. Can't find anything relevant with google right now, unfurtunately. Not that it even matters that much in this scenario.

ReplyDeleteAh, yes, I believe that is the argument mentioned in Freakonomics. Or maybe it was in Quirkology! That fact has a name, like some man's rule..

ReplyDeleteBendik, can you flip through Quirkology and have a look? I could't find it in Freakonomics.

Benford's law!

ReplyDelete