Bias-debasing Bayes

By Jørgen Tresse


“Prediction is very difficult, especially if it’s about the future.” – Niels Bohr.

We, as individuals and as a species, make predictions about future events all the time. Yet we keep getting many of them wrong, and it often seems like we’re unable to improve our predictive abilities. I present here a roadmap to uncertainty, risk and the failure of predictions, hoping to leave us all a bit wiser regarding this everyday activity.

Cognitive biases

First a word on the difference between risk and uncertainty. A risk is something you take when you know the probability of different outcomes. Uncertainty is what you have when you don’t know the odds of different outcomes. Walking home from work today, you take a chance of dying in a traffic related accident. However, you know that the risk of this happening is low, so you deem it an acceptable risk for walking home. Compare this to the fear many have of flying, or of terrorism. These risks are certainly much lower, but there are so many uncertainties involved, that the perceived risk is higher. This leads us to adopt anti-terrorism measures much more quickly than traffic safety measures. This in turn says something about how we perceive the probability of future events happening – in other words, our predictions about the future.

In addition to perceived risk, we have other cognitive biases – failures, if you will – that affect our ability to clearly and accurately predict outcomes of events. Take for example availability bias. Many studies suggest that we have an easier time remembering and drawing upon things that we are more exposed for, in uncertain situations. This makes sense, but it skews our predictions. Survivorship bias is another one, which is a sampling bias based on only looking at the survivors of an event. During World War II, Abraham Wald famously helped the American Navy build sturdier airplanes. Before Wald, they were reinforcing planes based on where returning planes had been hit, not realizing that they were reinforcing them where a plane could sustain damage and still survive the trip. Wald recognized that their samples consisted only of survivors, and they had hence failed to consider where fallen planes had been hit. Reinforcing the planes instead where the survivors had not been hit, drastically reduced the number of fatalities.

Furthermore, we have the gambler’s fallacy – the belief that just because something has had the same outcome many times in a row, the outcome is bound to soon change. The chance of a coin toss coming up heads is 50%, regardless of how many times you have thrown heads in a row. When simulating coin tosses – that is, writing down what is considered a reasonable result of coin tosses without actually tossing them – people have been found to write down too short streaks. After maybe four or five heads, they feel that they have to switch to tails, even though with real coin tosses you can easily get eight or more heads before tails show up. Humans are very good at finding patterns in data, and reacting accordingly, which gives us many evolutionary advantages. As many a gambler will have experienced, however, being good at finding signal in the noise does not always work to our advantage.

The unknown unknowns

All these biases and pitfalls can be summed up as snap judgements – cognitive heuristics, or shortcuts, that allow us to make decisions quickly, without having to stop and consciously process all the information we are bombarded with everyday. The Nobel laureate Daniel Kahneman, along with his collaborator Amos Tversky, is one of the best-known scientists within the field of heuristics and biases. In his 2011 book, Thinking, fast and slow, he lays out the differences between two “systems” we all have – system 1, which calls all the snap decisions, based on heuristics we already have discussed, and system 2, which consciously processes information before acting on it. When it comes to making predictions about future events, we could all benefit from slowing down, recognizing our blind spots, and putting system 2 in charge.

Of course, recognizing our blind spots requires us to be aware of them in the first place – so-called known unknowns. Former US Secretary of Defence Donald Rumsfeld also gave us two other “knowns”: known knowns – which is simply what we’re aware of that we know – and unknown unknowns, which is the real kicker. You can’t correct a prediction for unknown unknowns, because, well, you don’t know what to correct for. Yet being aware of the fact that there are unknown unknowns, is a giant leap for making more accurate predictions. A famous study by Philip Tetlock found that experts were often no better than amateurs at predicting future events, even though they stated their predictions with great confidence. This is in part based on the wisdom of the crowd, which experts almost by definition try to stand out from, but also because experts like to make predictions in others’ fields (catchily named ultracrepidarianism). This leads many experts to disregard common sense, in a way inadvertently creating their own blind spots, without even realizing it. Just being aware of your blind spots, so to speak, allows us to counteract some of their effect. My proposition? Think probabilistically.

Predicting the unpredictable

One of the great proponents for increasing the accuracy of predictions is Nate Silver. He started the political website FiveThirtyEight, and correctly predicted the results in 49 out of 50 states in the 2008 US presidential election. This improved to 50 out of 50 in the 2012 election, solidifying Silver and his team as top forecasters in the game. In statistics, the reigning paradigm for more than a half century has been testing a null-hypothesis (which posits that there is no relation between the variables you are examining), and disregarding it if a certain value passes a critical threshold, thereby strengthening your belief in there being a relationship between said variables. (All statistics nerds, please disregard my oversimplification of the method.) While a mathematically sound way of finding correlations, it has a few shortcomings.

Firstly, the critical value may in some cases seem arbitrary, and indeed it is. The critical value simply represents how accepting you are of being wrong. Second, your results really only tell you something about your sample of the population. If we could do the impossible task of testing every individual in the population, there would be no need for the prediction in the first place, so you are always operating with a part of the whole. Thirdly, as the more savvy of you will have noticed, I use the word “correlation” instead of “causation” for a reason. A correlation does not ensure a causation, and that leads me to the final point: statistical generalization is good for saying something about the, but not always a good predictor of the future. For predictions we’re better off using another tool, which has gained a lot of traction lately: Bayesian statistics.

Bayesian statistics, named for Thomas Bayes, encourages looking at the world through Bayesian probabilities, which is simply the act of assessing the chance of an event based on the chance of it occurring and your prior expectation of said event occurring. If said event occurs, you update your prior to fit the new data. Sounds intuitive? Bayes reportedly thought so little of his findings that he didn’t even find it worthwhile publishing. Imagine you test positive for a rare disease. Your doctor tells you that it’s correct 99% of the time. So how likely do you think it is that you have the disease? This depends on your prior, which is how likely you thought it was that you had the disease in the first place. If the disease is sufficiently rare, even a 1% error margin will amount to many people getting a false positive, so maybe you don’t need to be so worried. Of course, if you have a second, independent test that also turns up positive, you can be quite sure that you indeed have the disease. Your prior in this case is not the chance of having the rare disease (say 1/1000), but the chance of having the disease and having been tested positive for it before.

Updating your probabilities for future events after an event happens seems like a no-brainer. As they say: once bitten, twice shy. However, this may lead us to premature conclusions. The most difficult part of prediction is figuring out your priors, which is hard to do post-event. This leads us back to our cognitive blind spots. After being hit by lightning you might never go out during a thunderstorm again, even though the chance of being hit is small. Your experience trumps your prior, and suddenly a one-off event defines how you interact with the world. Keeping in mind that events have a probability of happening, and that the happening of an event should only make you update your belief in the probability of it happening, we might all make both more accurate forecasts, and keep the door open for a discussion about events, the future, and the truth.

So how did Silver’s FiveThirtyEight do in the 2016 election? Well, they missed the fact that Trump would win, but they gave him a much larger chance of winning than most other forecasters. As election day rolled around they had Trump at around a 30% chance of winning. This would imply him winning three out of every ten elections, which is far from an assured loss. Having a grasp of probabilities and error margins, this led to the team at FiveThirtyEight to not be surprised by his victory. Understanding probabilities like these is something we do everyday, even if we’re unaware of it. If you predicted that three out of every ten times you went outside, you would get hit by a car, you would probably start staying indoors.

But hey, what do I know? I’m no expert. And perhaps that’s for the best.