Why We All Missed It
This morning I woke up ready to figure stuff out. Yesterday I said I didn’t know (well I publicly and colorfully used a four letter word to more appropriately describe my state of being). Today, my plan was to know more. So I walked my son to school, strapped on the headphones, turned up the One Direction (“Story of My Life” which is a great way to get a day going), and got on the subway with a renewed dedication to figuring it out.
Here’s what I know so far. We were wrong. Period. Those people who are researchers hiding behind the margin of error are just kidding themselves. It is technically accurate to say that if you factor in margin of error, the polls performed as they should. But to any of us who know this business, that is just simply lying to yourself to make you feel like you weren’t as wrong as you are really are. We were wrong.
Who is the we? Well in this situation, all of us who deal with data. That’s the pollsters and the analytics firms, the online experts, the robopolls/IVRs, the exits, and the old-fashioned live callers who had scads of data to review. And all the data pointed in one direction – a Hillary win. In some of the analytic reports that we all saw on places like 538 or the Upshot, it was a 90%+ probability. Margin of error doesn’t fix that.
There may have been a shy Trump vote phenomenon. There’s probably not a hidden Trump vote that suggests that new people came out to vote – there won’t be a lot of evidence of that I think after the votes are counted given that turnout appears to be down and Trump will have gotten about the same number of votes that Mitt Romney did. But THAT is very likely not the whole story.
In a quick look at polling in Congressional races, it appears that the polling also overestimated Democratic vote share in districts where Trump won (meaning error in predicting BOTH the Presidential and Congressional). That means there was a TURNOUT modeling issue, particularly in the estimation of college versus non-college vote share. Hillary Clinton may have gotten 2.5+ million fewer votes than Barack Obama in 2012 – and if that is the case, our turnout models will have been off, and if you re-weight the data to that new turnout reality, I am betting we will all see different results. Polling as a science is highly reliant on past performance being repetitive – our models are all based on past behavior, and are going to be off if history does not repeat.
Ok, but let’s dig deeper. We know that the Presidential data in states with non-college white voters had a far higher error rate than others. But what about the OTHER races in those states. How did the data perform? I did any number of races for Governor, Senate, Congress, and initiatives. And in many, the data was actually pretty good. In some cases, spot-on. So what does that tell us? It’s not all wrong and it’s not all useless.
We need to remind ourselves that polling as a predictive tool, in general, and particularly for predicting turnout has always been difficult. We are very good at using research tools to determine message, allowing us to test different paths for communication. We need to embrace new methodologies, particularly online tools that allow us to scrape the internet for conversations – as we all know, our social media feeds are filled with people who echo our own opinions – the echo chamber is therefore self-repeating. Good tools that allow us to find out what everyone is saying, not just our circles of influence, must be a part of the future research mix.
The challenge now is to figure out how to get at these phenomena going forward. I don’t quite know that answer yet. But that’s the error we need to focus on – the business and science of market research certainly have to meet that challenge, but there is plenty to show that the data can be a useful tool going forward.