A/B Split Testing – understanding the test results

This is the third post in the A/B testing series, you may want to read part 1 and part 2 first.

So you started your test and after 10 visitors arrived to your site (5 in each group) 2 visitors from group B converted compared to just one from group A – the change improved conversion rate by 100% – time to party.

Well, maybe it’s too early to party, just 10 visitors isn’t much and it’s possible those two customers arrived at the same group by accident, how do we know the next 3 visitors to group A won’t buy and turn the results around?

What we need is some way to reliably predict the actions of a large number of people in the future based on the actions of a small number of people in the past – luckily we have such a tool – statistics.

Even more luckily our software will do all the math for us – but we do have to know how to read the results.

First there’s the confidence level, remember we are trying to predict the future – we can’t be absolutely certain of anything but we can know exactly how much we are uncertain – the confidence level is the chance that the result we are seeing is the true result.

The minimum confidence level we can accept is 90% with most using 95% – this is because lower levels practically guarantee we’ll have errors – if you run just 5 tests with 80% confidence level one of them will show the wrong results, with 95% confidence one in twenty tests will be wrong – on the other hand we can’t always require a high confidence level because for higher confidence levels we need more visitors – so the test will take longer and we can run a smaller number of tests and get the results later.

Most tools will show us something like:

Original    2.5% +/- 1.5%
Alternative 3.9% +/- 1.5%    56% improvement

So what does this tell us? that the alternative is 56% better than the original? not exactly.

This tells us that there is a 95% chance that the original conversion rate is somewhere in the range 1.5%-4% and the alternative conversion rate is 2.4%-5.4% – or in other word its possible the original conversion rate can be 4% and the alternative can be 2.4%

So, to summarize:

  • Anything with a confidence level below 90% doesn’t mean a thing and you should aim for at least 95% in most tests.
  • The result isn’t meaningful until the highest number is the losing option’s range is lower than the lowest number in the winning option’s range

In the next post I’ll tell you about what your first test should be

posted @ Wednesday, March 3, 2010 4:53 PM

Comments on this entry:

No comments posted yet.

Your comment:

 (will not be displayed)

Please add 3 and 4 and type the answer here: