4 common mistakes in A/B testing only a handful of testers know
When you are working on conversion rate optimization of your website, A/B testing is the first thing that comes to aid. A/B testing is one of most accessible and popular tool known to the marketers. While many businesses have seen the extended conversions with this tool, others have used it, only to fell into the false positives and false negatives with wrong practices.
Concern is, even the tiniest mistakes made during A/B testing can result into misguiding results. Such results would be full of inconclusiveness which would actually confuse you. To combat this, I have outlined 4 common A/B testing mistakes along with the tips to avoid them, and help you conduct a successful testing campaign with least chances for errors. So, let’s hop into it.
4 common mistakes in A/B testing and how to tackle them
Mistake #1: The faulty testing tool
The popularity of A/B testing has led various developers across the globe to present their versions of the tool. Each of them having some things in common with other tools, while some of the things being totally unique. This has also led to the availability of some low-cost software for testers to choose from.
While some of the tools have significant differences in their approaches, others may be trickier enough to even let you know their basic inabilities.
For example, some testing software can slow down your site speed. And, do you know that an increase of just one second in your website’s loading time can decline your conversion rates by 7 %.
A faulty A/B testing tool can not only kill your SEO but also trick you in showing the results. So, you won’t even know that you are going to conduct a ‘failed A/B test’ even before you get started.
So, how would you tackle this situation? How would you know that you are not using a faulty testing tool?
The solution: Running An A/A test would help
Before you can actually start with the A/B testing, use your tool to conduct an A/A test campaign. It might sound silly, as in A/A testing both the test groups will be shown on the same page. That is, you are testing a page against itself. Silly, but useful approach to identify any negative effect of the tool on the site.
It is a one-time process, you don’t have to trust the results but look at it carefully. If the conversion rates of both the pages are insensibly different, then your tool is faulty. Moreover, if you notice conversion drops as you start the test, know that your testing tool is slowing your site. It’s time you sacrifice your faulty software and switch to some really good A/B testing tool.
Mistake #2: Getting excited with early results and stopping the test immaturely-
When you run A/B test, stopping at an immature stage is the biggest mistake you would make. Right after noticing the first significance you would be lured to stop the experiment and adopt the needful suggested. Well, this is wrong, totally a mistake.
The issue that prevails here is the false positives. False positives are the results that incorrectly show a difference between the pages and present a less confident data to be true. The more you check your results, the more you would confront a result affected by false positives.
If you do not stay calm and stop your test early at the first sign of significance, more likely would end up with the deceptive false positive results which would take you to a suggestion worse enough to kill your conversions.
This Analysis by Heap displays how ending up early is a mistake and how they noticed a significance difference in the “Simulated False Positive Rate” each time they checked the results.
|Number of checks||Simulated False Positive Rate|
|1, at the end (like we’re supposed to)||5.0%|
|2 (every 500 visitors)||8.4%|
|5 (every 200 visitors)||14.3%|
|10 (every 100 visitors)||19.5%|
|20 (every 50 visitors)||25.5%|
|100 (every 10 visitors)||40.1%|
|1000 (every visitor)||63.5%|
Information courtesy: Heap
This means, if you are monitoring your A/B test and stopping it as soon as it hits the significance, the false positive rate will be over 60% in the above case. That’s not good! In such cases, even the worst variation has a pretty good chance of winning.
How would you tackle this?
The solution: Don’t stop the test early and stick to a fixed sample size
The fix to false positives would be simple if you do not stop the test early. Let them run completely for a significant time and later determine if the results are significant. Moreover, prior to running the test, fix your sample size and do stick to that size till the test is complete.
Selecting an audience and sample size for your test is a relative factor that depends on various factors. There are various tools to calculate the minimum sample size that you can use. I like this Optimizely’s tool.
Mistake #3: Missing out the revenue-driving results in the shadow of conversions
It’s quite easy to forget the long-term business aspects and concentrate only on conversion rates while testing. Often times, we overlook the revenue driving results and focus only on the micro conversions that come very easily.
In the meantime, we also tend to overlook the changes that increase in some specific conversions can cause a negative impact on long-term business aspects or not have any significant impact on revenue generation at all. These are the vanity metrics which would lure you to the insignificant conversions and prevent you from focusing on actually required conversions.
For example, While testing a CTA button that leads to the landing page, you should not just care about the conversions leading to the landing page. Instead, check the leads from that page and tie those leads to the revenue driving conversions they produce.
The Solution: Set a hypothesis and validate it
Prior to running the A/B test, set a hypothesis you think would help you achieve a goal. Focus on your hypothesis and validate or disprove it on the basis of its ability to affect your ultimate goal. Do not get distracted by other associated figures which you might come across in the meantime.
For example, if your goal is to increase downloads on a page, always judge the success by measuring the downloads, not the clickthrough rates to the download page.
Mistake #4: Small, random and incremental testing
Most of the A/B tests start with a random assumption without even knowing how it would affect the bigger side of the picture. It’s a testing approach that hardly thinks about what you are doing to improve the business.
Image courtesy: VWO
In the same path, more often we tend to test only the incremental changes that usually does not have a meaningful impact on the business. By doing so, we miss the larger opportunity while getting lured from the minuscule improvements. One must understand that large websites might get a significant improvement by small changes, but it is not same for all the websites.
The solution: Go for the big changes occasionally
Some call it radical testing where large-scale changes to a website are made in the hope to experience significant improvement in the conversion rates. It’s like a game where you have to make bigger bets to earn bigger prizes.
Be aware that while radical testing may promise a bigger gain, there is a high probability of failure too. So, it’s a high risk and high reward game. However, if your website has experience that cannot be risked through major changes, you may choose to go for radical testing occasionally.
One more thing you should notice here is, when you make so many changes at once, it becomes difficult to identify the core elements impacting the conversions. So, make an informed choice now when you are aware of both the ups and pitfalls of a radical approach.
An error free testing is all about getting success for your business by ensuring that you are going on the right path. Focus on macro conversions that actually bring revenue from the business. This can be ensured by concentrating on the core user experience and testing the hypothesis that is aimed to improve the bigger picture of the business. Be rigorous and periodic with your tests, and avoid falling into the traps of incremental and narrowly focused test cases.