Meaning
A/A testing is a statistical test performed to compare two identical experiences on a random set of users. While just like in an A/B test the traffic is distributed equally among both variations, however, the objective of an A/A test is to validate that the statistical test returns no difference in conversion rates between the two experiences.
The core logic behind obtaining no difference is that as an identical experience is provided to each user group, the expected KPI (Key Performance Indicator) will remain the same for each group. For example, if 10% of your website visitors are filling a survey on the landing page, we can expect the conversion rate to remain the same for another group of visitors who are getting an identical experience of the landing page.
Importance of A/A testing
But what would you accomplish by running an A/A test when both variations are identical?
An A/A test provides a framework to perform a sanity check of the tool which eventually can be used to run A/B tests. Several organisations run an A/A test under the following circumstances –
- To check if A/B testing software is configured correctly-
- The intended result of an A/A test is that the audience reacts similarly to the same piece of content. However, when an A/A test is run and the platform returns a winner, then the software must be evaluated. The tool might have been misconfigured.
It is possible to get a winner from an A/A test. If several A/A tests were run, then setting a Probability to Beat the Best(P2BB) of 95% would mean 5% of times a variation could be declared as a winner due to random chance. But in most cases, there will be no winner.
- The intended result of an A/A test is that the audience reacts similarly to the same piece of content. However, when an A/A test is run and the platform returns a winner, then the software must be evaluated. The tool might have been misconfigured.
- To obtain a baseline conversion rate for future A/B tests-
- Suppose a company plans a new series of tests on a landing page. After running an A/A test, if both variations converge to a similar conversion rate after a while, they can use that conversion rate as a baseline and can run future A/B tests to exceed this rate.
In an A/A test, it is expected that the result will be inconclusive. However, it is still possible in an A/A test to declare a winner between two identical variations even if the tool is configured correctly.
Why would an A/A test declare a winner?
Following are the reasons for getting a winner in the A/A test –
- It is all about probability – P2BB is estimated based upon data that can be variable. In a short time window, just by randomness, one variation can get a strong lead resulting in an extreme P2BB even though there is no difference. All A/B testing strategies tend to declare a result occasionally even when no difference exists.
- Large data volumes and small differences – In most digital experiments with large sample sizes even minute differences in KPIs can lead to an extreme P2BB. The analyst needs to make a judgment if the detected uplift has any value to the bottom line of the business.
- Constant peeking data – In the new age of digital experimentation, we seek to get faster results from experiments so good ideas can go live soon and help to increase engagement. With the intent to end the test early analysts often keep peeking at the results and as soon as the winner is declared the test is stopped. This continuous peeking at the results with an intent to determine the better variation can lead to ending the test prematurely and could invalidate the test altogether. With constant data peeking, A/A tests are guaranteed to declare a result during the test.
Only after providing sufficient time for the test, does the uplift. Which might have occurred due to a random chance and led to a winner, will decay down. The control and variation could perform differently at different points in time but neither one of them will be declared a winner indefinitely.
Best practices to run an A/A test
If your organisation plans to run an A/A test, always be aware of the fact that your test might declare a winner temporarily. This depends upon the time you have given the test to run. But as the test gains more visitors the difference is likely to diminish with time.
Decide a sample size before the experiment using the VWO sample size calculator. Then allow adequate time and visitors to the test before concluding. Be aware that depending on the P2BB level there is a chance that a variation can even be declared as a winner.
To check the validity of the software platform you may need to run several A/A tests for a fixed visitor set and observe that depending upon the configured P2BB threshold you will get a certain False Positive rate. If the obtained False positive rate is well within the threshold, the platform is working fine.
As it is not possible to run multiple A/A tests of a real experiment, it is a good idea to do this analysis through simulations by defining a data generating process and using the statistical tool to check for statistical significance.
Testing with VWO
VWO is a leading A/B testing platform trusted by thousands of brands around the world including Domino’s, HBO, eBay, and Disney. It offers everything you need to create and run tests to increase revenue, combat cart abandonment, and build stellar digital experiences that convert, with minimum dependency on your developers. Start a 30-day all-inclusive free trial today to explore the features and capabilities of VWO.