Let us set the stage for how and why we’ve been doing creative testing in a unique way. We test a lot of creative. In fact, we produce and test more than 100,000 videos and images yearly for our clients, and we’ve performed over 10,000 A/B and multivariate tests on Facebook and Google.
We focus on these verticals: gaming, e-commerce, entertainment, automotive, D2C, financial services, and lead generation. When we test, our goal is to compare new concepts vs. the winning video (control) to see if the challenger can outperform the champion. Why? If you can’t outperform the best ad in a portfolio, you will lose money running the second or third place ads.
While we have not tested our process beyond the aforementioned verticals, we have managed over $3 billion in paid social ad spend and want to share what we’ve learned. Our testing process has been architected to save both time and money by killing losing creatives quickly and to significantly reduce non converting spend. Our process will generate both false negatives and false positives. We typically allow our tests to run between 2-7 days to provide enough time to gather data without requiring the capital and time required to reach statistical significance (StatSig). We always run our tests using our software AdRules via the Facebook API. Our insights are specific to the above scenarios, not a representation of how all testing on Facebook’s platform operates. In cases, it is valuable to retain learning without obstructing ad delivery.
To be clear, our process is not the Facebook best practice of running a split test and allowing the algorithm to reach statistical significance (StatSig) which then moves the ad set out of the learning phase and into the optimized phase. The insights we’ve drawn are specific to these scenarios we outline here and are not a representation of how all testing on Facebook’s platform operates. In cases, it is valuable to have old creative retain learning to seamlessly A/B test without obstruct- ing ad delivery.
Let’s take a closer look at the cost aspect of creative testing.
In classic testing, you need a 95% confidence rate to declare a winner, exit the learning phase and reach StatSig. That’s nice to have, but getting a 95% confidence rate for in-app purchases may end up costing you $20,000 per creative variation.
As an example, to reach a 95% confidence level, you’ll need about 100 purchases. With a 1% purchase rate (which is typical for gaming apps), and a $200 cost per purchase, you’ll end up spending $20,000 for each variation in order to accrue enough data for that 95% confidence rate. There aren’t a lot of advertisers who can afford to spend $20,000 per variation, especially if 95% of new creative fails to beat the control.
What we do is move the conversion event we’re targeting for up in the sales funnel. For mobile apps, instead of optimizing for purchases we optimize for impression per install (IPM). For web- sites, we’d optimize for an impression to top-funnel conversion rate. Again, this is not a Facebook recommended best practice, this is our own voodoo magic/secret sauce that we’re brewing.
A concern with our process is that ads with high CTRs and high conversion rates for top-funnel events may not be true winners for down-funnel conversions and ROI / ROAS. But while there is a risk of identifying false positives and negatives with this method, we’d rather take that risk than spend the time and expense of optimizing for StatSig bottom-funnel metrics.
To us, it is more efficient to optimize for IPMs vs. purchases. Most importantly, it means you can run tests for less money per variation because you are optimizing towards installs vs purchases. For many advertisers, that alone can make more testing financially viable. $200 testing cost per variation versus $20,000 testing cost per variation can mean the difference between being able to do a couple of tests versus having an ongoing, robust testing program.
We don’t just test a lot of new creative ideas. We also test our creative testing methodology. That might sound a little “meta,” but it’s essential for us to validate and challenge our assumptions and results. When we choose a winning ad out of a pack of competing ads, we’d like to know that we’ve made a good decision.
Because the outcomes of our tests have consequences – sometimes big consequences – we test our testing process. We question our testing methodology and the assumptions that shape it. When we kill four out of five new concepts because they didn’t test well, our entire team reacts by killing the losing concepts and pivoting the creative strategy based on those results to try other ideas.