(This year, I’m devoting some of my Research Round-Up posts to academic research papers relating to the use of artificial intelligence for marketing purposes. This post features an unpublished paper that compares the performance of AI-generated vs. human-made images across three marketing use cases.)
“The power of generative marketing: Can generative AI reach human-level visual marketing content?“
Authors – Jochen Hartmann and Yannick Exner, Technical University of Munich; Samuel Domdey, Technical University of Hamburg-Harburg
Date Written – July 12. 2023
This paper describes the results of three studies designed to evaluate the performance of AI-generated vs. human-made images used for marketing purposes. Specifically, the studies evaluated image performance across three dimensions relevant to marketing.
Human perception of image quality and realismSocial media engagementClick-through rates of banner ads
The studies used AI-generated images created with 13 text-to-image diffusion models, including DALL-E2, Jasper, Midjourney v4, and several versions of Stable Diffusion. Altogether, these studies collected more than 17,000 human evaluations of over 1,500 AI-generated images.
All of the AI-generated images in these studies were created using a two-step process. In the first step, the researchers employed an image-to-text AI model to create a textual description of each human-made comparison image. These textual descriptions were then used (without modification) as the prompts to produce the AI-generated images.
Here are abbreviated descriptions of the three studies and the high-level results of each study.
Study 1 – Human Perception of Quality and Realism
The objective of this study was to compare the perceived quality and realism of AI-generated vs. human-made images across three marketing use cases – product design, social media, and print ads.
Each image was rated by five human evaluators for quality and realism using a 7-point Likert scale (1 = low, 7 = high), resulting in a total of 7,830 ratings.
The ratings for quality and realism varied depending on the specific image being evaluated and on the model used to create the AI-generated image. Overall, however, the study revealed that the AI-generated images outperformed or were on par with the human-made images in the product design and social media use cases.
In the print ad use case, the AI-generated images were significantly less likely to perform on par with the human-made images in terms of perceived quality and realism.
Again, the ratings varied significantly depending on the model used to create the AI-generated image. So, the choice of model matters.
Study 2 – Social Media Engagement
This study’s objective was to compare the ability of AI-generated images vs. a human-made image to produce engagement in a social media setting. In this study, engagement referred to the “likelihood to like” an image and the “likelihood to comment” on an image.
This study included one human-made image and 13 AI-generated images. The researchers recruited 701 participants who were randomly assigned to one of the 14 images. Each participant was asked to rate how likely they were to like or comment on an image using a 7-point Likert scale (1=low, 7=high).
The results of this study showed that the AI-generated images generally performed on par with the human-made image in terms of social media engagement.
Study 3 – Click-Through Rates On Banner Ads
The objective of this study was to compare the effectiveness of AI-generated images vs. a human-made image when used in an online banner ad. The measure of effectiveness used was click-through rates (CTR).
This study was a randomized field experiment that consisted of a real-world online banner ad campaign run on a leading display advertising platform. The human-made image was a professional photo purchased from Adobe Stock. The campaign ran December 28-29, 2022, and generated 702 clicks on 86,809 impressions.
Of the 14 images tested, the human-made image ranked 10th in terms of CTR. The best-performing AI-generated image achieved a 21.5% higher CTR compared to the human-made image.
This study also demonstrated that model choice matters. The best-performing AI model (Stable Diffusion v1-3) outperformed the worst model (Disco Diffusion) by 65.5%.
My Take
The three studies described in the Hartmann et al. paper demonstrate that generative AI models can create visual content that is on par with – and often better than – human-made images for a variety of marketing use cases.
If anything, these studies probably underestimate the ability of generative AI models to produce human-level visual content. The prompts used to create the AI images for these studies were produced by an image-to-text AI model, and the researchers didn’t modify those prompts. Prompts engineered by experienced marketers would likely have resulted in more effective AI images.
These studies also probably underestimate the quality of images generative AI models can currently produce because new, more capable versions of some of the models used in the studies have been released since the studies were conducted. For example, these studies used DALL-E2 and Midjourney v4, but DALL-E3 and Midjourney v6 are now available.
At minimum, the results of these studies suggest that AI-generated images are likely to play an increasingly important role in marketing.