Google has launched a new artificial intelligence tool called “Whisk,” which allows users to upload images and create combined, AI-generated images without the need for textual descriptions. Users can input photos depicting different subjects, settings, and styles, and Whisk will merge these elements into a single, cohesive image. The tool is designed to offer quick inspiration and creative exploration, making it an ideal option for users looking to experiment with visuals in a fun, engaging way, rather than producing professional-quality images.
Unlike traditional image editors, Whisk is categorized as a “creative tool” for those seeking rapid visual exploration rather than pixel-perfect edits. This reflects Google’s intent for Whisk to be a lighthearted feature for users, rather than a refined tool for professional work. The tool enables users to remix the final image by adjusting inputs or mixing different categories, such as transforming the image into something resembling a plushie toy, an enamel pin, or a sticker. Though users can add text to influence certain details, this is not necessary to generate a unique image.
Whisk builds upon the growing trend of AI-generated art, which gained significant attention following OpenAI’s launch of DALL-E in 2021. While DALL-E created images based on text prompts, Whisk is an image-to-image generator, taking the concept a step further by allowing users to combine various visual elements. The tool uses Google’s AI platform, Gemini, which debuted in December 2023, along with Imagen 3, a text-to-image generator developed by DeepMind, Google’s AI lab. This combination enhances Whisk’s ability to generate creative visuals.
When users upload their images, Gemini generates a description or caption, which is then fed into Imagen 3. This process captures the essence of the image rather than providing an exact replica, allowing for more flexible and creative remixes. As a result, the final image may differ from the original input, such as variations in height, hairstyle, or skin tone. This characteristic allows for artistic freedom, although the outcome may not precisely match the user’s expectations.
Currently, Whisk is available as a website on Google Labs for users in the United States, and it remains in its early stages of development. Google’s release of Whisk comes amid growing competition in the AI space, with other tech giants like OpenAI also introducing innovative products such as Sora, a text-to-video generator. Analysts see Whisk as a demonstration of Google’s strength in the AI race, with DeepMind’s advancements playing a crucial role in the company’s future technology offerings, which also include new products for 2025, such as an Android operating system developed in collaboration with Samsung and Qualcomm.
Leave a Reply