Google Announces Veo 2, Imagen 3 Improved Version, and Whisk: Video and Image Generation; Visualize and Remix Ideas

By Sidharth Joseph Published on December 17, 2024, 09:48 IST Last updated December 17, 2024, 09:49 IST

Google Announces Veo 2, Imagen 3 Improved Version, and Whisk

Alphabet Inc.’s Google, the California-based tech maker, has announced its Veo 2, Imagen 3 improved version, and Whisk. State of the art video and image generation, along with idea visualization and remix using images and AI are being offered with these new services.

Here’s more about it.

Veo 2 and Imagen 3 Improved Version

Google’s Veo 2 was already considered to be the most advanced video generation model when compared with its competitors and now, Google has brought some new improvements to it. Veo 2 will now be able to understand the unique language of cinematography and generate videos with up to 4K resolutions. Genre, lens, and cinematic effects can all be suggested during the process, portraying what its users exactly have on their mind. Moreover, the video length has been extended to minutes and the frequency of unwanted details have been reduced even less. The invisible SynthID watermark will also ensure safety and avoid instances of misinformation, by titling the generated video as made using AI.

Coming to the Imagen 3 Improved version, more brighter and better composed images can now be generated. More diverse art styles with greater accuracy can also now be generated. Prompts will also now start to follow even more faithfully, giving the best outputs for users.

Both the Veo 2 and Imagen 3 improved version are now available on VideoFX and ImageFX respectively. Google also plans to make Veo 2 and Imagen 3 improved version available to more users as well soon.

**Whisk (Experiment*)**

Whisk is Google’s new experiment in generative AI and which has now been introduced to users in the US. Instead of writing prompts, Whisk will let users add images and generate intended results. Users will also be able to remix them later, bringing what they have exactly in their mind to the generated image. Speaking more, with the help of Gemini, detailed captions are being written about the images and then combining with Imagen 3, the results are generated. Users can pick three images respectively for the subject, scene, and style. Also to note, being an experimental feature, inconsistencies may occur.

As mentioned, Whisk is currently only available to the US audience as of now. Probably on a later date, the brand may introduce it to more users across the world.

Source 1 2