OpenAI introduces Sora, an AI model for converting text to video

OpenAI is launching a new video generation model, called Sora. the says artificial intelligence company Sora “Realistic and imaginative scenes can be created from textual instructions.” The text-to-video template allows users to create realistic videos up to one minute long, all based on prompts they've written.

Sora is able to create “complex scenes with multiple characters, specific types of movement, and precise subject and background details,” according to an introductory OpenAI blog post. The company also notes that the model can understand how “objects exist in the physical world,” as well as “accurately interpret props and create compelling characters that express lifelike emotions.”

The model can also create a video based on a still image, as well as fill in or expand missing frames in an existing video. Demos created by Sora and included in the OpenAI blog post include an aerial view of California during the Gold Rush, a video that looks as if it was filmed from inside a Tokyo train, and others. Many have some obvious signs of AI — such as a suspiciously moving floor in a museum video — and OpenAI says the model “may have difficulty accurately simulating the physics of a complex scene,” but overall the results are very impressive.

A couple of years ago, it was text-to-image generators like Midjourney that were at the forefront of templates' ability to convert words into images. But recently, video has started to improve at a remarkable pace: companies like Runway and Pika have demonstrated impressive text-to-video models of their own, and Google's Lumiere is a major competitor to OpenAI in this space as well. Similar to Sora, Lumiere provides users with text-to-video conversion tools and also lets them create videos from a still image.

See also  Two massive JRPGs will be released on Nintendo Switch on July 29

Sora is currently only available to “red team members” who evaluate the model for potential damage and dangers. OpenAI also provides access to some visual artists, designers, and filmmakers for feedback. He points out that the current model may not accurately simulate the physics of a complex scene and may not correctly explain certain cases of cause and effect.

Leave a Reply

Your email address will not be published. Required fields are marked *