Stability AI has recently unveiled their groundbreaking text-to-image model, DeepFloyd IF. Developed by their multimodal AI research lab, DeepFloyd, this state-of-the-art model promises to revolutionize the way research labs approach text-to-image generation.
DeepFloyd IF is a non-commercial, research-permissible model that allows researchers to explore advanced text-to-image generation techniques. In line with Stability AI's commitment to open-source innovation, they plan to release a fully open-source version of DeepFloyd IF in the future.
1) Prompt : a photo of a full size old rusty sign that says "Deep Floyd Street", photo realism, bokeh, 50mm cine lens, super sharp focus.
2) Prompt : film still photograph of redhead bearded Abraham Lincoln look alike starring in a live action documentary about the life of Vincent an Gogh produced by Netflix, 4k
3) Prompt : delicious burger painted in the style of starry night
The model comprises several neural modules that work synergistically to create high-resolution images in a cascading manner. It starts with a base model that generates low-resolution samples, which are then upsampled by successive super-resolution models to produce high-resolution images. The diffusion process is implemented at the pixel level, distinguishing it from latent diffusion models like Stable Diffusion.
DeepFloyd IF was trained on a custom high-quality LAION-A dataset containing 1 billion image-text pairs. This dataset, an aesthetic subset of the LAION-5B dataset, was obtained through deduplication, extra cleaning, and modifications.
Initially released under a research license, Stability AI intends to transition to a more permissive license after gathering feedback. They hope DeepFloyd IF will inspire novel applications across various domains, including art, design, storytelling, virtual reality, and accessibility.
Researchers are encouraged to explore technical, academic, and ethical questions related to the model, such as optimizing performance, enhancing control over image generation, integrating multiple modalities, assessing interpretability, and addressing potential biases.
Access to DeepFloyd IF's weights can be obtained by accepting the license on the model's cards at their Hugging Face space (https://huggingface.co/DeepFloyd).
For more information, visit the model's website (https://deepfloyd.ai/deepfloyd-if), access the model card and code on GitHub (https://github.com/deep-floyd/IF), or try the Gradio demo (https://huggingface.co/spaces/DeepFloyd/IF).
Join public discussions via https://linktr.ee/deepfloyd and send your feedback to deepfloyd@stability.ai.
For the latest news & updates
Join our newsletter