Using AI To Generate Text-To-VIDEO! Things Are Moving Fast.
Table of Contents
See this video of a horse drinking water? It was made with AI. So is this video of a UFO landing on Mars. This turtle swimming through the ocean? Also made with AI. See these four dancing funky things? You probably guessed it. Made with AI. As were all of these videos that you're seeing on the screen right now. All made with AI. This wolf howling at the moon? Made with AI. This monkey eating a pineapple? Made with AI. And this waterfall landscape image right here was also made with AI. Now all of these tools aren't publicly available to everybody yet. But in this video I'm going to talk to you about what's coming and what you can do right now with AI video. So let's dig in. Hey what's up? Matt Wolf here. We've heard a lot lately about AI art and AI art is just exploding. But what about AI video? How close are we to being able to do text to video where you type in a text prompt and it generates a really cool video? Well as it turns out we're really really close. And there's some tools that are in the works that are really close to what you get from like a stable diffusion or a mid-journey with video. And there's tools that you can use right now that just make kind of cool videos. And in this video I want to talk about what the current landscape of text to video is.
Make-A-Video from Meta
So to start I want to show you what's in the works first and where things are kind of headed. So first off Meta, the company behind Facebook, they have AI of their own and they have a product called Make a Video that they've been working on. Now this one is one that's not publicly available. You can't go and use it right now and generate your own videos. But it's getting close and they've shown some demos of what it's capable of. So let's take a quick peek here. You can see an example over here on the right that says a dog wearing a superhero outfit with red cape flying through the sky. And you can see the video that it generated off of this text prompt over here. And they do give a whole bunch of examples. You've got a teddy bear painting a portrait, a robot dancing in Times Square. These are the prompts that you would give Meta and this is the output that it would currently generate right now. You've got a cat watching a TV with a remote in hand. You know, you've got the cat here but it kind of looks like a humanish hand going on. So there's obviously quite a bit of work to go on some of this stuff. But you can get an idea of where it's headed. A fluffy baby sloth with an orange knitted hat trying to figure out a laptop, close up, highly detailed studio lighting, screen reflecting in its eyes. You got realistic, an artist's brush painting on canvas close up. Clownfish swimming through the coral. I mean that one's pretty impressive looking. A young couple walking in heavy rain. It almost looks like they're kind of conjoined twins or something like that. But you know, it's pretty impressive honestly where things are headed. I'm not going to claim that this is amazing video footage. I do want to show you like what is there right now and where it's going. A horse drinking water. That looks like a horse drinking water to me. Got a hyper realistic spaceship landing on Mars. An oil painting of a couple in formal evening wear going home, get caught in a heavy downpour with umbrellas. I mean look at that. That's some really cool stylistic art. There's a table by a window with sunlight streaming through illuminating a pile of books. An emoji of a baby panda wearing a red hat, blue gloves, green shirt, and blue pants. It can also do image to video. So you've got this image over here of a ship on the seas and then over on the right you can see how it animated that video. I mean it's definitely a little bit blurry and kind of hard to make out what's going on. There's somebody doing yoga and you can see in the video version they're stretching out on their pose. And here's a sea turtle. This is an image of a sea turtle just kind of static and look you can see it's moving its fin around and swimming through the water over here. You can take a pair of images and it'll blend them into a video. So here you've got you know two images of asteroids and it's blending them together in this footage over here. You've got this art right here. I'm not quite sure what art this actually is. Some sort of like abstract art but if you blend these two together you get this video that's over here. You've got two images of a couple walking holding hands with a child and you can see how it animated these two images together. You can use an existing video. You've got your astronaut floating around in space. This is the input video. And then here's four variations of astronauts floating around in space with different perspectives on the earth and different poses of the astronaut. Here's a funny furry creature thing dancing around and here's four variations that it created of a similar furry fuzzy pink and blue creature thing. Here you've got a picture of three bunnies eating grass. Over here it generated four different variations of bunnies eating grass. Again this one isn't one that's publicly available for people to use. You can sign up to get early access when it's available. If you remember early on with a lot of the text to image technology that was out there it was pretty rough to start and once it got publicly available and more people started using it it got better and better and better and better. So this is the very very early version of some of the text to video stuff. Now not to be outdone by meta, Google is also doing AI text to video stuff as well.
Imagen from Google
Also not publicly available but here's some example images. Up here on the left you can see the prompt was an umbrella on top of a spoon and that's the result.
Here you've got a cat eating food out of a bowl in the style of Van Gogh and look at that result. I mean in my opinion this is even better than what meta has been working on. Up here melting ice cream dripping down the cone. Thousands of fast brush strokes slowly forming the text image in video on a light beige canvas. Oil painting style smooth animation and look at that it's even generating text into the video. Most text to image generators can't even do that right now. Hand lifts a cup. Once again you got some funky hands but we're used to that by now in text to image. Another one with a bunch of autumn leaves falling on a calm lake to form the text image in video smooth. Look at that it's forming the words. A video of the earth rotating in space. A swarm of bees flying around their hive. Drone fly through of a fast food restaurant on a dystopian alien planet. Teddy bear running in New York City. Drone fly through of a tropical jungle covered in snow. I mean look at that. That looks really good.
I could see using that as b-roll in a video at some point. So this one's called Imogen Video and this one is being put out by Google and once again this one is also one that is not publicly available. You can't quite get your hands on it yet but this one looks like it's even taking what Meta's put together and it's going even a step further. It looks even better. Now if you want to learn more about how these tools work you can go over to makeofvideo.studio to learn about Meta's version. There's research papers that you could read to learn more about it. And for the Imogen one you can go to imogen.research.google/video and they have an explanation of how it works right here as they walk you through the generation process. So as much as you want to learn about what's going on right now in generative video space there's plenty to study. Saying all that I did mention that there are some tools that you can use right now. Now they're not true text to video like what you're seeing here. It's more like you would take a text to an image and then take that image and figure out some way to animate it. That's kind of what the existing technology that's available right now is. And let's explore some of that real quick. So just for quick reference I'm going to go to futuretools.io. I'm going to click on generative video and there's a couple that I want to take a look at. I want to look at Genmo and I'll show you what that does. I want to look at Kyber and I'll show you what that does. And I want to take a look at LeiaPix here. Now there are also a bunch of generative video tools that will take your text and turn it into a talking head. I actually did a previous video about that in the video that I did where I made my sort of fake Joe Rogan talking. So check that video out if you haven't already. I used this tool D-ID to make that. And that's a really cool generative video technology as well that makes essentially talking heads. But that's not what I want to show you in this one. I want to show you some of the other cool tech that's out there.
Let's start by taking a look at LeiaPix converter and I'll show you what this does. I'm going to go ahead and log in here and this one is free to use right now. And this one allows you to upload an image and then it will turn it into kind of a 3D animated image. So let's go ahead and pick an image here. Go ahead and grab this cool wolf wearing sunglasses here and we'll upload it and look what it did. You can see it's got this, it kind of 3D-ified it. It's making a little animation where it's spinning around and you can do some different animation lengths. I can make it six seconds which will make it look like it's moving a lot slower. Or I can make it one second where it just looks like it's going hyper fast and he's moving around a lot. I kind of like to go somewhere in the middle. And then for animation style, you can choose how it's sort of rotating the camera around him. So if I go horizontal, it just looks like he's looking back and forth and you can see how it kind of animates the head there. You've got wide circles. You've got just a standard circle, a tall circle, and then vertical. And you can kind of see how that changes the style of animation. You can change the amount of motion to less. So it's kind of very subtle animation, almost looks like it's breathing. You've got regular, you've got more, and then you can change the focus point. So I can make the focus point close. Then you can see it really sort of makes the background shift a lot. And then if I move it to center, it's kind of more of a centered thing where the front and background are moving fairly equally. And if I go far, then you can see the background doesn't move as much, but the face itself looks like it's moving quite a bit more. And then when you're ready to grab this little video here, you come over to share and this gives you a link where you can share it or you can save it as a GIF, an MP4, or one of these other file types here, and then click save. And then it's going to download this version of the video for me. And there you go. You can see it just downloaded here. Let's open it up in the folder. And we've got our animated wolf video here that I can use however I want. Now this isn't technically generative video. This is taking an image and kind of converting it into a 3D sort of cool video. But as far as I can tell, this is a free to use platform for now.
All right, so that was the first one I want to show you. Now the other two that I'm going to show you are actually more where you would type in a piece of text and it will generate a video from that text.
So let's take a peek. The first one I want to look at is called Genmo, which is a text to video platform where you literally type in what you want and it will generate a video based on what you want. And here's some examples of what it's generated. As you can see, it's not really like an animation. It's more a series of images that are all kind of blending together is more of what this style is doing. Go ahead and create something real quick. You can actually see one that I generated once already, a wolf howling at the moon on a snowy mountain. So what it essentially seems to be doing is taking my prompt here, a wolf howling at the moon on a snowy mountain, and then it's using one of the generative art platforms, you know, a dolly or a stable diffusion. I'm not actually sure which one this uses, and it generates a whole bunch of images based on that prompt. And then it animates a lot of those images together. That's what it appears to be doing. So let's go ahead and create a new one here. A monkey wearing a top hat, eating a pineapple on the beach. And let's go ahead and generate and see what it comes up with. So it started by generating an image. This is our sort of first frame of the image here. And because I had this auto style selected, you can see it added additional styling to it. Let's go ahead and click on this here. A monkey wearing a top hat, eating a pineapple on the beach. And then it added warm shades, concept art, Kodak Ultra Max 800, architectural HD, insanely detailed center composition in pastel, solid color, and thick, uneven outline. So it added all of that extra stuff on its own because I had auto style selected. And so this is basically the first frame of the image. Let's go ahead and click customize. I could tweak the prompt a little bit if I want to. I can add some negative prompt here, change the length, the exploration, dial up the mayhem. Let's go ahead and dial up the mayhem. That sounds fun to me. Dynamism, how fast content changes over time. Let's go ahead and crank that up a little bit. Let's leave the smoothness to a hundred, leave a seamless loop. And let's go ahead and up the length a little bit and see what that does. And let's go ahead and leave the prompt the same and see what it generates for us. And here's what it came up with. You can see it's generated a whole bunch of different images and it's sort of morphing between all of the various images. But again, it's not that true generative video that you're going to get with the make a video or the image in from meta and from Google, but it is taking a text and turning it into a cool kind of creative video. Now if I click on this here, it brings me to this page where I can see the prompt, the seed and all of the details. And then if I want to download it, just click download comes to this page and I can save as and just save this video. Now one area where this type of tool works really, really well is like landscape type of video. So let's go like a green forest with a cliff side, flowing waterfall coming down the cliff into a stream that cuts a path through the forest. Let's go ahead and generate an image off that and see what we get. Now I'm not super happy with what it generated. I'm going to go ahead and go landscape here and see if we can get a wider image and let's go ahead and generate another one. Still not happy with it. Let's generate another one. Now we're talking. Oh, I really like this image. That's what I was looking for. So let's go ahead and take this one here, this waterfall coming into this stream with this green landscape around it. Looks like it's straight out of the Lord of the Rings or something like that. Let's customize it here. I'm going to dial up the mayhem a little bit, bring down the length slightly so it renders a little bit faster and let's go ahead and leave everything else the same and let's click make video. And I think what we'll find is because it's got this landscape and this flowing water, it's really going to have this cool effect when it's done. Check this out. Just that motion and switching between the images sort of gives it this feeling of flowing. And if you look at the waterfall in the background, it actually kind of makes the waterfall look like it's moving in the water in the foreground, look like it's moving. This one seems to be free and all you would do is click create a video and sign in with Google or sign in with discord and you're in and you can start creating. I'm not sure if it's eventually it's going to be paid.
I have no idea, but as of right now you can get in and use it for free and generate videos like what you just saw. Now caveat, there is going to be a watermark on it, but it's still pretty cool. You can still generate some cool stuff and see where things are going with this. Now there is another one that supposedly acts fairly similar that I haven't played with yet, but I'm going to experience it right now for the first time on this video.
And this one's called Kyber. And you can tell just by looking at the homepage that this one generates some cool sort of imagery as well. And here's some examples of images that this one generated where it looks like it kind of generates an image and then sort of does this like zoom in, zoom out effect on it to get the style that you see in these images here. Now before we get into it, let's take a quick peek at the pricing, see how they're doing this here. So they have a free plan that gives you 50 credits or approximately five videos and it will have a Kyber watermark on it for $10 a month. It looks like you can get a thousand credits, which gets you about a hundred videos a month and no watermark. That's the yearly billing. Let's check out, check out the monthly. So monthly is actually 15 bucks a month. If you want to pay monthly, take a peek at how this one works. So you describe what you're looking for. It'll generate a few style options for you to choose from, and then you can download the video. Let's go ahead and click direct my video and let's go ahead and log in and test this one out. You can see this is my first time using this, so I don't know what it's going to generate. I still have my 50 free credits here. Let's go ahead and create your first video.
I want to create a video of select a subject in the style of select a style. All right, so let's select a subject here. Intricate machinery, beautiful sunset painting, secret garden, bustling shopping street. Let's try waterfall into a forest. You can see it's going to cost eight credits for this one and let's go ahead and select a style. We've got 3d rendering, anime, art deco, art nouveau, cartoon, cubism, gothic art, minimalism, pop art, impressionism, graffiti, classic realism. Let's try classic realism and generate preview frames here. It says it'll take around 30 seconds. Okay, so I'm really excited by these frames that it generated for me. It generated these four frames. So it's saying select the first frame in your video. Let's go ahead and select this fourth one over here and then click finalize video. And then it says it's going to take a few minutes of patience before. Wow. So I'm going to use the old snap the finger trick to speed up the process here. Here's what it generated. You can see it's kind of doing that zoom in and zoom out effect. But if you look at the water, it kind of gives it this feel of the water actually moving. So pretty cool stuff. And then I can come over here, download my video, and now I've got it available. Let's go ahead and play it back on my computer here. And I have access to this video now. Now nothing that's out right now is really true text to video. It's really kind of text to image to video. There is some really, really cool tech in the works that's coming out from companies like Meta and Google that I'm really, really excited to get my hands on and play around with because once that technology gets more and more advanced, just think about the creative video ideas we'll all be able to do in our marketing and for fun and as B roll and just so many cool things that we can do with it. I just, I love nerding out about this stuff. So I'm excited to see all of this stuff that's in the works and play with it even more. If you love nerding out over cool tech and cool tools and all of this AI stuff and all of this generative art that's happening right now, come check out future tools.
Future tools.io is the site where I curate all of the cool tools that I come across. I make them filterable and sortable and you can sort by the ones that people have upvoted the most and like the most show only the free tools that are available on the site that don't cost any money to play with. As of right now, there's 490 tools in the database. And by the time you watch this video, there's probably even more because I'm constantly coming across cool tools and sharing them on the website. So check out future tools.io. And if you haven't already, make sure you subscribe to the newsletter. If you join the newsletter, then every Friday, I'm going to send you my five favorite tools from the week. I look at a hundred plus tools every single week. And on my Friday newsletter, I give you the top five that I came across that week. I also share three interesting news articles in the area of AI. I share three cool YouTube videos and I share one cool way to make money using AI. So if you're not on that newsletter, join today and I'll send you the first newsletter this coming Friday. I hope you enjoyed this video. I hope you enjoy nerding out over cool AI tech as much as I do. If you do, please like this video and subscribe to this channel because I'm going to keep making more videos and nerding out over cool tech tools. And as these new technologies emerge, I'm going to make videos about them and share them with you so that you can stay in the loop about all of this cool nerdiness that I'm into. So thanks so much for hanging out with me today and I'll see you guys in the next video.