High quality video + audio generation with first and last frame conditioning. Optimized fp8 model for faster inference. [code]
for best results - make it as elaborate as possible