OpenSORA

This repo intends to be an open discussion & implementation platform for the technical reproduction of video generative models with quality on par with sora.

OpenAI sora technical report summary:

The overall framework is similar to WALT but with many missing details.
The main improvements seem to come from model scaling. 16x compute model shows significantly better spatial & temporal quality
The video compression network could be similar to MAGVIT2 but with a higher temporal compression ratio. The number of frames could be very high, as suggested by a previous Google paper.
It is probable that casual attention is used for temporal modeling since it supports both image and video data.
learning at high resolution could benefit the model performance also.
re-captioning is important for text understanding.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

OpenSORA

About

Releases

Packages

zhoudaquan/OpenSORA

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

OpenSORA

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages