Demo
PEEK: Picking Essential frames via Efficient Knowledge distillation
Upload a video and let PEEK select the most informative frames for captioning — without watching every frame. The model is distilled from vision-language teachers to identify what matters, fast.