Turn your AI CLI into a Cinematic Video Studio
Flow Kit is an open-source agent + Chrome extension that drives Google Flow end-to-end. Reference images keep characters consistent, chain videos flow seamlessly, and YouTube publishing runs itself — all from a single /fk-* command.
MIT licensed · 34 skills · 40+ REST routes · Veo 3.1 · dual-orientation output
# Story → YouTube in one session
> /fk-create-project "The Lost City of Petra"
> /fk-pipeline # batch refs → images → videos, retries handled
> /fk-gen-narrator # TTS narration in 600+ languages
> /fk-concat-fit-narrator # trim + xfade to narrator timing
> /fk-youtube-upload # rule-validated publish
The Real Pain of Multi-Scene AI Video
You didn't get into video to wire APIs together. Flow Kit takes the glue work off your plate.
Characters Drift Between Scenes
Every scene you generate gives you a slightly different face, outfit, lighting. Your hero becomes a stranger by scene 6. Prompts can't carry identity — only reference images can.
Expired URLs & CAPTCHA Walls
GCS signed URLs expire in ~1 hour. Uploaded media_ids start returning 'not found'. reCAPTCHA blocks half your batch. Manually babysitting a Flow tab is not a workflow.
YouTube Publishing Is a Second Job
ffmpeg concats, xfade transitions, TTS at 48kHz, brand logo overlays, thumbnail variants, SEO tags, channel upload rules, token refresh — every step a different tool.
Flow Kit solves all three inside the worker — references per entity, auto URL recovery, one upload pipeline.
Flow Kit: Your AI Video Production Team
A Chrome extension captures the Flow bearer token and solves reCAPTCHA. A FastAPI agent orchestrates every generation. Your CLI just calls the skills.
Pre-built /fk-* workflows: refs, scene images, video chaining, TTS, music, thumbnails, YouTube upload — and a doctor that triages errors.
FastAPI surface covering projects, videos, scenes, characters, requests (with /batch), materials, models, reviews, and flow health.
Skills are CLI-agnostic markdown recipes. Claude Code, Codex CLI, and Gemini CLI all drive the same agent without modification.
Materials drive style project-wide: realistic, 3d_pixar, anime, stop_motion, minecraft, oil_painting, or register your own.
34 Skills That Do the Heavy Lifting
Every skill is a battle-tested workflow recipe. Invoke /fk-* in Claude Code, Codex CLI, or Gemini CLI — same skills, same results.
Core Pipeline(5)
/fk-create-projectCreate Project
Interactive wizard that sets up a project with entities (characters, locations, assets), dual-orientation videos, and scenes with chain_type ROOT/CONTINUATION.
/fk-gen-refsGenerate References
Generate one reference image per entity (portrait for characters/assets, landscape for locations). Verifies all responses return UUID media_id before proceeding.
/fk-gen-imagesGenerate Scene Images
Generate frame-0 images for every scene with all referenced entities applied via imageInputs. Blocks if any required ref is missing a UUID media_id.
/fk-gen-videosGenerate Videos
Animate scene images into 8s video clips via Veo 3.1 (i2v). Polls until complete. Auto-appends voice_description and 'no background music' rules to prompts.
/fk-concatConcat Final Video
Download every scene video, normalize with ffmpeg, and concatenate into a final cut. Preserves sound effects, no background music by default.
Advanced Video(6)
/fk-gen-chain-videosChain Videos (i2v_fl)
Generate videos with start+end frame chaining for smooth transitions between CONTINUATION scenes. Uses transition_prompt to control visual smoothness.
/fk-insert-sceneInsert Scene
Insert cutaways, close-ups, or multi-angle shots into an existing chain as INSERT children. Maintains ROOT/CONTINUATION integrity.
/fk-creative-mixCreative Mix
Analyze your story and suggest mixed techniques: chain transitions, inserts, r2v intros, parallel multi-scene generation for cinematic impact.
/fk-review-videoReview Video
Claude Vision quality pass before upscale: flags artifacts, continuity breaks, motion glitches, and off-brief compositions for regeneration.
/fk-review-boardReview Board
Visual scene review board for bulk feedback. Flag scenes for REGENERATE_VIDEO or REGENERATE_IMAGE with targeted notes, batched in one pass.
/fk-add-materialMaterial Style
Register or switch visual material: realistic, 3d_pixar, anime, stop_motion, minecraft, oil_painting, or custom. Controls both image_prompt and scene_prefix.
TTS & Narration(6)
/fk-gen-tts-templateVoice Template
Create a reusable OmniVoice anchor WAV for consistent narration. Zero-shot voice cloning across 600+ languages. Re-used per scene at generation time.
/fk-gen-narratorGenerate Narrator
Auto-write per-scene narrator_text, then run TTS using the project voice template. Skips interview scenes and preserves original audio when flagged.
/fk-gen-text-overlaysText Overlays
Extract dates, locations, stats and callouts from narrator text and schedule timed text overlays for the final cut.
/fk-concat-fit-narratorConcat Fit Narrator
Trim every scene video to match its narrator TTS duration with xfade cross-dissolve on chains, burn text overlays, and concat at 48kHz audio.
/fk-gen-musicGenerate Music
Generate a Suno chirp-v4 soundtrack matched to video mood. Uses sunoapi.org. Only applied when project.allow_music is enabled.
/fk-import-voiceImport Voice
Register an existing WAV as a project voice template without re-recording. Handy for reusing voice anchors from prior projects.
YouTube(4)
/fk-youtube-seoYouTube SEO
Generate niche-aware title, description, hashtags, and tag list. Validates total tag-char budget (<=500 with quote overhead) to avoid invalidTags.
/fk-brand-logoBrand Logo
Overlay channel logo intro/outro (220px or 4K badge from channel directory) on final video and thumbnails, covering the Veo watermark.
/fk-youtube-uploadYouTube Upload
Upload via YouTube Data API v3 with OAuth2. Auto-detects Shorts vs long-form, enforces per-channel max-per-day, min-gap, and avoid-hour rules.
/fk-thumbnailThumbnails
Generate 4 AI thumbnail variants optimized for CTR with branding and text overlays. Pulls channel design rules from channel_rules.json.
Orchestration(5)
/fk-pipelineFull Pipeline
Auto-detect current project state and run the remaining stages in order. Uses POST /api/requests/batch so the worker handles throttling (5 concurrent, 10s cooldown).
/fk-monitorPipeline Monitor
Poll project pipeline state, detect transitions, send desktop notifications, and optionally auto-download completed upscales.
/fk-statusStatus Dashboard
Print a full project dashboard: scene progress, entity refs, pending requests, error history, and the recommended next skill to invoke.
/fk-switch-projectSwitch Project
Switch the active project across the agent so subsequent skills target the right slug automatically. Reads/writes the active-project API.
/fk-dashboardLive Dashboard
Render a live Flow Kit statusline in Claude Code so progress, failures, and retries are visible inline while you work.
Diagnostics(3)
/fk-doctorDoctor
Auto-diagnose pipeline errors across Flow backend, Chrome extension, worker, and YouTube upload. Prescribes targeted fixes — invoke before guessing.
/fk-fix-uuidsFix UUIDs
Find media_id values stuck in CAMS... form, extract the real UUID from fifeUrl, and backfill scenes + entities so downstream gens succeed.
/fk-refresh-urlsRefresh URLs
Recover expired GCS signed URLs for scenes and entity refs (roughly 1h TTL). Handles the 'Requested entity was not found' recovery path.
Utilities(2)
/fk-change-modelChange Model
Inspect and switch video/image/upscale models per project via models.json. Downgrades automatically if tier does not have access.
/fk-upload-imageUpload Image
Upload a local PNG/JPG to Google Flow and receive a UUID media_id. Use for bespoke scene images, manual refs, or brand assets.
Reference(3)
/fk-camera-guideCamera Guide
Reference for cinematic video prompts: camera angles, movements, lighting, depth of field, and shot-timing patterns for 8s clips.
/fk-thumbnail-guideThumbnail Guide
Design rules for CTR-optimized YouTube thumbnails: composition, color psychology, hook phrases, and required branding elements.
/fk-researchResearch
Fact-check dates, names, and events before writing documentary content. Pulls citations and flags real-people bypass rules for safety filters.
From Story to YouTube in 6 Stages
Each command handles one stage. The worker batches, throttles, and retries for you — 5 concurrent requests, 10s cooldown, 5 retries with exponential backoff.
/fk-create-projectDefine Project & Entities
Name the project, pick a material (realistic, 3d_pixar, anime…), declare every entity that should stay consistent — characters, locations, props — and write your scenes with chain_type ROOT or CONTINUATION.
One project → both 9:16 Shorts and 16:9 long-form
/fk-gen-refsGenerate Reference Images
One reference per entity: portrait for characters and assets, landscape for locations. The worker verifies every response returns a UUID media_id before the pipeline can continue.
Powers visual consistency across every downstream scene
/fk-gen-imagesCompose Scene Images
Each scene image is generated with every entity listed in character_names passed as imageInputs. Scene prompts describe action only — the refs carry appearance.
Blocks automatically if any referenced entity is missing a UUID
/fk-gen-videosAnimate into 8s Clips
Veo 3.1 turns each scene image into an 8-second clip (i2v). Use /fk-gen-chain-videos for i2v_fl start+end frame chaining with transition_prompt between CONTINUATION scenes.
Voice description auto-appended, no background music by default
/fk-gen-narratorNarrator + Overlays (optional)
Auto-write per-scene narrator_text, clone your voice template via OmniVoice (600+ languages), then extract dates, locations, and stats for timed text overlays.
Skips interview scenes, preserves original audio when flagged
/fk-concat-fit-narratorTrim, Xfade & Publish
Trim every scene to narrator duration, xfade cross-dissolve on chained segments, burn overlays, apply brand logo, generate thumbnails, and upload to YouTube under your channel's rules.
Auto-detects Shorts (<61s + 9:16) vs long-form
A Three-Layer Architecture
Flow Kit does not reimplement Google Flow. It automates the browser session you already have, then wraps the pipeline in a local agent you can drive from any AI CLI.
┌──────────────────┐ WebSocket :9222 ┌───────────────────────┐
│ AI CLI (you) │ │ │
│ Claude · Codex │ REST :8100 │ Chrome Extension │
│ · Gemini │ ◄──────────────────── │ (MV3 Service Worker) │
│ │ │ │
│ /fk-* skills │ │ • Token capture │
└────────┬─────────┘ │ • reCAPTCHA solve │
│ │ • API proxy │
▼ │ │
┌──────────────────┐ │ on labs.google/fx │
│ FastAPI Agent │ ◄─────── commands ── │ │
│ │ ──────── results ─► │ │
│ • Queue worker │ └───────────────────────┘
│ • Error router │
│ • ffmpeg / TTS │
│ • aiosqlite DB │
└──────────────────┘Chrome Extension (MV3)
- Captures the Google Flow bearer token from labs.google/fx/tools/flow
- Solves reCAPTCHA in the background and retries up to 10× without burning the retry budget
- Proxies every Flow API call via a service worker — agent never hits labs.google directly
FastAPI Agent + SQLite
- REST API on :8100 for projects, videos, scenes, characters, requests, materials, reviews
- Queue worker: 5 concurrent, 10s cooldown, 20+ error patterns routed to targeted handlers
- aiosqlite schema with dual-orientation video/scene columns and chain_type linkage
AI CLI (you)
- Claude Code loads CLAUDE.md · Codex reads AGENTS.md · Gemini reads GEMINI.md
- Each /fk-* skill is a markdown recipe — same skills drive every CLI
- /fk-doctor triages failures before you re-run anything
CLI-agnostic by design
Every skill is a plain markdown recipe under skills/. Any AI CLI that can read files can drive the full pipeline.
CLAUDE.mdAuto-loaded. Native /fk-* slash commands in .claude/commands/.
AGENTS.mdAgent reads the markdown skill file for the invoked command.
GEMINI.mdSame recipe pattern — skill files drive Flow Kit the same way.
Install in Four Steps
Flow Kit is MIT-licensed. Python 3.10+, ffmpeg, and Chrome required. Windows users should run inside WSL.
Clone + one-command setup
$ git clone https://github.com/tuannguyenhoangit-droid/google-flow-agent.git
$ cd google-flow-agent
$ ./setup.sh # installs Python deps, verifies ffmpeg + Chrome
Load the Chrome extension
# chrome://extensions → Developer mode → Load unpacked
# Point to the ./extension/ folder in the repo
# Open https://labs.google/fx/tools/flow and sign in
Start the agent
$ source venv/bin/activate
$ python -m agent.main
$ curl http://127.0.0.1:8100/health
# {"status":"ok","extension_connected":true}
Drive it from your AI CLI
# Claude Code auto-loads CLAUDE.md. Codex / Gemini read AGENTS.md / GEMINI.md.
> /fk-create-project "My First Flow Kit Video"
> /fk-pipeline # batched refs → images → videos
What Creators Are Saying
Real feedback from video creators running Flow Kit in production.
“I used to spend 4-5 hours manually prompting each scene. With Flow Kit, my entire 12-scene documentary was done in under 30 minutes. The reference image system alone is the killer feature.”
Alex T.
Discord
“i2v_fl chaining with transition_prompt is insane. Every cut looks intentional. I haven't opened ffmpeg manually since I found Flow Kit.”
Maria S.
“create-project to youtube-upload in one session. I shipped 3 videos yesterday that would have taken me a week before Flow Kit.”
David L.
Discord
“Dual-orientation output from one project is the game-changer. Same scenes, both 9:16 Shorts and 16:9 long-form. No extra work.”
Sarah K.
“/fk-doctor saved me. First time my pipeline failed it told me exactly which extension error hit and what to do. Self-healing pipelines are the future.”
James R.
Discord
“OmniVoice with 600+ languages opened markets I couldn't reach before. I publish the same project in English, Vietnamese, and Spanish — same character voices, three upload chains.”
Nina P.