LTX 2.3 First Middle Last Frame — Keyframe AI Video Tutorial

One of LTX 2.3's most underrated features is multi-keyframe interpolation: instead of generating motion from a single starting image, you can pin the first frame, the middle frame, and the last frame, then let the model fill in the ~144 frames between each pair. The result is a 12-second video where your three uploaded poses appear at exact timestamps and the AI generates smooth, coherent motion connecting them.

This is the closest thing AI video has to traditional keyframe animation. We just shipped this as a one-click workflow on VirtuaVixen Studio — three image upload slots, drop them in, generate. If you want to run it yourself, the FML workflow is bundled in our ComfyUI Workflow Pack with all the LoRAs and the abliterated text encoder pre-configured. Discord for support.

How First-Middle-Last Keyframe Generation Works

Most image-to-video models accept a single start frame and invent everything else from the prompt. LTX 2.3 supports a richer mode: LTXVAddGuide nodes that pin specific images at specific frame indices. Three guide nodes in series — pinned at frame 0, frame 144, and frame -1 (last) — give the diffusion process anchors to interpolate between.

The ltx2.3-transition LoRA is specifically trained to produce smooth morphs between distant keyframes. Combined with the LTX-2.3-22b-AV-LoRA-talking-head and LTX2.3-NSFWMOTION LoRAs, the model learns to generate plausible motion that connects your start, middle, and end poses.

What You Can Do with FML That You Cannot Do with I2V

Choreographed pose progressions — standing → kneeling → laying down. With single-image I2V, the AI invents whatever motion it wants. With FML you control exactly where she ends up.
Outfit changes mid-shot — first frame clothed, middle frame partially undressed, last frame fully nude. The model interpolates the transition.
Position transitions — first frame missionary, middle frame pulled out, last frame doggy. Useful for multi-scene shots.
Cumshot moments — pin the climax frame as the last keyframe and the model builds toward it naturally.
Camera moves — wide shot → medium shot → close-up by varying the framing of each keyframe.

How to Use the FML Workflow

1. Pick three keyframes

The first frame is the master — it sets the resolution and the character identity for the whole video. Middle and last frames get auto-resized and padded to match. Keep the same character across all three; the model will struggle if the keyframes show different people.

Generate the keyframes any way you like — a T2I render, three frames pulled from existing footage, AI-edited variations of one source, even photographs. They just need to be consistent in subject and lighting.

2. Write a Scene-level prompt

The prompt drives style, lighting, mood, and audio — not pose. The keyframes already lock the body positions. So describe what's true throughout: lighting, environment, skin, hair, breathing, ambient sound. If you describe poses in the prompt, you fight your own keyframes.

Example for a tropical waterfall scene: “A young woman in a tropical waterfall grotto. Lush jungle and falling water behind her. Soft natural daylight, slight mist in the air. Her body moves smoothly between positions, water glistening on skin, hair shifting naturally. Ambient jungle sound, water falling, faint breathing.”

For full prompt strategy, see the LTX 2.3 prompting guide.

3. Optional: add dialogue

Add a [SPEECH]: line to the prompt — whatever follows it gets routed to the talking-head LoRA, which animates the lips to match. Keep dialogue short and concrete; long monologues drift off-sync.

4. Generate

The full pipeline runs in 8–12 minutes on a 48 GB GPU: 9 sampling steps with the linear-quadratic scheduler, then VAE-tiled decode, then synced audio decode, then save. Output: a 288-frame H.264 video at 24 fps with audio.

Settings That Matter

Guide strength: 0.7 — moderate. The keyframes are strong but the model has room to deviate. If output drifts too far from your keyframes, raise to 0.8–0.9.
Steps: 9 with linear-quadratic scheduler — sweet spot for quality vs speed.
CFG: 1.0 — the distilled LTX 2.3 model uses CFG=1 by design. Higher values cause artifacts.
Resolution: longest side 1080 — first frame's aspect ratio is preserved, others get cropped/padded to fit.
Length: 11 seconds × 24 fps + 24 padding frames = 288 frames total.

Common Issues

Jerky transitions between keyframes — usually caused by keyframes that are too far apart in pose. Pick a middle frame that's actually in between the first and last.

Character identity drifts during interpolation — usually a prompt problem. Don't add specific physical descriptors that conflict with the keyframes (“blonde” if she's brunette, “small breasts” if the keyframes show otherwise). The model gets confused.

Audio is silent or muffled — usually fine for ambient scenes. If you wanted dialogue, make sure your prompt includes a [SPEECH]: line. Empty SPEECH = ambient only.

Lips do not sync to dialogue — keep dialogue lines short. The talking-head LoRA performs whole sentences much better than long monologues. Pause-heavy delivery also confuses it.

FML vs Single-Frame I2V: When to Use Which

Use FML for choreographed scenes, position transitions, outfit changes, and any time you need the AI to land on a specific final frame.
Use single-frame I2V (our other LTX 2.3 workflows like Doggy Cinema or BJ Cinema) for simpler scenes where you only care about the start and you want the model to invent the motion.

Try It

The FML workflow is live on our Studio right now in the Cinema tab — look for “LTX 2.3 FML Frame Basic”. Three upload slots, your prompt, hit generate. Free with daily token allowance.

To run it locally, our ComfyUI Workflow Pack includes the FML workflow JSON pre-configured with the abliterated Gemma 3 text encoder, the transition LoRA, the NSFW motion LoRA, and the talking-head LoRA. The installer pulls everything from our Hugging Face repo automatically — about 60 GB of downloads, then ready to run.

LTX 2.3 First, Middle & Last Frame: Keyframe AI Video Interpolation

How First-Middle-Last Keyframe Generation Works

What You Can Do with FML That You Cannot Do with I2V

How to Use the FML Workflow

1. Pick three keyframes

2. Write a Scene-level prompt

3. Optional: add dialogue

4. Generate

Settings That Matter

Common Issues

FML vs Single-Frame I2V: When to Use Which

Try It

Related Reading

Author

Leave a comment

Cancel reply

Categories

LTX 2.3 First, Middle & Last Frame: Keyframe AI Video Interpolation

How First-Middle-Last Keyframe Generation Works

What You Can Do with FML That You Cannot Do with I2V

How to Use the FML Workflow

1. Pick three keyframes

2. Write a Scene-level prompt

3. Optional: add dialogue

4. Generate

Settings That Matter

Common Issues

FML vs Single-Frame I2V: When to Use Which

Try It

Related Reading

Author

Leave a comment

Are you 18 or older?

Before you go, fuel your WAN 2.2 AI Studio

Categories