Varmirdocs
Docs/Core concepts/Streaming vs batch

Streaming vs batch

The same model serves both modes. Pick batch for a finished file, streaming for live audio that the user needs to see immediately.

Batch (REST)

A single POST to /api/v1/transcribe. You hand the server the whole clip, the server hands you back the whole transcript when it's done. Latency is roughly 0.05 × audio length plus a fixed ~150 ms.

Use when: you have a finished recording on disk, you don't need partial results, or your client lives behind a strict firewall.

Streaming (WebSocket)

A persistent connection on /api/v1/stream where you push raw PCM and receive incremental partial and final events. Partial latency is sub-200 ms.

Use when: live captions, real-time interpretation, voice agents, long sessions you can't fit in a single upload.

Which to pick

As a rule of thumb: if a human is waiting for the transcript in real time, pick streaming. If a pipeline is processing a file, pick batch. Pricing is identical — both bill per minute of audio.

Streaming vs batch — Varmir docs