Create Transcription

Upload an audio file and receive a structured JSON response with transcription, speaker diarization, and AI analysis in a single synchronous API call.

POST/v1/transcribe

Request

Send a multipart/form-data request with the audio file and optional parameters:

FieldTypeRequiredDescription
fileFileYesAudio file. Supported: MP3, WAV, M4A, OGG, FLAC, AAC, WebM, Opus, MP4. Max 100 MB
languageStringNoISO 639-1 code (e.g., "en"). Auto-detected if omitted
modeStringNopolished (default) or verbatim. Verbatim preserves raw transcript without AI formatting
redact_piiBooleanNoMask PII in transcript (names, SSNs, card numbers). Pro plan only.
instructionsStringNoCustom AI prompt. Max 2,000 chars. See Custom Instructions
planStringNopro (default) or lite. Pro ($0.39/hr) includes full 11-feature AI analysis. Lite ($0.15/hr) includes diarization and transcript cleanup only.
asyncBooleanNoSet to true for async processing. Returns 202 Accepted with a job_id immediately. Results delivered via webhooks.
metadataJSON StringNoCustom JSON object (max 4KB). Echoed back in webhook payloads for correlation. Only used with async=true.

Example

curl
curl -X POST https://api.voxparse.com/v1/transcribe \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "[email protected]" \
  -F "plan=pro" \
  -F "language=en"

Response

Returns a JSON object with the transcript, segments with timestamps, word-level timing, and an ai_analysis object. Pro plan includes sentiment, compliance, financial data, call summary, and action items. Lite plan includes diarization labels and cleaned transcript only.

Tip: Use mode=verbatim for raw transcripts without AI formatting. Use the default polished mode for full AI analysis with diarization, sentiment, and compliance.
Plan selection: Use plan=pro ($0.39/hr) for full AI analysis including sentiment, compliance, financial extraction, and custom instructions. Use plan=lite ($0.15/hr) for diarization and transcript cleanup only. If omitted, defaults to pro.

Async Mode

For background processing, add async=true to receive an immediate 202 Accepted response with a job ID. Results are delivered to your registered webhook endpoints when processing completes.

curl — async request
curl -X POST https://api.voxparse.com/v1/transcribe \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "[email protected]" \
  -F "async=true" \
  -F 'metadata={"ticket_id":"T-1234"}'

The metadata field (up to 4KB JSON) is echoed in the webhook payload for easy correlation with your internal systems. Validation errors still return 4xx synchronously even in async mode.

For the complete response schema, see the Full API Reference.