TRANSCRIPT_JSON

Transcript JSON Specification

Transcript data is stored in a single server-side transcript.json file per project. The file is timestamp-based and agnostic to the specific audio it renders against. If the audio is edited without a corresponding transcript update, playback positions will become misaligned. Automatic transcription realigns the transcript but overwrites any manual edits.


Root Keys

keytypedescription
paragraphslistTranscript text, structured as paragraphs → sentences → words
sectionBreakslistUser-defined section markers with timestamps
annotationsobjectText annotations (hyperlinks) attached to transcript segments
noteslistInline notes anchored to timecodes (stage directions, speech labels, editor notes)

All root keys are optional — a file may contain only paragraphs, or only annotations, etc.


paragraphs

A list of paragraph objects. Paragraphs are created either manually by the user or automatically based on detected pauses during transcription. Each paragraph contains one or more sentences; each sentence optionally contains a word-level breakdown.

Paragraph

keytypedescription
speakerstrThe speaker_id associated with this paragraph
startfloatStart time of the paragraph in seconds
endfloatEnd time of the paragraph in seconds
sentenceslistOrdered list of sentence objects within this paragraph

Sentence

keytypedescription
startfloatStart time of the sentence in seconds
endfloatEnd time of the sentence in seconds
textstrFull text of the sentence
wordslistWord-level breakdown (may be empty; if so, text is used as the atomic unit)

Word

keytypedescription
startfloatStart time of the word in seconds
endfloatEnd time of the word in seconds
wordstrThe word string
probabilityfloatConfidence score (0–1) for the word's timing/detection

sectionBreaks

A list of section break objects. Section numbers are assigned at runtime (sorted by beforeSegStart) and are not stored in the file.

keytypedescription
beforeSegStartfloatStart time of the first segment in this section. At runtime, the break is positioned immediately before the nearest sentence start. 0 places the break before the very first paragraph.
namestrUser-defined label for the section (may be empty string)

notes

A list of inline note objects anchored to timecodes. Notes are rendered at the nearest paragraph boundary at or after their timecode. The list is kept sorted by timecode.

keytypedescription
timecodefloatPosition in seconds; the note renders before the first paragraph starting at or after this time
typestr"stage_direction" | "speech_label" | "editor_note"
textstrThe note text
publicboolfalse = visible to editors only; true = visible in presentation view. "editor_note" type is always hidden in presentation regardless of this flag.

annotations

An object containing text annotations attached to transcript segments. Currently the only annotation type is hyperlinks.

hyperlinks

A map from a unique string ID (generated at creation time) to a hyperlink object.

{
  "hyperlinks": {
    "<id>": { ... }
  }
}

Hyperlink object

keytypedescription
urlstrThe destination URL
namestr | nullDisplay name for the link (null if not set)
descriptionstr | nullShort description shown in the UI (null if not set)
editorNotesstr | nullPrivate notes visible only in the editor (null if not set)
segmentIdxintIndex of the transcript segment (sentence) this link is anchored to
charStartintCharacter offset within the sentence text where the link begins
charEndintCharacter offset within the sentence text where the link ends

Example

{
  "paragraphs": [
    {
      "speaker": "speaker_0",
      "start": 0.0,
      "end": 4.2,
      "sentences": [
        {
          "start": 0.0,
          "end": 4.2,
          "text": "Hello and welcome.",
          "words": [
            { "start": 0.0, "end": 0.5, "word": "Hello",   "probability": 0.99 },
            { "start": 0.6, "end": 0.9, "word": "and",     "probability": 0.98 },
            { "start": 1.0, "end": 1.6, "word": "welcome", "probability": 0.97 }
          ]
        }
      ]
    }
  ],
  "sectionBreaks": [
    { "beforeSegStart": 0, "name": "Introduction" }
  ],
  "notes": [
    { "timecode": 0.0, "type": "stage_direction", "text": "theme music plays", "public": true },
    { "timecode": 1.0, "type": "editor_note", "text": "check audio quality here", "public": false }
  ],
  "annotations": {
    "hyperlinks": {
      "lnk_1714000000000": {
        "url": "https://example.com",
        "name": "Example",
        "description": null,
        "editorNotes": null,
        "segmentIdx": 0,
        "charStart": 10,
        "charEnd": 17
      }
    }
  }
}