Background Information: Real-Time Captioning
Real-time and closed captioning technologies utilized by broadcast companies rely on the work of broadcast captioning stenographers, very similar to court reporting. In addition, broadcast closed captioning entails specialized equipment and highly skilled staff:
- The equipment consists of a special stenographer’s machine connected to a computer running captioning software. The machine is connected to a recorder or to a live broadcast feed.
- Captioners utilize key sequences to represent phonetic parts of words or phrases to generate text versions in near real-time. Trained professionals can reach speeds of 225 words per minute.
- Most large broadcasters either utilize their own captioning staff, or use a national or international captioning service provider.
Types of Broadcast Captioning: CART vs. Typewell
Real-time captioning services are generally based on two methodologies of production; Communication/Computer Access Real-time Translation (CART) or Typewell.
CART is a broad category of services including open captioning or real-time stenography. A trained stenographer uses stenographic tools and methods to translate the speech to text.
- Output is a highly reliable, verbatim transcript of everything that was said, including interaction between instructor and student.
- Provides a very fast speech to text transition.
- Transcriptionists are highly skilled with extensive training for speed and accuracy. Training can take 2 to 5 years.
Typewell is a system for converting speech into text using specialized software. The transcriptionist provides a non-verbatim translation that fully communicates the speaker’s intent while removing any non-critical communications.
- Produces a “meaning for meaning” transcription.
- Eliminates “non- meaningful speech” like witty banter between the students an teacher, which may be critical to full understanding the context.
- Students have complained that the typists are slower than CART professionals.
- Training for Typewell transcriptionists is much shorter; can be completed in 30 to 60 hours.
Web Stream Captioning
Web stream captioning is accomplished by much the same process, with approximately a 3-15 second delay. While Voice Recognition software continues to improve, its use in real-time captioning is hampered by:
- Output accuracy and quality problems
- Insufficient audio quality,
- Absence of punctuation options
- Speaker captioning limitations, such as only one speaker can be recorded at a time
- IT resources consumed (bandwidth, server, client)
Options for Captioning for MediaSite
Third Party Services
MediaSite Captioning utilizes an existing account with a third party captioning service to automate the process of submission and creation of transcripts for a video recording.
- The provider profile is created on the MediaSite server.
- This process may be automated.
- The provider will create a transcript of the recording and develop a Synchronized Accessible Media Interchange (SAMI) file to be returned and linked to the video presentation.
- The standard turnaround time is 3 business days for captioning and transcription services. (Urgent service for a 24 hour turnaround is available for both at a higher cost.)
- The captioning service profile is global for the MediaSite Enterprise server.
Mediasite captioning service partners include; Automatic Sync Technologies (AST) and 3Play Media. Both can be configured under the MediaSite Captioning Manager.
|Pay per project (Per Hour of Content)||$162.00||$150.00|
|100 Hours Prepaid||$14,550.00||$14,100.00|
|1000 Hours Prepaid||$145,500.00||$129,000.00|
Internally Managed Hardware/Software Solution
DOCSOFT AV is a stand-alone appliance that resides on the network and monitors designated MediaSite recorders.
- It uses voice-to-text technology to begin generating a transcript as soon as recording begins.
- Draft transcript is available approximately 1 hour after recording completes.
- Approximately 60 to 70 percent accuracy depending on audio quality, accents, and vocabulary.
- Final transcript preparation requires manual editing.
- Transcribing additional speakers/voices on the recording creates significant problems.