AWS Transcribe vs. AssemblyAI: The Ultimate Battle of Video Transcription Giants

By Haktan Suren, PhD
Jun 7th, 2024

I recently had the fascinating opportunity to pit two transcription heavyweights— AWS Transcribe and AssemblyAI—against each other in a rigorous, head-to-head competition. With nearly 100 hours of video transcription at stake, the results might surprise you. If you’re considering venturing into the realm of video/audio transcription, keep reading—this detailed comparison could be a game-changer for you!

Initial Comparison: The Basics

Media Handling

Let’s start with how each platform handles your media. For AWS Transcribe, all media files must be uploaded to an S3 bucket. On the other hand, AssemblyAI offers the flexibility of not caring where your media is, as long as it’s accessible from any location. This alone can be a deciding factor for many users.

Both platforms support MP3 (audio) and MP4 (video) formats. I personally prefer MP3, even for video transcriptions, as it saves on bandwidth and storage space.

Size and Vocabulary Constraints

AWS Transcribe hits a snag when dealing with files larger than 2GB. If you have larger files, you’ll need to chunk them down or convert them to MP3 format. On the flip side, AssemblyAI has no such restrictions, making it easier for users with hefty files.

AWS Transcribe does offer custom vocabulary options where you can define specific terms, jargons, or product names (see attached image below). However, my experience showed this feature lacked effectiveness. Despite defining terminology in the custom vocab, errors were still prevalent.

Multi-language Support and Concurrency

Both platforms accommodate multi-language transcription. But when it comes to handling multiple transcriptions simultaneously, AWS takes the lead with massive concurrency. AssemblyAI, however, offers up to 5 concurrent transcriptions for free accounts, extending to 200 for paid plans.

Free Tier and Output Formats

AWS’s free tier transcribes one hour of video, whereas AssemblyAI gifts you a whopping 100 hours—an immense difference. Additionally, both platforms can output transcripts in SRT, VTT, and TXT formats (images attached).

Transcription Speed and API Simplicity

Both platforms boast fast transcription speeds, but AssemblyAI edges out slightly faster in individual transcription time. However, AWS seems to scale better in terms of concurrency. In terms of API usability, both are straightforward, but AssemblyAI’s API is arguably a bit easier to work with (images attached).

Connection Handling

AWS transcribe queues your transcription requests, allowing you to walk away. Surprisingly, AssemblyAI requires you to keep the connection open during transcription—a more hands-on approach that raises the question of why there isn’t a built-in async method.

The Real Deal: Transcript Quality

Here’s where the rubber meets the road: the quality of the transcripts. I tested both platforms on some pretty dense medical content, and AssemblyAI emerged as the clear winner.

AWS Transcribe struggled notably:

– Chelators were rendered as “key laters”

– “6 one-on-one calls” turned into “61 one-on-one call”

– Cyto morphed into Cito

– DMSA showed up as DM SS A

– EDTA was oddly converted to Ed t & t

– Liposome appeared as lipo zone (see image for a list of these errors)

Even with custom vocabulary applied, AWS Transcribe fell short, making significant errors that could prove costly in professional settings.

To be certain, I even tested AWS Transcribe Medical, hoping for improved accuracy. Sadly, the results were similar.

AssemblyAI, on the other hand, was a revelation. Without any custom vocabulary, it nailed the medical transcriptions, delivering accuracy that left AWS in the dust.

Cost Analysis

The cherry on top? I had to shell out $153 to AWS for these transcriptions (receipt attached). Meanwhile, AssemblyAI provided around 100 hours of transcription free of charge—unbeatable value.

Conclusion: The Clear Winner

Quality should always be your primary consideration when it comes to transcription services. Despite AWS Transcribe’s scalability and extensive feature set, AssemblyAI takes the crown for its superior transcription accuracy and generous free tier. In the end, AssemblyAI’s performance highlights that sometimes, simplicity, accuracy, and cost-effectiveness can indeed trump extensive features.

Feel free to check out the attached images for a visual comparison of the errors and interface elements discussed.

So, there you have it. Is AssemblyAI my new go-to? Absolutely. What about you? Let us know in the comments below which platform you think reigns supreme!

Stay tuned for more tech comparisons and insights. Happy transcribing!

About the Author

Haktan Suren, PhD
- Webguru, Programmer, Web developer, and Father :)

Wrap your code in <code class="{language}"></code> tags to embed!

Leave a Reply

E-mail address is required for commenting. However, it won't be visible to other users.

Loading Facebook Comments ...
Loading Disqus Comments ...