Our jobs just got a little easier, thanks Google! The Google Cloud Speech-to-Text Program is live and a research study recently conducted by ASHA found that it does a better job than Speech-Language Pathologists and Trained Transcribers. The Google Cloud Speech-to-Text Program could act as your personal assistant.
The research study published in the September 2021 issue of ASHA noted that the new Speech-to-Text Program by Google narrated language more accurately than Speech-Language Pathologists (SLPs) and Trained Transcribers (TTs). Wow! The study was done with samples from school-aged children with developmental language disorders. Each transcription sample was critiqued based on accuracy on the original narrative language sample. The research study from ASHA states the following “Results indicated that Google Cloud Speech was significantly more accurate than real-time transcription in transcribing narrative samples and was not impacted by speech rate of the narrator. In contrast, SLP and TT transcription accuracy decreased as a function of increasing speech rate”. Source: An Evaluation of Expedited Transcription Methods for School-Age Children’s Narrative Language: Automatic Speech Recognition and Real-Time Transcription. Carly B. Fox, Megan Israelsen-Augenstein, Sharad Jones and Sandra Laing Gillam. Volume 64; Issue 9, September 2021, Pages: 3533-3548
Using The Google Cloud Speech-to-Text Program
OK, yay for research, but how do we use it in everyday SLP world? The Google Cloud Speech-to-Text Program would be great for a narrative language sample. A narrative language sample is often used during a dynamic assessment when not just standardized assessments are gathered. Narrative language samples are also helpful for clients who may have cultural differences, English as a Second Language, or other anomalies that standardized assessments cannot always account for. Narrative language samples help you gather:
- Narrative Language Abilities
- Click here for a an Assessment on Narrative Language by age you can use with you sample written by Scott Prath by Bilinguistics
- Type Token Ratio (TTR)
- TTR is a way to measure the variety of vocabulary your client has and it’s a great way to measure their progress overy time. To calculate, you divide the total number of “types” of words by the total words.
- Mean Length Utterance (MLU)
- MLU is the average length of utterances or phrases/sentences your client uses.
- Speech/Articulation Production Abilities in longer utterances
- A Speech-to-Text Narrative Language sample could be used as a quick screener for speech production or you could use the sample to gather a Percentage of Intelligibility.
- Perspective-taking/Theory of Mind Abilities
- Notice your client misinterpreting the perspective of other characters/people in their narrative language sample? This might be a concern for children who are 5 years of age who do not narrate that others might have their own thoughts and ideas.
I am excited that SLPs can benefit from this technology! Some other features of the Google cloud Speech-to-Text Program:
- Over 125 languages and language variants are available– and growing! Google is constantly updating with new languages, dialects, and language variants.
- Use the Speech-to-Text program with real-time speech or upload an audio file.
- The software program will automatically re-write numbers into dates, time, address, currencies, etc.
- Use multi-modes of audio input like phone calls, video conferences, recordings, etc.
- Google’s Speech-to-Text Program performs well even in noisy environments with technology to cancel out background noise.
- Punctuation! The Speech-to-Text Program can accurately punctuate your transcriptions.
- One of the coolest features is that the program auto predicts which speaker made which utterance for you! This is called speech diarization.
Let’s talk about pricing. Your first 60 minutes of the Google Cloud Speech-to-Text Program is free. Then after that, the standard program is priced at 15 seconds of audio processed for $0.006 so that comes out to be $0.72 for 30 minutes.
So what’s not to like? The only thing that caught my eye was that the audio samples taken during the research study were not with diverse populations. Over 75% of the samples came from white, monolingual children. The Informed SLP wrote, “For automatic speech recognition to be more clinically useful to SLPs, however, models that can accurately transcribe the speech of students from culturally or linguistically diverse populations are necessary” and I agree. I am hopeful that Google will continue to update their program and consider our diverse populations especially with their program including over 120 languages and growing. Get started and even try it for free here.
Sources:
Scott Prath from Bilinguistics for Narrative Language Chart
Join the SRN newsletter!
I'm so glad you stopped by! If you'd like to keep up with the newest posts and get exclusive free downloads, please sign up for the newsletter! Your first freebie is ready as soon as you subscribe and confirm your email!
Success! Now check your email to confirm your subscription.