Man vs. Machine: Our typists take on Speech Recognition Software
It’s the beginning of a new decade, and in some ways, it feels as if we’re already living in the future. Technology has come so far, with some inventions being almost unbelievable. Don’t feel like driving? No problem, just order an Uber helicopter from your phone or grab yourself a Tesla with self-driving capabilities. And let’s not forget 3D printers, AI and Virtual Reality.
Okay, so surely I should be able to transcribe my interviews with a simple voice to text software?
The simple answer is no, not yet.
Keep reading to find out how speech recognition has developed over the years, it’s limitations and be a judge in the first ever Sterling Transcription Type-Battle between man and machine…
A Brief History of Voice to Text
Although speech recognition has been around since the 1950s, it was unrecognisable compared to how we know it today. The first device filled the entire laboratory wall and could only understand the numbers zero to nine.
Fast forward to the 90s when Dragon released the first publicly available voice to text program. By the 2000s, Dragon Naturally Speaking could comprehend up to 100 words/minute, greatly reducing the typing load for busy professionals.
Since then, the software has taken off. We may take it for granted, but most of us have speech recognition in the palms of our hands. And over two million people in the world now use it in their homes.
“Alexa, how much will an Uber Chopper set me back?”
There is still one limitation that all of these softwares share; they can only transcribe one speaker.
Today, there are many brands offering different versions of speech to text software. At the higher end, Dragon still dominates the market and can cost from a few hundred up to a few thousand dollars. Dragon is now capable of learning your voice, your vocabulary and can be configured to work in various programs to suit your needs. On the other end, a number of free voice to text programs are available online. You probably already have the Windows Speech Recognition software installed on your computer.
The Type-Off
So now that you know a bit more about the history of speech recognition, let’s jump in to some real examples – in the first ever Sterling Transcription Type-Off. In this battle, we’re going to pit man vs machine, typist vs software. You’ll be able to see the difference for yourself, as we test some free speech recognition programs with both one speaker and two speakers against our own brilliant typists.
ONE SPEAKER
Windows Speech Recognition software
There are a couple of ways to use Windows Speech Recognition but for this experiment we will be using pre-recorded audio.
We had one of our staff members read out the start of this article and recorded it with an Olympus DS-9500 dictaphone. We then used the audio player; Express Scribe, to run Windows Speech Recognition. Can you see the resemblance?
Our at the beginning leading indicator, and and five, it the other that were already living in the future period
Technology had come so far, would family that has been almost unbelievable been called the traffic? No problem, and that although the helicopter from the line. Whether the get to the printers, AI and virtual reality. They get so surely engineered the tragic that many interviews with assemble what a couple?
The Himalayan you know, not yet been improving to find out how the children have developed in the years, into the kitchen and into the for the down trend compact battle between man and machine 08
The history of what the
And although the latter has been out of the fifties it was a nagging ankle compared to how he ordered a implacable that the entire wardrobe war economy and the channel islands and line in
But for the nineties when journalist of the publicly available wouldn’t expect them. Balloted out don’t, then actually speaking before and after 100 words a minute, greatly reducing the capital in the the that among the
Him then, that offer had taken off period when making it for granted, but most of the of the trade missions in the palm of LM. Another two million people in the world know you didn’t know I’m going
“A little, how much wanted the job at the back?”
The still one location to all the software share , but you only transcribe one speaker.
Today there are many lands offering different versions of the strength of were furious at the higher end, J and still dominates the market in can cost from a few hundred of the few 1000 auxiliary of January is now capable of wearing your voice, your vocabulary and can be configured to work in various programs to to your needs. From the other end and onwards for use with the text programs available online the if you probably already have Belinda speech recognition software installed on your computer.
Free online voice to text software
I scoured the internet, trying to find a free online speech to text program that could transcribe pre-recorded audio. Unfortunately, I was unsuccessful. In saying that, this program did a much better job at direct dictation than I would have expected. It struggled a little with punctuation, but overall it is much more comprehensible. See for yourself:
It’s the beginning of a new decade, and in some ways, it feels as if we already living in the future Point has come so far, with some inventions being almost unbelievable Point don’t feel like driving? just order an Uber helicopter from your phone or grab yourself a Tesla with self-driving capabilities point let’s not forget 3D printers comma.ai and virtual reality Point
Ok so, surely I should be able to transcribe my interviews with a simple voice to text off work?
Answer is no, not yet
Reading to find out how speech recognition has developed over the years, it’s limitations and be the judge in the first ever Stirling transcription type battle between man and machine ellipses
History of voice-to-text
Although speech recognition has been around since 1950s, it was unrecognisable compared to how we know it today period the first device filled the entire laboratory wall and could only understand the numbers 0 to 9. new line fast forward to the 90s when dragon released the first publicly available voice-to-text program Point Dragon Naturally Speaking could comprehend up to 100 words per minute, greatly reducing the typing load for busy professionals
Although this program was quite accurate, it may not be much of a time saver. We read the article aloud from a print out, very slowly. Had we been speaking naturally without prompts, it’s likely that the program would have had a hard time keeping up – meaning more mistakes and more time editing. In saying that, free voice to text software would be an extremely useful tool for those who have difficulty typing.
MULTIPLE SPEAKERS
Windows Speech to Text software
Even with one speaker, Windows Speech to Text struggled to produce anything worthwhile, but just for fun, let’s see what happens when we add a second speaker.
It’s worth noting speech recognition software isn’t designed to be able to transcribe multiple speakers at once, however it is a question we receive often and the results are quite eye-opening:
Tax status slash currency has to do in a memorable hat in a servant who has enhanced version of Latin that he is leaving a minister who loves Joan Allen Lujan a hat: USC which require us to hold you have happened if documents in La Jolla GC a lasting until I have is that low-season yesterday is bush’s last summer’s day that has been malice as how Mrs. Landau has sat in his last laugh listening at the time of the Latin class that has a familiar cancers indication who say the loans to happen if Mr. M lonely at hennessey’s times the is roughly 40 left in a race for the sometimes that he has to happen until I know what where the action as if you then came the necessary is to say who dismisses any individuals as an engine has saw that has enhanced versions of women who said his daughter of William F he is the whole a designer who who designed the routing has a son who have time to follow the holiday inn and is they who
Our Typists
We gave one of typists the same audio and here’s what they produced:
START OF TRANSCRIPT
Facilitator: Hi Cassie, thanks for joining me.
Interviewee: No problem, happy to help out.
Facilitator: Let’s jump right in. How long have you worked for Pacific Transcription?
Interviewee: I’ve worked here about two years.
Facilitator: Were you familiar with the industry before starting this job?
Interviewee: I was a little familiar. I had a course in university which required us to record an interview and then type it into a document. Similar to what we would produce here. From that experience I learnt how difficult and time-consuming it actually is, which is why I have so much respect for our typists.
Facilitator: What is your role here?
Interviewee: My role is pretty versatile. Most of the time my job is to check and edit the typists’ work before sending it out to the client. I’ve also worked in the products department so I’m familiar with transcription and dictation equipment too.
Facilitator: What’s one interesting fact you learned from this job?
Interviewee: Well the average person’s typing speed is roughly 40 words per minute. Whereas professional transcriptionists’ type at nearly double that rate.
Facilitator: What’s surprised you most about transcription as a service?
Interviewee: I think just how necessary it is for so many businesses and even individuals. I guess I’ve never given it much thought but outsourcing transcription is so important for researchers, doctors, lawyers, investigators. It’s used in court, to type novels; it’s really just so far reaching and saves so many people valuable time.
Facilitator: Well I think we’ve covered everything. Thanks again for speaking with me.
Interviewee: No, thank you.
END OF TRANSCRIPT
So, can you spot the difference?