How Speech Transcription Is Changing Our World

It’s not too many years ago that speaking to a computer that understood your instruction was considered science fiction, however today this is something we take for granted. We speak our searches into Google, Alexa is our new friend, and we even speak to our TV. This is all down to the technology that is able to transcribe what we say into structure data ‘text’.

As specialists in interaction recording and analytics, we have seen first-hand the maturing of transcription technology and an ever-increasing use of this technology to drive digital efficiency, enhance the customer experience and more importantly, to unlock the valuable insights contained in every conversation.

So has speech transcription reached the point where the value it delivers far outweighs its cost and therefore, turns it from a ‘nice to have’ to a ‘must have?’ Let’s explore some of the widely held perceptions and challenge the myths.

Speech Transcription is not accurate

Maybe in the early days, but now it is most certainly highly effective. The speech transcription engine we leverage has a Word Error Rate (WER) of a mere 5%; that means for every twenty words spoken, it only struggles with one of these.  To be honest, this is as good as some people I know!

However, if you add speech pack dictionaries that are relevant to an industry, organisation or area then this WER is reduced even further.

Speech Transcription cannot cope with accents

In the early days of speech technology, extensive training of the engine was required to understand regional accents and then in environments where multiple accents occurred, there were issues in having to use multiple engines to gain the necessary accuracy.

Three factors have changed this: firstly, is the specialists’ speech engines have developed universal English which means a single engine is able to maintain accuracy across all variants/accents of this language. Secondly, the technology being used can not only handle the vast array of regional accents, but also continually self-learn and improve the accuracy. Thirdly, the technology is able to distinguish the elements of a conversation between speakers and as such, tag and apply different rules to each party.

Coping with different languages

Speech technology today is able to automatically detect the language being used and utilise the appropriate engine to transcribe this. In the past, this was a manual task that both added to the complexity of transcription, but also prohibited it from being real-time. This is a major step forward, especially in such areas as financial trading and compliance where transactions that need to be captured and transcribed can often traverse multiple languages.

The technology is still expensive

This is most definitely no longer the case. For example, we provide speech transcription as a standard capability of our recording and analytics solution. It can be licensed as an application or provided as a software-as-a-service consumption model. The benefits of improved productivity, more informed decision making and reduced business risk through compliance, far outweigh the costs of transcription.

The process is slow and needs heavy computing power

The technology has advanced considerably over the past decade and today can neither be thought of as slow, nor processor hungry. The speed of Speech Transcription is measured in terms of the Real-time Factor (RTF); this is the time it takes to transcribe the audio divided by the duration of the audio. Anything over 1 means it takes longer to transcribe than the actual conversation and hence prevents real-time processing. In terms of the speech engine we utilise, this performs against an SLA of 0.5 and on average is able to transcribe an audio file of 10 minutes within 4 minutes. In terms of processing power, the advancement of these solutions is delivering far better server optimisation and with the capability delivered from a SaaS platform, the scalability of power is taken care of for you.

Transcription is only for Speech Analytics

It is true that transcription plays a key role in the ability to leverage speech analytics, however, transcription on its own can still add considerable value. With conversations converted to text it is possible to quickly mine data to find appropriate interactions, for example when responding to GDPR requests or audits. Transcription also provides the benefits of having detailed notes automatically generated against calls and attached to customer records in CRM which enables a full picture of customer engagement to be kept.

One Solution, Many Benefits

Speech Transcription is most definitely a transformational technology. It takes unstructured data ‘voice’ and turns this into highly structured data ‘text’ and how this data can be utilised opens up endless possibilities and benefits.

We are helping customers to leverage transcription to enable real-time compliance monitoring. Rather than reviewing trades and transactions that have taken place, we allow these to be analysed while they are still taking place, meaning action can be taken immediately to ensure compliance.

We are enabling contact centres to be proactive, to detect indicators of churn, frustration or vulnerability and do something about these immediately. We are helping contact centre managers to spot potential issues with agents and immediately address these with training and coaching – a critical plus during times when more agents are working from home.

We are applying Speech Transcription everywhere, teaching staff utilising it to automatically generate lecture notes, practitioners transcribing telephone patient consultations, public safety operators transcribing the audio captured across radio, phone and CCTV and in real-time pushing the transcripts to those who need it.

And across all of these applications, we are helping organisations to mine every interaction in every conversation to better understand customers, their preferences, their behaviours and the experience being delivered to them.

I truly believe that speech technology and, in particular, Speech Transcription is changing our world. It is enabling us to be more efficient, far more effective and most certainly gaining the all-important insights that enable us to shape our future.

If you would like to find out more about Speech Transcription and how we have helped other organisations to maximise this, we would love to discuss this with you. Please contact the team.