Sizo Develops ™

Design and Create.

Using Web Speech Api on mobile browsers.

August 27, 2022 ~ Minutes Read

The Web Speech API enables you to incorporate voice data into web apps.

Have you tried adding speech synthesis to your website and ran into a bug on mobile browsers? Where it doesn't run and it throws the "Synthesis Failed" error when you debug it. Well I'm pretty sure everyone who has tried using this API has encountered the same synthesis error. This bug is caused by the limit set for the number of characters the speech synthesis can utter at once, and this happens in most browsers that support speech synthesis, Chrome and Safari to name a few.

In this article we are going to be looking at a work around for this error, which will include reproducing the bug and debugging it. We are going to be using JavaScript to implement this speech synthesis on our app. We are not going to be discussing all the functions of the API, only the "speak" function.

First to touch up on how to reproduce the bug before fixing it, assuming that you already know how to debug your mobile browsers. Let's begin.

Reproducing the synthesis failed error.

This error only happens on mobile, desktop works just fine. you can get the text to test this on Text Generator .

    yourText = "YOUR TEXT HERE" // Make sure it's  +1000 words
    utterThis = new SpeechSynthesisUtterance(yourText);
    utterThis.volume = 1
    utterThis.rate = 1.6
    utterThis.pitch = 1
    utterThis.lang = "en-US"
    // This should use the default voice
    window.speechSynthesis.speak(utterThis)
     utterThis.onerror = (event) => {
        console.log(event.error)
        // This function should be called on your mobile browser
        //  Your console should log "Synthesis Failed"
     }

The code above should reproduce the bug.

Now that we have our bug, how do we fix it? As I have stated above, the bug is caused by the character limit that is set for mobile browsers. I have tried testing what this limit is, and have come to the conclusion that it is somewhere above 600 words, there may be an exact number on the internet but for now this assumption is good enough to fix the bug.

Note: This number of words takes into account the length of each word, punctuation and spaces, everything that counts as a character can be a factor in this, so it's only an estimate.

This approach is what is used by some speech APIs out there. I'm sure you probably now have a clue on how this might work, we need to cut down the number of words we give to the API to something below 600. Now you ask yourself, but what if I want it to read more words than 600? Well the Web Speech API has what is called queuing, where you can call multiple "speak" functions at once, and the API will queu them one after the other and read them inline.

// Queuing 
window.speechSynthesis.speak(utterance1)
window.speechSynthesis.speak(utterance2)
window.speechSynthesis.speak(utterance3)

These will all be read following each other, when Utterance1 completes the second Utterance follows. This feature will help us in our solution where we need to queu an unknown number of Utterances to the API.

Using a 2D array to queu the utterances.

If we had a known string that we needed to queu then this would be straight forward as we would just split it into chunks and call the "speak" function for all those chunks. But it may happen that you don't know how many words a string may have, which can be anything from 1 to a thousand and even more. So to accommodate such cases we need to use a 2D array and a loop that will count the number of words in a given string.

First you need to pre define the length of each array you are going to have, and this is the number of words each string is going to contain. For this we are going to use the "Math.ceil" function to make sure that we never get the length of 0 if the string is not empty or the number of words in the string is less than our pre defined number of words (500 in this article), and that we always have extra room for additional words.

   let text = "Your text"
   let numberOfWords = Math.ceil(text.split(" ").length/500)

This splits the string into an array of words, and then divides it by 500 as the number of words the array will contain, then we ceil it to get an integer which is a number with no decimal point. If the string has 1000 words the code above will return 2, and that is the length of the 2D array we want to create.

Now than we have defined the length, we can proceed to looping through that length and adding empty arrays as placeholders in our 2D array, and this is an array of arrays.

var listOfSentences = [] // This is our arrray.
for(var i = 0; i < numberOfWords; i++){
        listOfSentences[i] = []
    }

listOfSentences is the array that will hold all the sentences for the utterances.

After adding all the empty arrays which in the case of a 1000 words is going to be 2 empty arrays, we now need to create a loop within a loop for adding the words in those two empty arrays.

    let wordCount = 0 // A pointer to each word in the string.
   for (var x = 0; x < Math.ceil(text.split(" ").length/500); x++) {
        for (var j = 0; j < 500; j++) {
            listOfSentences[x][j] = text.split(" ")[wordCount++];
        }
   }

wordCount in the code above, is what we are using to get the words and also keep track of how many words we've counted so far.

Now it's time to call the "speak" function and we are going to do that inside the loops, as our 2D array has 2 arrays inside it we need to create a queu of 2 utterances and so on. In the string we now need to get each word and put it into an array, once the number of words exceeds then length of 500 in one array, we move to the next array until we finish iterating through the string.

//  Create a 2D array
    for (var x = 0; x < Math.ceil(text.split(" ").length/500); x++) {
        for (var j = 0; j < 500; j++) {
            listOfSentences[x][j] = text.split(" ")[wordCount++];
        }
// This creates a que of utterances
        speaker.text = listOfSentences[x].join(" ")
        synth.speak(speaker)
    }

Now our app is complete and this should work for both mobile and desktop, keep in mind this is a fix for the word limit not browser support. Some browsers do not support Speech Synthesis, so make sure yours does, before applying any fixes.

Note: The number of words you add to each array, in this case we added 500, may slow down the start for your speechSynthesis, but also produce a more smooth transition between sentences.

You can lower and increase the words as you wish to test this code.

The complete working code is in the pen below.

Sizo Develops

Sizo Develops ™

Using Web Speech Api on mobile browsers.

The Web Speech API enables you to incorporate voice data into web apps.

Reproducing the synthesis failed error.

Using a 2D array to queu the utterances.

Share On Socials

More Posts.

5 benefits of learning html and css

how to highlight markdown in react

restful apis a beginners guide to rest apis

send emails with python using smtp server

what is learning hell and how does it compare to tutorial hell

which language to choose as a beginner in programming

why is code splitting an important skill to learn for every developer

why you should not learn to code just to get a job