This week we are continuing on the theme explaining different sub-categories of Voice Technology. If you have not yet read our blog on the “Sonic Internet”, it is also on this theme. Today we are looking at “Smart Audio.” Is this phrase simply more tech jargon, or is there a level of differentiation that allows Smart Audio to carve out its own distinct category in the space of Voice Technology? We’ll also look forward to see where Smart Audio is going for businesses and consumers.

Smart Audio Definition

You may have guessed that Vocool’s position would be that Smart Audio is far more than tech jargon. It is special in its own right for a couple reasons: optimization and personalization.

Smart Audio is specifically optimized for its listeners. The Smart portion of the phrase points to generating what a user wants, with a high verifiable accuracy using an algorithmic approach. It’s not actually playing anything for the listener, but learning the users preferences. It is optimized to specifically serve a user only things that are appropriate. Nothing out of left field will show up in your smart audio feed. If you hate baseball, a smart algorithm will not serve baseball information. (Don’t take offense baseball fans, it’s just an example.)

This leads to personalization. Smart Audio can both anticipate what you’d like to hear, but also generate more of what you want to hear when you’ve finished, without asking. In this way, Spotify’s Autoplay feature is a great example. When you finish an album or playlist, it can predict what you’d like to hear next. The algorithms are not perfect, but the more you use it, the better it gets. But this doesn’t apply across different people. If your best friend listened to your feed, they may like some of it, but other parts would be totally irrelevant. Every feed is different for each person.

Today we have only scratched the surface of what Smart Audio can do for your listening experiences. But imagine you had a day where you heard everything you wanted, without guiding the player at all. You got all your news, all your updates on your personal interests, your favorite podcasts played, and when you finished it seamlessly continued playing a “discover weekly” of new music and podcasts that you liked. That’s sonic heaven.

Let’s move on to the differences.

How is it different from Voice Tech?

Voice Technology can be an all encompassing phrase for all of audio, voice, or smart assistants. But more specifically, refers to text-to-speech and Natural Language Understanding (NLU). Which, take what you say to your smart speaker or assistant, and interprets it into decipherable text. Those assistants then take that text and attempt to match it up to specific “intents” that then interpret that data and return a response that is appropriate.

An easy way to think about it is to consider Voice Tech as the “input”; hearing what you say and interpreting it. Smart Audio would be considered the “output”; playing a response that is appropriate. But this is just an example. Vocool uses the term Smart Audio to refer a generated consumable product and less a simple output. If you ask Alexa or Google to tell you the answer to a math problem, they will, but Smart Audio takes it a step further. Its not just a simple call and response, its the next level. If you ask for the news and it plays Reuters and you prefer NPR, it will learn that. However in the future it will learn that you care more about the financial news than world events, holding that preference and prioritizing financial news.

Right now Smart Audio type responses can be trained with Alexa and Google Assistant using “routines.” And soon Siri is joining the fray with “Shortcuts“. These features grow regularly, but it takes a lot of work to get them to work just how you would like. They are still a bit clunky, but getting better.

Future of Smart Audio


Smart Audio has the ability to drastically change the internal processes of businesses. Consider setting up alerts for when sales fall out of “normal range” so you don’t have to check in as often. It would allow a business leader to concentrate on projects that grow the business instead of worrying about the day-to-day of the business, especially when there’s nothing to worry about. Regular reporting could be more engaging. Audio being a far more natural experience could improve comprehension and preparation before/after meetings.


Our digital lives are becoming more personalized by the day. Smart Audio has the ability to make this even more efficient. It could take something you like and help you find more content like it. Imagine a radio station designed specifically around you. Not just music, but it plays the news when you want it, sports updates as well, and reads notifications in between content. Imagine a day when you’ve already heard that new podcast before your friend recommends it, because the Smart Audio algorithms knew it suited your interests before your friend did. All the while letting me know my dinner plans have changed seamlessly as well.