Voice Diversity Crisis
Understanding users with nonnative accents and voice user interfaces.
Voice user interfaces (VUI) are becoming more common in today’s technology as they help improve different user experiences. Major tech companies — Amazon, Apple, Google, and Microsoft — have invested in voice-enabled AI assistants, and users of all ages are using voice to get the latest news, play games, and control smart home devices.
In recent studies about automated speech recognition (ASR) systems, which use machine-learning algorithms to convert spoken language to text, an issue has been whether these tools work equally well for all subgroups of users. This independent study seeks to expose the limitations users with nonnative accents undergo while interacting with voice assistants.
Also, it suggests the need for strategies that use more diverse datasets that include accents and interlanguage to reduce disparities and ensure that the voice design community invests resources into collecting data of varieties of English, including regional and nonnative accents.
To test the effect of performance gaps with bilingual users, I conducted a qualitative research with users of voice-enabled devices. A total of 10 users, who actively interact with Alexa (Amazon), Siri (Apple), and Google, participated in the research. The participants were Caucasian American, African American, Asian American, Brazilian, Slovenian, Asian, Latino, and South Asian.
The interviews were conducted remotely across the United States: California, Illinois, New York, Oregon, Texas, and Washington. All findings are based on the interviews and focused on the everyday interaction with voice assistants, what type of commands users give to smart devices, and interactions that don’t go well.