Designing New Speech Interfaces

(Vetenskapsrådet and STINT, McMillan, 2017-2020)

This project is focused around understanding non-system directed audio (such as ordinary conversation and environmental, ambient audio) as well as not only what the user says to the system, but HOW they say it in order to design new user interfaces with more interesting interactions. While this audio is much less constrained than dialogic speech directed at a device, and as a consequence, extremely challenging to recognise and model, it offers a rich potential resource for system input and human computer interaction that has been almost entirely neglected. Indeed, while speech research has been an active area of computer research, human-computer interaction research on speech has been given much less focus – potentially limiting the opportunities for applications of speech, but also for research understandings of how speech systems can fit with user activity and system use. We hope to open up new opportunities for human computer interaction using audio detection and processing.