The aim of this thesis was to examine the feasibility of using automatic speech recognition as a control parameter in the automation of live entertainment systems. In depth research was conducted to identify the problems associated with speech recognition in real world scenarios.
A survey of audio professionals was conducted in order to obtain a knowledge base around speech recognition and its current applications in live entertainment. Experimentation was then conducted to examine the performance of ASR in the suggested scenarios. Several types of interference noise were incrementally mixed with the spoken dialog in order to test the ASR systems WER% under noise. Real world simulations were conducted to test the impact of additional factors such as microphone placement and room reflections.
The data collected from these experiments suggested that speech recognition is certainly applicable as a control parameter of live entertainment technologies. However the overall average error rates and false positive results found across all tested candidate systems are too unpredictable and problematic to entrust total control of any system to speech recognition. Future improvements in recognition accuracy and noise elimination will in time allow speech recognition to become a powerful control interface technology in the real world.