A Lightweight Neural Network for Mobile Speaker Filtering using Spatial Information

Large data streams of audio data facilitate research on human everyday behavior. A major challenge is to remove non-target persons who do not agree to be recorded and thereby ensure data privacy.

In this thesis, we will develop a privacy preserving speaker filtering system based on spatial information. In particular, we will make use of the interaural difference between the audio signals of two earbuds. ITD and ILD describe the difference between multiple audio signals that is arrival time at ITD and amplitude at ILD. It is also a psychological-perceptive mechanism for locating sound sources. This principle is already used in Wearables for privacy preserving voice commands [1] and speech enhancement in phone calls [2]. The system will build on previous work on lightweight speech enhancement networks [2] as well as prior thesis works of TECO students.

Core tasks for the student include the adaptation of the model for our use case (data protection), and the evaluation on a self-recorded dataset. Hardware, such as recording equipment will be provided.

Keywords: Earables, Audio Data Processing; Speaker Recognition; Machine Learning


Proactive and communicative work style (frequent updates, prepared meetings and ideas);
Python, C;
Machine Learning;
Audio Signal Processing desired but not required;
Good English reading and writing;
Interest in interdisciplinary work;

Interested? Please contact: Tim Schneegans (


[1] Yan, Y., Yu, C., Shi, Y., & Xie, M. (2019, October). PrivateTalk: Activating Voice Input with Hand-On-Mouth Gesture Detected by Bluetooth Earphones. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (pp. 1013-1020).

[2] Chatterjee, I., Kim, M., Jayaram, V., Gollakota, S., Kemelmacher, I., Patel, S., & Seitz, S. M. (2022, June). ClearBuds: wireless binaural earbuds for learning-based speech enhancement. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services (pp. 384-396).