Easy listening – access to oral history collections
Dr. Willemijn Heeren – Prof.dr. Franciska de Jong
University of Twente, Human Media Interaction Group
Abstract
Over the past century, millions of hours of spoken audio recordings have been collected with great potential for e.g., research, educational purposes and new creative audiovisual productions. The actual (re-)use of these collections, however, is severely hindered by their generally limited access. This is mainly caused by insufficiently accurate annotations, at the level of programs or tapes (i.e. durations up to hours) instead of user-manageable chunks, such as fragments of only a few minutes or less. The HMI speech group is concerned with improving access to multimedia archives through the application of speech and language technology (SLT): within the project CHoral-access to oral history the focus lies on improved disclosure of spoken heritage collections. In this presentation we give an overview of how SLT is being used to generate time-stamped, multi-layered annotations that support fragment retrieval from audiovisual heritage collections, how it can be used to enrich data from spoken word archives, and how the user experience may be enhanced accordingly both for professionals and for the general audience.
Short Bio (W.F.L. Heeren)
Willemijn Heeren obtained her M.A. in Linguistics in 2001 'cum laude' from Leyden University and in 2006 she got her PhD from Utrecht University for a dissertation on the perceptual development of speech sounds in both child and adult listeners. Since March 2006 she has been employed as a postdoc researcher at the University of Twente for the NWO-funded project CHoral-access to oral history, which is part of the CATCH program. This project aims at the development of spoken document retrieval technology for the disclosure of spoken word collections from the cultural heritage domain. Willemijn is concerned with the development of user interfaces for access to spoken word documents and archives, and her main research interests are human-computer interaction and speech perception.
Short Bio (F.M.G. de Jong)
Franciska de Jong is full professor of language technology at the University of Twente since 1992. She is also affiliated to the Erasmus University in Rotterdam, where she is managing director of the Erasmus Studio. She has a background in theoretical linguistics and started to work on language technology in 1985 at Philips Research where she worked on machine translation. Currently, her main research interest is in the field of multimedia indexing, semantic access, cross-language retrieval and the disclosure of cultural heritage collections (in particular spoken word audio archives). She is frequently involved in international program committees, expert groups and review panels, and has initiated a number of EU-projects. She is project leader of the MultimediaN-project on semantic multimedia access (2004-2008) and principal investigator of the CATCH project CHoral. She chairs the steering committee of the NWO IMIX-project on multimodal information extraction and since 2004 she is a member of the board of the NWO Research Council for the Humanities.