OLAC is attempting a project of this sort for film and video credits. We are trying to teach a computer to recognize the names and roles that appear in 245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also connect them to the correct 1xx/7xx if present. The current program, which uses natural language processing (NLP) techniques, is reasonably successful with personal names and with roles given in English. We are working on building a multilingual vocabulary. It tends to choke on complicated statements that involve a lot of corporate bodies.
There are a lot of enhancements that can be made to this process, but it will never be 100% accurate. I do think it will be good enough to be useful and hopefully also effective at identifying statements of responsibility that need human intervention. As part of this process, we are hand-annotating a large pool of credits with the correct answers. This will enable us to assess the effectiveness of different strategies and may be useful for machine learning. It would be wonderful if you could help us out, especially if you are able to translate credits from other languages into English. I am challenging people to annotate ten credits per week for six weeks. This should take less than ten minutes per week. Go to http://olac-annotator.org to get started. (If you want to do non-English credits, please email me off-list as it is probably more effective for me to send you a list each week. The credits are separated by language of the film, but all the files have lots of English language credits from notes. We have lots of languages from Arabic to Urdu.) I have also started a list to discuss problems with interpreting credits, which you're welcome to join. It's at https://lists.uoregon.edu/mailman/listinfo/olac-credits. Please feel free to share this information with anyone else who might be interested in contributing. Kelley On Mon, Nov 25, 2013 at 1:54 PM, Benjamin A Abrahamse <babra...@mit.edu<mailto:babra...@mit.edu>> wrote: I would be very curious to know if anyone with a systems background has thought about ways to batch-apply relators to existing records. Perhaps by making use of existing statements of responsibility? It seems to me given the number of pre-RDA records out there that no one will ever have time and/or money to update them manually.