Dear colleagues, please see below for an opportunity to contribute to OCLC 
deduplication efforts specific to records for Hebraica. I hope you'll consider 

Besorot tovot, Jasmin

From: Whitacre,Cynthia
Sent: Monday, October 23, 2023 9:53 AM
Subject: Identifying duplicate records using machine learning—non-Latin data 
labeling for WorldCat

As part of continuous quality improvement efforts, earlier this year we began 
implementation of our machine learning model to identify duplicate records in 
WorldCat. The initial phase focused on records for print books and e-books 
published in Latin-script languages; the results of these efforts have improved 
the cataloging, discovery, and interlibrary loan experiences for library staff 
and end users across the world.

 In our next phase, we’re expanding the focus to records that include non-Latin 
script describing print books, e-books, audiobooks, journals, and videos 
published in Chinese, Japanese, Korean, Thai, Arabic, Hebrew, and Russian. We 
invite metadata experts in those languages to validate our machine learning 
model’s understanding of duplicates by participating in a data labeling 
exercise using a simple, intuitive, online interface. Are the two records 
functionally equivalent? Do they represent the same manifestation? Or are they 
truly different manifestations that just seem the same, except for some 
important differences that only you can spot.

Put my skills to the 

 The interface will remain open through 15 December 2023, at which time we will 
begin analyzing the collected data. For more information, check out the 
 and read the 

With your help we can better scale the resolution of duplicate records in 
WorldCat, saving countless hours of time and improving the experience for the 
global library community. Thanks for all you do to advance the mission of 
libraries worldwide—we appreciate your ongoing collaboration!

Cynthia M. Whitacre  (she/her)

OCLC · Senior Metadata Operations Manager, Membership & Research Division

6565 Kilgour Place, Dublin, Ohio, 43017  United States

T +1-800-848-5878, ext. 6183

Direct: +1-614-764-6183

 · YouTube 

Heb-naco mailing list

Reply via email to