Michael -  I'm just about to load ebook records into our Innovative catalog, 
and I'm going to keep the e-books separate from the print book records.  For 
ebooks, I'm going to copy the OCLC number to the 901 with a prestamp, and 
overlay on that. So only records loaded with our ebook load table will have 
this 901 to overlay on.  Then I'm going to protect the 856s and the 710s for 
the ebook collection statement.  That'll take care of adds.  For deletes... I 
haven't got that worked out yet.  I think there's a way to delete a field based 
on the incoming field.

Cindy Harper
Virginia Theological Seminary
char...@vts.edu

-----Original Message-----
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Andy 
Kohler
Sent: Thursday, August 15, 2013 2:29 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] De-dup MARC Ebook records

Are you expecting to work with two files of records, outside of your ILS?
If so, for a project like that I'd probably write Perl script(s) using 
MARC::Record (there are similar code libraries for Ruby, Python and Java at 
least).

For each record in each file, use the ISBN (and/or OCLC number and/or LCCN) as 
a key.  Compare all sets, and keep one record per key.

This assumes that the vendors are supplying records with standard identifiers, 
and not just their own record numbers.

If you're comparing each file with what's already in your ILS, then it'll 
depend on the tools the ILS offers for matching incoming records to the 
database.  Or, export the database and compare it with the files, as above.

Andy Kohler / UCLA Library Info Tech
akoh...@library.ucla.edu / 310 206-8312

On Thu, Aug 15, 2013 at 10:11 AM, Michael Beccaria <mbecca...@paulsmiths.edu
> wrote:

> Has anyone had any luck finding a good way to de-duplicate MARC 
> records from ebook vendors. We're looking to integrate Ebrary and 
> Ebsco Academic Ebook collections and they estimate an overlap into the 10's 
> of thousands.
>
>

Reply via email to