On 20 August 2012 16:35, Sarah Breau <[email protected]> wrote:
>> I am not too sure about saying ISBNs for 'rejected' item types
>> should/could (not) be added to 'accepted' item types. If possible, it
>> would be nice if an ISBN only points a user to the item it was
>> attached to, not a related item. It is not possible (AFAIK) to explain
>> which ISBN is for what. On the other hand, it's sometimes hard to see
>> what an ISBN on an item identifies*, so you can't always tell whether
>> a mistake was made or that a related ISBN was added.
>
> Since the motto of OL is one web page for every book, I separate out records
> that contain multiple formats. So if I am working on a record and it has
> more than one ISBN, I make a new record and move the paperback version over
> there. Sometimes they have different covers, so to me they are different
> books.
>
> The bigger question here is where did all this bad information come from? It
> strikes me as sub-optimal to import a huge amount of data automatically and
> then have humans painstakingly sort through it and discard the non-book
> items one by one. And that's the best-case scenario: at this point, the
> human workers don't even have this ability. As a user, I sometimes get
> frustrated with the amount of disorderly information in OL, especially since
> as a user I don't have the tools to clean it up. I think I would spend more
> time on the database if a) I could make meaningful changes (like removing
> non-book items or merging duplicate records), and b) I didn't feel like
> somewhere around half of the records are duplicates (why bother fixing a
> record when it has twins out there that are just as incomplete?).
>
> Sarah

Hi Sarah,

As far as I can tell, bad data was imported from bad library records.
It seems many libraries have errors in their records, ranging from bad
data (e.g. physical format ":" and Dewey Decimal Code "B" at the
Library of Congress) to bad structure (e.g. missing separators in MARC
records).
It also seems that on import, typical MARC markup like "[Springfield,
Va]" was not changed to "Springfield, VA". I have been working on some
automated vacuuming, but VacuumBot can only do some simple stuff.

I agree that more options for users to handle duplicates are needed.
But I am afraid efforts have to come from users (I'd love to try
automatic duplicate detection on the OL records, but I have no
experience yet, except for having MySQL find duplicate work titles,
and need to do other work).

Ben
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to