Firstly, hello! Its my first time posting and possibly somewhat predictably with a call for help with Unicode stuff.

I've just checked the archive and seen this thread and am having a similar problem, a badly encoded character is causing a while loop through MARC::Batch->next to crash out with:

utf8 "\x87" does not map to Unicode at /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Encode.pm line 173.

I've tried pasting Al's modified decode subroutine and package to the script, but it is still failing. One of the offending records is isolated and attached.

Any suggestions welcome with regards to further modifying the sub or alternatives to MARC::Batch->next would be welcome. For the scope of the project, I'm limited to large batch files of Marc21.

With thanks

Ed



--
Edmund Chamberlain
Systems Development Librarian   
Electronic Services and Systems
Cambridge University Library
West Road,
Cambridge
CB3 9DR

tel: (+44) 01223 747437
fax: (+44) 01223 333160

email: em...@cam.ac.uk

Try LibrarySearch at http://search.lib.cam.ac.uk - a new way to discover Cambridge Library Collections

Attachment: dodgy_char.mrc
Description: Binary data

Reply via email to