Thanks for the info (I'm unable to choose font Arial Unicode MS so I
still can't display the data properly in my browser, but that's not a
problem).
Here's a small script that reproduces the problem I'm interested in:
use MARC::Charset qw(marc8_to_utf8);
binmode STDOUT, ":utf8";
my $marc8
Hi,
I'm using marc8_to_utf8() on Library of Congress data. I'm finding
that I get occasional null characters inserted in the output text, and
I'm wondering what this means.
An example is the author (personal name) of the book that can be found
at http://catalog.loc.gov/ by searching for ISBN 5040
Edward Summers wrote:
> Perhaps when you are writing out your data you aren't preparing the
> filehandle for utf8?
- you're right (as printing to STDOUT and using binmode() shows),
however I got confused because I'm actually trying to save the
transcoded data in a Berkeley-db file (via DB_File.pm
Hi,
I'm using MARC::Charset::marc8_to_utf8() v0.95 to transcode some
Library of Congress data to utf8, however I'm finding a problem with
character 'ΓΈ' (hex 0xB2 - lowercase scandinavian o / latin small
letter o with stroke), this character is transcoding to 0xF8 - which is
not valid utf8 - when i