Hi,
I wrote a script that extracts marc records from a file given certain
conditions and puts them in a new file. When my input record is correctly
encoded in UTF-8 and I run my script from windows command prompt, this warning
message appears: "Wide character in print at record_extraction.pl line 99" (the
line in my script where I print to a new file using as_usmarc). I compared the
extracted record before and after in MarcEdit and the diacritic was changed. I
tried marcdump newfile.mrc to see what happens and I get this error: "utf8 \xF4
does not map to Unicode at C:/Perl64/lib/Encode.pm line 176." When I run my
extraction script again with MARC-8 encoded data then I don't have the same
problem.
The basic outline of my script is:
my $batch = MARC::Batch->new('USMARC', $input_file);
while (my $record = $batch->next()) {
#do some checks
#if checks ok then
print FILE $record->as_usmarc();
}
Do I need to add something that specifies to interpret the data as UTF-8? Does
MARC::Record not handle UTF-8 at all?
Thanks,
Shelley
----
Shelley Doljack
E-Resources Metadata Librarian
Metadata and Library Systems
Stanford University Libraries
[email protected]
650-725-0167