For the life of me I can't figure out how to do reading and writing of UTF-8 with MARC::Batch.
I have a UTF-8 encoded file of MARC records. Dumping the records and greping for a particular string illustrates the validity: $ marcdump und.marc | grep Sainte-Face und.marc 1000 records 2000 records 3000 records 4000 records 5000 records 6000 records 7000 records 8000 records 9000 records 10000 records 11000 records 12000 records 245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face 610 20 _aArchiconfrérie de la Sainte-Face 13000 records $ I then run a Perl script that simply reads each record and dumps it to STDOUT. Notice how I define both my input and output as UTF-8: #!/shared/perl/current/bin/perl # configure use constant MARC => './und.marc'; # require use strict; use MARC::Batch; # initialize binmode ( MARC, ":utf8" ); my $batch = MARC::Batch->new( 'USMARC', MARC ); $batch->strict_off; $batch->warnings_off; binmode( STDOUT, ":utf8" ); # read & write while ( my $marc = $batch->next ) { print $marc->as_usmarc } # done exit; But my output is munged: $ ./marc.pl > und.mrc $ marcdump und.mrc | grep Sainte-Face und.mrc 1000 records 2000 records 3000 records 4000 records 5000 records 6000 records 7000 records 8000 records 9000 records 10000 records 11000 records 12000 records 245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face 610 _aArchiconfrérie de la Sainte-Face 13000 records $ What am I doing wrong!? -- Eric Lease Morgan University of Notre Dame 574/631-8604