Hey, that’s my post! Anyways, I haven’t really looked into what your problem is, but when you said that the copyright character is getting transformed to A9 even though it is supposedly stored as C2 A9 in the database, it made me think of how there can be two UTF-8 representations for the same character in some sections of the Unicode set. I wonder if that is somehow happening for you.
Shelley Shelley Doljack Discovery Metadata Librarian Metadata Dept., Lathrop Library Stanford University Libraries 650-725-0167 sdolj...@stanford.edu From: Highsmith, Anne L [mailto:hism...@library.tamu.edu] Sent: Friday, November 13, 2015 2:05 PM To: perl4lib@perl.org Subject: Opening & writing to UTF-8 files; copyright symbol again -- solution I should probably say, “apparent solution” ‘cause character set issues never seem to end. However, combining Jon Gorman’s recommendation with some Googling, I get: my $outfile='4788022.edited.bib'; open (my $output_marc, '>', $outfile) or die "Couldn't open file $!" ; binmode($output_marc, ':utf8'); The open statement may not be quite correct, as I am not familiar with the more current techniques for opening file handles that John mentioned. However, when I use those instructions to open the output file rather than what I had before, the copyright symbol does indeed come across as C2 A9 as it was in the original record. I didn’t want to use the utf8, because I’ve tried that before and ended up with double-encoding (and a real mess). But I’ll continue testing. The results of the googling I referred to can be found at: https://groups.google.com/forum/#!topic/perl.perl4lib/sy7hqiBQ1yM Anne L. Highsmith Director, Consortia Systems TAMU Libraries 5000 TAMU College Station, TX 77843-5000 979 862 4234 hism...@tamu.edu<mailto:hism...@tamu.edu>