Suggest Break into 2 problems. 1) Check the unicode/utf faq for perl5888 or whichever as appropriate. (Perldoc.perl.org). Sound like for you use you have multibyte chars being handled as 1-byte chars because it was read or forced raw at one ponit.
2) If not fixed by reading differently, to fix a string with these chars as you'd like. either (2a) do (1) before twig parses OR (2b) have twig apply it inplace to each element/text() you're extracting, and also any attributes you're keeping. Bill @ <XML2007 /> Bill, typing with thumbs ----- Original Message ----- From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: boston-pm@mail.pm.org <boston-pm@mail.pm.org> Sent: Wed Dec 05 14:21:55 2007 Subject: [Boston.pm] converting utf-8 to unicode from XML text gathered byXML::Twig Hi All, I am currently using XML::Twig to read in some XML. This XML's text is in utf-8. So there are smart-quotes and such in there. I need to unicode-ify the text. I tried using most of the methods that are part of XML::Twig, but came up dry. The best I could do is convert all unsupported chars to question marks. Without any XML::Twig conversion the smart quotes come out looking like: “ ” or ’ I tried doing a simple $val =~ s/’/'/gs; But that didn't work either. Does anyone have any suggestions on how I can do this conversion either manually OR with XML::Twig methods? Thanks. --Alex _______________________________________________ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm _______________________________________________ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm