Hi, I'm one of the people who actually like the way perl 5.6 and up is moving wrt utf8. Most data coming into my programs is already in UTF-8 nowadays: XML or database data, where we also use it inside the database. XML::Parser already sets the right bit, but DBD::Oracle doesn't yet.
I've hacked something in so that it checks the idiotic NLS_LANG Oracle environment variable for UTF8 and if it finds it it does an SvUTF8_on() on string data coming in from Oracle, which looks like it does the right thing. Another module which I thought could benefit from some hacking is Text::Unaccent, which uses iconv in its original form. It turned out to be very easy to make it into a utf8-only module which is probably a lot faster and smaller too, with no dependencies on external libraries. In both cases I still need to clean the modules up and see whether their authors like what I did to them :-) When I want to get my data *out* of perl, utf-8 is fine unless it has to be HTML for old browsers. For that it's easy to use a regexp to search for [^\x{00}-\x{7f}] and replace it with either an entity like ë or a numeric entityref like Ά Note that I don't even *care* whether the UTF8 bit is still on after that transform, since I know it's all 7-bit ASCII anyway. It would be nice if the above transform were easy using Encode, and if it were, it would be nice if Encode would work with 5.6.1, but I'll manage regardless. -- Bart.