On 25 Sep 2009, at 10:09, Philip Newton wrote:
On Fri, Sep 25, 2009 at 09:54, Dirk Koopman <d...@tobit.co.uk> wrote:
Dirk Koopman wrote:
Now, is there a reasonably reliable way of determining what we
have, on a
string by string basis, to at least tell whether we are dealing
with utf8 or
iso-8859 (not caring which variant) so that I can drive Encode
appropriately
to avoid crashes of the above type. Or how do I completely switch
off utf8
encoding/decoding - everywhere - in an 80,000 line perl app.
As no-one seems interested in this, or may be no-one else has had
these
problems themselves, can anyone suggest a better mailing list to
poll?
I was going to suggest Encode::is_utf8 and/or utf8::is_utf8, but I
wasn't sure whether it would actually solve your problem so I thought
I'd rather stay quiet and hope someone with real-world experience in
utf8 woes would pipe up.
Cheers,
Philip
--
Philip Newton <philip.new...@gmail.com>
http://search.cpan.org/perldoc?Encode::Detect might be of some use to
you. As a general rule, if you have is_utf8 in your code you have a
bug. It does not do what you think it does.