Dirk Koopman wrote:
It appears that, with the increasing prevalence of 5.10, the usage of utf8 or not is getting more picky.

I have a well established, networked, app that has upwards of 250 nodes and about 4000 users at one time (on certain weekends double that) all over the world. These users are running mainly windows based clients (which may include quite a lot of windows telnet). The nominal character set is ascii, as interpreted by the client's host operating system.

To date, I have managed to avoid the tribulations of Encode and utf8 et al. But I am now get occasional errors, on 5.10 perl, of the ilk:-

 Wide character in null operation at /spider/perl/DXDupe.pm line 47.
 at /spider/perl/DXDupe.pm line 47
DXDupe::find('X14163|UA0KEF|RZ6HV|������� �������') called at /spider/perl/Spot.pm line 420

And also something similar on print or syswrite.

Studying the data, what I am receiving is a mixture of utf8 and iso-8859-*, the reason for this being that older perls happily take what they are given and just pass it along. Some clients are emitting utf8 and other iso-8859 and yet others (running Win95/8) some kind of codepage. In addition, there are older, usually windows based, packages acting as nodes, together with yet more clients that are also adding data to this network in who knows what character set.

Up until recently, this has not been a problem because the important stuff is in 7 bit ascii and the remarks section (the usual source of problems), if it is unreadable, doesn't matter 'cos you can't translate it anyway.

Now, is there a reasonably reliable way of determining what we have, on a string by string basis, to at least tell whether we are dealing with utf8 or iso-8859 (not caring which variant) so that I can drive Encode appropriately to avoid crashes of the above type. Or how do I completely switch off utf8 encoding/decoding - everywhere - in an 80,000 line perl app.


As no-one seems interested in this, or may be no-one else has had these problems themselves, can anyone suggest a better mailing list to poll?

Dirk

Reply via email to