On Thursday, June 19, 2003, at 02:00 AM, Tim Bunce wrote:

But can someone summarise the causes/issues into something we can
all understand? [I don't have time to try to do that for myself.]

I see it this way:


In the past, even with UTF-8 enabled versions of Perl, UTF-8 was never
an issue because there was so little code out there to "use UTF8;"

XML and SOAP appear to have pushed this issue to the top of the heap.

We're now getting UTF-8 data back from these modules. Previously
single byte data is "magically" becoming UTF-8 when combined in
various ways with UTF-8 data. Since most of us still use plain old
ASCII, this isn't too much of an issue, since UTF-8 is a superset
of ASCII.

If, however, the single byte data contains characters in the \x80-\xff
range, then (some of?) these characters will expand to 2 bytes as
part of the UTF-8 data upgrade.

Perl is changing our data without asking, or warning. At the very
least, -w should be ejecting warning when this is happening!

I don't see this as a DBI/DBD issue AT ALL. It's a Perl issue, plain
and simple.

--
-- Tom Mornini, InfoMania Printing and Prepress
--
-- ICQ: 113526784, AOL, Yahoo, MSN and Jabber: tmornini
-- PGP: http://www.mornini.com/tmornini_infomania.asc



Reply via email to