But can someone summarise the causes/issues into something we can all understand? [I don't have time to try to do that for myself.]
I see it this way:
In the past, even with UTF-8 enabled versions of Perl, UTF-8 was never an issue because there was so little code out there to "use UTF8;"
XML and SOAP appear to have pushed this issue to the top of the heap.
We're now getting UTF-8 data back from these modules. Previously single byte data is "magically" becoming UTF-8 when combined in various ways with UTF-8 data. Since most of us still use plain old ASCII, this isn't too much of an issue, since UTF-8 is a superset of ASCII.
If, however, the single byte data contains characters in the \x80-\xff range, then (some of?) these characters will expand to 2 bytes as part of the UTF-8 data upgrade.
Perl is changing our data without asking, or warning. At the very least, -w should be ejecting warning when this is happening!
I don't see this as a DBI/DBD issue AT ALL. It's a Perl issue, plain and simple.
-- -- Tom Mornini, InfoMania Printing and Prepress -- -- ICQ: 113526784, AOL, Yahoo, MSN and Jabber: tmornini -- PGP: http://www.mornini.com/tmornini_infomania.asc