Hi, On Monday 16 June 2003 08:37 am, Brigitte Jellinek wrote: > i'm trying to use perl + dbi + dbd::mysql + mysql with unicode. > > as far as i can tell i can write a utf8 string into the database, > and get back the same sequence of bits, only now it's a 'classical' > perl-string, not flagged as utf-8.
The crux of the problem is that mysql thinks it knows what it's doing, and is assuming incoming data is latin1*, and thus storing your bytes as though they were latin1. When you retrive the string, it then of course tells perl that the string is latin1-encoded, hence your output. We're doing the same thing here (storing utf-8 bytes in mysql strings), but since we have to use perl 5.6, we're using the unpack method of upgrading the string to utf-8. It seems encode_utf8() should work too, but I haven't had the pleasure of using the "new" perl 5.8 stuff in production yet, so I don't know what the problem is there. What happens if you change your code to use something like the following? $f = pack('U*', unpack('U0U*', $f)) if defined $f; # where $f is the data in the field you just pulled (OT: Actually, we've subclassed DBI, so this upgrade is done transparently. This make things somewhat nicer; however, SQL operations [such as SORT] still cannot be relied upon.) Cheers, nate *or somesuch 1-byte encoding; mysql doesn't support utf-8, even in version 4, despite whatever claims they may make on their website. I'm not bitter. No, sir. -- Nathaniel W. Turner http://www.houseofnate.net/ Tel: +1 508 579 1948 (mobile)