On Mon, Jun 16, 2003 at 02:37:23PM +0200, Brigitte Jellinek wrote: > > hi! > > i'm trying to use perl + dbi + dbd::mysql + mysql with unicode. > > as far as i can tell i can write a utf8 string into the database, > and get back the same sequence of bits, only now it's a 'classical' > perl-string, not flagged as utf-8. > > the string i write into the db is 6 characters long: > "ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}" > > character unicode utf8 > hex binary > > A 0041 01000001 > B 0042 01000010 > c 0063 01100011 > greep alpha 03B1 11001110 10110001 > german scharfes s 00DF 11000011 10011111 > cyrrillic e 044D 11010001 10001101 > > what i get back from the db is
I've reformatted this slightly: > binary > A 01000001 > B 01000010 > c 01100011 > 11001110 10110001 > 11000011 00111111 > 11010001 00111111 The high bit has been lost from some of those bytes. Probably need to solve that before worrying about flagging the string as utf8 (for which Encode::_utf8_on(...) is okay). Right now that'll 'work' but the utf8 bytes have been corrupted. Perhaps the dbi-users mailing list would be a better place for this. I'm sure others have been here before. Tim. p.s. Extending the DBI spec to cover uft8 is high on my to-do list.