On Mon, Jun 16, 2003 at 02:37:23PM +0200, Brigitte Jellinek wrote:
>
> hi!
>
> i'm trying to use perl + dbi + dbd::mysql + mysql with unicode.
>
> as far as i can tell i can write a utf8 string into the database,
> and get back the same sequence of bits, only now it's a 'classical'
> perl-string, not flagged as utf-8.
>
> the string i write into the db is 6 characters long:
> "ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}"
>
> character unicode utf8
> hex binary
>
> A 0041 01000001
> B 0042 01000010
> c 0063 01100011
> greep alpha 03B1 11001110 10110001
> german scharfes s 00DF 11000011 10011111
> cyrrillic e 044D 11010001 10001101
>
> what i get back from the db is
I've reformatted this slightly:
> binary
> A 01000001
> B 01000010
> c 01100011
> 11001110 10110001
> 11000011 00111111
> 11010001 00111111
The high bit has been lost from some of those bytes.
Probably need to solve that before worrying about flagging the
string as utf8 (for which Encode::_utf8_on(...) is okay).
Right now that'll 'work' but the utf8 bytes have been corrupted.
Perhaps the dbi-users mailing list would be a better place for this.
I'm sure others have been here before.
Tim.
p.s. Extending the DBI spec to cover uft8 is high on my to-do list.