On Mon, Jun 16, 2003 at 02:37:23PM +0200, Brigitte Jellinek wrote:
> 
> hi!
> 
> i'm trying to use perl + dbi + dbd::mysql + mysql with unicode.
> 
> as far as i can tell i can write a utf8 string into the database,
> and get back the same sequence of bits, only now it's a 'classical'
> perl-string, not flagged as utf-8.
> 
> the string i write into the db is 6 characters long:
> "ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}"
> 
>     character           unicode utf8
>                       hex     binary
> 
>     A                 0041    01000001
>     B                 0042    01000010
>     c                 0063    01100011
>     greep alpha       03B1    11001110 10110001
>     german scharfes s 00DF    11000011 10011111
>     cyrrillic e       044D    11010001 10001101 
> 
> what i get back from the db is

I've reformatted this slightly:

>                               binary
>     A                         01000001
>     B                         01000010
>     c                         01100011
>                               11001110 10110001
>                               11000011 00111111
>                               11010001 00111111

The high bit has been lost from some of those bytes.

Probably need to solve that before worrying about flagging the
string as utf8 (for which Encode::_utf8_on(...) is okay).

Right now that'll 'work' but the utf8 bytes have been corrupted.
Perhaps the dbi-users mailing list would be a better place for this.
I'm sure others have been here before.

Tim.

p.s. Extending the DBI spec to cover uft8 is high on my to-do list.

Reply via email to