I had the same problem, and worked around it by using _utf8_on() from
Encode on the mysql query results.  In my version, it was not exported 
by default, so I added '_utf8_on' to @EXPORT .

However, the Encode documentation states that utf8_on is an internal
function, and "Do not use unless you know that the STRING is well-formed

UTF-8."

Is there a better way to do this?

Also, as a suggestion to the authors/documentors of Encode:  it would
be helpful to have more explanation of (& warnings about) the UTF-8
flag,
how/why it works, functions that manipulate it, and warnings about
common
problems, such as the current one.

Mark

> -----Original Message-----
> From: Brigitte Jellinek [mailto:[EMAIL PROTECTED] 
> Sent: 2003 06 16 8:37
> To: [EMAIL PROTECTED]
> Subject: i know it's utf-8, how can i force perl to see it that way
> 
> 
> 
> hi!
> 
> i'm trying to use perl + dbi + dbd::mysql + mysql with unicode.
> 
> as far as i can tell i can write a utf8 string into the database,
> and get back the same sequence of bits, only now it's a 'classical'
> perl-string, not flagged as utf-8.
> 
> the string i write into the db is 6 characters long:
> "ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}"
> 
> 
>     character           unicode utf8
>                       hex     binary
> 
>     A                 0041    01000001
>     B                 0042    01000010
>     c                 0063    01100011
>     greep alpha               03B1    1100111010110001
>     german scharfes s 00DF    1100001110011111
>     cyrrillic e               044D    1101000110001101 
> 
> 
> what i get back from the db is
> 
>                               binary
> 
>     A                         01000001
>     B                         01000010
>     c                         01100011
>     ?                         11001110
>     ?                         10110001
>     ?                         11000011
>     ?                         00111111
>     ?                         11010001
>     ?                         00111111
> 
> 
> I have tried to convert this using 
>       $new = decode_utf8( $fromdb );
> but all i get is an empty string.  is there
> some way to find out *why* this won't decode?
> 
> or is my debugging stuff that shows me the bits in the
> string just wrong:
> 
> 
> sub showbits 
> {
>     my ($template, $utf, $result, $i);
>     $utf =  is_utf8  $_[0];
>     $template = $utf ? "U*" : "C*";
>     foreach ( unpack($template, $_[0] ) )
>     {
>         $result .= "\n" ;
>         $result .= substr( $_[0], $i, 1 ) . "=" . sprintf 
> ("%04X", $_) .  "=";
>         if ( $utf and $_ > 127) {
>                 $b = unpack("B*", substr( $_[0], $i, 1 ));
>         }
>         else {
>                 $b = unpack("B*", pack("N", $_ ));
>         }
>         $b =~ s/^0{32}//;  # leading zeros
>         $b =~ s/^0{16}//;
>         $b =~ s/^0{8}//;
>         $result .= $b;
>         $i++;
>     }
>     return $result;
> }
> 
> -- 
> Brigitte        'I never met a chocolate I didnt like'        Jellinek
> [EMAIL PROTECTED]                         http://www.horus.com/~bjelli/
> http://perlwelt.horus.at http://www.perlmonks.org/index.pl?node=bjelli
> 

Reply via email to