Re: big trubble perl encoding DBD/DBI
On Wed, 24 Sep 2003 12:12:37 +0200 udo <[EMAIL PROTECTED]> wrote: > hello you, > > excuse me, please help! > > precondition: > * linux redhat 9.0 > * perl, v5.8.0 > * dbms sybase 11.9.x (support iso-8859-1 ) > * string contains german special chars "・・・ ° ト ヨ > ワ" > > the string will be written into the database through perls DBD/DBI > > strange problem: > different process handling between encoding the data stream in the > file-descriptor and in the socket-descriptor (connecton via sybase-driver) > in perl ? > > if I run my program with pragma "encoding" then I will got different > encoding datas between SOCKET and STDOUT It seems due to LC_ containing a substring "utf8", ... cf. http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2003-05/msg00 003.html This problem has been fixed in newer perl. Parhaps it works fine with Perl 5.8.1 recently released. Regards, SADAHIRO Tomoyuki
[PATCH] piconv -C 512 doesn't work.
It is unfortunate that this very handy utility breaks: $ piconv -f utf8 -t ascii -C 1024 Can't open 1024: No such file or directory at piconv line 60. Patch follows. Thanks, /Autrijus/ --- piconv.orig Fri Sep 26 21:57:32 2003 +++ piconv Fri Sep 26 22:07:09 2003 @@ -14,6 +14,7 @@ my $name = basename($0); use Getopt::Long; +$Getopt::Long::ignorecase = 0; my %Opt; signature.asc Description: =?UTF-8?Q?=E9=80=99=E6=98=AF=E6=95=B8=E4=BD=8D=E5=8A=A0=E7=B0=BD?= =?UTF-8?Q?=E7=9A=84=E9=83=B5?= =?UTF-8?Q?=E4=BB=B6?=
Re: [PATCH] piconv -C 512 doesn't work.
On Friday, Sep 26, 2003, at 23:18 Asia/Tokyo, Autrijus Tang wrote: It is unfortunate that this very handy utility breaks: $ piconv -f utf8 -t ascii -C 1024 Can't open 1024: No such file or directory at piconv line 60. Thanks, applied in my repository. Dan the Encode Maintainer
Re: Prototype for decode_utf8 incorrect?
On Fri, Sep 26, 2003 at 06:32:41PM +0900, Dan Kogai wrote: > so we can make decode_utf8() as follows; > > sub decode_utf8($;$) > { > my ($str, $check) = @_; > if ($check){ > return decode("utf8", @_); > }else{ > return undef unless utf8::decode($str); > return $str; > } > } Sure, worksforme. Thanks! /Autrijus/ pgp0.pgp Description: PGP signature
Prototype for decode_utf8 incorrect?
$ perl -MEncode -e'print Encode::decode_utf8(1, 1)' Too many arguments for Encode::decode_utf8 at -e line 1, at end of line $ perldoc Encode |grep decode_utf8 $string = decode_utf8($octets [, CHECK]); Thanks, /Autrijus/ pgp0.pgp Description: PGP signature
Re: Prototype for decode_utf8 incorrect?
Autrijus, Thanks for the report :) -- murphy's law strikes :( On Friday, Sep 26, 2003, at 17:23 Asia/Tokyo, Autrijus Tang wrote: $ perl -MEncode -e'print Encode::decode_utf8(1, 1)' Too many arguments for Encode::decode_utf8 at -e line 1, at end of line $ perldoc Encode |grep decode_utf8 $string = decode_utf8($octets [, CHECK]); A tricky bug you have found. Here is what the document says. $string = decode_utf8($octets [, CHECK]); equivalent to "$string = decode("utf8", $octets [, CHECK])". The sequence of octets represented by $octets is decoded from UTF-8 into a sequence of logical characters. Not all sequences of octets form valid UTF-8 encodings, so it is possible for this call to fail. For CHECK, see "Handling Malformed Data". and here is how it is really implemented: sub decode_utf8($) { my ($str) = @_; return undef unless utf8::decode($str); return $str; } which is RIGHT so long as the prototype of utf8::decode() is '$' % perl -e 'print utf8::decode()' Usage: utf8::decode(sv) at -e line 1. % perl -e 'print utf8::decode(1)' 1 % perl -le 'print utf8::decode(1,1)' Usage: utf8::decode(sv) at -e line 1. and utf8::decode is not designed to return status. % perl -MEncode -e 'print decode_utf8("\xC2\x80")' | hexdump -C 80|.| 0001 % perl -MEncode -e 'print decode_utf8("\x80")' | hexdump -C % perl -MEncode -e 'print decode_utf8("\x7f")' | hexdump -C 7f|.| 0001 I consider this a feature bug than a documentation bug. But I wonder how I should fix it. fixing utf8::decode() involves tweaking core so it would be nice if it can be fixed on Encode side. Fortunately Encode::decode("utf8" => $str) works. % perl -MEncode -e '$a="\xC2\x80"; print decode("utf8"=>$a, 1)' | hexdump -C 80|.| 0001 % perl -MEncode -e '$a="\x80"; print decode("utf8"=>$a, 1)' | hexdump -C utf8 "\x80" does not map to Unicode at /usr/local/lib/perl5/5.8.0/i386-freebsd/Encode.pm line 164. % perl -MEncode -e '$a="\x7f"; print decode("utf8"=>$a, 1)' | hexdump -C 7f|.| 0001 so we can make decode_utf8() as follows; sub decode_utf8($;$) { my ($str, $check) = @_; if ($check){ return decode("utf8", @_); }else{ return undef unless utf8::decode($str); return $str; } } Dan the Encode Maintainer