Re: big trubble perl encoding DBD/DBI

2003-09-26 Thread SADAHIRO Tomoyuki

On Wed, 24 Sep 2003 12:12:37 +0200
udo <[EMAIL PROTECTED]> wrote:

> hello you,
> 
> excuse me, please help!
> 
> precondition:
> * linux  redhat 9.0
> * perl, v5.8.0
> * dbms sybase 11.9.x (support iso-8859-1 )
> * string contains german special chars "・・・ °  ト ヨ 
> ワ"
> 
> the string will be written into the database through  perls  DBD/DBI
> 
> strange problem:
> different process handling between encoding the data stream in the
> file-descriptor and in the socket-descriptor (connecton via sybase-driver)
> in perl ?
> 
> if I run my program with  pragma "encoding"  then I will got different
> encoding datas  between SOCKET and STDOUT

It seems due to LC_ containing a substring "utf8", ...
cf. http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2003-05/msg00
003.html
This problem has been fixed in newer perl.

Parhaps it works fine with Perl 5.8.1 recently released.

Regards,
SADAHIRO Tomoyuki

[PATCH] piconv -C 512 doesn't work.

2003-09-26 Thread Autrijus Tang
It is unfortunate that this very handy utility breaks:

$ piconv -f utf8 -t ascii -C 1024
Can't open 1024: No such file or directory at piconv line 60.

Patch follows.

Thanks,
/Autrijus/

--- piconv.orig Fri Sep 26 21:57:32 2003
+++ piconv  Fri Sep 26 22:07:09 2003
@@ -14,6 +14,7 @@
 my $name = basename($0);
 
 use Getopt::Long;
+$Getopt::Long::ignorecase = 0;
 
 my %Opt;
 



signature.asc
Description: 	=?UTF-8?Q?=E9=80=99=E6=98=AF=E6=95=B8=E4=BD=8D=E5=8A=A0=E7=B0=BD?=	=?UTF-8?Q?=E7=9A=84=E9=83=B5?= =?UTF-8?Q?=E4=BB=B6?=


Re: [PATCH] piconv -C 512 doesn't work.

2003-09-26 Thread Dan Kogai
On Friday, Sep 26, 2003, at 23:18 Asia/Tokyo, Autrijus Tang wrote:
It is unfortunate that this very handy utility breaks:

$ piconv -f utf8 -t ascii -C 1024
Can't open 1024: No such file or directory at piconv line 60.
Thanks, applied in my repository.

Dan the Encode Maintainer




Re: Prototype for decode_utf8 incorrect?

2003-09-26 Thread Autrijus Tang
On Fri, Sep 26, 2003 at 06:32:41PM +0900, Dan Kogai wrote:
> so we can make decode_utf8() as follows;
> 
> sub decode_utf8($;$)
> {
> my ($str, $check) = @_;
> if ($check){
>   return decode("utf8", @_);
>   }else{
>   return undef unless utf8::decode($str);
>   return $str;
>   }
> }

Sure, worksforme.  Thanks!

/Autrijus/


pgp0.pgp
Description: PGP signature


Prototype for decode_utf8 incorrect?

2003-09-26 Thread Autrijus Tang

$ perl -MEncode -e'print Encode::decode_utf8(1, 1)'
Too many arguments for Encode::decode_utf8 at -e line 1, at end of line

$ perldoc Encode |grep decode_utf8
   $string = decode_utf8($octets [, CHECK]);


Thanks,
/Autrijus/


pgp0.pgp
Description: PGP signature


Re: Prototype for decode_utf8 incorrect?

2003-09-26 Thread Dan Kogai
Autrijus,

Thanks for the report :) -- murphy's law strikes :(

On Friday, Sep 26, 2003, at 17:23 Asia/Tokyo, Autrijus Tang wrote:
$ perl -MEncode -e'print Encode::decode_utf8(1, 1)'
Too many arguments for Encode::decode_utf8 at -e line 1, at end of line
$ perldoc Encode |grep decode_utf8
   $string = decode_utf8($octets [, CHECK]);
A tricky bug you have found.  Here is what the document says.

   $string = decode_utf8($octets [, CHECK]);
 equivalent to "$string = decode("utf8", $octets [, CHECK])".  
The
 sequence of octets represented by $octets is decoded from 
UTF-8 into
 a sequence of logical characters. Not all sequences of octets 
form
 valid UTF-8 encodings, so it is possible for this call to 
fail.  For
 CHECK, see "Handling Malformed Data".
and here is how it is really implemented:

sub decode_utf8($)
{
my ($str) = @_;
return undef unless utf8::decode($str);
return $str;
}
which is RIGHT so long as the prototype of utf8::decode() is '$'

% perl -e 'print utf8::decode()'
Usage: utf8::decode(sv) at -e line 1.
% perl -e 'print utf8::decode(1)'
1
% perl -le 'print utf8::decode(1,1)'
Usage: utf8::decode(sv) at -e line 1.
and utf8::decode is not designed to return status.

% perl -MEncode -e 'print decode_utf8("\xC2\x80")' | hexdump -C
  80|.|
0001
% perl -MEncode -e 'print decode_utf8("\x80")' | hexdump -C
% perl -MEncode -e 'print decode_utf8("\x7f")' | hexdump -C
  7f|.|
0001
I consider this a feature bug than a documentation bug.  But I wonder 
how I should fix it.  fixing utf8::decode() involves tweaking core so 
it would be nice if it can be fixed on Encode side.  Fortunately 
Encode::decode("utf8" => $str) works.

% perl -MEncode -e '$a="\xC2\x80"; print decode("utf8"=>$a, 1)' | 
hexdump -C
  80|.|
0001
% perl -MEncode -e '$a="\x80"; print decode("utf8"=>$a, 1)' | hexdump 
-C
utf8 "\x80" does not map to Unicode at 
/usr/local/lib/perl5/5.8.0/i386-freebsd/Encode.pm line 164.
% perl -MEncode -e '$a="\x7f"; print decode("utf8"=>$a, 1)' | hexdump 
-C
  7f|.|
0001
so we can make decode_utf8() as follows;

sub decode_utf8($;$)
{
my ($str, $check) = @_;
if ($check){
return decode("utf8", @_);
}else{
return undef unless utf8::decode($str);
return $str;
}
}
Dan the Encode Maintainer