Hi,
BÃRTHÃZI AndrÃs wrote:
> It's interesting, and it can be the problem, but I think, the CGI.pm
> way is not the good solution to decode the URL encoded string: if you
> say chr(0xE2)~chr(0x82)~chr(0xA2), then they are 3 characters, and
s:g/A2/AC/?
I think we've discovered a bug in Pugs, but as I don't know that much
about UTF-8, I'd like to see the following confirmed first :).
# This is what *should* happen:
my $x = chr(0xE2)~chr(0x82)~chr(0xAC);
say $x.bytes; # 3
say $x.chars; # 1
# This is what currently happens:
my $x = chr(0xE2)~chr(0x82)~chr(0xAC);
say $x.bytes; # 6
say $x.chars; # 3
Comparision with perl5:
$ perl -MEncode -we '
my $x = decode "utf-8", chr(0xE2).chr(0x82).chr(0xAC);
print length $x;
'
1 # (chars)
$ perl -we '
my $x = chr(0xE2).chr(0x82).chr(0xAC);
print length $x;
'
3 # (bytes)
--Ingo
--
Linux, the choice of a GNU | The computer revolution is over. The
generation on a dual AMD | computers won. -- Eduard Bloch <[EMAIL PROTECTED]>
Athlon! |