On Sat, 02 Dec 2006 12:00:12 -0800, you wrote:
On a Japanese version of Windows when you execute a Perl to run a script, the
length() fcn returns
the wrong number of characters for anything you pass in as @ARGV[0], and the
split() fcn seems to
work the same way.
Using some of the samples shows in perluniintro we do not get the same
results, so something is wrong.
Using ActivePerl 5.8.8 Build 819. Using Win2003 Server, Japanese. No
emulation, all default Japanese
installation.
Here is what we are doing:
perl script.pl #12486;#12473;#12488;
(there are three characters for @ARGV[0], the Japanese word for 'test')
The perl script does this:
print length(@ARGV[0]); # returns 6
If one tries to use split(\\, @ARGV[0]) there are 6 iterations.
Tried use encoding UTF8, the -C6 flag and a ton of other stuff.
Oddly, if one does 'print @ARGV[0]' the output is #12486;#12473;#12488;.
Even used something from perluniintro:
$Unicode_string = pack(U*, unpack(W*, $ARGV[0]));
print $Unicode_string # returns #12486;#12473;#12488;
print length($Unicode_string) # returns 6
We need to capture each character in #12486;#12473;#12488; (3 of them) and
get the HEX or UNICODE value for the
character. Since Perl thinks the length is 6 we cannot get correct hex/unicode
values using
pack/unpack or anything else for that matter.
I may be missing something, but wouldn't -CA or -C32 do what you want?
According to perlrun, it means the elements of @ARGV are strings
encoded in UTF-8.
--
Eric Amick
Columbia, MD
___
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs