Hi Nick,
thank you so much for solving that problem! I didn't know that
"Unicode" is a valid canonical name of an available encoding, since
use Encode;
my @all_encodings = Encode->encodings(":all");
print join("\n", @all_encodings);
does not include it on my machine.
best,
rob
--
On Thu, 19 Sep 2002, 13:35 GMT+01 (14:35 local time) Nick Ing-Simmons
wrote:
> Robert Allerstorfer <[EMAIL PROTECTED]> writes:
>>Hello,
>>
>>I want to convert source code written in the Japanese shift_jis
>>character set, into their Unicode numbers. For instance, "��" should
>>result in "U+691C" (which is 26908 in decimal). I tried using the
>>Encode module of Perl 5.8 with something like this:
>>
>> use Encode::JP;
>> my $string = "��";
>> Encode::from_to($string, "shiftjis", "utf8");
>> my $ord = join("\n", unpack('U*', $string));
>> print "$string\n$ord";
> from_to does what it says. In that case you took shiftjis decoded
> it to Unicode then re-encoded as UTF-8 octets.
> What you might have meant was to get Unicode rather than the re-encoded form:
> use Encode::JP;
> my $string = "��";
> Encode::from_to($string, "shiftjis", "Unicode");
> binmode STDOUT,':utf8';
> print length($string)," chars '$string'\n";
> my $ord = join("\n", map( ord($_),split(//,$string)));
> print "$ord";