On Friday, April 12, 2002, at 02:30 , Nick Ing-Simmons wrote:
> Having hacked RFC2047 support into tkmail I have now seen some
> non-latin1 characters in a "real" perl/Tk app.
>
> There seem to be a few snags with mime's iso-2022-jp:
>
> - It failed to demand load given upper-case form ISO-2022-JP
What's Encode->VERSION say? Here is the current status on this one.
I wrote a ad hoc script as follows,
use Encode;
my $jp = "ISO-2022-JP";
Encode::encode($jp, "foo"); # should croak if you are right, NI-S
print join("\n", map{"\$INC{$_} == $INC{$_}"} grep m,^Encode/,o, keys
%INC);
printf "$jp => %s\n", find_encoding($jp)->name;
for (my $i = 0; $i < length($jp); $i++){
my $alias = $jp;
my $char = substr($alias,$i,1);
substr($alias, $i, 1) = lc($char);
printf "$alias => %s\n", find_encoding($alias)->name;
}
__END__
And here is the outcome.
% perl5.7.3 foo
$INC{Encode/Alias.pm} ==
/Users/dankogai/lib/perl5/5.7.3/darwin/Encode/Alias.pm
$INC{Encode/JP.pm} == /Users/dankogai/lib/perl5/5.7.3/darwin/Encode/JP.pm
$INC{Encode/Config.pm} ==
/Users/dankogai/lib/perl5/5.7.3/darwin/Encode/Config.pm
$INC{Encode/Encoding.pm} ==
/Users/dankogai/lib/perl5/5.7.3/darwin/Encode/Encoding.pm
$INC{Encode/JP/2022_JP.pm} ==
/Users/dankogai/lib/perl5/5.7.3/Encode/JP/2022_JP.pm
$INC{Encode/XS.pm} == /Users/dankogai/lib/perl5/5.7.3/darwin/Encode/XS.pm
$INC{Encode/JP/JIS.pm} ==
/Users/dankogai/lib/perl5/5.7.3/Encode/JP/JIS.pm
$INC{Encode/CJKConstants.pm} ==
/Users/dankogai/lib/perl5/5.7.3/darwin/Encode/CJKConstants.pm
$INC{Encode/JP/2022_JP1.pm} ==
/Users/dankogai/lib/perl5/5.7.3/Encode/JP/2022_JP1.pm
$INC{Encode/JP/H2Z.pm} ==
/Users/dankogai/lib/perl5/5.7.3/Encode/JP/H2Z.pmISO-2022-JP =>
iso-2022-jp
iSO-2022-JP => iso-2022-jp
IsO-2022-JP => iso-2022-jp
ISo-2022-JP => iso-2022-jp
ISO-2022-JP => iso-2022-jp
ISO-2022-JP => iso-2022-jp
ISO-2022-JP => iso-2022-jp
ISO-2022-JP => iso-2022-jp
ISO-2022-JP => iso-2022-jp
ISO-2022-JP => iso-2022-jp
ISO-2022-jP => iso-2022-jp
ISO-2022-Jp => iso-2022-jp
> - euc-jp "\xDC" does not map to Unicode (3) at
> /tools/perls/lib/5.7.3/i686-linux-multi/Encode.pm line 142.
>
> Will try and convert latter to a test when I have figured out what
> the offending source data is (and checking for bugs in my RFC2047
> hack).
Well now that we have raw encodings we don't have to trepass EUC to
decode iso-2022-jp (saves tr//) but there must be a way to tell which
character set a given character belongs when you encode to iso-2022-jp.
EUC still comes in handy there.
At any rate, I wanted to clean up 7bit-jis, ISO-2022-JP and
ISO-2022-JP1 anyway. I'll make this the assignment of today.
Dan