On Monday, April 29, 2002, at 07:38 , SADAHIRO Tomoyuki wrote:
> I doubt whether users of 'euc-jp' will
> assume it to be a combination with JIS X 0213.

They don't have to because 'euc-jp' behaves exactly the same as before 
so long as the charset is in ASCII/JISX(0201|0208|0212).

> Such a mixing would prevent warning/croaking
> for appearance of code points that are not defined
> originally (meaning w/o X 0213), wouldn't it?

That was my biggest concern but I have decided to go ahead with euc-jp 
to (partially) support JIS X 0213 and the reason is simple;  Encode::JP 
is already too big to differentiate between various euc-jp.  In such 
cases, we should settle for the most 'comprehensive' version.

Even the term 'euc-jp' is too ambiguous for many;  At first it didn't 
include G3 and some say they must be clearly marked as something like 
'euc-jp-classic' (no 0212 support) vs 'euc-jp-modern' and so forth (then 
our current euc-jp should be marked as 'euc-jp-postmodern' :).  It would 
be nice if we can go that way like 7bit-JIS/ISO-2022-JP/ISO-2022-JP-1 
but for euc-jp we have to have a whole ucm for each.

This is definitely a todo for Perl 5.8.1 and up and I have already come 
up with a solution;  the future Encode (Encode II) will support 
"CES-generator";  that is, you can express euc-jp not as a whole big 
table but a combination of tables.  That will also reduce the duplicates 
found in vendor mappings.  It will be a complete rewrite of encengine.c

But that requires not only codes but the expansion of UCM format so give 
me more time (and Perl 5.8.0!)

Dan the Encode Maintainer

Reply via email to