On Monday, April 29, 2002, at 07:38 , SADAHIRO Tomoyuki wrote: > I doubt whether users of 'euc-jp' will > assume it to be a combination with JIS X 0213.
They don't have to because 'euc-jp' behaves exactly the same as before so long as the charset is in ASCII/JISX(0201|0208|0212). > Such a mixing would prevent warning/croaking > for appearance of code points that are not defined > originally (meaning w/o X 0213), wouldn't it? That was my biggest concern but I have decided to go ahead with euc-jp to (partially) support JIS X 0213 and the reason is simple; Encode::JP is already too big to differentiate between various euc-jp. In such cases, we should settle for the most 'comprehensive' version. Even the term 'euc-jp' is too ambiguous for many; At first it didn't include G3 and some say they must be clearly marked as something like 'euc-jp-classic' (no 0212 support) vs 'euc-jp-modern' and so forth (then our current euc-jp should be marked as 'euc-jp-postmodern' :). It would be nice if we can go that way like 7bit-JIS/ISO-2022-JP/ISO-2022-JP-1 but for euc-jp we have to have a whole ucm for each. This is definitely a todo for Perl 5.8.1 and up and I have already come up with a solution; the future Encode (Encode II) will support "CES-generator"; that is, you can express euc-jp not as a whole big table but a combination of tables. That will also reduce the duplicates found in vendor mappings. It will be a complete rewrite of encengine.c But that requires not only codes but the expansion of UCM format so give me more time (and Perl 5.8.0!) Dan the Encode Maintainer