On Monday, March 25, 2002, at 06:59 , Nick Ing-Simmons wrote:
> It should not be too hard to take the .ucm file parsing from 'compile'
> and teach Encode::Tcl-like all-perl code to read .ucm-s.
> We can then rename it Encode::Perl ;-)

   I am considering that kind of option but I am not sure if it should go 
to the perl dist.  Thanks to your compile script, Encode is now smart 
enough to handle most of the major encodings without a help of 
Encode::Tcl (ISO-2022 types are so far indivisually handled by perl 
modules, such as Encode::JP::JIS).
   We can go even wilder.  I am thinking of developing something like 
Unicode::DataBase to implement full support for ISO-2022-(INT|JP-2).  
The current problem to implement ISO-2022 is encoding;  You have to to 
what character set a given (Unicode) character maps to but thanks
to the character unification rule, this is impossible just by looking at 
the character.
   The solution is to have a database and lookup each character to find 
what character sets have corresponding codepoints, then pick one up by a 
given precedence (for instance, you go like "try JIS X 0208, then GB 
2312, then KSC 5601 for ISO-2022-JP-2).  But we need a database to begin 
with....

Dan the Man with Too Many Encodings to Support

Reply via email to