... > >> Of course, no compression format applied to jamos could > >> even do as well as UTF-16 applied to syllables, i.e. 2 bytes per > >> syllable.
I wonder why Hangul would need compression over and above any other alphabetic script... It has already quite a lot of compression in the form of precomposed syllables. I think we better start a project for allocating precomposed "syllables" for many other scripts, precomposed Latin script syllables, precomposed Greek script syllables, precomposed Tamil script syllables (most of the Brahmic derived scripts are especially disadvantaged, from a 'compression' viewpoint by the virama characters), etc. That should take up much of the excess space in the unused planes (3-13, decimal). Unfortunately that mean 4 bytes per non-Hangul syllable (before byte oriented compression is done), but that could be compensated by using an SCSU-like approach, just with bigger windows. No, this was not serious ;-) /kent k PS Hangul syllables are "LVT" (actually (L+)(V+)(T*)), not TLV.