As a by-product of our recent work on collation, we developed a method of
Unicode compression that is similar to SCSU, in that small alphabets are
about a byte per character and large alphabets are about two bytes per
character.

The main difference from SCSU is that this method preserves binary order. As
this is a hot topic right now, I thought it might be of interest. The latest
draft description is on http://oss.software.ibm.com/icu/develop/bocu.htm.
Comments are welcome.

Mark
—————

πάντων μέτρον ἄνθρωπος — Πρωταγόρας

[http://www.macchiato.com]


Reply via email to