Peter Kirk <peterkirk at qaya dot org> wrote:

> Yes, the compressor can make any canonically equivalent change, not
> just composing composition exclusions but reordering combining marks
> in different classes. The only flaw I see is that the compressor does
> not have to undo these changes on decompression; at least no other
> process is allowed to rely on it having done so.

I agree with Peter here.  I don't think the burden should be on the
decompressor to reverse any operation that the compressor performs,
except for the compression itself.  After all, if we are letting the
compressor change the normalization form of the input text, the
decompressor cannot possibly know what the original form was, and is in
no position to try to re-create it.

I'm particularly concerned about having the compressor produce any
normalization form other than NFC or NFD, such as the partial
normalization Philippe originally described, or (most definitely) any
form of so-called "normalization" that ignores the composition
exclusions.  The output of the compressor *is* Unicode text; it just
happens to be in another format.  It must follow all the conformance
rules that normally apply to Unicode text.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/


Reply via email to