Peter Kirk <peterkirk at qaya dot org> wrote: > Yes, the compressor can make any canonically equivalent change, not > just composing composition exclusions but reordering combining marks > in different classes. The only flaw I see is that the compressor does > not have to undo these changes on decompression; at least no other > process is allowed to rely on it having done so.
I agree with Peter here. I don't think the burden should be on the decompressor to reverse any operation that the compressor performs, except for the compression itself. After all, if we are letting the compressor change the normalization form of the input text, the decompressor cannot possibly know what the original form was, and is in no position to try to re-create it. I'm particularly concerned about having the compressor produce any normalization form other than NFC or NFD, such as the partial normalization Philippe originally described, or (most definitely) any form of so-called "normalization" that ignores the composition exclusions. The output of the compressor *is* Unicode text; it just happens to be in another format. It must follow all the conformance rules that normally apply to Unicode text. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/