From: "Peter Kirk" <[EMAIL PROTECTED]> > I wonder if it would in fact be possible to merge certain adjacent > combining classes, as from a future numbered version N of the standard. > That would not affect the normalisation of existing text; text > normalised before version N would remain normalised in version N and > later, although not vice versa. I know that this would break the letter > of the current stability policy, but is this kind of backward > compatibility actually necessary? The change could be sold to others as > required for the internal consistency of Unicode.
The problem with this solution is that stability is not guaranteed across backward versions of Unicode: if a tool A implements the new version of combining classes and normalizes its input, it will keep the relative ordering of characters. If its output is injected into a tool B that still uses the legacy classes, the tool B may either reject the input (not normalized) or force the normalization. Then is the text comes back to tool A, it will see a modified text. One could argue that a CCO control may be generated when converting for backwards versions of Unicode. But will tool A know the version of Unicode used by legacy tool B, if B is a remote service that does not provide this version information to A? The problem would then be the interoperability of Unicode-compliant systems using distinct versions of Unicode (for example between XML processors, text editors, input methods, renderers, text converters, full text search engines. This may even be critical in tools like sorting, in applications that require and expect that their input is sorted according to its locale in a predictable way (for example in applications using binary searches in sorted lists of text items, such as authentication in a list of user names, or a filenames index).