On Fri, 16 Feb 2018 18:05:41 -0800 James Kass via Unicode <unicode@unicode.org> wrote:
> Richard Wordingham wrote: > > > One can argue that once the compound ideograph have been encoded, > > the IDS should no longer be interpreted. > > Wouldn't that break existing data? If this sort of thing were done at > OS or app level, it might be possible to replace the IDS string with > the appropriate character upon file save in some kind of automatic > fashion. But I'd sure hate for that to happen to any of my text files > without warning. TUS allows one to use an IDS in place of an unencoded character, but not in place of an encoded character. Once the character is encoded, the IDS substitutions should be weeded out. Of course, there is the problem that upgrades to a new version of Unicode can be a mosaic process, with data tables, fonts and rendering engines out of alignment. At least it's a graceful break, unlike the probability of PUA mappings simply vanishing or, worse, changing. Ideally, searching as just searching would use a collation to equate character and IDS. There may be a problem in that two distinct characters could have the same IDS. Search and automatic replacement is more of a problem. I strongly suspect that the rule not to use an IDS in place of an encoded character would only be applied to an input method. There is the very common interpretation that 'should' in the principal clause of a requirement cancels the requirement; formally the justification is that it would be too much work. Enforcing the rule for an unsupported encoded character would be a hostile act. Richard.