Eric Muller asked: > Is it correct that the sequences U+x U+0360 U+y and U+x U+034F U+y > U+0303 should display the same? Would it be worth putting some words > about those situations in section 13.2 of PDUTR #28?
I think that that should be the case, given the current definitions. In particular, if U+x = U+006E "n" and U+y = U+0067 "g", you would get the following three possibilities for writing the Tagalog ng-tilde: 1. <U+006E, U+0360, U+0067> 2. <U+006E, U+FE22, U+0067, U+FE23> 3. <U+006E, U+034F, U+0067, U+0303> 1. uses the double-diacritic tilde, which nominally applies merely to the U+006E, but would be designed to lay over the top of a following base character on display. 2. uses the compatibility combining double-tilde halves. These occur in legacy bibliographic data records. In principle, 2 should display in the same way as 1, but would be recommended only for interoperating with the legacy data. 3. uses the grapheme joiner to create a "grapheme cluster", which in this case would be the digraph "ng". A rendering engine savvy to grapheme cluster status should then attempt to apply a following combining mark, in this case a regular combining tilde, to the entire grapheme cluster, rather than simply to the preceding base character. While these are three alternative ways of representing the "same thing", we aren't talking about canonical equivalences here. 3 creates a grapheme cluster (which could have implications for other processing), while 1 and 2 do not. For example, if I added U+0301 (combining acute) after each of the above sequences, 1 would put the acute on the "g" (and might result in overlap with the right half of the double tilde); 2 would put the acute over the right-half tilde on the "g"; 3 should put the acute midships over the stretched tilde applying to the digraph. 2 is used for interoperating with legacy bibliographic data, while 1 and 2 are not. And there are quite likely to be other small formatting differences between the three options. In the real world it is unlikely that you will run into a "perfect" rendering engine that would produce exactly the same image from each of the sequences. The combining grapheme joiner is the best answer that Unicode currently has for the extensibility problem for unusual accent placements over (or under) groups of letters, where the existing compatibility answers (U+0360..U+0362 for double diacritics; U+FE20..U+FE23 for diacritic halves) aren't sufficient. For example, it makes it possible to represent a double breve or a double macron over (as seen in some American dictionary orthographies) or a double (or triple) underline under (as seen in some transliterations). --Ken