Re: Combining Marks and Variation Selectors
On 2/2/2020 5:22 PM, Richard Wordingham via Unicode wrote: On Sun, 2 Feb 2020 16:20:07 -0800 Eric Muller via Unicode wrote: That would imply some coordination among variations sequences on different code points, right? E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on 0B56 (Mn, ccc=0) would imply the existence of a variation sequence on 0B48 with the same variation selector, and the same effect. That particular case oughtn't to be impossible, as in NFD everything in sight has ccc=0. However TUS 12.0 Section 23.4 does contain an additional prohibition against meaningfully applying a variation selector to a 'canonical decomposable character'. (Scare quotes because 'ly' seems to be missing from the phrase.) Richard. So, let's look at what that would look like with some variation selector <0B48, Fxxx> ≡ <0B47, 0B56, Fxxx> If the variant in the shape of 0B48 is well-described by a variation on the contribution due to 0B56 in the decomposed sequence then this might make sense. But if the variant would be better described as a variation in the 0B47 component, then it would be a prime example of poor "pseudo encoding": where some random sequence is assigned to a a shape (in this case) without being properly analyzable into constituent characters with their own identity. Which would it be in this example? And this example only works, of course, because with ccc=0, 0B56 cannot be reordered. The prohibition as worded may perhaps be slightly more broad than necessary, but I can understand that the UTC didn't want to parse it more finely in the absence of any good examples that could be used to better understand what the actual limitations should be. Better safe than sorry, and all that. A./ On 2/2/2020 11:43 AM, Mark Davis ☕️ via Unicode wrote: I don't think there is a technical reason for disallowing variation selectors after any starters (ccc=000); the normalization algorithm doesn't care about the general category of characters. Mark
Re: Combining Marks and Variation Selectors
On Sun, 2 Feb 2020 16:20:07 -0800 Eric Muller via Unicode wrote: > That would imply some coordination among variations sequences on > different code points, right? > > E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on 0B56 (Mn, > ccc=0) would imply the existence of a variation sequence on 0B48 with > the same variation selector, and the same effect. That particular case oughtn't to be impossible, as in NFD everything in sight has ccc=0. However TUS 12.0 Section 23.4 does contain an additional prohibition against meaningfully applying a variation selector to a 'canonical decomposable character'. (Scare quotes because 'ly' seems to be missing from the phrase.) Richard. > On 2/2/2020 11:43 AM, Mark Davis ☕️ via Unicode wrote: > I don't think there is a technical reason for disallowing variation > selectors after any starters (ccc=000); the normalization algorithm > doesn't care about the general category of characters. > > Mark
Re: Combining Marks and Variation Selectors
That would imply some coordination among variations sequences on different code points, right? E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on 0B56 (Mn, ccc=0) would imply the existence of a variation sequence on 0B48 with the same variation selector, and the same effect. Eric. On 2/2/2020 11:43 AM, Mark Davis ☕️ via Unicode wrote: I don't think there is a technical reason for disallowing variation selectors after any starters (ccc=000); the normalization algorithm doesn't care about the general category of characters. Mark On Sun, Feb 2, 2020 at 10:09 AM Richard Wordingham via Unicodewrote: On Sun, 2 Feb 2020 07:51:56 -0800 Ken Whistler via Unicode wrote: > What it comes down to is avoidance of conundrums involving canonical > reordering for normalization. The effect of variation selectors is > defined in terms of an immediate adjacency. If you allowed variation > selectors to be defined for combining marks of ccc!=0, then > normalization of sequences could, in principle, move the two apart. > That would make implementation of the intended rendering much more > difficult. I can understand that for non-starters. However, a lot of non-spacing combining marks are starters (i.e. ccc=0), so they would not be a problem. is an unbreakable block in canonical equivalence-preserving changes. Is this restriction therefore just a holdover from when canonical equivalence could be corrected? Richard.
Re: Combining Marks and Variation Selectors
I don't think there is a technical reason for disallowing variation selectors after any starters (ccc=000); the normalization algorithm doesn't care about the general category of characters. Mark On Sun, Feb 2, 2020 at 10:09 AM Richard Wordingham via Unicode < unicode@unicode.org> wrote: > On Sun, 2 Feb 2020 07:51:56 -0800 > Ken Whistler via Unicode wrote: > > > What it comes down to is avoidance of conundrums involving canonical > > reordering for normalization. The effect of variation selectors is > > defined in terms of an immediate adjacency. If you allowed variation > > selectors to be defined for combining marks of ccc!=0, then > > normalization of sequences could, in principle, move the two apart. > > That would make implementation of the intended rendering much more > > difficult. > > I can understand that for non-starters. However, a lot of non-spacing > combining marks are starters (i.e. ccc=0), so they would not be a > problem. is an unbreakable block in > canonical equivalence-preserving changes. Is this restriction therefore > just a holdover from when canonical equivalence could be corrected? > > Richard. >
Re: Combining Marks and Variation Selectors
On Sun, 2 Feb 2020 07:51:56 -0800 Ken Whistler via Unicode wrote: > What it comes down to is avoidance of conundrums involving canonical > reordering for normalization. The effect of variation selectors is > defined in terms of an immediate adjacency. If you allowed variation > selectors to be defined for combining marks of ccc!=0, then > normalization of sequences could, in principle, move the two apart. > That would make implementation of the intended rendering much more > difficult. I can understand that for non-starters. However, a lot of non-spacing combining marks are starters (i.e. ccc=0), so they would not be a problem. is an unbreakable block in canonical equivalence-preserving changes. Is this restriction therefore just a holdover from when canonical equivalence could be corrected? Richard.
Re: Combining Marks and Variation Selectors
Richard, What it comes down to is avoidance of conundrums involving canonical reordering for normalization. The effect of variation selectors is defined in terms of an immediate adjacency. If you allowed variation selectors to be defined for combining marks of ccc!=0, then normalization of sequences could, in principle, move the two apart. That would make implementation of the intended rendering much more difficult. That is basically why the UTC, from the start, ruled out using variation selectors to try to make graphic distinctions between different styles of acute accent marks explicit, for example. --Ken On 2/1/2020 7:30 PM, Richard Wordingham via Unicode wrote: Ah, I missed that change from Version 5.0, where the restriction was, 'The base character in a variation sequence is never a combining character or a decomposable character'. I now need to rephrase the question. Why are marks other than spacing marks prohibited?