From: "Philippe Verdy" <[EMAIL PROTECTED]>
> So, <NBSP,CC> must not be treated as if it was:
>     <WJ,SP,WJ,CC>
> but really rather as:
>     <WJ,SP,CC,WJ>
> Note here the inversion.

The inversion here acts as if WJ was a combining character of combining
class 256 (i.e. with a class higher than the combining class of all other
"Mn" combining characters) and a canonical reordering was applied to the
sequence.

Of course this is not a standard normalization form, but using this pseudo
combining class may help render the last two coded strings (in my quote
above) equivalently in renderers.
This works even in the case where there are multiple diacritics (noted CC1
and CC2 below):
    <NBSP,CC1,CC2>
is then treated as if it was:
    <WJ,SP,WJ,CC1,CC2>
and then the pseudo-normalization had given:
    <WJ,SP,CC1,CC2,WJ>
or:
    <WJ,SP,CC2,CC1,WJ>
(depending on the canonical reordering of CC1 and CC2, i.e. of their
relative combining class)


Reply via email to