On 19/11/2003 16:26, Philippe Verdy wrote:

From: "Philippe Verdy" <[EMAIL PROTECTED]>


So, <NBSP,CC> must not be treated as if it was:
<WJ,SP,WJ,CC>
but really rather as:
<WJ,SP,CC,WJ>
Note here the inversion.



The inversion here acts as if WJ was a combining character of combining class 256 (i.e. with a class higher than the combining class of all other "Mn" combining characters) and a canonical reordering was applied to the sequence.

Of course this is not a standard normalization form, but using this pseudo
combining class may help render the last two coded strings (in my quote
above) equivalently in renderers.
This works even in the case where there are multiple diacritics (noted CC1
and CC2 below):
   <NBSP,CC1,CC2>
is then treated as if it was:
   <WJ,SP,WJ,CC1,CC2>
and then the pseudo-normalization had given:
   <WJ,SP,CC1,CC2,WJ>
or:
   <WJ,SP,CC2,CC1,WJ>
(depending on the canonical reordering of CC1 and CC2, i.e. of their
relative combining class)






This trick doesn't work if any of the CC's are in combining class zero.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to