On Mon, 11 Feb 2013 02:45:27 +0100
Philippe Verdy <verd...@wanadoo.fr> wrote:

> 2013/2/10 Richard Wordingham <richard.wording...@ntlworld.com>:

> The term "pathological" could aplpy to these cases where a "naive"
> implementation may in fact break the expectations. How then can a
> collator become a "conforming" process if it has to differentiate
> canonically equivalent input strings ?

There is a UCA collation option, 'normalization' set to 'off', which
allows such incorrect operation if strings are not FCD.  (Both NFC and
NFD strings are FCD.) The UCA and LMDL definitions *still* together
falsely claim that omitting normalisation will give the correct result
on FCD strings; counter-examples include default collation <U+0F71
TIBETAN VOWEL SIGN AA, U+0F73 TIBETAN VOWEL SIGN II> and Danish (still
at CLDR Version 22.1) <U+0061 LATIN SMALL LETTER A, U+00E5 LATIN SMALL
LETTER A WITH RING ABOVE>.

Richard.

Reply via email to