At 07:53 PM 3/27/2004, [EMAIL PROTECTED] wrote:


> >What does the collation standard say to do with unassigned codepoints
> >anyhow?
>
> Variation selectors are not unassigned characters.

But, they might be regarded as such by any application predating VSs.  And,
likewise for any VS sequences approved after the application was created.

While applications predating VSs have no choice but to treat them as what
they are (in that context) i.e. unassigned characters, applications of later
date have no business treating unapproved VS sequences as unassigned *characters*.


The intent of VSs is to mark a difference that falls below the distinction
between separately encoded characters. Therefore I would expect that by default
all VS charactesr are ingnored in an fullblown collation implementation, leaving
open the choice of supporting, say, a fourth level difference between specific
known variation sequences.


They are also best ignored in any kind of identifier or name matching, as otherwise
the presence of invisible characters can change the lookup--with all the consequences
for spoofing and security.


A./





Reply via email to