Lars Kristan asked:

> Is there a character (codepoint), that is guaranteed to be sorted (collated)
> after all other codepoints?
> 
> Like:
> 
> _WantThisOneOnTop
> Able
> Baker
> NoMatterWhat
> ^WantThisOneOnBottom
> ^^and_so_on
> 
> Where _ is the underscore, which is usually collated 'quite high'.
> And ^ is the hipothetical character I am querying about.

ISO/IEC 14651 contains a special symbol "<SFFFF>" which is deliberately
left at the end of the list of all other primary-weighted symbols,
so that there will be a "highest weight". You would still have to
tailor the table, to assign a particular character a high weight
making use of <SFFFF> or a weight tailored with respect to <SFFFF>,
since there is no highest character, per se, in the list. In the
amendment to 14651 under current ballot, <SFFFF> is still present.
In the default table, the highest weighted characters before <SFFFF>
are the Han characters, so that the last Extension B character would
be weighted high.

In the Unicode Collation Algorithm (UTS #10), there is no explicit
weight assigned corresponding to <SFFFF>, but a primary weight
assignment of 0xFFFF is guaranteed to be higher than that of
any Han character. (The Han character weights are constructed
synthetically based on first element primary weights in the
range 0xFF40..0xFFBF.) Once again, if you want a *character* to
correspond to that highest weight, then you have to tailor the
table to do so. But then, of course, you could assign any character
you want to have that highest weight value, including a private
use character or even a noncharacter code point.

--Ken

Reply via email to