I disagree. What you want is a merged database field. See
http://www.macchiato.com/slides/icu_collation.ppt

Mark
—————

Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο 
πάντα — Όμήρου Μαργίτῃ
[http://www.macchiato.com]
----- Original Message -----
From: "Asmus Freytag" <[EMAIL PROTECTED]>
To: "David Gallardo" <[EMAIL PROTECTED]>; "Ayers, Mike"
<[EMAIL PROTECTED]>; "'David Starner'" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Friday, September 07, 2001 11:50
Subject: Re: [OT] o-circumflex


> At 01:06 PM 9/7/01 -0400, David Gallardo wrote:
> >As a practical matter, you need to take the diacritics into account when
> >sorting, even in English where they (may or may not) have linguistic
> >significance, otherwise you'll get nondeterministic behaviour. In other
> >words, résumé and resume should fall together, but always in the same
order.
>
> Stated absolutely, this is patent, but oft-repeated nonsense. For example,
> it does not always make sense for list of names. An old friend of mine,
Jon
> Proppe, who is an Icelandic art critic, spells his name with an accent
> grave on the first o and an acute accent on the e. In a campus directory
of
> the US university he attended (assuming it did not strip the accents), it
> would make no sense to have his name show up after all the Proppes, or all
> the Jons without an accent (depending on whether its sorted by first or
> last name).
>
> If I sort a list of single words which contains non-unique entries, a
> stable sort would sort the non-unique subsets in the order of their
> appearance in the input. If its not important to distinguish between naive
> and naïve (e.g. in a machine generated index that spans multiple documents
> with differences in the use of accents) its hard to see what's gained in
> splitting the list in two for this case.
>
> On the other hand, if San Jose and San José are correctly and consistently
> distinguished in my input, they should probably sort separately.
>
> The two cases of resume are different yet again, as noted, since one could
> be a verb form.
>
> It all depends not on whether a distinction can be made, but whether it is
> meaningful in the context of the list being sorted.
>
> A./
>
>
>
>
>
>


Reply via email to