In the latest ICU, we took the work we did for Java collation and extended
it substantially (and made it many times faster). It also allows arbitrary
customization at runtime.

I happen to be giving a presentation on it in a few hours at the conference.
For more information, see the draft collation chapter in the User guide, at The presentation (a slightly older draft)
is on my site at


Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο 
πάντα — Όμήρου Μαργίτῃ
----- Original Message -----
From: "David Gallardo" <[EMAIL PROTECTED]>
To: "Edward Cherlin" <[EMAIL PROTECTED]>;
Sent: Thursday, September 13, 2001 8:35 AM
Subject: Re: Collation (was RE: [OT] o-circumflex)

> Java's collation class has a rule-based  collator that is in effect
> programmable using a little language. Here is how an example from Sun's
> doc for Norwegian:
> String Norwegian = "< a,A< b,B< c,C< d,D< e,E< f,F< g,G< h,H< i,I< j,J"
>                  "< k,K< l,L< m,M< n,N< o,O< p,P< q,Q< r,R< s,S< t,T"
>                  "< u,U< v,V< w,W< x,X< y,Y< z,Z"
>                  "< å=a?,Å=A?"
>                  ";aa,AA< æ,Æ< ø,Ø";
>  RuleBasedCollator myNorwegian = new RuleBasedCollator(Norwegian);
> There is also syntax for things such as specifying reverse order (for
> accents for example), contraction and expansion.
> - David Gallardo
> ----- Original Message -----
> From: "Edward Cherlin" <[EMAIL PROTECTED]>
> Sent: Thursday, September 13, 2001 3:40 AM
> Subject: Collation (was RE: [OT] o-circumflex)
> > English and several other languages have dozens of collations. Compare
> telephone books, library catalogs, book indexes (sic), and other sorted
> data. Knuth vol. 3 Sorting and Searching gives an example of a set of
> library sorting rules that runs to more than a page, and suggests
> programming it as an exercise. ;-) Among the rules are to spell out
> > For example,
> >
> > 1984 (Nineteen Eighty Four)
> > 1066 and all that (Ten Sixty Six)
> > 3001 (Three Thousand One)
> > 2050 (Twenty Fifty)
> > 2010 (Twenty Ten)
> > 2001, A Space Odyssey (Two Thousand One)
> >
> > Bell Labs invented a whole programming language, Snobol, to deal with
> telephone listing conversions, matches, and sorts. Many phone books sort
> and Mac- together, others one after the other but separate from other
> >
> > Edward Cherlin
> > Generalist
> > "A knot! Oh, do let me help to undo it."
> > Alice in Wonderland
> >
> >

Reply via email to