Henning Hraban Ramm wrote:
> Am 2006-01-23 um 01:08 schrieb Vit Zyka:
> 
>>> ä (adiaeresis) is identical to a, ö (odiaeresis) identical to o, ü
>>> (udiaeresis) identical to u, the same for uppercase. ß (ssharp) is
>>> edentical to "ss" (same for uppercase, but in uppercase it's written
>>> as SS anyway).
>>
>>
>> Hmmm, that is not complete: I understand that every ü, Ü, u, U  comes 
>> to single group, but is u<ü<U<Ü? Let say yes. Then try
> 
> 
> I didn't test your code, but u, ü, U and Ü should be handled as same  
> (in "normal German order")
> and u=U, ü=ue=Ü=Ue=UE in "German phone book order".

Hmmm, I feel that the situation is more complicated (same as in Czech). 
Proper sorting needs several (3 or 4, perhaps some languages more?) passes:

1. pass: division - define which letters comes to the same group (it can 
be also group of letters) - defined for newtexutil.rb

2. pass: sorting with the simplified rules e.g. ü=ue=Ü=Ue=UE

3. pass: if all letter are the same according the 2. pass, then apply 
e.g. ü<ue<Ü<Ue<UE

4. ??? (perhaps problems with Czech 'Ch').

After that:

'Üb' < 'üz' < 'Üz'

> Greetlings from Lake Constance!

Enjoy it.
Vit

_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

Reply via email to