Maciej Jaros (2011-01-18 15:42):
> Tim Starling (2011-01-18 02:03):
>> On 18/01/11 07:41, Amir E. Aharoni wrote:
>>> 2011/1/17 Tim Starling<tstarl...@wikimedia.org>:
>>>> * It automatically drops accents, since accented letters sort the same
>>>> as unaccented letters (at the primary level).
>>> How locale aware is it? For example, in Swedish accented letters come
>>> at the end of the alphabet and in Lithuanian I, Į and Y are collated
>>> together as if they were one letter. There are many quirks of this
>>> kind in other languages.
>> It's not locale-aware. As I said, it's a compromise collation. I was
>> hoping that other people might be interested in adding support for
>> specific locales, that's part of the reason for my post. ICU supports
>> lots of different locales, and there is locale-specific collation data
>> in the CLDR.
>>
>>> And i don't know what to do when in the Lithuanian Wikipedia you sort
>>> names of places in the UK - should Islington come before or after
>>> York?
>> Before.
>>
>>> $collator = new Collator('lt')
>>> print $collator->compare( 'Islington', 'York' )
>> -1
>>
>> But more interestingly, York goes before London:
>>
>>> print $collator->compare( 'York', 'London' )
>> -1
>>
>> I think attempting to do it any other way would be a lot of trouble,
>> and not what is wanted anyway. To put the question another way: on the
>> English Wikipedia, should Kybartai sort before Klaipėda? I would think
>> not.
> I've seen sorting accent insensitive and so for example "Bańka" would be
> sorted as if it was "Banka", but I haven't yet seen phone insensitive or
> whatever you call it. What I mean is in Poland "rz" is pronounced the
> same (almost the same) as "ż", but "rz" is nowhere near "ż" when it
> comes to sorting. In fact it would be very counter intuitive for me (as
> would be 'York'<  'London'). I think it would not be helpful especially
> for foreigners. I've also said that I've _seen_ accent insensitive
> dictionaries, but _most_ are case sensitive and so "ą">  "a" not "ą"="a"
> also when it comes to the first letter all dictionaries I know have "Ż"
> separate from "Z". You might see our collation as - without accent first
> and with accent second. /This is the why we say are ABC. And it would be
> intuitive for to have English collation by it's ABC with Y coming just
> before Z./

Sorry, sometimes I type phonetically :-). The last sentences were 
supposed to be:

This is the way we say our ABC. And it would be intuitive for me to have 
English collation by its ABC with Y coming just before Z.


> I think the problem should only be solved for letters which are not just
> Latin character + accent. How to sort them in Latin (and Latin based)
> characters.
>
> Regards,
> Nux.
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to