Re: [Wikitech-l] More aggressive DEFAULTSORT

Gerard Meijssen Wed, 13 May 2009 07:11:45 -0700

Hoi,
The introduction demonstrates that Unicode indeed deals with collation.


When you look at the characters in Unicode, you will find that the Unicode
UTF-8 standard is very much a work in progress. When you look at the CLDR
you will find that it is also very much a work in progress HOWEVER, for many
languages the collation has been well defined and is unlikely to change.
When you look at the CLDR for African languages, there is a project called
Afrigen where they are collecting the relevant information necessary to
include it into the CLDR.

I am not impressed by your argument that you will have to rebuild the
sorting order when there are indeed changes to a collation order. First of
all standards like the CLDR know releases so these iterations only happen
when a new release becomes available and second of all it seems weird to me
to refuse to implement an improved collation order when it is wrong in the
first place.

I have been always told that we develop and implent open source in order to
create open content using open standards. In my opinion you have not
provided any argument why any other approach is preferable. In this case the
CLDR is an applicable open standard.

When as a consequence of an improved collation order for particular
languages we have to rebuild databases every now and again, then it is tough
but it needs to be done. It is all part of normal and acceptable system
management.
Thanks,
      GerardM

http://o2.it46.se/afrigen/statistics.php

2009/5/13 Domas Mituzas <midom.li...@gmail.com>

> Hi!
>
> > http://www.unicode.org/reports/tr10/#Introduction
>
> This is not CLDR, this is general collation algorithm.
>
> > http://cldr.unicode.org/index/cldr-spec/collation-guidelines
>
> CLDR is a repository/process for LDMLs (thats what I referred to
> people sending us that data, in case current is wrong/not existing).
> Currently it has mistakes and multiple versions even for same locales
> - doesn't seem to be too stable nor correct.
>
> An example:
>
> http://unicode.org/cldr/data/common/collation/lt.xml?rev=1.26&content-type=text/vnd.viewcvs-markup
>  ;-)
>
> Do note, that such unstable changes require database rebuilds at each
> iteration.. So, we'd have to have someone reviewing it all, comparing
> with different sources, and then pushing it once every few years into
> some data staging environment where we do data conversions all the
> time? :) riiight...
>
> Domas
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] More aggressive DEFAULTSORT

Reply via email to