On 14 May 2011 06:33, Andrew Dunbar <hippytr...@gmail.com> wrote:

> I'm almost positive Azeri has the same dotless i issue and perhaps
> some of the other Turkic languages of Central Asia. One solution is to
> do accent/diacritic normalization too as part of the canonicalization.

It's a good thing to think about these beforehand. But we already do
enough mindless killing of diacritics. It doesn't work across all
languages. In Finnish saa and sää are different words and ä is not a
letter "a" with something added to it.

  -Niklas


-- 
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to