On Fri, Dec 21, 2012 at 11:35 AM, Jukka K. Korpela <jkorp...@cs.tut.fi> wrote: > 2012-12-21 21:05, Leif Halvard Silli wrote: > >> My Moscow Russian-Norwegian from 1987 and my Pocket Oxford Russian >> Dictionary from 2003 agree that both list words on Ё and Е under the >> same category – namely, under the letter Е. > > This appears to be the case in any serious dictionary.
You're right. In an influential orthographic dictionary the difference is secondary, e.g. ёлка is between елисейский дворец and ёлки-палки: http://lopatina-slovar.com/description/elka/34736 (The site database has been built by scanning a printed dictionary) However, the preferences could change, as electronic dictionaries seem to demonstrate. > It is of course possible that some people would prefer treating “ё” as a > primarily different letter. But it’s rather illogical to require that it be > treated that way at the start of a word only. I don’t think collation rules > need to accommodate such preferences. Granted, not yet, but by itself the argument is invalid. Unicode collation rules are descriptive; if, for example, a language happens to sort accents backwards, this rule has to be - and is - accommodated despite its apparent illogicality; along the same lines, if a language happens to make a distinction discussed in this thread, it has to be accommodated just as well. Also, "In several languages the rules have changed over time, and so *older dictionaries may use a different order than modern ones* [emph. mine - LB]. Furthermore, collation may depend on use. For example, German dictionaries and telephone directories use different approaches." [http://en.wikipedia.org/wiki/Collation] The distinction in two collation methods in German (secondary vs expanded umlauts) is prominent enough to be mentioned in UCA. Luckily for Germans, both methods are covered by the algorithm thanks to requirements of other languages. My question is as follows: does UCA have to be modified (e.g. by adding another bit flag "word-initial primary" next to the existing "backward secondary") to support the feature if it were to be implemented, or is there a way to achieve the "new Russian online collation" within the existing UCA without modifying the strings to be sorted before the application of the algorithm? Leo