On Wed, Dec 26, 2012 at 11:18 AM, Whistler, Ken ken.whist...@sap.com wrote:
Leo asked:
My question was narrower: assuming that the strings being compared are
words, could it be supported without any markup?
... where it refers to conditional weighting based on the (identified) word
The UCA algorithm itself has no opinion on this issue. It is simply a
specification of *how* to compare strings at multiple levels, given a
multi-level collation weight table.
The UCA *does* have a default behavior, of course, based on the DUCET table.
And the DUCET table puts all Unicode
Leo asked:
My question was narrower: assuming that the strings being compared are
words, could it be supported without any markup?
... where it refers to conditional weighting based on the (identified) word
boundary. And the answer to that is no, unless the word boundary was explicitly
Hello,
Leo Broukhis hatte geschrieben:
In Russian, the difference between Е and Ё is primary at the beginning
of a word as they are considered distinct letters of the alphabet, yet
secondary in the middle of a word, as the dieresis over Ё is not
mandatory. As an example, ель ёлка, but тёлка
Ken,
A basic question: does the UCA algorithm consider the Russian Ye and the
Russian Yo as equal with regard to sort order? Or is it not meant to solve
that issue?
Leif Halvard Silli
--- Opprinnelig melding ---
Fra: Whistler, Ken ken.whist...@sap.com
Til: l...@mailcom.com,
My opinion is that BOTH the UCA algorithm AND the LDML formal decription of
collations are just Best known practices to accomodate the collation
(i.e. dictionary ordering AND string searches AND string comparisons).
But neither of them can accomodate all possible orders or weak comparisons
On Fri, Dec 21, 2012 at 1:49 PM, Whistler, Ken ken.whist...@sap.com wrote:
Leo Broukhis said:
Granted, not yet, but by itself the argument is invalid. Unicode
collation rules are descriptive;
I'm not sure what you mean by that. UTS #10 is a *specification* of an
algorithm, with various
Leo Broukhis, Fri, 21 Dec 2012 01:31:18 -0800:
In Russian, the difference between Е and Ё is primary at the beginning
of a word as they are considered distinct letters of the alphabet, yet
secondary in the middle of a word, as the dieresis over Ё is not
mandatory.
As an example, ель ёлка,
[Philippe tells me that his message that I'm quoting could have been
rejected by the mailing list as spam; my answer is below.]
On Fri, Dec 21, 2012 at 5:13 AM, Philippe Verdy verd...@wanadoo.fr wrote:
This is an interesting case. A solution would be to be able define a
distinct collation
On Fri, Dec 21, 2012 at 4:56 AM, Leif Halvard Silli
xn--mlform-...@xn--mlform-iua.no wrote:
You say that the difference is primary in the beginning of a word but
elsewhere secondary. And yes, that orthographic dictionary that you
link to above, looks as you describe.
However, in reality, the
Resending my earlier reply. Apparently, by default, Gmail sends subject
lines in KOI8-R if they contain Cyrillic, and unicode.org rejects those as
likely spam. I just changed my Gmail settings to Use Unicode (UTF-8)
encoding for outgoing messages and hope this goes through. (*Please change
the
Leo Broukhis, Fri, 21 Dec 2012 08:57:11 -0800:
On Fri, Dec 21, 2012 at 4:56 AM, Leif Halvard Silli wrote:
You say that the difference is primary in the beginning of a word but
elsewhere secondary. And yes, that orthographic dictionary that you
link to above, looks as you describe.
2012-12-21 21:05, Leif Halvard Silli wrote:
My Moscow Russian-Norwegian from 1987 and my Pocket Oxford Russian
Dictionary from 2003 agree that both list words on Ё and Е under the
same category – namely, under the letter Е.
This appears to be the case in any serious dictionary.
The use of
Fact is, again, that ёлка - in the wild - can be written ёлка and елка
Though you need a better dictionary: it's the diminutive of ель (as in
Yel'tsin) meaning fir tree, and is the 4-letter word for Christmas tree.
С Рождеством,
Joe
On Fri, Dec 21, 2012 at 11:35 AM, Jukka K. Korpela jkorp...@cs.tut.fi wrote:
2012-12-21 21:05, Leif Halvard Silli wrote:
My Moscow Russian-Norwegian from 1987 and my Pocket Oxford Russian
Dictionary from 2003 agree that both list words on Ё and Е under the
same category – namely, under the
Jukka K. Korpela, Fri, 21 Dec 2012 21:35:16 +0200:
2012-12-21 21:05, Leif Halvard Silli wrote:
My Moscow Russian-Norwegian from 1987 and my Pocket Oxford Russian
Dictionary from 2003 agree that both list words on Ё and Е under the
same category – namely, under the letter Е.
This appears
On Fri, Dec 21, 2012 at 1:08 PM, Leif Halvard Silli
xn--mlform-...@xn--mlform-iua.no wrote:
In «Tolkovïj slovar’ sovremennogo russkogo jazïka» from 2005
(«Dictionary over contempary Russian language»), has located words on Ё
in its a separate category, consisting of exactly one word: Ёмкость.
Leo Broukhis said:
Granted, not yet, but by itself the argument is invalid. Unicode
collation rules are descriptive;
I'm not sure what you mean by that. UTS #10 is a *specification* of an
algorithm, with various options for tailoring and parameterization which make
it possible to
Joe, Fri, 21 Dec 2012 12:48:47 -0800:
Fact is, again, that ёлка - in the wild - can be written ёлка and елка
Though you need a better dictionary: it's the diminutive of ель (as
in Yel'tsin) meaning fir tree, and is the 4-letter word for
Christmas tree.
The dictionary of Dal,[1] says:
Leo Broukhis, Fri, 21 Dec 2012 13:43:14 -0800:
On Fri, Dec 21, 2012 at 1:08 PM, Leif Halvard Silli
xn--mlform-...@xn--mlform-iua.no wrote:
In «Tolkovïj slovar’ sovremennogo russkogo jazïka» from 2005
(«Dictionary over contempary Russian language»), has located words on Ё
in its a separate
20 matches
Mail list logo