Thanks to John for pointing me in the right direction; the normalization
charts were not helpful, but after spending some time with UAX#15 and
looking at the actual Unicode database, I see what is going on here.
It seems strange to me that the Unicode book (where I initially looked)
simply gives
U+03AC and U+1F71 both have canonical decompositions to U+03B1 followed
by U+0301. (There are other similar pairs in the Greek blocks.) If an
application applies normalisation form C both decompose to the same
string; will the resulting recomposed character be 03AC or 1F71? I
suspect the
David J. Perry scripsit:
U+03AC and U+1F71 both have canonical decompositions to U+03B1 followed
by U+0301. (There are other similar pairs in the Greek blocks.) If an
application applies normalisation form C both decompose to the same
string; will the resulting recomposed character be 03AC
PROTECTED]
To: [EMAIL PROTECTED]
Sent: Saturday, March 15, 2003 09:36
Subject: Normalisation and Greek characters
U+03AC and U+1F71 both have canonical decompositions to U+03B1 followed
by U+0301. (There are other similar pairs in the Greek blocks.) If an
application applies normalisation
4 matches
Mail list logo