2013/8/6 Richard Wordingham <richard.wording...@ntlworld.com> > For example, I think the proper > upper-casing of <U+1FB3 GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI, > U+0359 COMBINING ASTERISK BELOW> is <U+0391 GREEK CAPITAL LETTER ALPHA, > U+0359, U+0196 LATIN CAPITAL LETTER IOTA, U+0359>. >
Why do you use U+0196 LATIN CAPITAL LETTER IOTA instead of U+399 GREEK CAPITAL IOTA ??? I'm also not convinced that duplicating the combining asterisk below is correct here. My opinion is that it should be: <U+0391 GREEK CAPITAL LETTER ALPHA, DOUBLE COMBINING ASTERISK BELOW, U+0399 GREEK CAPITAL LETTER IOTA> with a new "double" diacritic encoded between both letters (it will be shown as a single asterisk, centered below the gap between the two capital letters... There's no such "double combining asterisk" character in the UCS. But if you replace the asterisk by a macron (below or above) there exists such double diaritic. The problem is that collation with strength ignoring case diferences will not compare these strings as equal. Or it could also be: <U+0391, WJ, U+359, U+0399> using a zero-width word joiner to hold the simple combining asterisk below (this will create three grapheme clusters, with the second one kerned below the two surrounding letters). I think this solution is preferable because collation with strength ignoring case diferences (and treating WJ as ignorable) will compare the uppercased string as equal to the original lowercase string. But now if you replace the asterisk by a macron, the macron will not be doubled (its visual length won't be increased to cover the whole width of the ALPHA+IOTA pair, it will only cover the gap between the two letters, plus a tiny right part of ALPHA and may extend up to the horizontal position of the vertical stroke of IOTA (or may be not, depending on the interletter spacing and the internal default side bearings of the ihdividual glyphs when letter-spacing is null) In both solutions, this requires a complex case conversion pattern where : - <U+1FB3, combining characters> maps to <U+0391, combining characters converted to their "double" version> or (simpler because you don't need to convert the combining characters): - <U+1FB3, combining characters> maps to <U+0391, WJ, combining diacritics, U+399> Note that there will be additional difficulties if U+1FB3 is already followed by a double diacritic (such as a double tie above) and another letter, because the capitalized string should have the double diacritic changed into a triple one (to cover ALPHA+IOTA+the next letter) with the first solution or a quadruple one (to cover ALPHA+WJ+IOTA+the next letter) with the second solution ! For this reason, I do think that the capitalization of YPOGEFRAMMENI should better be a COMBINING GREEK SMALL CAPITAL IOTA RIGHT and not a standard GREEK CAPITAL LETTER IOTA, even if it is not visually distinct (when not using a "small-caps" style for the combining capital iota).