At 11:15 AM 12/30/2003, Peter Kirk wrote:I understand this, and, as I answered separately, I don't think this is the appopriate mechanism in this case as the suggested ligature is not fully equivalent to the sequence.
Even if it were verified, it isn't a good case for encoding a separate character *equivalent* to a combination of two existing characters: that's a glyph variant ligature.
Actually, I don't think so. The separate character was not formed by merging the dot into the letter, rather the distinction was made in a different way.
In modern digital font development, ligation refers to the mechanism of display, not the visual appearance, which is largely irrelevant. A ligature is any glyph that represents two or more characters, typically arrived at by a ligation lookup. If I wanted a special sin glyph *equivalent* to the character sequence <shin, sindot>, I would ligate the two characters to that single glyph, either directly
shin sindot -> sin
or via a two-stage stylistic variant lookup associated with a different typographic feature
shin sindot -> shin_sindot and then shin_sindot -> sin
But if it were, this ligature would be very interesting and problematic because it is a ligature between a base character and a diacritic. This is not a problem if it is always used, in a particular font, but it is problematic if the ligature is optional. This is because ZWNJ and ZWJ cannot be used between base characters and diacritics because they break the combining sequence. We came across this problem before with Hebrew script, but in a rather different (and less ambiguous) context, that of the need for a ligature between meteg and hataf vowels.
I wonder if there are other, better defined, cases of ligatures between base characters and diacritics in other scripts, i.e. cases where there is an optional alternative to base character plus diacritic which does not look like the base character plus the diacritic. Candidates like � as an alternative for � are ruled out because they are already separately encoded. I have certainly seen glyphs rather like U+0255 used for c cedilla. In the light of recent discussions, I can easily imagine a script or style like Sutterlin having a special ligated form for u umlaut, but that this ligature must not be used, rather two dots should be written above the letter as in normal Latin script, in the name Sa�l in which the dots represent a diaeresis rather than an umlaut.
OpenType etc fonts are currently able to make these distinctions consistently, with the mechanisms John described above; but these mechanisms fail if there is a need for the ligature to be optional, as ZWNJ and ZWJ cannot be used.
Are there any real examples where this might be necessary?
As this is a more general issue, I am coying it back to the main Unicode list.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

