At 15:09 11/3/2002, Doug Ewell wrote:

This is what I am proposing be changed: fonts and/or rendering engines
(wherever the intelligence lies, depending on the vendor technology)
should be updated to recognize "letter + ZWJ + letter" (and similar
combinations of 3 or more letters) as a request to ligate the characters
if possible.

I am *not* suggesting that fonts and rendering engines and intelligent
text processing tools like InDesign be stripped of all power to control
ligation.  They are probably in an excellent position to do so.  (I
wish, oh how I wish, that Microsoft Word had some facility for
generating ligatures.)  And I am *not* suggesting that user overrides of
the default ligation behavior be limited to inserting ZWJ or ZWNJ.  If
programs like InDesign give the user a convenient option to turn
ligation on and off, globally or locally, more power to them.  What I am
suggesting is that the Unicode ZWJ and ZWNJ *also* be honored as a way
to control ligation.  That is how I read the Unicode Standard.
I basically agree with you, Doug, and my proposal for handling ZWJ ligation in OpenType would provide exactly what you describe, if implemented in fonts and supported by rendering engines. There are, however, a number of issues that need to be resolved. In order for a font lookup sequence involving ZWJ to be processed during layout, a *glyph* for the ZWJ character has to be painted in the glyph string, since font lookups work at the glyph level. Because ZWJ already had a function as a control character, e.g. in Indic script processing, prior to being pressganged into service for ligation, existing implementations do not paint a glyph for this character unless the user invokes an option to display control characters, e.g. in MS Office. In order to permit the latter option, these characters, if they are supported in a font at all, are represented by a special glyph: a vertical bar on a zero-width with a little x at the top. This obviously presents various problems, and should be a warning to the UTC to avoid repurposing characters that have already been implemented for other purposes: such implementations might not be compatible with the intended new purpose.

So we have a quandry: do we stop treating ZWJ as a control character and always paint a glyph so that it can be used in lookup sequences? If we do this, we run the risk of a visble glyph appearing in text anywhere that a font does not provide a ligature glyph or lookup sequence. Do we avoid this by making the ZWJ glyph a blank, zero-width glyph? If we do this, we can no longer use current methods to provide users with the option of displaying control characters (I can think of various ways to solve this particular problem, including glyph substitution, e.g. a 'Control Display Forms' layout feature that would map the blank glyphs to visible forms). We also lose the ability to kern the glyphs on either side of the ZWJ if a ligature is not available (this could be solved with a lot of contextual kerning data, but that would be a serious pain). I'm not saying that any of these problems are insoluble, or that software developers should not rewrite all their existing rendering engines and rethink their approach to control characters in order to implement ZWJ ligation. I just think people should be aware that supporting ZWJ ligation is considerably more difficult than it would have been if, for example, Michael Everson's initial proposal for a separate Zero-Width Ligator had been accepted. Implementing something new is a lot easier than completely changing an existing implementation for a character whose purpose has suddenly been redefined. The more widely implemented Unicode becomes, the more the UTC will need to consider the impact of their decisions on existing implementations.

John Hudson

Tiro Typeworks www.tiro.com
Vancouver, BC [EMAIL PROTECTED]

It is necessary that by all means and cunning,
the cursed owners of books should be persuaded
to make them available to us, either by argument
or by force. - Michael Apostolis, 1467


Reply via email to