John H. Jenkins <jenkins at apple dot com> wrote: > Remember, though that the Unicode approach is that ZWJ is *not* the > preferred Unicode way to support things like a discretionary ct > ligature in Latin text. The standard says that the preferred way to > handle this is through higher-level protocols. > > I know that you and I disagree with to what extent ligation control > belongs in plain text, but the standard clearly allows both > approaches. The ZWJ mechanism is not *the* Unicode approach.
Once again, I have done a poor job of expressing myself on this topic. Sometimes misunderstandings are the speaker's fault, sometimes the listener's, and sometimes both. In this case it is clearly my fault for not communicating well. I should not have implied that ZWJ was the only way to effect ligation in Unicode Latin text, or that the user (or even the software) should have to insert ZWJ everywhere ligatures are desired. Rendering subsystems can certainly use their own judgement to ligate or not. The way I read the ZWJ in regard to ligation is as a request to the renderer to override the default, in effect saying, "Look, dammit, I want a ligature here." The renderer (possibly influenced by the capability of the font) still has the right to decline that request. Let's consider our good old friend, the "ct" ligature. Courier is a good example of a font that had better darned well *not* have a ct ligature; it would just look too weird. Helvetica (≈ Arial) might or might not have a "ct" ligature, but rendering systems using Helvetica probably would not use it by default. If Baskerville is used instead, the chances of using the ligature by default might be somewhat higher. (Note that I am deliberately avoiding the question of "default modes" of fonts, or any mention of specific font technologies. Also note that I am steering way clear of the language-dependent "fi" ligature.) So if the text contains the letters "ct", a Courier rendition definitely would not ligate them by default, and a Helvetica rendition probably would not, but a Baskerville rendition might. This is all up to the designers of the font and rendering engine, of course. (Please, if you are a font designer and know that one of these examples is wrong, be gentle and just treat them as examples.) Now, if the text contains c + ZWJ + t, that should tell the renderer that the user would really, really like to see a ligature if possible. In the case of Courier, it *isn't* possible, so you still get a "c" and a "t". In the case of Helvetica and Baskerville, assuming those fonts have a "ct" ligature, the default (whatever it was) should be overridden and the ligature should be displayed. The same thing is true for ZWNJ. That is, if the default behavior for Baskerville is to ligate "ct", then c + ZWNJ + t should result in two discrete letters. Now, we know that fonts and renderers already do this without being told, because ZWNJ breaks up the combination that would otherwise be ligated, and that behavior (while accidental) is correct. My point is that, if fonts and renderers are *also* breaking up potential ligatures because of an intervening ZWJ, that is NOT correct according to Unicode. The accidental, naïve behavior that does the right thing for ZWNJ does not do the right thing for ZWJ. This is what I am proposing be changed: fonts and/or rendering engines (wherever the intelligence lies, depending on the vendor technology) should be updated to recognize "letter + ZWJ + letter" (and similar combinations of 3 or more letters) as a request to ligate the characters if possible. I am *not* suggesting that fonts and rendering engines and intelligent text processing tools like InDesign be stripped of all power to control ligation. They are probably in an excellent position to do so. (I wish, oh how I wish, that Microsoft Word had some facility for generating ligatures.) And I am *not* suggesting that user overrides of the default ligation behavior be limited to inserting ZWJ or ZWNJ. If programs like InDesign give the user a convenient option to turn ligation on and off, globally or locally, more power to them. What I am suggesting is that the Unicode ZWJ and ZWNJ *also* be honored as a way to control ligation. That is how I read the Unicode Standard. -Doug Ewell Fullerton, California