> > and Philippe Verdy responded with another question:
> > 
> > > Isn't there a "Grapheme Disjoiner" format control character to
> > > force the absence of a ligature like <fi>, i.e. <f, GDJ, i>?
> > 
> > The answer to Philippe's rejoinder question is no, there is not
> > a "Grapheme Disjoiner" format control character.
> 
> I did not refer to a specific unicode character, I knew that there
> is one already dedicated, but I did not want to comment about
> this choice.
> 
> There's no contractiction. The Grapheme Disjoiner, for you is
> ZWNJ. OK.

<ad hominem>

Every so often, Philippe, it would be refreshing if, when someone
points out in error in your claims about the Unicode Standard,
that you would simply acknowledge the error and discontinue
making the claim, instead of coming back trying to claim that
the error was just another way of being right.

</ad hominem>

There is a separate character, U+034F COMBINING GRAPHEME JOINER,
which is the "grapheme joiner", abbreviation "CGJ" in the
standard. That character has nothing to do with ligation
control. There has also been debate, on several occasions,
within the UTC, regarding the advisability of encoding
a "grapheme non-joiner", as a pair with the "grapheme joiner".
But again, such a grapheme non-joiner -- which has *not* been
encoded, by the way -- would have nothing to do with ligation
control.

So it is a disservice to the list, perpetuating confusion, to
invent the term "Grapheme Disjoiner" and use it in a series
of notes regarding ligation control, when the standard already
designates the ZWJ and the ZWNJ as the relevant controls
related to ligation control.

So it is not that for me "the Grapheme Disjoiner is the ZWNJ";
rather, it is for the Unicode Standard that the ZWNJ is the
designated, standardized format control for ligation control
of the sort you are talking about. Please learn the terminology
and make correct use of it.

> A font that would automatically select a <fi> ligature to represent
> a sequence of <f, i> codepoints, from the fact that the <fi>
> codepoint is canonically equivalent

U+FB01 LATIN SMALL LIGATURE FI is not a *canonical* equivalent to
<f, i>; it is *compatibility* equivalent. That is an important
distinction.

> is probably  defective and not
> conforming. 

Wrong. There is nothing nonconformant about fonts automatically
ligating <f, i> (or any other sequence). Such automatic
ligation may not always be appropriate or the desired result
for an end user, but that has nothing to do with the conformance
requirements of the standard.

> Such selection of ligature must be put under the
                             ^^^^
                             
Wrong. "must" --> "may"

> control of the renderer with additional markup, which can in fact
> select among three ligatures in Turkish: the <fi> ligature glyph
> where the f is ligated with the dot above i (normal ligature for
> languages other than Turkish/Azeri, the <f-dotted-i> and
> <f-fotted-i> ligatures for Turkish/Azeri.

It is unclear that any such ligatures are required or desireable
for Turkish/Azeri, in any case.

> Markup is necessary to select the appropriate glyph, or this
  ^^^^^^^^^^^^^^^^^^^
  
Wrong. A higher-level protocol is needed, and that may involve
markup. But the Turkish requirements can equally well be
met by simply setting "no ligature" style settings for
the relevant fonts.

> can be selected by using the "Grapheme Disjoiner" (ZWNJ)
                               ^^^^^^^^^^^^^^^^^^^^
                               
Wrong term. See above.

> or the "Grapheme Joiner" (ZWJ) in addition to the use of
         ^^^^^^^^^^^^^^^^^
         
Wrong term. See above.

> a <i> or <dotless-i> codepoint eventually followed by the
> <i-above> diacritic.

And in any case, it is inadvisable to be suggesting use of
ZWJ and ZWNJ in this way to solve the problem of assuring that
Turkish texts don't ligate inappropriately on rendering. 

> All this enrichment of text is assumed
> to be under the control of the markup added to the original
> text which does not need to specify whever ligatures should
> or should not be used.

This last clause I agree with. But the implication that
markup has to be added to Turkish text in order to get it
to render correctly regarding ligature usage is incorrect.
Adding markup to the text is "adding to the original text"
as surely as adding ZWNJ format controls would be. In any
case it is unnecessary, since alternatives exist which simply
specify suppression (or use) of ligatures stylistically in
the fonts.

--Ken


Reply via email to