Peter Kirk asked:

> > In Turkish and Azeri the sequences f - i and f - dotless i both occur,
> > and are fairly frequent. So it is inappropriate in these languages to
> > use fi ligatures in which the dot on the i is lost or invisible, at
> > least where the second character is a dotted i. Has any thought been
> > given to this issue? Is it possible to block such ligation on a
> > language-dependent basis?
> 

and Philippe Verdy responded with another question:

> Isn't there a "Grapheme Disjoiner" format control character to force the
> absence of a ligature like <fi>, i.e. <f, GDJ, i>?

The answer to Philippe's rejoinder question is no, there is not
a "Grapheme Disjoiner" format control character.

What Philippe has in mind, however, is covered in the standard
by the interaction of the joiner and non-joiner characters
with ligature control:

"U+200C ZERO WIDTH NON-JOINER is intended to break both cursive
connections and ligatures in rendering.

"ZWNJ requests that glyphs in the lowest available category
(for the given font) be used."

      -- Unicode 4.0, Section 15.2, Layout Controls

The categories referred to, from lowest to highest, are:

1. unconnected
2. cursively connected
3. ligated

At Peter pointed out, however, it is neither expected or reasonable
to have to go back through and drop in ZWNJ's at every relevant
location in existing Turkish or Azeri text, simply to prevent
fi ligation. Such use of ZWNJ is intended to be exceptional,
to deal with special cases.

The general solutions depend either on use of fonts (or more
generally, renderers) which block such ligation across the
board. It is my understanding that modern font technologies
allow the choice of ligation to essentially be a style selection
for the font. How well various applications take advantage
of that and make the choice available easily to end users may
be an open issue still, but the fundamental pieces to do this
correctly are available.

--Ken


Reply via email to