On Monday, July 1, 2002, at 02:08 PM, Asmus Freytag wrote:

> At 11:34 AM 6/30/02 -0600, John H. Jenkins wrote:
>> Remember, Unicode is aiming at encoding *plain text*.  For the bulk of 
>> Latin-based languages, ligation control is simply not a matter of *plain 
>> text*—that is, the message is still perfectly correct whether ligatures 
>> are on or off.  There are some exceptional cases.  The ZWJ/ZWNJ is 
>> available for such exceptional cases.
>
> Remember also that the simplistic model you present already breaks down 
> for German, since the same character pair may or may not allow ligation 
> depending on the content and meaning of the text - features that in the 
> Unicode model are relegated to *plain* text.
>

*sigh*  I'm clearly not expressing myself well here.

I'm trying to state the general rule.  Each time I do, I say there are 
exceptions.  German is an excellent example of an exception.  Michael's 
exceptional cases are exceptional cases.  We put ZWJ/ZWNJ in charge of 
plain-text ligature formation to handle these cases.  I'm fine with that.

Turkish is another exception, BTW, where the typical "fi" ligature of 
Latin typography should not be formed.

The issue -- as I see it -- is not whether or not *any* ligature control 
belongs in plain text, or whether or not manditory/prohibited ligation 
points should be marked in plain text.  I'm not aware of anyone who is 
arguing against that position.

We started out with a discussion of whether or not we should add more 
Latin ligatures (whether in the PUA or elsewhere) so that people can, in 
essence, create a plain-text representation of an older book where such 
were more common.  (And, as always, if my memory is inaccurate please feel 
free to correct me here.)  This is not an appropriate use of plain text 
IMHO.  I do not believe, moreover, that the ZWJ/ZWNJ mechanism is 
appropriate for this sort of thing.  This is rich text, and other ligation 
controls should be used.

> Therefore, I would be much happier if the discussion of the 'standard' 
> case wasn't as anglo-centric and allowed more directly for the fact that 
> while fonts are in control of what ligatures are provided, layout engines 
> may be in control of what and how many optional ligatures to use, the 
> text (!) must be in control of where ligatures are mandatory or 
> prohibited.
>

Which is what Unicode 3.2 says.  (You said it very nicely here, though.)

(The standard case, BTW, seems to be Anglo-centric largely because this is 
an English-speaking list and people always seem to start out with the "ct"
  ligature they'd like to put in words like "respectfully."  Sorry about 
that.)

==========
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/


Reply via email to