Re: Unicode issues

Simon Pepping Mon, 15 Jan 2007 11:29:41 -0800

On Mon, Jan 15, 2007 at 04:42:12PM +0100, J.Pietschmann wrote:
 
> As for Ligatures and character shaping: an algorithm for automatically
> detecting ligature points may use a pattern lookup similar to the
> pattern based hyphenation. The pattern dictionary should store only
> either NFD or NFC forms, for the same reason this is advisable for
> hyphenation.


Aren't ligatures a feature of the font, e.g. the GSUB table of an Open
Type font? That is, one font may have a specific ligature, while
another font does not.

> We should choose either NFD or NFC as a canonical representation for
> hyphenation patters (and, in the future, for similar things), so that
> hyphenation patterns containing umlauts can be found regardless of
> the representation of the umlaut in the source file. Currently, we
> don't care much, which works but may break suddenly.
> There is obviously a slight space vs. run time tradeoff (NFC ought to
> be more compact but NFC'ing the source text may be more expensive
> than NFD'ing).

NFC is the standard for the web. Does that carry any weight?

Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu

Re: Unicode issues

Reply via email to