> However, in the class of languages for which I am trying to
> provide support, certain characters are meant to be produced
> by an ordered combination of other characters.  For example,
> the general sequence in Devanagari script (and this extends
> to the other scripts as well) is that
> consonant+virama+consonant produces
> half-consonant+consonant, where the half-consonant has no
> other unicode specification.  As a concrete case in
> Devanagari, na virama sa (viz., \u0928\u094d\u0938) should
> produce the nsa character (this sequence can be seen in any
> unicode representation of the word "Sanskrit" in Devanagari
> script).
> 
> It seems to me that TTF font specifications (i.e., those I
> converted to subfonts using Federico's ttf2subf) include
> these sequence definitions, which are then processed by each
> application providing support for the fonts.  Plan 9
> subfonts are much too simple for this.

yes.  this is a problem.  unfortunately the unicode guys
took the position that codepoint is divorced from glyphs
unfortunately, this case isn't as bad as it gets.  e.g. archaic cryllic
letters have transliterations like ^^A in unicode.  would
three hats on an A be illegal?  i don't see what would prevent it.
and therefore one needs to implment some sort of character
layout engine to render unicode.  that's pretty bogus.

what is the total number of stealth characters like nsa?
if it'not too unreasonable, it might be good enough to steal part of
the operating system or application reserved areas.

i hope my ignorance of the particular script in question
isn't leading to silly suggestions!

- erik

Reply via email to