Peter Constable wrote:
There is a potential concern in Uniscribe/OpenType: substitution and
positioning rules in OT are organised hierarchically by script then by
individual writing system / typographic groups (the label used is
languages, but the intent is really groups of writing systems that share
common typographic behaviours). Thus, a rule that handles positioning of a
glyph for 0950 (or whatever) relative to some member of some class of
glyphs must be entered somewhere under some particular script. Now, there
is nothing that prohibits a font developer from creating multiple
positioning rules for 0950 with different classes of base glyphs and to
have a different one placed in the hierarchy under several different
scripts.
Fully agreed so far.


> But there may yet be an issue on the Uniscribe side: given a
string of characters, which it will begin by mapping into a string of
initial glyphs, it has to decide which script tag(s) to apply to portions
of the string. What I don't know is whether it generally assumes combining
marks belong to a specific script, or whether it allows combining marks to
inherit their script from the base characters with which they combine.
Look: in current Uniscribe, leading ZWJ and ZWNJ are discarded (i.e., with
input U+200B U+093E, you still get the circle meaning "incorrect combining",
even if this is perfectly correct Unicode as far as I understand.
So clearly, they have a problem with "backtracking" when the script is
not determined by the first character in stream. I can understand that.
OTOH, when ZWJ or ZWNJ come second or later in conjuncts, they are properly
handled. In every script it is relevant. What I would like to see, is that
the Indic accents be handled in the same way. And when I spoke about that
with MS people (and not only me, but also Pothana's designer), MS answered
that the Unicode standard seemed to imply that these accents apply to
Devanagari script only.
It looks like to me taht this Scripts.txt just confirm the MS point of view.
If this is as intended, that is fine, but that means that a bunch of new
character (with few or no added value) are to be added to some new revision
of Unicode.

By the way, the situation is similar with the dandas (U+0964 and U+0965):
they only appear in the Devanagari and Myanmar blocks, but are used for many
other (all?) South-Asian scripts as well. Worse, they are often used, so
there is already many material that is encoded with these codepoints.
Luckily, dandas do not need special handling from complex script engines,
so it does not matter if Uniscribe decide they are Devanagri or script-less
(except perhaps on the selection of the font).


Antoine


Reply via email to