On 24/04/2004 21:01, Ernest Cline wrote:

... My point here was that adding a category of characters
that was tightly bound to the preceding character without using the
existing combining class mechanism would cause problems
for normalization that could not be avoided, and as such, it is
impossible to add variation selectors for combining marks
unless the variation selector for a combining mark is of the
same canonical combining class. That would cause any
proposal for such variation selectors to have to add variation
selectors for each canonical combining class, and thus
increase the cost of implementing such a proposal.



Let us remember that problems arise with class 0 VSs only if preceded by more than one combining mark. So it would be possible to specify that VSs may be preceded by no more than one combining mark. Therefore, a base character with two combining marks, one of which has a variant glyph, must be encoded B CM2 VS CM1 - irrespective of the canonical order. This is stable under normalisation as the VS is class 0. Even if without the VS the canonical order is B CM1 CM2 (i.e. cc(CM2)>cc(CM1)), the sequences B CM1 CM2 VS and its unnormalised canonical equivalent B CM2 CM1 VS can be defined as illegal (just as at present any sequence of CM VS is illegal), and the sequence with the variant glyph would have to be B CM2 VS CM1. This avoids any problems with normalisation by defining the sequences which can be reordered as illegal. (There is a small problem when variants of BOTH combining marks are required, as B CM1 VS1 CM2 VS2 and B CM2 VS2 CM1 VS1 are equivalent but not canonically equivalent. This could happen in Hebrew e.g. if a VS is used for dagesh hazaq as well as qamats qatan, but should be rare enough to be a marginal problem.)


It might make sense to relax the restriction on allowable
variation sequences to include combining marks of class 0,
and maybe even to provide variation selectors for the two
big classes of combing characters, 220 and 230, given
that those two classes are far and away the largest non-0
classes at present and are likely to remain so.



In principle this makes sense. In practice it fails to solve the specific problem with Hebrew, because most of the combining marks which have variants are not in classes 220 or 230.

Earlier, Ernest wrote:

Adding Variation Selectors with non-zero canonical
combining classes is possible, but I fail to see the benefits
from adding new Variation Selectors on the SSP outweighing
the benefits of defining new vowel marks in the Hebrew
block.


The benefits of using variation selectors rather than new code points in this case are exactly the same as those for variation selectors for base characters, as expressed in TUS section 15.3:


Occasionally the need arises in text processing to restrict or change the set of glyphs that are to be used to represent a character. ... In special circumstances, such a variation from the normal range of appearance needs to be expressed side-by-side in the same document in plain text contexts, where it is impossible or inconvenient to exchange formatted text. ... The variation selectors are used when characters have essentially the same semantic.


Variation selectors provide a mechanism for specifying a restriction on the set of glyphs that are used to represent a particular character. They also provide a mechanism for specifying variants ... that have essentially the same semantic but substantially different ranges of glyphs.


I accept that there is some continuing debate (for which the Hebrew list is the proper place) over whether the particular variant characters I have in mind do "have essentially the same semantic". But in principle these conditions may be true of combining characters just as much as of base characters. And so the reasons for which VSs are defined for base characters are just as valid for combining characters.

As for the new variant selectors being in the SSP, is this actually necessary, or could they be in the Hebrew block, space permitting? After all, if we are talking about VSs with the fixed combining classes of Hebrew points, they are useful only with Hebrew script.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/




Reply via email to