On 16/01/2004 11:17, Rick McGowan wrote:

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new item closes on January 27, 2004.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:


Issue #27 Joiner/Nonjoiner in Combining Character Sequences


Unicode 4.0 describes the structure of Khmer syllables, saying that they may contain an interior ZWJ. There is a problem with this that needs to be resolved in 4.0.1, because some of the characters later in the syllable can be combining characters. This paper describes a proposal with to fix this problem. As a part of the proposal, a choice has to be made among two alternatives.



Although this issue has been brought up for review in the light of the problem with Khmer, it also has a significant impact on Hebrew, and for that reason I am bringing it to the attention of the Hebrew list as well.

I support the main proposal, which is to allow the ZWJ and ZWNJ characters to occur within combining character sequences. When they occur between two combining marks, they will indicate joining and non-joining forms respectively of those two combining marks. In Hebrew, this will provide a convenient mechanism for requesting or inhibiting ligatures between meteg and hataf vowels (see http://www.qaya.org/academic/hebrew/Issues-Hebrew-Unicode.html secton 3.5). Previously there was no such mechanism which was strictly compatible with Unicode definitions. With this change, the following distinctions can be made:

<vowel, ZWJ, meteg> - medial meteg preferred, but only possible if the vowel is a hataf vowel (ZWJ must be ignored for other vowels)

<vowel, ZWNJ, meteg> - left meteg preferred

<vowel, meteg> - no preference, font default should be used (probably left meteg with all vowels)

<meteg, CGJ, vowel> - right meteg preferred - or should this last one be <meteg, ZWNJ, vowel>, considering that ZWNJ will have the same effect as CGJ of blocking canonical reordering?

I have a small concern that at least potentially there might be a need to promote or inhibit a ligature between combining marks which do not come together in canonical order. For example, in principle a single Hebrew base character might be combined with a hataf vowel (ccc 11-13), dagesh (ccc 21) and meteg (ccc 22). In canonical order the dagesh would be reordered between the hataf vowel and the meteg, either before or after ZWJ/ZWNJ, and would interfere with the mechanism. It might be necessary to code <dagesh, CGJ, hataf vowel, ZW(N)J, meteg> or <hataf vowel, ZW(N)J, meteg, CGJ, dagesh>. No such combination actually occurs in the standard text of the Hebrew Bible, but in principle one might be found in other texts.

At first sight I see no reason to express a preference between option A or option B in the review issue, for Hebrew or any other reason.

Please note the following if you wish to make official feedback to the UTC on this matter.

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please use the following link to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Let me take this opportunity also to remind everyone that the closing date for comment on several other public review issues is approaching, so if you have comments, please try to send them in soon.

Note: If you are a liaison representative, please forward this message as appropriate within your organization.

Regards,
        Rick McGowan
        Unicode, Inc.










--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to