Re: Public Review Issue #27
Peter Kirk wrote: > One question here which is more of principle. Last year there was a long discussion of the appropriate method of inhibiting undesirable canonical reordering e.g. between meteg and vowels but potentially in other scripts. The mechanism agreed on, I think formally by the UTC, was to use CGJ. But one reason for using CGJ was that ZWJ and ZWNJ were not then available in this position. Now that they are available, would it be better to use them rather than CGJ? The point of using CGJ for this purpose is that CGJ is not intended to affect rendering, so may be inserted as a neutral re-ordering inhibitor. If you insert ZWJ or ZWNJ, one presumably intends them to affect the rendering. Of course, if you do want to affect the rendering, insertion of ZWJ or ZWNJ will also have the effect of inhibiting reordering, so one should never need to insert both CGJ and ZWJ/ZWNJ. John Hudson
Re: Public Review Issue #27
On 09/02/2004 18:25, Asmus Freytag wrote: At 04:12 PM 2/9/2004, Kenneth Whistler wrote: That leaves item A. And it is mostly a matter of determining what is the best mechanism for getting people to know how they should "spell" the metegs with the minimum of confusion. Putting something in the Unicode Standard might be appropriate, or there might be better venues to document the conventions. I'm of the opinion that conventions that use characters of general category Cf are best documented in the standard. Otherwise, a consistent treatment of such characters across implementations depends too much on context (e.g. use of a particular font). "Fine typography" issues for other characters, incl. combining are a different matter. These will legitimately differ even among uses of the same character (viz. math and text handling of accents, for example). A./ Thank you, Ken and Asmus. I think I will take this back to the Unicode Hebrew list and see if we can all agree on a convention for using ZWJ, ZWNJ etc with meteg, now that the way is open for their use. And we may look again at some other issues which are more than just spelling conventions. Then we can formally propose something to the UTC, and the UTC can decide whether this is approprite for inclusion in the standard. One question here which is more of principle. Last year there was a long discussion of the appropriate method of inhibiting undesirable canonical reordering e.g. between meteg and vowels but potentially in other scripts. The mechanism agreed on, I think formally by the UTC, was to use CGJ. But one reason for using CGJ was that ZWJ and ZWNJ were not then available in this position. Now that they are available, would it be better to use them rather than CGJ? -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Public Review Issue #27
At 04:12 PM 2/9/2004, Kenneth Whistler wrote: That leaves item A. And it is mostly a matter of determining what is the best mechanism for getting people to know how they should "spell" the metegs with the minimum of confusion. Putting something in the Unicode Standard might be appropriate, or there might be better venues to document the conventions. I'm of the opinion that conventions that use characters of general category Cf are best documented in the standard. Otherwise, a consistent treatment of such characters across implementations depends too much on context (e.g. use of a particular font). "Fine typography" issues for other characters, incl. combining are a different matter. These will legitimately differ even among uses of the same character (viz. math and text handling of accents, for example). A./
Re: Public Review Issue #27
Peter C opined: > >Well, perhaps there is a step that's needed to propose representations > >for the alternate positions of meteg, one of these making use of ZWJ or > >ZWNJ (whichever) and to get UTC to approve that so that it's formally a > >part of the standard and, hence, an interoperable representation. and Peter K queried: > I was always a bit confused about this aspect. I understand that there > is a bit more to using ZWJ/ZWNJ in this way than a private decision; but > it is one which has already been proposed and implemented by several > font providers. But I was told a few months ago in effect that Unicode > doesn't specify such things because they are the spelling conventions > for individual languages. Where is the boundary between an > "interoperable representation" to be agreed by the UTC and a spelling > convention to be left to individuals? It's a grey area in this case. The Unicode Standard should not step into the area of standardization of spelling conventions and orthography. However, in the case of format controls like the ZWJ and ZWNJ there *are* no natural spelling conventions outside the context of the Unicode Standard, and the use of such controls has to be specified at least to some level of detail in the standard for their usage to be interoperable. But this shouldn't be either a consideration of "turf" nor something to be decided on a priori or logical merits. The desired effects are: A. That all users of Biblical Hebrew know, understand, and use the same spelling conventions for representation of meteg in Unicode text data. B. That such conventions not violate, knowingly or unknowingly, some architectural constraints of the Unicode Standard that could cause problems and surprises in implementations. C. That designers of fonts and/or rendering systems buy into the conventions in such a way as to properly display the data using such spelling conventions in widely available fonts and systems. What the UTC just did was to take care of item B, to formally eliminate the ambiguity in the standard that was causing doubt about whether ZWJ/ZWNJ could occur in the middle of a combining character sequence. Implementers are now on notice that such a sequence is expected and interpretable, and should implement accordingly. There may be some shakedown difficulties as software adjusts, but at least we know where we're headed here. It sounds like the Biblical Hebrew font providers are already bought in on item C, but it wouldn't be appropriate for the Unicode Standard to try to enforce particular approaches on them in any case. That leaves item A. And it is mostly a matter of determining what is the best mechanism for getting people to know how they should "spell" the metegs with the minimum of confusion. Putting something in the Unicode Standard might be appropriate, or there might be better venues to document the conventions. --Ken
Re: Public Review Issue #27
On 09/02/2004 14:37, Peter Constable wrote: From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Kirk The UTC decided to allow ZWJ/ZWNJ to occur in combining character sequences. Thank you. So I can do what I was wanting to do (Hebrew meteg combining sequences) with a clear conscience! Well, perhaps there is a step that's needed to propose representations for the alternate positions of meteg, one of these making use of ZWJ or ZWNJ (whichever) and to get UTC to approve that so that it's formally a part of the standard and, hence, an interoperable representation. I was always a bit confused about this aspect. I understand that there is a bit more to using ZWJ/ZWNJ in this way than a private decision; but it is one which has already been proposed and implemented by several font providers. But I was told a few months ago in effect that Unicode doesn't specify such things because they are the spelling conventions for individual languages. Where is the boundary between an "interoperable representation" to be agreed by the UTC and a spelling convention to be left to individuals? -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
RE: Public Review Issue #27
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of Peter Kirk > >The UTC decided to allow ZWJ/ZWNJ to occur in combining character > >sequences. > Thank you. So I can do what I was wanting to do (Hebrew meteg combining > sequences) with a clear conscience! Well, perhaps there is a step that's needed to propose representations for the alternate positions of meteg, one of these making use of ZWJ or ZWNJ (whichever) and to get UTC to approve that so that it's formally a part of the standard and, hence, an interoperable representation. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Re: Public Review Issue #27
On 09/02/2004 14:25, Kenneth Whistler wrote: Was any decision made at the UTC meeting concerning Public Review Issue #27? I ask because I am waiting to encode a text which needs to use ZWJ and ZWNJ within combining character sequences (and for which there is already publicly available font support!), but I don't want to do anything which is not acceptable Unicode. Yes, that issue was decided. The UTC decided to allow ZWJ/ZWNJ to occur in combining character sequences. The UTC decided *not* to change the General Category of ZWJ/ZWNJ, so their General Category will stay Cf, and not shift to Mn. The upshot of that is that the formal definition of "combining character sequence" will need to be modified slightly to account for the possible occurrence of ZWJ/ZWNJ in one. Several other modifications will be made to data files to makes things consistent for Unicode 4.0.1. The Public Review Issues page will be updated soon with all the resolutions of these issues from the UTC meeting last week. --Ken Thank you. So I can do what I was wanting to do (Hebrew meteg combining sequences) with a clear conscience! -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Public Review Issue #27
> Was any decision made at the UTC meeting concerning Public Review Issue > #27? I ask because I am waiting to encode a text which needs to use ZWJ > and ZWNJ within combining character sequences (and for which there is > already publicly available font support!), but I don't want to do > anything which is not acceptable Unicode. Yes, that issue was decided. The UTC decided to allow ZWJ/ZWNJ to occur in combining character sequences. The UTC decided *not* to change the General Category of ZWJ/ZWNJ, so their General Category will stay Cf, and not shift to Mn. The upshot of that is that the formal definition of "combining character sequence" will need to be modified slightly to account for the possible occurrence of ZWJ/ZWNJ in one. Several other modifications will be made to data files to makes things consistent for Unicode 4.0.1. The Public Review Issues page will be updated soon with all the resolutions of these issues from the UTC meeting last week. --Ken
Public Review Issue #27
Was any decision made at the UTC meeting concerning Public Review Issue #27? I ask because I am waiting to encode a text which needs to use ZWJ and ZWNJ within combining character sequences (and for which there is already publicly available font support!), but I don't want to do anything which is not acceptable Unicode. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/