Re: Public Review Issue #27

2004-02-10 Thread John Hudson
Peter Kirk wrote:

 > One question here which is more of principle. Last year there was a long
discussion of the appropriate method of inhibiting undesirable canonical 
reordering e.g. between meteg and vowels but potentially in other 
scripts. The mechanism agreed on, I think formally by the UTC, was to 
use CGJ. But one reason for using CGJ was that ZWJ and ZWNJ were not 
then available in this position. Now that they are available, would it 
be better to use them rather than CGJ?
The point of using CGJ for this purpose is that CGJ is not intended to affect rendering, 
so may be inserted as a neutral re-ordering inhibitor. If you insert ZWJ or ZWNJ, one 
presumably intends them to affect the rendering.

Of course, if you do want to affect the rendering, insertion of ZWJ or ZWNJ will also have 
the effect of inhibiting reordering, so one should never need to insert both CGJ and ZWJ/ZWNJ.

John Hudson



Re: Public Review Issue #27

2004-02-10 Thread Peter Kirk
On 09/02/2004 18:25, Asmus Freytag wrote:

At 04:12 PM 2/9/2004, Kenneth Whistler wrote:

That leaves item A. And it is mostly a matter of determining
what is the best mechanism for getting people to know how
they should "spell" the metegs with the minimum of confusion.
Putting something in the Unicode Standard might be appropriate,
or there might be better venues to document the conventions.


I'm of the opinion that conventions that use characters of
general category Cf are best documented in the standard. Otherwise,
a consistent treatment of such characters across implementations
depends too much on context (e.g. use of a particular font).
"Fine typography" issues for other characters, incl. combining
are a different matter. These will legitimately differ even
among uses of the same character (viz. math and text handling
of accents, for example).
A./

Thank you, Ken and Asmus. I think I will take this back to the Unicode 
Hebrew list and see if we can all agree on a convention for using ZWJ, 
ZWNJ etc with meteg, now that the way is open for their use. And we may 
look again at some other issues which are more than just spelling 
conventions. Then we can formally propose something to the UTC, and the 
UTC can decide whether this is approprite for inclusion in the standard.

One question here which is more of principle. Last year there was a long 
discussion of the appropriate method of inhibiting undesirable canonical 
reordering e.g. between meteg and vowels but potentially in other 
scripts. The mechanism agreed on, I think formally by the UTC, was to 
use CGJ. But one reason for using CGJ was that ZWJ and ZWNJ were not 
then available in this position. Now that they are available, would it 
be better to use them rather than CGJ?

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/



Re: Public Review Issue #27

2004-02-09 Thread Asmus Freytag
At 04:12 PM 2/9/2004, Kenneth Whistler wrote:
That leaves item A. And it is mostly a matter of determining
what is the best mechanism for getting people to know how
they should "spell" the metegs with the minimum of confusion.
Putting something in the Unicode Standard might be appropriate,
or there might be better venues to document the conventions.
I'm of the opinion that conventions that use characters of
general category Cf are best documented in the standard. Otherwise,
a consistent treatment of such characters across implementations
depends too much on context (e.g. use of a particular font).
"Fine typography" issues for other characters, incl. combining
are a different matter. These will legitimately differ even
among uses of the same character (viz. math and text handling
of accents, for example).
A./ 





Re: Public Review Issue #27

2004-02-09 Thread Kenneth Whistler
Peter C opined:

> >Well, perhaps there is a step that's needed to propose representations
> >for the alternate positions of meteg, one of these making use of ZWJ or
> >ZWNJ (whichever) and to get UTC to approve that so that it's formally a
> >part of the standard and, hence, an interoperable representation.

and Peter K queried:

> I was always a bit confused about this aspect. I understand that there 
> is a bit more to using ZWJ/ZWNJ in this way than a private decision; but 
> it is one which has already been proposed and implemented by several 
> font providers. But I was told a few months ago in effect that Unicode 
> doesn't specify such things because they are the spelling conventions 
> for individual languages. Where is the boundary between an 
> "interoperable representation" to be agreed by the UTC and a spelling 
> convention to be left to individuals?

It's a grey area in this case.

The Unicode Standard should not step into the area of standardization
of spelling conventions and orthography.

However, in the case of format controls like the ZWJ and ZWNJ
there *are* no natural spelling conventions outside the context of the
Unicode Standard, and the use of such controls has to be specified
at least to some level of detail in the standard for their usage
to be interoperable.

But this shouldn't be either a consideration of "turf" nor
something to be decided on a priori or logical merits.

The desired effects are:

A. That all users of Biblical Hebrew know, understand, and use the
same spelling conventions for representation of meteg in
Unicode text data.

B. That such conventions not violate, knowingly or unknowingly,
some architectural constraints of the Unicode Standard that
could cause problems and surprises in implementations.

C. That designers of fonts and/or rendering systems buy into
the conventions in such a way as to properly display the data
using such spelling conventions in widely available fonts
and systems.

What the UTC just did was to take care of item B, to formally
eliminate the ambiguity in the standard that was causing
doubt about whether ZWJ/ZWNJ could occur in the middle of
a combining character sequence. Implementers are now on notice
that such a sequence is expected and interpretable, and should
implement accordingly. There may be some shakedown difficulties
as software adjusts, but at least we know where we're headed
here.

It sounds like the Biblical Hebrew font providers are already
bought in on item C, but it wouldn't be appropriate for
the Unicode Standard to try to enforce particular approaches
on them in any case.

That leaves item A. And it is mostly a matter of determining
what is the best mechanism for getting people to know how
they should "spell" the metegs with the minimum of confusion.
Putting something in the Unicode Standard might be appropriate,
or there might be better venues to document the conventions.

--Ken





Re: Public Review Issue #27

2004-02-09 Thread Peter Kirk
On 09/02/2004 14:37, Peter Constable wrote:

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
   

On Behalf
 

Of Peter Kirk
   



 

The UTC decided to allow ZWJ/ZWNJ to occur in combining character
sequences.
 

 

Thank you. So I can do what I was wanting to do (Hebrew meteg
   

combining
 

sequences) with a clear conscience!
   

Well, perhaps there is a step that's needed to propose representations
for the alternate positions of meteg, one of these making use of ZWJ or
ZWNJ (whichever) and to get UTC to approve that so that it's formally a
part of the standard and, hence, an interoperable representation.
 

I was always a bit confused about this aspect. I understand that there 
is a bit more to using ZWJ/ZWNJ in this way than a private decision; but 
it is one which has already been proposed and implemented by several 
font providers. But I was told a few months ago in effect that Unicode 
doesn't specify such things because they are the spelling conventions 
for individual languages. Where is the boundary between an 
"interoperable representation" to be agreed by the UTC and a spelling 
convention to be left to individuals?

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/



RE: Public Review Issue #27

2004-02-09 Thread Peter Constable
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
> Of Peter Kirk


> >The UTC decided to allow ZWJ/ZWNJ to occur in combining character
> >sequences.

> Thank you. So I can do what I was wanting to do (Hebrew meteg
combining
> sequences) with a clear conscience!

Well, perhaps there is a step that's needed to propose representations
for the alternate positions of meteg, one of these making use of ZWJ or
ZWNJ (whichever) and to get UTC to approve that so that it's formally a
part of the standard and, hence, an interoperable representation.



Peter
 
Peter Constable
Globalization Infrastructure and Font Technologies
Microsoft Windows Division



Re: Public Review Issue #27

2004-02-09 Thread Peter Kirk
On 09/02/2004 14:25, Kenneth Whistler wrote:

Was any decision made at the UTC meeting concerning Public Review Issue 
#27? I ask because I am waiting to encode a text which needs to use ZWJ 
and ZWNJ within combining character sequences (and for which there is 
already publicly available font support!), but I don't want to do 
anything which is not acceptable Unicode.
   

Yes, that issue was decided.

The UTC decided to allow ZWJ/ZWNJ to occur in combining character
sequences.
The UTC decided *not* to change the General Category of
ZWJ/ZWNJ, so their General Category will stay Cf, and not
shift to Mn.
The upshot of that is that the formal definition of
"combining character sequence" will need to be modified slightly
to account for the possible occurrence of ZWJ/ZWNJ in one.
Several other modifications will be made to data files to
makes things consistent for Unicode 4.0.1.
The Public Review Issues page will be updated soon with all the
resolutions of these issues from the UTC meeting last week.
--Ken



 

Thank you. So I can do what I was wanting to do (Hebrew meteg combining 
sequences) with a clear conscience!

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/



Re: Public Review Issue #27

2004-02-09 Thread Kenneth Whistler

> Was any decision made at the UTC meeting concerning Public Review Issue 
> #27? I ask because I am waiting to encode a text which needs to use ZWJ 
> and ZWNJ within combining character sequences (and for which there is 
> already publicly available font support!), but I don't want to do 
> anything which is not acceptable Unicode.

Yes, that issue was decided.

The UTC decided to allow ZWJ/ZWNJ to occur in combining character
sequences.

The UTC decided *not* to change the General Category of
ZWJ/ZWNJ, so their General Category will stay Cf, and not
shift to Mn.

The upshot of that is that the formal definition of
"combining character sequence" will need to be modified slightly
to account for the possible occurrence of ZWJ/ZWNJ in one.
Several other modifications will be made to data files to
makes things consistent for Unicode 4.0.1.

The Public Review Issues page will be updated soon with all the
resolutions of these issues from the UTC meeting last week.

--Ken




Public Review Issue #27

2004-02-09 Thread Peter Kirk
Was any decision made at the UTC meeting concerning Public Review Issue 
#27? I ask because I am waiting to encode a text which needs to use ZWJ 
and ZWNJ within combining character sequences (and for which there is 
already publicly available font support!), but I don't want to do 
anything which is not acceptable Unicode.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/