Jony took the words right out of my mouth:

> How about RLM?
> 
> Jony

This already belongs, naturally, in the context of the Hebrew
text handling, which is going to have to handle bidi controls.

Another possibility to consider is U+2060 WORD JOINER, the
version of the zero width non-breaking space unfreighted with
the BOM confusion of U+FEFF.

WJ is also (gc=Cf, cc=0), so would block canonical reordering
of a sequence it was inserted into. Unlike ZWJ, it should have no 
potentially conflicting semantics regarding ligation or anything
else for display. It is *defined* only as specifying no break
opportunity at its position:

  "...inserting a word joiner between two characters has no
  effect on their ligating and cursive joining behavior. The
  word joiner should be ignored in contexts other than word
  or line breaking."
  
Well, as before, we already know that <lamed, patah, hiriq>
                                                    ^
is not a word or line break opportunity, so inserting a WJ
there should have no effect. And by definition, it should also
have no effect on any glyph ligation (or any other aspect of
the display). But it *would* break up the sequence that
gets canonically reordered for normalization, thus enabling
a textual distinction to be preserved.

One might even want to suggest that if RichEdit or some other
text control causes a display problem when WJ is inserted between
two Hebrew points, that should be considered a bug in the
implementation of the WORD JOINER for that text control.

Of course, I'm not privy to the internals of such implementations
and don't understand the font lookup issues in the kind of
detail that John clearly does, but if WORD JOINER cannot
be implemented as the standard says it should be, then we've
got a more serious problem on our hand than just the
Biblical Hebrew vocalization issue.

--Ken

> > 
> > At 04:26 AM 6/26/2003, Jony Rosenne wrote:
> > 
> > >I don't think we need any new characters, ZERO WIDTH SPACE 
> > would do and 
> > >it requires no new semantics.
> > 
> > ZERO WIDTH SPACE would screw up search and sort algorithms, I think, 
> > because it is not a control character per se and may not be 
> > ignored as desired.
> > 
> > I've made some tests using Ken's ZWJ suggestion and, as 
> > feared, it messes 
> > with the glyph positioning lookups. The results varied 
> > slightly between MS 
> > RichText clients and InDesign ME, but both displayed marks 
> > incorrectly when 
> > ZWJ was inserted. I strongly suspect that this is not 
> > something that can 
> > easily be resolved in the glyph shaping model.
> > 
> > John Hudson


Reply via email to