Let's try to be clear on the terms. Look at the definition of combining sequences: D17 Combining character sequence: A character sequence consisting of either a base character followed by a sequence of one or more combining characters, or a sequence of one or more combining characters.
Thus a combining character sequence *cannot* contain a ZWJ or any other Cf. Any use of a ZWJ before a combining mark produces a *defective* combining character sequence (D17a), which isolates the combining mark from any preceeding base character. And as I said earlier: > - *Default* grapheme clusters do not include ZWJ; as a matter of fact, default > grapheme clusters, except for Hangul Jamo Syllables and a few exceptional cases, > are identical with combining sequences. > http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries > - *Tailored* grapheme clusters may include longer sequences, but it is not at > all obvious whether they would contain ever ZWJ or ZWNJ. I'll expand on the latter. What constitutes a tailored grapheme cluster is up to a particular process, and so one could contain a ZWJ. However, any combining mark after a ZWJ does *not* apply to a previous base character within that tailored grapheme cluster, so the use of a ZWJ would isolate that combining mark. Such a sequence would not correspond to anything used in a natural language. Mark __________________________________ http://www.macchiato.com â ààààààààààààààààààààà â ----- Original Message ----- From: "Peter Kirk" <[EMAIL PROTECTED]> To: "Mark Davis" <[EMAIL PROTECTED]> Cc: "Unicode List" <[EMAIL PROTECTED]> Sent: Sun, 2003 Nov 09 09:19 Subject: Re: ZWJ, ZWNJ, CGJ and combination > On 08/11/2003 17:09, Mark Davis wrote: > > >I agree with the first part of your analysis. By the phrase "requesting ligation > >of combining characters" it is unclear to me what you mean, and whether that is > >the right solution to whatever problem you are referring to. > > > >Mark > >__________________________________ > >http://www.macchiato.com > >â ààààààààààààààààààààà â > > > > > > > A further reply to this one: > > On the bidi list Paul Nelson pointed out that in Khmer ZWJ and ZWNJ do > not break combining sequences; or at least they do not break grapheme > clusters, which is not quite the same thing. And the same may be true of > Indic scripts, although in the examples I found ZWJ/ZWNJ is always at > the end of a combining sequence. Are ZWJ and ZWNJ actually used within > combining character sequences (or what would be such sequences if not > technically broken)? Is there some tension here with the general > definition of combining character sequences? > > If Khmer really does do this, and unless there are any real objections > to this practice, perhaps the best way ahead, rather than defining a new > COMBINING CHARACTER JOINER and changing the Khmer encoding, is to adjust > the definition of combining character sequences to allow ZWJ, ZWNJ and > perhaps some other suitable layout control characters to be included > within such sequences. This would allow the Hebrew issue to be solved in > a way analogous to the Khmer issue. > > -- > Peter Kirk > [EMAIL PROTECTED] (personal) > [EMAIL PROTECTED] (work) > http://www.qaya.org/ > > >