On 26/11/2004 21:27, Doug Ewell wrote:

...

One useful litmus (or lackmus) test for this Hebrew example would be
whether the text in question is still legible, with its original
meaning, when reduced to plain text representable in today's Unicode.
If the special Ketiv/Qere handling is needed only because It Is The
Word, and This Is How It Was Written, then this is probably a
paleographic distinction and out of scope for plain text.  If it
genuinely changes the spelling, that is another matter.



Well, for a start we need to define what might be meant by "reduced to plain text". In this case there is simply no logical way to describe what is written as plain text plus markup. I suppose some kind of markup like <ketiv>KKKK</ketiv><qere>QqQqQ</qere> could be used (K = Ketiv base character, Q = Qere base character, q = Qere diacritical mark), and this would preserve the original meaning, but it would not show how the individual Ketiv base characters and Qere combining marks are graphically combined, i.e. it would not distinguish the written "blended" forms KqKqKK and KqKKqK, which are graphically distinct. And certainly if the markup were simply stripped from this the resulting form KKKKQqQqQ would not be legible.

But fortunately this whole issue is a storm in a teacup. For Unicode does provide quite adequate ways of representing every known Ketiv and Qere blended form - since we sorted out the Yerushala(y)im issue more than a year ago. The only real problem comes when the Qere is longer than the Ketiv and the blended form looks something like qKqKqKq, so starting with a combining mark. It is well established that such a combining mark with a blank base character may be represented by NBSP followed by the combining mark (and the alternative with SPACE is now apparently deprecated). And it seems that the UTC in rejecting the INVISIBLE LETTER proposal, and in proposing instead certain changes to the properties of NBSP which are currently out for public review, has reaffirmed this usage.

So I only raised this issue to clarify exactly how NBSP should be used in such cases. Although I have been rather confused by the responses I have received, I think the situation is clear as follows: NBSP may be used with a combining mark at the start of a word, but should be preceded by ZWSP to ensure a break opportunity before the word (although this should become unnecessary if the proposed revision to UTR #14 is accepted) and also by RLM to ensure correct bidi behaviour.

Please let me know if any of you disagree with this conclusion.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to