> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Peter Constable > Sent: Tuesday, November 30, 2004 1:20 AM > To: Unicode Mailing List > Subject: RE: No Invisible Character - NBSP at the start of a word > > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf > > Of Jony Rosenne > ...
> > Jony, where you and I have had a different worldview is that, it seems > to me, you view characters as encoding language, and I view characters > as encoding letterforms; or, put another way, for you, text is > necessarily linguistic, whereas for me text is text, independent of > linguistic interpretation. To make this concrete, the fact that a qere > sequence involves the vowel points of word A rather than word B is > linguistically interesting, but irrelevant as far as encoding is > concerned. If the displayed letterforms consist of a lamed with two > vowel points, then the encoded character sequence IMO should be lamed > with two vowel points -- and I would not consider that a hack. When I look at the text, even with a magnifying glass, I do not see a Lamed with two points. The displayed form, from my point of view, is a Lamed with a single point and another point without a base character. The Hiriq is not under the Lamed, it is between the Lamed and the Mem. The linguistic approach is just the explanation, the displayed letterforms are quite clear. Even when I look at old Latin manuscripts, which I did once again when I visited the flea market in Milan a few months ago, they are not plain text and they cannot be faithfully reproduced in Unicode without markup. Although the nature of Hebrew manuscripts is different, I do not understand the desire to make Hebrew different, and I cannot accept it if it makes the computerized handling of Hebrew unnecessarily more complicated that it is already. To make it very clear: The use of CGJ approved by the UTC is fine by me, and I have no objection to anyone using it, but it is not required for Hebrew, and we do not have a standard plain text solution for Qere and Ketiv and for Yerushala(y)im. Regarding the latter, the UTC discussion was based on a mistaken or incomplete presentation of the problem. Yes, for those need two vowels for a single letter, CGJ would do it, but since this is not my question, CGJ is not the answer. The hack needed here is an invisible base character. If anyone wants to use CGJ or any other Unicode characters that are not included in the standard Hebrew subset (Unicode does not define subsets, but other bodies do and implementers necessarily have to) to encode Hebrew texts, they should do their users a favor and explain to them that they require specific implementations, operating systems and fonts. Jony ... > > > Peter Constable > > > >

