On 25/07/2003 23:24, Jony Rosenne wrote:
This explanation makes me unhappy with CGJ.
Ken says: "The important things are that it is a) invisible, b) a combining
mark, and c) has combining class zero".
And: "There is no need for an invisible base character here".
On the contrary, to represent the text we do need an invisible base
character for the Hiriq, representing the unwritten Yod.
Another possibility is to encode the Yod with a complex text (in the meaning
non plain text) control saying the Yod is invisible.
I think it is important, whatever solution is chosen, to represent the real
situation, rather than just a sequence of codes that happens to be able to
produce the desired visual output.
Jony
If we are talking about introducing markup, surely the correct approach
would be to encode the word twice with separate markup, once for the
Qere form ending lamed patah yod hiriq mem and once for the Ketiv form
(probably consonants only) ending lamed mem. But I can see the need also
for a third form also distinguishable by markup, which is the form
actually seen in print, as at least some users are more concerned with
reproducing this accurately than with the underlying linguistics. But if
this principle is applied generally it does lead to further
complications, for example the need to encode "Qere without Ketiv" i.e.
words which, on the page, consist of points only with no base characters.
--
Peter Kirk
[EMAIL PROTECTED]
http://web.onetel.net.uk/~peterkirk/