On 29/07/2003 11:20, Ted Hopp wrote:

Okay -- there are two Hebrew vowels that are not encoded in Unicode. Their
(transliterated) Hebrew names are (caps indicate syllable accent): khoLAM
maLE and shuRUQ. The kholam male LOOKS like a "vav with holam" [05D5.05B9]
or the alphabetic presentation form FB4B (HEBREW LETTER VAV WITH HOLAM) and
the shuruq LOOKS like a vav with dagesh [05D5.05BC] or the alphabetic
presentation form FB35 (HEBREW LETTER VAV WITH DAGESH). (For the record, the
Unicode HEBREW POINT HOLAM [05B9] is usually called khoLAM khaSER in
Hebrew.)

The two vowels kholam male and shuruq have nothing to do with the consonant
vav (HEBREW LETTER VAV) other than that they are written with the same
glyph. In unpointed Hebrew text, the vav glyph is used to represent these
vowels but, outside of ketiv male, the use is often optional (although
sometimes strictly determined by tradition). (For instance, the name Aharon
appears in Hebrew bible scrolls sometimes with a vav glyph after the resh
and sometimes without. It would be nice if I could search for all
occurrences of the name by doing a "match consonants only" search instead of
having to resort to regular expressions.) In some texts (e.g., many of the
books published by ArtScroll), the kholam male and vav with kholam are
rendered differently--the former with the dot centered above the vav and
latter with the dot somewhat more to the left. I have not seen a text that
renders a shuruq differently than a vav with dagesh. (However, a dagesh has
nothing to do with a shuruq, despite the nice little note in the Unicode
code chart. A consonantal vav with a dagesh is NOT a shuruq.)

Thanks for this useful information.


Furthermore, context cannot be used to distinguish vav with kholam vs.
kholam male. As I posted once before, at least one major dictionary uses a
single consonant with both a patah and a kholam male (NOT a consonantal vav
with kholam) to transliterate foreign words. Hebrew characters are used for
much more than spelling Hebrew words.


Good point. The algorithm I suggested works only for orthographically regular Hebrew.

These different uses for the same (or approximately same) glyphs cannot, as
far as I know, be distinguished in Unicode. (Putting a HEBREW POINT HOLAM in
front of a HEBREW LETTER VAV would just associate the kholam with the
preceding letter.) It might be nice if there were different code points for
them. Alphabetic presentation forms don't quite do the trick. When I first
saw it, I had assumed that FB4B was supposed to be used for kholam male (and
that's what we use it for in our code). Of course, I could have assumed that
it was intended for (consonantal) vav with kholam. However, that sequence
automatically renders with the dot more to the left, so (for us) a
presentation form was unnecessary in that case. Will all font designers who
include Hebrew alphabetic presentation forms conform to my assumptions? Can
anyone authoritatively say what was intended? I don't think so. This is a
problem.

U+FB4B has a canonical decomposition into vav holam, so cannot be used for anything distinct from vav holam. Maybe it was originally intended for holam male, but if so the people who defined the decomposition forgot that. But there is nothing to stop the UTC defining a new character HEBREW LETTER HOLAM MALE with no canonical decomposition (but perhaps a compatibility one), a glyph with the holam clearly to the right, and a note explaining the distinction from vav plus holam. That would be one sensible way ahead.


Other typographic curiosities: The HEBREW POINT QAMATS [05B8] is used for two Hebrew vowels: qamats katan (pronounced in Israeli Hebrew like the 'o' in American English 'corn', as is kholam male) and qamats gadol (pronounced like 'a' in American English 'father', as is patah when not under a final HE, HET, or AYIN). Dictionaries usually list the two as separate vowels but render them identically. HOWEVER, some text publishers now distinguish these two vowels typographically (e.g., Revised Siddur Sim Shalom published by the Rabbinical Assembly). Perhaps there should be an alphabetic presentation form for qamats katan.

The two qamatses were distinguished as early as 1850 in Benjamin Davidson's "The Analytical Hebrew and Chaldee Lexicon", of which I have a facsimile edition. But Davidson did not distinguish the holam vavs or the shevas.


The same comment goes for HEBREW POINT SHEVA [05B0]: in pronunciation it comes in two flavors, called sheva na ("moving sheva" -- pronounced something like the vowel segol) and sheva nakh ("resting sheva" -- silent). Again, most dictionaries list these as separate vowels but render them identically, while some publishers now distinguish them typographically (e.g., Tikkun Korim Simanim, published by Feldheim). Again, should there be an alphabetic presentation form for sheva na?

With that, I'll leave off.

Ted (not content with a focussed discussion)






--
Peter Kirk
[EMAIL PROTECTED]
http://web.onetel.net.uk/~peterkirk/





Reply via email to