Problem: We have here one character sequence with two alternate renditions: the common rendition, in which they are the same, and a distinguished rendition which uses two separate glyphs for the separate meanings.
On paper, which is two-dimensional, it is a Vav with a Holam point somewhere above it. Unicode decided that in the encoding, which is one-dimensional, the marks follow the base character. Any solution should accommodate both kinds of users and both renditions. Solution: Suggestions, please. Jony > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ted Hopp > Sent: Wednesday, July 30, 2003 6:43 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: SPAM: Re: Back to Hebrew -holem-waw vs waw-holem > > > On Wednesday, July 30, 2003 11:57 AM, [EMAIL PROTECTED] wrote: > > I agree 100% with your description of the characters that have not > > been encoded in Unicode. There are certainly marks and > consonants that > > mean two completely different things, as you have so accurately > > described. But > there > > are two approaches to encoding. There is "Code what you > see" and "Code > what > > is meant". In your analysis and in the way SIL encoded the original > > SIL Ezra font, we went with "Code what is meant". This > means that we > > have two shevas (one pronounced and one silent), a holemwaw > character > > and a shureq character. Unicode, on the other hand, is > totally "Code > > what you see". It is attempting to make no analysis of the marks on > > the page. If there is a mark, code it. If it is identical > to another > > mark, then it gets the same codepoint. (Of course, there are > > exceptions, but this is the general > rule.) > > One of the key points some of us are trying to make is that > vav with kholam khaser is a different mark on the page than a > kholam male. Different semantics AND different appearance, > but no separate Unicode encoding. What more do we need? > > Besides, what's all this that I keep reading about Unicode > encodes characters, not glyphs? From Chapter 1: "[T]he > standard defines how characters are interpreted, not how > glyphs are rendered." The "code what you see" approach, while > probably the reality of Unicode, seems somewhat contrary to > this statement of principle. > > > So with Unicode, there is no way to separate even vowels and > > consonants, since a waw in a shureq, a holem-waw, and just > a plain waw > > will always be encoded the same. Some of us are trying to make this > > approach usable by allowing at least a holem-waw to be > distinguished > > from waw holem, by placing the holem first. > > > > For the encoders, it is fairly straight-forward. For the > people trying > > to actually use the encoding, it's going to take a lot of context to > determine > > what you've got. > > Yes, indeed. Nothing like an encoding that can't be decoded. :) > > Ted > > Ted Hopp, Ph.D. > ZigZag, Inc. > [EMAIL PROTECTED] > +1-301-990-7453 > > newSLATE is your personal learning workspace > ...on the web at http://www.newSLATE.com/ > > > > >