Re: Biblical Hebrew

2003-06-27 Thread Philippe Verdy
On Saturday, June 28, 2003 1:15 AM, Kenneth Whistler <[EMAIL PROTECTED]> wrote: > Philippe Verdy said: > > > I understand the frustration: if Unicode had not attempted to define > > combining classes, which were not necessary to Unicode, all > > existing combining characters would have been given

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Kenneth Whistler
Peter responded: > Kenneth Whistler wrote on 06/26/2003 05:36:34 PM: > > > Why is making use of the existing behavior of existing characters > > a "groanable kludge", if it has the desired effect and makes > > the required distinctions in text? > > Why is it a kludge to insert some cc=0 control

Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)

2003-06-27 Thread Kenneth Whistler
Peter countered: > > > Could this finally be the missing "killer ap" for the CGJ? > > > > It will be perfect to allow an application like XML to encode Hebrew > > text using Unicode 4.0 rules (and before). > > It is not perfect. CGJ is supposed to be significant (and kept in the > text) for a v

Mongolian Rant (was: Biblical Hebrew... was: Tibetan... was: ...)

2003-06-27 Thread Kenneth Whistler
Andrew West wrote: > I have to agree 100% with Peter on this. The potential fiasco with regards to > Mongolian Free Variation Selectors is another area where our grandchildren are > going to be weeping with despair if we are not careful. Well, I doubt that our grandchildren will be quite *that*

Re: Biblical Hebrew

2003-06-27 Thread Kenneth Whistler
Philippe Verdy said: > I understand the frustration: if Unicode had not attempted to define > combining classes, which were not necessary to Unicode, all > existing combining characters would have been given a CC=0 > (or all the same 220 or 230 value). Uh, no. Under this scheme, would be di

Re: Biblical Hebrew

2003-06-27 Thread John Hudson
At 01:45 PM 6/27/2003, Philippe Verdy wrote: I understand the frustration: Similar to the frustration of having private, off-list messages replied to in public. if Unicode had not attempted to define combining classes, which were not necessary to Unicode, all existing combining characters would

Re: Unicode Public Review Issues update

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 10:29 PM, Rick McGowan <[EMAIL PROTECTED]> wrote: > The Unicode Technical Committee has posted a new issue for public > review and comment. Details are on the following web page: > > http://www.unicode.org/review/ > > Briefly, the new issue is: > > Issue #11 Soft Dott

Re: Biblical Hebrew

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 10:28 PM, John Hudson <[EMAIL PROTECTED]> wrote: > I don't think it would break any modern Hebrew document, because it > is not in any way essential to modern Hebrew that the vowels have > fixed position combining classes as in Unicode. That is part of the > frustration: th

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-27 Thread Christopher John Fynn
Rick McGowan <[EMAIL PROTECTED]> has privately suggested moving the discussion of Combining Classes of *Tibetan* Characters from the main Unicode list [EMAIL PROTECTED] to the TIBEX list [EMAIL PROTECTED] - an "experts" list which was set up several years ago specifically to discuss proposals for

Re: Biblical Hebrew

2003-06-27 Thread Karljürgen Feuerherm
Kenneth Whistler said on June 27, 2003 at 4:08 PM >Karljürgen, >> 2. Consequently ANY OTHER solution than 'FIX the obvious mistake(s)' is a >> kludge (contra Philippe's (?) recent comment). One *pays* for all kludges, >> one way or the other. >Digital encoding of writing systems is a kludge. An

Unicode Public Review Issues update

2003-06-27 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page: http://www.unicode.org/review/ Review periods for the new items close on August 18, 2003. Please see the page for links to discussion and relevant documents. Brief

Re: Biblical Hebrew

2003-06-27 Thread Kenneth Whistler
Karljürgen, > 2. Consequently ANY OTHER solution than 'FIX the obvious mistake(s)' is a > kludge (contra Philippe's (?) recent comment). One *pays* for all kludges, > one way or the other. Digital encoding of writing systems is a kludge. And boy, do we seem to be paying for the Unicode version o

RE: SPAM: About combining classes

2003-06-27 Thread Peter_Constable
Jony Rosenne wrote on 06/27/2003 08:32:11 AM: > I am under the impression that the existing scientific encodings of the > Bible are encode with the help of some kind of mark up, and maybe this is > how they should continue. The existing eBHS texts use an encoding in which the order of characters

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Hudson
At 10:20 AM 6/27/2003, John Cowan wrote: > What if the request to change the Hebrew combining classes came *from* W3C > and/or IETF? I'm not saying that this is likely, but I'm wondering whether > they might, in fact, not insist on stability for characters for which > normalisation is currently br

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
Peter replied: > Karljürgen Feuerherm wrote on 06/27/2003 08:23:08 AM: > > > Now, Q: I take it the combining classes are linked to the script, rather > > than say to a dialect > > They're linked to the character. > > --e.g. one can't define BH as a separate dialect from > > MH with its own set o

Fw: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
(repost. last word missing, sorry) > John Cowan said on June 27, 2003 at 12:56 PM > > Michael Everson had said: > > > This is not analogous to the present situation, it seems to me. In > > > the first place, what else is the \ for? :-) > > > > Escaping special characters, since you ask. > > But

Re: Biblical Hebrew

2003-06-27 Thread Karljürgen Feuerherm
Jony Rosenne said on June 27, 2003 at 2:17 PM > > 1. Everyone is more or less agreed that the present combining > > class rules as they apply to BH contain mistakes. > > I don't. I do agree that are some cases that are not handled. Well, ok, 'omissions,' then. If BH was intended to be covered in

Re: Yerushala(y)im - or Biblical Hebrew

2003-06-27 Thread John Cowan
[EMAIL PROTECTED] scripsit: > Of course, the point is, this is a particular situation where > > > > is canonically equivalent to > > No, Mark had it right. These two are canonically equivalent and therefore no normal Unicode process (including rendering) can treat them differently. That's

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Hudson
Philippe said on June 27, 2003 at 10:25 AM Do you then propose to create a specific character, for use within the Hebrew script only, as a way to specify an alternate order for hebrew cantillation? In that case, it would be more appropriate to define new standard variants of these cantillation mar

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Peter_Constable
Karljürgen Feuerherm wrote on 06/27/2003 08:23:08 AM: > Now, Q: I take it the combining classes are linked to the script, rather > than say to a dialect They're linked to the character. > --e.g. one can't define BH as a separate dialect from > MH with its own set of rules? No, not unless BH is

Re: About combining classes

2003-06-27 Thread Rick McGowan
Philippe wrote: > When I just look at the history of combining classes, they did not exist in > the first Unicode standard, and they still don't exist in ISO10646 as well. > This was a technology developed by IBM and offered for free to the community Excuse me Philippe, but you are wrong. Please

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes ofTibetan Vowels)]

2003-06-27 Thread Peter_Constable
John Cowan wrote on 06/27/2003 06:29:12 AM: > Since the use of non-ASCII characters in things like XML and the DNS I suspect the users of Biblical Hebrew would rather be told they can't use Hebrew vowels and accents in markup or URIs than deal with a hack to fix errors in the combining classes.

Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)

2003-06-27 Thread Peter_Constable
Philippe Verdy wrote on 06/27/2003 04:46:56 AM: > > Could this finally be the missing "killer ap" for the CGJ? > > It will be perfect to allow an application like XML to encode Hebrew > text using Unicode 4.0 rules (and before). It is not perfect. CGJ is supposed to be significant (and kept in t

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
Philippe Verdy said on June 27, 2003 at 12:38 PM Subject: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels) > On Friday, June 27, 2003 5:53 PM, Karljürgen Feuerherm <[EMAIL PROTECTED]> wrote: > > And in any case this should NOT muck things up which aren't broken, > >

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes ofTibetan Vowels)]

2003-06-27 Thread Peter_Constable
Michael Everson wrote on 06/27/2003 09:39:16 AM: > But you might trot on over with a white flag to parley about a problem. > > They [IETF] 're only human beings over there, just as we are over here. Every time I have referred to IETF as "them" in his presence, Misha Wolf has reminded me, "WE ar

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Peter_Constable
John Cowan wrote on 06/27/2003 08:24:35 AM: > The IETF has an explicit contract with Unicode: "We' > ll use your normalization algorithm if you promise NEVER, NEVER to change > the normalization status of a single character." Unicode has already > broken that promise four times, so its credibili

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
John Cowan said on June 27, 2003 at 12:56 PM Michael Everson had said: > > This is not analogous to the present situation, it seems to me. In > > the first place, what else is the \ for? :-) > > Escaping special characters, since you ask. But in a completely different. K

Re: Fw: Biblical Hebrew: possible solution for XML

2003-06-27 Thread John Cowan
Philippe Verdy scripsit: > Given that XML will require normalization for texts identified as > being Unicode encoded (UTF-8 and others), couldn't a document be > labelled so that the normalization step be removed from the XML > processing, using a "ISO-10646-8" encoding name (for the UTF-8 > encod

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Hudson
At 05:48 AM 6/27/2003, Michael Everson wrote: The W3C would also hit the roof if Unicode normalization changed radically. I don't think anyone is proposing a *radical* change. I have uploaded the relevant draft pages of the SBL Hebrew user manual to http://www.tiro.com/transfer/SBLappendi

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
John Hudson scripsit: > What if the request to change the Hebrew combining classes came *from* W3C > and/or IETF? I'm not saying that this is likely, but I'm wondering whether > they might, in fact, not insist on stability for characters for which > normalisation is currently broken anyway? Th

Biblical Hebrew

2003-06-27 Thread Jony Rosenne
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Karljrgen Feuerherm > Sent: Friday, June 27, 2003 3:23 PM > To: [EMAIL PROTECTED] > Subject: SPAM: Re: Biblical Hebrew (Was: Major Defect in > Combining Classes of Tibetan Vowels) > > > > 1. Ever

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
John Cowan said on June 27, 2003 at 12:48 PM > Karljürgen Feuerherm scripsit: > > > > Several people have expressed reasons why this can't be (practically) be > > > > done--which mainly seem to stem from political concerns. > > > > > > All concerns involving human beings -- ho bios politikos -- a

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Hudson
At 03:12 AM 6/27/2003, Michael Everson wrote: Who is it who will kill the Unicode Consortium if UAX #15 were to be revised? Did it occur to anyone to *ask* about the possible revision of classes for the dozen or so instances that would be affected? My understanding is that stability promises hav

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Hudson
At 02:53 AM 6/27/2003, [EMAIL PROTECTED] wrote: ISO: Then, obviously they need to correct their errors. I mean, it's not like the wrong characters got encoded or something. Tell them to just fix the errors; that can't be difficult to do, and is obviously the right thing to do. That seems to be exa

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
Michael Everson scripsit: > No, but you're not making a technical argument, either. "The life of [Unicode] has not been logic but experience." --Oliver Wendell Holmes, somewhat mutated > >Not when their core values -- correctness vs. stability -- are made to > >be at odds. > > And shift

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
Michael Everson scripsit: > And sometimes not, then. What four characters have been corrected so > far? Were they "important" characters to some company? Are there no > Christians or Jews in the IETF who might care about a problem like > this, where a simple solution might be effected? Particul

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
Karljürgen Feuerherm scripsit: > > The use of > > the backslash character in DOS/Windows systems as a path separator is > > arguably a mistake > > I hardly think so. It was a matter of a necessary alternative. It could only > be viewed as a mistake on the assumption that somehow the Unix way was

Fw: Biblical Hebrew: possible solution for XML

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 6:01 PM, Philippe Verdy <[EMAIL PROTECTED]> wrote: Given that XML will require normalization for texts identified as being Unicode encoded (UTF-8 and others), couldn't a document be labelled so that the normalization step be removed from the XML processing, using a "ISO-1

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 5:53 PM, Karljürgen Feuerherm <[EMAIL PROTECTED]> wrote: > And in any case this should NOT muck things up which aren't broken, > like MH. Not breaking Modern Hebrew means not changing the combining classes of the characters it uses. Adding a distinct set for Traditional

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 5:05 PM, Michael Everson <[EMAIL PROTECTED]> wrote: > At 10:40 -0400 2003-06-27, John Cowan wrote: > > Karljürgen Feuerherm scripsit: > > > > > 1. Everyone is more or less agreed that the present combining > > > class rules as they apply to BH contain mistakes. The clea

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
Philippe Verdy scripsit: > May be Unicode should be more prudent with Normalization Forms: if > new characters are added, their combining classes should be > documented as informative before there is a consensus and > experimentation. This will not break the stability pact with XML, which > will s

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
Philippe said on June 27, 2003 at 10:25 AM > On Friday, June 27, 2003 3:23 PM, Karljürgen Feuerherm <[EMAIL PROTECTED]> wrote: > > I REALLY think that option 1 [FIX the combining classes] should be beaten to death with a stick, > > then beaten to death again, before settling for one of the others.

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 4:40 PM, John Cowan <[EMAIL PROTECTED]> wrote: > Not so. Sometimes stability is more important than correctness. Very well answered. I don't see why we need to sacrifice stability when correcting something. As the error is not in ISO10646, it is definitely not reasonnable

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Doug Ewell
Andrew C. West wrote: > I have to agree 100% with Peter on this. The potential fiasco with > regards to Mongolian Free Variation Selectors is another area where > our grandchildren are going to be weeping with despair if we are > not careful. The standardized variants for Mongolian were set in >

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)]

2003-06-27 Thread John Cowan
Michael Everson scripsit: > But you might trot on over with a white flag to parley about a problem. > > They're only human beings over there, just as we are over here. Michael, I *am* the guy carrying the white flag to the W3C, and I have made promises about what the Unicode Consortium will and

Re: Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 4:44 PM, Ben Dougall <[EMAIL PROTECTED]> wrote: > i'm a bit confused. i thought that this type of thing was already > pretty well covered by the various unicode resources? (i guess there's > a strong chance not, if you're asking this question). I'm not discussing about ho

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Michael Everson
At 10:40 -0400 2003-06-27, John Cowan wrote: Karljürgen Feuerherm scripsit: 1. Everyone is more or less agreed that the present combining class rules as they apply to BH contain mistakes. The clearly preferential way to deal with mistakes in any technological/computing software environment is t

Re: Biblical Hebrew (Was: Major Defect in Combining Classes ofTibetan Vowels)

2003-06-27 Thread Michael Everson
At 09:24 -0400 2003-06-27, John Cowan wrote: Michael Everson scripsit: So, you're saying, no one has asked IETF whether or not they would be able to countenance a dozen or so changes for unimplemented things like biblical accents. The IETF has an explicit contract with Unicode: "We' ll use your

Re: Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Ben Dougall
i'm a bit confused. i thought that this type of thing was already pretty well covered by the various unicode resources? (i guess there's a strong chance not, if you're asking this question). this is the way i see it: it's for you to decide which format you internally normalise to (i'm not even

Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)

2003-06-27 Thread Doug Ewell
Philippe Verdy wrote: > The current use of CGJ is for sequences like: > and > which still encode the French words "boeuf" and "effet", where the > author gives a hint to display the sequence "oe" as a single ligated > form instead of two separate grapheme clusters, despite this > corres

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in CombiningClasses of Tibetan Vowels)]

2003-06-27 Thread Michael Everson
At 09:16 -0400 2003-06-27, John Cowan wrote: Michael Everson scripsit: Oh, come on. Let's not put words in people's mouths. Ifs and mights are not facts. Expressed attitudes are facts, and it's reasonable to extrapolate people's future behaviors, at least the general trend thereof, from their ex

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
Karljürgen Feuerherm scripsit: > 1. Everyone is more or less agreed that the present combining class rules as > they apply to BH contain mistakes. The clearly preferential way to deal with > mistakes in any technological/computing software environment is to FIX them. Not so. Sometimes stability

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 3:23 PM, Karljürgen Feuerherm <[EMAIL PROTECTED]> wrote: > > At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote: > Now, Q: I take it the combining classes are linked to the script, > rather than say to a dialect--e.g. one can't define BH as a separate > dialect from MH with

Re: Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 3:36 PM, Jony Rosenne <[EMAIL PROTECTED]> wrote: > For Hebrew and Arabic, add a step: Find the root, remove prefixes, > suffixes and other grammatical artifacts and obtain the base form of > the word. Removing common suffixes is a separate issue (this requires unificatio

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
(Regret I hadn't yet read this post prior to my last post) Peter said, in reponse to Ken: > Why is it a kludge to insert some cc=0 control character into the text for > the sole purpose of preventing reordering during canonical ordering of two > combining marks that do interact typographically an

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Karljürgen Feuerherm
> At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote: > > >I just have a hard time believing that 50 years from now our > >grandchildren won't look back [...] I am in complete agreement with the spirit of what Peter says, though realistically, 50 years from now, this is likely to be all neither h

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)]

2003-06-27 Thread John Cowan
Michael Everson scripsit: > Oh, come on. Let's not put words in people's mouths. Ifs and mights > are not facts. Expressed attitudes are facts, and it's reasonable to extrapolate people's future behaviors, at least the general trend thereof, from their expressed attitudes. When someone draws a

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread John Cowan
Michael Everson scripsit: > So, you're saying, no one has asked IETF whether or not they would be > able to countenance a dozen or so changes for unimplemented things > like biblical accents. The IETF has an explicit contract with Unicode: "We' ll use your normalization algorithm if you promise

Re: Biblical Hebrew (Was: Major Defect in Combining Classes ofTibetan Vowels)

2003-06-27 Thread Michael Everson
At 07:28 -0400 2003-06-27, John Cowan wrote: Michael Everson scripsit: Who is it who will kill the Unicode Consortium if UAX #15 were to be revised? Did it occur to anyone to *ask* about the possible revision of classes for the dozen or so instances that would be affected? The IETF, for one. I

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in CombiningClasses of Tibetan Vowels)]

2003-06-27 Thread Michael Everson
At 14:34 +0200 2003-06-27, Philippe Verdy wrote: On Friday, June 27, 2003 1:29 PM, John Cowan <[EMAIL PROTECTED]> wrote: Michael Everson scripsit: Change the character classes in Unicode 4.1, and they *might* decide to freeze support at, say, Unicode 3.0. Or they may simply opt to define their *

RE: Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Jony Rosenne
For Hebrew and Arabic, add a step: Find the root, remove prefixes, suffixes and other grammatical artifacts and obtain the base form of the word. Nearly nobody does it, and searches in these languages are less useful than parallel searches in other languages. Jony > -Original Message- >

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)]

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 1:29 PM, John Cowan <[EMAIL PROTECTED]> wrote: > Michael Everson scripsit: > Change the character classes in Unicode 4.1, and they *might* decide > to freeze support at, say, Unicode 3.0. Or they may simply opt to define their *OWN* normalization standard, distinct from U

RE: SPAM: About combining classes

2003-06-27 Thread Jony Rosenne
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Verdy > Sent: Friday, June 27, 2003 12:31 PM > To: [EMAIL PROTECTED] > Subject: SPAM: About combining classes > > > When I just look at the history of combining classes, they > did not exi

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Andrew C. West
On Fri, 27 Jun 2003 04:22:30 -0500, [EMAIL PROTECTED] wrote: > I just have a hard time believing that 50 years from now our grandchildren > won't look back, "What were they thinking? So it took them a couple of > years to figure out canonical ordering and normalization; why on earth > didn't th

Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Philippe Verdy
In order to implement a plain-text search algorithm, in a language neutral way that would still work with all scripts, I am searching for advices on how this can be done "safely" (notably for automated search engines), to allow searching for text matching some basic encoding styles. My first a

[cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)]

2003-06-27 Thread John Cowan
Michael Everson scripsit: > Who is it who will kill the Unicode Consortium if UAX #15 were to be > revised? Did it occur to anyone to *ask* about the possible revision > of classes for the dozen or so instances that would be affected? The IETF, for one. IETF is already very wary of Unicode, ev

About combining classes

2003-06-27 Thread Philippe Verdy
When I just look at the history of combining classes, they did not exist in the first Unicode standard, and they still don't exist in ISO10646 as well. This was a technology developed by IBM and offered for free to the community to allow a simplified management of encoded texts, and it has long b

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Michael Everson
At 04:53 -0500 2003-06-27, [EMAIL PROTECTED] wrote: If they're so unaware of combining classes, might it not seem reasonable to think the the dialog might continue as follows? - [gives explanation of combining classes and the related problem for Hebrew] ISO: So, you're saying you're coming to us

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Michael Everson
At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote: Are we saying that ISO doesn't give a rip for implementation issues? Duplication of characters is not the way to fix (forgive me, UTC) *Unicode's* error in combining characters. Or that their notion of ordering distinctions is different from U

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Michael Everson
At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote: I just have a hard time believing that 50 years from now our grandchildren won't look back, "What were they thinking? So it took them a couple of years to figure out canonical ordering and normalization; why on earth didn't they work that o

Re: Biblical Hebrew

2003-06-27 Thread Michael Everson
At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote: In discussing these issues among Biblical Hebrew implementers, content providers and users, I have had to explain repeatedly why UTC doesn't want to consider this. It is completely obvious to them that this is the right solution. Even on ex

Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 3:54 AM, Kenneth Whistler <[EMAIL PROTECTED]> wrote: > John, > > > At 03:36 PM 6/26/2003, Kenneth Whistler wrote: > > > > > Why is making use of the existing behavior of existing characters > > > a "groanable kludge", if it has the desired effect and makes > > > the requ

Re: Yerushala(y)im - Biblical Hebrew

2003-06-27 Thread Michael Everson
At 10:09 +0200 2003-06-27, Jony Rosenne wrote: Whatever you do, any new characters designed for solving these problems should not be in the Hebrew block. Add a new Biblical Hebrew block, clearly labeled as not intended for regular Hebrew use. And I suggest that whenever a proposal comes up to the U

Re: Biblical Hebrew

2003-06-27 Thread Michael Everson
At 23:59 -0700 2003-06-26, John Hudson wrote: I think there is a reasonable case to be made for treating modern Hebrew and Biblical Hebrew as separate languages for pretty much all purposes. The existing codepoints with the fixed position combining classes work fine for Modern Hebrew, and there

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of TibetanVowels)

2003-06-27 Thread Peter_Constable
Kenneth Whistler wrote on 06/26/2003 05:36:34 PM: > But in the 10646 WG2 context... You can always come in > with the proposal to encode BIBLICAL HEBREW POINT PATAH and > say, even though the glyph is identical, see, the name is > different, so the character is different. But this is a pretty > th

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of TibetanVowels)

2003-06-27 Thread Peter_Constable
Kenneth Whistler wrote on 06/26/2003 05:36:34 PM: > Why is making use of the existing behavior of existing characters > a "groanable kludge", if it has the desired effect and makes > the required distinctions in text? Why is it a kludge to insert some cc=0 control character into the text for the

RE: Question about Unicode Ranges in TrueType fonts

2003-06-27 Thread Peter_Constable
> but premature standardization can > also be a problem if the wrong choices get codified too soon. As in canonical combining classes? :-) - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W

Re: Biblical Hebrew

2003-06-27 Thread Peter_Constable
Kenneth Whistler wrote on 06/26/2003 10:15:12 PM: > How does a user of pointed Hebrew text know whether they are > dealing with the legacy points... Ken, corresponding arguments apply equally to your suggestion of putting CGJ everywhere and letting software make it transparent to the user: how

Re: Yerushala(y)im - or Biblical Hebrew (was Major Defect in CombiningClasses of Tibetan Vowels)

2003-06-27 Thread Peter_Constable
Ken Whistler wrote on 06/26/2003 05:04:55 PM: > Another possibility to consider is U+2060 WORD JOINER, the > version of the zero width non-breaking space unfreighted with > the BOM confusion of U+FEFF. It wouldn't allow line breaks, but it would indicate an unwanted word boundary, no? (I don't h

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of TibetanVowels)

2003-06-27 Thread Peter_Constable
Kenneth Whistler wrote on 06/26/2003 08:54:08 PM: > Actually, in casting around for the solution to the problem of > introduction of format controls creating defective combining > character sequences, it finally occurred to me that: > > U+034F COMBINING GRAPHEME JOINER > > has the requisite prop

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of TibetanVowels)

2003-06-27 Thread Peter_Constable
John Hudson wrote on 06/26/2003 03:19:44 PM: > >That is a potential solution, thought it would have to be *two* additional > >metegs. > > Can you explain your thinking here, Peter? I was thinking of the three-way distinction for hataf vowels, but you were correct in pointing out earlier that c

Re: Biblical Hebrew

2003-06-27 Thread Peter_Constable
Rick McGowan wrote on 06/26/2003 05:52:32 PM: > The *best* thing to do, in my personal opinion and I know it'll get shot > down so don't bother telling me so, is to fix the combining classes of the > Hebrew points. In discussing these issues among Biblical Hebrew implementers, content provi

Re: Yerushala(y)im - Biblical Hebrew

2003-06-27 Thread Jony Rosenne
Whatever you do, any new characters designed for solving these problems should not be in the Hebrew block. Add a new Biblical Hebrew block, clearly labeled as not intended for regular Hebrew use. And I suggest that whenever a proposal comes up to the UTC, it would be advantageous to involve Israel

Re: Yerushala(y)im - or Biblical Hebrew (was Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Jony Rosenne
John, You just discovered one more shortcoming of UniScribe. As you say, the authors did not consider this particular case. I suppose it will be fixed sooner or later. I don't see how this affects the discussion, though. UniScribe and most current fonts do not process the simple case of Holam cor

Re: Biblical Hebrew

2003-06-27 Thread John Hudson
At 08:15 PM 6/26/2003, Kenneth Whistler wrote: But who then does end up carrying the can eventually, if we go the cloning route? Cloning 14 characters creates a *new* normalization problem, and forces non-Biblical-scholar users of pointed Hebrew text to carry *that* particular can. ... I think if

Yerushala(y)im - or Biblical Hebrew (was Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Jony Rosenne
It is not a problem, this is how it should be. Jony > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Mark Davis > Sent: Thursday, June 26, 2003 11:46 PM > To: Kenneth Whistler; [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: