RE: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Jony Rosenne
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of John Hudson > Sent: Saturday, November 27, 2004 1:21 AM > To: 'Unicode Mailing List' > Subject: Re: No Invisible Character - NBSP at the start of a word > > > Jony Rosenne wrote: > > > One of the p

Re: Relationship between Unicode and 10646

2004-11-26 Thread John Cowan
Peter Kirk scripsit: > I don't want to go along with Philippe entirely on this, but surely he > must be right on this last point. Formally, Unicode is effectively the > agent of just one national body in this decision-making process. The Unicode Consortium is not an agent of the USNB, althoug

Re: Misuse of 8th bit [Was: My Querry]

2004-11-26 Thread Doug Ewell
John Cowan wrote: > No, I don't agree with this part. Unicode just isn't going to expand > past 0x10 unless Earth joins the Galactic Empire. So the upper > bits are indeed free for private uses. A few years ago there was the "Whistler Constant," which basically stated that at current growt

Re: Misuse of 8th bit [Was: My Querry]

2004-11-26 Thread John Cowan
Antoine Leca scripsit: > In a similar vein, I cannot be in agreement that it could be advisable to > use the 22th, 23th, 32th, 63th, etc., the upper bits of the storage of a > Unicode codepoint. Right now, nobody is seeing any use for them as part of > characters, but history should have learned u

Re: Relationship between Unicode and 10646

2004-11-26 Thread Peter Kirk
On 26/11/2004 14:04, Philippe Verdy wrote: From: "Doug Ewell" <[EMAIL PROTECTED]> My impression is that Unicode and ISO/IEC 10646 are two distinct standards, administered respectively by UTC and ISO/IEC JTC1/SC2/WG2, which have pledged to work together to keep the standards perfectly aligned and in

Re: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Peter Kirk
On 26/11/2004 21:27, Doug Ewell wrote: ... One useful litmus (or lackmus) test for this Hebrew example would be whether the text in question is still legible, with its original meaning, when reduced to plain text representable in today's Unicode. If the special Ketiv/Qere handling is needed only be

Re: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Peter Kirk
On 26/11/2004 23:24, Doug Ewell wrote: ... Most "break opportunities" are between words, a concept often indicated by an ordinary space (U+0020). So you wouldn't generally have to precede *every* combination of NBSP+combining mark with ZWSP "to ensure a break opportunity," only those combinations

Re: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Doug Ewell
Peter Kirk wrote: > So I only raised this issue to clarify exactly how NBSP should be used > in such cases. Although I have been rather confused by the responses I > have received, I think the situation is clear as follows: NBSP may be > used with a combining mark at the start of a word, but shou

Re: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread John Hudson
Jony Rosenne wrote: One of the problems in this context is the phrase "original meaning". What we have is a juxtaposition of two words, which is indicated by writing the letters of one with the vowels of the other. In many cases this does not cause much of a problem, because the vowels fit the lett

Re: Misuse of 8th bit [Was: My Querry]

2004-11-26 Thread Asmus Freytag
The fact is, once you dedicate the top bits in a pipe to some purposes, you've narrowed the width of the pipe. That's what happened to those systems that implemented a 7-bit pipe for ASCII by using the top bit for other purposes. And everybody seems to agree that when you serialize such an enco

Re: CGJ , RLM

2004-11-26 Thread Doug Ewell
Philippe Verdy wrote: > If encoding ligation oportunity is not plain-text, why then have it in > Unicode? > If hyphenation opportunity is not plain-text, why then have it in > Unicode? Both of these capabilities are arguably plain-text. There is such a thing as over-using them to the point wher

Re: CGJ , RLM

2004-11-26 Thread Philippe Verdy
From: "Doug Ewell" <[EMAIL PROTECTED]> Philippe Verdy wrote: If I want to encode explicit ligatures for the "ffi" cluster, if it is not hyphenated, I need to add ZWJ: "ef"+ZWJ+SHY+"f"+ZWJ+"i"+SHY+"ca"+SHY+"ce"(1) Great Scott! You can use ZWJ to suggest a ligation opportunity, and SHY to sugge

Re: CGJ , RLM

2004-11-26 Thread Philippe Verdy
From: "Doug Ewell" <[EMAIL PROTECTED]> Perhaps a better question to ask would be why you need to indicate both hyphenation points and ligation points in text that is going to be collated. Because one would want to: - prepare documents for correct rendering (including both ligatures and hyphenation

Re: CGJ , RLM

2004-11-26 Thread Philippe Verdy
Which "statements"? My message is mostly a read as a question, not as an affirmation... I also took the precaution of using terms like "not sure if...", or "i don't know if...", which mean that it's a problem for which I can't find easy solutions, i.e. the interaction of ligature opportunities

Re: CGJ , RLM

2004-11-26 Thread Doug Ewell
Philippe Verdy wrote: > If I want to encode explicit ligatures for the "ffi" cluster, if it is > not hyphenated, I need to add ZWJ: > "ef"+ZWJ+SHY+"f"+ZWJ+"i"+SHY+"ca"+SHY+"ce"(1) Great Scott! You can use ZWJ to suggest a ligation opportunity, and SHY to suggest a hyphenation opportunity, b

RE: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Jony Rosenne
> -Original Message- > From: Doug Ewell [mailto:[EMAIL PROTECTED] > Sent: Friday, November 26, 2004 11:28 PM > To: Unicode Mailing List > Cc: Jony Rosenne; Peter Kirk > Subject: Re: No Invisible Character - NBSP at the start of a word > > > Jony Rosenne wrote: > > > Normal printed te

Re: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Doug Ewell
Jony Rosenne wrote: > Normal printed text is hardly ever plain text. It contains headings, > highlighted phrases, paragraphs etc. Headings and highlighted text, when stripped of their formatting, are still legible, and paragraph boundaries can usually be indicated in plain text. One useful litm

Re: CGJ , RLM

2004-11-26 Thread Mark Davis
The statements below are incorrect, but I don't have the time to correct them all. âMark - Original Message - From: "Philippe Verdy" <[EMAIL PROTECTED]> To: "Mark Davis" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Friday, November 26, 2004 11:13 Subject: Re: CGJ , RLM > From: "Ma

Re: CGJ , RLM

2004-11-26 Thread Philippe Verdy
From: "Mark Davis" <[EMAIL PROTECTED]> I want to correct some misperceptions about CGJ; it should not be used for ligatures. True. CGJ is a combining character that extends the grapheme cluster started before it, but it does not imply any linking with the next grapheme cluster starting at a base

RE: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Jony Rosenne
Normal printed text is hardly ever plain text. It contains headings, highlighted phrases, paragraphs etc. The Hebrew Bible has its unique non-plain text artifacts, such as Ketiv/Qere. If standardization is necessary, take it to the SGML people. Simple cases of Ketiv/Qere can be managed without ma

Re: Misuse of 8th bit [Was: My Querry]

2004-11-26 Thread Philippe Verdy
From: "Antoine Leca" <[EMAIL PROTECTED]> On Thursday, November 25th, 2004 08:05Z Philippe Verdy va escriure: In ASCII, or in all other ISO 646 charsets, code positions are ALL in the range 0 to 127. Nothing is defined outside of this range, exactly like Unicode does not define or mandate anything f

Re: Relationship between Unicode and 10646 (was: Re: Shift-JIS conversion.)

2004-11-26 Thread Philippe Verdy
From: "Doug Ewell" <[EMAIL PROTECTED]> My impression is that Unicode and ISO/IEC 10646 are two distinct standards, administered respectively by UTC and ISO/IEC JTC1/SC2/WG2, which have pledged to work together to keep the standards perfectly aligned and interoperable, because it would be destructiv

Re: No Invisible Character - NBSP at the start of a word

2004-11-26 Thread Peter Kirk
On 26/11/2004 03:40, Mark E. Shoulson wrote: ... I think part of what makes Biblical Hebrew so contentious is the unstated assumption that "the BHS text of the Bible *must* be considered plain-text." It's not necessarily so. It isn't necessarily a bad rule to work with, but it isn't one we sho

UTC rejects invisible character

2004-11-26 Thread Dean Snyder
Where can I read about the reasoning behind the UTC's rejection of the proposal to encode an invisible letter? Respectfully, Dean A. Snyder Assistant Research Scholar Manager, Digital Hammurabi Project Computer Science Department Whiting School of Engineering 218C New Engineering Building 3400

Hanunoo, Tagbanwa

2004-11-26 Thread Johannes Bergerhausen
We are looking for free or shareware fonts of Hanunoo an Tagbanwa. ...thank you for your help. Johannes, Mainz, Germany www.decodeunicode.org

Re: Misuse of 8th bit [Was: My Querry]

2004-11-26 Thread Antoine Leca
On Thursday, November 25th, 2004 08:05Z Philippe Verdy va escriure: > > In ASCII, or in all other ISO 646 charsets, code positions are ALL in > the range 0 to 127. Nothing is defined outside of this range, exactly > like Unicode does not define or mandate anything for code points > larger than 0x10

LocalisationDev.org invites input

2004-11-26 Thread Donald Z. Osborn
FYI, a recent localization development workshop in Warsaw (20-22/11/04) discussed a number of issues relating to software and documentation localization and produced a wiki and other resources. The appended announcement and request for input may be of interest to list members involved in this field