date:20031027

Re: Unicode and Script Encoding Initiative in San Jose Mercury News

2003-10-27 Thread Doug Ewell

Eric Muller wrote: > Doug Ewell wrote: > >> [...] about "You see, boys and girls, computers think only in >> numbers" -- in a Silicon Valley paper, > > > [...] Should we tell them about ârealâ quotes? > > âreal quotesâ are not just for Web publication; they are also for > email. > Throw in real d

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > Thanks for the clarification. In principle we might be able to go a > little further: we could define both and as > canonically equivalent to c for all c in combining class zero. This > would have to be some kind of decomposition exception so that c is

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > On 27/10/2003 10:31, Philippe Verdy wrote: > > > ... > > > >The bad thing is that there's no way to say that a superfluous > >CGJ character can be "safely" removed if CC(char1) <= CC(char2), > >so that it will preserve the semantic of the encoded text even

RE: unicode on Linux

2003-10-27 Thread Shao, Yiying

>>On Red Hat Linux, if UTF-8 is not made as the default encoding for >>Chnese/Japanese/Korean, what it is using for those double byte languages? >The old multi-byte character sets. for CJK, can UTF-8 to be set to the local for an App programatically without affecting other apps? >>Does la

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > each possible individually as a contraction. The Logical_Order_Exception > property (see http://www.unicode.org/reports/tr10/ section 3.1.3) just One bug report note here: The UTS#10 contains all references to several character properties, pointing to http

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 27/10/2003 16:16, Philippe Verdy wrote: ... So, all we can do is to define compatibility equivalence between: and: if and only if: CC(c1) > CC(c2) > 0. This won't affect the NFC and NFD conversion algorithms, but it can affect the NFKC and NFKD conversion algorithms. This means that

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > On 27/10/2003 12:28, Mark Davis wrote: > > >Collation is very different, and already has mechanisms for dealing with > >sequences. So no CGJ is needed there (except for case 2). > > > >Mark > > > > > > > Mark, can you outline what these mechanisms are or po

Re: [OT by now] Re: Traditional dollar sign

2003-10-27 Thread Kenneth Whistler

> ... Ironically, > in 1943-45 nickels were actually minted in silver, as nickel was considered > strategic for the war effort. Current nickels are 75% copper and 25% > nickel, the same as the cladding of the other coins. (Pennies are > copper-clad zinc, however.) Prior to 1982, pennies were a

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

> So, all we can do is to define compatibility equivalence between: > > and: > > if and only if: > CC(c1) > CC(c2) > 0. Oops! Of course, I really meant: All we can do is to define compatibility equivalence (NFK*) between: and: unless: CC(c1)

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > On 27/10/2003 10:31, Philippe Verdy wrote: > > > ... > > > >The bad thing is that there's no way to say that a superfluous > >CGJ character can be "safely" removed if CC(char1) <= CC(char2), > >so that it will preserve the semantic of the encoded text even

Re: UAX #29 beta update (text breaks): apostrophe ./. H

2003-10-27 Thread Philippe Verdy

- Original Message - From: "Peter Kirk" <[EMAIL PROTECTED]> To: "Philippe Verdy" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, October 28, 2003 12:16 AM Subject: Re: UAX #29 beta update (text breaks): apostrophe ./. H > On 27/10/2003 13:34, Philippe Verdy wrote: > > >The pr

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 27/10/2003 10:31, Philippe Verdy wrote: ... The bad thing is that there's no way to say that a superfluous CGJ character can be "safely" removed if CC(char1) <= CC(char2), so that it will preserve the semantic of the encoded text even though such filtered text would not be canonically equivale

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 27/10/2003 12:28, Mark Davis wrote: Collation is very different, and already has mechanisms for dealing with sequences. So no CGJ is needed there (except for case 2). Mark Mark, can you outline what these mechanisms are or point me to a definition e.g. in a section of UTR #10? As I had und

Re: UAX #29 beta update (text breaks): apostrophe ./. H

2003-10-27 Thread Peter Kirk

On 27/10/2003 13:34, Philippe Verdy wrote: The proposed update to UAX#29 contains this text: Apostrophe is another tricky case. Usually considered part of one word ("can 't", "aujourd'hui") ... ... So in French we also have the additional word break rule: hyphens ÷ LatinLetterH This case is

Re: UAX #29 beta update (text breaks): apostrophe ./. H

2003-10-27 Thread Philippe Verdy

I wrote: > So in French we also have the additional word break rule: > > hyphens ÷ LatinLetterH > > This case is not documented... But I forgot to speak about the common exception "aujourd'hui" (today) where the apostrophe was originally an ellision resulting from the contraction of "au jour de

UAX #29 beta update (text breaks): apostrophe ./. H

2003-10-27 Thread Philippe Verdy

The proposed update to UAX#29 contains this text: Apostrophe is another tricky case. Usually considered part of one word ("can 't", "aujourd'hui") it may also be considered two ("l'objectif"). Also, one cannot easily distinguish the cases where it is used as a quotation mark from those where it is

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Constable" <[EMAIL PROTECTED]> > There is no problem requiring a solution for combining marks used with > Latin script,* including IPA and Vietnamese, because all of the marks > that occupy a comparable space relative to the base have the same > combining class, meaning that normaliza

Re: U+0BA3, U+0BA9

2003-10-27 Thread Kenneth Whistler

Peter Jacobi asked: > Doug, Kenneth, All, > > I', somewhat confused. I assume I'm lacking a lot > of background, but I can't interpolate successfully between > your answers: > > "Doug Ewell" <[EMAIL PROTECTED]> wrote: > > The Unicode character names attempt to be (a) unique and (b) reasonably >

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Mark Davis" <[EMAIL PROTECTED]> > the UTC decision: > > [96-C20] Consensus: Add text to Unicode 4.0.1 which points out that combining > grapheme joiner has the effect of preventing the canonical re-ordering of > combining marks during normalization. [L2/03-235, L2/03-236, L2/03-234] > > [96-

Re: Traditional dollar sign

2003-10-27 Thread Kenneth Whistler

Doug Ewell noted: > The dollar sign was used > occasionally for decoration on large-sized (pre-1929) U.S. currency, but > not on small-sized issues (except for the bank-only $100,000 note). And very rarely even at that. See: http://www.money.org/bebeeexhibit.html for many exhibits of all kinds

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Mark Davis

Collation is very different, and already has mechanisms for dealing with sequences. So no CGJ is needed there (except for case 2). Mark __ http://www.macchiato.com â à â - Original Message - From: "Peter Kirk" <[EMAIL PROTECTED]> To: "M

RE: Unicode support for Khmer

2003-10-27 Thread Sue and Maurice Bauhahn

According to my understanding OpenType fonts for Khmer Unicode are available from: Om Mony ([EMAIL PROTECTED]) Danh Hong ([EMAIL PROTECTED]) Masavang Sean ([EMAIL PROTECTED]) These should work in Microsoft Office 2003 on Windows (especially Microsoft Publisher) for display of Khmer characters bec

RE: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Constable

Philippe Verdy wrote: > This principle may help solve the ambiguities in all those affected > scripts > (may be there are similar issues in the Latin script for Vietnamese, which > would like to better fit the phonetics of words that may be incorrectly > rendered by the currently requited normaliz

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > I don't see any difference between your proposed generic CCO and CGJ. As > you say, the same function may be needed in several scripts, including > perhaps IPA which uses complex diacritic stacking. So why not simply use > CGJ? Why not effectively, but

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > I am not sure what you mean by "further normalization steps for Hebrew". Of course I don't mean that NF* algorithms must be changed. See below. > If this means that users will be expected to input Hebrew in this order, > perhaps with a keyboard driver wh

Re: Traditional dollar sign

2003-10-27 Thread Michael Everson

At 20:45 -0800 2003-10-26, Doug Ewell wrote: The European Commission might have chosen to follow this example 30 years later, instead of trying to mandate that the Euro glyph remain invariant in all fonts and contexts. Doug, give that one a rest, OK? That was in 1996. -- Michael Everson * * Everso

Unicode support for Khmer

2003-10-27 Thread Patrick Andries

Someone has asked me what is the currently available Unicode support for Khmer major software products (OpenType fonts, MS Office, Windows, Mac, browsers and keyboards). Could someone point me to current material on this topic or -- better -- summarize this support ? P. A.

RE: Traditional dollar sign

2003-10-27 Thread jim

Simon Butcher wrote: My bank (ANZ) recently gave me literature related to obtaining foreign currency, and used the form $A (that is, with the double-bar form of the dollar sign, not the single-bar form). Considering the small glossy leaflet was about the rising Australian dollar, it's evidently

Re: New contribution N2676

2003-10-27 Thread Richard Peevers

Raymond, Apropos 10186 G GREEK ARTABE SIGN The identity of one glyph variant of ‘zero’ and one of ‘artabe’ raises an interesting problem. If you look at e.g. ‘Siglae’ in RE 2.2 (1923) 2279-2315 you’ll see that Bilabel lists 16 glyph variants for the Artabe. The most common varia

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 27/10/2003 08:45, Mark Davis wrote: Thank you for the interesting thoughts. As I understand your suggestion, and bearing in mind that dagesh (and the rare rafe) are also consonant modifiers, you are effectively suggesting an order (already normalised): consonant dagesh rafe shin/sin-dot CGJ rig

Re: Traditional dollar sign

2003-10-27 Thread Norbert Lindenberg

The holographic strip on the Euro notes shows the Euro symbol when viewed at certain angles. Norbert Peter Kirk wrote: > > The latest issue of UK banknotes do carry the pound sterling sign (with > one crossbar), but this is quite new. At least the more recent former > issues did not, if I rememb

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Mark Davis

> Thank you for the interesting thoughts. As I understand your suggestion, > and bearing in mind that dagesh (and the rare rafe) are also consonant > modifiers, you are effectively suggesting an order (already normalised): > > consonant dagesh rafe shin/sin-dot CGJ right-meteg CGJ vowel accent CGJ

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 27/10/2003 06:54, Philippe Verdy wrote: Thanks a lot for thzese precisions on Hebrew usages that need those combining order overrides. This demonstrates that this occurs relatively infrequently, and so introducing a ignorable "combining order override" control makes sense, without needing to ad

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 27/10/2003 07:28, Philippe Verdy wrote: From: "Peter Kirk" <[EMAIL PROTECTED]> So the logical order is . But the canonical order is ; up to three (and in theory more, at least in biblical Hebrew) other characters may appear between the base letter and the dot which fundamentally modifies it

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

From: "Peter Kirk" <[EMAIL PROTECTED]> > So the logical order is > . > But the canonical order is > ; > up to three (and in theory > more, at least in biblical Hebrew) other characters may appear between > the base letter and the dot which fundamentally modifies it. Ohh, I forgot the case of the

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Philippe Verdy

Thanks a lot for thzese precisions on Hebrew usages that need those combining order overrides. This demonstrates that this occurs relatively infrequently, and so introducing a ignorable "combining order override" control makes sense, without needing to add duplicate codepoints with corrected proper

RE: Traditional dollar sign

2003-10-27 Thread Simon Butcher

Hi! > However, the presence of two opposing conventions serves as a strong > hint that there was no consensus in 1966, nor now, as to how glyph > variants of the dollar sign were to be used to stand for > different types > of dollars. I went to school in the 1980's, and both in Victoria and Ta

Re: [OT by now] Re: Traditional dollar sign

2003-10-27 Thread John Cowan

Asmus Freytag scripsit: > Many monetary systems have coin sizes and weights that are based on > the traditional precious or semi-precious metals once used. The nick- > name for the nickel gives that away, associating it with a different > metal than the (presumably once) silver-based dime/quarter/

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread rosennej

I am on a business trip abroad with only limited e-mail access. I will try to respond next week when I'm back home. Jony

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 26/10/2003 19:58, John Hudson wrote: ... Functionally, inserting a CGJ here resolves the problem fine. I'm just not convinced that CGJ is a good general solution to the normalisation problem: it works, but it requires deliberate insertion in every place where unwanted mark re-ordering may oc

Re: Merging combining classes, was: New contribution N2676

2003-10-27 Thread Peter Kirk

On 26/10/2003 12:51, Jony Rosenne wrote: While the current combining classes may cause some difficulties for Biblical scholars (and this isn't cut and dry yet - it isn't certain whether these are Unicode problem, implementation problems, missing characters or mis-identified characters), I have yet

Re: [OT by now] Re: Traditional dollar sign

2003-10-27 Thread Peter Kirk

On 26/10/2003 21:30, Doug Ewell wrote: ... In my limited experience, that word DIME has done more to confuse furriners than anything else about the U.S. and Canadian monetary systems. The dime is the smallest coin in the set physically, weighing less than half as much as a nickel, and made of (a

Re: Traditional dollar sign

2003-10-27 Thread Peter Kirk

On 26/10/2003 20:08, John Cowan wrote: Kevin Brown scripsit: Incidentally, as far as I know, neither the dollar symbol nor cent symbol have ever appeared on Australia's paper money or coinage. Is this unusual? I can't speak for the whole of the last two centuries, but certainly current

Re: [OT by now] Re: Traditional dollar sign

2003-10-27 Thread Asmus Freytag

At 09:30 PM 10/26/03 -0800, Doug Ewell wrote: > I can't speak for the whole of the last two centuries, but certainly > current American bills and coins do not use either symbol. The bills > in common use say ONE DOLLAR, FIVE DOLLARS, TEN DOLLARS, and TWENTY > DOLLARS; the coins say ONE CENT, FIVE

44 matches

Mail list logo