Unicode Public Review Issues update

2003-11-26 Thread Rick McGowan
The Unicode Technical Committee has posted new issues for public review and comment. Details are on the following web page: http://www.unicode.org/review/ Review periods for the new items close on dates as early as December 31, 2003. Please see the page for relevant details, links, an

Unicode Membership Collateral

2003-11-26 Thread Mark Davis
People interested in having their companies join the Unicode Consortium have asked for collateral material to help explain to their management the value of joining. The consortium has put together a short document that provides background material that should be useful in doing this. It has been de

Oriya: nndda / nnta?

2003-11-26 Thread Peter Constable
 The Indian gov’t doc at http://tdil.mit.gov.in/ori-guru-telu.pdf describes the conjunct shown in the attached PNG as being pronounced as though NNA + VIRAMA + DDA (0B21). The component attached to the NNA otherwise represents TA (0B24), however. My question is this: should this conjunct be enco

Re: Exclamation mark comma

2003-11-26 Thread Rick McGowan
Theodore H. Smith asked: > I've often wanted to type a symbol, that's like an exclamation mark, > and a comma at the same time. That is, instead of the "." on the bottom > of a "!", it has a "," instead. > Is there such a Unicode code point? Just out of curiosity! Or I > suppose, is there a way to

Re: Exclamation mark comma

2003-11-26 Thread Michael Everson
At 23:19 + 2003-11-26, Theodore H. Smith wrote: I've often wanted to type a symbol, that's like an exclamation mark, and a comma at the same time. That is, instead of the "." on the bottom of a "!", it has a "," instead. Is there such a Unicode code point? No. Or I suppose, is there a way t

Exclamation mark comma

2003-11-26 Thread Theodore H. Smith
I've often wanted to type a symbol, that's like an exclamation mark, and a comma at the same time. That is, instead of the "." on the bottom of a "!", it has a "," instead. Is there such a Unicode code point? Just out of curiosity! Or I suppose, is there a way to compose such a character.

Re: What is a process?

2003-11-26 Thread Timothy Partridge
Peter Kirk wrote: > As there hasn't been a rush of on-list responses to this one, and partly > in reply to the one off-list response, let me clarify the issue I am > have in mind. > > Instance A of a program P, version X, writes a Unicode character string > S, in a particular normalisation form

RE: Compression through normalization

2003-11-26 Thread D. Starner
> Use Base64 - it is stable through all normalisation forms. The problem with Base64 (and worse yet, PUA characters for bytes), is that it's inefficent. Base64 offers 6 bits per 8 (75%) on UTF-8, 6 bits per 16 (37%) on UTF-16. You can get 15 bits per 16 (93%) on UTF-16 and 15 bits per 24 (62%) on

RE: Definitions

2003-11-26 Thread Peter Constable
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of Peter Kirk > a sequence of combining characters > following ZWNJ is a defective combining sequence. For now, yes. This may change. Peter Peter Constable Globalization Infrastructure and Font Techn

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Michael Everson
At 08:40 -0800 2003-11-26, Andrew C. West wrote: No-one's disputing the origins of U+10341 and U+1034A. All I'm saying is that these two letters are neither needed nor actually used for writing Gothic words, but were devised (i.e. borrowed from Greek) with the sole purpose of representing the n

Re: Definitions

2003-11-26 Thread Peter Kirk
On 26/11/2003 07:43, [EMAIL PROTECTED] wrote: ... In all I would rather ban all defective sequences, by enforcing the W3C character model. I dont' see much point for them. The only possible reason I can think of right now is to allow description of the character itself, though that would possi

RE: Compression through normalization

2003-11-26 Thread jon
> The whole point of such a tool would be to send binary data on a transport > that > only allowed Unicode text. In practice, you'd also have to remap C0 and C1 > characters; but even then 0x00-0x1F -> U+0250-026F and 0x80-0x9F to > U+0270-U+028F > wouldn't be too complex. Unless you've added a Uni

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Peter Kirk
On 26/11/2003 07:27, D. Starner wrote: ... And let's be honest - every word written in Gothic, ever, fits on 68 pages of paper (small font, and both sides, but still.) ... Strictly, you mean all that survives and has been disovered in Gothic. It is known that much more was written e.g. most of

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Mark Davis
I agree that the numeric values should be set properly where they exist, following the precedent of other scripts. In practice, however, with non-decimal systems the programmer will need to know much more about how the numbering system works than just simply the numeric values, so the fact that we

RE: Definitions

2003-11-26 Thread jon
> In all I would rather ban all defective sequences, by enforcing the W3C > character model. rect: by enforcing the use of full normalisation as defined in the W3C character model.

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Andrew C. West
On Wed, 26 Nov 2003 08:04:33 -0800, Peter Kirk wrote: > > On 26/11/2003 04:40, Andrew C. West wrote: > >Is this perhaps because all the other Gothic letters > >can also be used to represent numbers in exactly the same way that U+10341 and > >U+1034A are used (these two letter were devised specific

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Peter Constable
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of [EMAIL PROTECTED] > Can anyone let me know how I can create OTF fonts for Windows CE and have a > keyboard driver for input characters like Bangla. For information on developing OpenType fonts, see htt

Re: Compression through normalization

2003-11-26 Thread Peter Kirk
On 26/11/2003 07:05, D. Starner wrote: ... The whole point of such a tool would be to send binary data on a transport that only allowed Unicode text. In practice, you'd also have to remap C0 and C1 characters; but even then 0x00-0x1F -> U+0250-026F and 0x80-0x9F to U+0270-U+028F wouldn't be too c

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Peter Kirk
On 26/11/2003 04:40, Andrew C. West wrote: On Tue, 25 Nov 2003 16:16:15 -0800, "Doug Ewell" wrote: Well, one reason could be that there is no such character. (Did you mean U+1034A GOTHIC LETTER NINE HUNDRED?) But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE HUNDRE

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Mark Davis
ï I agree that it would be best to have a formal policy on that; but we could not assign different numbers without a contradication in the way that number_value and number_type are defined, so the relationship is stable. Mark__http://www.macchiato.comâ

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Arcane Jill
In full agreement with Philippe here. But also, ever since I first discovered Unicode, I have had the opinion that the descriptions in what is now UCD.html are very confusingly worded. For a start, the three types of numeric property are called "decimal digit", "digit", and "numeric". Now, as

RE: Definitions

2003-11-26 Thread jon
Quoting Philippe Verdy <[EMAIL PROTECTED]>: > Peter Kirk [mailto:[EMAIL PROTECTED] writes: > > Why is this a problem? Quotes and ">" with combining marks are > > presumably not legal HTML or XML; > > You're wrong: it is legal in both HTML and XML. What is not specified > correctly is the behavio

Re: Definitions

2003-11-26 Thread Peter Kirk
On 26/11/2003 06:17, Philippe Verdy wrote: Peter Kirk [mailto:[EMAIL PROTECTED] writes: Why is this a problem? Quotes and ">" with combining marks are presumably not legal HTML or XML; You're wrong: it is legal in both HTML and XML. What is not specified correctly is the behavior of HTML

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread D. Starner
> The cost of such exceptions is that an application cannot reliably use the > general categories to detect, evaluate or create numbers in a relevant > script. So this requires a separate table for each supported script. It's not a generally solvable problem. What's "C"? In the Latin script, that

Re: How can I have OTF for MacOS

2003-11-26 Thread John Jenkins
On Nov 26, 2003, at 7:26 AM, [EMAIL PROTECTED] wrote: But what about devnagri or Bangla. Devanagari and Bangla cannot be supported on Mac OS X through QuickDraw text rendering. Since Office on the Mac is currently restricted to QuickDraw text rendering, it cannot support them. John H

Re: Definitions

2003-11-26 Thread John Cowan
Peter Kirk scripsit: > There could of course be > problems if there were any precomposed combinations of quotes or ">" > with combining characters, but I don't think there are any, are there? Just one: U+226F NOT GREATER THAN is canonically equivalent to > followed by U+0338 COMBINING LONG SOL

RE: Compression through normalization

2003-11-26 Thread D. Starner
> I see no reason why you accept some limitations for this > encapsulation, but not ALL the limitations. Because I can convert the data from binary to Unicode text in UTF-16 in a few lines of code if I don't worry about normalization. Suddenly the rules become much more complex if I have to worry

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Michael Everson
At 15:40 +0100 2003-11-26, Philippe Verdy wrote: The cost of such exceptions is that an application cannot reliably use the general categories to detect, evaluate or create numbers in a relevant script. So this requires a separate table for each supported script. Um. It's not as if anyone does com

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Philippe Verdy
Michael Everson writes: > >But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE > >HUNDRED], which are letters that are only ever used to represent the > >numbers 90 and 900 respectively (they have no intrinsic phonetic > >value), not have a numeric value assigned to them? >

RE: How can I have OTF for MacOS

2003-11-26 Thread mjabbar
But what about devnagri or Bangla. Thanks and regards Mustafa Jabbar Quoting Peter Constable <[EMAIL PROTECTED]>: > > -Original Message- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf > > Of Michael Everson > > > Microsoft Office on OS X does not support Unicode. > >

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread mjabbar
Can anyone let me know how I can create OTF fonts for Windows CE and have a keyboard driver for input characters like Bangla. Thanks and regards Mustafa Jabbar - This mail sent through bangla.net, The First Online Internet Service Provider In Bang

RE: Definitions

2003-11-26 Thread Philippe Verdy
Peter Kirk [mailto:[EMAIL PROTECTED] writes: > Why is this a problem? Quotes and ">" with combining marks are > presumably not legal HTML or XML; You're wrong: it is legal in both HTML and XML. What is not specified correctly is the behavior of HTML and XML parsers face to a XML or HTML document

RE: Compression through normalization

2003-11-26 Thread Philippe Verdy
D. Starner writes: > > In the case of GIF versus JPG, which are usually regarded as "lossless" > > versus "lossy", please note that there /is/ no "orignal", in the sense > > of a stream of bytes. Why not? Because an image is not a stream of > > bytes. Period. > > GIF isn't a compression scheme

RE: Compression through normalization

2003-11-26 Thread Philippe Verdy
Peter Kirk [peterkirk at qaya dot org] writes: > On 25/11/2003 16:38, Doug Ewell wrote: > > >Philippe Verdy wrote: > > > >>So SCSU and BOCU-* formats are NOT general purpose compressors. As > >>they are defined only in terms of stream of Unicode code points, they > >>are assumed to follow the co

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Michael Everson
At 04:40 -0800 2003-11-26, Andrew C. West wrote: But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE HUNDRED], which are letters that are only ever used to represent the numbers 90 and 900 respectively (they have no intrinsic phonetic value), not have a numeric value assign

RE: Compression through normalization

2003-11-26 Thread D. Starner
> In the case of GIF versus JPG, which are usually regarded as "lossless" > versus "lossy", please note that there /is/ no "orignal", in the sense > of a stream of bytes. Why not? Because an image is not a stream of > bytes. Period. GIF isn't a compression scheme; it uses the LZW compression s

Re: numeric properties of Nl characters in the UCD

2003-11-26 Thread Andrew C. West
On Tue, 25 Nov 2003 16:16:15 -0800, "Doug Ewell" wrote: > > Well, one reason could be that there is no such character. (Did you > mean U+1034A GOTHIC LETTER NINE HUNDRED?) > But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE HUNDRED], which are letters that are only ever

RE: Compression through normalization

2003-11-26 Thread jon
> In the case of GIF versus JPG, which are usually regarded as "lossless" > versus "lossy", please note that there /is/ no "orignal", in the sense > of a stream of bytes. Why not? Because an image is not a stream of > bytes. Period. What is being compressed here is a rectangular array of > pixe

Re: How can I have OTF for MacOS

2003-11-26 Thread Michael Everson
At 11:20 -0800 2003-11-25, Peter Kirk wrote: There has been a long running chorus of complaints about the lack of support for Hebrew and Arabic in MS Office on the Mac, which is a symptom of the lack of proper Unicode support. Microsoft and Apple seem to blame one another. It seems likely that

Re: Definitions

2003-11-26 Thread Peter Kirk
On 26/11/2003 02:29, Philippe Verdy wrote: [EMAIL PROTECTED] wrote: Briefly, it's my opinion that applications which claim to support and comply with Unicode should not 'step on' Unicode text. Any loopholes in the 'letter of the law' which allow applications to mung or reject Unicode text shou

Re: Compression through normalization

2003-11-26 Thread Peter Kirk
On 25/11/2003 16:38, Doug Ewell wrote: Philippe Verdy wrote: So SCSU and BOCU-* formats are NOT general purpose compressors. As they are defined only in terms of stream of Unicode code points, they are assumed to follow the conformance clauses of Unicode. As they recognize their input as Unic

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Philippe Verdy
-Message d'origine- Arcane Jill writes: > But actually, there is one small difference between what > I said and what you said. I merely observed that no characters > have different non-null values for the various number-related > properties. But you state (emphasis on *cannot*) that this i

RE: Compression through normalization

2003-11-26 Thread Arcane Jill
In the case of GIF versus JPG, which are usually regarded as "lossless" versus "lossy", please note that there is no "orignal", in the sense of a stream of bytes. Why not? Because an image is not a stream of bytes. Period. What is being compressed here is a rectangular array of pixels, and tha

RE: Definitions

2003-11-26 Thread Philippe Verdy
[EMAIL PROTECTED] wrote: > Briefly, it's my opinion that applications which claim to support > and comply with Unicode should not 'step on' Unicode text. Any > loopholes in the 'letter of the law' which allow applications to > mung or reject Unicode text should be plugged. If this "pluging" reque

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Arcane Jill
That is almost precisely what I said. You repeated it perfectly. Thanks. But actually, there is one small difference between what I said and what you said. I merely observed that no characters have different non-null values for the various number-related properties. But you state (emphasis on

RE: Request

2003-11-26 Thread jon
> >Most font developers restrict rights on their fonts. Obtaining a > >legal copy of a font only grants the user the right to use the font; > >not to make changes. > > Actually, a lot of font developers -- probably the majority -- explicitly > allow modifications for personal use. What they do n

Vá: Re: Unicode 4.0 Poster

2003-11-26 Thread Géza Rózsa
It is great! I have two PCs and three monitors, and in this configuration I see it well, but: I have only the normal windows on my PCs, and my internetexplorer shows only little boxes in many of the cells in the table. What can I do? Where can I find a font (TTF) with all of the caracters that a