The Unicode Technical Committee has posted new issues for public review
and comment. Details are on the following web page:
http://www.unicode.org/review/
Review periods for the new items close on dates as early as December 31,
2003. Please see the page for relevant details, links, an
People interested in having their companies join the Unicode Consortium have
asked for collateral material to help explain to their management the value of
joining. The consortium has put together a short document that provides
background material that should be useful in doing this. It has been de
The Indian gov’t doc at http://tdil.mit.gov.in/ori-guru-telu.pdf describes the
conjunct shown in the attached PNG as being pronounced as though NNA + VIRAMA + DDA
(0B21). The component attached to the NNA otherwise represents TA (0B24), however.
My question is this: should this conjunct be enco
Theodore H. Smith asked:
> I've often wanted to type a symbol, that's like an exclamation mark,
> and a comma at the same time. That is, instead of the "." on the bottom
> of a "!", it has a "," instead.
> Is there such a Unicode code point? Just out of curiosity! Or I
> suppose, is there a way to
At 23:19 + 2003-11-26, Theodore H. Smith wrote:
I've often wanted to type a symbol, that's like an exclamation mark,
and a comma at the same time. That is, instead of the "." on the
bottom of a "!", it has a "," instead.
Is there such a Unicode code point?
No.
Or I suppose, is there a way t
I've often wanted to type a symbol, that's like an exclamation mark,
and a comma at the same time. That is, instead of the "." on the bottom
of a "!", it has a "," instead.
Is there such a Unicode code point? Just out of curiosity! Or I
suppose, is there a way to compose such a character.
Peter Kirk wrote:
> As there hasn't been a rush of on-list responses to this one, and partly
> in reply to the one off-list response, let me clarify the issue I am
> have in mind.
>
> Instance A of a program P, version X, writes a Unicode character string
> S, in a particular normalisation form
> Use Base64 - it is stable through all normalisation forms.
The problem with Base64 (and worse yet, PUA characters for bytes), is that
it's inefficent. Base64 offers 6 bits per 8 (75%) on UTF-8, 6 bits per 16 (37%)
on UTF-16. You can get 15 bits per 16 (93%) on UTF-16 and 15 bits per 24 (62%)
on
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
> Of Peter Kirk
> a sequence of combining characters
> following ZWNJ is a defective combining sequence.
For now, yes. This may change.
Peter
Peter Constable
Globalization Infrastructure and Font Techn
At 08:40 -0800 2003-11-26, Andrew C. West wrote:
No-one's disputing the origins of U+10341 and U+1034A. All I'm
saying is that these two letters are neither needed nor actually
used for writing Gothic words, but were devised (i.e. borrowed from
Greek) with the sole purpose of representing the n
On 26/11/2003 07:43, [EMAIL PROTECTED] wrote:
...
In all I would rather ban all defective sequences, by enforcing the W3C
character model. I dont' see much point for them. The only possible reason I
can think of right now is to allow description of the character itself, though
that would possi
> The whole point of such a tool would be to send binary data on a transport
> that
> only allowed Unicode text. In practice, you'd also have to remap C0 and C1
> characters; but even then 0x00-0x1F -> U+0250-026F and 0x80-0x9F to
> U+0270-U+028F
> wouldn't be too complex. Unless you've added a Uni
On 26/11/2003 07:27, D. Starner wrote:
...
And let's be honest - every word written in Gothic, ever, fits on 68 pages of paper
(small font, and both sides, but still.) ...
Strictly, you mean all that survives and has been disovered in Gothic.
It is known that much more was written e.g. most of
I agree that the numeric values should be set properly where they exist,
following the precedent of other scripts. In practice, however, with non-decimal
systems the programmer will need to know much more about how the numbering
system works than just simply the numeric values, so the fact that we
> In all I would rather ban all defective sequences, by enforcing the W3C
> character model.
rect: by enforcing the use of full normalisation as defined in the W3C
character model.
On Wed, 26 Nov 2003 08:04:33 -0800, Peter Kirk wrote:
>
> On 26/11/2003 04:40, Andrew C. West wrote:
> >Is this perhaps because all the other Gothic letters
> >can also be used to represent numbers in exactly the same way that U+10341 and
> >U+1034A are used (these two letter were devised specific
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
> Of [EMAIL PROTECTED]
> Can anyone let me know how I can create OTF fonts for Windows CE and
have a
> keyboard driver for input characters like Bangla.
For information on developing OpenType fonts, see
htt
On 26/11/2003 07:05, D. Starner wrote:
...
The whole point of such a tool would be to send binary data on a transport that
only allowed Unicode text. In practice, you'd also have to remap C0 and C1
characters; but even then 0x00-0x1F -> U+0250-026F and 0x80-0x9F to U+0270-U+028F
wouldn't be too c
On 26/11/2003 04:40, Andrew C. West wrote:
On Tue, 25 Nov 2003 16:16:15 -0800, "Doug Ewell" wrote:
Well, one reason could be that there is no such character. (Did you
mean U+1034A GOTHIC LETTER NINE HUNDRED?)
But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE
HUNDRE
ï
I agree that it would be best to have a formal policy on that;
but we could not assign different numbers without a contradication in the way
that number_value and number_type are defined, so the relationship is
stable.
Mark__http://www.macchiato.comâ
In full agreement with Philippe here. But also, ever since I first
discovered Unicode, I have had the opinion that the descriptions in
what is now UCD.html are very confusingly worded.
For a start, the three types of numeric property are called "decimal
digit", "digit", and "numeric". Now, as
Quoting Philippe Verdy <[EMAIL PROTECTED]>:
> Peter Kirk [mailto:[EMAIL PROTECTED] writes:
> > Why is this a problem? Quotes and ">" with combining marks are
> > presumably not legal HTML or XML;
>
> You're wrong: it is legal in both HTML and XML. What is not specified
> correctly is the behavio
On 26/11/2003 06:17, Philippe Verdy wrote:
Peter Kirk [mailto:[EMAIL PROTECTED] writes:
Why is this a problem? Quotes and ">" with combining marks are
presumably not legal HTML or XML;
You're wrong: it is legal in both HTML and XML. What is not specified
correctly is the behavior of HTML
> The cost of such exceptions is that an application cannot reliably use the
> general categories to detect, evaluate or create numbers in a relevant
> script. So this requires a separate table for each supported script.
It's not a generally solvable problem. What's "C"? In the Latin script, that
On Nov 26, 2003, at 7:26 AM, [EMAIL PROTECTED] wrote:
But what about devnagri or Bangla.
Devanagari and Bangla cannot be supported on Mac OS X through QuickDraw
text rendering. Since Office on the Mac is currently restricted to
QuickDraw text rendering, it cannot support them.
John H
Peter Kirk scripsit:
> There could of course be
> problems if there were any precomposed combinations of quotes or ">"
> with combining characters, but I don't think there are any, are there?
Just one: U+226F NOT GREATER THAN is canonically equivalent to > followed
by U+0338 COMBINING LONG SOL
> I see no reason why you accept some limitations for this
> encapsulation, but not ALL the limitations.
Because I can convert the data from binary to Unicode text in UTF-16
in a few lines of code if I don't worry about normalization. Suddenly
the rules become much more complex if I have to worry
At 15:40 +0100 2003-11-26, Philippe Verdy wrote:
The cost of such exceptions is that an application cannot reliably use the
general categories to detect, evaluate or create numbers in a relevant
script. So this requires a separate table for each supported script.
Um. It's not as if anyone does com
Michael Everson writes:
> >But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE
> >HUNDRED], which are letters that are only ever used to represent the
> >numbers 90 and 900 respectively (they have no intrinsic phonetic
> >value), not have a numeric value assigned to them?
>
But what about devnagri or Bangla.
Thanks and regards
Mustafa Jabbar
Quoting Peter Constable <[EMAIL PROTECTED]>:
> > -Original Message-
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf
> > Of Michael Everson
>
> > Microsoft Office on OS X does not support Unicode.
>
>
Can anyone let me know how I can create OTF fonts for Windows CE and have a
keyboard driver for input characters like Bangla.
Thanks and regards
Mustafa Jabbar
-
This mail sent through bangla.net, The First Online Internet Service Provider In
Bang
Peter Kirk [mailto:[EMAIL PROTECTED] writes:
> Why is this a problem? Quotes and ">" with combining marks are
> presumably not legal HTML or XML;
You're wrong: it is legal in both HTML and XML. What is not specified
correctly is the behavior of HTML and XML parsers face to a XML or HTML
document
D. Starner writes:
> > In the case of GIF versus JPG, which are usually regarded as "lossless"
> > versus "lossy", please note that there /is/ no "orignal", in the sense
> > of a stream of bytes. Why not? Because an image is not a stream of
> > bytes. Period.
>
> GIF isn't a compression scheme
Peter Kirk [peterkirk at qaya dot org] writes:
> On 25/11/2003 16:38, Doug Ewell wrote:
>
> >Philippe Verdy wrote:
> >
> >>So SCSU and BOCU-* formats are NOT general purpose compressors. As
> >>they are defined only in terms of stream of Unicode code points, they
> >>are assumed to follow the co
At 04:40 -0800 2003-11-26, Andrew C. West wrote:
But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE
HUNDRED], which are letters that are only ever used to represent the
numbers 90 and 900 respectively (they have no intrinsic phonetic
value), not have a numeric value assign
> In the case of GIF versus JPG, which are usually regarded as "lossless"
> versus "lossy", please note that there /is/ no "orignal", in the sense
> of a stream of bytes. Why not? Because an image is not a stream of
> bytes. Period.
GIF isn't a compression scheme; it uses the LZW compression s
On Tue, 25 Nov 2003 16:16:15 -0800, "Doug Ewell" wrote:
>
> Well, one reason could be that there is no such character. (Did you
> mean U+1034A GOTHIC LETTER NINE HUNDRED?)
>
But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE
HUNDRED], which are letters that are only ever
> In the case of GIF versus JPG, which are usually regarded as "lossless"
> versus "lossy", please note that there /is/ no "orignal", in the sense
> of a stream of bytes. Why not? Because an image is not a stream of
> bytes. Period. What is being compressed here is a rectangular array of
> pixe
At 11:20 -0800 2003-11-25, Peter Kirk wrote:
There has been a long running chorus of complaints about the lack of
support for Hebrew and Arabic in MS Office on the Mac, which is a
symptom of the lack of proper Unicode support. Microsoft and Apple
seem to blame one another. It seems likely that
On 26/11/2003 02:29, Philippe Verdy wrote:
[EMAIL PROTECTED] wrote:
Briefly, it's my opinion that applications which claim to support
and comply with Unicode should not 'step on' Unicode text. Any
loopholes in the 'letter of the law' which allow applications to
mung or reject Unicode text shou
On 25/11/2003 16:38, Doug Ewell wrote:
Philippe Verdy wrote:
So SCSU and BOCU-* formats are NOT general purpose compressors. As
they are defined only in terms of stream of Unicode code points, they
are assumed to follow the conformance clauses of Unicode. As they
recognize their input as Unic
-Message d'origine-
Arcane Jill writes:
> But actually, there is one small difference between what
> I said and what you said. I merely observed that no characters
> have different non-null values for the various number-related
> properties. But you state (emphasis on *cannot*) that this i
In the case of GIF versus JPG, which are usually regarded as "lossless"
versus "lossy", please note that there is no "orignal", in the
sense of a stream of bytes. Why not? Because an image is not a stream
of bytes. Period. What is being compressed here is a rectangular array
of pixels, and tha
[EMAIL PROTECTED] wrote:
> Briefly, it's my opinion that applications which claim to support
> and comply with Unicode should not 'step on' Unicode text. Any
> loopholes in the 'letter of the law' which allow applications to
> mung or reject Unicode text should be plugged.
If this "pluging" reque
That is almost precisely what I said. You repeated it perfectly. Thanks.
But actually, there is one small difference between what I said and
what you said. I merely observed that no characters have
different non-null values for the various number-related properties.
But you state (emphasis on
> >Most font developers restrict rights on their fonts. Obtaining a
> >legal copy of a font only grants the user the right to use the font;
> >not to make changes.
>
> Actually, a lot of font developers -- probably the majority -- explicitly
> allow modifications for personal use. What they do n
It is great! I have two PCs and three monitors, and in this configuration I see it
well, but:
I have only the normal windows on my PCs, and my internetexplorer shows only little
boxes in many of the cells in the table.
What can I do? Where can I find a font (TTF) with all of the caracters that a
47 matches
Mail list logo