I want to use those for Desktop Publishing including MS Office for Mac, Quark
Xpress for Mac, Adobe apps etc.
Thanks and regards
Mustafa Jabbar
Quoting Peter Constable [EMAIL PROTECTED]:
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
Of
On Mon, Nov 24, 2003 at 06:53:25PM -0800,
Mark Davis [EMAIL PROTECTED] wrote
a message of 24 lines which said:
Good test of your browser!
(Mozilla Firebird croaks on it. Opera works, but has ugly formatting. IE works.
Haven't tried any others.)
On my machine, it works fine, it is not even
At 15:25 +0600 2003-11-25, [EMAIL PROTECTED] wrote:
I want to use those for Desktop Publishing including MS Office for Mac, Quark
Xpress for Mac, Adobe apps etc.
Microsoft Office on OS X does not support Unicode.
Quark XPress on OS X does not support Unicode.
Adobe InDesign on OS X does not
Mark Davis wrote:
I remembered that I had done something with making a Unicode
Poster some time
ago. Dusted it off, and posted the results.
Voila, every Unicode character in 4.0:
http://www.macchiato.com/unicode/UnicodeChart.zip
Columns: 256, Rows: 410
all unassigned rows are
Christopher John Fynn wrote:
Peter Kirk [EMAIL PROTECTED] wrote:
This approach would certainly have simplified pointed Hebrew a lot, so
much so that it could well be serious. After all, Ethiopic was encoded
as a syllabary just because the vowel points happen to have become
attached to
I'm pretty sure it depends on whether you regard a text document as a
sequence of characters, or as a sequence of glyphs. (Er - I mean
default grapheme clusters of course). Regarded as a sequence of
characters, normalisation changes that sequence. But regarded as a
sequence of glyphs,
Actually, I don't understand why UnicodeData.txt has no less than three
different fields for numerical value anyway. I mean, it's not as though
there exists EVEN A SINGLE CODEPOINT for which two or more of these
fields exist and are defined differently from each other. One never
sees, for
At 11:53 +0100 2003-11-25, Bertrand Laidain wrote:
On OS X the hebrew keyboard is part of the Unicode bundle so it is
inputting hebrew with Unicode right ?
And this works with InDesign ME 2.
What is that? A special Middle Eastern version?
--
Michael Everson * * Everson Typography * *
On OS X the hebrew keyboard is part of the Unicode bundle so it is
inputting hebrew with Unicode right ?
And this works with InDesign ME 2.
So I had the impression that ID ME 2 support Unicode inputting (at
least with this keyboard).
Bertrand
Le 25 nov. 03, à 11:01, Michael Everson a écrit
I am not sure if this is a point that really involves Unicode
blocks, but someone in this listmight have a comment.
In Word 2002 there is one bug that is cleared up in Word 2003
(at least in the Beta, which I have been playing with).
In Word 2002 the Style may assign one particular font for
Yes , there a special ME version for Arabic and Hebrew since Page Maker
(5x I think)
See http://www.adobeme.com/ or http://www.winsoft.fr/
(there is also CE version for Central European and Cyrillic)
Bertrand
Le 25 nov. 03, à 11:52, Michael Everson a écrit :
At 11:53 +0100 2003-11-25, Bertrand
On 24/11/2003 16:56, Philippe Verdy wrote:
Peter Kirk writes:
If conformance clause C10 is taken to be operable at all levels, this
makes a nonsense of the concept of normalisation stability within
databases etc.
I don't think that the stability of normalization influence this: as long
On 24/11/2003 17:56, Christopher John Fynn wrote:
Peter Kirk [EMAIL PROTECTED] wrote:
This approach would certainly have simplified pointed Hebrew a lot, so
much so that it could well be serious. After all, Ethiopic was encoded
as a syllabary just because the vowel points happen to have
At 12:16 +0100 2003-11-25, Bertrand Laidain wrote:
Yes , there a special ME version for Arabic and Hebrew since Page
Maker (5x I think)
See http://www.adobeme.com/ or http://www.winsoft.fr/
(there is also CE version for Central European and Cyrillic)
Adobe is on its way to Unicode support,
Michael Everson writes
Eudora on OS X does not support Unicode.
Eudora doesn't support Unicode anywhere, surely ? To my knowledge on a PC
the only mail handler that is Unicode compliant is Outlook Express.
Raymond Mercier
At 03:41 -0800 2003-11-25, Peter Kirk wrote:
After all, Ethiopic was encoded as a syllabary just because the
vowel points happen to have become attached to the base characters.
Ridiculous. This happened centuries ago, and it is not why Ethiopic
was encoded as a syllabary. It was encoded as a
On 25/11/2003 03:54, Michael Everson wrote:
At 03:41 -0800 2003-11-25, Peter Kirk wrote:
...
But the floodgates have already been opened - not just Ethiopic but
Greek extended, much of Latin extended, the Korean syllables which
started this discussion, the small amount of precomposed Hebrew
Raymond Mercier scripsit:
Michael Everson writes
Eudora doesn't support Unicode anywhere, surely ? To my knowledge on a PC
the only mail handler that is Unicode compliant is Outlook Express.
Mozilla and Mozilla Thunderbird.
--
A poetical purist named Cowan [that's me: [EMAIL
I was wondering: what exactly does GB-18030 certification consists of?
I guess that some tests done on the software, but what exactly? Also, where
and who performs this certification? Does the Chinese government do it
directly, or is it out-sourced to external agencies? Does this have to be in
Eudora doesn't support Unicode anywhere, surely ? To my knowledge on a PC
the only mail handler that is Unicode compliant is Outlook Express.
Raymond Mercier
I'm pretty sure Mozilla does on Windows. Of course, if we're actually
talking about the PC, I belive all the mailers on BeOS always
On Mon, 24 Nov 2003 15:47:16 +0100 (CET), Philippe VERDY wrote:
[HKEY_CURRENT_USER\Software\Microsoft\Internet
Explorer\International\Scripts\42]
IEFixedFontName=Code2001
IEPropFontName=Code2001
This setting is incorrect: the script IDs go between 3 and 40,
See
OK, I stand corrected on Mozilla !
Raymond Mercier
Actually, I don't understand why UnicodeData.txt has no less than
/three/ different fields for numerical value anyway. I mean, it's not as
though there exists EVEN A SINGLE CODEPOINT for which two or more of
these fields exist and are defined differently from each other. One
never sees,
On 25/11/2003 03:42, Raymond Mercier wrote:
Michael Everson writes
Eudora on OS X does not support Unicode.
Eudora doesn't support Unicode anywhere, surely ? To my knowledge on a PC
the only mail handler that is Unicode compliant is Outlook Express.
Raymond Mercier
Both Mozilla and
Michael Everson scripsit:
Ridiculous. This happened centuries ago, and it is not why Ethiopic
was encoded as a syllabary. It was encoded as a syllabary because it
is a syllabary.
Structurally it's an abugida, like Indic and UCAS.
You are, because the floodgates, while once open, have been
At 08:23 -0500 2003-11-25, John Cowan wrote:
Michael Everson scripsit:
Ridiculous. This happened centuries ago, and it is not why Ethiopic
was encoded as a syllabary. It was encoded as a syllabary because it
is a syllabary.
Structurally it's an abugida, like Indic and UCAS.
I disagree. And I
I've been looking at the Vietnamese readings given in the Unihan database
recently, and although I don't know Vietnamese, I think there may be something
not quite right with some of them, and so I wondered if anyone on this list who
knows Vietnamese could confirm the validity of the Unihan
So it's the absence of stability which would make impossible this
rearrangement of normalization forms...
Canonical equivalence is unaffected if combining classes are rearranged,
though not if they are split or joined. It is only the normalised forms
of strings which may be changed. So
On 25/11/2003 07:22, Philippe Verdy wrote:
...
Composition exclusions have a lower impact as well as the relative orders of
canonical classes, as they don't affect canonical equivalence of strings,
and thus won't affect applications based on the Unicode C10 definition; they
are important only to
Normalization may or may not have an effect on compression. It has
definitely been shown to have an effect on Hebrew combining marks.
I must ask, however, that we try to keep these issues separate in
discussion, and not let the compression topic, if there is to be any,
degenerate into a wing of
De : Peter Kirk [mailto:[EMAIL PROTECTED]
Envoye : mardi 25 novembre 2003 17:06
A : [EMAIL PROTECTED]
Cc : [EMAIL PROTECTED]
Objet : Re: Normalisation stability, was: Compression through
normalization
On 25/11/2003 07:22, Philippe Verdy wrote:
...
Composition exclusions have a
John Cowan writes:
You are, because the floodgates, while once open, have been closed by
normalization.
Indeed, they were opened in Unicode 1.1, as a result of the merger with
FDIS 10646; since then, only 46 characters with canonical decompositions
have been added to Unicode (excepting
Philippe Verdy scripsit:
The question of Latin letters with two diacritics added in Latin Extension B
does not seem to respect this constraint, as it is not justifed in the
Vietnames VISCII standard that already does not contain characters with two
diacritics, but already composes them with
Philippe Verdy scripsit:
I just wonder however why it was crucial (as Unicode says in its
Definitions chapter) to expect a relative order of distinct non-zero
combining classes. For me these combining classes are arbitrary not only on
their absolute value as they are now, but even their
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
Of Michael Everson
Microsoft Office on OS X does not support Unicode.
My understanding is that Word for Mac in MS Office Mac versions since
Office 98 have used the same file format as Windows versions --
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Arcane Jill
Actually, I don't understand why UnicodeData.txt has no less than
three different
fields for numerical value anyway...
Not all characters representing numbers are digits. Not all characters
representing digits are
John Cowan writes:
Since it adds efficiency to normalize only once,
it is worthwhile to define a few normalization forms and urge
people to produce text in one of them, so that receivers need not
normalize but need only check for normalization, typically much cheaper.
I'm not convinced that
At 09:51 -0800 2003-11-25, Peter Constable wrote:
My understanding is that Word for Mac in MS Office Mac versions since
Office 98 have used the same file format as Windows versions -- Word 97
and later. That means that Word for Mac can read files containing any
Unicode characters.
It doesn't
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
Of Ritu Malhotra
Could someone kindly help me by providing an exe(Font utility) that
will not
only edit open type fonts(ex: Mangal.ttf)...
Making changes to mangal.ttf or other Microsoft fonts would be in
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
The question of Latin letters with two diacritics added in Latin
Extension B does not seem to respect this constraint, as it is not
justifed in the Vietnames VISCII standard that already does not
contain characters with two
Peter Kirk peterkirk at qaya dot org wrote:
Well, Doug, I see your point; different topics should be kept
separate. But I changed the subject line precisely because the thread
has shifted from discussion of compression to a general discussion of
normalisation stability.
That's true; most
On 25/11/2003 10:03, John Cowan wrote:
... And as for
canonical equivalence, the most efficient way to compare strings for
it is to normalize both of them in some way and then do a raw
binary compare. Since it adds efficiency to normalize only once,
it is worthwhile to define a few
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
I'm not convinced that there's a significant improvement when only
checking for noramlization but not perfomring it. It requires at least
a list of the characters are acceptable in a normalization form, and
as well their combining
Here's a summary of the responses so far:
* Philippe Verdy and and Jill Ramonsky say YES, a compressor can
normalize, because it knows it is operating on Unicode character data
and can take advantage of Unicode properties.
* Peter Kirk and Mark Shoulson say NO, it can't, because all the
The fields are the way they are for backwards compatibility.
If you look at the UCD.html, you will see that the actual properties are
separated:
http://www.unicode.org/Public/UNIDATA/UCD.html#Numeric_Type
I'd like to remind people again that you
should read the documentation in UCD.html
Peter Kirk scripsit:
If receivers are expected to check for normalisation, they are
presumably expected also to normalise
Not so. An alternative behavior, which is preferred in certain circumstances,
is to reject the input, or at least to advise higher layers that the input
may be invalid.
On 25/11/2003 10:26, Michael Everson wrote:
At 09:51 -0800 2003-11-25, Peter Constable wrote:
My understanding is that Word for Mac in MS Office Mac versions since
Office 98 have used the same file format as Windows versions -- Word 97
and later. That means that Word for Mac can read files
On 25/11/2003 11:15, John Cowan wrote:
Peter Kirk scripsit:
If receivers are expected to check for normalisation, they are
presumably expected also to normalise
Not so. An alternative behavior, which is preferred in certain circumstances,
is to reject the input, or at least to advise
Michael Everson [EMAIL PROTECTED] wrote:
At 09:51 -0800 2003-11-25, Peter Constable wrote:
My understanding is that Word for Mac in MS Office Mac versions since
Office 98 have used the same file format as Windows versions -- Word 97
and later. That means that Word for Mac can read files
On my home page I have a link to a brief paper on minimal size for an NFC
normalizer.
http://www.macchiato.com/, see Normalization Footprint
It was for Unicode 3.0, but the sizes shouldn't have changed much since then. It
would add a bit of extra code for supplementaries.
Mark
I would say that a compressor can normalize, if (a) when decompressing it
produces NFC, and (b) it advertises that it normalizes.
Mark
__
http://www.macchiato.com
- Original Message -
From: Doug Ewell [EMAIL PROTECTED]
To: Unicode Mailing List [EMAIL
John Cowan suggested...
We will never come close to exceeding this limit. Essentially all new
combining characters are either class 0 or fall into one of the 200-range
positional classes.
Or 9, for viramas.
One take-home point is that there won't be any more fixed position
classes added
On 25/11/2003 08:55, Doug Ewell wrote:
Normalization may or may not have an effect on compression. It has
definitely been shown to have an effect on Hebrew combining marks.
I must ask, however, that we try to keep these issues separate in
discussion, and not let the compression topic, if there
The Unicode conformance clauses, in TUS 4.0 section 3.2, are written in
terms of what A process may or may not do, sometimes in relation to
another process. But there doesn't seem to be a definition, either on
this section or in the glossary, of process. Is this to be understood
in a general
.
Peter Constable wrote,
On Behalf
Of Ritu Malhotra
Could someone kindly help me by providing an exe(Font utility) that
will not only edit open type fonts(ex: Mangal.ttf)...
Making changes to mangal.ttf or other Microsoft fonts would be in
violation of the end-user license agreement
Mozilla Firebird 0.7/WinXP had no problem with the Chart, though it
was a little slow to open and even slower to print it. I got four pages
of decidedly small type; only columns 0-63 appeared in the printout
(I wish Mozilla had a mode to print stuff wider than the print margin
on separate pages).
Of course, as usual, this is my opinion. UTC hasn't actually made any
proclamations about what will or won't be done in terms of the classes or
what kinds of classes might be assigned in the future.
Rick
John Cowan suggested...
We will never come close to exceeding this limit.
.
John Hudson wrote,
If in doubt, check your license agreement.
Windows users can check the licensing material on many newer fonts
with a program called TTFEXT.EXE, freely available from Microsoft:
http://www.microsoft.com/typography/property/property.htm
It's too bad that this feature is
.
Peter Constable wrote,
James:
Inside a program, for instance...
This is *very* faulty logic. ...
Jeepers!
... Variable names exist in source code only,
and have nothing whatsoever to do with the data actually processed.
Exactly. Variable names are always internal while data may be
[EMAIL PROTECTED] wrote:
I don't know about Chinese, but it appears that one is limited to
WorldScript. Word hasn't been updated for Mac OS since 2001.
I would enjoy hearing otherwise, but as far as I know, the only MS products
for the Mac which are not like this (and can actually do Unicode
At 12:07 PM 11/25/2003, [EMAIL PROTECTED] wrote:
Most font developers restrict rights on their fonts. Obtaining a
legal copy of a font only grants the user the right to use the font;
not to make changes.
Actually, a lot of font developers -- probably the majority -- explicitly
allow
There's still my unanswered question about the third numeric field not
filled for some numeric characters (notably Nl characters, i.e. number
letters).
I accepted the fact of being unable to define it for the numerator one less
than the denominator, but the Latin Roman number 900 has NO defined
Doug Ewell writes:
* Philippe Verdy and and Jill Ramonsky say YES, a compressor can
normalize, because it knows it is operating on Unicode character data
and can take advantage of Unicode properties.
I say YES only for compressors that are supposed to work on Unicode text
(this applies to
Mark Davis writes:
I would say that a compressor can normalize, if (a) when decompressing it
produces NFC, and (b) it advertises that it normalizes.
Why condition (a) ? NFD could be used as well, and even another
normalization where combining characters are sorted differently, or partly
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
I say YES only for compressors that are supposed to work on Unicode
text (this applies to BOCU-1 and SCSU which are not intented to
compress anything else than Unicode text), but NO of course for
general purpose compressors (like
Peter Kirk peterkirk at qaya dot org wrote:
The Unicode conformance clauses, in TUS 4.0 section 3.2, are written
in terms of what A process may or may not do, sometimes in relation
to another process. But there doesn't seem to be a definition,
either on this section or in the glossary, of
Rick McGowan writes:
John Cowan suggested...
We will never come close to exceeding this limit. Essentially all new
combining characters are either class 0 or fall into one of the
200-range positional classes.
Or 9, for viramas.
Or 1, for overlays. Don't forget them...
Or 7, for
Hi Andrew,
Thanks for stumbling into this problem. We can confirm that the UniHan readings are
incorrect, and will generate a correct mapping between the 164 CJK characters in
question and their kVietnamese values for submission to the UTC.
Best,
James
On 25/11/2003 12:02, Peter Kirk wrote:
The Unicode conformance clauses, in TUS 4.0 section 3.2, are written
in terms of what A process may or may not do, sometimes in relation
to another process. But there doesn't seem to be a definition,
either on this section or in the glossary, of process.
Doug Ewell writes:
Yes, you can take SCSU- or BOCU-1-encoded text and recompress it using a
GP compression scheme. Atkin and Stansifer's paper from last year is
all about that, and I spend a few pages on it in my paper as well. You
can also re-Zip a Zip file, though, so I don't know what
Peter Kirk writes:
The Unicode conformance clauses, in TUS 4.0 section 3.2, are written in
terms of what A process may or may not do, sometimes in relation to
another process. But there doesn't seem to be a definition, either on
this section or in the glossary, of process. Is this to be
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
There's still my unanswered question about the third numeric field not
filled for some numeric characters (notably Nl characters, i.e. number
letters).
I accepted the fact of being unable to define it for the numerator
one less than
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
So SCSU and BOCU-* formats are NOT general purpose compressors. As
they are defined only in terms of stream of Unicode code points, they
are assumed to follow the conformance clauses of Unicode. As they
recognize their input as
Peter Kirk peterkirk at qaya dot org wrote:
Instance A of a program P, version X, writes a Unicode character
string S, in a particular normalisation form, to a storage medium Z.
Some time later (maybe seconds, maybe years) instance B of version Y
of that same program P reads that string from
"Doug Ewell" [EMAIL PROTECTED] crivait en ce
25/XI/2003
All the Roman numerals I can find in the
standard, except U+2183 ROMAN NUMERAL REVERSED ONE HUNDRED, have a value
in the "numeric value" field. (Perhaps the actual numeric value of
U+2183 is not known.)
I think it is rather
Doug Ewell scripsit:
All the Roman numerals I can find in the standard, except U+2183 ROMAN
NUMERAL REVERSED ONE HUNDRED, have a value in the numeric value field.
(Perhaps the actual numeric value of U+2183 is not known.)
It has no definite numeric value. The notation CI), where ) means
I tried the chart where I teach, on RedHat Linux 9 and Mozilla 1.2 or
1.4 (I forget which) and it came through fine, if small.
~mark
On 11/25/03 15:05, John Cowan wrote:
Mozilla Firebird 0.7/WinXP had no problem with the Chart, though it
was a little slow to open and even slower to print it.
On Nov 25, 2003, at 12:05 PM, Christopher John Fynn wrote:
It doesn't work. They seem always to get converted into underscores.
Do the characters actually get converted to underscore characters or
do they
simply get displayed with an underscore glyph?
They get converted.
Input and rendering,
78 matches
Mail list logo