Manchu/Mongolian in Unicode

2002-10-13 Thread Tom Gewecke

The latest Mac OS X upgrade has fonts that include the classic
Mongolian/Manchu range, 1800-18AF.

Displaying these scripts correctly seems to be loaded with problems:  They
should run top-to-bottom and left-to-right, with ligatures and positional
variants similar to Arabic.

I assume that ligatures and positional variants would be handled by font
tables and rendering software operating on text encoded with the basic
codepoints. I'm wondering, however, how the directional questions of
display would be dealt with.

I gather that vertical display is for markup and not part of Unicode.  I've
found what appears to be the appropriate stuff in the "writing-mode"
property of XSL and CSS3. Does anyone know of any browsers, Mac or Windows,
that support this?

I've also seen examples of the scripts written horizontally, both
left-to-right and right-to-left.  Since the standard font glyphs have a
vertical orientation, they must be individually rotated minus or plus 90
degrees for displaying horizontally.  Is this also purely a markup issue,
and are there any browsers that support it?

Thanks for the help!  Any pointers to online or other information on this
topic would be appreciated.








Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Michael Everson

At 11:26 -0700 2002-10-13, Tom Gewecke wrote:
>The latest Mac OS X upgrade has fonts that include the classic
>Mongolian/Manchu range, 1800-18AF

Where?
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)




Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Michael Everson

At 13:29 +0200 2002-10-14, Herbert Elbrecht wrote:
>Hi Michael -
>
>Apple "Character Palette" has special "Show only fonts containing 
>selected character" - click triangle below, select Mongolian Unicode 
>Block first, and a Monglian character then; now activate "Show only 
>fonts containing selected character" and now you get: "STFangsong, 
>STHeiti, STKaiti & STSong" for Mongolian!!!

Apple's Character Palette will not load on my machine, a bug which I 
have been filing for a good while now.
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)




Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Andrew C. West

On Mon, 14 Oct 2002, Tom Gewecke wrote:

> 
> The latest Mac OS X upgrade has fonts that include the classic
> Mongolian/Manchu range, 1800-18AF.
> 
> Displaying these scripts correctly seems to be loaded with problems:  They
> should run top-to-bottom and left-to-right, with ligatures and positional
> variants similar to Arabic.
> 
You're not kidding !!

Microsoft's SimSun-18030 font (available at
) also 
covers the
Mongolian block, but as with the Mac font, whilst this font provides glyphs for the 
individual code
points of the Mongolian block, there are no glyphs for the many standard variants, and 
there is no
mechanism for ligating the glyphs to form typographically correct 
Mongolian/Todo/Sibe/Manchu text
(ditto for Tibetan). Under a new Chinese law all new software sold in China must 
support the new
GB18030 character standard and operating systems must provide support for the Tibetan, 
Mongolian,
Uighur and Yi scripts. Although totally unusable for typesetting Mongolian, Manchu or 
Tibetan text,
the provision of glyphs with a one-to-one correspondence with Mongolian and Tibetan 
block codepoints
ensures compliance with the new law, and thus allows Microsoft and Apple to continue 
to sell their
operating systems in China.

As far as I am aware there is nobody out there currently engaged in the production of 
a proper
Unicode Mongolian/Manchu font - but I'd pleased if someone could prove me wrong on 
this.

> 
> Thanks for the help!  Any pointers to online or other information on this
> topic would be appreciated.

The 1999 report "Traditional Mongolian Script in the ISO/IEC 10646 and Unicode 
Standards" by Myatav
Erdenechimeg, Richard Moore and Yumbayar Namsrai, available in postscript form at
, is the key document to 
understanding how
to write Mongolian and Manchu in Unicode, and should be read in conjunction with 
Unicode's
"Standardized Variants" page at
.

Andrew




Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Tom Gewecke

>At 11:26 -0700 2002-10-13, Tom Gewecke wrote:
>>The latest Mac OS X upgrade has fonts that include the classic
>>Mongolian/Manchu range, 1800-18AF
>
>Where?

STFangsong, STHeiti, STKaiti, STSong

If you feel like "typing" the characters, I have a keyboard at

http://homepage.mac.com/thgewecke/fs/






Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Dean Snyder

Michael Everson wrote the following at 10:18 AM on Mon, Oct 14, 2002:

>At 11:26 -0700 2002-10-13, Tom Gewecke wrote:
>>The latest Mac OS X upgrade has fonts that include the classic
>>Mongolian/Manchu range, 1800-18AF
>
>Where?

You can view all supported characters by using the Unicode character
palette invoked from the keyboard menu in Mac OSX 10.2.

This is a great Unicode tool (and one I have seen no mention of
elsewhere). You can view characters by Unicode block or the entire
Unicode character set; you can view the glyph catalog of whatever font
you select; you can add characters to the favorite tab. When you select a
character it shows the Unicode name, number, and related characters (like
the box drawing charcaters).

Respectfully,

Dean A. Snyder
Scholarly Technology Specialist
Center For Scholarly Resources, Sheridan Libraries
Garrett Room, MSE Library, 3400 N. Charles St.
The Johns Hopkins University
Baltimore, Maryland, USA 21218

office: 410 516-6850 mobile: 410 245-7168 fax: 410-516-6229
Digital Hammurabi: www.jhu.edu/digitalhammurabi
Initiative for Cuneiform Encoding: www.jhu.edu/ice






Re: Manchu/Mongolian in Unicode

2002-10-14 Thread John H. Jenkins


On Sunday, October 13, 2002, at 12:26 PM, Tom Gewecke wrote:

> The latest Mac OS X upgrade has fonts that include the classic
> Mongolian/Manchu range, 1800-18AF.
>

Well, yes, but they're not ready for prime time.  They're included 
because of PRC requirements which expect the glyphs but don't really 
insist that they "do the right thing."  The same is true of Tibetan.  
Even the PRC's own fonts have this problem.  This is an unfortunate 
bind we were put in and I hope we can correct it in a not-too-distant 
release.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.tejat.net/





Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Michael Everson

At 06:49 -0700 2002-10-14, Peter Lofting wrote:
>At 1:35 PM +0100 10/14/02, Michael Everson wrote:
>>Apple's Character Palette will not load on my machine, a bug which I
>>have been filing for a good while now.
>
>Works on my iBook. What machine have you got?
>
>Did you get a copy of 10.2 from Cork?

No, I get seeded 'cause I send in so many bugs. :-)

Actually a kind soul told me to delete

user//library/preferences/com.apple.CharPaletteServer.plist

Which does bring it back for me, though I find its functionality 
isn't working properly; when you choose a new font what's in the 
Window doesn't update. Thus I have Lucida Grande on screen and select 
the character for Yogh which has no glyph in Lucida Grande, but when 
I switch to Evertimes or Lucida Grande ME ze font, she does not 
change
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)




Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Stefan Persson

- Original Message -
From: "Andrew C. West" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Monday, October 14, 2002 2:58 PM
Subject: Re: Manchu/Mongolian in Unicode

> Microsoft's SimSun-18030 font

That font also includes some characters mapped to the PUA: A € sign, and
several æ¼¢ character, many of which look like radicals. Why? Is that
something that's also required by that law?

Stefan

_
Gratis e-mail resten av livet på www.yahoo.se/mail
Busenkelt!





Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Michael Everson

At 12:52 -0700 2002-10-14, Peter Lofting wrote:
>At 4:33 PM +0100 10/14/02, Michael Everson wrote:
>>No, I get seeded 'cause I send in so many bugs. :-)
>
>Good ! So you got the OSX 10.2 developer seed disks

Naaah I have to download them from Apple with a 56K modem with 
metered access
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)




Re: Manchu/Mongolian in Unicode

2002-10-14 Thread Michael Everson

At 14:36 -0700 2002-10-14, tom wrote:
>To get Character Palette to actually change fonts in the panel, use 
>the Glyph Catalog tab.

Nope. This doesn't work for me. Changing the font in Glyph Catalogue 
does not change all my fonts. Actually, *my* fonts don't seem to 
appear. It is most strange.

Also I had to delete the .plist again. No idea why. Time to file 
another bug I think.
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)




Re: Manchu/Mongolian in Unicode

2002-10-15 Thread Andrew C. West

On Tue, 15 Oct 2002, "Stefan Persson" wrote:

> That font also includes some characters mapped to the PUA: A € sign, and
> several 漢 character, many of which look like radicals. Why? Is that
> something that's also required by that law?
> 

It's my experience that many fonts include gunk in the Private Use Area. A quick check 
of some of
the CJK glyphs in the PUA of SimSun-18030 shows that they are not unique, but are also 
mapped to
codepoints in the CJK Radical Supplement and CJK-A blocks for example.

I believe that it is intended to maintain a one-to-one correspondence between the 
GB18030 standard
and Unicode, and so there should be no need for any supplementary glyphs in the PUA.

The new PRC law is, as you hint, overly restrictive and prescriptive, and is, I think, 
a serious
setback for popularisation of Unicode on the Web. The intent is that GB18030 should 
replace GB2312
and Big5, and so that instead of the current mishmash of GB2312 (SC) and Big5 (TC) 
websites, in the
future Traditional and Simplified Chinese sites (at least those hosted in China) will 
use the same
GB18030 encoding.

Where does this leave websites written in Unicode Chinese ? Out in the cold !

At present web pages written in Unicode Chinese (some of mine for example) are not 
being indexed by
Google, and are ignored by both Yahoo China (SC) and Chinese Yahoo (TC). The situation 
will
certainly not be improved by the replacement of GB2312 and Big5 with GB18030.

Andrew




Re: Manchu/Mongolian in Unicode

2002-10-15 Thread Markus Scherer

Andrew C. West wrote:

> On Tue, 15 Oct 2002, "Stefan Persson" wrote:
> 
>>That font also includes some characters mapped to the PUA: A € sign, and
>>several 漢 character, many of which look like radicals. Why? Is that
>>something that's also required by that law?
> 
> It's my experience that many fonts include gunk in the Private Use Area. A quick 
>check of some of
> the CJK glyphs in the PUA of SimSun-18030 shows that they are not unique, but are 
>also mapped to
> codepoints in the CJK Radical Supplement and CJK-A blocks for example.


I may be able to shed some light on this.

GB 18030 is really an extension not only of GB 2312, but also of GBK.
GBK contained all ideographs from Unicode 2.0, plus of course many other characters.

GB 18030 is based on Unicode 3.0. Between 2.0 and 3.0 some characters were added to 
Unicode that GBK had mapped to the Unicode Private Use Area. GB 18030 maps those 
characters to their Unicode 3.0 code points instead of PUA ones, and the PUA ones now 
map instead to linearly enumerated 4-byte sequences.
About 80 such characters are affected, among them the Euro sign and the Ideographic 
Description Sequence characters. (Listed in Appendix E of the GB 18030 standard.)

I assume that the font shows glyphs for those 80 or so characters in both the old 
GBK/Unicode PUA position and for the new GB 18030/Unicode 3.0 real code point.


See http://oss.software.ibm.com/icu/docs/papers/gb18030.html


> I believe that it is intended to maintain a one-to-one correspondence between the 
>GB18030 standard
> and Unicode, and so there should be no need for any supplementary glyphs in the PUA.

 >

> The new PRC law is, as you hint, overly restrictive and prescriptive, and is, I 
>think, a serious
> setback for popularisation of Unicode on the Web. The intent is that GB18030 should 
>replace GB2312


... and GBK ...


> and Big5, and so that instead of the current mishmash of GB2312 (SC) and Big5 (TC) 
>websites, in the
> future Traditional and Simplified Chinese sites (at least those hosted in China) 
>will use the same
> GB18030 encoding.


I am not sure about this. GB 18030 requires to _support_ its new encoding, but I 
believe it does not require to _use_ it.
Most implementations have a converter to/from Unicode, and GB 18030 works quite well 
for that because it is defined _in terms of_ Unicode.
As such, it actually boosts the spread of Unicode-based software. The drawback is of 
course that a GB 18030 converter requires special code on top of a large mapping table.


> Where does this leave websites written in Unicode Chinese ? Out in the cold !
> 
> At present web pages written in Unicode Chinese (some of mine for example) are not 
>being indexed by
> Google, and are ignored by both Yahoo China (SC) and Chinese Yahoo (TC). The 
>situation will
> certainly not be improved by the replacement of GB2312 and Big5 with GB18030.


There is no reason for that. You should contact Google to get that fixed.

markus
-- 
Opinions expressed here may not reflect my company's positions unless otherwise noted.





Tolkenian Sarati (Formerly: RE: Manchu/Mongolian in Unicode)

2002-10-24 Thread Robert
 




 --- On Sun 10/13, Tom Gewecke < [EMAIL PROTECTED] > wrote:
From: Tom Gewecke [mailto: [EMAIL PROTECTED]]
To: [EMAIL PROTECTED]
Date: Sun, 13 Oct 2002 11:26:04 -0700
Subject: Manchu/Mongolian in Unicode

> The latest Mac OS X upgrade has fonts that include the classic
> Mongolian/Manchu range, 1800-18AF.
> 
> Displaying these scripts correctly seems to be loaded with problems: 
> They
> should run top-to-bottom and left-to-right, with ligatures and positional
> variants similar to Arabic.
> 
> I assume that ligatures and positional variants would be handled by font
> tables and rendering software operating on text encoded with the basic
> codepoints. I'm wondering, however, how the directional questions of
> display would be dealt with.
> 
> I gather that vertical display is for markup and not part of Unicode. 
> I've
> found what appears to be the appropriate stuff in the
> "writing-mode"
> property of XSL and CSS3. Does anyone know of any browsers, Mac or
> Windows,
> that support this?
> 
> I've also seen examples of the scripts written horizontally, both
> left-to-right and right-to-left.  Since the standard font glyphs have a
> vertical orientation, they must be individually rotated minus or plus 90
> degrees for displaying horizontally.  Is this also purely a markup issue,
> and are there any browsers that support it?
> 
> Thanks for the help!  Any pointers to online or other information on this
> topic would be appreciated.
> 
> 
--Reply--
Another language alphabetic script that reads left-to-right vertically from top-to-bottom is Sarati, another of the fantasy scripts from the late J. R. R. Tolkien's *Lord Of The Rings* book series.  Featured in Sarati are the consonant symbols (sarat) that form the backbone of the vertical reading line; the vowels are small marks that go on either side (left for before, right for after) of the involved consonant—the name *Illuvatar* (for example) would be written thus in Sarati:
 

 
 
 

the  holding the initial  mark to its left.  Carriers long and short are used to hold vowel signs that form syllables by themselves (like the initial  in *Illuvatar*, above).
There's even a sarat for the blend , to boot!  Vowel marks include those for lengthened doubles  and <əə> (the 2nd being a double shəwa).
Thank You!

Robert Lloyd Wheelock
Augusta, ME  USA

Join Excite! - http://www.excite.comThe most personalized portal on the Web!