RE: Ruby Annotation and XHTML 1.1 are W3C Proposed Recommendations

2001-04-10 Thread Martin Duerst
At 10:00 01/04/09 -0700, Carl W. Brown wrote: >I am wondering how in the absence of a sub language how one should render >Chinese ruby. Mandarin ruby will not do a Cantonese reader much good. Can >I specify multiple ruby and then have one displayed depending on the spoken >language? Maybe that'

Re: The Code2000 font.

2001-04-10 Thread James Kass
Marco Cimarosti wrote about the Code2000 font, and so did 11digitboy. There is room for improvement in the font and I will consider your helpful comments. I'll respond to specific issues off-list and request that anyone who wants to discuss the font's merits and/or shortcomings contact me priv

Re: Byte Order Marks

2001-04-10 Thread DougEwell2
In a message dated 2001-04-10 3:04:09 Pacific Daylight Time, [EMAIL PROTECTED] writes: > When looking at a document would it be safe to assume that if you found any > of the following Byte Order Marks > *0xFFFE (UCS-2 Little Endian) > *0xFEFE (UCS-2 Big Endian) should be 0xFEFF >

Re: gb2312

2001-04-10 Thread Jungshik Shin
On Tue, 10 Apr 2001, Tomas McGuinness wrote: > Is the character set gb2312 encoded in a two octet scheme? If so does it pad > out its ascii characters to two octets e.g. the character < is 0x3C in ascii > so does it become 0x003C in gb2312? No !! In EUC-CN(which is a better name for what y

Re: Unicode savvy concordance software?

2001-04-10 Thread Otto Stolz
Am 2001-04-06 um 7:50 h UCT hat Richard Kunst geschrieben: > Perhaps you could post to the list a brief summary in English of > the extent to which TUSTEP does support Unicode. Am 2001-04-07 um 9:22 h UCT hat Janusz S. Bien' geschrieben: > Do you mean TUSTEP supports UNICODE? Since October 1999,

RE: gb2312

2001-04-10 Thread Marco Cimarosti
Tomas McGuinness wrote: > Is the character set gb2312 encoded in a two octet scheme? It is one of the so-called "double byte character sets" (DBCS), but this name is misleading: "multibyte character set" (MBCS) is a better definition. > If so does it pad out its ascii characters to two octets >

gb2312

2001-04-10 Thread Tomas McGuinness
Hi, Is the character set gb2312 encoded in a two octet scheme? If so does it pad out its ascii characters to two octets e.g. the character < is 0x3C in ascii so does it become 0x003C in gb2312? Regrards, Tom. Tomas McGuinness Consultant > -

Byte Order Marks

2001-04-10 Thread Tomas McGuinness
Hi, When looking at a document would it be safe to assume that if you found any of the following Byte Order Marks * 0xFFFE (UCS-2 Little Endian) * 0xFEFE (UCS-2 Big Endian) * 0xEFBBBF (UTF-8) That the document is encoded with that encoding format. That means that if I found the

Re: Digits in Unicode Names

2001-04-10 Thread Antoine Leca
Nelson H. F. Beebe wrote: > > Yves Arrouye <[EMAIL PROTECTED]> writes on Fri, 6 Apr 2001 15:52:59 -0700: > > >> Does anybody know if the C++ standard specified > Here is what Thanks Nelson for quoting the relevant citations. > However, it is not clear to me on a quick skim that wchar_t > ne

RE: The Code 2000 font.

2001-04-10 Thread Marco Cimarosti
[EMAIL PROTECTED] wrote > To the author of the Code 2000 font: > 1) The top stroke of a capital J does not require serifs. > It is a serif. Well, font designers have the freedom to do what they want with their glyphs, and often they choose to go against conventions to make their design more origi

RE: Why win32 ANSI api does not work with Indic Scripts?

2001-04-10 Thread Marco Cimarosti
Gharesh wrote (on Sat Apr 7, 2001 11:45am): > There is an article in MSJ Nov98 publication on 'Supporting > Multilingual text layout and complex scripts on Windows NT 5.0" > http://www.microsoft.com/MSJ/1198/multilang/multilang.htm > > It says following; > > "Indic scripts must be handled separate

RE: Code charts

2001-04-10 Thread Marco Cimarosti
Tomás McGuinness wrote: > I am working on a project that involves converting WML and HTML > documents from a character set to UCS-2. The problem is that > the UCS-2 hex representation for say 0x003C (<) is not present > in GB2312 [the same glypg I mean]. Notice that some characters normally have