Re: GB18030 and super font

2004-04-22 Thread Eric Muller
Raymond Mercier wrote: Mark Shoulson writes >their Super Font is bundled with Microsoft Office XP, and > even Microsoft's prices haven't gotten that high! >From Microsoft, http://www.microsoft.com/globaldev/DrIntl/columns/015/default.mspx : "A font that contains Simp

Re: GB18030 and super font

2004-04-22 Thread Frank Yung-Fong Tang
In case you want to test your GB18030 font, you can use Netscape 7 (or lateset Mozilla) and then visit my GB18030 test pages at http://people.netscape.com/ftang/testscript/gb18030/gb18030.cgi?page=10 It should be page to page compatable to the paper copy of GB18030-2000 standard. I also

Re: GB18030 and super font

2004-04-22 Thread Peter Kirk
On 22/04/2004 10:04, Raymond Mercier wrote: Eric, Amazin' Amazon!! Now why didn't I think of that ? In fact the uk Amazon.co.uk say it is discontinued, so I would have to get it from Amazon in the US. It is not the first time that the two Amazon's fail to connect. Many thanks for the tip, Raymo

Re: GB18030 and super font

2004-04-22 Thread Frank Yung-Fong Tang
Raymond Mercier wrote on 4/22/2004, 7:35 AM: > I enquired about the 'super font' created by a Beijing foundry, > http://font.founder.com.cn/english/web/index.htm, and am fairly > astonished > at the prices, as you see from the attached. The cost of produce these fonts are much higher than p

Re: GB18030 and super font

2004-04-22 Thread Raymond Mercier
riginal Message - From: Eric Muller To: [EMAIL PROTECTED] Sent: Thursday, April 22, 2004 5:40 PM Subject: Re: GB18030 and super font Raymond Mercier wrote: But that link to proofing tools leads nowhere. Maybe it's not be so easy toget the CHS version.In

Re: GB18030 and super font

2004-04-22 Thread Eric Muller
Raymond Mercier wrote: But that link to proofing tools leads nowhere. Maybe it's not be so easy to get the CHS version. Includes ~140 fonts, mostly for CJK, Arabic, Hebrew but other scripts as well. Includes "Simsun (Founder Extended)" aka "åä-ææèååçé", with 65,531 glyph

Re: GB18030 and super font

2004-04-22 Thread Philippe Verdy
From: "Mark E. Shoulson" <[EMAIL PROTECTED]> > Raymond Mercier wrote: > > >I am intrigued by GB18030 encoding. There is a table of equivalences in > >http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200 > >0.xml > >No doubt Unih

Re: GB18030 and super font

2004-04-22 Thread Ernest Cline
Possibly they were quoting the price for one to be able to bundle their font with software that you would sell. Judging by the website, I don't think that their intent is to sell directly to individual users. In that context, the price doesn't seem unreasonable at all. When you consider that hig

GB18030 and super font

2004-04-22 Thread Raymond Mercier
ï Mark Shoulson writes>their Super Font is bundled with Microsoft Office XP, and> even Microsoft's prices haven't gotten that high!From Microsoft,http://www.microsoft.com/globaldev/DrIntl/columns/015/default.mspx :"A font that contains Simplified Chinese glyphs from both CJK Extension Aand B s

Re: GB18030 and super font

2004-04-22 Thread Mark E. Shoulson
Raymond Mercier wrote: I am intrigued by GB18030 encoding. There is a table of equivalences in http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200 0.xml No doubt Unihan will at some stage include these 2 & 4 byte values. I enquired about the 'super font'

GB18030 and super font

2004-04-22 Thread Raymond Mercier
I am intrigued by GB18030 encoding. There is a table of equivalences in http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200 0.xml No doubt Unihan will at some stage include these 2 & 4 byte values. I enquired about the 'super font' created by a Beijing

Re: commandline converter for gb18030 -> utf8 in *nix

2004-03-05 Thread Frank Yung-Fong Tang
you can also use 'nsconv' which come with mozilla source code with GB18030. see http://www.mozilla.org/projects/l10n/mlp_tools.html for details Zhang Weiwu wrote on 3/5/2004, 6:43 AM: > Hello. I believe this must be a frequent question, but I googled around > and I didn

Re: commandline converter for gb18030 -> utf8 in *nix

2004-03-05 Thread Zhang Weiwu
Peter Jacobi wrote: Hello. I believe this must be a frequent question, but I googled around and I didn't find a satisfying tool. It seems most converters do GB2312 but not GB18030. Both GNU libc iconv and GNU libiconv support GB18030. I assume the libiconv distribution includes the co

Re: commandline converter for gb18030 -> utf8 in *nix

2004-03-05 Thread Peter Jacobi
> Hello. I believe this must be a frequent question, but I googled around > and I didn't find a satisfying tool. It seems most converters do GB2312 > but not GB18030. Both GNU libc iconv and GNU libiconv support GB18030. I assume the libiconv distribution includes the comman

commandline converter for gb18030 -> utf8 in *nix

2004-03-05 Thread Zhang Weiwu
Hello. I believe this must be a frequent question, but I googled around and I didn't find a satisfying tool. It seems most converters do GB2312 but not GB18030. I have 100+ files to convert, normal graphical /web based converters won't do the work well. On my FreeBSD there is a p

GB18030 mapping table....

2003-08-19 Thread Addison Phillips [wM]
Hi Will, The ICU library is a good source for information like this. See: http://oss.software.ibm.com/icu/charset/ The data table is located here: http://oss.software.ibm.com/cvs/icu/charset/data/xml/gb-18030-2000.xml Read the note on the first page. There are official sources as well, but I

RE: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030

2002-11-15 Thread John McConnell
xt section, which taught me that I shouldn't care. John Microsoft -Original Message- From: Doug Ewell [mailto:dewell@;adelphia.net] Sent: Thursday, November 14, 2002 8:26 PM To: Unicode Mailing List Cc: Carl W. Brown Subject: Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030 Carl W

Re: IBM AIX 5 and GB18030

2002-11-15 Thread Markus Scherer
Jane, you are right, I over-simplified. I tried to make the point that you need not _process_ text in GB18030 but that Unicode processing and conversion to/from GB18030 fulfills the requirement to be able to read and write GB18030 text. Yes, you need to have font support for all the characters

Re: IBM AIX 5 and GB18030

2002-11-15 Thread Markus Scherer
Michael Yau wrote: Markus, >The standard does _not_ require to _process_ internally in GB18030. It is sufficient to have a converter and to process in Unicode, which does contain all of >the characters. Just curious, do you have this in writing from the China standards body? I

RE: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030

2002-11-15 Thread Carl W. Brown
Doug, > > However, 16 bit characters were a hard enough sell in the good old > > days. If we had started out withug 2bit characters we would still be > > dreaming about Unicode. > > I think Carl meant "with 32-bit characters." I don't know what kind of > word "withug" is (Old English?), but I li

Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030

2002-11-14 Thread Doug Ewell
Carl W. Brown wrote: > Converting from UCS-2 to UTF-16 is just like converting from SBCS to > DBCS. For folks who think DBCS it is no problem. Those who went from > DBCS to Unicode to simplify their lives I am sure are not happy. Ken made me laugh last March by referring to this as "... a

UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030

2002-11-14 Thread Carl W. Brown
Markus, > You seem to suggest that there is a problem with 16-bit Unicode. > It does take some effort to adapt > UCS-2-designed functions for UTF-16, but it's not "rocket > science" and works very well thanks to the > Unicode allocation practice (common characters in the BMP). > Making UTF-8/32 fu

RE: IBM AIX 5 and GB18030

2002-11-14 Thread Carl W. Brown
[EMAIL PROTECTED] [mailto:unicode-bounce@;unicode.org]On > Behalf Of Markus Scherer > Sent: Thursday, November 14, 2002 9:18 AM > To: unicode > Subject: Re: IBM AIX 5 and GB18030 > > > Carl W. Brown wrote: > > Some Unix systems adapted faster because the later Unicode > adopter

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Joe Ross
:59 AM                 To:        Markus Scherer <[EMAIL PROTECTED]>, unicode <[EMAIL PROTECTED]>         cc:                 Subject:        Re: IBM AIX 5 and GB18030 Thanks Mark ! That may mean IBM AIX 5 support converison between GB18030 and Unicode, but I don't see this is a system l

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Jane Liu
Mark, I think only "converter" is not sufficient. How about the following support : - IME (to input CJK Ext.A characters through GB18030/Unicode code) - X-Windows fonts support. - iconv support - mbtowc(), mbstowcs(), mblen()... - and so on... You need be able to do like what you

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Michael Yau
Markus, >The standard does _not_ require to _process_ internally in GB18030. It is sufficient to have a converter and to process in Unicode, which does contain all of >the characters. Just curious, do you have this in writing from the China standards body? - Michael Markus S

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Michael \(michka\) Kaplan
From: "Carl W. Brown" <[EMAIL PROTECTED]> > Other companies > like Microsoft took a very big gamble and implemented the code for surrogate > support into Windows 2000 based on early drafts of the Unicode standard. If > they had not done it this way or had guessed wrong they might not even have > s

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Markus Scherer
Jane Liu wrote: That may mean IBM AIX 5 support converison between GB18030 and Unicode, but I don't see this is a system level of support because there is no locale names for GB18030 in the doc of AIX 5 : The GB 18030 standard requires software to be able to _read and write_ text in the GB

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Markus Scherer
string handling assume that the single-code-point type is the same as the string base unit. This one design point requires 32-bit wchar_t not just for Unicode but also for the character sets of EUC-TW and GB18030. You seem to suggest that there is a problem with 16-bit Unicode. It does take some

RE: IBM AIX 5 and GB18030

2002-11-14 Thread Carl W. Brown
Jane, One of the problems is that early Unicode adopters used the 16 bit UCS-2 encoding for of Unicode. Converting to UTF-16 requires surrogate support. Some of the GB18030 characters require this support. ICU is dedicated to Unicode support so a lot of effort is put into ICU to keep it up to

Re: IBM AIX 5 and GB18030

2002-11-14 Thread Jane Liu
Thanks Mark ! That may mean IBM AIX 5 support converison between GB18030 and Unicode, but I don't see this is a system level of support because there is no locale names for GB18030 in the doc of AIX 5 : http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/admnconc/locale.htm

Re: IBM AIX 5 and GB18030

2002-11-13 Thread Markus Scherer
xjliu_ca wrote: I have searched all the web on IBM about the support of GB18030 in OS AIX 4.3 and 5, but didn't find anything. I only can see they support GB2312 and GBK. Google found something for me: http://www-3.ibm.com/software/ts/mqseries/support/readme/aix530_read.html Search for &

IBM AIX 5 and GB18030

2002-11-13 Thread xjliu_ca
Dear I18N experts, I have searched all the web on IBM about the support of GB18030 in OS AIX 4.3 and 5, but didn't find anything. I only can see they support GB2312 and GBK. I know IBM was one of the pioneer to support GB18030, i.e. their ICU. But it doesn't make sense their A

Re: is GB18030 a combination of CJK and CJK extension?

2002-07-19 Thread James Kass
Sorry, second post, this looks like the standard can be downloaded now from on-line once you are a registered member of this site: (all-on-one-line:) http://www.sun.com/developers/gadc/technicalpublications/articles/gb18030.html Best regards, James Kass. - Original Message - From

Re: is GB18030 a combination of CJK and CJK extension?

2002-07-19 Thread James Kass
Zhang Weiwu wrote, > I cannot find GB18030 stardard in local library, neither can I find it > anywhere on the Internet. I wish to know the stardard itself. > > GB18030 contains about 27000 characters. CJK contains about 21000 characters > and CJK Extension A 6000 characters. (

is GB18030 a combination of CJK and CJK extension?

2002-07-19 Thread Zhang Weiwu
I cannot find GB18030 stardard in local library, neither can I find it anywhere on the Internet. I wish to know the stardard itself. GB18030 contains about 27000 characters. CJK contains about 21000 characters and CJK Extension A 6000 characters. (i don't remeber the actual number.) It

Re: GB18030

2001-09-27 Thread David Starner
On Thu, Sep 27, 2001 at 03:03:22PM -0700, Yung-Fong Tang wrote: > David Starner wrote: > > > If you can't recognize the > > character, then just don't convert it. > > It could be the quality of other's software, we have higher standard however. Higher standard? If I'm working on "Old High Germa

Re: GB18030

2001-09-27 Thread Michael \(michka\) Kaplan
From: Yung-Fong Tang > Case mapping ? You have no way to generate mapping table for > case mapping with knowing the character unless you already > define those character have no case or only one case. Um, Unicode defines a behavior and even properties for unassigned code points. If you choose no

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
  Markus Scherer wrote: Yung-Fong Tang wrote: > ... But you > still need to know what U+4ff3a to define such mapping table, right? Wrong. You just need to know the mapping between code points, whether assigned, used, or whatever. > ... So, whatever the software the user currently have today, with

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
ok... you beat me :) David Starner wrote: > On Thu, Sep 27, 2001 at 12:27:11PM -0700, Yung-Fong Tang wrote: > > looks like I beat ICU by checkin my mapping table at April 9 (to > > mozilla) , 10 days before they check in their first version of GB18030 > > xml mapping tab

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
you have the > > access to the specification and DOES it specify so? > > Do you not have access to the web? It took me 4 minutes to find the > information on the web. Start with www.google.com and type in GB18030, > and you'll find most of the information right there. Others

Re: GB18030

2001-09-27 Thread Markus Scherer
Yung-Fong Tang wrote: > ... But you > still need to know what U+4ff3a to define such mapping table, right? Wrong. You just need to know the mapping between code points, whether assigned, used, or whatever. > ... So, whatever the software the user currently have today, without an > upgrade (eith

Re: GB18030

2001-09-27 Thread Markus Scherer
orry for the confusion. > ... > looks like I beat ICU by checkin my mapping table at April 9 (to > mozilla) , 10 days before they check in their first version of GB18030 > xml mapping table :) I am sorry to disappoint you. ICU 1.7, released in December 2000, had the GB 18030 converter.

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
I have filed a bug against mozilla for this . see http://bugzilla.mozilla.org/show_bug.cgi?id=101998 I also submit a patch there (see the bug report). Unfortunately , I don't have time to test it yet. It will be nice if someone can code review that change for me. Sun folks, do you care abou

Re: GB18030

2001-09-27 Thread David Starner
On Thu, Sep 27, 2001 at 12:27:11PM -0700, Yung-Fong Tang wrote: > looks like I beat ICU by checkin my mapping table at April 9 (to > mozilla) , 10 days before they check in their first version of GB18030 > xml mapping table :) I probably can still claim the first open source > p

Re: GB18030

2001-09-27 Thread David Starner
don't have the access to THE specification >itself and asking help to get one. Do you have the > access to the specification and DOES it specify so? Do you not have access to the web? It took me 4 minutes to find the information on the web. Start with www.google.com and type in GB18030, and you

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
  Kenneth Whistler wrote: Frank, > You don't need to explain to me > the concept of GB18030. The question I have is about details mapping > information. Now, now, there's no need to get snippy with me. It sounded like you were unclear from the kinds of questions you were ask

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
w how can you do that. > In particular, DOES GB18030 define code point to > code point mapping (beyond BMP) between Unicode? Unless you can said that is YES and show me the specification how to map between > them, there are no way people can implement code set conversion between GB18030

Re: GB18030

2001-09-27 Thread Michael \(michka\) Kaplan
From: "Yung-Fong Tang" <[EMAIL PROTECTED]> > Can anyone tell me where can I find a online version of the GB18030 > standard (yes, I want the STANDARD itself. Not someone's paper talk > about the standard) . Or anyone could tell me where to get a copy of the >

Re: GB18030

2001-09-27 Thread Kenneth Whistler
Frank, > You don't need to explain to me > the concept of GB18030. The question I have is about details mapping > information. Now, now, there's no need to get snippy with me. It sounded like you were unclear from the kinds of questions you were asking.

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
ges to convert all the > other code points. I know. I already implement the Unicode BMP to GB18030 conversion (back and forth) in Mozilla. The 4 bytes GB18030 to Unicode BMP conversion only take me about 1488 bytes (see http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvcn/gb180304bytes.ut

Re: GB18030

2001-09-27 Thread Yung-Fong Tang
Sure I know it could (and will ) be implement by a mapping table. But you still need to know what U+4ff3a to define such mapping table, right ? and the mapping table will still be part of the software package, right ? And the user still won't get your new version of mapping table untill they upgra

Re: GB18030

2001-09-27 Thread Tom Emerson
GB 18030 is aligned to ISO 10646, which does not define the semantic properties that Unicode does. -- Tom Emerson Basis Technology Corp. Sr. Sinostringologist http://www.basistech.com "Beware the lollipop of mediocrity: lick

Re: GB18030

2001-09-26 Thread David Starner
do you do that for BMP characters? There's a whole lot you can do without knowing the identity of a character. You can draw the glyph from a font, which will suffice for a lot of purposes. > In particular, DOES GB18030 define code point to > code point mapping (beyond BMP) between Uni

Re: GB18030

2001-09-26 Thread David Starner
with the character if you don't have to (C10). GB18030, if it claims to support Unicode, needs to round-trip both characters. -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org When the aliens come, when the deathrays hum, when the bombers bomb, we'll still b

Re: GB18030

2001-09-26 Thread Michael \(michka\) Kaplan
From: "Geoffrey Waigh" <[EMAIL PROTECTED]> > It shouldn't require honest-to-goodness we-were't-kidding > see-here's-one-defined-now characters In many cases, it did. > for developers to slap themselves on the head They did -- and they are slapping others around them, too. > and start devel

Re: GB18030

2001-09-26 Thread Kenneth Whistler
nd the code points associated with the characters, and not the encoded characters per se. (And this is a disease that was inflicted on the world 23 years ago when Kernighan and Ritchie published a certain language that unfortunately chose to call its 8-bit numeric data type a "char".

Re: GB18030

2001-09-26 Thread Geoffrey Waigh
On Wed, 26 Sep 2001, Yung-Fong Tang wrote: > how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ? With a data table. One set of debugged code that handles surrogates, composing characters, bidirectionality etc. coupled with a datafile that gets upgraded with each release of

Re: GB18030

2001-09-26 Thread Yung-Fong Tang
David Starner wrote: > On Mon, Sep 24, 2001 at 06:18:19PM -0700, Yung-Fong Tang wrote: > > Markus Scherer wrote: > > > > > Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030 >covers all 17 planes, not just the BMP. &

Re: GB18030

2001-09-26 Thread Yung-Fong Tang
Do you know where I can get the mapping table between GB18030 and Planes 1 to 16? I can only get the mapping between Plane 0 and GB18030. Tom Emerson wrote: > Yung-Fong Tang writes: > > Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 > > planes? I don&#x

Re: GB18030

2001-09-26 Thread Yung-Fong Tang
how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ? [EMAIL PROTECTED] wrote: > In a message dated 2001-09-24 20:50:25 Pacific Daylight Time, > [EMAIL PROTECTED] writes: > > >> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes? >

Re: GB18030

2001-09-24 Thread DougEwell2
In a message dated 2001-09-24 20:50:25 Pacific Daylight Time, [EMAIL PROTECTED] writes: >> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes? >> I don't think so, since Unicode have not define them yet, right ? > > Unicode defined all the plane

Re: GB18030

2001-09-24 Thread Tom Emerson
Yung-Fong Tang writes: > Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 > planes? I don't think so, since Unicode have not define them yet, > right ? Sure it does. We know what the code points are, even if they don't have characters assigned to them yet.

Re: GB18030

2001-09-24 Thread David Starner
On Mon, Sep 24, 2001 at 06:18:19PM -0700, Yung-Fong Tang wrote: > Markus Scherer wrote: > > > Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030 >covers all 17 planes, not just the BMP. > > Does GB18030 DEFINED the mapping

Re: GB18030

2001-09-24 Thread Yung-Fong Tang
Markus Scherer wrote: > Yung-Fong Tang wrote: > > bascillay GB18030 is design to encode All Unicode BMP in a encoding which is > > backward compatable with GB2312 and GBK. > > Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 1

Re: GB18030

2001-09-24 Thread Markus Scherer
Yung-Fong Tang wrote: > bascillay GB18030 is design to encode All Unicode BMP in a encoding which is > backward compatable with GB2312 and GBK. Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030 covers all 17 planes, not just the BMP. markus

Re: GB18030

2001-09-21 Thread Yung-Fong Tang
bascillay GB18030 is design to encode All Unicode BMP in a encoding which is backward compatable with GB2312 and GBK. The birth of GB18030 is because those characters which are encoded unicode but not encoded in GB2312 neither GBK. Thierry Sourbier wrote: > Charlie, > > > In wh

RE: GB18030

2001-09-21 Thread Murray Sargent
I think I've figured out a way to find the beginning of a GB18030 character starting anywhere in a document. The algorithm is similar to finding the beginning of a DBCS character in that you scan backward until you find a byte that can only come at the start of a character. The main diffe

RE: GB18030

2001-09-21 Thread Sampo Syreeni
On Fri, 21 Sep 2001, Carl W. Brown wrote: >Most systems that handle GB18030 will want to convert it to Unicode first >to reduce processing overhead. Unless we start seeing Chinese software which is designed to utilize the compatibility between 18030 and GBK -- font rendering apps a

RE: GB18030

2001-09-21 Thread Carl W. Brown
Charlie, GB18030 is designed to support all Unicode characters. It has the capacity to also encode additional characters. I know of no plans to do so. I don't think it will have much affect on Unicode. Most systems that handle GB18030 will want to convert it to Unicode first to r

Re: GB18030

2001-09-21 Thread Thierry Sourbier
wer your question on the relationship between GB18030 and Unicode. Cheers, Thierry. <><><><><><><><><><><><><><><><><><><><><><> www.i18ngurus.com - Open Internationalization Resources Directory

GB18030

2001-09-21 Thread Charlie Jolly
GB18030 In what ways will this effect Unicode? Does it contain anything that Unicode doesn't?

GB18030 summary and issues

2000-10-13 Thread Markus Scherer
Dear Uni-encoders and -decoders, Dirk Meyer from Adobe has put together an extensive summary of the chinese GB 18030 encoding standard that was published on 2000-mar-17. Ken Lunde and I assisted Dirk with reviews and comments. The summary is on the web site of Ken's famous CJKV book "with the