Raymond Mercier wrote:
Mark Shoulson writes
>their Super Font is bundled with Microsoft Office XP, and
> even Microsoft's prices haven't gotten that high!
>From Microsoft,
http://www.microsoft.com/globaldev/DrIntl/columns/015/default.mspx :
"A font that contains Simp
In case you want to test
your GB18030 font, you can use Netscape 7 (or lateset Mozilla) and then
visit my GB18030 test pages at
http://people.netscape.com/ftang/testscript/gb18030/gb18030.cgi?page=10
It should be page to page compatable to the paper copy of GB18030-2000
standard. I also
On 22/04/2004 10:04, Raymond Mercier wrote:
Eric,
Amazin' Amazon!! Now why didn't I think of that ?
In fact the uk Amazon.co.uk say it is discontinued, so I would have to
get it from Amazon in the US. It is not the first time that the two
Amazon's fail to connect.
Many thanks for the tip,
Raymo
Raymond Mercier wrote on 4/22/2004, 7:35 AM:
> I enquired about the 'super font' created by a Beijing foundry,
> http://font.founder.com.cn/english/web/index.htm, and am fairly
> astonished
> at the prices, as you see from the attached.
The cost of produce these fonts are much higher than p
riginal Message -
From:
Eric Muller
To: [EMAIL PROTECTED]
Sent: Thursday, April 22, 2004 5:40
PM
Subject: Re: GB18030 and super font
Raymond Mercier wrote:
But that
link to proofing tools leads nowhere. Maybe it's not be so easy toget
the CHS version.In
Raymond Mercier wrote:
But that link to proofing tools leads nowhere. Maybe it's not be so
easy to
get the CHS version.
Includes ~140 fonts, mostly for CJK, Arabic, Hebrew but other scripts
as well. Includes "Simsun (Founder Extended)" aka "åä-ææèååçé", with
65,531 glyph
From: "Mark E. Shoulson" <[EMAIL PROTECTED]>
> Raymond Mercier wrote:
>
> >I am intrigued by GB18030 encoding. There is a table of equivalences in
> >http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200
> >0.xml
> >No doubt Unih
Possibly they were quoting the price for one to be able
to bundle their font with software that you would sell.
Judging by the website, I don't think that their intent is
to sell directly to individual users. In that context, the
price doesn't seem unreasonable at all. When you
consider that hig
ï
Mark Shoulson writes>their Super Font is
bundled with Microsoft Office XP, and> even Microsoft's prices haven't
gotten that high!From Microsoft,http://www.microsoft.com/globaldev/DrIntl/columns/015/default.mspx :"A font that contains Simplified Chinese glyphs from
both CJK Extension Aand B s
Raymond Mercier wrote:
I am intrigued by GB18030 encoding. There is a table of equivalences in
http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200
0.xml
No doubt Unihan will at some stage include these 2 & 4 byte values.
I enquired about the 'super font'
I am intrigued by GB18030 encoding. There is a table of equivalences in
http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200
0.xml
No doubt Unihan will at some stage include these 2 & 4 byte values.
I enquired about the 'super font' created by a Beijing
you can also use 'nsconv' which come with mozilla source code with GB18030.
see http://www.mozilla.org/projects/l10n/mlp_tools.html for details
Zhang Weiwu wrote on 3/5/2004, 6:43 AM:
> Hello. I believe this must be a frequent question, but I googled around
> and I didn
Peter Jacobi wrote:
Hello. I believe this must be a frequent question, but I googled around
and I didn't find a satisfying tool. It seems most converters do GB2312
but not GB18030.
Both GNU libc iconv and GNU libiconv support GB18030. I assume the libiconv
distribution includes the co
> Hello. I believe this must be a frequent question, but I googled around
> and I didn't find a satisfying tool. It seems most converters do GB2312
> but not GB18030.
Both GNU libc iconv and GNU libiconv support GB18030. I assume the libiconv
distribution includes the comman
Hello. I believe this must be a frequent question, but I googled around
and I didn't find a satisfying tool. It seems most converters do GB2312
but not GB18030.
I have 100+ files to convert, normal graphical /web based converters
won't do the work well.
On my FreeBSD there is a p
Hi Will,
The ICU library is a good source for information like this. See:
http://oss.software.ibm.com/icu/charset/
The data table is located here:
http://oss.software.ibm.com/cvs/icu/charset/data/xml/gb-18030-2000.xml
Read the note on the first page.
There are official sources as well, but I
xt section, which taught me that I shouldn't
care.
John
Microsoft
-Original Message-
From: Doug Ewell [mailto:dewell@;adelphia.net]
Sent: Thursday, November 14, 2002 8:26 PM
To: Unicode Mailing List
Cc: Carl W. Brown
Subject: Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030
Carl W
Jane, you are right, I over-simplified. I tried to make the point that you need not _process_ text
in GB18030 but that Unicode processing and conversion to/from GB18030 fulfills the requirement to be
able to read and write GB18030 text.
Yes, you need to have font support for all the characters
Michael Yau wrote:
Markus,
>The standard does _not_ require to _process_ internally in GB18030. It
is sufficient to have a converter and to process in Unicode, which does
contain all of >the characters.
Just curious, do you have this in writing from the China standards body?
I
Doug,
> > However, 16 bit characters were a hard enough sell in the good old
> > days. If we had started out withug 2bit characters we would still be
> > dreaming about Unicode.
>
> I think Carl meant "with 32-bit characters." I don't know what kind of
> word "withug" is (Old English?), but I li
Carl W. Brown wrote:
> Converting from UCS-2 to UTF-16 is just like converting from SBCS to
> DBCS. For folks who think DBCS it is no problem. Those who went from
> DBCS to Unicode to simplify their lives I am sure are not happy.
Ken made me laugh last March by referring to this as
"... a
Markus,
> You seem to suggest that there is a problem with 16-bit Unicode.
> It does take some effort to adapt
> UCS-2-designed functions for UTF-16, but it's not "rocket
> science" and works very well thanks to the
> Unicode allocation practice (common characters in the BMP).
> Making UTF-8/32 fu
[EMAIL PROTECTED] [mailto:unicode-bounce@;unicode.org]On
> Behalf Of Markus Scherer
> Sent: Thursday, November 14, 2002 9:18 AM
> To: unicode
> Subject: Re: IBM AIX 5 and GB18030
>
>
> Carl W. Brown wrote:
> > Some Unix systems adapted faster because the later Unicode
> adopter
:59 AM
To: Markus Scherer <[EMAIL PROTECTED]>, unicode <[EMAIL PROTECTED]>
cc:
Subject: Re: IBM AIX 5 and GB18030
Thanks Mark !
That may mean IBM AIX 5 support converison between GB18030 and
Unicode, but I don't see this is a system l
Mark,
I think only "converter" is not sufficient. How about the following
support :
- IME (to input CJK Ext.A characters through GB18030/Unicode code)
- X-Windows fonts support.
- iconv support
- mbtowc(), mbstowcs(), mblen()...
- and so on...
You need be able to do like what you
Markus,
>The standard does _not_ require to _process_ internally in GB18030. It
is sufficient to have a converter and to process in Unicode, which does
contain all of >the characters.
Just curious, do you have this in writing from the China standards body?
- Michael
Markus S
From: "Carl W. Brown" <[EMAIL PROTECTED]>
> Other companies
> like Microsoft took a very big gamble and implemented the code for
surrogate
> support into Windows 2000 based on early drafts of the Unicode standard.
If
> they had not done it this way or had guessed wrong they might not even
have
> s
Jane Liu wrote:
That may mean IBM AIX 5 support converison between GB18030 and
Unicode, but I don't see this is a system level of support because
there is no locale names for GB18030 in the doc of AIX 5 :
The GB 18030 standard requires software to be able to _read and write_ text in the GB
string handling assume that the single-code-point type is the same as the string base unit.
This one design point requires 32-bit wchar_t not just for Unicode but also for the character sets
of EUC-TW and GB18030.
You seem to suggest that there is a problem with 16-bit Unicode. It does take some
Jane,
One of the problems is that early Unicode adopters used the 16 bit UCS-2
encoding for of Unicode. Converting to UTF-16 requires surrogate support.
Some of the GB18030 characters require this support. ICU is dedicated to
Unicode support so a lot of effort is put into ICU to keep it up to
Thanks Mark !
That may mean IBM AIX 5 support converison between GB18030 and
Unicode, but I don't see this is a system level of support because
there is no locale names for GB18030 in the doc of AIX 5 :
http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/admnconc/locale.htm
xjliu_ca wrote:
I have searched all the web on IBM about the support of GB18030 in OS
AIX 4.3 and 5, but didn't find anything. I only can see they support
GB2312 and GBK.
Google found something for me:
http://www-3.ibm.com/software/ts/mqseries/support/readme/aix530_read.html
Search for &
Dear I18N experts,
I have searched all the web on IBM about the support of GB18030 in OS
AIX 4.3 and 5, but didn't find anything. I only can see they support
GB2312 and GBK.
I know IBM was one of the pioneer to support GB18030, i.e. their ICU.
But it doesn't make sense their A
Sorry, second post, this looks like the standard can
be downloaded now from on-line once you are a
registered member of this site:
(all-on-one-line:)
http://www.sun.com/developers/gadc/technicalpublications/articles/gb18030.html
Best regards,
James Kass.
- Original Message -
From
Zhang Weiwu wrote,
> I cannot find GB18030 stardard in local library, neither can I find it
> anywhere on the Internet. I wish to know the stardard itself.
>
> GB18030 contains about 27000 characters. CJK contains about 21000 characters
> and CJK Extension A 6000 characters. (
I cannot find GB18030 stardard in local library, neither can I find it
anywhere on the Internet. I wish to know the stardard itself.
GB18030 contains about 27000 characters. CJK contains about 21000 characters
and CJK Extension A 6000 characters. (i don't remeber the actual number.) It
On Thu, Sep 27, 2001 at 03:03:22PM -0700, Yung-Fong Tang wrote:
> David Starner wrote:
>
> > If you can't recognize the
> > character, then just don't convert it.
>
> It could be the quality of other's software, we have higher standard however.
Higher standard? If I'm working on "Old High Germa
From: Yung-Fong Tang
> Case mapping ? You have no way to generate mapping table for
> case mapping with knowing the character unless you already
> define those character have no case or only one case.
Um, Unicode defines a behavior and even properties for unassigned code
points. If you choose no
Markus Scherer wrote:
Yung-Fong Tang wrote:
> ... But you
> still need to know what U+4ff3a to define such mapping table, right?
Wrong. You just need to know the mapping between code points, whether
assigned, used, or whatever.
> ... So, whatever the software the user currently have today, with
ok... you beat me :)
David Starner wrote:
> On Thu, Sep 27, 2001 at 12:27:11PM -0700, Yung-Fong Tang wrote:
> > looks like I beat ICU by checkin my mapping table at April 9 (to
> > mozilla) , 10 days before they check in their first version of GB18030
> > xml mapping tab
you have the
> > access to the specification and DOES it specify so?
>
> Do you not have access to the web? It took me 4 minutes to find the
> information on the web. Start with www.google.com and type in GB18030,
> and you'll find most of the information right there. Others
Yung-Fong Tang wrote:
> ... But you
> still need to know what U+4ff3a to define such mapping table, right?
Wrong. You just need to know the mapping between code points, whether assigned, used,
or whatever.
> ... So, whatever the software the user currently have today, without an
> upgrade (eith
orry for the
confusion.
> ...
> looks like I beat ICU by checkin my mapping table at April 9 (to
> mozilla) , 10 days before they check in their first version of GB18030
> xml mapping table :)
I am sorry to disappoint you. ICU 1.7, released in December 2000, had the GB 18030
converter.
I have filed a bug against mozilla for this . see
http://bugzilla.mozilla.org/show_bug.cgi?id=101998 I also submit a patch there
(see the bug report). Unfortunately , I don't have time to test it yet.
It will be nice if someone can code review that change for me.
Sun folks, do you care abou
On Thu, Sep 27, 2001 at 12:27:11PM -0700, Yung-Fong Tang wrote:
> looks like I beat ICU by checkin my mapping table at April 9 (to
> mozilla) , 10 days before they check in their first version of GB18030
> xml mapping table :) I probably can still claim the first open source
> p
don't have the access to THE specification
>itself and asking help to get one. Do you have the
> access to the specification and DOES it specify so?
Do you not have access to the web? It took me 4 minutes to find the
information on the web. Start with www.google.com and type in GB18030,
and you
Kenneth Whistler wrote:
Frank,
> You don't need to explain to me
> the concept of GB18030. The question I have is about details mapping
> information.
Now, now, there's no need to get snippy with me. It sounded
like you were unclear from the kinds of questions you were
ask
w how can you do that.
> In particular, DOES GB18030 define code point to
> code point mapping (beyond BMP) between Unicode? Unless you can said
that is YES and show me the specification how to map between
> them, there are no way people can implement code set conversion between
GB18030
From: "Yung-Fong Tang" <[EMAIL PROTECTED]>
> Can anyone tell me where can I find a online version of the GB18030
> standard (yes, I want the STANDARD itself. Not someone's paper talk
> about the standard) . Or anyone could tell me where to get a copy of the
>
Frank,
> You don't need to explain to me
> the concept of GB18030. The question I have is about details mapping
> information.
Now, now, there's no need to get snippy with me. It sounded
like you were unclear from the kinds of questions you were
asking.
ges to convert all the
> other code points.
I know. I already implement the Unicode BMP to GB18030 conversion (back
and forth) in Mozilla. The 4 bytes GB18030 to Unicode BMP conversion
only take me about 1488 bytes (see
http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvcn/gb180304bytes.ut
Sure I know it could (and will ) be implement by a mapping table. But you
still need to know what U+4ff3a to define such mapping table, right ? and the
mapping table will still be part of the software package, right ? And the user
still won't get your new version of mapping table untill they upgra
GB 18030 is aligned to ISO 10646, which does not define the semantic
properties that Unicode does.
--
Tom Emerson Basis Technology Corp.
Sr. Sinostringologist http://www.basistech.com
"Beware the lollipop of mediocrity: lick
do you do that for BMP characters? There's a whole lot you can do
without knowing the identity of a character. You can draw the glyph from
a font, which will suffice for a lot of purposes.
> In particular, DOES GB18030 define code point to
> code point mapping (beyond BMP) between Uni
with the character if you don't have to (C10). GB18030, if it
claims to support Unicode, needs to round-trip both characters.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still b
From: "Geoffrey Waigh" <[EMAIL PROTECTED]>
> It shouldn't require honest-to-goodness we-were't-kidding
> see-here's-one-defined-now characters
In many cases, it did.
> for developers to slap themselves on the head
They did -- and they are slapping others around them, too.
> and start devel
nd the code points
associated with the characters, and not the encoded characters
per se. (And this is a disease that was inflicted on the world
23 years ago when Kernighan and Ritchie published a certain
language that unfortunately chose to call its 8-bit numeric
data type a "char".
On Wed, 26 Sep 2001, Yung-Fong Tang wrote:
> how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ?
With a data table. One set of debugged code that handles surrogates,
composing characters, bidirectionality etc. coupled with a datafile that
gets upgraded with each release of
David Starner wrote:
> On Mon, Sep 24, 2001 at 06:18:19PM -0700, Yung-Fong Tang wrote:
> > Markus Scherer wrote:
> >
> > > Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030
>covers all 17 planes, not just the BMP.
&
Do you know where I can get the mapping table between GB18030 and Planes 1 to
16? I can only get the mapping between Plane 0 and GB18030.
Tom Emerson wrote:
> Yung-Fong Tang writes:
> > Does GB18030 DEFINED the mapping between GB18030 and the rest of 11
> > planes? I don
how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ?
[EMAIL PROTECTED] wrote:
> In a message dated 2001-09-24 20:50:25 Pacific Daylight Time,
> [EMAIL PROTECTED] writes:
>
> >> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes?
>
In a message dated 2001-09-24 20:50:25 Pacific Daylight Time,
[EMAIL PROTECTED] writes:
>> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes?
>> I don't think so, since Unicode have not define them yet, right ?
>
> Unicode defined all the plane
Yung-Fong Tang writes:
> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11
> planes? I don't think so, since Unicode have not define them yet,
> right ?
Sure it does. We know what the code points are, even if they don't
have characters assigned to them yet.
On Mon, Sep 24, 2001 at 06:18:19PM -0700, Yung-Fong Tang wrote:
> Markus Scherer wrote:
>
> > Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030
>covers all 17 planes, not just the BMP.
>
> Does GB18030 DEFINED the mapping
Markus Scherer wrote:
> Yung-Fong Tang wrote:
> > bascillay GB18030 is design to encode All Unicode BMP in a encoding which is
> > backward compatable with GB2312 and GBK.
>
> Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 1
Yung-Fong Tang wrote:
> bascillay GB18030 is design to encode All Unicode BMP in a encoding which is
> backward compatable with GB2312 and GBK.
Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030 covers
all 17 planes, not just the BMP.
markus
bascillay GB18030 is design to encode All Unicode BMP in a encoding which is
backward compatable with GB2312 and GBK.
The birth of GB18030 is because those characters which are encoded unicode
but not encoded in GB2312 neither GBK.
Thierry Sourbier wrote:
> Charlie,
>
> > In wh
I think I've figured out a way to find the beginning of a GB18030 character starting
anywhere in a document. The algorithm is similar to finding the beginning of a DBCS
character in that you scan backward until you find a byte that can only come at the
start of a character. The main diffe
On Fri, 21 Sep 2001, Carl W. Brown wrote:
>Most systems that handle GB18030 will want to convert it to Unicode first
>to reduce processing overhead.
Unless we start seeing Chinese software which is designed to utilize the
compatibility between 18030 and GBK -- font rendering apps a
Charlie,
GB18030 is designed to support all Unicode characters. It has the capacity
to also encode additional characters. I know of no plans to do so.
I don't think it will have much affect on Unicode. Most systems that handle
GB18030 will want to convert it to Unicode first to r
wer your question on the relationship between GB18030
and Unicode.
Cheers,
Thierry.
<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory
GB18030
In what ways will this effect Unicode?
Does it contain anything that Unicode doesn't?
Dear Uni-encoders and -decoders,
Dirk Meyer from Adobe has put together an extensive summary of the chinese GB 18030
encoding standard that was published on 2000-mar-17. Ken Lunde and I assisted Dirk
with reviews and comments.
The summary is on the web site of Ken's famous CJKV book "with the
73 matches
Mail list logo