On top of that, it looks like 950 maps a bogus symbol or punctuation 
character to U+2574. (2574 is one of a set of 4, and only 1 is mapped for 
starters. Fonts covering CP950 give a way different image for that 
character than you'd expect from either the charts or the names...

I let some people know about this, but fixing it would cause even more 
problems one assumes.
A./

At 11:13 PM 12/18/01 -0500, Tex Texin wrote:
>Ken,
>
>Thanks for commiserating.
>Yes, I noticed the differences in mapping tables.
>I am glad Sybase gave different character sets different names.
>I am curious how you deal with Unicode and HKSCS in the private use
>area, sometimes....
>For that matter I wonder what a user in HK does when their Windows
>operating system is upgraded and their files that had HKSCS characters
>in the private use area now expect them in other locations.
>
>With respect to messy tables, and HKSCS and GB18030 in particular, it is
>a damn shame that there is no entity making a case to governments and
>others creating character set standards, that they not consider the set
>defined until it is registered to ISO and Unicode, so some of the silly
>mistakes get worked out first. A little press relations here, with
>recent history and resulting problems as evidence and the corrections
>that came about once registration was attempted, would show that working
>these things out in committee is helpful and not a threat to national
>soverignty.
>
>Oh well. Surely this won't happen again in 2002....
>tex
>
>
>
>Kenneth Whistler wrote:
> >
> > Tex,
> >
> > >
> > > Thanks for this and the several private responses.
> > >
> > > For anyone interested, in addition to the Microsoft page:
> > > http://www.microsoft.com/hk/hkscs/
> > >
> > > The HK Gov't has a web page, fonts and mapping tables:
> > > http://www.info.gov.hk/digital21/eng/hkscs/introduction.html
> >
> > And to add to the chaos and confusion, note that the HKSCS
> > patch for Windows Code Page 950 does not map exactly the
> > same as the HK Government mapping table. And that the HK
> > Government mapping table has at least a couple of blatant
> > errors in it. And that the HKSCS path for Windows Code Page 950
> > (like Code Page 950 without the extension, but even moreso)
> > has duplicate mappings in it that need to be resolved in
> > order to roundtrip through Unicode. And you have no guarantee
> > that various vendors' attempts to sort out the HK Government
> > mapping table and Windows Code Page 950 + HKSCS path behavior
> > will themselves produce matching results.
> >
> > >
> > > Oracle gave a nice paper at a recent Unicode conference:
> > > http://www.unicode.org/iuc/iuc18/papers/b19.ppt
> > >
> > > It amazes me that in the year 2000, organizations are still creating
> > > chaos by amending definitions of standards especially code pages,
> > > without giving the new creation its own name or some other way of
> > > distinguishing it, and then on top of that creating multiple mapping
> > > tables.
> > >
> > > I understand the desire to get new functionality into users hands, but
> > > would it have been a problem to rename either big5 or 950 to something
> > > like big-6 or big-5hk or 950HK or 951?
> >
> > Sybase is now supporting "cp950" (+euro, by the way -- another addition
> > that may or may not be supported in a particular Windows implementation,
> > depending on date) and a separate "big5hk", so if you interoperate
> > with Sybase, you should know what you are getting. However, like
> > everybody else, it is hit or miss for us when a platform or other
> > data announces itself to us as "cp950" or "big-5", whether it
> > is with or without the HKSCS extensions.
> >
> > > So now we can't tell if big-5 or 950 will or won't have this data, or
> > > even whether Unicode data will have these characters in the private use
> > > area or elsewhere, or whether software that may be on the other end of
> > > the pipe supports HKSCS or not, or even if their operating system has
> > > the patch or not.
> > >
> > > Although "that which we call a rose by any other name would smell as
> > > sweet",
> > > calling everything a rose, makes it hard to know when you are getting a
> > > rose.
> >
> > I think this was all part of a conspiracy for Chinese to catch up
> > with Japanese, since the Chinese code pages (until now) didn't have
> > a mess the scale of SJIS. But between HKSCS and GB 18030, they are
> > making up for lost time.
> >
> > --Ken
> >
> > >
> > > Here's hoping for less chaos in 2002!
> > > tex
>
>--
>-------------------------------------------------------------
>Tex Texin                    Director, International Business
>mailto:[EMAIL PROTECTED]    Tel: +1-781-280-4271
>the Progress Company         Fax: +1-781-280-4655
>-------------------------------------------------------------
>For a compelling demonstration for Unicode:
>http://www.geocities.com/i18nguy/unicode-example.html


Reply via email to