Re: GB18030

Markus Scherer Thu, 27 Sep 2001 15:16:22 -0700

Yung-Fong Tang wrote:
> ... But you
> still need to know what U+4ff3a to define such mapping table, right?


Wrong. You just need to know the mapping between code points, whether assigned, used, 
or whatever.

> ... So, whatever the software the user currently have today, without an
> upgrade (either upgrade the code or mapping table) still won't know how to
> convert U+4ff3a to lower case or upper case, right ?

No, but that's irrelevant for character conversion. Once you update the Unicode 
character database in your product, your software will do it - if it knows how to deal 
with supplementary characters in general. (That part is a technicality which is, 
again, independent of whether there _are_ assigned characters.)

> But how can you generate such mapping table without knowing that character ?

By specifying which _code point_ in one encoding gets mapped to which other _code 
point_ in the other encoding.
Character conversion never looks at whether the code points that it maps are actual 
_characters_.

When you map between the GBK or Shift-JIS user-defined areas and Unicode PUA or 
similar, then you also map code points that don't have characters. What's new?

> ...
> How many years does it take for people to realize that give a new mappint to
> their customer still need a complete life cycle of QA and distribution?  And
> there will be a new version number attach to the software for that.

Is this about the existence of supplementary characters again?
They exist since 1996, and a vendor who followed the UTC/ISO negotiations could see it 
coming since 1993.
Surely most everyone had the time to roll out a new release of their software to get 
the support for them in - in more than five years?

(I know that few actually worked on this in time. But time there was.)


markus

Re: GB18030

Reply via email to