On Wed, Sep 26, 2001 at 06:17:15PM -0700, Yung-Fong Tang wrote:
> Sure Unicode defined those planes, but defining planes without defining the
>characters in it mean not too much to people. How can
> you implement case conversion, property mapping without knowing what is inside.
How do you do that for BMP characters? There's a whole lot you can do
without knowing the identity of a character. You can draw the glyph from
a font, which will suffice for a lot of purposes.
> In particular, DOES GB18030 define code point to
> code point mapping (beyond BMP) between Unicode? Unless you can said that is YES and
>show me the specification how to map between
> them, there are no way people can implement code set conversion between GB18030 and
>Unicode.
Have you looked for the specification? Or are you just going to complain
on the list?
According to GNU libc, the algorithm for coverting a Unicode character
ch outside the BMP to GB18030 to outptr (1 .. 4) is:
idx := ch + 16#1E248#;
outptr (4) := (idx div 10) + 16#30#;
idx := idx / 10;
outptr (3) := (idx div 126) + 16#81#;
idx := idx / 126;
outptr (2) := (idx div 10) + 16#30#;
outptr (1) := (idx / 10) + 16#81#;
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - "Freakin' Friends"