On Wed, Sep 26, 2001 at 06:17:15PM -0700, Yung-Fong Tang wrote:
> Sure Unicode defined those planes, but defining planes without defining the 
>characters in it mean not too much to people. How can
> you implement case conversion, property mapping without knowing what is inside. 

How do you do that for BMP characters? There's a whole lot you can do
without knowing the identity of a character. You can draw the glyph from
a font, which will suffice for a lot of purposes. 

> In particular, DOES GB18030 define code point to
> code point mapping (beyond BMP) between Unicode? Unless you can said that is YES and 
>show me the specification how to map between
> them, there are no way people can implement code set conversion between GB18030 and 
>Unicode.

Have you looked for the specification? Or are you just going to complain
on the list?

According to GNU libc, the algorithm for coverting a Unicode character
ch outside the BMP to GB18030 to outptr (1 .. 4) is:

        idx := ch + 16#1E248#;
        outptr (4) := (idx div 10) + 16#30#;
        idx := idx / 10;
        outptr (3) := (idx div 126) + 16#81#;
        idx := idx / 126;
        outptr (2) := (idx div 10) + 16#30#;
        outptr (1) := (idx / 10) + 16#81#;
 

-- 
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - "Freakin' Friends"

Reply via email to