> Lars Garshol wrote:
> 
> * Tom Emerson
> | 
> | As far as mapping tables go, the best one you'll find is the
> | Microsoft or ICU mapping tables. I personally have not seen an
> | official mapping table from GB 13000. As others have noted,
> | Microsoft has extended the "pure" GBK with Euro, and perhaps other
> | code points.
> 
> Hmmm. Does this mean that it is best to support the Microsoft
> extensions, or that it is best not to do so?  I guess we will be
> forced to support them sooner or later, and that we might as well do
> it now to save everyone some bother.

As others have already indirectly noted, the problem then is the Euro  
is thus "double-defined" within GBK at code points GB 0x80 and GB 0xA2E3.
Consequently, round-trip conversions between GBK and the Unicode
0x20AC Euro are thereby not possible without some form of data
code value transformation on the return for one of these two GBK values.

The one alternative is to distinguish between the two forms of GBK,
supporting two forms of conversions - one to cp936 and the other to
"pure" GBK.

---

Out of curiosity, what does GB-18030 define for the Euro?  Does it 
define both a single-width and a double-width form?

If so, does it include any reference to how interoperability should 
be handled in conversions with Unicode (or for that matter, any 
character set which defines a single code value for this character)?  

(Lastly, throwing a lighted match onto gasoline...) If two forms are 
specified in GB-18030, should Unicode consider adding another code 
point in the fullwidth variant region to accomodate this?

- Sue
 

Reply via email to