SADAHIRO-san and cp9?? experts,

On Thursday, Mar 27, 2003, at 00:44 Asia/Tokyo, SADAHIRO Tomoyuki wrote:
+<U20AC> \x80 |0 # EURO SIGN

Is this right? Yes, U20AC is indeed missing from cp936.ucm but see this;


grep U20AC ucm/cp*.ucm
/Users/dankogai/work/Encode/ucm/cp1250.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1251.ucm:<U20AC> \x88 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1252.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1253.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1254.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1255.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1256.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1257.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp1258.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp874.ucm:<U20AC> \x80 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp949.ucm:<U20AC> \xA2\xE6 |0 # EURO SIGN
/Users/dankogai/work/Encode/ucm/cp950.ucm:<U20AC> \xA3\xE1 |0 # EURO SIGN

\x80 SEEMS right for single-byte CPs but they are mapped differently in CP949 and CP950.
So far as I check the Microsoft's pages


http://www.microsoft.com/typography/unicode/cscp.htm ->
http://www.microsoft.com/globaldev/reference/wincp.mspx ->
http://www.microsoft.com/globaldev/reference/dbcs/936.htm

it indeed does use \x80 (though only \x00-\xFF are covered; Where the heck is the FULL MAP!?). But it seem this only applies to 936. 932 (Japanese; Shift_JIS based), 949 (Korean; euc-kr based) and 950 (Traditional Chinese; Big5-based) all leave \x80 blank.

I would like more confirmation from experts; cp936.ucm has been overhauled with a help of MORIYAMA san and back then and at that time FULL map was available from the URIs above. And I think \x80 was not used for EURO SIGN back then.

Oh, I still have a copy of full mapping that was one available via URI above. Let's see...

cp936.txt says...
CODEPAGE 936 ; PRC GBK (XGB) - ANSI, OEM

CPINFO 2 0x3f 0x003f ; DBCS CP, Default Char = Question Mark

MBTABLE 130

0x00    0x0000  ;Null
[snip]
0x20    0x0020  ;Space
[snip]
0x7f    0x007f  ;^?
0x80    0x0080  ;<80>
0xff    0xf8f5  ;<FF>

\x80 is mentioned but not mapped to EURO SIGN.


Please somebody tell me where to find the FULL map.

Dan the Encode Maintainer with Too Many (Dead) Links to Follow



Reply via email to