Ma Lin added the comment: >> I examined all Chinese codecs I said it above, but I forgot Taiwan and HongKong are using Chinese as well.
BIG5 and CP950 are using a wrong convert table, test this: >>> u = b'\xC6\xA1'.decode('big5') >>> hex(ord(u)) '0x30fe' This should not happen, 0xC6A1 is neither in BIG5 nor in CP950. In BIG5-2003 and HKSCS-2008, 0xC6A1 is mapped to U+2460. I only had a look roughly, please check more. I won't check HongKong codec anymore, I suggest check it as well. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue24117> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com