Xiang Zhang added the comment: The table in wikipedia is somewhat complex. I find ftp://ftp.software.ibm.com/software/globalization/documents/gb18030m.pdf and the table in it is same as https://pan.baidu.com/share/link?shareid=2606985291&uk=3341026630 (except 0x80) but in English. I agree with Ma Lin bytes sequences like b'\x81\x30\xFF\x30' are invalid.
For current implementation, you could see: >>> invalid = b'\x81\x30\xff\x30' >>> invalid.decode('gb18030').encode('gb18030') == invalid False ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29990> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com