Serhiy Storchaka added the comment: It seems to me there is something wrong with your test. For example decoding b'\x81\x8d' from CP1251 (as well from any other codepage!) gives you u'\x81\x8d', but codes 0x81 and 0x8D are assigned to different characters: 'Ѓ' (U+0402) and 'Ќ' (U+040C).
0x81 0x0403 #CYRILLIC CAPITAL LETTER GJE 0x8D 0x040C #CYRILLIC CAPITAL LETTER KJE [1] https://en.wikipedia.org/wiki/Windows-1251 [2] http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT [3] http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1251.txt ---------- nosy: +serhiy.storchaka _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28712> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com