STINNER Victor added the comment: > PAYLOAD.decode('utf8') passes in P2.7.* and fails in P3.4
Well, Python 2 decoder didn't respect the Unicode standard. Please see: http://unicodebook.readthedocs.org/issues.html#non-strict-utf-8-decoder-overlong-byte-sequences-and-surrogates Python 3 is now stricted. You can still decode surrogate characters if you need them *for a good reason* using: >>> b'\xed\xa0\x80'.decode('utf-8', 'surrogatepass') '\ud800' By they way, there is also: >>> b'\xed\xa0\x80'.decode('utf-8', 'surrogateescape') '\udced\udca0\udc80' which is very different but may also help. I suggest to close the issue as NOT A BUG. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26260> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com