STINNER Victor <victor.stin...@haypocalc.com> added the comment: > >> I also found out that, according to RFC 3629, surrogates > >> are considered invalid and they can't be encoded/decoded, > >> but the UTF-8 codec actually does it. > > > > Python2 does, but Python3 raises an error. > > (...) > > I wonder how that change got into the 3.x branch - I would certainly > not have approved it for the reasons given further up on this ticket. > > I think we should revert that change for Python 3.2.
See r72208 and issue #3672. pitrou wrote "We could fix it for 3.1, and perhaps leave 2.7 unchanged if some people rely on this (for whatever reason)." ---------- title: str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0 -> str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8271> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com