Toshio Kuratomi added the comment: Nick and I had talked about this at a recent conference and came to it from different directions. On the one hand, Nick made the point that any encoding of surrogateescape'd text to bytes via a different encoding is corrupting the data as a whole. On the other hand, I made the point that raising an exception when doing something as basic as printing something that's text type was reintroducing the issues that python2 had wrt unicode, bytes, and encodings -- particularly with the exception being raised far from the source of the problem (when the data is introduced into the program).
After some thought, Nick came up with this solution. The idea is that surrogateescape was originally accepted to allow roundtripping data from the OS and back when the OS considers it to be a "string" but python does not consider it to be "text". When that's the case, we know what the encoding was used to attempt to construct the text in python. If that same encoding is used to re-encode the data on the way back to the OS, then we're successfully roundtripping the data we were given in the first place. So this is just applying the original goal to another API. ---------- nosy: +a.badger _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18713> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com