And Clover <a...@doxdesk.com> added the comment:

> The problem is that codecs.open() forces binary mode on the underlying
file object, and this defeats the U mode.

Actually the problem is it doesn't defeat it!

The function is documented to force binary, but it actually only does
"mode = mode + 'b'", which can leave you with a mode of 'rUb'. This mode
should be invalid but in practice the 'U' wins out, and causes the
expected problems for UTF-16 and some East Asian codecs.

Until such time as text/universal mode is supported at the overlying
decoded stream level, I suggest that 'U' should be .replace()d out of
the mode as well as 'b' being added, as the documentation would imply.

----------
nosy: +aclover

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue691291>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to