Marc-Andre Lemburg <m...@egenix.com> added the comment: Ryan McGuire wrote: > > New submission from Ryan McGuire <python....@enigmacurry.com>: > > Opening a UTF-8 encoded file with unix newlines ("\n") on Win32: > > codecs.open("whatever.txt","r","utf-8").read() > > replaces the newlines ("\n") with CR+LF ("\r\n"). > > The docs specifically say that : > > "Files are always opened in binary mode, even if no binary mode was > specified. This is done to avoid data loss due to encodings using 8-bit > values. This means that no automatic conversion of '\n' is done on > reading and writing." > > And yet, opening the file with an explicit binary mode resolves the > situation: > > codecs.open("whatever.txt","rb","utf-8").read() > > This reads the file with the original newlines unmodified. > > The implementation of codecs.open and the documentation are out of sync.
The implementation looks like this: if encoding is not None and \ 'b' not in mode: # Force opening of the file in binary mode mode = mode + 'b' in both Python 2 and 3, so I'm not sure what could be causing this. ---------- nosy: +lemburg _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6788> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com