Marc-Andre Lemburg <m...@egenix.com> added the comment:

Ryan McGuire wrote:
> 
> New submission from Ryan McGuire <python....@enigmacurry.com>:
> 
> Opening a UTF-8 encoded file with unix newlines ("\n") on Win32:
> 
> codecs.open("whatever.txt","r","utf-8").read()
> 
> replaces the newlines ("\n") with CR+LF ("\r\n").
> 
> The docs specifically say that :
> 
> "Files are always opened in binary mode, even if no binary mode was
> specified. This is done to avoid data loss due to encodings using 8-bit
> values. This means that no automatic conversion of '\n' is done on
> reading and writing."
> 
> And yet, opening the file with an explicit binary mode resolves the
> situation:
> 
> codecs.open("whatever.txt","rb","utf-8").read()
> 
> This reads the file with the original newlines unmodified.
> 
> The implementation of codecs.open and the documentation are out of sync.

The implementation looks like this:

    if encoding is not None and \
       'b' not in mode:
        # Force opening of the file in binary mode
        mode = mode + 'b'

in both Python 2 and 3, so I'm not sure what could be causing
this.

----------
nosy: +lemburg

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6788>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to