[issue6788] codecs.open on Win32 does not force binary mode
Changes by Terry J. Reedy tjre...@udel.edu: -- status: pending - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6788 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6788] codecs.open on Win32 does not force binary mode
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: I think your test is invalid: it creates the file in w mode, so \n are written as two bytes \r\n on the disk. codecs.open just reads them back. -- nosy: +amaury.forgeotdarc resolution: - invalid status: open - pending ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6788 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6788] codecs.open on Win32 does not force binary mode
Marc-Andre Lemburg m...@egenix.com added the comment: Ryan McGuire wrote: New submission from Ryan McGuire python@enigmacurry.com: Opening a UTF-8 encoded file with unix newlines (\n) on Win32: codecs.open(whatever.txt,r,utf-8).read() replaces the newlines (\n) with CR+LF (\r\n). The docs specifically say that : Files are always opened in binary mode, even if no binary mode was specified. This is done to avoid data loss due to encodings using 8-bit values. This means that no automatic conversion of '\n' is done on reading and writing. And yet, opening the file with an explicit binary mode resolves the situation: codecs.open(whatever.txt,rb,utf-8).read() This reads the file with the original newlines unmodified. The implementation of codecs.open and the documentation are out of sync. The implementation looks like this: if encoding is not None and \ 'b' not in mode: # Force opening of the file in binary mode mode = mode + 'b' in both Python 2 and 3, so I'm not sure what could be causing this. -- nosy: +lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6788 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6788] codecs.open on Win32 does not force binary mode
Ryan McGuire python@enigmacurry.com added the comment: Uploading a doctest for this. The tests are successful on Linux using Python 2.6 They fail on Win32 with Python 2.6 -- Added file: http://bugs.python.org/file14788/codecs_bug.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6788 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6788] codecs.open on Win32 does not force binary mode
New submission from Ryan McGuire python@enigmacurry.com: Opening a UTF-8 encoded file with unix newlines (\n) on Win32: codecs.open(whatever.txt,r,utf-8).read() replaces the newlines (\n) with CR+LF (\r\n). The docs specifically say that : Files are always opened in binary mode, even if no binary mode was specified. This is done to avoid data loss due to encodings using 8-bit values. This means that no automatic conversion of '\n' is done on reading and writing. And yet, opening the file with an explicit binary mode resolves the situation: codecs.open(whatever.txt,rb,utf-8).read() This reads the file with the original newlines unmodified. The implementation of codecs.open and the documentation are out of sync. -- assignee: georg.brandl components: Documentation, Library (Lib) messages: 91995 nosy: EnigmaCurry, georg.brandl severity: normal status: open title: codecs.open on Win32 does not force binary mode type: behavior versions: Python 2.6, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6788 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com