[issue12855] open() and codecs.open() treat form-feed differently
New submission from Matthew Boehm boehm.matt...@gmail.com: A file opened with codecs.open() splits on a form feed character (\x0c) while a file opened with open() does not. with open(formfeed.txt, w) as f: ... f.write(line \fone\nline two\n) ... with open(formfeed.txt, r) as f: ... s = f.read() ... s 'line \x0cone\nline two\n' print s line one line two import codecs with open(formfeed.txt, rb) as f: ... lines = f.readlines() ... lines ['line \x0cone\n', 'line two\n'] with codecs.open(formfeed.txt, r, encoding=ascii) as f: ... lines2 = f.readlines() ... lines2 [u'line \x0c', u'one\n', u'line two\n'] Note that lines contains two items while lines2 has 3. Issue 7643 has a good discussion on newlines in python, but I did not see this discrepancy mentioned. -- components: Interpreter Core messages: 143182 nosy: Matthew.Boehm priority: normal severity: normal status: open title: open() and codecs.open() treat form-feed differently type: behavior versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12855] open() and codecs.open() treat form-feed differently
STINNER Victor victor.stin...@haypocalc.com added the comment: U+000C (Form feed) is considered as a line boundary in Unicode (unicode type), but no for a byte string (str type). Example: u'line \x0cone\nline two\n'.splitlines(True) [u'line \x0c', u'one\n', u'line two\n'] 'line \x0cone\nline two\n'.splitlines(True) ['line \x0cone\n', 'line two\n'] -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12855] open() and codecs.open() treat form-feed differently
Matthew Boehm boehm.matt...@gmail.com added the comment: Thanks for explaining the reasoning. Perhaps I should add this to the python wiki (http://wiki.python.org/moin/Unicode) ? It would be nice if it fit in the docs somewhere, but I'm not sure where. I'm curious how (or if) 2to3 would handle this as well, but I'm closing this issue as it's now clear to me why these two are expected to act differently. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12855] open() and codecs.open() treat form-feed differently
Changes by Matthew Boehm boehm.matt...@gmail.com: -- resolution: - wont fix status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12855] open() and codecs.open() treat form-feed differently
STINNER Victor victor.stin...@haypocalc.com added the comment: It would be nice if it fit in the docs somewhere, but I'm not sure where. See: http://docs.python.org/library/codecs.html#codecs.StreamReader.readline Can you suggest a patch for the documentation? Source code of this document: http://hg.python.org/cpython/file/bb7b14dd5ded/Doc/library/codecs.rst -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12855] open() and codecs.open() treat form-feed differently
Matthew Boehm boehm.matt...@gmail.com added the comment: I'll suggest a patch for the documentation when I get to my home computer in an hour or two. -- assignee: - docs@python components: +Documentation -Interpreter Core nosy: +docs@python resolution: wont fix - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12855] open() and codecs.open() treat form-feed differently
Matthew Boehm boehm.matt...@gmail.com added the comment: I'm taking a look at the docs now. I'm considering adding a table/list of characters python treats as newlines, but it seems like this might fit better as a note in http://docs.python.org/library/stdtypes.html#str.splitlines or somewhere else in stdtypes. I'll start working on it now, but please let me know what you think about this. This is my first attempt at a patch, so I greatly appreciate your help so far. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12855 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com