[issue1159051] Handle corrupted gzip files with unexpected EOF
Roundup Robot added the comment: New changeset 854ba6f414a8 by Georg Brandl in branch '3.2': Issue #1159051: Back out a fix for handling corrupted gzip files that http://hg.python.org/cpython/rev/854ba6f414a8 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Changes by Georg Brandl ge...@python.org: -- priority: release blocker - normal ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Changes by Georg Brandl ge...@python.org: -- versions: -Python 2.7, Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Roundup Robot added the comment: New changeset 9c2831fe84e9 by Georg Brandl in branch '3.3': Back out patch for #1159051, which caused backwards compatibility problems. http://hg.python.org/cpython/rev/9c2831fe84e9 New changeset 5400e8fbc1de by Georg Brandl in branch 'default': null-merge reversion of #1159051 patch from 3.3 http://hg.python.org/cpython/rev/5400e8fbc1de -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Roundup Robot added the comment: New changeset abc780332b60 by Benjamin Peterson in branch '2.7': backout 214d8909513d for regressions (#1159051) http://hg.python.org/cpython/rev/abc780332b60 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Serhiy Storchaka added the comment: In both cases broken applications use the undocumented implementation details. But such changes are extremely strong for bugfix release and they should not be done without a special need. I propose to revert these changes in 2.7, 3.2 and 3.3 (possibly leaving in the default branch). Unfortunately, I was not online right before the release of the latest bugfix release and failed to do this. Fortunately, it is now possible to fix this in regression fix releases. -- nosy: +benjamin.peterson, georg.brandl priority: normal - release blocker type: enhancement - behavior versions: +Python 2.7, Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Matthias Klose added the comment: another test case failure with this patch: https://launchpad.net/ubuntu/+archive/test-rebuild-20130329/+build/4416983 reproducible with feedparser 5.1.3 from pypi, on x86 (but not x86_64). ERROR: test_gzip_struct_error (__main__.TestCompression) -- Traceback (most recent call last): File ./feedparsertest.py, line 433, in test_gzip_struct_error f = feedparser.parse('http://localhost:8097/tests/compression/gzip-struct-error.gz') File /build/buildd/feedparser-5.1.2/feedparser/feedparser.py, line 3836, in parse data = gzip.GzipFile(fileobj=_StringIO(data)).read() File /usr/lib/python2.7/gzip.py, line 253, in read while self._read(readsize): File /usr/lib/python2.7/gzip.py, line 323, in _read self._read_eof() File /usr/lib/python2.7/gzip.py, line 340, in _read_eof crc32, isize = struct.unpack(II, self._read_exact(8)) File /usr/lib/python2.7/gzip.py, line 189, in _read_exact raise EOFError(Compressed file ended before the EOFError: Compressed file ended before the end-of-stream marker was reached -- Ran 4237 tests in 5.190s FAILED (errors=1) Exception happened during processing of request from ('127.0.0.1', 43939) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Changes by Matthias Klose d...@debian.org: -- priority: release blocker - normal ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Serhiy Storchaka added the comment: I will be offline some time. Feel free to revert these changes in 2.7-3.3 if it is necessary. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Serhiy Storchaka added the comment: tuned_gzip does dangerous things, it overloads private methods of GzipFile. From Bazaar 2.3 Release Notes: * Stop using ``bzrlib.tuned_gzip.GzipFile``. It is incompatible with python-2.7 and was only used for Knit format repositories, which haven't been recommended since 2007. The file itself will be removed in the next release. (John Arbash Meinel) Current version is 2.6b2. bzrlib.tuned_gzip.GzipFile should be removed two releases ago. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Matthias Klose added the comment: this change breaks a test case in the bzr testsuite; will try to get to it next week. See https://launchpad.net/bugs/1116079 -- nosy: +doko, larry priority: normal - release blocker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Nadeem Vawda added the comment: I think the new behavior should be controlled by a constructor flag, maybe named defer_errors. I don't like the idea of adding the flag to read(), since that makes us diverge from the standard file interface. Making a distinction between size0 and size=None seems confusing and error-prone, not to mention that we (again) would have read() work differently from most other file classes. I'd prefer it if the new behavior is not enabled by default for size=0, even if this wouldn't break well-behaved code. Having a flag that only controls the size0 case is inelegant, and I don't think we should change the default behavior unless there is a clear benefit to doing so. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Serhiy Storchaka added the comment: Actually previous patch doesn't fix original problem, it only ensure that GzipFile consistent with BZ2File and LZMAFile. To fix original problem we need other patch, and this patch looks as new feature for 3.4. Here is a sample patch for LZMAFile. BZ2File patch will be similar, and GzipFile patch will be more different and complex. Now error doesn't raised immediately when read the file unexpectedly ended if some data can be read. Instead maximal possible part of read data returned and exception raising deferred to next read (see tests). Perhaps we need a new flag for constructor or for read() which enables a new behavior (what will be a good name for this?). Or we can use a special value for size argument which means read to the end as much as possible (we can differentiate the behavior for size0 and size=None). Unconditional enabling a new behavior for size =0 is safe. -- type: behavior - enhancement versions: -Python 2.7, Python 3.2, Python 3.3 Added file: http://bugs.python.org/file28809/lzma_deferred_error.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Roundup Robot added the comment: New changeset 174332b89a0d by Serhiy Storchaka in branch '3.2': Issue #1159051: GzipFile now raises EOFError when reading a corrupted file http://hg.python.org/cpython/rev/174332b89a0d New changeset 87171e88847b by Serhiy Storchaka in branch '3.3': Issue #1159051: GzipFile now raises EOFError when reading a corrupted file http://hg.python.org/cpython/rev/87171e88847b New changeset f2f947cdc5fe by Serhiy Storchaka in branch 'default': Issue #1159051: GzipFile now raises EOFError when reading a corrupted file http://hg.python.org/cpython/rev/f2f947cdc5fe New changeset 214d8909513d by Serhiy Storchaka in branch '2.7': Issue #1159051: GzipFile now raises EOFError when reading a corrupted file http://hg.python.org/cpython/rev/214d8909513d -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Serhiy Storchaka added the comment: Here is an updated patch addressing Nadeem Vawda's comments. Thank you. -- Added file: http://bugs.python.org/file28795/gzip_eof-3.4_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Nadeem Vawda added the comment: The updated patch looks good to me. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Nadeem Vawda added the comment: I've reviewed the patch and posted some comments on Rietveld. I doubt about backward compatibility. It's obvious that struct.error and TypeError are unintentional, and EOFError is purposed for this case. However users can catch undocumented but de facto exceptions and doesn't expect EOFError. I think it's fine for us to change it to raise EOFError in these cases. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Serhiy Storchaka added the comment: At the moment gzip can raise two errors on unexpected EOF: struct.error from struct.unpack() or TypeError from ord(). Both bz2 and lzma raise EOFError in such cases. The proposed patch converts both truncated gzip errors to EOFError as for bz2 and lzma. Added similar tests for gzip, bz2 and lzma. I doubt about backward compatibility. It's obvious that struct.error and TypeError are unintentional, and EOFError is purposed for this case. However users can catch undocumented but de facto exceptions and doesn't expect EOFError. -- nosy: +nadeem.vawda stage: needs patch - patch review versions: +Python 3.3, Python 3.4 -Python 3.1 Added file: http://bugs.python.org/file28727/gzip_eof-3.4.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Changes by Serhiy Storchaka storch...@gmail.com: -- assignee: - serhiy.storchaka nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Yoav Weiss yee...@gmail.com added the comment: What is the reason that the currently submitted patch is not good enough and current stage is needs patch? The current patch seem to solve this issue, which is a very common one when dealing with gzip files coming from the Internet. In any case, an indication on *why* the current patch is not good enough will help create a better patch that may be good enough. -- nosy: +yv ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Changes by Mark Lawrence breamore...@yahoo.co.uk: -- stage: - needs patch type: - behavior versions: +Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Daniel Diniz aja...@gmail.com added the comment: Confirmed on trunk with test_gzip_error.py: struct.error: unpack requires a string argument of length 4 -- nosy: +ajaksu2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1159051] Handle corrupted gzip files with unexpected EOF
Wummel added the comment: Here is a new test script that works with simple strings and no file objects. It reproduces the error by cutting off the last two bytes of the GZIP data. The resulting struct error is due to the read() methods missing a check that the requested amount of data is actually returned. In this case read(4) returned 2 bytes instead of 4, and the struct raises an error. I think the easiest way to handle this is to introduce a read_save(fileobj, size) method that checks that the read() data is of the requested size, else raise an error (perhaps an IOError?). btw: you can remove the t.{gz,py} files, the test_gzip_error.py replaces them. Added file: http://bugs.python.org/file8610/test_gzip_error.py _ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1159051 _# test corrupted GZIP data import gzip import StringIO uncompressed = This is a test fileobj = StringIO.StringIO() gzipobj = gzip.GzipFile(test.gz, 'wb', 9, fileobj) gzipobj.write(uncompressed) gzipobj.close() # corrupt the .gz data: remove the last 2 bytes compressed = fileobj.getvalue()[:-2] # now uncompress again fileobj = StringIO.StringIO(compressed) print gzip.GzipFile('', 'rb', 9, fileobj).read() ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com