[issue10694] zipfile.py end of central directory detection not robust

2018-09-25 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

This change significantly increased the probability of false positive detection 
in is_zipfile(). It was around 1/2**32, now it is around 1/2**16. It looks 
small, but not enough small if you test hundreds of thousands files. See 
issue28494.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-06-09 Thread Roundup Robot

Roundup Robot devnull@devnull added the comment:

New changeset 7530b141c20f by R David Murray in branch '3.2':
#10694: zipfile now ignores garbage at the end of a zipfile.
http://hg.python.org/cpython/rev/7530b141c20f

New changeset 2e49722c7263 by R David Murray in branch 'default':
Merge #10694: zipfile now ignores garbage at the end of a zipfile.
http://hg.python.org/cpython/rev/2e49722c7263

New changeset cc3255a707c7 by R David Murray in branch '2.7':
#10694: zipfile now ignores garbage at the end of a zipfile.
http://hg.python.org/cpython/rev/cc3255a707c7

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-06-09 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

While I don't understand the zip file structure enough to do a definitive 
review of the patch, I'm willing to take processing 50K zip files successfully 
as a definitive test :)

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed
versions:  -Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-06-05 Thread Amar Takhar

Amar Takhar v...@aegisub.org added the comment:

This patch fixed every BadZipfile exception (~300) I got while testing over 
50,000 zipfiles. (Applied to 2.7)

It would be nice to see this bug resolved.

--
nosy: +verm

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-03-22 Thread bcroq

Changes by bcroq bertrand.c...@gmail.com:


--
nosy: +bcroq

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-03-22 Thread Bertrand Croq

Bertrand Croq bertrand.c...@gmail.com added the comment:

The last patch (manually applied to Python 2.6) fixed the problem with a ZIP 
file used in my project.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-03-22 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +amaury.forgeotdarc
versions: +Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-03-21 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

Hi, can someone look at the patch? There doesn't seem to be any objections to 
it and it'll solve a long-standing issue.

Thanks!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-03-21 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-02-10 Thread rep

rep m...@mwcollect.org added the comment:

Xuanji: Patch looks good now with the test cases.

Regarding the other fix suggested by Kevin I think that we should not introduce 
custom behaviour here like appending the eofdata to the comment even though the 
comment size in the record does not say so. In the end we want to make sure 
that correct zip files end up being processed correctly with the specified 
comment size and incorrect zip files with eofdata should not be rejected / 
throw exceptions.

So I agree with Xuanji that his patch (or mine) is the way to go.

You probably agree that we can mark issue 10298 to be superseded by this one 
here.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-02-09 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

I've attached a patch copying rep's patch from issue 10298, as i think it is 
cleaner (but it seems to do the same thing as kevin's patch) together with 2 
test cases. 

David Murray: I'm not really sure what you mean by test comments with trailing 
data; as I understand, the way to fix the problem is simply to remove the 
excessively strict check that the length the zip file claims to have is equal 
to the actual amount of comments found by truncating comments to the claimed 
size; if that's the case my (new) test should be enough to exercise that logic. 
Please correct me if I misunderstand.

--
Added file: http://bugs.python.org/file20723/reps_patch_and_two_tests.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-02-09 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Xuanji: yes, your test_ignores_stuff_appended_past_comments is exactly what I 
was asking for.  I've put this patch on my review list, but I may not get to it 
until after 3.2 final.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-01-15 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

Not important enough to block release.

--
priority: deferred blocker - critical

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2011-01-14 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage: unit test needed - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-31 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

I finally got around to researching this issue in the tracker.

Issue 10298 is a close relative to this issue.  The fix from that issue make 
the test that Xuanji added here pass.  That issue contains no testsit would 
be ideal to have tests that test the behavior in the face of actual comments in 
the zipfile, but even if all we have is Xuanji's test IMO we should apply one 
of the two fixes.

The 10298 patch takes the approach of ignoring the excess data but preserving 
the comment if any.  The author implies that that is what other tools do, so in 
the absence of input from Alan or other zipfile experts that's probably what we 
should go with.

Rep, could you look over this issue and indicate if you agree?

Note also issue 9239, which fixes one way that zipfile could create a zipfile 
with garbage at the end.

Then there is issue 1757072, where we hear some of Alan's thinking about this: 
a non-strict mode...but it is perhaps too late for a feature request, and 
there is the fact that ignoring the trailing data appears to be a de-facto 
standard.

And then we have issue 1757072, which is identical to this one and was closed 
won't fix, but apparently only because the source of the corrupted zip files 
wasn't identified, which this issue does do.

Interestingly, issue 669036 claims that zipfile.py supports garbage at the 
start, which makes tolerating garbage at the end seem sensibly symmetric.

Finally, comment support was added by the patch in issue 611760.  It would be 
interesting to know if garbage at the end was supported before that patch.  My 
guess is that it was, by ignoring it, but I haven't tested it.

Summary: if someone can review the actual patch, I think we should apply the 
issue 10298 patch along with Xuanji's test.  Xuanji, if you have the time and 
desire to add some additional tests that test comments with trailing data, that 
would be a bonus.  You could look at the tests in issue 9239 for ideas.

--
nosy: +rep, rfk

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-30 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

Ok, new patch that creates a zipfile (actually I used TESTFN2, all the other 
tests seem to also use it)

--
Added file: http://bugs.python.org/file20200/Issue10694_2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-21 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

Hi KevinH, sorry if I gave the wrong impression that you must submit a patch 
before anyone will look at the bug. I thought you wanted to provide a patch for 
this. If you only want to report a bug that is ok, thanks for that.

I have confirmed your bug on the py3k branch. Attached is a patch that adds a 
test case to test_zipfile.py (as requested by others) and ports Kevin's patch 
to py3k. I have confirmed that his patch works: without it the test fails, and 
with it the test passes. Note: You have to create a file README.zip in the 
Lib/test directory, because svn diff cannot display added files.

--
Added file: http://bugs.python.org/file20127/Issue10694.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-21 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

Sorry, forgot to add: README.zip should have a \r\n at the end. Actually I just 
zipped up README, ran echo \r\n  README.zip and put it in Lib/test.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-21 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

High is not near high enough to get noticed before the release :)

--
nosy: +georg.brandl
priority: high - deferred blocker

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-21 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Xuanji: thanks for taking a crack at the test.

Rather than adding another data file to the test directory, how about creating 
a zipfile using the zipfile module in the test, closing it, opening it as a 
file, writing the /r/n to it, and then opening it back up for read as a zipfile 
for the test.  That's more parallel to how the other zipfile tests work.

Given what we know I'd be inclined to accept Kevin's suggested fix, unless Alan 
speaks up against it.  In that case the test may as well check that the comment 
is what we expect, as well.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-20 Thread Kevin Hendricks

Kevin Hendricks kevin.hendri...@sympatico.ca added the comment:

I have not looked at how other tools handle this.  They could simply ignore 
what comes after a valid endrecdata is found, they could strip it out (truncate 
it) or make it into a final comment.  I guess for simply unpacking a zip 
archive, all of these are equivalent (it goes unused).

But if you are copying a zip archive to another archive then ignoring it and 
truncating it may be safer in some sense (since you have no idea what this 
extra data is for and why it might be attached) but then you are not being 
faithful to the original but at the same time you do not want to create 
improper zip archives.  If you change the extra data into a final comment, then 
at least none of the original data is actually lost (just moved slightly in the 
copied zip and protected as a comment) and could be recovered if it turns out 
to have been useful.  With so many things using/making the zip archive format 
(jars, normal zips, epubs, etc) you never know what might have been left over 
at the end of the zip file and if it was important.

So I am not really sure how to deal with this properly.  Also I know nothing 
about _EndRecData64 and if it needs to somehow be handled in a different way.

So someone who is very familiar with the code should look at this and tell us 
what is the right thing to do and even if the approach I took is correct (it 
works fine for me and I have taken to including my own zipfile.py in my own 
projects until this gets worked out) but it might not be the right thing to do.

As for a test case, I know nothing about that but will look at test_zipfile.py. 
 I am a Mac OS X user/developer so all of my code is targeted to running on 
both Python 2.5 (Mac OS X 10.5.X) and Python 2.6 (Mac OS 10.6.X). Python 3.X 
and even Python 2.7 are not on my horizon and not even on my build machine 
(trying to get Mac OS X users to install either would be an experience in 
frustration). I simply looked at the source in Python 2.7 and Python 3.1.3 
(from the official Python releases from python.org) to see that the problem 
still exists (and it does).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-20 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

It's pretty easy, really, to do an SVN checkout of python and compile it on a 
mac, if you are at all familiar with the unix command line.  If you don't have 
the time or desire for that, though, someone will eventually get to it, we just 
don't know when ;)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-20 Thread Kevin Hendricks

Kevin Hendricks kevin.hendri...@sympatico.ca added the comment:

Been programming on unix/vax and then linux since the mid 80s and on punch 
cards in the late 70s.  Grew my first beard writing 8080 and Z80 assembler.  
All of that was over 30 years ago. 

All I want to do is report a damn bug!

Then I get nudged for a test case (although how to recreate the bug was already 
described) so I add the two lines to create a test case.  

Then I get nudged for a patch, so I give a patch even though there are many 
ways to deal with the issue.

Then I get nudged for patches for other branches, then I get nudged for 
official test_zipfile.py patches.

All of this **before** the damn owner has even bothered to look at it and say 
if he/she even wants it or if the patch is even correct.

I have my own working code for the epub ebook stuff I am working on so this 
issue no longer impacts me.

How did I go from a simple bug report to having to build python's latest 
checkout just to get someone to look at the bug.

You have got to be kidding me!

I am done here,  do what you want with the bug.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-20 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Sorry, I thought I was being clear that if you *wanted* to help further here 
was how you could, but if you didn't then we'd get to it eventually.  We're all 
volunteers here, just like you, so every bit of help...helps, and we thank you 
sincerely for taking the time to report the bug and provide the preliminary 
patch and test.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-19 Thread Ned Deily

Ned Deily n...@acm.org added the comment:

Thanks for the suggested code. As suggested, it would be helpful to supply a 
complete patch (or patches) ready to be reviewed and applied to the tips of the 
currently maintained source trees (py3k, 3.1, and 2.7) including adding an 
appropriate test case to Lib/test/test_zipfile.py.

--
nosy: +alanmcintyre, ned.deily
stage:  - needs patch
versions:  -Python 2.5, Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-19 Thread Kevin Hendricks

Kevin Hendricks kevin.hendri...@sympatico.ca added the comment:

Final patches against the trees make no sense as no developer has decided which 
way they want to actually handle the problem. 

My patch is only one way and I note it may not be the way the owners of the 
code want.

Also, this patch is very straight forward (one hunk) and should apply to 2.6, 
2.7, and 3.1 (although I have not tried it with 3.1) with only line offsets.

So if the owner of this code actually looks that the patch and the bug report 
and makes a decision on how they want to handle this issue and they like the 
patch I have suggested, then I would be happy to diff it against whatever 
zipfile.py versions he/she wants.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-19 Thread Ned Deily

Ned Deily n...@acm.org added the comment:

Not to belabor the point but the most useful part of a ready-to-go patch would 
be to have a fully automated test case that fits into the existing 
test_zipfile.py structure and that demonstrates the failure. Somebody has to 
write a test case one way or another because that's the first step towards 
getting a fix applied. That can be a big help as there are many more modules in 
the standard library than there are volunteer core developers to look at 
problems.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-19 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Actually our normal procedure currently (this will change a bit after the 
migration to mercurial) is a patch against the py3k branch, and the committer 
will do the backport to the other active branches.  If the 2.7 code is very 
different, a separate 2.7 patch is sometimes helpful.  In this case it looks 
like a patch that applies to py3k is enough.

Alan hasn't spoken up yet, and he's listed as maintainer.  If he doesn't speak 
up someone else will hopefully make a decision before the end of the beta 
period for 3.2.  I'm bumping up the priority of this issue because I think not 
being able to handle commonly produced archives that other zip tools can handle 
is a relatively serious defect.

My own opinion is that truncation if the extra data doesn't look like a comment 
makes the most sense.  But absent an opinion from Alan I think I'd be guided by 
whatever other zip tools do in this case.  Kevin, am I correct in guessing that 
that is truncation?

Ned is definitely correct that absence of a test case is something that can 
delay getting a patch applied.  If a patch includes a unit test, a committer 
can often do a quick review-and-apply, while absent a unit test the committer 
would first need to write one, and therefore will often move on to some other, 
easier to process issue :)  In this case it seems to me that the unit test will 
only differ in one detail depending on the fix chosen, so it may be worth 
writing it even before a final decision is made, if you are willing.

--
nosy: +r.david.murray
priority: normal - high
stage: needs patch - unit test needed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-18 Thread Kevin Hendricks

Kevin Hendricks kevin.hendri...@sympatico.ca added the comment:

Same problem exists in Python 3.1.3 and in later versions as well.

Same patch should also work.

--
versions: +Python 3.1, Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-14 Thread Xuanji Li

Xuanji Li xua...@gmail.com added the comment:

Hi, I would like to take a look at whether your code works, can you provide an 
zip file that is currently not read by zipfile but should be?

Also, I suggest you submit the code changes as a patch 
(http://www.python.org/dev/patches/)

thanks!

--
nosy: +xuanji

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-14 Thread Kevin Hendricks

Kevin Hendricks kevin.hendri...@sympatico.ca added the comment:

If you read the bug report it explains how to generate a testcase (i.e. append 
any data to the end of a zip archive)

Here it is as a step by step process 

1. simply take any working zip and call it testcase.zip 

2. do the following:

echo \r\n  testcase.zip 

If you run unzip -t on testcase.zip it will pass with flying colors and will 
properly unzip on every piece of zip software I have tried.

However if you try to use python to copy the zip archive to another zip archive

python ./zipfix.py testcase.zip junk.zip
Error Occurred  File is not a zip file

All because of the appended carriage return / linefeed at the end.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-14 Thread Kevin Hendricks

Kevin Hendricks kevin.hendri...@sympatico.ca added the comment:

Here is one potential patch.  It simply incorporates and non-comment extraneous 
data into a final comment so that nothing is lost.  Another solution that might 
be safer, would be to truncate the zip archive immediately after the endrec is 
found if the extraneous data is not a properly formatted comment.  The right 
solution is obviously up to the developers of zipfile.py

--- zipfile_orig.py 2010-12-14 10:23:58.0 -0500
+++ zipfile.py  2010-12-14 10:30:21.0 -0500
@@ -228,6 +228,13 @@
 # structure present, so go look for it
 return _EndRecData64(fpin, start - filesize, endrec)
 return endrec
+else :
+# be robust to non-comment extaneous data after endrec
+# by making it a comment so that nothing is ever lost
+endrec[_ECD_COMMENT_SIZE] = len(comment)
+endrec.append(comment)
+endrec.append(maxCommentStart + start)
+return endrec
 
 # Unable to find a valid end of central directory structure
 return

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-14 Thread Kevin Hendricks

Changes by Kevin Hendricks kevin.hendri...@sympatico.ca:


--
keywords: +patch
versions: +Python 2.5, Python 2.6
Added file: http://bugs.python.org/file20040/more_robust.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10694] zipfile.py end of central directory detection not robust

2010-12-13 Thread Kevin Hendricks

New submission from Kevin Hendricks kevin.hendri...@sympatico.ca:

The current version of zipfile.py is not robust to slight errors at the end of 
zip archives.  Many file servers **improperly** append a new line to the end of 
files that do not have a new line when they are uploaded from a browser.  This 
bug ends up adding 0x0d 0xa to the end of the zip archive.  This in turn makes 
zipfile.py eventually throw a Not a zip file exception when no other zip 
tools seem to have trouble with them.  Even unzip -t passes these problem zip 
archives with flying colours.

I hate to have to extract and create my own zipfile.py script just to be robust 
to zip archives that are commonly found on the net and that are handled more 
robustly by other software.

So please consider changing this code from _EndRecData below to simply ignore 
any trailing data after the proper stringEndArchive and structEndArchive are 
found instead of looking for the comment and verifying if the comment is 
properly formatted and throwing an exception if not correct.  Ignoring the 
comment seems to be more robust in this case as everything needed to unpack 
the zip archive has been found.


# Either this is not a ZIP file, or it is a ZIP file with an archive
# comment.  Search the end of the file for the end of central directory
# record signature. The comment is the last item in the ZIP file and may be
# up to 64K long.  It is assumed that the end of central directory magic
# number does not appear in the comment.
maxCommentStart = max(filesize - (1  16) - sizeEndCentDir, 0)
fpin.seek(maxCommentStart, 0)
data = fpin.read()
start = data.rfind(stringEndArchive)
if start = 0:
# found the magic number; attempt to unpack and interpret
recData = data[start:start+sizeEndCentDir]
endrec = list(struct.unpack(structEndArchive, recData))
comment = data[start+sizeEndCentDir:]
# check that comment length is correct
if endrec[_ECD_COMMENT_SIZE] == len(comment):
# Append the archive comment and start offset
endrec.append(comment)
endrec.append(maxCommentStart + start)
if endrec[_ECD_OFFSET] == 0x:
# There is apparently a Zip64 end of central directory
# structure present, so go look for it
return _EndRecData64(fpin, start - filesize, endrec)
return endrec


This will in turn make the Python implementation of zipfile.py more robust to 
data improperly appended when some zip archives are uploaded or downloaded 
(similar to how other zip tools handle this issue).

Thank you for your time and consideration.

--
messages: 123891
nosy: KevinH
priority: normal
severity: normal
status: open
title: zipfile.py end of central directory detection not robust
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10694
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com