[issue34606] Unable to read zip file with extra

2019-06-25 Thread Andreas Gäer

Andreas Gäer  added the comment:

I'm a little bit puzzled that this bug is closed as "not a bug" despite the 
fact that all versions of python allow the user to create "invalid" ZIP files 
and all version up to 3.7 are able to read those "invalid" files.

Currently we're stuck with a lot of already created ZIP files that cannot be 
read or fixed after an update to 3.7.

Would it be an option to make decoding of extra data optional eg. with an extra 
keyword arg to __init__?

--
nosy: +Andreas.Gäer

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-12 Thread Kyle Altendorf


Kyle Altendorf  added the comment:

Vladimir, if compression were the cause wouldn't the extra bytes I added 
(signature and length) not have any effect?  If you've found an issue it seems 
like it would be a different one than I was triggering.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-11 Thread Vladimir Matveev


Vladimir Matveev  added the comment:

In this particular case looks like a crux of the problem was in the fact that 
compression encodes extra fields only if either zip64 is set or length of the 
field is larger than threshold but decompression always tries to decode it. 
Attached PR switches decoding to be conditioned on the presence of zip64 end of 
central directory record.

--
nosy: +v2m

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Vladimir Matveev


Change by Vladimir Matveev :


--
pull_requests: +8561

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Right, the correct section is 4.5. All integers in ZIP files are unsigned and 
in the little endian format.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Kyle Altendorf


Kyle Altendorf  added the comment:

Python 3.7 works with 2-byte elements, I managed to find the wrong section in 
the doc-linked docs.

4.5 Extensible data fields
--

   4.5.1 In order to allow different programs and different types
   of information to be stored in the 'extra' field in .ZIP
   files, the following structure MUST be used for all
   programs storing data in this field:

   header1+data1 + header2+data2 . . .

   Each header should consist of:

   Header ID - 2 bytes
   Data Size - 2 bytes

   Note: all fields stored in Intel low-byte/high-byte order.


And the test showing it working.


import sys
import zipfile

print(sys.version)

fn = sys.argv[1]
print(fn)

options = {
'endianness': ('little', 'big'),
'header_element_bytes': (2, 4),
'additional_size': (0, 4, 4 + 4),
}

for endianness in options['endianness']:
for additional_size in options['additional_size']:
for header_element_bytes in options['header_element_bytes']:
print('\n\n --- trying endianness: {}, additional_size: {}, 
header_element_bytes: {}'.format(endianness, additional_size, 
header_element_bytes))
with zipfile.ZipFile(fn, 'w') as zf:
zi = zipfile.ZipInfo("0")
extra_data = b"hello, extra, and some more just to make it 
longer and such so yeah"
zi.extra = (
(42).to_bytes(header_element_bytes, endianness)
+ (len(extra_data) + 
additional_size).to_bytes(header_element_bytes, endianness)
+ extra_data
)
zf.writestr(zi, b"the real data")

try:
zipfile.ZipFile(fn)
except Exception as e:
print(e)
else:
print('success')


altendky@lt:~/twisted$ python3.7 ../z.py 37.py
3.7.0 (default, Jul  7 2018, 15:49:24)
[GCC 6.3.0 20170516]
37.py


 --- trying endianness: little, additional_size: 0, header_element_bytes: 2
success


 --- trying endianness: little, additional_size: 0, header_element_bytes: 4
Corrupt extra field 6568 (size=27756)


 --- trying endianness: little, additional_size: 4, header_element_bytes: 2
Corrupt extra field 002a (size=71)


 --- trying endianness: little, additional_size: 4, header_element_bytes: 4
Corrupt extra field 6568 (size=27756)


 --- trying endianness: little, additional_size: 8, header_element_bytes: 2
Corrupt extra field 002a (size=75)


 --- trying endianness: little, additional_size: 8, header_element_bytes: 4
Corrupt extra field 6568 (size=27756)


 --- trying endianness: big, additional_size: 0, header_element_bytes: 2
Corrupt extra field 2a00 (size=17152)


 --- trying endianness: big, additional_size: 0, header_element_bytes: 4
Corrupt extra field  (size=10752)


 --- trying endianness: big, additional_size: 4, header_element_bytes: 2
Corrupt extra field 2a00 (size=18176)


 --- trying endianness: big, additional_size: 4, header_element_bytes: 4
Corrupt extra field  (size=10752)


 --- trying endianness: big, additional_size: 8, header_element_bytes: 2
Corrupt extra field 2a00 (size=19200)


 --- trying endianness: big, additional_size: 8, header_element_bytes: 4
Corrupt extra field  (size=10752)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Kyle Altendorf


Kyle Altendorf  added the comment:

Turns out the docs do document this.  My apologies.

https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

   4.3.11  Archive extra data record: 

archive extra data signature4 bytes  (0x08064b50)
extra field length  4 bytes
extra field data(variable size)

Aside from the discrepancy between 16-bits and 4 bytes, it seems like something 
should happen, even if it's something other than 'fixing' the code to handle 
the malformed data.  Isn't it a bug for zipfile to create a non-compliant file? 
 Shouldn't it either check or provide an interface by which a compliant file 
could sensibly be created?  It doesn't seem great to just expect users to 
rewrite this each time they call.

(42).to_bytes(4, 'little') + len(data).to_bytes(4, 'little') + data

or, should it be 'big'?  and would it be (len(data) + 4 + 4)?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

The value of the extra attribute is not an arbitrary bytes object. It is a 
sequence of blocks in the following format: 16-bit identifier followed by 
16-bit size of the block followed by the specified amount of bytes. Your 
example creates invalid ZIP file.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Karthikeyan Singaravelan

Karthikeyan Singaravelan  added the comment:

It's also reproducible on master. git blame tells me this is introduced with 
feccdb2a249a71be330765be77dee57121866779 (https://bugs.python.org/issue29774)

➜  cpython git:(master)
commit feccdb2a249a71be330765be77dee57121866779
Author: Serhiy Storchaka 
Date:   Thu Mar 9 18:34:03 2017 +0200

bpo-29774: Improve error reporting for corrupted extra field in ZIP file. 
(#583)

diff --git a/Lib/zipfile.py b/Lib/zipfile.py
index b5c16dbc12..8a19ca246b 100644
--- a/Lib/zipfile.py
+++ b/Lib/zipfile.py
@@ -438,7 +438,9 @@ class ZipInfo (object):
 unpack = struct.unpack
 while len(extra) >= 4:
 tp, ln = unpack(' len(extra):
+raise BadZipFile("Corrupt extra field %04x (size=%d)" % (tp, 
ln))
+if tp == 0x0001:
 if ln >= 24:
 counts = unpack('>>
➜  cpython git:(master) ./python.exe ../backups/bpo34606.py 36.zip
3.8.0a0 (heads/master:874809ea38, Sep  7 2018, 21:03:18)
[Clang 7.0.2 (clang-700.1.81)]
36.zip
Traceback (most recent call last):
  File "../backups/bpo34606.py", line 14, in 
zipfile.ZipFile(fn)
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/zipfile.py", 
line 1204, in __init__
self._RealGetContents()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/zipfile.py", 
line 1327, in _RealGetContents
x._decodeExtra()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/zipfile.py", 
line 440, in _decodeExtra
raise BadZipFile("Corrupt extra field %04x (size=%d)" % (tp, ln))
zipfile.BadZipFile: Corrupt extra field 6568 (size=27756)
➜  cpython git:(master) ✗ rm 36.zip

# checkout feccdb2a249a71be330765be77dee57121866779

➜  cpython git:(master) git checkout feccdb2a249a71be330765be77dee57121866779 
Lib/zipfile.py
➜  cpython git:(master) ✗ ./python.exe ../backups/bpo34606.py 36.zip
3.8.0a0 (heads/master:874809ea38, Sep  7 2018, 21:03:18)
[Clang 7.0.2 (clang-700.1.81)]
36.zip
Traceback (most recent call last):
  File "../backups/bpo34606.py", line 14, in 
zipfile.ZipFile(fn)
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/zipfile.py", 
line 1105, in __init__
self._RealGetContents()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/zipfile.py", 
line 1230, in _RealGetContents
x._decodeExtra()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/zipfile.py", 
line 442, in _decodeExtra
raise BadZipFile("Corrupt extra field %04x (size=%d)" % (tp, ln))
zipfile.BadZipFile: Corrupt extra field 6568 (size=27756)
➜  cpython git:(master) ✗ rm 36.zip

# checkout feccdb2a249a71be330765be77dee57121866779~1

➜  cpython git:(master) ✗ git checkout 
feccdb2a249a71be330765be77dee57121866779~1 Lib/zipfile.py
➜  cpython git:(master) ✗ ./python.exe ../backups/bpo34606.py 36.zip
3.8.0a0 (heads/master:874809ea38, Sep  7 2018, 21:03:18)
[Clang 7.0.2 (clang-700.1.81)]
36.zip
➜  cpython git:(master) ✗ gsha256sum 36.zip
0f54bd6ab84facfeefc2c38f12c30eb84101b3be3d91f8826f6fa36e73b86cb6  36.zip


Hope this helps. 

I am adding Serhiy who might have a better explanation. @Serhiy Feel free to 
unassign yourself if this is irrelevant.

Thanks

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34606] Unable to read zip file with extra

2018-09-07 Thread Kyle Altendorf


New submission from Kyle Altendorf :

This was first found over in Twisted tests.  We probably aren't too terribly 
worried about it but I wanted to report here anyways.

https://twistedmatrix.com/trac/ticket/9525

Both 3.6 and 3.7 write the same file (sha at the end) based on the script in 
the first snippet.  3.6 can read both files, 3.7 can't read either.



altendky@lt:~/twisted$ cat ../z.py
import sys
import zipfile

print(sys.version)

fn = sys.argv[1]
print(fn)

with zipfile.ZipFile(fn, 'w') as zf:
zi = zipfile.ZipInfo("0")
zi.extra = b"hello, extra"
zf.writestr(zi, b"the real data")

zipfile.ZipFile(fn)


altendky@lt:~/twisted$ venv36/bin/python ../z.py 36.zip
3.6.6 (default, Jul 24 2018, 16:23:12)
[GCC 6.3.0 20170516]
36.zip


altendky@lt:~/twisted$ venv37/bin/python ../z.py 37.zip
3.7.0 (default, Jul  7 2018, 15:49:24)
[GCC 6.3.0 20170516]
37.zip
Traceback (most recent call last):
  File "../z.py", line 14, in 
zipfile.ZipFile(fn)
  File "/home/altendky/.pyenv/versions/3.7.0/lib/python3.7/zipfile.py", 
line 1200, in __init__
self._RealGetContents()
  File "/home/altendky/.pyenv/versions/3.7.0/lib/python3.7/zipfile.py", 
line 1323, in _RealGetContents
x._decodeExtra()
  File "/home/altendky/.pyenv/versions/3.7.0/lib/python3.7/zipfile.py", 
line 440, in _decodeExtra
raise BadZipFile("Corrupt extra field %04x (size=%d)" % (tp, ln))
zipfile.BadZipFile: Corrupt extra field 6568 (size=27756)


altendky@lt:~/twisted$ cat ../z.py
import sys
import zipfile

print(sys.version)

fn = sys.argv[1]
print(fn)

#with zipfile.ZipFile(fn, 'w') as zf:
#zi = zipfile.ZipInfo("0")
#zi.extra = b"hello, extra"
#zf.writestr(zi, b"the real data")

zipfile.ZipFile(fn)


altendky@lt:~/twisted$ venv36/bin/python ../z.py 37.zip
3.6.6 (default, Jul 24 2018, 16:23:12)
[GCC 6.3.0 20170516]
37.zip


altendky@lt:~/twisted$ venv37/bin/python ../z.py 36.zip
3.7.0 (default, Jul  7 2018, 15:49:24)
[GCC 6.3.0 20170516]
36.zip
Traceback (most recent call last):
  File "../z.py", line 14, in 
zipfile.ZipFile(fn)
  File "/home/altendky/.pyenv/versions/3.7.0/lib/python3.7/zipfile.py", 
line 1200, in __init__
self._RealGetContents()
  File "/home/altendky/.pyenv/versions/3.7.0/lib/python3.7/zipfile.py", 
line 1323, in _RealGetContents
x._decodeExtra()
  File "/home/altendky/.pyenv/versions/3.7.0/lib/python3.7/zipfile.py", 
line 440, in _decodeExtra
raise BadZipFile("Corrupt extra field %04x (size=%d)" % (tp, ln))
zipfile.BadZipFile: Corrupt extra field 6568 (size=27756)


altendky@lt:~/twisted$ sha256sum 36.zip
0f54bd6ab84facfeefc2c38f12c30eb84101b3be3d91f8826f6fa36e73b86cb6  36.zip


altendky@lt:~/twisted$ sha256sum 37.zip
0f54bd6ab84facfeefc2c38f12c30eb84101b3be3d91f8826f6fa36e73b86cb6  37.zip

--
components: Library (Lib)
messages: 324743
nosy: altendky
priority: normal
severity: normal
status: open
title: Unable to read zip file with extra
type: crash
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com