[issue40172] ZipInfo corrupts file names in some old zip archives

2022-03-21 Thread Daniel Hillier
Daniel Hillier added the comment: Related to issue https://bugs.python.org/issue28080 which has a patch that covers a bit of this issue -- ___ Python tracker <https://bugs.python.org/issue40

[issue45981] Get raw file name in bytes from ZipFile

2021-12-05 Thread Daniel Hillier
Daniel Hillier added the comment: Handling different character sets is not completely supported yet. There are a couple of open issues relating to this: https://bugs.python.org/issue40407 (reading file names), https://bugs.python.org/issue41928 (support for reading and writing filenames

[issue39359] zipfile: add missing "pwd: expected bytes, got str" exception message

2021-09-27 Thread Daniel Hillier
Daniel Hillier added the comment: I agree it is bad form but I would accidentally do it when I couldn't remember the proper API and took a stab in the dark without looking up the docs. I unfortunately used it in an example in the docs for pyzipper and started getting a few bug reports even

[issue39359] zipfile: add missing "pwd: expected bytes, got str" exception message

2021-09-23 Thread Daniel Hillier
Daniel Hillier added the comment: Thanks Ɓukasz! -- ___ Python tracker <https://bugs.python.org/issue39359> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue40172] ZipInfo corrupts file names in some old zip archives

2021-05-26 Thread Daniel Hillier
Daniel Hillier added the comment: Looking into this more and it appears that while Appendix D of https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT says "If general purpose bit 11 is unset, the file name and comment SHOULD conform to the original ZIP character encoding&q

[issue40172] ZipInfo corrupts file names in some old zip archives

2021-05-24 Thread Daniel Hillier
Daniel Hillier added the comment: zipfile decodes filenames using cp437 or unicode and encodes using ascii or unicode. It seems like zipfile has a preference for writing filenames in unicode rather than cp437. Is zipfile's preference for writing filenames in unicode rather than cp437

[issue44129] zipfile: Add descriptive global variables for general purpose bit flags

2021-05-13 Thread Daniel Hillier
Change by Daniel Hillier : -- keywords: +patch pull_requests: +24763 stage: -> patch review pull_request: https://github.com/python/cpython/pull/26118 ___ Python tracker <https://bugs.python.org/issu

[issue44129] zipfile: Add descriptive global variables for general purpose bit flags

2021-05-13 Thread Daniel Hillier
New submission from Daniel Hillier : In the zipfile module, masking of bit flags is done against hex numbers eg. if flags & 0x800... To increase readability I suggest we replace these with global variables named for the purpose of the flag. From the example above: if flags & 0x800

[issue44128] zipfile: Deduplicate ZipExtFile code for init and resetting when seeking

2021-05-13 Thread Daniel Hillier
Change by Daniel Hillier : -- keywords: +patch pull_requests: +24761 stage: -> patch review pull_request: https://github.com/python/cpython/pull/26116 ___ Python tracker <https://bugs.python.org/issu

[issue44128] zipfile: Deduplicate ZipExtFile code for init and resetting when seeking

2021-05-13 Thread Daniel Hillier
New submission from Daniel Hillier : Integrating a refactor suggested in https://bugs.python.org/issue38334 The logic for preparing a ZipExtFile for reading (setting CRC state, read positions, etc) is currently in two locations: first initialisation and when seeking back to the start

[issue40301] zipfile module: new feature (two lines of code), useful for test, security and forensics

2020-04-18 Thread Daniel Hillier
Daniel Hillier added the comment: Hi Massimo, Unless I'm missing something about your requirements, the advantage is that it already works in python 2.7 so there is no need to patch Python. Just bundle the above function with your analysis tool and you're good to go. Cheers, Dan On Sat, Apr

[issue40301] zipfile module: new feature (two lines of code), useful for test, security and forensics

2020-04-17 Thread Daniel Hillier
Daniel Hillier added the comment: Could something similar be achieved by looking for the earliest file header offset? def find_earliest_header_offset(zf): earliest_offset = None for zinfo in zf.infolist(): if earliest_offset is None: earliest_offset

[issue39294] zipfile.ZipInfo objects contain invalid 'extra' fields.

2020-02-01 Thread Daniel Hillier
Daniel Hillier added the comment: This looks to be expected behaviour for the zip64 extension in the zip spec (for handling large files or large archives). Section 4.4.1.4 of the zip spec outlines when the zip64 extra fields are used (https://pkware.cachefly.net/webdocs/casestudies

[issue39359] zipfile: add missing "pwd: expected bytes, got str" exception message

2020-01-16 Thread Daniel Hillier
Change by Daniel Hillier : -- keywords: +patch pull_requests: +17427 stage: -> patch review pull_request: https://github.com/python/cpython/pull/18031 ___ Python tracker <https://bugs.python.org/issu

[issue39359] zipfile: add missing "pwd: expected bytes, got str" exception message

2020-01-16 Thread Daniel Hillier
New submission from Daniel Hillier : Setting the ZipFile.pwd attribute directly skips the check to ensure the password is a bytes object and, if not, return a user friendly TypeError("pwd: expected bytes, got ") informing them of that. -- components: Library (Lib) messag

[issue37523] zipfile: Raise ValueError for i/o operations on closed zipfile.ZipExtFile

2019-10-29 Thread Daniel Hillier
Daniel Hillier added the comment: Good point. Thanks for the advice. I've updated it to use timeit. Does that give a better indication? import zipfile test_zip = "time_test.zip" test_name = "test_name.txt" # with zipfile.ZipFile(test_zip, "w") as zf: #

[issue37523] zipfile: Raise ValueError for i/o operations on closed zipfile.ZipExtFile

2019-10-28 Thread Daniel Hillier
Daniel Hillier added the comment: Here's the script I used for profiling and the results I observed with and without the closed check in read: import zipfile test_zip = "time_test.zip" test_name = "test_name.txt" with zipfile.ZipFile(test_zip, "w") as zf:

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-28 Thread Daniel Hillier
Daniel Hillier added the comment: Thanks for your help! Good point, I'll create a new change for the refactoring. -- ___ Python tracker <https://bugs.python.org/issue38

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-26 Thread Daniel Hillier
Daniel Hillier added the comment: I also think that the `read_init` method in my PR is a useful refactor as it locates all the state that needs to be (re)set when starting a read into the same location. At the moment this state is set in 1) __init__ and 2) the seek method when seeking back

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-26 Thread Daniel Hillier
Daniel Hillier added the comment: Thanks for looking at the PR. I got carried away refactoring the decrypter for a future scenario where there could be different decrypters (possibly using certificates too) :) Your PR is much simpler. Would you also be able to take a look at some other

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-25 Thread Daniel Hillier
Daniel Hillier added the comment: Hi, I have another patch I would like to contribute to the zipfile module but would like to request a review of this one to minimise conflicts with later patches. If anyone is able to review the patch, I would really appreciate it :) Also, with regards

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-13 Thread Daniel Hillier
Change by Daniel Hillier : -- nosy: +serhiy.storchaka ___ Python tracker <https://bugs.python.org/issue38334> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-01 Thread Daniel Hillier
Change by Daniel Hillier : -- keywords: +patch pull_requests: +16120 stage: -> patch review pull_request: https://github.com/python/cpython/pull/16529 ___ Python tracker <https://bugs.python.org/issu

[issue38334] zipfile: Seeking encrypted file breaks after seeking backwards

2019-10-01 Thread Daniel Hillier
New submission from Daniel Hillier : Seeking back beyond the decrypted / unzipped buffer doesn't reset the decrypter's crc key values. All data read after seeking back beyond the buffer is garbled. -- components: Library (Lib) messages: 353646 nosy: dhillier priority: normal severity

[issue37538] Refactor zipfile to ease subclassing and enhancement

2019-07-26 Thread Daniel Hillier
Change by Daniel Hillier : -- keywords: +patch pull_requests: +14725 stage: -> patch review pull_request: https://github.com/python/cpython/pull/14957 ___ Python tracker <https://bugs.python.org/issu

[issue37538] Refactor zipfile to ease subclassing and enhancement

2019-07-15 Thread Daniel Hillier
Daniel Hillier added the comment: Hi, Here is a pull request against my fork: https://github.com/danifus/cpython/pull/1/files The overall behaviour of zipfile remains the same and I've tried to call out any behaviour changes in the extended commit messages (usually with ** markers

[issue37538] Refactor zipfile to ease subclassing and enhancement

2019-07-10 Thread Daniel Hillier
Daniel Hillier added the comment: I've started a branch on my github fork if anyone wants to follow along. https://github.com/danifus/cpython/tree/zipfile_refactor Is there a better way to manage this in terms of review and suggestions as I add more commits

[issue37538] Refactor zipfile to ease subclassing and enhancement

2019-07-10 Thread Daniel Hillier
New submission from Daniel Hillier : I've written https://github.com/danifus/pyzipper which incorporates a refactor of zipfile.py in order to support winzip AES encryption. I don't intend to include the crypto code but I would like to incorporate the refactor to help others subclass and extend

[issue37523] zipfile: Raise ValueError for i/o operations on closed zipfile.ZipExtFile

2019-07-09 Thread Daniel Hillier
Change by Daniel Hillier : -- keywords: +patch pull_requests: +14465 stage: -> patch review pull_request: https://github.com/python/cpython/pull/14658 ___ Python tracker <https://bugs.python.org/issu

[issue37523] zipfile: Raise ValueError for i/o operations on closed zipfile.ZipExtFile

2019-07-09 Thread Daniel Hillier
New submission from Daniel Hillier : After closing a file object opened from a ZipFile, attempting i/o operations raises AttributeError because the underlying fd has been set to None. We should be raising ValueErrors consistent with io.FileIO behaviour. Similar inconsistencies exist

[issue36993] zipfile: tuple IndexError on extract

2019-07-09 Thread Daniel Hillier
Daniel Hillier added the comment: I've pushed a PR which adds a test that generates corrupt zip64 files with different combinations of zip64 extra data lengths and zip64 flags (which determines how many fields are required in the extra data). It now raises a BadZipFile with a message naming

[issue36993] zipfile: tuple IndexError on extract

2019-07-08 Thread Daniel Hillier
Change by Daniel Hillier : -- pull_requests: +14463 pull_request: https://github.com/python/cpython/pull/14656 ___ Python tracker <https://bugs.python.org/issue36