[issue40172] ZipInfo corrupts file names in some old zip archives

2022-03-22 Thread Gregory P. Smith
Gregory P. Smith added the comment: Examining Lib/zipfile.py code, the existing code makes sense. Python's zipfile module produces modern zipfiles when writing by setting the utf-8 flag and storing the filename as utf-8 when it is not ASCII. This is desirable for use with all normal zip

[issue40172] ZipInfo corrupts file names in some old zip archives

2022-03-21 Thread Daniel Hillier
Daniel Hillier added the comment: Related to issue https://bugs.python.org/issue28080 which has a patch that covers a bit of this issue -- ___ Python tracker ___

[issue40172] ZipInfo corrupts file names in some old zip archives

2022-03-05 Thread Yudi Levi
Yudi Levi added the comment: The main issue is that when extracting older zip files, files are actually written to disk with corrupted (altered) names. Unfortunately it's been a while since I saw this issue and I can't tell if it was fixed or if I simply can't reproduce it. I do see that

[issue40172] ZipInfo corrupts file names in some old zip archives

2021-05-26 Thread Daniel Hillier
Daniel Hillier added the comment: Looking into this more and it appears that while Appendix D of https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT says "If general purpose bit 11 is unset, the file name and comment SHOULD conform to the original ZIP character encoding" where the

[issue40172] ZipInfo corrupts file names in some old zip archives

2021-05-24 Thread Daniel Hillier
Daniel Hillier added the comment: zipfile decodes filenames using cp437 or unicode and encodes using ascii or unicode. It seems like zipfile has a preference for writing filenames in unicode rather than cp437. Is zipfile's preference for writing filenames in unicode rather than cp437

[issue40172] ZipInfo corrupts file names in some old zip archives

2021-05-16 Thread Yudilevi
Yudilevi added the comment: Hey :) Sorry that I'm not responsive, just busy. I'll add one soon. Yudi On Mon, May 17, 2021 at 12:08 AM Irit Katriel wrote: > > Irit Katriel added the comment: > > Can you suggest a unit test for this? > > -- > nosy: +iritkatriel > >

[issue40172] ZipInfo corrupts file names in some old zip archives

2021-05-16 Thread Irit Katriel
Irit Katriel added the comment: Can you suggest a unit test for this? -- nosy: +iritkatriel ___ Python tracker ___ ___

[issue40172] ZipInfo corrupts file names in some old zip archives

2020-04-03 Thread Yudi
Change by Yudi : -- keywords: +patch pull_requests: +18697 stage: -> patch review pull_request: https://github.com/python/cpython/pull/19335 ___ Python tracker ___

[issue40172] ZipInfo corrupts file names in some old zip archives

2020-04-03 Thread Yudi
New submission from Yudi : Some old zip files that don't yet use unicode file names might have entries with characters beyond the ascii range. ZipInfo seems to encode these file names with 'cp437' codepage (correct for old zips) but decode them back with 'ascii' code page which might corrupt