Devourer Station added the comment:
I do think providing a rawfile field in the ZipInfo struct helps.
As a library, ZipFile should let users know what they are dealing with.
Users can get data from zip files, and ZipFile shouldn't corrupt them.
I don't mean that we should provide everything
Devourer Station added the comment:
Null bytes appear in abnormal zip files. (I haven't seen any multibyte encoding
that represents a character with null bytes)
But non-utf8 encodings are common in normal zip files, as windows uses
different encodings for different language settings
Devourer Station added the comment:
In file Lib/zipfile.py:
1357> flags = centdir[5]
1358> if flags & 0x800:
1359># UTF-8 file names extension
1360>filename = filename.decode('utf-8')
1361> else:
1362># Historical ZIP filename encoding
1363>filename
New submission from Devourer Station :
It's quite annoying that ZipFile corrupts the filename by simply replacing '\\'
with '/', not providing the raw file name in bytes to us.
--
components: Library (Lib)
messages: 407665
nosy: accelerator0099
priority: normal
severity: normal
status