[issue10972] zipfile: add unicode option to the choose filename encoding

2011-01-21 Thread STINNER Victor
STINNER Victor added the comment: 7zip and WinRAR uses the same algorithm than ZipFile._encodeFilename(): try cp437 or use UTF-8. Eg. if a filename contains ∞ (U+221E), it is encoded to UTF-8. WinZIP encodes all filenames to cp437: ∞ (U+221E) is replaced by 8 (U+0038), ☺ (U+263A) is replaced

[issue10972] zipfile: add unicode option to the choose filename encoding

2011-01-21 Thread STINNER Victor
STINNER Victor added the comment: Oh, this patch fixes also a bug: ZipFile._RealGetContents() doesn't keep the unicode flag, so open a ZIP file and then write it somewhere else may change the unicode flag if unicode flag was set but the filename is also encodable to UTF-8 (eg. ASCII filename)

[issue10972] zipfile: add unicode option to the choose filename encoding

2011-01-21 Thread STINNER Victor
New submission from STINNER Victor : ZipInfo._encodeFilename() tries cp437 encoding or use UTF-8. It is not possible to decide the encoding. To workaround #10955 (bootstrap issue with python32.zip), it would be nice to be able to create a ZIP file using only UTF-8 filenames. Attached patch ad