New submission from Martin von Gagern:

https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.write writes:

“Note: There is no official file name encoding for ZIP files. If you have 
unicode file names, you must convert them to byte strings in your desired 
encoding before passing them to write(). WinZip interprets all file names as 
encoded in CP437, also known as DOS Latin.”

I think this is wrong in many ways. Firstly, APPNOTE.TXT used to explicitely 
define CP437 as the standard, and it's still the standard in the absence of 
general purpose bit 11 and a more specific description using the 0x0008 Extra 
Field. On the other hand, we do have that general purpose bit these days, so 
there are now not just one but two well-defined file name encodings. And 
thirdly, encoding the string to bytes as suggested will in fact lead to a run 
time error, since ZipInfo expects to do this conversion itself.

See work towards issue1734346, starting at commit 8e33f316ce14, for details on 
when this was addressed in the source code.

----------
assignee: docs@python
components: Documentation
messages: 257567
nosy: docs@python, gagern
priority: normal
severity: normal
status: open
title: documentation of ZipFile file name encoding
type: behavior
versions: Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26018>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to