New submission from STINNER Victor <victor.stin...@haypocalc.com>: tarfile is unable to open a TAR archive in PAX format embedding invalid filenames (filename not encoded in utf8, an undecodable filename). Attached file is an example (contain the file b'z/\xff', not decodable from utf8).
PAX specification has a "invalid" option with 4 values: bypass (default), rename, UTF-8, write. http://www.opengroup.org/onlinepubs/009695399/utilities/pax.html As it was done for other formats in issue #8390, PAX can use Python surrogateescape error handler to store undecodable bytes as unicode surrogates. I think that PAX should be strict by default, but have an option to enable surrogateescape mode. ---------- components: Library (Lib) files: z-pax.tar messages: 105094 nosy: haypo, lars.gustaebel, loewis priority: normal severity: normal status: open title: tarfile doesn't support undecodable filename in PAX format versions: Python 3.2 Added file: http://bugs.python.org/file17230/z-pax.tar _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8633> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com