Patrik Dufresne added the comment: > Is the tarfile module designed to support bytes for file names in general? > The documentation doesn’t seem to mention bytes anywhere relevant. This seems > more like a new feature rather than a bug to me.
I'm using bytes in Unix to represent a path. From `os.path` docs : The path parameters can be passed as either strings, or bytes. Applications are encouraged to represent file names as (Unicode) character strings. Unfortunately, some file names may not be representable as strings on Unix, so applications that need to support arbitrary file names on Unix should use bytes objects to represent path names. Vice versa, using bytes objects cannot represent all file names on Windows (in the standard mbcs encoding), hence Windows applications should use string objects to access all files. As such, I'm expecting to use bytes to represent a path with tarfile. Also, tar file format doesn't define any specific encoding for filename. I'me xpecting to but any kind of bytes data for a given filename... since this was wokring in tarfile with py2. > Does using a surrogateescape encoded filename work? (You won't get the error > you report...my question is, does that do the right thing when building the > archive?) I will need to have further look into surrogateescape. I read somewhere it was an experimental feature, so I didn't try it. Thanks both for your quick feedback in this holidays. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue25997> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com