Patrik Dufresne added the comment:

> Is the tarfile module designed to support bytes for file names in general? 
> The documentation doesn’t seem to mention bytes anywhere relevant. This seems 
> more like a new feature rather than a bug to me.

I'm using bytes in Unix to represent a path. From `os.path` docs : The path 
parameters can be passed as either strings, or bytes. Applications are 
encouraged to represent file names as (Unicode) character strings. 
Unfortunately, some file names may not be representable as strings on Unix, so 
applications that need to support arbitrary file names on Unix should use bytes 
objects to represent path names. Vice versa, using bytes objects cannot 
represent all file names on Windows (in the standard mbcs encoding), hence 
Windows applications should use string objects to access all files.

As such, I'm expecting to use bytes to represent a path with tarfile.

Also, tar file format doesn't define any specific encoding for filename. I'me 
xpecting to but any kind of bytes data for a given filename... since this was 
wokring in tarfile with py2.

> Does using a surrogateescape encoded filename work?  (You won't get the error 
> you report...my question is, does that do the right thing when building the 
> archive?)

I will need to have further look into surrogateescape. I read somewhere it was 
an experimental feature, so I didn't try it.


Thanks both for your quick feedback in this holidays.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25997>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to