New submission from Mark Grandi:

So I ran into this problem today, where near impossible to create a 
tarfile.TarFile object, then add files to the archive, when the files are in 
memory file-like objects (like io.BytesIO, io.StringIO, etc)

code example:

###################
import tarfile, io

tarFileIo = io.BytesIO()

tarFileObj = tarfile.open(fileobj=tarFileIo, mode="w:xz")

fileToAdd = io.BytesIO("hello world!".encode("utf-8"))

# fixes "AttributeError: '_io.BytesIO' object has no attribute 'name'"
fileToAdd.name="helloworld.txt"

# fails with 'io.UnsupportedOperation: fileno'
tarInfo = tarFileObj.gettarinfo(arcname="helloworld.txt", fileobj=fileToAdd)

# never runs
tarFileObj.addfile(tarInfo, fileobj=fileToAdd)
###################

this was previously reported as this bug: http://bugs.python.org/issue10369 but 
I am unhappy with the resolution of "its not a bug", and the 'hack' that Lars 
posted as a solution. My reasons:

1: The zipfile module supports writing in memory files / bytes , using the 
following code (which is weird but it works)

tmp = zipfile.ZipFile("tmp.zip", mode="w")
import io
x = io.BytesIO("hello world!".encode("utf-8"))
tmp.writestr("helloworld.txt", x.getbuffer())
tmp.close()

2: the 'hack' that Lars posted, while it works, this is unintuitive and 
confusing, and isn't the intended behavior. What happens if your script is 
cross platform, what file do you open to give to os.stat()? In the code posted 
it uses open('/etc/passwd/') for the fileobj parameter to gettarinfo(), but 
that file doesn't exist on windows, now not only are you doing this silly hack, 
you have to have code that checks platform.system() to get a valid file that is 
known to exist for every system, or use sys.executable, except the 
documentation for that says it can return None or an empty string.

3: it is easy to fix (at least to me), in tarfile.gettarinfo(), if fileobj is 
passed in, and it doesn't have a fileno, then to create the TarInfo object, you 
set 'name' to be the arcname parameter, size = len(fileobj), then have default 
(maybe overridden by keyword args to gettarinfo()) values for 
uid/gid/uname/gname.

On a random tar.gz file that I downloaded from sourceforge, the uid/gid are 
'500' (when my gid is 20 and uid is 501), and the gname/uname are just empty 
strings. So its obvious that those don't matter most of the time, and when they 
do matter, you can modify the TarInfo object after creation or pass in values 
for them in a theoretical keywords argument to gettarinfo().

If no one wants to code this I can provide a patch, I just want the previous 
bug report's status of "not a bug" to be reconsidered.

----------
components: Library (Lib)
messages: 225374
nosy: markgrandi
priority: normal
severity: normal
status: open
title: tarfile can't add in memory files (reopened)
versions: Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22208>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to