New submission from Dan Stromberg <strom...@gmail.com>:

The tarfile module's gettarinfo callable insists on stat'ing the file in 
question, preventing one from dynamically generating file content by passing a 
file-like object for addfile's fileobj argument.

I believe the attached patch fixes this issue.  I generated the patch against 
2.7 and tested it with 2.7, but it applies cleanly against 3.1 and "feels 
innocuous".  I've also included my test code at the bottom of this comment.

Why would you want to do this?  Imagine you've stored a file in three smaller 
files (perhaps to save the pieces on small external media, or as part of a 
deduplication system), with the content divided up into thirds.  To 
subsequently put this file as a whole into a tar archive, it'd be nice if you 
could just create a file-like object to emit the catenation, rather than having 
to create a temporary file holding that catenation.

It's occurred to me that this should be done in a more object oriented style, 
but that feels a bit inconsistent given that fstat is in the os module, and not 
provided as an attribute of a file(-like) object.  Comments?

Here's the test code:

#!/usr/local/cpython-2.7/bin/python

import os
import sys
import copy
import array
import stat_tarfile

def my_stat(filename):
        class mutable_stat:
                pass
        readonly_statobj = os.lstat(filename)
        mutable_statobj = mutable_stat()
        for attribute in dir(readonly_statobj):
                if not attribute.startswith('_'):
                        value = getattr(readonly_statobj, attribute)
                        setattr(mutable_statobj, attribute, value)
        return mutable_statobj

class generate_file_content:
        def __init__(self, number):
                self._multiplier = 100
                self._multipleno = 0
                self._number = str(number)
                self._buffer = ''

        def read(self, length):
                while self._multipleno < self._multiplier and len(self._buffer) 
< length:
                        self._buffer += self._number
                        self._multipleno += 1
                if self._buffer == '':
                        return ''
                else:
                        result = self._buffer[:length]
                        self._buffer = self._buffer[length:]
                        return result

def main():
        with stat_tarfile.open(fileobj = sys.stdout, mode = "w|") as tar:
                for number in xrange(100):
                        #string = str(number) * 100
                        fileobj = generate_file_content(number)
                        statobj = my_stat('/etc/passwd')
                        statobj.st_size = len(str(number)) * 100
                        filename = 'file-%d.txt' % number
                        tarinfo = tar.gettarinfo(filename, statobj = statobj)
                        tarinfo.uid = 1000
                        tarinfo.gid = 1000
                        tarinfo.uname = "dstromberg"
                        tarinfo.gname = "dstromberg"
                        tar.addfile(tarinfo, fileobj)

main()

----------
components: Library (Lib)
files: tarfile.diff
keywords: patch
messages: 120822
nosy: strombrg
priority: normal
severity: normal
status: open
title: tarfile requires an actual file on disc; a file-like object is 
insufficient
versions: Python 2.7, Python 3.1
Added file: http://bugs.python.org/file19549/tarfile.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10369>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to