Lars Gustäbel added the comment:
tarfile does not use the `format` argument for reading, it will be detected.
You can even mix different formats in one archive and tarfile will be fine with
it.
--
nosy: +lars.gustaebel
___
Python tracker
<ht
Lars Gustäbel added the comment:
Actually, it is not prohibited to add the same file to the same archive more
than once.
--
nosy: +lars.gustaebel
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/i
Lars Gustäbel added the comment:
The question is what you're trying to accomplish. If you just want to prevent
tarfile from stopping at the first invalid header in order to extract
everything following it, you may use the ignore_zeros=True keyword argument
Lars Gustäbel added the comment:
I suck :-) It is hg revision bb94f6222fef.
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/i
Lars Gustäbel added the comment:
TarFile.makelink() has a fallback mode in case the platform does not support
links. Instead of a symlink or a hardlink it extracts the file it points to as
long as it exists in the current archive.
More precisely, makelink() calls os.symlink() and if one
Lars Gustäbel added the comment:
Please give us some example test code that shows us what goes wrong exactly.
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/i
Lars Gustäbel added the comment:
Closed after years of inactivity.
--
resolution: -> works for me
stage: -> resolved
status: open -> closed
___
Python tracker <rep...@bugs.python.org>
<http://bugs.py
Lars Gustäbel added the comment:
Sorry for the glitch, I suppose everything works fine now.
--
status: open -> closed
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.or
Lars Gustäbel added the comment:
Closing after six years of inactivity.
--
resolution: -> wont fix
stage: -> resolved
status: open -> closed
___
Python tracker <rep...@bugs.python.org>
<http://bugs.pyt
Changes by Lars Gustäbel <l...@gustaebel.de>:
--
resolution: -> fixed
stage: test needed -> resolved
status: open -> closed
versions: -Python 3.2, Python 3.3, Python 3.4
___
Python tracker <rep...@bugs.python.org>
<http://bu
Lars Gustäbel added the comment:
Thanks for the detailed report and the patch. I haven't checked yet, but I
suppose that the entire 3.x branch is affected. The first thing I have to do
now is to come up with a comprehensive testcase.
--
assignee: - lars.gustaebel
components: +Library
Changes by Lars Gustäbel l...@gustaebel.de:
--
resolution: - fixed
stage: patch review - resolved
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24259
Changes by Lars Gustäbel l...@gustaebel.de:
--
resolution: - fixed
stage: patch review - resolved
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24514
Lars Gustäbel added the comment:
Martin, I followed your suggestion to raise ReadError. This needed an
additional change in copyfileobj() because it is used both for adding file data
to an archive and extracting file data from an archive.
But I think the patch is in good shape now
Lars Gustäbel added the comment:
I think a simple addition to the existing unittest for nti() will be enough.
itn() seems well-tested, and nts() and stn() are not affected, because they
don't operate on numbers.
--
Added file: http://bugs.python.org/file39832/issue24514.diff
Lars Gustäbel added the comment:
The problem is that the tar archive has empty uid and gid fields, i.e. 7 spaces
terminated with a null-byte.
I attached a patch that solves the problem.
--
keywords: +patch
Added file: http://bugs.python.org/file39815/issue24514.diff
Lars Gustäbel added the comment:
You're welcome :-D
--
assignee: - lars.gustaebel
priority: normal - low
stage: - patch review
type: - behavior
versions: +Python 3.5, Python 3.6
___
Python tracker rep...@bugs.python.org
http://bugs.python.org
Lars Gustäbel added the comment:
Yes, Python 2.7 still gets bugfixes.
However, there's still some work to do on the patch (maybe clean the code,
write a test, add a NEWS entry).
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org
Lars Gustäbel added the comment:
The patch would change behaviour for all tarfile users by the back door, that's
why I am a little reluctant. And if the same can be achieved by a reasonably
simple change to shutil I think it's just as well
Lars Gustäbel added the comment:
You don't need to patch the tarfile module. You could use os.walk() in
shutil._make_tarball() and add each file with TarFile.add(recursive=False).
--
nosy: +lars.gustaebel
___
Python tracker rep...@bugs.python.org
Lars Gustäbel added the comment:
@Martin:
This is actually a nice idea that I hadn't thought of. I updated the Python 3
patch to use a seek() that moves to one byte before the next header block,
reads the remaining byte and raises an error if it hits eof. The code looks
rather clean compared
Changes by Lars Gustäbel l...@gustaebel.de:
Added file: http://bugs.python.org/file39580/issue24259-2.x-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24259
Lars Gustäbel added the comment:
@Thomas:
I think your proposal adds a little too much complexity. Also, ExFileObject is
not used during iteration, and we would like to detect broken archives without
unpacking all the data segments first.
I have written patches for Python 2 and 3
Lars Gustäbel added the comment:
I have written a test for the issue, so that we have a basis for discussion.
There are four different scenarios where an unexpected eof can occur: inside a
metadata block, directly after a metadata block, inside a data segment or
directly after a data segment
Lars Gustäbel added the comment:
I agree with David that there is no need for tarfile to be thread-safe. There
is nothing to be gained from distributing one TarFile object among multiple
threads because it operates on a single resource which has to be accessed
sequentially anyway. So
Lars Gustäbel added the comment:
I would argue that a serious alternative to this patch is to simply override
the TarFile.chown() method in a subclass. However, I'm not sure if this expects
too much of the user.
--
___
Python tracker rep
Lars Gustäbel added the comment:
Please provide a patch which allows easy addition of file-like objects (not
only io.BytesIO) and directories, preferably hard and symbolic links, too. It
would be nice to still be able to change attributes of a TarInfo before
addition. Please also add tests
Lars Gustäbel added the comment:
I don't have an idea how to make it easier and still meet all/most requirements
and without cluttering up the api. The way it currently works allows the
programmer to control every tiny aspect of a tar member. Maybe it's best to
simply add a new entry
Lars Gustäbel added the comment:
tarfile needs to know the size of a file object beforehand because the tar
header is written first followed by the file object's data. If the file object
is not based on a real file descriptor, tarfile cannot simply use os.fstat()
but the user has to pass
Lars Gustäbel added the comment:
Why overcomplicate things?
import io, tarfile
with tarfile.open(foo.tar, mode=w) as tar:
b = hello world!.encode(utf-8)
t = tarfile.TarInfo(helloworld.txt)
t.size = len(b) # this is crucial
tar.addfile(t, io.BytesIO(b))
My answer
Lars Gustäbel added the comment:
Apparently, the problem is located in TarInfo._proc_gnulong(). I attached a
patch.
When tarfile reads an archive, it strips trailing slashes from all filenames,
except GNUTYPE_LONGNAME headers, which is a bug. tarfile creates GNU_FORMAT tar
files by default
Lars Gustäbel added the comment:
The size of the buffer returned by TarInfo.fromtarfile() is checked by
TarInfo.frombuf() which raises either an EmptyHeaderError or
TruncatedHeaderError respectively.
--
assignee: - lars.gustaebel
resolution: - not a bug
stage: - resolved
status
Lars Gustäbel added the comment:
IIRC, tarfile under 2.7 has never been explicitly unicode-safe, support for
unicode objects is heterogeneous at best. The obvious work-around is to work
exclusively with str objects.
What we can't do is to decode the utf-8 pathname from the archive
Lars Gustäbel added the comment:
Let me present for discussion a proposal (and a patch with documentation) with
an approach that is a little different, but in my opinion the most effective. I
hope that it will appeal to all involved.
My proposal consists of a new class SafeTarFile
Lars Gustäbel added the comment:
tarfile.open() actually supports a compress_level argument for gzip and bzip2
and a preset argument for lzma compression.
--
nosy: +lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org
Lars Gustäbel added the comment:
That's right. But it is there.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21404
___
___
Python-bugs-list
Lars Gustäbel added the comment:
That was a design decision. What would be the advantage of having the TarFile
class offer the compression itself?
--
assignee: - lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org
Lars Gustäbel added the comment:
You can pass keyword arguments to tarfile.open(), which will be passed to the
TarFile constructor. You can also use pass fileobj arguments to tarfile.open().
--
___
Python tracker rep...@bugs.python.org
http
Lars Gustäbel added the comment:
Jup. That's it.
--
priority: normal - low
resolution: - not a bug
stage: - resolved
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21369
Lars Gustäbel added the comment:
Okay, let me tell you why I reject your contribution at this point.
The patch you submitted may be well-suited for your purposes but it does not
meet the requirements of a standard library implementation because it is not
generic and comprehensive enough
Lars Gustäbel added the comment:
[...] but remember, we split a volume only in the middle of a big file, not
in any other case (AFAIK). Hopefully you don't get huge pax headers or
anything strange. [...]
Hopefully? Sorry, but have you tested this? I did. I let GNU tar create a two
volume
Lars Gustäbel added the comment:
In the past, our answer to these kinds of bug reports has always been that you
must not extract an archive from an untrusted source without making sure that
it has no malicious contents. And that tarfile conforms to the posix
specifications with respect
Lars Gustäbel added the comment:
It's also consistent with how the tar command works afaik, just listing the
contents of the current volume.
No, GNU tar operates on the entirety of the archive and asks for the filename
of the subsequent volume every time it hits eof in the current volume
Lars Gustäbel added the comment:
I had the following idea: What about a separate class, let's call it
TarVolumeSet for now, that maps a set of (virtual) volumes onto one big
file-like object. This TarVolumeSet will be passed to a TarFile constructor as
the fileobj argument. It is subclassable
Lars Gustäbel added the comment:
At first, I'd like to take back my comment on this patch being too complex for
too little benefit. That is no real argument.
Okay, I gave it a shot and I have a few more remarks:
The patch does not support iterating over a multi-volume tar archive, e.g
Lars Gustäbel added the comment:
I cannot yet go into the details, because I have not tested the patch.
The comments, docstrings and quoting are not very consistent with the rest of
the module. There are a few spelling mistakes. The open_volume() method is more
or less a copy of the open
Lars Gustäbel added the comment:
I'd like to re-emphasize that it is best to keep the whole thing as simple and
straight-forward as possible. Offer some basic operations and that's it.
Although I am pretty accustomed to the original tar command line, I think we
should copy zipfile's interface
New submission from Lars Gustäbel:
Today I accidentally did this:
open(True).read()
Passing True as a file argument to open() does not fail, because a bool value
is treated like an integer file descriptor (stdout in this case). Even worse is
that the read() call hangs in an endless loop
Lars Gustäbel added the comment:
I prepared a patch that fixes this issue and adds a few tests. Please try if it
works for you.
--
keywords: +patch
stage: - patch review
Added file: http://bugs.python.org/file27152/issue15875.diff
___
Python
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15875
Lars Gustäbel added the comment:
Could you provide some sample data and code? I see the problem, but I cannot
quite reproduce the behaviour you describe. In all of my testcases tarfile
either throws an exception or successfully reads the archive, but never
silently stops.
--
assignee
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14810
___
___
Python-bugs
Lars Gustäbel l...@gustaebel.de added the comment:
This issue is related to issue13158 which deals with a GNU tar specific
extension to the original tar format. In that issue a negative number in the
uid/gid fields caused problems. In your case the problem is a negative mtime
field.
Reading
Changes by Lars Gustäbel l...@gustaebel.de:
--
nosy: +lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14807
___
___
Python-bugs-list
Lars Gustäbel l...@gustaebel.de added the comment:
Okay, I close this issue now, as I think the problems are now resolved.
--
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13815
Lars Gustäbel l...@gustaebel.de added the comment:
Okay, I attached a patch that I hope we can all agree upon. It restores the
ExFileObject class as a small subclass of BufferedReader as Amaury suggested.
Does the documentation have to be changed, too? It states that an
io.BufferedReader
Lars Gustäbel l...@gustaebel.de added the comment:
In an earlier draft of my patch, I had kept ExFileObject as a subclass of
BufferedReader, but I later decided against it. To use BufferedReader directly
is in my opinion the cleaner solution.
I admit that the change is not fully backward
Lars Gustäbel l...@gustaebel.de added the comment:
I did some tarfile spring cleaning: I removed the ExFileObject class completely
as it was more or less a leftover from the old days. io.BufferedReader now does
the job. So, as a side-effect, I close this issue as fixed.
(BTW, this makes
Lars Gustäbel l...@gustaebel.de added the comment:
Fixed. Thanks for the report.
--
resolution: - fixed
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14160
Changes by Lars Gustäbel l...@gustaebel.de:
--
resolution: - invalid
stage: - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10369
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14160
___
___
Python-bugs
Lars Gustäbel l...@gustaebel.de added the comment:
Thanks for the report. Attached is a patch (against 3.2) that is supposed to
fix the problem.
--
keywords: +patch
stage: - patch review
Added file: http://bugs.python.org/file24735/issue14160.diff
Lars Gustäbel l...@gustaebel.de added the comment:
a) Good point, a case of sloppy naming.
b) IMO a table is a tad too much. The amount of different compression methods
is still quite small. My patch proposes a simpler approach.
c) A link to shutil is very useful.
BTW, thanks for the effort
Lars Gustäbel l...@gustaebel.de added the comment:
I updated your patch:
- I removed the import as bit completely and changed all occurrences of
_open() to builtins.open() which is more readable and explanatory.
- I object to changing the error messages in the 3.2 branch due to backwards
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14012
___
___
Python-bugs
Lars Gustäbel l...@gustaebel.de added the comment:
I think this is a reasonable proposal. I think it is good style to let tarfile
figure out which supported compression methods are available instead of shutil
or the user. So far I have no objections.
Following 3.3's crypt module, I think
Lars Gustäbel l...@gustaebel.de added the comment:
This has been fixed (issue13158,
http://hg.python.org/cpython/rev/341008eab87d). Thanks anyway for the report.
--
resolution: - duplicate
stage: - committed/rejected
status: open - closed
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13935
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13815
Lars Gustäbel l...@gustaebel.de added the comment:
You actually hit two bugs at the same time here: The target of the created
symlink was not translated from unix to windows path delimiters and is
therefore broken. The second bug is issue12926 which leads to the error in
TarFile.makefile
Lars Gustäbel l...@gustaebel.de added the comment:
The dereference option is only used for archive creation, so the contents of
the file a symbolic link is pointing to is added instead of the symbolic link
itself.
--
___
Python tracker rep
Lars Gustäbel l...@gustaebel.de added the comment:
This should be fixed now, thanks.
--
resolution: - fixed
stage: - committed/rejected
status: open - closed
versions: +Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13702
Lars Gustäbel l...@gustaebel.de added the comment:
I think we should wrap this up as soon as possible, because it has already
absorbed too much of our time. The issue we discuss here is a tiny glitch
triggered by a corner-case. My original idea was to fix it in a minimal sort of
way
Lars Gustäbel l...@gustaebel.de added the comment:
I thought about that myself, too. It is clearly no new feature, it is really
more some kind of a fix.
Unicode pathnames given to tarfile.open() are just passed through to the open()
function, which is why this always has been working, except
Lars Gustäbel l...@gustaebel.de added the comment:
Wouldn't it be better then to use a default compresslevel of 6 in tarfile? I
used level 9 in my patch without a particular reason, just because I thought 9
must be better than 6 ;-)
--
Added file: http://bugs.python.org/file24084/lzma
Changes by Lars Gustäbel l...@gustaebel.de:
Removed file: http://bugs.python.org/file24084/lzma-preset.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5689
Lars Gustäbel l...@gustaebel.de added the comment:
Yes, that's much better. Thanks for the tip.
--
Added file: http://bugs.python.org/file24086/lzma-preset.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5689
Lars Gustäbel l...@gustaebel.de added the comment:
Is there a good reason why the tarfile mode that is used is w|gz? It seems to
me that this is not necessary, w:gz should be enough. w|gz is for special
operations only (see the tarfile docs).
--
nosy: +lars.gustaebel
Added file: http
Lars Gustäbel l...@gustaebel.de added the comment:
tarfile under Python 2.x is not particularly designed to support unicode
filenames (the gzip module does not support them either), but that should not
be too hard to fix.
--
keywords: +patch
Added file:
http://bugs.python.org
Lars Gustäbel l...@gustaebel.de added the comment:
Just for the record:
The gzip format (defined in RFC 1952) allows storing the original filename
(without the .gz suffix) in an additional field in the header (the FNAME
field). Latin-1 (iso-8859-1) is required. It is ironic that this causes
Lars Gustäbel l...@gustaebel.de added the comment:
See http://bugs.python.org/issue11638#msg150029
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13639
Lars Gustäbel l...@gustaebel.de added the comment:
Please, go ahead!
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5689
___
___
Python-bugs
Lars Gustäbel l...@gustaebel.de added the comment:
Thanks for the review, guys! I can't close this issue yet because it depends on
#6715.
--
resolution: - fixed
stage: needs patch - committed/rejected
___
Python tracker rep...@bugs.python.org
http
Lars Gustäbel l...@gustaebel.de added the comment:
For those who want to test it first, I post the current state of the patch
here. It is ready for commit, there are no failing tests. If nobody objects, I
will apply it this weekend.
--
Added file: http://bugs.python.org/file23880/2011
Lars Gustäbel l...@gustaebel.de added the comment:
I will be happy to, but my spare time is limited right now, so this could take
about a week. If this is a problem, please go ahead.
--
___
Python tracker rep...@bugs.python.org
http
Lars Gustäbel l...@gustaebel.de added the comment:
This is no bad idea. I recommend keeping it as simple as possible. I would
definitely not be supportive of a full tar clone. List, extract, create - that
should be enough. There are two possible command line choices: do what the
zipfile
Lars Gustäbel l...@gustaebel.de added the comment:
Some testing reveals that the bz2 module 3.3 cannot fully decompress the file
in question. Only the first 900k are decompressed. Thus, this issue is not
related to issue13158 or the tarfile module.
--
nosy: +lars.gustaebel
Lars Gustäbel l...@gustaebel.de added the comment:
Thanks for the report. There was a problem decoding a special and rare kind of
header field in the archive. The format of the archive is of very bad quality
BTW ;-)
--
resolution: - fixed
stage: - committed/rejected
status: open
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13158
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
nosy: +lars.gustaebel
priority: normal - low
versions: +Python 3.3 -Python 2.7, Python 3.2
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13031
Lars Gustäbel l...@gustaebel.de added the comment:
Attached is a patch with the current state of my work on lzma integration into
tarfile (17 test errors).
--
assignee: - lars.gustaebel
keywords: +patch
Added file: http://bugs.python.org/file23162/2011-09-15-tarfile-lzma.diff
Lars Gustäbel l...@gustaebel.de added the comment:
Today I played around with lzma support for tarfile based on your last patch
(see issue5689). There are a few minor issues that I just wanted to mention, as
they break the tarfile testsuite:
- LZMAFile does not expose a name attribute
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12800
___
___
Python-bugs
Changes by Lars Gustäbel l...@gustaebel.de:
--
assignee: - lars.gustaebel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12926
___
___
Python-bugs
Lars Gustäbel l...@gustaebel.de added the comment:
It's the low-level operating system aspects of tarfile that are very difficult
to test, e.g. filesystem and operating system dependent features such as
symbolic links, hard links, file permissions, ownership. It is not even
possible
Lars Gustäbel l...@gustaebel.de added the comment:
Close as fixed. Thanks all!
--
resolution: - fixed
stage: - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12841
Lars Gustäbel l...@gustaebel.de added the comment:
Issue #12841 is a duplicate of this one, but I give it precedence because it
comes with a working patch.
--
resolution: - duplicate
status: open - closed
versions: +Python 2.7, Python 3.3
Changes by Lars Gustäbel l...@gustaebel.de:
--
versions: +Python 2.7, Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12841
Lars Gustäbel l...@gustaebel.de added the comment:
Yes, it should be fixed in all affected branches.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12841
1 - 100 of 220 matches
Mail list logo