Nadeem Vawda added the comment: > How does one create a multi-stream bzip2 file in the first place?
If you didn't do so deliberately, I would guess that you used a parallel compression tool like pbzip2 or lbzip2 to create your bz2 file. These tools work by splitting the input into chunks, compressing each chunk as a separate stream, and then concatenating these streams afterward. Another possibility is that you just concatenated two existing bz2 files, e.g.: $ cat first.bz2 second.bz2 >multi.bz2 > And how do I tell it's multi-stream. I don't know of any pre-existing tools to do this, but you can write a script for it yourself, by feeding the file's data through a BZ2Decompressor. When the decompress() method raises EOFError, you're at the end of the first stream. If the decompressor's unused_data attribute is non-empty, or there is data that has not yet been read from the input file, then it is either (a) a multi-stream bz2 file or (b) a bz2 file with other metadata tacked on to the end. To distinguish between cases (a) and (b), take unused_data + rest_of_input_file and feed it into a new BZ2Decompressor. If don't get an IOError, then you've got a multi-stream bz2 file. (If you *do* get an IOError, then that's case (b) - someone's appended non-bz2 data to the end of a bz2 file. For example, Gentoo and Sabayon Linux packages are bz2 files with package metadata appended, according to issue 19839.) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue20781> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com