Diez B. Roggisch wrote:
<div class="moz-text-flowed" style="font-family: -moz-fixed">Glenn Maynard schrieb:
I want to do something fairly simple: read files from one ZIP and add
them to another, so I can remove and replace files.  This led me to a
couple things that seem to be missing from the API.

<snip>

The correct approach is to copy the data directly, so it's not
recompressed.  This would need two new API calls: rawopen(), acting
like open() but returning a direct file slice and not decompressing
data; and rawwrite(zinfo, file), to pass in pre-compressed data, where
the compression method in zinfo matches the compression type used.

I was surprised that I couldn't find the former.  The latter is an
advanced one, important for implementing any tool that modifies large
ZIPs.  Short-term, at least, I'll probably implement these externally.

<snip>

And regarding your second idea: can that really work? Intuitively, I would have thought that compression is adaptive, and based on prior additions to the file. I might be wrong with this though.


I'm pretty sure that the ZIP format uses independent compression for each contained file (member). You can add and remove members from an existing ZIP, and use several different compression methods within the same file. So the adaptive tables start over for each new member.

What isn't so convenient is that the sizes are apparently at the end. So if you're trying to unzip "over the wire" you can't readily do it without somehow seeking to the end. That same feature is a good thing when it comes to spanning zip files across multiple disks.

The zip file format is documented on the net, but I haven't read the spec in at least 15 years.

DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to