Re: [compress] How to implement zip-bomb protection with Java 10

Stefan Bodewig Sun, 01 Apr 2018 03:12:07 -0700

On 2018-03-31, kiwiwings wrote:

> Stefan Bodewig wrote
>> ... then the compressed and uncompressed sizes of each
>> ZipArchiveEntry are known before you try to read the stream. Can't you
>> simply reject reading entries who's uncompressed size is too big?


> Are those sizes taken from the central directory?
> Isn't it possible to tamper around those entries?
> if so, would you rely on those?

You are certainly correct. In that case wrapping the InputStream into a
BoundedInputStream capped at the claimed uncompressed size would
work. If the uncompressed size hasn't been tampered, then all is well,
otherwise the stream will simply stop reading any further.

> To prevent false positives with people who put an insane amount of
> data into Excel sheets, I've chosen the ratio way, i.e. comparing how
> many bytes of the compressed stream are read vs. the uncompressed
> size.  There's another limit check which does exactly what you've
> explained, but this defaults to around 2GB.  Although POI has anyway a
> problem with the normal processing mode to deal with such big files,
> there's also a SAX streaming mode which can take benefit of this.

> I haven't dived into the compress code yet, but my hope is, that the ratio
> logic can be also applied there.

The CompressorInputStreams provide counts of the compressed bytes read,
but at least the InputStream provided by ZipFile#getInputStream does
not. So I'm afraid there currently is no easy way to tell how many
compressed bytes have been processed at any point in time.

> For a reference I've looked through the archives to find related discussions
> [1], but maybe missed the private ones.

I'm not aware of any private discussion.

In this thread as well as COMPRESS-445 we are only talking about ZIP,
but all the other compressing archive formats (and compression formats)
can certainly also be used for "bombs". Compress' test resources contain
tar.gz files of about 8MB size that expand to 8+GB - these are the tests
for really big tars.

What I'd really want to do is provide a solution that works for all our
formats and not just for ZIP.

> So the discussion were similar to this thread, i.e. about taking the
> metadata/sizes from the central directory.

True, and you are absolutely correct that this is short-sightened when
somebody who crafts a malicious input may just put the wrong values in
there.

> So my approach would be to try to move to commons-compress, check if the
> ratio logic can be added and if this is possible, we might also need to look
> at the performance trade-off (if any?) and decide if we want a switchable
> implementation ...

The first thing you'd need is to get the information of how much of the
uncompressed size has been read so far. Given an API of

InputStream getInputStream(ZipArchiveEntry e)

there isn't much room for providing that information apart from
documenting "InputStream will be a FooInputStream" and have
FooInputStream provide the information.

An alternative may be a progress listener approach which would
frequently provide information about the number of bytes processed. This
has been discussed before (on this list a few days ago and in
COMPRESS-207).

I'm absolutely open to providing the means to implement the
protection. The dev list might be the better place. Right now I'm not
convinced it should be enabled by default, but we can discuss that once
we get there.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [compress] How to implement zip-bomb protection with Java 10

Reply via email to