Hi,

I think there are two distinct types of security vulnerabilities that we
are talking about here. One is called something like "Zip bomb", the other
"XML bomb". Both try to get you to open a malicious file which causes a
huge expansion in memory and thus causes out-of-memory in your application
and through this a denial-of-service. One attacks at the zip-file level,
i.e. when uncompressing the ooxml-file during reading. The other on the
XML-content level which resides inside the ZIP-file.

An "XML Bomb" is a file which uses various XML-functionality to cause a
small file to expand to a much larger file in memory. Typically multiple
leves of entity-expansion are used to cause this. Apache POI protects
against this by disabling features in XML Parsers to not allow such
expansion to take place at all.

However you are actually looking at the code which protects Apache POI
against a "Zip Bomb", i.e. when a ZIP-file is created in a way which
expands to a much larger amount of memory when uncompressed. This is
probably done via writing lots of similar data which compresses very well,
however I did not look into details of this yet.

While extracting the zip file (all ooxml file types are actually compressed
zip-files), Apache POI counts compressed bytes and resulting uncompressed
bytes. If the ratio is lowe  than a given threshold (i.e. the compressed
data expands a lot), it stops processing the file with an error to avoid
this type of attack.

If the file in question is produced by yourself, it is probably safe to
lower the threshold via the API somewhat. If it is an external file from an
untrusted source, you likely don't want to process the file, only a close
look at the actual ZIP-data will allow to say for sure.

Dominik

On Mon, Mar 25, 2019, 15:26 Scott Gardner <[email protected]> wrote:

> I understand that, but specifically what is it in a .zip file that will
> cause this if statement to throw the IllegalStateException?  I don't
> understand where the values of text.length() and string.length() are coming
> from.
>        int size = text.length() + string.length();
>        if(size > ZipSecureFile.getMaxTextSize()) {
> I'm getting this exception and I don't know what (in the .zip file) is
> causing this to be thrown.
> The text would exceed the max allowed overall size of extracted text. By
> default this is prevented as some documents may exhaust available memory
> and it may indicate that the file is used to inflate memory usage and thus
> could pose a security risk. You can adjust this limit via
> ZipSecureFile.setMaxTextSize() if you need to work with files which have a
> lot of text. Size: 10485785, limit: MAX_TEXT_SIZE: 10485760
>
> On 2019/03/22 18:39:06, Scott Gardner <[email protected]> wrote:
> > Can someone explain what causes IllegalStateException to be thrown in>
> > POIXMLTextExtractor.java?>
> >
> > In the file  org/apache/poi/POIXMLTextExtractor.java is this if
> statement>
> >
> >    if(size > ZipSecureFile.getMaxTextSize()) {>
> >       throw new IllegalStateException("The text would exceed the max>
> > allowed overall size of extracted text. ">
> >         + "By default this is prevented as some documents may exhaust>
> > available memory and it may indicate that the file is used to inflate>
> > memory usage and thus could pose a security risk. ">
> >         + "You can adjust this limit via ZipSecureFile.setMaxTextSize()
> if>
> > you need to work with files which have a lot of text. ">
> >         + "Size: " + size + ", limit: MAX_TEXT_SIZE: " +>
> > ZipSecureFile.getMaxTextSize());>
> >    }>
> >
> > Can someone tell me exactly what causes this message to be printed? What>
> > does "The text" mean in the context of that message?>
> > Can someone give me a .zip file that will cause this message to appear
> and>
> > explain to me what it is about the contents of the .zip file>
> > causes that message to be printed?>
> >
>

Reply via email to