[ https://issues.apache.org/jira/browse/TIKA-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502296#comment-16502296 ]
Robin Schimpf commented on TIKA-2446: ------------------------------------- I also see this error from time to time with Tika 1.15. This is a regression from the changes made in TIKA-2311. [~talli...@apache.org] since you made the changes for the truncated zips can you provide any help or information? I tried to fix the issue but always broke some existing tests. This might also be forwarded to the POI project. Maybe they can provide a fix in their code. > Tainted Zip file can provoke OOM errors > --------------------------------------- > > Key: TIKA-2446 > URL: https://issues.apache.org/jira/browse/TIKA-2446 > Project: Tika > Issue Type: Bug > Affects Versions: 1.16 > Reporter: Thorsten Schäfer > Priority: Major > Attachments: corrupt_zip.zip > > > Hi, > using Tika 1.16 with embedded POI 3.17-beta1 we experienced an OutOfMemory > error on a Zip file. The suspicious code is in the constructor of > FakeZipEntry in line 125. Here a ByteArrayOutputStream of up to 2 GiB in size > is opened which will most probably lead to an OutOfMemory. The entry size in > the zip file can be easily faked by an attacker. > The code path to FakeZipEntry will be used only if the native > java.util.zip.ZipFile implementation already failed to open the (possibly > corrupted) Zip. Possibly a more fine grained error analysis could be done in > ZipPackage. > I have attached a tweaked zip file that will provoke this error. > {code:java} > public FakeZipEntry(ZipEntry entry, InputStream inp) throws IOException { > super(entry.getName()); > > // Grab the de-compressed contents for later > ByteArrayOutputStream baos; > long entrySize = entry.getSize(); > if (entrySize !=-1) { > if (entrySize>=Integer.MAX_VALUE) { > throw new IOException("ZIP entry size is too large"); > } > baos = new ByteArrayOutputStream((int) entrySize); > } else { > baos = new ByteArrayOutputStream(); > } > {code} > Kinds, > Thorsten -- This message was sent by Atlassian JIRA (v7.6.3#76005)