[ 
https://issues.apache.org/jira/browse/TIKA-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475557#comment-17475557
 ] 

Tim Allison edited comment on TIKA-3644 at 1/13/22, 5:26 PM:
-------------------------------------------------------------

It looks like the package-depth detector (the 5 in your config) is not 
triggered if the embeddedDocument extractor calls the parse with 
outputHTML=false.  In the MSOffice parser, outputHTML=true; however in the 
OOXMLParser, outputHTML=false.  I propose that we change all outputHTML=true 
throughout the parsers.

 

That said, I did get a zip bomb exception when I set the maxDepth to 5 on both 
MSOffice and ooxml files.

 

-Looking at the SecureContentHandler, I'm frankly not certain what the 
difference between packageDepth and depth is.- :P

 

It looks like the difference is that maxPackageDepth should cover embedded 
items, where as maxDepth covers all html entities.


was (Author: talli...@mitre.org):
It looks like the package-depth detector (the 5 in your config) is not 
triggered if the embeddedDocument extractor calls the parse with 
outputHTML=false.  In the MSOffice parser, outputHTML=true; however in the 
OOXMLParser, outputHTML=false.  I propose that we change all outputHTML=true 
throughout the parsers.

 

That said, I did get a zip bomb exception when I set the maxDepth to 5 on both 
MSOffice and ooxml files.

 

Looking at the SecureContentHandler, I'm frankly not certain what the 
difference between packageDepth and depth is. :P

> OfficeParser can not detect embedded zip bomb in the office documents
> ---------------------------------------------------------------------
>
>                 Key: TIKA-3644
>                 URL: https://issues.apache.org/jira/browse/TIKA-3644
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 2.2.1
>            Reporter: Sergen Bağ
>            Priority: Minor
>         Attachments: 10_2_2_2_2.zip, tika_exception.PNG, zipbomb.doc, 
> zipbomb.docx, zipbomb.ppt, zipbomb.pptx, zipbomb.xls, zipbomb.xlsx
>
>
> Hi, I am trying to get "zip bomb detection" exception but I can't. I used 
> attachments as below and I saw this situation like that:
> When I send "zipbomb.xls" and "zipbomb.doc" to Tika, Tika threw exception.
> When I send "zipbomb.xlsx","zipbomb.docx","zipbomb.ppt" and "zipbomb.pptx" to 
> Tika, Tika didn't throw exception.
> Thanks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to