[ 
https://issues.apache.org/jira/browse/TIKA-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103558#comment-16103558
 ] 

Nicholas DiPiazza commented on TIKA-860:
----------------------------------------

It's now totally configurable.

See: 

https://github.com/apache/tika/blob/master/tika-core/src/main/java/org/apache/tika/sax/SecureContentHandler.java#L157
https://github.com/apache/tika/blob/master/tika-core/src/main/java/org/apache/tika/sax/SecureContentHandler.java#L177
https://github.com/apache/tika/blob/master/tika-core/src/main/java/org/apache/tika/sax/SecureContentHandler.java#L137

> Make ZIP bomb detection configureable
> -------------------------------------
>
>                 Key: TIKA-860
>                 URL: https://issues.apache.org/jira/browse/TIKA-860
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.0
>            Reporter: Uwe Schindler
>
> The detection of ZIP bombs is nice and the original issue says it's 
> configureable, but I found no solution how to change ParseContext of the 
> AutoDetectParser to e.g. allow deeper nesting levels. The 
> SecureContentHandler instantiation is hardcoded and there is no point of 
> intervention.
> In my case a simple ZIP of an Eclipse project: 
> http://store.pangaea.de/Publications/AltaweelM_2011/Salinization.zip 
> triggered the bomb detection, but it is of course no bomb. Its just because 
> the JAR/WAR files in this projects itself contain other JAR files and class 
> files :-) This overflows the nesting level of 10 - maybe even the TIKA OSGI 
> bundle triggers the bomb detection (not tested).
> In my case I would like to raise the nesting level, but there is no solution. 
> My change was to simply filter away JAR files (as they contain no metadata we 
> are interested in our own development, we already removed e.g. CLASS file 
> parsers from out TIKA config so we have a very simple parser structure only 
> allowing pdf, office documents, txt files,...) by using a custom 
> DocumentSelector in my ParseContext.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to