[
https://issues.apache.org/jira/browse/TIKA-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004450#comment-18004450
]
Hudson commented on TIKA-4453:
------------------------------
SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk17 #813 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/813/])
TIKA-4453 -- decrement embedded depth in rpwh via Stephen H (#2277) (github:
[https://github.com/apache/tika/commit/388097b687560a6f5337eab895273b1c1e6654ed])
* (add) tika-core/src/test/resources/test-documents/massive_embedded.xml
* (edit) tika-core/src/test/java/org/apache/tika/fork/ForkParserTest.java
* (edit)
tika-core/src/main/java/org/apache/tika/fork/RecursiveMetadataContentHandlerProxy.java
* (edit)
tika-core/src/main/java/org/apache/tika/sax/AbstractRecursiveParserWrapperHandler.java
> ForkParser fails on documents with more than 100 embedded documents
> -------------------------------------------------------------------
>
> Key: TIKA-4453
> URL: https://issues.apache.org/jira/browse/TIKA-4453
> Project: Tika
> Issue Type: Bug
> Components: core
> Affects Versions: 3.2.1
> Reporter: Stephen H
> Priority: Minor
> Fix For: 4.0.0, 3.2.2
>
> Attachments: forkparser-patch.txt
>
>
> ForkParser uses RecursiveMetadataContentHandlerProxy, which overrides
> endEmbeddedDocument() but does not call the superclass method. Because of
> this, the embeddedDepth in AbstractRecursiveParserWrapperHandler gets
> incremented with each new embedded document but never decremented. Once it
> hits 100 embedded documents and the maximum depth a SAXException is thrown by
> AbstractRecursiveParserWrapperHandler startEmbeddedDocument().
> The attached patch adds a new method to AbstractRecursiveParserWrapperHandler
> to decrement the depth which is called by
> RecursiveMetadataContentHandlerProxy endEmbeddedDocument(). There is a new
> ForkParser test for a document with 110 embedded documents.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)