[ 
https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505358#comment-14505358
 ] 

Hudson commented on TIKA-1611:
------------------------------

SUCCESS: Integrated in tika-trunk-jdk1.7 #639 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/639/])
TIKA-1611 -- allow RecursiveParserWrapper to catch exceptions caused by 
embedded documents (tallison: 
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1675159)
* 
/tika/trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java
* 
/tika/trunk/tika-batch/src/main/java/org/apache/tika/util/TikaExceptionFilter.java
* /tika/trunk/tika-batch/src/test/java/org/apache/tika/util
* 
/tika/trunk/tika-core/src/main/java/org/apache/tika/parser/RecursiveParserWrapper.java
* /tika/trunk/tika-core/src/main/java/org/apache/tika/utils/ExceptionUtils.java
* 
/tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/RecursiveParserWrapperTest.java
* 
/tika/trunk/tika-parsers/src/test/resources/test-documents/test_recursive_embedded_npe.docx
* 
/tika/trunk/tika-server/src/test/java/org/apache/tika/server/RecursiveMetadataResourceTest.java


> Allow RecursiveParserWrapper to catch exceptions from embedded documents
> ------------------------------------------------------------------------
>
>                 Key: TIKA-1611
>                 URL: https://issues.apache.org/jira/browse/TIKA-1611
>             Project: Tika
>          Issue Type: Improvement
>          Components: core
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Minor
>             Fix For: 1.9
>
>
> While parsing embedded documents, currently, if a parser hits an 
> EncryptedDocumentException or anything wrapped in a TikaException, the 
> Exception is swallowed by {{ParsingEmbeddedDocumentExtractor}}:
> {noformat}
>             DELEGATING_PARSER.parse(
>                                     newStream,
>                                     new EmbeddedContentHandler(new 
> BodyContentHandler(handler)),
>                                     metadata, context);
>         } catch (EncryptedDocumentException ede) {
>             // TODO: can we log a warning that we lack the password?
>             // For now, just skip the content
>         } catch (TikaException e) {
>             // TODO: can we log a warning somehow?
>             // Could not parse the entry, just skip the content
>         } finally {
>             tmp.close();
>         }
> {noformat}
> For some applications, it might be better to store the stack trace of the 
> attachment that caused an exception.
> The proposal would be to include the stack trace in the metadata object for 
> that particular attachment.
> The user will be able to specify whether or not to store stack traces, and 
> the default will be to store stack traces.  This will be a small change to 
> the legacy behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to