[ https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505358#comment-14505358 ]
Hudson commented on TIKA-1611: ------------------------------ SUCCESS: Integrated in tika-trunk-jdk1.7 #639 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/639/]) TIKA-1611 -- allow RecursiveParserWrapper to catch exceptions caused by embedded documents (tallison: http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1675159) * /tika/trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java * /tika/trunk/tika-batch/src/main/java/org/apache/tika/util/TikaExceptionFilter.java * /tika/trunk/tika-batch/src/test/java/org/apache/tika/util * /tika/trunk/tika-core/src/main/java/org/apache/tika/parser/RecursiveParserWrapper.java * /tika/trunk/tika-core/src/main/java/org/apache/tika/utils/ExceptionUtils.java * /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/RecursiveParserWrapperTest.java * /tika/trunk/tika-parsers/src/test/resources/test-documents/test_recursive_embedded_npe.docx * /tika/trunk/tika-server/src/test/java/org/apache/tika/server/RecursiveMetadataResourceTest.java > Allow RecursiveParserWrapper to catch exceptions from embedded documents > ------------------------------------------------------------------------ > > Key: TIKA-1611 > URL: https://issues.apache.org/jira/browse/TIKA-1611 > Project: Tika > Issue Type: Improvement > Components: core > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Minor > Fix For: 1.9 > > > While parsing embedded documents, currently, if a parser hits an > EncryptedDocumentException or anything wrapped in a TikaException, the > Exception is swallowed by {{ParsingEmbeddedDocumentExtractor}}: > {noformat} > DELEGATING_PARSER.parse( > newStream, > new EmbeddedContentHandler(new > BodyContentHandler(handler)), > metadata, context); > } catch (EncryptedDocumentException ede) { > // TODO: can we log a warning that we lack the password? > // For now, just skip the content > } catch (TikaException e) { > // TODO: can we log a warning somehow? > // Could not parse the entry, just skip the content > } finally { > tmp.close(); > } > {noformat} > For some applications, it might be better to store the stack trace of the > attachment that caused an exception. > The proposal would be to include the stack trace in the metadata object for > that particular attachment. > The user will be able to specify whether or not to store stack traces, and > the default will be to store stack traces. This will be a small change to > the legacy behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)