[
https://issues.apache.org/jira/browse/TIKA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1611.
-------------------------------
Resolution: Fixed
r1675159.
Nothing like testing to see behavior, rather than assumptions. :(
> Allow RecursiveParserWrapper to catch exceptions from embedded documents
> ------------------------------------------------------------------------
>
> Key: TIKA-1611
> URL: https://issues.apache.org/jira/browse/TIKA-1611
> Project: Tika
> Issue Type: Improvement
> Components: core
> Reporter: Tim Allison
> Assignee: Tim Allison
> Priority: Minor
> Fix For: 1.9
>
>
> While parsing embedded documents, currently, if a parser hits an
> EncryptedDocumentException or anything wrapped in a TikaException, the
> Exception is swallowed by {{ParsingEmbeddedDocumentExtractor}}:
> {noformat}
> DELEGATING_PARSER.parse(
> newStream,
> new EmbeddedContentHandler(new
> BodyContentHandler(handler)),
> metadata, context);
> } catch (EncryptedDocumentException ede) {
> // TODO: can we log a warning that we lack the password?
> // For now, just skip the content
> } catch (TikaException e) {
> // TODO: can we log a warning somehow?
> // Could not parse the entry, just skip the content
> } finally {
> tmp.close();
> }
> {noformat}
> For some applications, it might be better to store the stack trace of the
> attachment that caused an exception.
> The proposal would be to include the stack trace in the metadata object for
> that particular attachment.
> The user will be able to specify whether or not to store stack traces, and
> the default will be to store stack traces. This will be a small change to
> the legacy behavior.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)