[ 
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584229#comment-13584229
 ] 

Jukka Zitting commented on TIKA-1074:
-------------------------------------

bq. InterruptedException is never thrown in these places today, so I can't add 
the separate catch clause (compiler is angry).

It's a checked exception, so if it isn't declared to be thrown by POI, it 
shouldn't get thrown here (even though the VM doesn't strictly prohibit that). 
So in that case the extra check shouldn't even be needed.

bq. I think it's cleaner to set the interrupt bit and let the next place that 
waits see the interrupt bit and throw IE?

I don't really like this approach. We're essentially saying: "Yes, you asked me 
to stop what I'm doing, but instead I'll just finish up what I was doing and 
ask the next guy to stop." Instead, when receiving an IE I'd prefer Tika to 
stop immediately, either by letting the IE bubble up or (where necessary) by 
throwing a TikaException that wraps the IE.
                
> Extraction should continue if an exception is hit visiting an embedded 
> document
> -------------------------------------------------------------------------------
>
>                 Key: TIKA-1074
>                 URL: https://issues.apache.org/jira/browse/TIKA-1074
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 1.4
>
>         Attachments: TIKA-1074.patch, TIKA-1074.patch
>
>
> Spinoff from TIKA-1072.
> In that issue, a problematic document (still not sure if document is corrupt, 
> or possible POI bug) caused an exception when visiting the embedded documents.
> If I change Tika to suppress that exception, the rest of the document 
> extracts fine.
> So somehow I think we should be more robust here, and maybe log the 
> exception, or save/record the exception(s) somewhere so after parsing the app 
> could decide what to do about them ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to