[ 
https://issues.apache.org/jira/browse/TIKA-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022803#comment-13022803
 ] 

Nick Burch commented on TIKA-645:
---------------------------------

One solution that springs to mind is to place the hasFile and getFile methods 
onto an interface. TikaInputStream, TaggedInputStream and CountingInputStream 
could then all implement this. That way, if the underlying stream is a 
TikaInputStream, the parser can still find out and grab the file. If it isn't, 
then nothing changes.

> Parsers can't get at an underlying TikaInputStream to get the file if they 
> wanted one
> -------------------------------------------------------------------------------------
>
>                 Key: TIKA-645
>                 URL: https://issues.apache.org/jira/browse/TIKA-645
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Nick Burch
>
> Spotted this with the office parser, but it should be general. The user 
> creates a TikaInputStream, and passes that off to the parser framework. The 
> Parser that is called may wish to spot that the input is a File backed 
> TikaInputStream, and take a shortcut to use the file instead of the 
> InputStream.
> However, what the parser gets is a TaggedInputStream wrapping a 
> CountingInputStream wrapping the original TikaInputStream. As such, it can't 
> get at the file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to