[ 
https://issues.apache.org/jira/browse/TIKA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856660#action_12856660
 ] 

Chris A. Mattmann commented on TIKA-153:
----------------------------------------

{quote}
The current Tika APIs are already pretty good, and I'd hate to complicate the 
clean Parser interface with extra methods for different kinds of inputs. 
Instead I'm thinking of adding a TikaInputStream utility class that extends 
InputStream with methods that allow accessing the input document as a File.

The TikaInputStream class would have at least the following construtors:

    public TikaInputStream(InputStream stream) { ... }
    public TikaInputStream(File file) { ... }

{quote}

+100!! :) I could have used this for TIKA-400 since NetCDF expects (and only 
provides means) to deal with input as a File. This happens a lot where 
streaming doesn't make a lot of sense in data-intensive files with huge memory 
footprint...

Cheers,
Chris

> Allow passing of files or memory buffers to parsers
> ---------------------------------------------------
>
>                 Key: TIKA-153
>                 URL: https://issues.apache.org/jira/browse/TIKA-153
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> Some of our parsers need to be able to go back and forth within a source 
> document, so need either a file or (for smaller documents) an in-memory 
> buffer that contains the full document. Currently we use temporary files for 
> such cases, which in some cases means doing an extra copy of a file before it 
> gets parsed. We should come up with some way for clients to pass in a file or 
> a memory buffer if one is available.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to