[
https://issues.apache.org/jira/browse/TIKA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856660#action_12856660
]
Chris A. Mattmann commented on TIKA-153:
----------------------------------------
{quote}
The current Tika APIs are already pretty good, and I'd hate to complicate the
clean Parser interface with extra methods for different kinds of inputs.
Instead I'm thinking of adding a TikaInputStream utility class that extends
InputStream with methods that allow accessing the input document as a File.
The TikaInputStream class would have at least the following construtors:
public TikaInputStream(InputStream stream) { ... }
public TikaInputStream(File file) { ... }
{quote}
+100!! :) I could have used this for TIKA-400 since NetCDF expects (and only
provides means) to deal with input as a File. This happens a lot where
streaming doesn't make a lot of sense in data-intensive files with huge memory
footprint...
Cheers,
Chris
> Allow passing of files or memory buffers to parsers
> ---------------------------------------------------
>
> Key: TIKA-153
> URL: https://issues.apache.org/jira/browse/TIKA-153
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Reporter: Jukka Zitting
> Priority: Minor
>
> Some of our parsers need to be able to go back and forth within a source
> document, so need either a file or (for smaller documents) an in-memory
> buffer that contains the full document. Currently we use temporary files for
> such cases, which in some cases means doing an extra copy of a file before it
> gets parsed. We should come up with some way for clients to pass in a file or
> a memory buffer if one is available.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira