[
https://issues.apache.org/jira/browse/TIKA-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Keith R. Bennett updated TIKA-17:
---------------------------------
Attachment: tika-17.patch
PLEASE NOTE THAT tika-17.patch IS THE MOST RECENT PATCH (not tika-17-2.patch).
I am attaching a third patch file, giving it the original name tika-17.patch as
per Jukka's advice.
This patch addresses issues, 17, 20, and 24.
These changes are based on conversations on the forum and on conversations with
Chris Mattmann yesterday.
A new class, ParseUtils, has methods to create parsers and to parse documents
in single call. Where practical, methods are provided that calculate the MIME
type, and also that take the MIME type as a parameter (to save processing time,
or if the caller wants to use his own detection scheme).
Methods that parse a document in a single call open and close the input stream,
whereas methods that return a parser open them but the user is responsible to
close them.
I have not provided methods that take a String identifying a file, since a
String can be used to express both a URL and a File, and creating either from a
String is trivial for the caller to do.
For the methods that take InputStreams as parameters, I required that the MIME
type be specified, because (at this time) we cannot determine the URL from a
Stream.
I've used the name getStringContent rather than getStrContent because I think
for a public API the greater clarity is worth the extra characters; change it
back if you like. (I did not change the name in the Parser class.)
> Need to support URL's for input resources.
> ------------------------------------------
>
> Key: TIKA-17
> URL: https://issues.apache.org/jira/browse/TIKA-17
> Project: Tika
> Issue Type: Improvement
> Components: general
> Affects Versions: 0.1-incubator
> Reporter: Keith R. Bennett
> Assignee: Chris A. Mattmann
> Fix For: 0.1-incubator
>
> Attachments: tika-17-2.patch, tika-17.patch, tika-17.patch
>
>
> It would be extremely helpful to support URL's instead of just File's for
> input resources. This would enable us to use class loaders to find
> resources, and in general support resources that are not available via the
> filesystem.
> Patch coming...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.