[
https://issues.apache.org/jira/browse/LABS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631309#action_12631309
]
Thorsten Scherler commented on LABS-118:
----------------------------------------
I do not want to create a dependency in the api to tika. If your solution does
not produce this dependency it is good as gold.
As I see it the integration should be:
->Parser produces SAX and metadata
->LinkExtractor [new component -> see LABS-149] uses SAX events to extract
links/tasks
-> Handler (can use SAX, metadata, original stream)
Does that makes sense?
> Create tied integration with Apache Tika (for parser and handler)
> -----------------------------------------------------------------
>
> Key: LABS-118
> URL: https://issues.apache.org/jira/browse/LABS-118
> Project: Labs
> Issue Type: New Feature
> Components: Droids
> Reporter: Thorsten Scherler
>
> http://incubator.apache.org/tika/
> Apache Tika is a toolkit for detecting and extracting metadata and structured
> text content from various documents using existing parser libraries.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]