[ 
https://issues.apache.org/jira/browse/LABS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Javier Puerto updated LABS-118:
-------------------------------

    Attachment: tikaparser.diff

A first tika parser implementation. Works with default crawler and worker. 

The parser only wrapped the Tika parser with the LinkExtractor to get the 
OutLinks and EchoHandler to save the parsed data. Then return a ParseImpl.

> Create tied integration with Apache Tika (for parser and handler)
> -----------------------------------------------------------------
>
>                 Key: LABS-118
>                 URL: https://issues.apache.org/jira/browse/LABS-118
>             Project: Labs
>          Issue Type: New Feature
>          Components: Droids
>            Reporter: Thorsten Scherler
>         Attachments: tikaparser.diff
>
>
> http://incubator.apache.org/tika/
> Apache Tika is a toolkit for detecting and extracting metadata and structured 
> text content from various documents using existing parser libraries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to