[ 
https://issues.apache.org/jira/browse/CONNECTORS-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551454#comment-17551454
 ] 

Julien Massiera commented on CONNECTORS-1667:
---------------------------------------------

Hi [~cguzel], no the Tika service connector does not correctly handle Tika 
server 2.x because of the metadata keys indeed. You should consider using the 
tika-service-rmeta-connector instead which is better in terms of performances 
and stability, and has been updated to be compatible with the latest version of 
Tika Server (see CONNECTORS-1703)

I am currently only maintaining that version of tika service connector by the 
way, because as you said, the maintenance cost is very limited, and having an 
external Tika instead of an embedded one is more reliable.

 

> New Tika Service Connector
> --------------------------
>
>                 Key: CONNECTORS-1667
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1667
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Tika service connector
>            Reporter: Julien Massiera
>            Assignee: Julien Massiera
>            Priority: Major
>             Fix For: ManifoldCF 2.20
>
>
> The current Tika Service Connector exploits the '/unpack/all' endpoint of a 
> Tika Server. This endpoint is not optimal to only extract document's metadata 
> and content.  We should develop a new connector based on the 'rmeta' endpoint 
> which is more suited for our needs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to