[ https://issues.apache.org/jira/browse/CONNECTORS-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551454#comment-17551454 ]
Julien Massiera commented on CONNECTORS-1667: --------------------------------------------- Hi [~cguzel], no the Tika service connector does not correctly handle Tika server 2.x because of the metadata keys indeed. You should consider using the tika-service-rmeta-connector instead which is better in terms of performances and stability, and has been updated to be compatible with the latest version of Tika Server (see CONNECTORS-1703) I am currently only maintaining that version of tika service connector by the way, because as you said, the maintenance cost is very limited, and having an external Tika instead of an embedded one is more reliable. > New Tika Service Connector > -------------------------- > > Key: CONNECTORS-1667 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1667 > Project: ManifoldCF > Issue Type: New Feature > Components: Tika service connector > Reporter: Julien Massiera > Assignee: Julien Massiera > Priority: Major > Fix For: ManifoldCF 2.20 > > > The current Tika Service Connector exploits the '/unpack/all' endpoint of a > Tika Server. This endpoint is not optimal to only extract document's metadata > and content. We should develop a new connector based on the 'rmeta' endpoint > which is more suited for our needs. -- This message was sent by Atlassian Jira (v8.20.7#820007)