[ https://issues.apache.org/jira/browse/CONNECTORS-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17547939#comment-17547939 ]
Karl Wright commented on CONNECTORS-1667: ----------------------------------------- [~cguzel], this ticket is about an EXTERNAL service where Tika runs as a separate stand-alone process, and the connector communicates to it. I don't think there is any difference from a service standpoint whether you run Tika 1.x or 2.x as that service - the protocol is likely the same, although I haven't researched it. What you seem to be thinking is that the internal Tika connector should go to Tika 2.0. This is a major, major deal because most of the connector dependencies we have to update are due to Tika. I looked at it and found we'd need 4-5 weeks of a dedicated individual to do the port. Are you volunteering? If so I can advise you. Otherwise we will be staying current with Tika 1.x releases for now, and that is all. > New Tika Service Connector > -------------------------- > > Key: CONNECTORS-1667 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1667 > Project: ManifoldCF > Issue Type: New Feature > Components: Tika service connector > Reporter: Julien Massiera > Assignee: Julien Massiera > Priority: Major > Fix For: ManifoldCF 2.20 > > > The current Tika Service Connector exploits the '/unpack/all' endpoint of a > Tika Server. This endpoint is not optimal to only extract document's metadata > and content. We should develop a new connector based on the 'rmeta' endpoint > which is more suited for our needs. -- This message was sent by Atlassian Jira (v8.20.7#820007)