[ 
https://issues.apache.org/jira/browse/TIKA-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071637#comment-14071637
 ] 

Rob Tulloh commented on TIKA-1371:
----------------------------------

Also, we have been using tika to externally pre-process documents before we 
index them into Solr. We are using Solr 4.7 and it seems that the additional 
meta-data that Tika 1.5 now produces is not being recognized by Solr. We 
purposely decouple Tika from Solr because we don't want document processing to 
destabilize indexing/search (separating concerns).

When we use the supported URL format, we get back meta-data and content and 
this confuses Solr's language detection. This is a separate question, but I 
want to mention it as it may explain more about how we have been using Tika and 
the problems we have encountered with upgrading to Tika 1.5.

> passing parameters via URL no longer works (regression)
> -------------------------------------------------------
>
>                 Key: TIKA-1371
>                 URL: https://issues.apache.org/jira/browse/TIKA-1371
>             Project: Tika
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 1.5
>            Reporter: Rob Tulloh
>
> In Tika 1.1 and 1.2, it was possible to add some values to the URL that get 
> logged like this:
> http://localhost:9998/tika/GUID/FILENAME
> This was very useful for correlating between client and server in a 
> distributed compute environment. In 1.5 and in the nighty builds (for 1.6), 
> this feature no longer works. Not having this makes it very difficult to 
> troubleshoot problems with document processing in a distributed environment. 
> Please add back this feature so that operations and development teams can 
> more easily figure out which tika instance is processing which document and 
> what the result of the processing resulted in.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to