[ 
https://issues.apache.org/jira/browse/SOLR-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040969#comment-13040969
 ] 

Liam O'Boyle commented on SOLR-2424:
------------------------------------

Hi, sorry for the slow response, I don't seem to be receiving notifications of 
updates.  

You are correct; I used the Tika 0.9 command line tool, which worked correctly. 
 When I tried the 0.8 version the same problem occurs as is described in this 
ticket, so it appears that the bug is in Tika and that it is already resolved 
in the 0.9 release.

I'll try to update the version of Tika in use in my installation, although it's 
something that has caused more problems than it has solved when I've tried it 
in the past.

> extracted text from tika has no spaces
> --------------------------------------
>
>                 Key: SOLR-2424
>                 URL: https://issues.apache.org/jira/browse/SOLR-2424
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - Solr Cell (Tika extraction)
>    Affects Versions: 3.1
>            Reporter: Yonik Seeley
>         Attachments: ET2000 Service Manual.pdf
>
>
> Try this:
> curl 
> "http://localhost:8983/solr/update/extract?extractOnly=true&wt=json&indent=true";
>   -F "tutorial=@tutorial.pdf"
> And you get text output w/o spaces: 
> "ThisdocumentcoversthebasicsofrunningSolru"...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to