You need to overwrite ALL the relevant jars of the updated components.
These supposed to be:
tika 0.9:
commons-codec-1.2.jar
geronimo-stax-api_1.0_spec-1.0.1.jar
jdom-1.0.jar
slf4j-api-1.5.6.jar
tika-core-0.9.jar
tika-parsers-0.9.jar
pdfbox 1.6.0:
fontbox-1.6.0.jar
jempbox-1.6.0.jar
pdfbox-1
Hi,
On Mon, Aug 22, 2011 at 11:08 AM, nirnaydewan wrote:
> Please let me know how can i get rid of this exception.
It looks like you have a dependency version mismatch. Instead of POI
version -3.8-beta3-20110606 use the earlier 3.6 version as listed in
the Tika 0.9 getting started page [1].
If
Please let me know how can i get rid of this exception.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tika-0-9-integration-in-Solr-3-3-0-tp3267799p3274463.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.
Thanks Tom again for trying to help me out but it didn't work out on my side:
What i did:
poi-3.8-beta3-20110606
pdfbox-app-1.6.0
tika-core-0.9
Replaced all the jars above as you said.
The following jars were also necessary as it was giving many errors:
poi-scratchpad-3.8-beta3-20110606
poi-oo
On 08/19/2011 03:20 PM, nirnaydewan wrote:
Thanks for your suggestion. Do i just need to replace these jars ?
yes, that is all you need to do.
How do i build again? I am just using start.jar as of now.
look at your solrconfig.xml. There should be a line saying:
or something similar. Al
Thanks for your suggestion. Do i just need to replace these jars ?
How do i build again? I am just using start.jar as of now.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tika-0-9-integration-in-Solr-3-3-0-tp3267799p3268030.html
Sent from the Apache Tika - Development m
Hi
the steps shown @
http://wiki.apache.org/solr/ExtractingRequestHandler#Upgrading_Tika
worked for me. I use the following versions for extracting MS-Office and
PDF files:
tika 0.9
pdfbox 1.6.0
poi 3.8beta3
This combination turned out to be the most failsafe at the moment.
Cheers
-Tom
On
As in Tika 0.9, the formatting issue for extracting content from PDF & DOC
files have been fixed, i want to integrate this in my existing Solr project.
Please let me know the steps.
All i have is the downloaded folder of Solr 3.3.0 and currently using the
attached Jetty server only. This version