Issue using the text extraction with lucene

Stephan Becker Sat, 23 Jan 2016 08:05:33 -0800

Hi Oak dev team,

I was trying to use the method as described on
http://jackrabbit.apache.org/oak/docs/query/lucene.html#text-extraction to
reduce the time needed with indexing pdfs.


Trying to run the following:
 java -cp tika-app-1.11.jar:oak-run-1.2.4.jar
org.apache.jackrabbit.oak.run.Main tika --fds-path
/opt/cq/crx-quickstart/repository/repository/datastore --nodestore
/opt/cq/crx-quickstart/repository/segmentstore --data-file dump.csv generate

I get an error message

Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.commons.csv.CSVFormat.withIgnoreSurroundingSpaces()Lorg/apache/commons/csv/CSVFormat;
at
org.apache.jackrabbit.oak.plugins.tika.CSVFileBinaryResourceProvider.<clinit>(CSVFileBinaryResourceProvider.java:51)
at
org.apache.jackrabbit.oak.plugins.tika.CSVFileGenerator.generate(CSVFileGenerator.java:46)
at
org.apache.jackrabbit.oak.plugins.tika.TextExtractorMain.main(TextExtractorMain.java:180)
at org.apache.jackrabbit.oak.run.Main.main(Main.java:192)

If anyone could provide a solution on how to make this work that would be
amazing!

Thanks
-- 
With kind regards

Mit freundlichen Grüßen

*Stephan Becker* | Senior System Engineer
Netcentric Deutschland GmbH
M D: +49 (0) 175 2238120
Skype: stephanhs.b

stephan.bec...@netcentric.biz | www.netcentric.biz
Other disclosures according to §35a GmbhG, §161, 125a HGB:
www.netcentric.biz/imprint.html

Issue using the text extraction with lucene

Reply via email to