Plan C: if you’re willing to store a mirror set of directories with the text 
versions of the files, just run tika-app.jar on your “input” directory and run 
your SolrJ loader on the “text/export” directory:

java -jar tika-app.jar <input> <output>

And, if you’re feeling jsonic:

java -jar tika-app.jar –J -t –i <input> -o <output>


This method of running Tika will be robust to OOM, permanent hangs and 
OS-destroying-your-process-out-of-self-preservation incidents.


From: Steven White [mailto:swhite4...@gmail.com]
Sent: Thursday, February 11, 2016 10:18 AM
To: user@tika.apache.org
Subject: Re: Using tika-app-1.11.jar

Thank you Nick and everyone who has helped me with my questions.

I'm now understand Tika much better vs. where I was at last week when I first 
looked at it.

Steve

On Thu, Feb 11, 2016 at 8:18 AM, Nick Burch 
<apa...@gagravarr.org<mailto:apa...@gagravarr.org>> wrote:
On Wed, 10 Feb 2016, Steven White wrote:
I'm including tika-app-1.11.jar with my application and see that Tika
includes "slf4j".

The Tika App single jar is intended for standalone use. It's not generally 
recommended to be included as part of a wider application, as it tends to 
include everything and the kitchen sink, to allow for easy standalone use

Generally, you should just tell Maven / Groovy / Ivy that you want to depend on 
Tika Core + Tika Parsers, then your build tool will fetch + bundle all the 
dependencies for you. That lets you have proper control over conflicting 
versions of jars etc

Nick

Reply via email to