I have one answer here [0], but I'd be interested to hear what Solr users/devs/integrators have experienced on this topic.
[0] http://mail-archives.apache.org/mod_mbox/tika-user/201602.mbox/%3CCY1PR09MB0795EAED947B53965BC86874C7D70%40CY1PR09MB0795.namprd09.prod.outlook.com%3E -----Original Message----- From: Steven White [mailto:swhite4...@gmail.com] Sent: Tuesday, February 09, 2016 6:33 PM To: solr-user@lucene.apache.org Subject: Re: How is Tika used with Solr Thank you Erick and Alex. My main question is with a long running process using Tika in the same JVM as my application. I'm running my file-system-crawler in its own JVM (not Solr's). On Tika mailing list, it is suggested to run Tika's code in it's own JVM and invoke it from my file-system-crawler using Runtime.getRuntime().exec(). I fully understand from Alex suggestion and link provided by Erick to use Tika outside Solr. But what about using Tika within the same JVM as my file-system-crawler application or should I be making a system call to invoke another JAR, that runs in its own JVM to extract the raw text? Are there known issues with Tika when used in a long running process? Steve