Re: Upgrading Tika in Solr

2010-02-18 Thread Christian Vogler
Just a word of caution: I've been bitten by this bug, which affects Tika 0.6: https://issues.apache.org/jira/browse/PDFBOX-541 It causes the parser to go into an infinite loop, which isn't exactly great for server stability. Tika 0.4 is not affected in the same way - as far as I remember, the

Re: Upgrading Tika in Solr

2010-02-17 Thread Liam O'Boyle
I just copied in the newer .jars and got rid of the old ones and everything seemed to work smoothly enough. Liam On Tue, 2010-02-16 at 13:11 -0500, Grant Ingersoll wrote: I've got a task open to upgrade to 0.6. Will try to get to it this week. Upgrading is usually pretty trivial. On

Re: Upgrading Tika in Solr

2010-02-16 Thread Grant Ingersoll
I've got a task open to upgrade to 0.6. Will try to get to it this week. Upgrading is usually pretty trivial. On Feb 14, 2010, at 12:37 AM, Liam O'Boyle wrote: Afternoon, I've got a large collections of documents which I'm attempting to add to a Solr index using Tika via the

Upgrading Tika in Solr

2010-02-13 Thread Liam O'Boyle
Afternoon, I've got a large collections of documents which I'm attempting to add to a Solr index using Tika via the ExtractingRequestHandler, but there are a large number that it has problems with (PDFs, PPTX and XLS documents mainly). I've tried them with the most recent stand alone version