Hi Luis,

Yes, threads can only mitigate (not solve) some (not all) of the problems.

But they do help with hangs, which is why we use them when parsing docs with 
Tika in Hadoop jobs.

Which is why I was curious why ERH didn’t take advantage of that approach, as 
it’s better than not using them.

— Ken

> On May 29, 2018, at 12:18 PM, Luís Filipe Nassif <lfcnas...@gmail.com> wrote:
> 
> Hi Ken,
> 
> Threads will not help with OutOfMemoryErrors or crashes caused by native
> libs. ForkParser can help, after the refactoring started by Tim to handle
> some of its limitations. See TIKA-2653
> 
> 2018-05-29 16:11 GMT-03:00 Ken Krugler <kkrugler_li...@transpac.com>:
> 
>> Thanks for the ref, Tim.
>> 
>> I’m curious why SolrCell doesn’t fire up threads when parsing docs with
>> Tika (or use the fork parser), to mitigate issues with hangs & crashes?
>> 
>> — Ken
>> 
>>> On May 29, 2018, at 11:54 AM, Tim Allison <talli...@apache.org> wrote:
>>> 
>>> All,
>>> 
>>> Over the weekend, Shawn Heisey very kindly drafted a wikipage about the
>>> challenges of using Solr's ExtractingRequestHandler and the guidance to
>>> avoid it in production.
>>> 
>>>  I completely agree with this point, and I think that Shawn did a very
>>> nice job of capturing some of the challenges.  If you have any feedback
>> or
>>> would like to make edits, see:
>>> 
>>> https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
>>> 
>>>  Cheers,
>>> 
>>>                Tim

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra

Reply via email to