Re: Inconsistent search cluster status and search results after long GC run

Thomas S. Thu, 27 Mar 2014 15:06:32 -0700

Thanks Jörg,

I can increase the ping_timeout to 60s for now. However, shouldn't the goal 
be to minimize the time GC runs? Is the node blocked when GC runs and will 
delay any requests to it? If so, then it would be very bad to allow long GC 
runs.


Regarding the bulk thread pool: I specifically set this to a higher value 
to avoid errors when we perform bulk indexing (we had errors sometimes when 
the queue was full and set to 50. I was also going to increase the "index" 
queue since there are sometimes errors). I will try keeping the limit and 
give it more heap space to indexing instead, as you suggested.

Regarding Java 8: We're currently running Java 7 and haven't tweaked any GC 
specific settings. Do you think it makes sense to already switch to Java 8 
on production and enable the G1 garbage collector?

Thanks again,
Thomas

On Thursday, March 27, 2014 9:41:10 PM UTC+1, Jörg Prante wrote:
>
> It seems you run into trouble because you changed some of the default 
> settings, worsening your situation.
>
> Increase ping_timout from 9s to 60s as first band aid - you have GCs with 
> 35secs running.
>
> You should reduce the bulk thread pool of 100 to 50, this reduces high 
> memory pressure on the 20% memory you allow. Give more heap space to 
> indexing, use 50% instead of 20%.
>
> Better help would be to diagnose the nodes if you exceed the capacity for 
> search and index operations. If so, think about adding nodes.
>
> More finetuning after adding nodes could include G1 GC with Java 8, which 
> is targeted to minimize GC stalls. This would not solve node capacity 
> problems though.
>
> Jörg
>
>
> On Thu, Mar 27, 2014 at 4:46 PM, Binh Ly <binh...@yahoo.com 
> <javascript:>>wrote:
>
>> I would probably not master enable any node that can potentially gc for a 
>> couple seconds. You want your master-eligible nodes to make decisions as 
>> quick as possible.
>>
>> About your GC situation, I'd find out what the underlying cause is:
>>
>> 1) Do you have bootstrap.mlockall set to true?
>>
>> 2) Does it usually triggered while running queries? Or is there a pattern 
>> on when it usually triggers?
>>
>> 3) Is there anything else running on these nodes that would overload and 
>> affect normal ES operations?
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/cd594a91-00c4-43ae-97d8-bbda35618d8e%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/cd594a91-00c4-43ae-97d8-bbda35618d8e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/86db1b12-038f-47d6-9fac-9e8eb8314dbc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Inconsistent search cluster status and search results after long GC run

Reply via email to