Forgot to reply to your questions, Binh:
1) No I haven't set this. However I wonder if this has any significant
effect since swap space is barely used.
2) It seems to happen when the cluster is under high load but I haven't
seen any specific pattern so far.
3) No there's not. There's a very smal
Thanks Jörg,
I can increase the ping_timeout to 60s for now. However, shouldn't the goal
be to minimize the time GC runs? Is the node blocked when GC runs and will
delay any requests to it? If so, then it would be very bad to allow long GC
runs.
Regarding the bulk thread pool: I specifically s
It seems you run into trouble because you changed some of the default
settings, worsening your situation.
Increase ping_timout from 9s to 60s as first band aid - you have GCs with
35secs running.
You should reduce the bulk thread pool of 100 to 50, this reduces high
memory pressure on the 20% mem
I would probably not master enable any node that can potentially gc for a
couple seconds. You want your master-eligible nodes to make decisions as
quick as possible.
About your GC situation, I'd find out what the underlying cause is:
1) Do you have bootstrap.mlockall set to true?
2) Does it us
Hi,
Multiple times we ran into a problem where our search cluster was in an
inconsistent state. We have 3 nodes (all running 1.0.1), where nodes 2+3
hold the data (all the shards each, i.e. one replica per shard). Sometimes,
a long GC run happens on one of the nodes (here on node 3), causing it