HI,
To minimize GC pauses, try using G1GC and turn on 'ParallelRefProcEnabled'
jvm flag. G1GC works much better for heaps > 4 GB. Lowering
'InitiatingHeapOccupancyPercent'
will also help to avoid long GC pauses at the cost of more short pauses.
On 3 November 2015 at 12:12, Björn Häuser wrote:
>
One another item to look into is to increase the zookeeper timeout in
solr.xml of Solr. This would help with timeout caused by long GC pauses.
On 11/3/15 9:12 AM, Björn Häuser wrote:
Hi,
thank you for your answer.
1> No OOM hit, the log does not contain any hind of that. Also solr
wasn't rest
Hi,
thank you for your answer.
1> No OOM hit, the log does not contain any hind of that. Also solr
wasn't restarted automatically. But the gc log has some pauses which
are longer than 15 seconds.
2> So, if we need to recover a system we need to stop ingesting data into it?
3> The JVMs currently
The GC logs don't really show anything interesting, there would
be 15+ second GC pauses. The Zookeeper log isn't actually very
interesting. As far as OOM errors, I was thinking of _solr_ logs.
As to why the cluster doesn't self-heal, a couple of things:
1> Once you hit an OOM, all bets are off. T
Hi!
Thank you for your super fast answer.
I can provide more data, the question is which data :-)
These are the config parameters solr runs with:
https://gist.github.com/bjoernhaeuser/24e7080b9ff2a8785740 (taken from
the admin ui)
These are the log files:
https://gist.github.com/bjoernhaeuser/
Without more data, I'd guess one of two things:
1> you're seeing stop-the-world GC pauses that cause Zookeeper to
think the node is unresponsive, which puts a node into recovery and
things go bad from there.
2> Somewhere in your solr logs you'll see OutOfMemory errors which can
also cascade a bun
Hey there,
we are running a SolrCloud, which has 4 nodes, same config. Each node
has 8gb memory, 6GB assigned to the JVM. This is maybe too much, but
worked for a long time.
We currently run with 2 shards, 2 replicas and 11 collections. The
complete data-dir is about 5.3 GB.
I think we should mov