I tried to profile the memory of each solr node. I can see the GC activity going higher as much as 98% and there are many instances where it has gone up at 10+%. In one of the solr node I can see it going to 45%. Memory is fully used and have gone to the maximum usage of heap which is set to 24g. During other search I can see the error *org.apache.solr.common.SolrException: no servers hosting shard.* Few nodes are in gone state. There are many instances of *org.apache.solr.common.SolrException: org.apache.zookeeper.KeeperException$SessionExpiredException.* GC logs shows a very busy garbage collection.Please provide your inputs.
On Tue, Aug 18, 2015 at 10:38 AM, Modassar Ather <modather1...@gmail.com> wrote: > Shawn! The container I am using is jetty only and the JVM setting I am > using is the default one which comes with Solr startup scripts. Yes I have > changed the JVM memory setting as mentioned. > Kindly help me understand, even if there is a a GC pause why the solr node > will go down. At least for other queries is should not throw exception of > *org.apache.solr.common.SolrException: no servers hosting shard.* > Why the node will throw above exception even a huge query is time out or > may have taken lot of resources. Kindly help me understand in what > conditions such exception can arise as I am not fully aware of it. > > Daniel! The error logs do not say if it was JVM crash or just solr. But by > the exception I understand that it might have gone to a state from where it > recovered after sometime. I did not restart the Solr. > > On Mon, Aug 17, 2015 at 10:12 PM, Daniel Collins <danwcoll...@gmail.com> > wrote: > >> When you say "the solr node goes down", what do you mean by that? From >> your >> comment on the logs, you obviously lose the solr core at best (you do >> realize only having a single replica is inherently susceptible to failure, >> right?) >> But do you mean the Solr Core drops out of the collection (ZK timeout), >> the >> JVM stops, the whole machine crashes? >> >> On 17 August 2015 at 14:17, Shawn Heisey <apa...@elyograg.org> wrote: >> >> > On 8/17/2015 5:45 AM, Modassar Ather wrote: >> > > The servers have 32g memory each. Solr JVM memory is set to -Xms20g >> > > -Xmx24g. There are no OOM in logs. >> > >> > Are you starting Solr 5.2.1 with the included start script, or have you >> > installed it into another container? >> > >> > Assuming you're using the download's "bin/solr" script, that will >> > normally set Xms and Xmx to the same value, so if you have overridden >> > the memory settings such that you can have different values in Xms and >> > Xmx, have you also overridden the garbage collection parameters? If you >> > have, what are they set to now? You can see all arguments used on >> > startup in the "JVM" section of the admin UI dashboard. >> > >> > If you've installed in an entirely different container, or you have >> > overridden the garbage collection settings, then a 24GB heap might have >> > extreme garbage collection pauses, lasting long enough to exceed the >> > timeout. >> > >> > Giving 24 out of 32GB to Solr will mean that there is only (at most) 8GB >> > left over for caching the index. With 200GB of index, this is nowhere >> > near enough, and is another likely source of Solr performance problems >> > that cause timeouts. This is what Upayavira was referring to in his >> > reply. For good performance with 200GB of index, you may need a lot >> > more than 32GB of total RAM. >> > >> > https://wiki.apache.org/solr/SolrPerformanceProblems >> > >> > This wiki page also describes how you can use jconsole to judge how much >> > heap you actually need. 24GB may be too much. >> > >> > Thanks, >> > Shawn >> > >> > >> > >