So Toke/Daniel is the node showing *gone* on Solr cloud dashboard is
because of GC pause and it is actually not gone but the ZK is not able to
get the correct state?
The issue is caused by a huge query with many wildcards and phrases in it.
If you see I have mentioned about (*The request took too long to iterate
over terms.). *So does it mean that the terms which are getting expanded
has taken the amount of memory? Just trying to understand what consumes so
much of memory.
I am trying to reproduce the OOM by executing multiple queries in parallel
but not able to whereas I am seeing the memory usage going up by more than
90+% for Solr JVM. So what happens to the query which is executed in
parallel. Do they wait for such query to timeout/complete which is taking
lot of time and resources?
We also have migration to java 8 on our things to do list and will try with
different GC settings.



On Tue, Aug 18, 2015 at 2:08 PM, Daniel Collins <danwcoll...@gmail.com>
wrote:

> Ah ok, its ZK timeout then
> (org.apache.zookeeper.KeeperException$SessionExpiredException)
> which is because of your GC pause.
>
> The page Shawn mentioned earlier has several links on how to investigate GC
> issues and some common GC settings, sounds like you need to tweak those.
>
> Generally speaking, I believe Java 8 is considered better for GC
> performance than 7, so you probably want to investigate that.  GC tuning is
> very dependent on the load on your system. You may be running close yo the
> limit under normal load, and that 1 big query is enough to tip it over the
> edge.  We have seen similar issues from time to time. We are still running
> an older Java 7 build with G1GC which we found worked well for us (though
> CMS seems to be the general consensus on the list here), migrating to Java
> 8 is on our "list of things to do", so our settings are probably not that
> relevant.
>
>
> On 18 August 2015 at 09:04, Toke Eskildsen <t...@statsbiblioteket.dk> wrote:
>
> > On Tue, 2015-08-18 at 10:38 +0530, Modassar Ather wrote:
> > > Kindly help me understand, even if there is a a GC pause why the solr
> > node
> > > will go down.
> >
> > If a stop-the-world GC is in progress, it is not possible for an
> > external service to know if this is because a GC is in progress or the
> > node is dead. If the GC takes longer than the relevant timeouts, the
> > external conclusion is that it is dead.
> >
> > In you next post you state that there is very heavy GC going on, so it
> > would seem that your main problem is that your heap is too small for
> > your setup.
> >
> > Getting OOM for a 200GB index with 24GB heap is not at all impossible,
> > but it is a bit of a red flag. If you have very high values for your
> > caches or perform faceting on a lot of different fields, that might be
> > the cause. If you describe your setup in more detail, we might be able
> > to help find the cause for your relatively high heap requirement.
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
> >
> >
>

Reply via email to