bq: The issue is caused by a huge query with many wildcards and phrases in it.
Well, the very first thing I'd do is look at whether this is necessary. For instance: leading and trailing wildcards are an anti-pattern. You should investigate using ngrams instead. trailing wildcards usually resolve to term queries and are usually quite space-efficieint leading wildcards are usually best handled by ReverseWildcardFilterFactory. Very often, large, complex wildcarded queries are a holdover from SQL searching which is limited to things like %whatever% and don't take into account things like the Solr analysis chain. A classic example is people searching for run* to find runner, running, runs etc., all of which can be handled by stemming. FWIW, Erick On Tue, Aug 18, 2015 at 2:06 AM, Modassar Ather <modather1...@gmail.com> wrote: > So Toke/Daniel is the node showing *gone* on Solr cloud dashboard is > because of GC pause and it is actually not gone but the ZK is not able to > get the correct state? > The issue is caused by a huge query with many wildcards and phrases in it. > If you see I have mentioned about (*The request took too long to iterate > over terms.). *So does it mean that the terms which are getting expanded > has taken the amount of memory? Just trying to understand what consumes so > much of memory. > I am trying to reproduce the OOM by executing multiple queries in parallel > but not able to whereas I am seeing the memory usage going up by more than > 90+% for Solr JVM. So what happens to the query which is executed in > parallel. Do they wait for such query to timeout/complete which is taking > lot of time and resources? > We also have migration to java 8 on our things to do list and will try with > different GC settings. > > > > On Tue, Aug 18, 2015 at 2:08 PM, Daniel Collins <danwcoll...@gmail.com> > wrote: > >> Ah ok, its ZK timeout then >> (org.apache.zookeeper.KeeperException$SessionExpiredException) >> which is because of your GC pause. >> >> The page Shawn mentioned earlier has several links on how to investigate GC >> issues and some common GC settings, sounds like you need to tweak those. >> >> Generally speaking, I believe Java 8 is considered better for GC >> performance than 7, so you probably want to investigate that. GC tuning is >> very dependent on the load on your system. You may be running close yo the >> limit under normal load, and that 1 big query is enough to tip it over the >> edge. We have seen similar issues from time to time. We are still running >> an older Java 7 build with G1GC which we found worked well for us (though >> CMS seems to be the general consensus on the list here), migrating to Java >> 8 is on our "list of things to do", so our settings are probably not that >> relevant. >> >> >> On 18 August 2015 at 09:04, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: >> >> > On Tue, 2015-08-18 at 10:38 +0530, Modassar Ather wrote: >> > > Kindly help me understand, even if there is a a GC pause why the solr >> > node >> > > will go down. >> > >> > If a stop-the-world GC is in progress, it is not possible for an >> > external service to know if this is because a GC is in progress or the >> > node is dead. If the GC takes longer than the relevant timeouts, the >> > external conclusion is that it is dead. >> > >> > In you next post you state that there is very heavy GC going on, so it >> > would seem that your main problem is that your heap is too small for >> > your setup. >> > >> > Getting OOM for a 200GB index with 24GB heap is not at all impossible, >> > but it is a bit of a red flag. If you have very high values for your >> > caches or perform faceting on a lot of different fields, that might be >> > the cause. If you describe your setup in more detail, we might be able >> > to help find the cause for your relatively high heap requirement. >> > >> > - Toke Eskildsen, State and University Library, Denmark >> > >> > >> > >>