Re: Solr server becomes non-responsive.

Modassar Ather Thu, 25 Dec 2014 22:05:49 -0800

Thanks for your suggestions Erick.

This may be one of those situations where you really have to
push back at the users and understand why they insist on these
kinds of queries. They must be very patient since it won't be
very performant. That said, I've seen this pattern; there are
certainly valid conditions under which response times can be
many seconds if there are few users and they are doing very
complex/expert-level things.


We have tried educating the users but it did not work because they are used
to the old way. They feel that wildcard gives more control over the results
and may not fully understand stemming.

Regards,
Modassar

On Thu, Dec 25, 2014 at 3:17 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> There's no magic bullet here that I know of. If your requirements
> are to support these huge, many-wildcard queries then you only
> have a few choices:
>
> 1> redo the index. I was surprised at how little it bloated the
> index as far as memory required is concerned to add ngrams.
> The key here is that there really aren't very many unique terms.
> If you use bigrams, then there are only maybe 36^2 distinct
> combinations. (assuming English and including numbers).
>
> 2> Increase the number of shards, putting many fewer docs
> on each shard.
>
> 3> give each shard a lot more memory. This isn't actually one
> of my preferred solutions since GC issues may raise their ugly
> heads here.
>
> 4> insert creative solution here.
>
> This may be one of those situations where you really have to
> push back at the users and understand why they insist on these
> kinds of queries. They must be very patient since it won't be
> very performant. That said, I've seen this pattern; there are
> certainly valid conditions under which response times can be
> many seconds if there are few users and they are doing very
> complex/expert-level things.
>
> Now, all that said, wildcards are often examples of poor habits
> or habits learned in DB systems where the only hammer was
> %whatever%. I've seen situations where users didn't
> understand that Solr broke the input stream up into words. And
> stemmed. And WordDelimiterFilterFactory did all the magic
> for finding, say D.C. and DC. So it's worth looking at the actual
> queries that are sent, perhaps talking to users and understanding
> what they _want_ out of the system, then perhaps educating them
> as to better ways to get what they want.
>
> Literally I've seen people insist on entering queries that
> wildcarded _everything_ both pre and post wildcards because
> they didn't realize that Solr tokenizes...
>
> Once you hit an OOM, all bets are off as Shawn outlined.
>
> Best,
> Erick
>
> On Wed, Dec 24, 2014 at 1:57 AM, Modassar Ather <modather1...@gmail.com>
> wrote:
> > Thanks for your response.
> >
> > How many items in the collection ?
> > There are about 100 millions documents.
> >
> > How are configured cache in solrconfig.xml ?
> > Each cache has size attribute as 128.
> >
> > Can you provide a sample of the query ?
> > Does it fail immediately after solrcloud startup or after several hours ?
> > It is a query with many terms(more than a thousand) and phrase where
> > phrases have many wildcards in it.
> > Once such query is executed there are many zookeeper related exceptions
> and
> > with a couple of such queries executed it goes for OutOfMemory.
> >
> > Thanks,
> > Modassar
> >
> >
> > On Wed, Dec 24, 2014 at 1:37 PM, Dominique Bejean <
> dominique.bej...@eolya.fr
> >> wrote:
> >
> >> And you didn’t give how many RAM on each servers ?
> >>
> >> 2014-12-24 8:17 GMT+01:00 Dominique Bejean <dominique.bej...@eolya.fr>:
> >>
> >> > Modassar,
> >> >
> >> > How many items in the collection ?
> >> > I mean how many documents per collection ? 1 million, 10 millions, …?
> >> >
> >> > How are configured cache in solrconfig.xml ?
> >> > What are the size attribute value for each cache ?
> >> >
> >> > Can you provide a sample of the query ?
> >> > Does it fail immediately after solrcloud startup or after several
> hours ?
> >> >
> >> > Dominique
> >> >
> >> > 2014-12-24 6:20 GMT+01:00 Modassar Ather <modather1...@gmail.com>:
> >> >
> >> >> Thanks for your suggestions.
> >> >>
> >> >> I will look into the link provided.
> >> >> http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
> >> >>
> >> >> This is usually an anti-pattern. The very first thing
> >> >> I'd be doing is trying to not do this. See ngrams for infix
> >> >> queries, or shingles or ReverseWildcardFilterFactory or.....
> >> >>
> >> >> We cannot avoid multiple wildcards since that's is our user's
> >> requirement.
> >> >> We try to discourage it but the users insist on firing such queries.
> >> Also,
> >> >> ngrams etc. can be tried but our index is already huge and ngrams may
> >> >> further add lot to it. We are OK with such queries failing as long as
> >> >> other
> >> >> queries are not affected.
> >> >>
> >> >>
> >> >> Please find the details below.
> >> >>
> >> >> So, how many nodes in the cluster ?
> >> >> There are total 4 nodes on the cluster.
> >> >>
> >> >> How many shards and replicas for the collection ?
> >> >> There are 4 shards and no replica for any of them.
> >> >>
> >> >> How many items in the collection ?
> >> >> If I understand the question correctly there are two collection on
> each
> >> >> node and there size on each node is approximately 190GB and 130GB.
> >> >>
> >> >> What is the size of the index ?
> >> >> There are two collection on each node and there size on each node is
> >> >> approximately 190GB and 130GB.
> >> >>
> >> >> How is updated the collection (frequency, how many items per days,
> what
> >> is
> >> >> your hard commit strategy) ?
> >> >> It is an optimized index and read-only. There are no inter-mediate
> >> update.
> >> >>
> >> >> How are configured cache in solrconfig.xml ?
> >> >> Filter cache, query result cache and document cache are enabled.
> >> >> Auto-warming is also done.
> >> >>
> >> >> Can you provide all other JVM parameters ?
> >> >> -Xms20g -Xmx24g -XX:+UseConcMarkSweepGC
> >> >>
> >> >> Thanks again,
> >> >> Modassar
> >> >>
> >> >> On Wed, Dec 24, 2014 at 2:29 AM, Dominique Bejean <
> >> >> dominique.bej...@eolya.fr
> >> >> > wrote:
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > I agree Erick it could be a good think to have more details about
> your
> >> >> > configuration and collection.
> >> >> >
> >> >> > Your heap size is 32Gb. How many RAM on each servers ?
> >> >> >
> >> >> > By « 4 shard Solr cluster », you mean a 4 nodes Solr servers or a
> >> >> > collection with 4 shards ?
> >> >> >
> >> >> > So, how many nodes in the cluster ?
> >> >> > How many shards and replicas for the collection ?
> >> >> > How many items in the collection ?
> >> >> > What is the size of the index ?
> >> >> > How is updated the collection (frequency, how many items per days,
> >> what
> >> >> is
> >> >> > your hard commit strategy) ?
> >> >> > How are configured cache in solrconfig.xml ?
> >> >> > Can you provide all other JVM parameters ?
> >> >> >
> >> >> > Regards
> >> >> >
> >> >> > Dominique
> >> >> >
> >> >> > 2014-12-23 17:50 GMT+01:00 Erick Erickson <erickerick...@gmail.com
> >:
> >> >> >
> >> >> > > Second most important part of your message:
> >> >> > > "When executing a huge query with many wildcards inside it the
> >> server"
> >> >> > >
> >> >> > > This is usually an anti-pattern. The very first thing
> >> >> > > I'd be doing is trying to not do this. See ngrams for infix
> >> >> > > queries, or shingles or ReverseWildcardFilterFactory or.....
> >> >> > >
> >> >> > > And if your corpus is very large with many unique terms it's even
> >> >> > > worse, but you haven't really told us about that yet.
> >> >> > >
> >> >> > > Best,
> >> >> > > Erick
> >> >> > >
> >> >> > > On Tue, Dec 23, 2014 at 8:30 AM, Shawn Heisey <
> apa...@elyograg.org>
> >> >> > wrote:
> >> >> > > > On 12/23/2014 4:34 AM, Modassar Ather wrote:
> >> >> > > >> Hi,
> >> >> > > >>
> >> >> > > >> I have a setup of 4 shard Solr cluster with embedded
> zookeeper on
> >> >> one
> >> >> > of
> >> >> > > >> them. The zkClient time out is set to 30 seconds, -Xms is 20g
> and
> >> >> -Xms
> >> >> > > is
> >> >> > > >> 24g.
> >> >> > > >> When executing a huge query with many wildcards inside it the
> >> >> server
> >> >> > > >> crashes and becomes non-responsive. Even the dashboard does
> not
> >> >> > responds
> >> >> > > >> and shows connection lost error. This requires me to restart
> the
> >> >> > > servers.
> >> >> > > >
> >> >> > > > Here's the important part of your message:
> >> >> > > >
> >> >> > > > *Caused by: java.lang.OutOfMemoryError: Java heap space*
> >> >> > > >
> >> >> > > >
> >> >> > > > Your heap is not big enough for what Solr has been asked to do.
> >> You
> >> >> > > > need to either increase your heap size or change your
> >> configuration
> >> >> so
> >> >> > > > that it uses less memory.
> >> >> > > >
> >> >> > > > http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
> >> >> > > >
> >> >> > > > Most programs have pretty much undefined behavior when an OOME
> >> >> occurs.
> >> >> > > > Lucene's IndexWriter has been hardened so that it tries
> extremely
> >> >> hard
> >> >> > > > to avoid index corruption when OOME strikes, and I believe that
> >> >> works
> >> >> > > > well enough that we can call it nearly bulletproof ... but the
> >> rest
> >> >> of
> >> >> > > > Lucene and Solr will make no guarantees.
> >> >> > > >
> >> >> > > > It's very difficult to have definable program behavior when an
> >> OOME
> >> >> > > > happens, because you simply cannot know the precise point
> during
> >> >> > program
> >> >> > > > execution where it will happen, or what isn't going to work
> >> because
> >> >> > Java
> >> >> > > > did not have memory space to create an object.  Going
> unresponsive
> >> >> is
> >> >> > > > not surprising.
> >> >> > > >
> >> >> > > > If you can solve your heap problem, note that you may run into
> >> other
> >> >> > > > performance issues discussed on the wiki page that I linked.
> >> >> > > >
> >> >> > > > Thanks,
> >> >> > > > Shawn
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >>
> >> >> On Wed, Dec 24, 2014 at 2:29 AM, Dominique Bejean <
> >> >> dominique.bej...@eolya.fr
> >> >> > wrote:
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > I agree Erick it could be a good think to have more details about
> your
> >> >> > configuration and collection.
> >> >> >
> >> >> > Your heap size is 32Gb. How many RAM on each servers ?
> >> >> >
> >> >> > By « 4 shard Solr cluster », you mean a 4 nodes Solr servers or a
> >> >> > collection with 4 shards ?
> >> >> >
> >> >> > So, how many nodes in the cluster ?
> >> >> > How many shards and replicas for the collection ?
> >> >> > How many items in the collection ?
> >> >> > What is the size of the index ?
> >> >> > How is updated the collection (frequency, how many items per days,
> >> what
> >> >> is
> >> >> > your hard commit strategy) ?
> >> >> > How are configured cache in solrconfig.xml ?
> >> >> > Can you provide all other JVM parameters ?
> >> >> >
> >> >> > Regards
> >> >> >
> >> >> > Dominique
> >> >> >
> >> >> > 2014-12-23 17:50 GMT+01:00 Erick Erickson <erickerick...@gmail.com
> >:
> >> >> >
> >> >> > > Second most important part of your message:
> >> >> > > "When executing a huge query with many wildcards inside it the
> >> server"
> >> >> > >
> >> >> > > This is usually an anti-pattern. The very first thing
> >> >> > > I'd be doing is trying to not do this. See ngrams for infix
> >> >> > > queries, or shingles or ReverseWildcardFilterFactory or.....
> >> >> > >
> >> >> > > And if your corpus is very large with many unique terms it's even
> >> >> > > worse, but you haven't really told us about that yet.
> >> >> > >
> >> >> > > Best,
> >> >> > > Erick
> >> >> > >
> >> >> > > On Tue, Dec 23, 2014 at 8:30 AM, Shawn Heisey <
> apa...@elyograg.org>
> >> >> > wrote:
> >> >> > > > On 12/23/2014 4:34 AM, Modassar Ather wrote:
> >> >> > > >> Hi,
> >> >> > > >>
> >> >> > > >> I have a setup of 4 shard Solr cluster with embedded
> zookeeper on
> >> >> one
> >> >> > of
> >> >> > > >> them. The zkClient time out is set to 30 seconds, -Xms is 20g
> and
> >> >> -Xms
> >> >> > > is
> >> >> > > >> 24g.
> >> >> > > >> When executing a huge query with many wildcards inside it the
> >> >> server
> >> >> > > >> crashes and becomes non-responsive. Even the dashboard does
> not
> >> >> > responds
> >> >> > > >> and shows connection lost error. This requires me to restart
> the
> >> >> > > servers.
> >> >> > > >
> >> >> > > > Here's the important part of your message:
> >> >> > > >
> >> >> > > > *Caused by: java.lang.OutOfMemoryError: Java heap space*
> >> >> > > >
> >> >> > > >
> >> >> > > > Your heap is not big enough for what Solr has been asked to do.
> >> You
> >> >> > > > need to either increase your heap size or change your
> >> configuration
> >> >> so
> >> >> > > > that it uses less memory.
> >> >> > > >
> >> >> > > > http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
> >> >> > > >
> >> >> > > > Most programs have pretty much undefined behavior when an OOME
> >> >> occurs.
> >> >> > > > Lucene's IndexWriter has been hardened so that it tries
> extremely
> >> >> hard
> >> >> > > > to avoid index corruption when OOME strikes, and I believe that
> >> >> works
> >> >> > > > well enough that we can call it nearly bulletproof ... but the
> >> rest
> >> >> of
> >> >> > > > Lucene and Solr will make no guarantees.
> >> >> > > >
> >> >> > > > It's very difficult to have definable program behavior when an
> >> OOME
> >> >> > > > happens, because you simply cannot know the precise point
> during
> >> >> > program
> >> >> > > > execution where it will happen, or what isn't going to work
> >> because
> >> >> > Java
> >> >> > > > did not have memory space to create an object.  Going
> unresponsive
> >> >> is
> >> >> > > > not surprising.
> >> >> > > >
> >> >> > > > If you can solve your heap problem, note that you may run into
> >> other
> >> >> > > > performance issues discussed on the wiki page that I linked.
> >> >> > > >
> >> >> > > > Thanks,
> >> >> > > > Shawn
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >
> >> >
> >>
>

Re: Solr server becomes non-responsive.

Reply via email to