Hello,

2011/3/14 Markus Jelsma <markus.jel...@openindex.io>

> Hi Doğacan,
>
> Are you, at some point, running out of heap space? In my experience, that's
> the common cause of increased load and excessivly high response times (or
> time
> outs).
>
>
How much of a heap size would be enough? Our index size is growing slowly
but we did not have this problem
a couple weeks ago where index size was maybe 100mb smaller.

We left most of the caches in solrconfig as default and only increased
filterCache to 1024. We only ask for "id"s (which
are unique) and no other fields during queries (though we do faceting). Btw,
1.6gb of our index is stored fields (we store
everything for now, even though we do not get them during queries), and
about 1gb of index.

Anyway, Xmx was 4000m, we tried increasing it to 8000m but did not get any
improvement in load. I can try monitoring with Jconsole
with 8gigs of heap to see if it helps.


> Cheers,
>
> > Hello everyone,
> >
> > First of all here is our Solr setup:
> >
> > - Solr nightly build 986158
> > - Running solr inside the default jetty comes with solr build
> > - 1 write only Master , 4 read only Slaves (quad core 5640 with 24gb of
> > RAM) - Index replicated (on optimize) to slaves via Solr Replication
> > - Size of index is around 2.5gb
> > - No incremental writes, index is created from scratch(delete old
> documents
> > -> commit new documents -> optimize)  every 6 hours
> > - Avg # of request per second is around 60 (for a single slave)
> > - Avg time per request is around 25ms (before having problems)
> > - Load on each is slave is around 2
> >
> > We are using this set-up for months without any problem. However last
> week
> > we started to experience very weird performance problems like :
> >
> > - Avg time per request increased from 25ms to 200-300ms (even higher if
> we
> > don't restart the slaves)
> > - Load on each slave increased from 2 to 15-20 (solr uses %400-%600 cpu)
> >
> > When we profile solr we see two very strange things :
> >
> > 1 - This is the jconsole output:
> >
> > https://skitch.com/meralan/rwwcf/mail-886x691
> >
> > As you see gc runs for every 10-15 seconds and collects more than 1 gb of
> > memory. (Actually if you wait more than 10 minutes you see spikes up to
> 4gb
> > consistently)
> >
> > 2 - This is the newrelic output :
> >
> > https://skitch.com/meralan/rwwci/solr-requests-solr-new-relic-rpm
> >
> > As you see solr spent ridiculously long time in
> > SolrDispatchFilter.doFilter() method.
> >
> >
> > Apart form these, when we clean the index directory, re-replicate and
> > restart  each slave one by one we see a relief in the system but after
> some
> > time servers start to melt down again. Although deleting index and
> > replicating doesn't solve the problem, we think that these problems are
> > somehow related to replication. Because symptoms started after
> replication
> > and once it heals itself after replication. I also see lucene-write.lock
> > files in slaves (we don't have write.lock files in the master) which I
> > think we shouldn't see.
> >
> >
> > If anyone can give any sort of ideas, we will appreciate it.
> >
> > Regards,
> > Dogacan Guney
>



-- 
Doğacan Güney

Reply via email to