Re: Optimize SolrCloud without downtime

Shawn Heisey Wed, 25 Mar 2015 09:56:35 -0700

On 3/25/2015 9:08 AM, pavelhladik wrote:
> Our data are changing frequently so that's why so many deletedDocs.
> Optimized core takes around 50GB on disk, we are now almost on 100GB and I'm
> looking for best solution howto optimize this huge core without downtime. I
> know optimization working in background, but anyway when the optimization is
> running our search system is slow and sometimes I receive errors - this
> behavior is like a downtime for us.
>
> I would like to switch to SolrCloud, the performance is not a issue, so I
> don't need the sharding feature at this time. I'm more interested with
> replication and distribute requests by some Nginx proxy. Idea is:
>
> 1) proxy forward requests to node1 and optimize cores on node2
> 2) proxy forward requests to node2 and optimize cores on node1
>
> But when I do optimize on node2, the node1 is doing optimization as well,
> even if I use the "distrib=false" with curl.


You are correct - with SolrCloud, any optimize command will optimize the
entire collection, one shard replica at a time, regardless of any
distrib parameter.  It does NOT optimize multiple replicas or shards in
parallel.  I thought we had an issue in Jira asking to make optimize
honor a "distrib=false" parameter, but I can't find it.  Even if that
were fixed, it would not help you, because SolrCloud is only optimizing
one shard replica at any given moment.

Optimization does NOT directly result in downtime ... but because
optimize generates a very large amount of disk I/O, it can be disruptive
if your server does not have enough resources.

I don't have enough information to say for sure, but I am betting that
you don't have enough RAM in your machine to effectively cache your
index, so anything that negatively affects performance, like an
optimize, is too much for your server to handle at the same time as
ongoing queries or indexing.  The info on this wiki page can help you
determine how much total RAM you might need:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn

Re: Optimize SolrCloud without downtime

Reply via email to