bq:  It does NOT optimize multiple replicas or shards in parallel.

This behavior was changed in 4.10 though, see:
https://issues.apache.org/jira/browse/SOLR-6264

So with 5.0 Pavel is seeing the result of that JIRA I bet.

I have to agree with Shawn, the optimization step should proceed
invisibly in the background, I suspect you have something else
going on here.

FWIW,
Erick

On Wed, Mar 25, 2015 at 9:54 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> On 3/25/2015 9:08 AM, pavelhladik wrote:
>> Our data are changing frequently so that's why so many deletedDocs.
>> Optimized core takes around 50GB on disk, we are now almost on 100GB and I'm
>> looking for best solution howto optimize this huge core without downtime. I
>> know optimization working in background, but anyway when the optimization is
>> running our search system is slow and sometimes I receive errors - this
>> behavior is like a downtime for us.
>>
>> I would like to switch to SolrCloud, the performance is not a issue, so I
>> don't need the sharding feature at this time. I'm more interested with
>> replication and distribute requests by some Nginx proxy. Idea is:
>>
>> 1) proxy forward requests to node1 and optimize cores on node2
>> 2) proxy forward requests to node2 and optimize cores on node1
>>
>> But when I do optimize on node2, the node1 is doing optimization as well,
>> even if I use the "distrib=false" with curl.
>
> You are correct - with SolrCloud, any optimize command will optimize the
> entire collection, one shard replica at a time, regardless of any
> distrib parameter.  It does NOT optimize multiple replicas or shards in
> parallel.  I thought we had an issue in Jira asking to make optimize
> honor a "distrib=false" parameter, but I can't find it.  Even if that
> were fixed, it would not help you, because SolrCloud is only optimizing
> one shard replica at any given moment.
>
> Optimization does NOT directly result in downtime ... but because
> optimize generates a very large amount of disk I/O, it can be disruptive
> if your server does not have enough resources.
>
> I don't have enough information to say for sure, but I am betting that
> you don't have enough RAM in your machine to effectively cache your
> index, so anything that negatively affects performance, like an
> optimize, is too much for your server to handle at the same time as
> ongoing queries or indexing.  The info on this wiki page can help you
> determine how much total RAM you might need:
>
> http://wiki.apache.org/solr/SolrPerformanceProblems
>
> Thanks,
> Shawn
>

Reply via email to