bq: It does NOT optimize multiple replicas or shards in parallel. This behavior was changed in 4.10 though, see: https://issues.apache.org/jira/browse/SOLR-6264
So with 5.0 Pavel is seeing the result of that JIRA I bet. I have to agree with Shawn, the optimization step should proceed invisibly in the background, I suspect you have something else going on here. FWIW, Erick On Wed, Mar 25, 2015 at 9:54 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 3/25/2015 9:08 AM, pavelhladik wrote: >> Our data are changing frequently so that's why so many deletedDocs. >> Optimized core takes around 50GB on disk, we are now almost on 100GB and I'm >> looking for best solution howto optimize this huge core without downtime. I >> know optimization working in background, but anyway when the optimization is >> running our search system is slow and sometimes I receive errors - this >> behavior is like a downtime for us. >> >> I would like to switch to SolrCloud, the performance is not a issue, so I >> don't need the sharding feature at this time. I'm more interested with >> replication and distribute requests by some Nginx proxy. Idea is: >> >> 1) proxy forward requests to node1 and optimize cores on node2 >> 2) proxy forward requests to node2 and optimize cores on node1 >> >> But when I do optimize on node2, the node1 is doing optimization as well, >> even if I use the "distrib=false" with curl. > > You are correct - with SolrCloud, any optimize command will optimize the > entire collection, one shard replica at a time, regardless of any > distrib parameter. It does NOT optimize multiple replicas or shards in > parallel. I thought we had an issue in Jira asking to make optimize > honor a "distrib=false" parameter, but I can't find it. Even if that > were fixed, it would not help you, because SolrCloud is only optimizing > one shard replica at any given moment. > > Optimization does NOT directly result in downtime ... but because > optimize generates a very large amount of disk I/O, it can be disruptive > if your server does not have enough resources. > > I don't have enough information to say for sure, but I am betting that > you don't have enough RAM in your machine to effectively cache your > index, so anything that negatively affects performance, like an > optimize, is too much for your server to handle at the same time as > ongoing queries or indexing. The info on this wiki page can help you > determine how much total RAM you might need: > > http://wiki.apache.org/solr/SolrPerformanceProblems > > Thanks, > Shawn >