We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes with 500 
million documents. We're using custom sharding where we  direct all documents 
with specific business date to specific shard.

With Solr 3.6 we used this command to optimize documents on master and then let 
replication take care of updating documents on slave1 and slave2.

curl --proxy "" 
'http://prod-solr-master.xyz.com:8983/solr/core1/update?optimize=true&waitFlush=false&maxSegments=1'

How do we optimize documents for all shards in Solr Cloud? Do we have to fire 
five different optimize commands to all five leaders? Also, looks like optimize 
will be going away and might no longer be necessary - see 
SOLR-3141<https://issues.apache.org/jira/browse/SOLR-3141> Is that true? With 
Solr 3.6 we purge millions of documents every month and then run optimize. 
We're planning to do same with Solr Cloud set up.

With Solr 3.6 we used following curl command to purge documents. Now with 
multiple shards can we still use the same command? We will definitely 
experiment with our QA set up of 500 million documents.

curl --proxy "" 
http://prod-solr-master.xyz.com:8983/solr/core1/update?commit=true -H 
"Content-Type: text/xml" --data-binary '<delete><query>busdate_i:[* TO 
20130208]</query></delete>'

Thanks!






Reply via email to