Comments inline: On Fri, Nov 8, 2013 at 8:09 PM, michael.boom <my_sky...@yahoo.com> wrote:
> Here's the background of this topic: > I have setup a collection with 4 shards, replicationFactor=2, on two > machines. > I started to index documents, but after hitting some update deadlocks and > restarting servers my shards ranges in ZK state got nulled (i'm using > implicit routing). Indexing continued without me noticing and all new > documents were indexed in shard1 creating huge disproportions with > shards2,3,4. > Of course, I want to fix this and get my index into 4 shards, evenly > distributed. > If you are using implicit routing then the shard ranges should be null. Shard ranges are only used when the router is compositeId. > > What I'm thinking to do is: > 1. on machine 1, merge shards2,3,4 into shard1 using > http://wiki.apache.org/solr/MergingSolrIndexes > (at this point what happens to the replica of shard1 on machine2 ? will > SolrCloud try to replicate shard1 from machine1?) > Index merge is a core admin command. It is not solr cloud aware. Therefore I think that merging will not automatically replicate shard1 on machine1 to other replicas unless a recovery is requested for some reason. > 2. on machine 2, unload the shard1,2,3,4 cores > 3. on machine 1, split shard1 in shard1_0 and shard1_1. Again split > shard1_0 > and 1_1 getting 4 equal shards 1_0_0, 1_0_1, 1_1_0, 1_1_1 > (will now the shard range for the newborns be correct if in the beginning > shard1's range was "null"?) > No, shard splitting does not work with implicit routing. It works only if router is compositeId. > 4. on machine 1 unload shard1 > 5. rename shards 1_0_0, 1_0_1, 1_1_0, 1_1_1 to 1,2,3,4. > 6. replicate shard 1,2,3,4 to machine 2 > > Do you see any problems with this scenario? Anything that could be don in a > more efficient way ? > Thank you > > > Unfortunately no. If you had only inserts on your index and you were searching across the entire cluster always i.e. you don't care where a document ends up -- then you could have used the core admin split API to re-balance the cluster. I think you should just re-index everything and start again. > > ----- > Thanks, > Michael > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Merging-shards-and-replicating-changes-in-SolrCloud-tp4099997.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Shalin Shekhar Mangar.