Re: Merging shards and replicating changes in SolrCloud

Shalin Shekhar Mangar Sat, 09 Nov 2013 08:32:56 -0800

Comments inline:

On Fri, Nov 8, 2013 at 8:09 PM, michael.boom <my_sky...@yahoo.com> wrote:


> Here's the background of this topic:
> I have setup a collection with 4 shards, replicationFactor=2, on two
> machines.
> I started to index documents, but after hitting some update deadlocks and
> restarting servers my shards ranges in ZK state got nulled (i'm using
> implicit routing). Indexing continued without me noticing and all new
> documents were indexed in shard1 creating huge disproportions with
> shards2,3,4.
> Of course, I want to fix this and get my index into 4 shards, evenly
> distributed.
>

If you are using implicit routing then the shard ranges should be null.
Shard ranges are only used when the router is compositeId.


>
> What I'm thinking to do is:
> 1. on machine 1, merge shards2,3,4 into shard1 using
> http://wiki.apache.org/solr/MergingSolrIndexes
> (at this point what happens to the replica of shard1 on machine2 ? will
> SolrCloud try to replicate shard1 from machine1?)
>

Index merge is a core admin command. It is not solr cloud aware. Therefore
I think that merging will not automatically replicate shard1 on machine1 to
other replicas unless a recovery is requested for some reason.


> 2. on machine 2, unload the shard1,2,3,4 cores
> 3. on machine 1, split shard1 in shard1_0 and shard1_1. Again split
> shard1_0
> and 1_1 getting 4 equal shards 1_0_0, 1_0_1, 1_1_0, 1_1_1
> (will now the shard range for the newborns be correct if in the beginning
> shard1's range was "null"?)
>

No, shard splitting does not work with implicit routing. It works only if
router is compositeId.


> 4. on machine 1 unload shard1
> 5. rename shards 1_0_0, 1_0_1, 1_1_0, 1_1_1 to 1,2,3,4.
> 6. replicate shard 1,2,3,4 to machine 2
>
> Do you see any problems with this scenario? Anything that could be don in a
> more efficient way ?
> Thank you
>
>
>
Unfortunately no. If you had only inserts on your index and you were
searching across the entire cluster always i.e. you don't care where a
document ends up -- then you could have used the core admin split API to
re-balance the cluster. I think you should just re-index everything and
start again.


>
> -----
> Thanks,
> Michael
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Merging-shards-and-replicating-changes-in-SolrCloud-tp4099997.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Merging shards and replicating changes in SolrCloud

Reply via email to