SolrCloud uses ZooKeeper sequence flags to keep track of the order in which
nodes register themselves as leader candidates. The node with the lowest
sequence number wins as leader of the shard.

What I'm trying to do is to keep the leader re-assignments to the minimum
during a rolling restart. In this direction I change the zk sequence numbers
on the SolrCloud nodes when all nodes of the cluster are up and active. I'm
using Solr 4.10.0 and I'm aware of SOLR-6491 which has a similar purpose but
I'm trying to do it from "outside", using the existing APIs without editing
Solr source code.

== TYPICAL SCENARIO ==
Suppose we have 3 Solr instances S1,S2,S3. They are started in the same
order and the zk sequences assigned have as follows
S1:-n_0000000000 (LEADER)
S2:-n_0000000001
S3:-n_0000000002

In a rolling restart we'll get S2 as leader (after S1 shutdown), then S3
(after S2 shutdown) and finally S1(after S3 shutdown), 3 changes in total.

== MY ATTEMPT ==
By using SolrZkClient and the Zookeeper multi API  I found a way to get rid
of the old zknodes that participate in a shard's leader election and write
new ones where we can assign the sequence number of our liking. 

S1:-n_0000000000 (no code running here)
S2:-n_0000000004 (code deleting zknode -n_0000000001 and creating
-n_0000000004)
S3:-n_0000000003 (code deleting zknode -n_0000000002 and creating
-n_0000000003)

In a rolling restart I'd expect to have S3 as leader (after S1 shutdown), no
change (after S2 shutdown) and finally S1(after S3 shutdown), that is 2
changes. This will be constant no matter how many servers are added in
SolrCloud while in the first scenarion the # of re-assignments equals the #
of Solr servers.

The problem occurs when S1 (LEADER) is shut down. The elections that take
place still set S2 as leader, It's like ignoring the new sequence numbers.
When I go to /solr/#/~cloud?view=tree the new sequence numbers are listed
under "/collections" based on which S3 should have become the leader.
Do you have any idea why the new state is not acknowledged during the
elections? Is something cached? Or to put it bluntly do I have any chance
down this path? If not what are my options? Is it possible to apply all
patches under SOLR-6491 in isolation and continue from there?

Thank you. 

Extra info which might help follows
1. Some logging related to leader elections after S1 has been shut down
    S2 - org.apache.solr.cloud.SyncStrategy Leader's attempt to sync with
shard failed, moving to the next candidate
    S2 - org.apache.solr.cloud.ShardLeaderElectionContext We failed sync,
but we have no versions - we can't sync in that 
           case - we were active before, so become leader anyway

    S3 - org.apache.solr.cloud.LeaderElector Our node is no longer in line
to be leader

2. And some sample code on how I perform the ZK re-sequencing
   // Read current zk nodes for a specific collection
     
solrServer.getZkStateReader().getZkClient().getSolrZooKeeper().getChildren("/collections/core/leader_elect/shard1
      /election", true)
   // node deletion
      Op.delete(path, -1) 
   // node creation
      Op.create(createPath, new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE,
CreateMode.EPHEMERAL_SEQUENTIAL);
   // Perform operations
     
solrServer.getZkStateReader().getZkClient().getSolrZooKeeper().multi(opsList);
      solrServer.getZkStateReader().updateClusterState(true);




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-shard-leader-elections-Altering-zookeeper-sequence-numbers-tp4178973.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to