Thanks for this. For our system we can not use AUTO_REBALANCE since we need to MASTER to be in a particular machine. But I'll try changing the preference list.
I did more debugging today and I think I know what's causing problems are us. The problem is the sequence of transitions. 1-node_1 = MASTER and node_2 = SLAVE 2-node_1 killed/dies 3-node_2 transition to MASTER 4-node_1 is restarted here is the problem 5-node_2 transitions to SLAVE 6-node_1 transitions to SLAVE, yes both node_1 and node_2 are SLAVEs 7-node_1 transitions to MASTER The problem is in step 5. Whenever a node comes up it needs to talk to the MASTER to sync up the state. But since the node_2 is transition to SLAVE before node_1 becomes SLAVE, causing no active MASTER in the cluster. Is there a way to get step 5 and 6 reversed? On Mar 31, 2013, at 1:56 AM, kishore g <[email protected]> wrote: > No you dont have to change the state model to achieve this. > > Intead of AUTO, AUTO_REBALANCE should work in your case. Simply change the > ideal state to look like this. > > { > "id": "Cluster", > > "simpleFields":{ > > "IDEAL_STATE_MODE":"AUTO_REBALANCE", > > "NUM_PARTITIONS": "1", > > "REPLICAS": "2", > > "STATE_MODEL_DEF_REF":"MasterSlave" > > }, > "mapFields":{ > > }, > "listFields":{ > > "Partition_0" : [ ] > > } > } > > You can also achieve this in AUTO mode: when a node becomes master for a > partition, as part of the transition change the preference list in the > idealstate. So in this case change "Partition_0" : [ "node_1", "node_2" ] to > "Partition_0" : [ "node_2", "node_1" ] when node_2 becomes master. > > In the next release, you will be able to add custom rebalancer code which > will allow you to make this change in the controller easily. > > thanks, > Kishore G > > > > > On Sat, Mar 30, 2013 at 10:21 PM, Ming Fang <[email protected]> wrote: > Hi Kishore > > Our system requires deterministic placement of the MASTER and SLAVE. > This is a sample of the idealstate file we're using > > { > "id": "Cluster", > > "simpleFields":{ > > "IDEAL_STATE_MODE":"AUTO", > > "NUM_PARTITIONS": "1", > > "REPLICAS": "2", > > "STATE_MODEL_DEF_REF":"MasterSlave" > > }, > "mapFields":{ > > }, > "listFields":{ > > "Partition_0" : [ "node_1", "node_2" ] > > } > } > > > In this example, node_1 is the MASTER. > If node_1 dies then node_2 will take over. > But if node_1 then get restarted, it will try to become MASTER again. > We normally keep the died node down to avoid this problem. > But I was hoping for a more elegant solution. > > One solution would be for node_1 to come up and realizes that node_2 has > taken over due to the previous failure. > In that case node_1 will decide to remain as a SLAVE node instead. > Should this be done by the Controller instead? > Should I create a new statemodel other than MASTER/SLAVE? > > On Mar 31, 2013, at 12:50 AM, kishore g <[email protected]> wrote: > >> Hi MIng, >> >> There are couple of ways you can achieve that. Before providing an answer, >> how many partitions do you have. Did you generate the idealstate yourself or >> used Helix to come up with initial idealstate? >> >> The reason old master tries to become a master again is to distribute the >> load among the nodes currently alive. Otherwise the old node that comes back >> will never become a master for any partition and will remain idle until >> another failure happens in the system. >> >> thanks, >> Kishore G >> >> >> On Sat, Mar 30, 2013 at 8:01 PM, Ming Fang <[email protected]> wrote: >> We're using MASTER SLAVE in AUTO model. >> When the MASTER is killed, the failover is working properly as the SLAVE >> transitions to become MASTER. >> However if the failed MASTER is restarted, it will try to become MASTER >> again. >> This is causing a problem in our business logic. >> Is there a way to prevent the failed instance from becoming MASTER again? >> > >
