Hi Matthieu, I think the code avoids placing more than one replica of a partition on the same node. So If you have only 1 node, it will not create only LEADERS. We can make add a configuration and allow this to happen. But I do see something weird when you add a node, only 1 additional replica gets created, that does not make sense. I will take a look at that.
thanks, Kishore G On Tue, Oct 15, 2013 at 1:31 AM, Matthieu Morel <[email protected]> wrote: > Thanks for your prompt answers! > > I used the latest version from the master branch and applied the code > changes suggested by Jason. > > The good news are that: > - the update was trivial - at least for the small code example I provided. > - I always get 3 leaders states for the 3 partitions > > The bad news are that: > - I either don't get enough replica (I want 1 replica for each partition, > and initially I only have replica for 2 partitions) > - or simply I get no replica at all (after removing 1 node from the > cluster, I have 3 leaders, 0 replica) > > I updated my simple example > https://github.com/matthieumorel/helix-balancing so you can reproduce > that behavior. > > // with only 1 node, I have 3 leaders, 0 replica : > > Starting instance Node:myhost:10000 > Assigning MY_RESOURCE_1 to Node:myhost:10000 > Assigning MY_RESOURCE_0 to Node:myhost:10000 > Assigning MY_RESOURCE_2 to Node:myhost:10000 > OFFLINE -> REPLICA (Node:myhost:10000, MY_RESOURCE_2) > OFFLINE -> REPLICA (Node:myhost:10000, MY_RESOURCE_1) > OFFLINE -> REPLICA (Node:myhost:10000, MY_RESOURCE_0) > REPLICA -> LEADER (Node:myhost:10000, MY_RESOURCE_0) > REPLICA -> LEADER (Node:myhost:10000, MY_RESOURCE_1) > REPLICA -> LEADER (Node:myhost:10000, MY_RESOURCE_2) > > > // adding 1 node adds a replica: > > Starting instance Node:myhost:10001 > Assigning MY_RESOURCE_1 to Node:myhost:10001 > OFFLINE -> REPLICA (Node:myhost:10001, MY_RESOURCE_1) > > > // adding another node adds a new replica: > > Starting instance Node:myhost:10002 > Assigning MY_RESOURCE_0 to Node:myhost:10002 > OFFLINE -> REPLICA (Node:myhost:10002, MY_RESOURCE_0) > > > // removing a node rebalances things but we end up with 3 leaders, 0 > replica > > Stopping instance Node:myhost:10000 > Assigning MY_RESOURCE_2 to Node:myhost:10002 > REPLICA -> LEADER (Node:myhost:10002, MY_RESOURCE_0) > OFFLINE -> REPLICA (Node:myhost:10002, MY_RESOURCE_2) > REPLICA -> LEADER (Node:myhost:10002, MY_RESOURCE_2) > REPLICA -> LEADER (Node:myhost:10001, MY_RESOURCE_1) > > > I would like to get 1 leader and 1 replica for each partition, regardless > of the number of nodes. Is that possible? > > Thanks! > > Matthieu > > > > On Oct 15, 2013, at 02:30 , Kanak Biscuitwala <[email protected]> wrote: > > Hi Matthieu, > > I have just pushed a patch to the master branch (i.e. trunk) that should > fix the issue. Please let me know if the problem persists. > > Thanks, > Kanak > > ________________________________ > > From: [email protected] > To: [email protected] > Subject: Re: Getting auto_rebalance right > Date: Mon, 14 Oct 2013 21:32:41 +0000 > > Hi Matthieu, this is a known bug in 0.6.1 release. We have fixed it in > trunk. If you are building from trunk, change ClusterConfigInit#init() > > admin.addResource(DEFAULT_CLUSTER_NAME, > RESOURCE, > PARTITIONS, > "LEADER_REPLICA", > IdealStateModeProperty.AUTO_REBALANCE.toString()); > to > > > admin.addResource(DEFAULT_CLUSTER_NAME, RESOURCE, PARTITIONS, > > "LEADER_REPLICA", > > RebalanceMode.FULL_AUTO.toString()); > > > It should work. We are planing to make 0.6.2 release with a few fixes > including this one. > > > Thanks, > > Jason > > > From: Matthieu Morel > <[email protected]<mailto:[email protected]<[email protected]>>> > > Reply-To: > "[email protected]<mailto:[email protected]<[email protected]>>" > > <[email protected]<mailto:[email protected]<[email protected]>>> > > Date: Monday, October 14, 2013 12:09 PM > To: > "[email protected]<mailto:[email protected]<[email protected]>>" > > <[email protected]<mailto:[email protected]<[email protected]>>> > > Subject: Getting auto_rebalance right > > Hi, > > I'm trying to use the auto-rebalance mode in Helix. > > The use case is the following (standard leader-standby scenario, a bit > like the rsync example in the helix codebase): > - the dataspace is partitioned > - for a given partition, we have > - a leader that is responsible for writing and serving data, logging > operations into a journal > - a replica that fetches updates from a journal and applies them > locally but it does not serve data > Upon failure, the replica becomes leader, applies pending updates and > can write and serve data. Ideally we also get a new replica assigned. > > We'd like to use the auto_rebalance mode in Helix so that partitions > are automatically assigned and re-assigned, and so that leaders are > automatically elected. > > > Unfortunately, I can't really get the balancing right. I might be doing > something wrong, so I uploaded an example here > : https://github.com/matthieumorel/helix-balancing > > > In this application I would like to get exactly 1 leader and 1 replica > for each of the partitions > > In this example we don't reach that result, and when removing a node, > we even get to a situation where there is no leader for a given > partition. > > > Do I have wrong expectations? Is there something wrong with the code, > is it something with helix? > > > Thanks! > > Matthieu > > >
