Hi Matthieu, this is a known bug in 0.6.1 release. We have fixed it in trunk. 
If you are building from trunk, change ClusterConfigInit#init()

admin.addResource(DEFAULT_CLUSTER_NAME,
                RESOURCE,
                PARTITIONS,
                "LEADER_REPLICA",
                IdealStateModeProperty.AUTO_REBALANCE.toString());
to


admin.addResource(DEFAULT_CLUSTER_NAME, RESOURCE, PARTITIONS,

                                "LEADER_REPLICA",

                                RebalanceMode.FULL_AUTO.toString());


It should work. We are planing to make 0.6.2 release with a few fixes including 
this one.


Thanks,

Jason


From: Matthieu Morel <[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Monday, October 14, 2013 12:09 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Getting auto_rebalance right

Hi,

I'm trying to use the auto-rebalance mode in Helix.

The use case is the following (standard leader-standby scenario, a bit like the 
rsync example in the helix codebase):
- the dataspace is partitioned
- for a given partition, we have
- a leader that is responsible for writing and serving data, logging operations 
into a journal
- a replica that fetches updates from a journal and applies them locally but it 
does not serve data
Upon failure, the replica becomes leader, applies pending updates and can write 
and serve data. Ideally we also get a new replica assigned.

We'd like to use the auto_rebalance mode in Helix so that partitions are 
automatically assigned and re-assigned, and so that leaders are automatically 
elected.


Unfortunately, I can't really get the balancing right. I might be doing 
something wrong, so I uploaded an example here : 
https://github.com/matthieumorel/helix-balancing


In this application I would like to get exactly 1 leader and 1 replica for each 
of the partitions

In this example we don't reach that result, and when removing a node, we even 
get to a situation where there is no leader for a given partition.


Do I have wrong expectations? Is there something wrong with the code, is it 
something with helix?


Thanks!

Matthieu

Reply via email to