Thanks Kanak,
Note that it's also probably wrong that with a replication factor of 1 as well
because we actually get both leaders and replica, when we should only get 3
leaders. Probably related to the other issue.
E.g. in my example, with 3 nodes deployed, 3 partitions, replication factor 1,
we actually get 3 leaders and 2 replicas:
MY_RESOURCE_0: {
Node:myhost:10000: "LEADER",
Node:myhost:10002: "REPLICA"
},
MY_RESOURCE_1: {
Node:myhost:10000: "LEADER",
Node:myhost:10001: "REPLICA"
},
MY_RESOURCE_2: {
Node:myhost:10000: "LEADER"
}
Regards,
Matthieu
On Oct 15, 2013, at 19:39 , Kanak Biscuitwala <[email protected]> wrote:
> Hi Matthieu,
>
> Please change line 39 in ClusterConfigInit to:
>
> admin.rebalance(DEFAULT_CLUSTER_NAME, RESOURCE, 2);
>
> Basically, the leader counts as a replica, so if you want a replica in
> addition to the leader, you need to specify 2 for the replica count.
>
> There is a bug where when there are 3 nodes, I see partition 0 has 2 in
> REPLICA state even though one of them should be dropped. I'll keep
> investigating that.
>
> Kanak
>
> From: Matthieu Morel <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Tuesday, October 15, 2013 8:06 AM
> To: "[email protected]" <[email protected]>
> Subject: Re: Getting auto_rebalance right
>
> Hi Kishore,
>
> On Oct 15, 2013, at 16:48 , kishore g <[email protected]> wrote:
>
>> Hi Matthieu,
>>
>> I think the code avoids placing more than one replica of a partition on the
>> same node. So If you have only 1 node, it will not create only LEADERS. We
>> can make add a configuration and allow this to happen.
>
> Actually, preventing leader and replica for a partition to be on the same
> node makes sense : such a placement defeats the purpose of the replica.
>
>> But I do see something weird when you add a node, only 1 additional replica
>> gets created, that does not make sense. I will take a look at that.
>
> Yes, with 3 nodes and 3 partitions we should expect 3 leaders and 3 replicas.
> An additional requirement, related to the above comment, would be that
> leaders and replica are never colocated. Should I open a jira for that?
>
> Let me know if you need more feedback.
>
> Thanks!
>
> Matthieu
>
>
>
>>
>> thanks,
>> Kishore G
>>
>>
>> On Tue, Oct 15, 2013 at 1:31 AM, Matthieu Morel <[email protected]> wrote:
>>> Thanks for your prompt answers!
>>>
>>> I used the latest version from the master branch and applied the code
>>> changes suggested by Jason.
>>>
>>> The good news are that:
>>> - the update was trivial - at least for the small code example I provided.
>>> - I always get 3 leaders states for the 3 partitions
>>>
>>> The bad news are that:
>>> - I either don't get enough replica (I want 1 replica for each partition,
>>> and initially I only have replica for 2 partitions)
>>> - or simply I get no replica at all (after removing 1 node from the
>>> cluster, I have 3 leaders, 0 replica)
>>>
>>> I updated my simple example
>>> https://github.com/matthieumorel/helix-balancing so you can reproduce that
>>> behavior.
>>>
>>> // with only 1 node, I have 3 leaders, 0 replica :
>>>
>>> Starting instance Node:myhost:10000
>>> Assigning MY_RESOURCE_1 to Node:myhost:10000
>>> Assigning MY_RESOURCE_0 to Node:myhost:10000
>>> Assigning MY_RESOURCE_2 to Node:myhost:10000
>>> OFFLINE -> REPLICA (Node:myhost:10000, MY_RESOURCE_2)
>>> OFFLINE -> REPLICA (Node:myhost:10000, MY_RESOURCE_1)
>>> OFFLINE -> REPLICA (Node:myhost:10000, MY_RESOURCE_0)
>>> REPLICA -> LEADER (Node:myhost:10000, MY_RESOURCE_0)
>>> REPLICA -> LEADER (Node:myhost:10000, MY_RESOURCE_1)
>>> REPLICA -> LEADER (Node:myhost:10000, MY_RESOURCE_2)
>>>
>>>
>>> // adding 1 node adds a replica:
>>>
>>> Starting instance Node:myhost:10001
>>> Assigning MY_RESOURCE_1 to Node:myhost:10001
>>> OFFLINE -> REPLICA (Node:myhost:10001, MY_RESOURCE_1)
>>>
>>>
>>> // adding another node adds a new replica:
>>>
>>> Starting instance Node:myhost:10002
>>> Assigning MY_RESOURCE_0 to Node:myhost:10002
>>> OFFLINE -> REPLICA (Node:myhost:10002, MY_RESOURCE_0)
>>>
>>>
>>> // removing a node rebalances things but we end up with 3 leaders, 0 replica
>>>
>>> Stopping instance Node:myhost:10000
>>> Assigning MY_RESOURCE_2 to Node:myhost:10002
>>> REPLICA -> LEADER (Node:myhost:10002, MY_RESOURCE_0)
>>> OFFLINE -> REPLICA (Node:myhost:10002, MY_RESOURCE_2)
>>> REPLICA -> LEADER (Node:myhost:10002, MY_RESOURCE_2)
>>> REPLICA -> LEADER (Node:myhost:10001, MY_RESOURCE_1)
>>>
>>>
>>> I would like to get 1 leader and 1 replica for each partition, regardless
>>> of the number of nodes. Is that possible?
>>>
>>> Thanks!
>>>
>>> Matthieu
>>>
>>>
>>>
>>> On Oct 15, 2013, at 02:30 , Kanak Biscuitwala <[email protected]> wrote:
>>>
>>>> Hi Matthieu,
>>>>
>>>> I have just pushed a patch to the master branch (i.e. trunk) that should
>>>> fix the issue. Please let me know if the problem persists.
>>>>
>>>> Thanks,
>>>> Kanak
>>>>
>>>> ________________________________
>>>>> From: [email protected]
>>>>> To: [email protected]
>>>>> Subject: Re: Getting auto_rebalance right
>>>>> Date: Mon, 14 Oct 2013 21:32:41 +0000
>>>>>
>>>>> Hi Matthieu, this is a known bug in 0.6.1 release. We have fixed it in
>>>>> trunk. If you are building from trunk, change ClusterConfigInit#init()
>>>>>
>>>>> admin.addResource(DEFAULT_CLUSTER_NAME,
>>>>> RESOURCE,
>>>>> PARTITIONS,
>>>>> "LEADER_REPLICA",
>>>>> IdealStateModeProperty.AUTO_REBALANCE.toString());
>>>>> to
>>>>>
>>>>>
>>>>> admin.addResource(DEFAULT_CLUSTER_NAME, RESOURCE, PARTITIONS,
>>>>>
>>>>> "LEADER_REPLICA",
>>>>>
>>>>> RebalanceMode.FULL_AUTO.toString());
>>>>>
>>>>>
>>>>> It should work. We are planing to make 0.6.2 release with a few fixes
>>>>> including this one.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jason
>>>>>
>>>>>
>>>>> From: Matthieu Morel <[email protected]<mailto:[email protected]>>
>>>>> Reply-To:
>>>>> "[email protected]<mailto:[email protected]>"
>>>>> <[email protected]<mailto:[email protected]>>
>>>>> Date: Monday, October 14, 2013 12:09 PM
>>>>> To:
>>>>> "[email protected]<mailto:[email protected]>"
>>>>> <[email protected]<mailto:[email protected]>>
>>>>> Subject: Getting auto_rebalance right
>>>>>
>>>>> Hi,
>>>>>
>>>>> I'm trying to use the auto-rebalance mode in Helix.
>>>>>
>>>>> The use case is the following (standard leader-standby scenario, a bit
>>>>> like the rsync example in the helix codebase):
>>>>> - the dataspace is partitioned
>>>>> - for a given partition, we have
>>>>> - a leader that is responsible for writing and serving data, logging
>>>>> operations into a journal
>>>>> - a replica that fetches updates from a journal and applies them
>>>>> locally but it does not serve data
>>>>> Upon failure, the replica becomes leader, applies pending updates and
>>>>> can write and serve data. Ideally we also get a new replica assigned.
>>>>>
>>>>> We'd like to use the auto_rebalance mode in Helix so that partitions
>>>>> are automatically assigned and re-assigned, and so that leaders are
>>>>> automatically elected.
>>>>>
>>>>>
>>>>> Unfortunately, I can't really get the balancing right. I might be doing
>>>>> something wrong, so I uploaded an example here
>>>>> : https://github.com/matthieumorel/helix-balancing
>>>>>
>>>>>
>>>>> In this application I would like to get exactly 1 leader and 1 replica
>>>>> for each of the partitions
>>>>>
>>>>> In this example we don't reach that result, and when removing a node,
>>>>> we even get to a situation where there is no leader for a given
>>>>> partition.
>>>>>
>>>>>
>>>>> Do I have wrong expectations? Is there something wrong with the code,
>>>>> is it something with helix?
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Matthieu
>>>
>>
>