Re: Controller fault tolerance

Lance Co Ting Keh Fri, 21 Jun 2013 13:52:38 -0700

Thank you very much for the quick response guys


On Fri, Jun 21, 2013 at 1:49 PM, Zhen Zhang <[email protected]> wrote:

>  yes. Using different names for the controllers is a quick workaround.
>
>   From: Lance Co Ting Keh <[email protected]>
> Reply-To: "[email protected]" <
> [email protected]>
> Date: Friday, June 21, 2013 1:47 PM
>
> To: "[email protected]" <[email protected]>
> Subject: Re: Controller fault tolerance
>
>   Okay thank you. But for now the quick fix is to make sure to name the
> controllers differently?
>
>
> On Fri, Jun 21, 2013 at 1:44 PM, Zhen Zhang <[email protected]> wrote:
>
>>  This is a known bug in helix.
>> https://issues.apache.org/jira/browse/HELIX-123
>>
>>  The problem is we are comparing the instance name of the controller but
>> not the session id, so if you start two controllers of the same name,
>> isLeader() return true. We will fix it shortly.
>>
>>  Thanks,
>> Jason
>>
>>   From: Lance Co Ting Keh <[email protected]>
>> Reply-To: "[email protected]" <
>> [email protected]>
>> Date: Friday, June 21, 2013 1:39 PM
>> To: "[email protected]" <[email protected]>
>> Subject: Re: Controller fault tolerance
>>
>>   Hi Kishore,
>>
>>  I tried starting two controllers programmatically like you mentioned:
>>
>>  controllerManager = HelixControllerMain.startHelixController(zkAddress,
>>
>>
>>           clusterName, "controller", HelixControllerMain.STANDALONE);
>>
>>
>> I then called isLeader() on the both managers 
>> (http://helix.incubator.apache.org/apidocs/reference/org/apache/helix/HelixManager.html#isLeader()).
>>  and both of them returned true. They're obviously both on the same 
>> zookeeper instance, and on the same cluster. The controllers are running and 
>> so im not sure whether or not its actually leader electing properly, or I'm 
>> misinterpreting the isLeader() function
>>
>>
>> Thanks
>> Lance
>>
>>
>>
>> On Mon, Jun 17, 2013 at 9:22 AM, Manikumar Reddy <[email protected]>wrote:
>>
>>> Hi Kishore,
>>>
>>> Thanks for the quick response.
>>>
>>> Regards,
>>> Kumar
>>>
>>>
>>> On Mon, Jun 17, 2013 at 8:18 PM, kishore g <[email protected]> wrote:
>>>
>>>> Hi Kumar,
>>>>
>>>>  You can start multiple controllers and only one of them will be
>>>> active and rest of them will be in standby mode. If the active controller
>>>> fails, one of the standby will become active and start managing the 
>>>> cluster.
>>>>
>>>>  You can start the controllers either using command line or
>>>> programmatically.
>>>>
>>>>  command line
>>>>
>>>> ./run-helix-controller.sh --zkSvr localhost:2199 --cluster <clustername>
>>>>
>>>>  using Helix api
>>>>
>>>> controllerManager = HelixControllerMain.startHelixController(zkAddress,
>>>>           clusterName, "controller", HelixControllerMain.STANDALONE);
>>>>
>>>> Hope this helps.
>>>>
>>>> thanks,
>>>> Kishore G
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 17, 2013 at 7:01 AM, Manikumar Reddy 
>>>> <[email protected]>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to understand the Helix Controller/Cluster manager fault
>>>>> tolerance mechanism.
>>>>> Single Controller will become Single-Point-Failure. So what are the
>>>>> available options/techniques to
>>>>> achieve controller fault tolerance?   Any pointers/recipes/code
>>>>> snippets?
>>>>>
>>>>> Regards,
>>>>> Kumar
>>>>
>>>>
>>>>
>>>
>>
>

Re: Controller fault tolerance

Reply via email to