This is a known bug in helix.
https://issues.apache.org/jira/browse/HELIX-123

The problem is we are comparing the instance name of the controller but not the 
session id, so if you start two controllers of the same name, isLeader() return 
true. We will fix it shortly.

Thanks,
Jason

From: Lance Co Ting Keh <[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Friday, June 21, 2013 1:39 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Controller fault tolerance

Hi Kishore,

I tried starting two controllers programmatically like you mentioned:


controllerManager = HelixControllerMain.startHelixController(zkAddress,

          clusterName, "controller", HelixControllerMain.STANDALONE);


I then called isLeader() on the both managers 
(http://helix.incubator.apache.org/apidocs/reference/org/apache/helix/HelixManager.html#isLeader()).
 and both of them returned true. They're obviously both on the same zookeeper 
instance, and on the same cluster. The controllers are running and so im not 
sure whether or not its actually leader electing properly, or I'm 
misinterpreting the isLeader() function


Thanks
Lance



On Mon, Jun 17, 2013 at 9:22 AM, Manikumar Reddy 
<[email protected]<mailto:[email protected]>> wrote:
Hi Kishore,

Thanks for the quick response.

Regards,
Kumar


On Mon, Jun 17, 2013 at 8:18 PM, kishore g 
<[email protected]<mailto:[email protected]>> wrote:
Hi Kumar,

You can start multiple controllers and only one of them will be active and rest 
of them will be in standby mode. If the active controller fails, one of the 
standby will become active and start managing the cluster.

You can start the controllers either using command line or programmatically.

command line

./run-helix-controller.sh --zkSvr localhost:2199 --cluster <clustername>

using Helix api

controllerManager = HelixControllerMain.startHelixController(zkAddress,
          clusterName, "controller", HelixControllerMain.STANDALONE);

Hope this helps.

thanks,
Kishore G



On Mon, Jun 17, 2013 at 7:01 AM, Manikumar Reddy 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I am trying to understand the Helix Controller/Cluster manager fault tolerance 
mechanism.
Single Controller will become Single-Point-Failure. So what are the available 
options/techniques to
achieve controller fault tolerance?   Any pointers/recipes/code snippets?

Regards,
Kumar



Reply via email to