Thank you very much for the quick response guys
On Fri, Jun 21, 2013 at 1:49 PM, Zhen Zhang <[email protected]> wrote: > yes. Using different names for the controllers is a quick workaround. > > From: Lance Co Ting Keh <[email protected]> > Reply-To: "[email protected]" < > [email protected]> > Date: Friday, June 21, 2013 1:47 PM > > To: "[email protected]" <[email protected]> > Subject: Re: Controller fault tolerance > > Okay thank you. But for now the quick fix is to make sure to name the > controllers differently? > > > On Fri, Jun 21, 2013 at 1:44 PM, Zhen Zhang <[email protected]> wrote: > >> This is a known bug in helix. >> https://issues.apache.org/jira/browse/HELIX-123 >> >> The problem is we are comparing the instance name of the controller but >> not the session id, so if you start two controllers of the same name, >> isLeader() return true. We will fix it shortly. >> >> Thanks, >> Jason >> >> From: Lance Co Ting Keh <[email protected]> >> Reply-To: "[email protected]" < >> [email protected]> >> Date: Friday, June 21, 2013 1:39 PM >> To: "[email protected]" <[email protected]> >> Subject: Re: Controller fault tolerance >> >> Hi Kishore, >> >> I tried starting two controllers programmatically like you mentioned: >> >> controllerManager = HelixControllerMain.startHelixController(zkAddress, >> >> >> clusterName, "controller", HelixControllerMain.STANDALONE); >> >> >> I then called isLeader() on the both managers >> (http://helix.incubator.apache.org/apidocs/reference/org/apache/helix/HelixManager.html#isLeader()). >> and both of them returned true. They're obviously both on the same >> zookeeper instance, and on the same cluster. The controllers are running and >> so im not sure whether or not its actually leader electing properly, or I'm >> misinterpreting the isLeader() function >> >> >> Thanks >> Lance >> >> >> >> On Mon, Jun 17, 2013 at 9:22 AM, Manikumar Reddy <[email protected]>wrote: >> >>> Hi Kishore, >>> >>> Thanks for the quick response. >>> >>> Regards, >>> Kumar >>> >>> >>> On Mon, Jun 17, 2013 at 8:18 PM, kishore g <[email protected]> wrote: >>> >>>> Hi Kumar, >>>> >>>> You can start multiple controllers and only one of them will be >>>> active and rest of them will be in standby mode. If the active controller >>>> fails, one of the standby will become active and start managing the >>>> cluster. >>>> >>>> You can start the controllers either using command line or >>>> programmatically. >>>> >>>> command line >>>> >>>> ./run-helix-controller.sh --zkSvr localhost:2199 --cluster <clustername> >>>> >>>> using Helix api >>>> >>>> controllerManager = HelixControllerMain.startHelixController(zkAddress, >>>> clusterName, "controller", HelixControllerMain.STANDALONE); >>>> >>>> Hope this helps. >>>> >>>> thanks, >>>> Kishore G >>>> >>>> >>>> >>>> >>>> On Mon, Jun 17, 2013 at 7:01 AM, Manikumar Reddy >>>> <[email protected]>wrote: >>>> >>>>> Hi, >>>>> >>>>> I am trying to understand the Helix Controller/Cluster manager fault >>>>> tolerance mechanism. >>>>> Single Controller will become Single-Point-Failure. So what are the >>>>> available options/techniques to >>>>> achieve controller fault tolerance? Any pointers/recipes/code >>>>> snippets? >>>>> >>>>> Regards, >>>>> Kumar >>>> >>>> >>>> >>> >> >
