Hi Lance, Yes manager.isConnected will be false. For now you can periodically poll this.
What are u planning to do after you detect disconnect. 2 scenarios that might result in this, network partition and GC.If its network paritition, you may not be able to reach any other box in the cluster, in case of GC the process is mostly not responding. Yes, when the node is disabled we invoke the transitions so that partitions come back to OFFLINE. Thanks, Kishore G On Jul 25, 2013 3:57 PM, "Lance Co Ting Keh" <[email protected]> wrote: > Thank you for the response. > > I will definitely file a ticket once I have a good understanding of how > the participant does it-- just so i can phrase the ticket properly. > > You mentioned that you detect the disconnection from Zk in the > participant. How should i best be informed of this disconnection (in > advance of the ephemeral node in /LIVEINSTANCES going away?) > > 1. Looking at ZkStateChangeListener line 76, it looks like > manager.isConnected() will be false when the state goes into *Disconnected > *, even before *Expired *which works for me. Should i then be > periodically calling manager.isConnected()? > > 2. The addHealthStateChangeListener on line 358 of ZkHelixManager only > seems to be listening for EventTypes and not KeeperStates > > You also mentioned that "if we notice many disconnects in a short period > we disable the node". When the node is disabled do you call the > @Transition(from = "OFFLINE", to = "ONLINE") method? > > Sincerely, > Lance > > > > > > > On Wed, Jul 24, 2013 at 12:45 PM, kishore g <[email protected]> wrote: > >> Hi Lance, >> >> Unfortunately the controller does not know about the disconnection from >> ZK. However we detect that in the participant and if we notice many >> disconnects in a short period we disable the node. >> >> After we detect a disconnect we can potentially inform the controller >> about it and have an alert. Can you please file a jira for this. >> >> thanks, >> Kishore G >> >> >> On Tue, Jul 23, 2013 at 6:50 PM, Lance Co Ting Keh <[email protected]> wrote: >> >>> I see what you mean by alerts on live instances. In fact there is an >>> "onLiveInstanceChange" under GenericHelixController ( >>> http://helix.incubator.apache.org/apidocs/reference/org/apache/helix/controller/GenericHelixController.html >>> ) >>> >>> The question is can i register for an alert to myself? If the agent that >>> is being alerted is the one that loses connection to zk, does the alert >>> trigger? >>> >>> More importantly, it seems that setting an alert for >>> onLiveInstanceChange happens when the zookeeper session expires(in which >>> case master controller already remaps), and not immediately when a zk >>> connection falters (but ephemeral node on LIVEINSTANCES is still there). I >>> was hoping to get an alert not when the ephemeral node expires but >>> immediately right when a zk connection falters. >>> >>> >>> Thank you >>> Lance >>> >>> >>> On Tue, Jul 23, 2013 at 6:00 PM, Shi Lu <[email protected]> wrote: >>> >>>> Hi Lance: >>>> >>>> The helix controller exposes jmx beans that reflects the number of >>>> liveInstances under the jmx domain ClusterStatus:cluster=<clusterName>, in >>>> which it will report >>>> number of down instances, disabled instancesand disabled partitions. >>>> You can set alerts on those jmx beans. >>>> >>>> >>>> >>>> >>>> On Tue, Jul 23, 2013 at 2:32 PM, Lance Co Ting Keh <[email protected]>wrote: >>>> >>>>> Hi guys, >>>>> >>>>> I was trying to look for how I can most cleanly get alerted when a >>>>> helix participant temporary and permanently loses its session with >>>>> Zookeeper. What is the best way to do this? >>>>> >>>>> >>>>> Sincerely, >>>>> Lance >>>>> >>>> >>>> >>> >> >
