My general recommendation is to handle SUSPENDED and LOST in the same way. In the case of LeaderLatch, your code should exit whatever critical section it has for executing when leader.
-JZ On Jul 2, 2013, at 12:29 PM, chao chu <[email protected]> wrote: > sorry for the confusion. I knew that LeaderLatch will retry when it got > connection 'LOST' event and one of the participant should be elected as a new > leader when re-connected, however, I meant to ask what if it stayed in 'LOST' > for quite a long time (for example, the zk ensemble become unavailable). I > can understand that how to handle this situation should be very application > specific, I was just trying to know what's your reaction for this in your > code (want to see if there are any ideas we can borrow). > > not sure If i explained this clearly enough, thanks a lot for your reply > though. > > > On Tue, Jul 2, 2013 at 6:07 AM, Eric Tschetter <[email protected]> wrote: > My understanding is that LeaderLatch already handles those cases for you. > The unit tests in TestLeaderLatch definitely have something that tries to > test the LOST case. If there's a case that is not handled, it'd probably be > best if you could provide a unit test that shows what's not handled to help > shape the conversation. > > --Eric > > > On Fri, Jun 28, 2013 at 8:03 AM, chao chu <[email protected]> wrote: > Hi Eric, > > Thanks for your sharing, by looking into your code, it's not very clear to me > that how do you handle the 'SUSPEND' or 'LOST' events of LeaderLatch? Could > you please shed some lights here? Thanks > > > On Wed, Jun 26, 2013 at 11:52 PM, Eric Tschetter <[email protected]> wrote: > ChuChao, > > We use it in the Druid project (http://www.github.com/metamx/druid/) > > You can see its use in the class com.metamx.druid.master.DruidMaster > > The class has a bunch of other stuff in it as well that is not specific to > the LeaderLatch, but you can just ignore that and see how it handles the > latch. > > --Eric > > > On Wednesday, June 26, 2013, chao chu wrote: > Thanks a lot for your reply. Could you please name a few open source projects > that used LeaderLatch if you are aware of any? I'd like to take a look at the > code. > > btw, What about issues reported in the links I mentioned? are they actual > bugs or just used in an unexpected way? > > > > On Wed, Jun 26, 2013 at 7:29 AM, Jordan Zimmerman > <[email protected]> wrote: > Curator is being used at major companies (i.e. Netflix, eBay, etc.). Bugs are > quickly fixed when reported. In particular, LeaderLatch is widely used. > -JZ > > > On Jun 25, 2013, at 11:03 AM, chao chu <[email protected]> wrote: > >> Hi, >> >> I have been trying to use the LeaderLatch to implement Leader Election in my >> project and had written some scripts to simulate the situations when the zk >> ensemble become unstable due to network problems. It worked well and as >> expected so far. >> >> However, by digging into both zookeeper-users and curator-users mailing >> lists, there are indeed some bugs/edge cases reported, like >> LeaderLatch bug causing extra znodes appearing in Zookeeper and multiple >> participants thought they are leader which worried me about the reliability >> of this. >> >> So, my question is that: are there any real world projects are using this >> recipe which have proved the quality of it, or are there any other known >> edge cases or open issues? >> >> >> Thanks & Regards, >> >> -- >> ChuChao > > > > > -- > ChuChao > > > > -- > ChuChao > > > > > -- > ChuChao
