My general recommendation is to handle SUSPENDED and LOST in the same way. In 
the case of LeaderLatch, your code should exit whatever critical section it has 
for executing when leader.

-JZ

On Jul 2, 2013, at 12:29 PM, chao chu <[email protected]> wrote:

> sorry for the confusion. I knew that LeaderLatch will retry when it got 
> connection 'LOST' event and one of the participant should be elected as a new 
> leader when re-connected, however, I meant to ask what if it stayed in 'LOST' 
> for quite a long time (for example, the zk ensemble become unavailable). I 
> can understand that how to handle this situation should be very application 
> specific, I was just trying to know what's your reaction for this in your 
> code (want to see if there are any ideas we can borrow).
> 
> not sure If i explained this clearly enough, thanks a lot for your reply 
> though.
> 
> 
> On Tue, Jul 2, 2013 at 6:07 AM, Eric Tschetter <[email protected]> wrote:
> My understanding is that LeaderLatch already handles those cases for you.  
> The unit tests in TestLeaderLatch definitely have something that tries to 
> test the LOST case.  If there's a case that is not handled, it'd probably be 
> best if you could provide a unit test that shows what's not handled to help 
> shape the conversation.
> 
> --Eric
> 
> 
> On Fri, Jun 28, 2013 at 8:03 AM, chao chu <[email protected]> wrote:
> Hi Eric,
> 
> Thanks for your sharing, by looking into your code, it's not very clear to me 
> that how do you handle the 'SUSPEND' or 'LOST' events of LeaderLatch? Could 
> you please shed some lights here? Thanks
> 
> 
> On Wed, Jun 26, 2013 at 11:52 PM, Eric Tschetter <[email protected]> wrote:
> ChuChao,
> 
> We use it in the Druid project (http://www.github.com/metamx/druid/)
> 
> You can see its use in the class com.metamx.druid.master.DruidMaster
> 
> The class has a bunch of other stuff in it as well that is not specific to 
> the LeaderLatch, but you can just ignore that and see how it handles the 
> latch.
> 
> --Eric
> 
> 
> On Wednesday, June 26, 2013, chao chu wrote:
> Thanks a lot for your reply. Could you please name a few open source projects 
> that used LeaderLatch if you are aware of any? I'd like to take a look at the 
> code. 
> 
> btw, What about issues reported in the links I mentioned? are they actual 
> bugs or just used in an unexpected way?
> 
> 
> 
> On Wed, Jun 26, 2013 at 7:29 AM, Jordan Zimmerman 
> <[email protected]> wrote:
> Curator is being used at major companies (i.e. Netflix, eBay, etc.). Bugs are 
> quickly fixed when reported. In particular, LeaderLatch is widely used. 
> -JZ
> 
> 
> On Jun 25, 2013, at 11:03 AM, chao chu <[email protected]> wrote:
> 
>> Hi,
>> 
>> I have been trying to use the LeaderLatch to implement Leader Election in my 
>> project and had written some scripts to simulate the situations when the zk 
>> ensemble become unstable due to network problems. It worked well and as 
>> expected so far.
>> 
>> However, by digging into both zookeeper-users and curator-users mailing 
>> lists, there are indeed some bugs/edge cases reported, like
>> LeaderLatch bug causing extra znodes appearing in Zookeeper and multiple 
>> participants thought they are leader which worried me about the reliability 
>> of this.
>> 
>> So, my question is that: are there any real world projects are using this 
>> recipe which have proved the quality of it, or are there any other known 
>> edge cases or open issues?
>> 
>> 
>> Thanks & Regards,
>> 
>> -- 
>> ChuChao
> 
> 
> 
> 
> -- 
> ChuChao
> 
> 
> 
> -- 
> ChuChao
> 
> 
> 
> 
> -- 
> ChuChao

Reply via email to