Also, specific Zookeeper 3.4.X version where loss of quorum occurred will help.
3.4.5 fixed some pretty serious issues around hanging.

Gwen

On Mon, Aug 4, 2014 at 9:29 AM, Gwen Shapira <gshap...@cloudera.com> wrote:
> Thanks for the heads-up, Joe.
>
> We've been shipping Zookeeper 3.4.X for over  two years now (since
> CDH4.0) and have many production customers. I'll check if there are
> any known issues with breaking quorum. In any case I will take your
> comments into account and see if I can arrange for extra testing.
>
> Can you share more information about the 3.4.X issues you were seeing?
> Was there especially large clusters involved? large number of
> consumers?
>
> Also, I'm curious to hear more about the reasons for separate ZK
> cluster. I can see why you'll want it if you have thousands of
> consumers, but are there other reasons? Multiple zookeeper installs
> can be a pain to manage.
>
> Gwen
>
>
>
> On Mon, Aug 4, 2014 at 7:52 AM, Joe Stein <joe.st...@stealth.ly> wrote:
>> I have heard issues from installations running 3.4.X that I have not heard
>> from installations running 3.3.X (i.e. zk breaking quorum and cluster going
>> down).
>>
>> In none of these cases did I have an opportunity to isolate and reproduce
>> and confirm the issue happening and caused by 3.4.X. Moving to 3.3.x was
>> agreed to being a lower risk/cost solution to the problem. Once on 3.3.X
>> the issues didn't happen again.
>>
>> So I can't say for sure if there are issues with running 3.4.X but I would
>> suggest some due diligence in testing and production operation to validate
>> that every case that Kafka requires operates correctly (and over some
>> time).  There is a cost to this so some company(s) will have to take that
>> investment and do some cost vs the benefit of moving to 3.4.x.
>>
>> I currently recommend running a separate ZK cluster for Kafka production
>> and not chroot into an existing one except for test/qa/dev.
>>
>> I don't know what others experience is with 3.4.X as I said the issues I
>> have seen could have been coincidence.
>>
>> /*******************************************
>>  Joe Stein
>>  Founder, Principal Consultant
>>  Big Data Open Source Security LLC
>>  http://www.stealth.ly
>>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
>> ********************************************/
>>
>>
>> On Mon, Aug 4, 2014 at 12:56 AM, Gwen Shapira <gshap...@cloudera.com> wrote:
>>
>>> Hi,
>>>
>>> Kafka currently builds against Zookeeper 3.3.4, which is quite old.
>>>
>>> Perhaps we should move to the more recent 3.4.x branch?
>>>
>>> I tested the change on my system and the only impact is to
>>> EmbeddedZookeeper used in tests (it uses NIOServerCnxn.factory, which
>>> was refactored into its own class in 3.4).
>>>
>>> Here's what the change looks like:
>>> https://gist.github.com/gwenshap/d95b36e0bced53cab5bb
>>>
>>> Gwen
>>>

Reply via email to