Thanks for the heads-up, Joe. We've been shipping Zookeeper 3.4.X for over two years now (since CDH4.0) and have many production customers. I'll check if there are any known issues with breaking quorum. In any case I will take your comments into account and see if I can arrange for extra testing.
Can you share more information about the 3.4.X issues you were seeing? Was there especially large clusters involved? large number of consumers? Also, I'm curious to hear more about the reasons for separate ZK cluster. I can see why you'll want it if you have thousands of consumers, but are there other reasons? Multiple zookeeper installs can be a pain to manage. Gwen On Mon, Aug 4, 2014 at 7:52 AM, Joe Stein <joe.st...@stealth.ly> wrote: > I have heard issues from installations running 3.4.X that I have not heard > from installations running 3.3.X (i.e. zk breaking quorum and cluster going > down). > > In none of these cases did I have an opportunity to isolate and reproduce > and confirm the issue happening and caused by 3.4.X. Moving to 3.3.x was > agreed to being a lower risk/cost solution to the problem. Once on 3.3.X > the issues didn't happen again. > > So I can't say for sure if there are issues with running 3.4.X but I would > suggest some due diligence in testing and production operation to validate > that every case that Kafka requires operates correctly (and over some > time). There is a cost to this so some company(s) will have to take that > investment and do some cost vs the benefit of moving to 3.4.x. > > I currently recommend running a separate ZK cluster for Kafka production > and not chroot into an existing one except for test/qa/dev. > > I don't know what others experience is with 3.4.X as I said the issues I > have seen could have been coincidence. > > /******************************************* > Joe Stein > Founder, Principal Consultant > Big Data Open Source Security LLC > http://www.stealth.ly > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > ********************************************/ > > > On Mon, Aug 4, 2014 at 12:56 AM, Gwen Shapira <gshap...@cloudera.com> wrote: > >> Hi, >> >> Kafka currently builds against Zookeeper 3.3.4, which is quite old. >> >> Perhaps we should move to the more recent 3.4.x branch? >> >> I tested the change on my system and the only impact is to >> EmbeddedZookeeper used in tests (it uses NIOServerCnxn.factory, which >> was refactored into its own class in 3.4). >> >> Here's what the change looks like: >> https://gist.github.com/gwenshap/d95b36e0bced53cab5bb >> >> Gwen >>