@michal

My interpretation is that he's running 2 instances of zookeeper - not 6. (1
on the "4 broker machine" and one on the other)

I'm not sure where that leaves you in zookeeper land - ie if you happen to
have a timeout between the two zookeepers will you be out of service or
will you have a split brain problem? None of the alternatives are good.
That said - it should be visible in the logs.

Anyway two zk is not a good config - stick to one or go to three.





2017-04-30 15:41 GMT+02:00 Michal Borowiecki <michal.borowie...@openbet.com>
:

> Hi Jan,
>
> Correct. As I said before it's not common or recommended practice to run
> an even number, and I wouldn't recommend it myself. I hope it didn't sound
> as if I did.
>
> However, I don't see how this would cause the issue at hand unless at
> least 3 out of the 6 zookeepers died, but that could also have happened in
> a 5 node setup.
>
> In either case, changing the number of zookeepers is not a prerequisite to
> progress debugging this issue further.
>
> Cheers,
>
> Michal
>
> On 30/04/17 13:35, jan wrote:
>
> I looked this up yesterday  when I read the grandparent, as my old
> company ran two and I needed to know.
> Your link is a bit ambiguous but it has a link to the zookeeper
> Getting Started guide which says this:
>
> "
> For replicated mode, a minimum of three servers are required, and it
> is strongly recommended that you have an odd number of servers. If you
> only have two servers, then you are in a situation where if one of
> them fails, there are not enough machines to form a majority quorum.
> Two servers is inherently less stable than a single server, because
> there are two single points of failure.
> "
> <https://zookeeper.apache.org/doc/r3.4.10/zookeeperStarted.html> 
> <https://zookeeper.apache.org/doc/r3.4.10/zookeeperStarted.html>
>
> cheers
>
> jan
>
>
> On 30/04/2017, Michal Borowiecki <michal.borowie...@openbet.com> 
> <michal.borowie...@openbet.com> wrote:
>
> Svante, I don't share your opinion.
> Having an even number of zookeepers is not a problem in itself, it
> simply means you don't get any better resilience than if you had one
> fewer instance.
> Yes, it's not common or recommended practice, but you are allowed to
> have an even number of zookeepers and it's most likely not related to
> the problem at hand and does NOT need to be addressed 
> first.https://zookeeper.apache.org/doc/r3.4.10/zookeeperAdmin.html#sc_zkMulitServerSetup
>
> Abhit, I'm afraid the log snippet is not enough for me to help.
> Maybe someone else in the community with more experience can recognize
> the symptoms but in the meantime, if you haven't already done so, you
> may want to search for similar issues:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20text%20~%20%22ZK%20expired%3B%20shut%20down%20all%20controller%22
>
> searching for text like "ZK expired; shut down all controller" or "No
> broker in ISR is alive for" or other interesting events form the log.
>
> Hope that helps,
> Michal
>
>
> On 26/04/17 21:40, Svante Karlsson wrote:
>
> You are not supposed to run an even number of zookeepers. Fix that first
>
> On Apr 26, 2017 20:59, "Abhit Kalsotra" <abhit...@gmail.com> 
> <abhit...@gmail.com> wrote:
>
>
> Any pointers please....
>
>
> Abhi
>
> On Wed, Apr 26, 2017 at 11:03 PM, Abhit Kalsotra <abhit...@gmail.com> 
> <abhit...@gmail.com>
> wrote:
>
>
> Hi *
>
> My kafka setup
>
>
> **OS: Windows Machine*6 broker nodes , 4 on one Machine and 2 on other
> Machine*
>
> **ZK instance on (4 broker nodes Machine) and another ZK on (2 broker
> nodes machine)*
> ** 2 Topics with partition size = 50 and replication factor = 3*
>
> I am producing on an average of around 500 messages / sec with each
> message size close to 98 bytes...
>
> More or less the message rate stays constant throughout, but after
>
> running
>
> the setup for close to 2 weeks , my Kafka cluster broke and this
> happened
> twice in a month.  Not able to understand what's the issue, Kafka gurus
> please do share your inputs...
>
> the controlle.log file at the time of Kafka broken looks like
>
>
>
>
> *[2017-04-26 12:03:34,998] INFO [Controller 0]: Broker failure callback
> for 0,1,3,5,6 (kafka.controller.KafkaController)[2017-04-26
>
> 12:03:34,998]
>
> INFO [Controller 0]: Removed ArrayBuffer() from list of shutting down
> brokers. (kafka.controller.KafkaController)[2017-04-26 12:03:34,998]
>
> INFO
>
> [Partition state machine on Controller 0]: Invoking state change to
> OfflinePartition for partitions
> [__consumer_offsets,19],[mytopic,11],[__consumer_
>
> offsets,30],[mytopicOLD,18],[mytopic,13],[__consumer_
> offsets,47],[mytopicOLD,26],[__consumer_offsets,29],[
> mytopicOLD,0],[__consumer_offsets,41],[mytopic,44],[
> mytopicOLD,38],[mytopicOLD,2],[__consumer_offsets,17],[__
> consumer_offsets,10],[mytopic,20],[mytopic,23],[mytopic,30],
> [__consumer_offsets,14],[__consumer_offsets,40],[mytopic,
> 31],[mytopicOLD,43],[mytopicOLD,19],[mytopicOLD,35]
> ,[__consumer_offsets,18],[mytopic,43],[__consumer_offsets,26],[__consumer_
> offsets,0],[mytopic,32],[__consumer_offsets,24],[
> mytopicOLD,3],[mytopic,2],[mytopic,3],[mytopicOLD,45],[
> mytopic,35],[__consumer_offsets,20],[mytopic,1],[
> mytopicOLD,33],[__consumer_offsets,5],[mytopicOLD,47],[__
> consumer_offsets,22],[mytopicOLD,8],[mytopic,33],[
> mytopic,36],[mytopicOLD,11],[mytopic,47],[mytopicOLD,20],[
> mytopic,48],[__consumer_offsets,12],[mytopicOLD,32],[_
> _consumer_offsets,8],[mytopicOLD,39],[mytopicOLD,27]
> ,[mytopicOLD,49],[mytopicOLD,42],[mytopic,21],[mytopicOLD,
> 31],[mytopic,29],[__consumer_offsets,23],[mytopicOLD,21],[_
> _consumer_offsets,48],[__consumer_offsets,11],[mytopic,
> 18],[__consumer_offsets,13],[mytopic,45],[mytopic,5],[
> mytopicOLD,25],[mytopic,6],[mytopicOLD,23],[mytopicOLD,37]
> ,[__consumer_offsets,6],[__consumer_offsets,49],[
> mytopicOLD,13],[__consumer_offsets,28],[__consumer_offsets,4],[__consumer_
> offsets,37],[mytopic,12],[mytopicOLD,30],[__consumer_
> offsets,31],[__consumer_offsets,44],[mytopicOLD,15],[
> mytopicOLD,29],[mytopic,37],[mytopic,38],[__consumer_
> offsets,42],[mytopic,27],[mytopic,26],[mytopic,15],[__
> consumer_offsets,34],[mytopic,42],[__consumer_offsets,46],[
> mytopic,14],[mytopicOLD,12],[mytopicOLD,1],[mytopic,7],[__
> consumer_offsets,25],[mytopicOLD,24],[mytopicOLD,44]
> ,[mytopicOLD,14],[__consumer_offsets,32],[mytopic,0],[__
> consumer_offsets,43],[mytopic,39],[mytopicOLD,5],[mytopic,9]
> ,[mytopic,24],[__consumer_offsets,36],[mytopic,25],[
> mytopicOLD,36],[mytopic,19],[__consumer_offsets,35],[__
> consumer_offsets,7],[mytopic,8],[__consumer_offsets,38],[
> mytopicOLD,48],[mytopicOLD,9],[__consumer_offsets,1],[
> mytopicOLD,6],[mytopic,41],[mytopicOLD,41],[mytopicOLD,7],
> [mytopic,17],[mytopicOLD,17],[mytopic,49],[__consumer_
> offsets,16],[__consumer_offsets,2]
>
> (kafka.controller.PartitionStateMachine)[2017-04-26 12:03:35,045] INFO
> [SessionExpirationListener on 1], ZK expired; shut down all controller
> components and try to re-elect
> (kafka.controller.KafkaController$SessionExpirationListener)[2017-04-26
> 12:03:35,045] DEBUG [Controller 1]: Controller resigning, broker id 1
> (kafka.controller.KafkaController)[2017-04-26 12:03:35,045] DEBUG
> [Controller 1]: De-registering IsrChangeNotificationListener
> (kafka.controller.KafkaController)[2017-04-26 12:03:35,060] INFO
>
> [Partition
>
> state machine on Controller 1]: Stopped partition state machine
> (kafka.controller.PartitionStateMachine)[2017-04-26 12:03:35,060] INFO
> [Replica state machine on controller 1]: Stopped replica state machine
> (kafka.controller.ReplicaStateMachine)[2017-04-26 12:03:35,060] INFO
> [Controller 1]: Broker 1 resigned as the controller
> (kafka.controller.KafkaController)[2017-04-26 12:03:36,013] DEBUG
> [OfflinePartitionLeaderSelector]: No broker in ISR is alive for
> [__consumer_offsets,19]. Pick the leader from the alive assigned
>
> replicas:
>
> (kafka.controller.OfflinePartitionLeaderSelector)[2017-04-26
>
> 12:03:36,029]
>
> DEBUG [OfflinePartitionLeaderSelector]:
> [mytopic,11]. Pick the leader from the alive assigned replicas:
> (kafka.controller.OfflinePartitionLeaderSelector)[2017-04-26
>
> 12:03:36,029]
>
> DEBUG [OfflinePartitionLeaderSelector]: No broker in ISR is alive for
> [__consumer_offsets,30]. Pick the leader from the alive assigned
>
> replicas:
>
> (kafka.controller.OfflinePartitionLeaderSelector)[2017-04-26
>
> 12:03:37,811]
>
> DEBUG [OfflinePartitionLeaderSelector]: Some broker in ISR is alive for
> [mytopicOLD,18]. Select 2 from ISR 2 to be the leader.
> (kafka.controller.OfflinePartitionLeaderSelector)*
>
> Typical broker config attached.. Please do share some valid inputs...
>
> Abhi
> !wq
>
>
> *-- *
> If you can't succeed, call it version 1.0
>
>
>
> --
> If you can't succeed, call it version 1.0
>
>
> --
> Signature<http://www.openbet.com/> <http://www.openbet.com/>  Michal 
> Borowiecki
> Senior Software Engineer L4
>       T:      +44 208 742 1600 <+44%2020%208742%201600>
>
>       
>       +44 203 249 8448 <+44%2020%203249%208448>
>
>       
>       
>       E:      michal.borowie...@openbet.com
>       W:      www.openbet.com <http://www.openbet.com/> 
> <http://www.openbet.com/>
>
>       
>       OpenBet Ltd
>
>       Chiswick Park Building 9
>
>       566 Chiswick High Rd
>
>       London
>
>       W4 5XT
>
>       UK
>
>       <https://www.openbet.com/email_promo> 
> <https://www.openbet.com/email_promo>
>
> This message is confidential and intended only for the addressee. If you
> have received this message in error, please immediately notify 
> thepostmas...@openbet.com <mailto:postmas...@openbet.com> 
> <postmas...@openbet.com> and delete it
> from your system as well as any copies. The content of e-mails as well
> as traffic data may be monitored by OpenBet for employment and security
> purposes. To protect the environment please do not print this e-mail
> unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building
> 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company
> registered in England and Wales. Registered no. 3134634. VAT no.
> GB927523612
>
>
>
>
> --
> <http://www.openbet.com/> Michal Borowiecki
> Senior Software Engineer L4
> T: +44 208 742 1600 <+44%2020%208742%201600>
>
>
> +44 203 249 8448 <+44%2020%203249%208448>
>
>
>
> E: michal.borowie...@openbet.com
> W: www.openbet.com
> OpenBet Ltd
>
> Chiswick Park Building 9
>
> 566 Chiswick High Rd
>
> London
>
> W4 5XT
>
> UK
> <https://www.openbet.com/email_promo>
> This message is confidential and intended only for the addressee. If you
> have received this message in error, please immediately notify the
> postmas...@openbet.com and delete it from your system as well as any
> copies. The content of e-mails as well as traffic data may be monitored by
> OpenBet for employment and security purposes. To protect the environment
> please do not print this e-mail unless necessary. OpenBet Ltd. Registered
> Office: Chiswick Park Building 9, 566 Chiswick High Road, London, W4 5XT,
> United Kingdom. A company registered in England and Wales. Registered no.
> 3134634. VAT no. GB927523612
>

Reply via email to