Ah, yes, you're right. I miss-read it.

My bad. Apologies.

Michal

On 30/04/17 16:02, Svante Karlsson wrote:
@michal

My interpretation is that he's running 2 instances of zookeeper - not 6. (1 on the "4 broker machine" and one on the other)

I'm not sure where that leaves you in zookeeper land - ie if you happen to have a timeout between the two zookeepers will you be out of service or will you have a split brain problem? None of the alternatives are good. That said - it should be visible in the logs.

Anyway two zk is not a good config - stick to one or go to three.





2017-04-30 15:41 GMT+02:00 Michal Borowiecki <michal.borowie...@openbet.com <mailto:michal.borowie...@openbet.com>>:

    Hi Jan,

    Correct. As I said before it's not common or recommended practice
    to run an even number, and I wouldn't recommend it myself. I hope
    it didn't sound as if I did.

    However, I don't see how this would cause the issue at hand unless
    at least 3 out of the 6 zookeepers died, but that could also have
    happened in a 5 node setup.

    In either case, changing the number of zookeepers is not a
    prerequisite to progress debugging this issue further.

    Cheers,

    Michal


    On 30/04/17 13:35, jan wrote:
    I looked this up yesterday  when I read the grandparent, as my old
    company ran two and I needed to know.
    Your link is a bit ambiguous but it has a link to the zookeeper
    Getting Started guide which says this:

    "
    For replicated mode, a minimum of three servers are required, and it
    is strongly recommended that you have an odd number of servers. If you
    only have two servers, then you are in a situation where if one of
    them fails, there are not enough machines to form a majority quorum.
    Two servers is inherently less stable than a single server, because
    there are two single points of failure.
    "

    <https://zookeeper.apache.org/doc/r3.4.10/zookeeperStarted.html>
    <https://zookeeper.apache.org/doc/r3.4.10/zookeeperStarted.html>

    cheers

    jan


    On 30/04/2017, Michal Borowiecki<michal.borowie...@openbet.com>
    <mailto:michal.borowie...@openbet.com>  wrote:
    Svante, I don't share your opinion.
    Having an even number of zookeepers is not a problem in itself, it
    simply means you don't get any better resilience than if you had one
    fewer instance.
    Yes, it's not common or recommended practice, but you are allowed to
    have an even number of zookeepers and it's most likely not related to
    the problem at hand and does NOT need to be addressed first.
    
https://zookeeper.apache.org/doc/r3.4.10/zookeeperAdmin.html#sc_zkMulitServerSetup
    
<https://zookeeper.apache.org/doc/r3.4.10/zookeeperAdmin.html#sc_zkMulitServerSetup>

    Abhit, I'm afraid the log snippet is not enough for me to help.
    Maybe someone else in the community with more experience can recognize
    the symptoms but in the meantime, if you haven't already done so, you
    may want to search for similar issues:

    
https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20text%20~%20%22ZK%20expired%3B%20shut%20down%20all%20controller%22
    
<https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20text%20%7E%20%22ZK%20expired%3B%20shut%20down%20all%20controller%22>

    searching for text like "ZK expired; shut down all controller" or "No
    broker in ISR is alive for" or other interesting events form the log.

    Hope that helps,
    Michal


    On 26/04/17 21:40, Svante Karlsson wrote:
    You are not supposed to run an even number of zookeepers. Fix that first

    On Apr 26, 2017 20:59, "Abhit Kalsotra"<abhit...@gmail.com> 
<mailto:abhit...@gmail.com>  wrote:

    Any pointers please....


    Abhi

    On Wed, Apr 26, 2017 at 11:03 PM, Abhit Kalsotra<abhit...@gmail.com> 
<mailto:abhit...@gmail.com>
    wrote:

    Hi *

    My kafka setup


    **OS: Windows Machine*6 broker nodes , 4 on one Machine and 2 on other
    Machine*

    **ZK instance on (4 broker nodes Machine) and another ZK on (2 broker
    nodes machine)*
    ** 2 Topics with partition size = 50 and replication factor = 3*

    I am producing on an average of around 500 messages / sec with each
    message size close to 98 bytes...

    More or less the message rate stays constant throughout, but after
    running
    the setup for close to 2 weeks , my Kafka cluster broke and this
    happened
    twice in a month.  Not able to understand what's the issue, Kafka gurus
    please do share your inputs...

    the controlle.log file at the time of Kafka broken looks like




    *[2017-04-26 12:03:34,998] INFO [Controller 0]: Broker failure callback
    for 0,1,3,5,6 (kafka.controller.KafkaController)[2017-04-26
    12:03:34,998]
    INFO [Controller 0]: Removed ArrayBuffer() from list of shutting down
    brokers. (kafka.controller.KafkaController)[2017-04-26 12:03:34,998]
    INFO
    [Partition state machine on Controller 0]: Invoking state change to
    OfflinePartition for partitions
    [__consumer_offsets,19],[mytopic,11],[__consumer_
    offsets,30],[mytopicOLD,18],[mytopic,13],[__consumer_
    offsets,47],[mytopicOLD,26],[__consumer_offsets,29],[
    mytopicOLD,0],[__consumer_offsets,41],[mytopic,44],[
    mytopicOLD,38],[mytopicOLD,2],[__consumer_offsets,17],[__
    consumer_offsets,10],[mytopic,20],[mytopic,23],[mytopic,30],
    [__consumer_offsets,14],[__consumer_offsets,40],[mytopic,
    31],[mytopicOLD,43],[mytopicOLD,19],[mytopicOLD,35]
    ,[__consumer_offsets,18],[mytopic,43],[__consumer_offsets,26],[__consumer_
    offsets,0],[mytopic,32],[__consumer_offsets,24],[
    mytopicOLD,3],[mytopic,2],[mytopic,3],[mytopicOLD,45],[
    mytopic,35],[__consumer_offsets,20],[mytopic,1],[
    mytopicOLD,33],[__consumer_offsets,5],[mytopicOLD,47],[__
    consumer_offsets,22],[mytopicOLD,8],[mytopic,33],[
    mytopic,36],[mytopicOLD,11],[mytopic,47],[mytopicOLD,20],[
    mytopic,48],[__consumer_offsets,12],[mytopicOLD,32],[_
    _consumer_offsets,8],[mytopicOLD,39],[mytopicOLD,27]
    ,[mytopicOLD,49],[mytopicOLD,42],[mytopic,21],[mytopicOLD,
    31],[mytopic,29],[__consumer_offsets,23],[mytopicOLD,21],[_
    _consumer_offsets,48],[__consumer_offsets,11],[mytopic,
    18],[__consumer_offsets,13],[mytopic,45],[mytopic,5],[
    mytopicOLD,25],[mytopic,6],[mytopicOLD,23],[mytopicOLD,37]
    ,[__consumer_offsets,6],[__consumer_offsets,49],[
    mytopicOLD,13],[__consumer_offsets,28],[__consumer_offsets,4],[__consumer_
    offsets,37],[mytopic,12],[mytopicOLD,30],[__consumer_
    offsets,31],[__consumer_offsets,44],[mytopicOLD,15],[
    mytopicOLD,29],[mytopic,37],[mytopic,38],[__consumer_
    offsets,42],[mytopic,27],[mytopic,26],[mytopic,15],[__
    consumer_offsets,34],[mytopic,42],[__consumer_offsets,46],[
    mytopic,14],[mytopicOLD,12],[mytopicOLD,1],[mytopic,7],[__
    consumer_offsets,25],[mytopicOLD,24],[mytopicOLD,44]
    ,[mytopicOLD,14],[__consumer_offsets,32],[mytopic,0],[__
    consumer_offsets,43],[mytopic,39],[mytopicOLD,5],[mytopic,9]
    ,[mytopic,24],[__consumer_offsets,36],[mytopic,25],[
    mytopicOLD,36],[mytopic,19],[__consumer_offsets,35],[__
    consumer_offsets,7],[mytopic,8],[__consumer_offsets,38],[
    mytopicOLD,48],[mytopicOLD,9],[__consumer_offsets,1],[
    mytopicOLD,6],[mytopic,41],[mytopicOLD,41],[mytopicOLD,7],
    [mytopic,17],[mytopicOLD,17],[mytopic,49],[__consumer_
    offsets,16],[__consumer_offsets,2]
    (kafka.controller.PartitionStateMachine)[2017-04-26 12:03:35,045] INFO
    [SessionExpirationListener on 1], ZK expired; shut down all controller
    components and try to re-elect
    (kafka.controller.KafkaController$SessionExpirationListener)[2017-04-26
    12:03:35,045] DEBUG [Controller 1]: Controller resigning, broker id 1
    (kafka.controller.KafkaController)[2017-04-26 12:03:35,045] DEBUG
    [Controller 1]: De-registering IsrChangeNotificationListener
    (kafka.controller.KafkaController)[2017-04-26 12:03:35,060] INFO
    [Partition
    state machine on Controller 1]: Stopped partition state machine
    (kafka.controller.PartitionStateMachine)[2017-04-26 12:03:35,060] INFO
    [Replica state machine on controller 1]: Stopped replica state machine
    (kafka.controller.ReplicaStateMachine)[2017-04-26 12:03:35,060] INFO
    [Controller 1]: Broker 1 resigned as the controller
    (kafka.controller.KafkaController)[2017-04-26 12:03:36,013] DEBUG
    [OfflinePartitionLeaderSelector]: No broker in ISR is alive for
    [__consumer_offsets,19]. Pick the leader from the alive assigned
    replicas:
    (kafka.controller.OfflinePartitionLeaderSelector)[2017-04-26
    12:03:36,029]
    DEBUG [OfflinePartitionLeaderSelector]:
    [mytopic,11]. Pick the leader from the alive assigned replicas:
    (kafka.controller.OfflinePartitionLeaderSelector)[2017-04-26
    12:03:36,029]
    DEBUG [OfflinePartitionLeaderSelector]: No broker in ISR is alive for
    [__consumer_offsets,30]. Pick the leader from the alive assigned
    replicas:
    (kafka.controller.OfflinePartitionLeaderSelector)[2017-04-26
    12:03:37,811]
    DEBUG [OfflinePartitionLeaderSelector]: Some broker in ISR is alive for
    [mytopicOLD,18]. Select 2 from ISR 2 to be the leader.
    (kafka.controller.OfflinePartitionLeaderSelector)*

    Typical broker config attached.. Please do share some valid inputs...

    Abhi
    !wq


    *-- *
    If you can't succeed, call it version 1.0

    --
    If you can't succeed, call it version 1.0

    --
    Signature
    <http://www.openbet.com/> <http://www.openbet.com/>     Michal Borowiecki
    Senior Software Engineer L4
        T:      +44 208 742 1600 <tel:+44%2020%208742%201600>

        
        +44 203 249 8448 <tel:+44%2020%203249%208448>

        
        
        E:      michal.borowie...@openbet.com 
<mailto:michal.borowie...@openbet.com>
        W:      www.openbet.com <http://www.openbet.com>  <http://www.openbet.com/> 
<http://www.openbet.com/>

        
        OpenBet Ltd

        Chiswick Park Building 9

        566 Chiswick High Rd

        London

        W4 5XT

        UK

        
    <https://www.openbet.com/email_promo>
    <https://www.openbet.com/email_promo>

    This message is confidential and intended only for the addressee. If you
    have received this message in error, please immediately notify the
    postmas...@openbet.com <mailto:postmas...@openbet.com>  
<mailto:postmas...@openbet.com> <mailto:postmas...@openbet.com>  and delete it
    from your system as well as any copies. The content of e-mails as well
    as traffic data may be monitored by OpenBet for employment and security
    purposes. To protect the environment please do not print this e-mail
    unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building
    9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company
    registered in England and Wales. Registered no. 3134634. VAT no.
    GB927523612


-- <http://www.openbet.com/> Michal Borowiecki
    Senior Software Engineer L4
        T:      +44 208 742 1600 <tel:+44%2020%208742%201600>
                +44 203 249 8448 <tel:+44%2020%203249%208448>
                
        E:      michal.borowie...@openbet.com
    <mailto:michal.borowie...@openbet.com>
        W:      www.openbet.com <http://www.openbet.com/>

        
        OpenBet Ltd
        Chiswick Park Building 9
        566 Chiswick High Rd
        London
        W4 5XT
        UK

        
    <https://www.openbet.com/email_promo>

    This message is confidential and intended only for the addressee.
    If you have received this message in error, please immediately
    notify the postmas...@openbet.com <mailto:postmas...@openbet.com>
    and delete it from your system as well as any copies. The content
    of e-mails as well as traffic data may be monitored by OpenBet for
    employment and security purposes. To protect the environment
    please do not print this e-mail unless necessary. OpenBet Ltd.
    Registered Office: Chiswick Park Building 9, 566 Chiswick High
    Road, London, W4 5XT, United Kingdom. A company registered in
    England and Wales. Registered no. 3134634. VAT no. GB927523612

-- Signature
<http://www.openbet.com/>         Michal Borowiecki
Senior Software Engineer L4
        T:      +44 208 742 1600
                +44 203 249 8448
                
        E:      michal.borowie...@openbet.com
        W:      www.openbet.com <http://www.openbet.com/>

        
        OpenBet Ltd
        Chiswick Park Building 9
        566 Chiswick High Rd
        London
        W4 5XT
        UK

        
<https://www.openbet.com/email_promo>

This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postmas...@openbet.com <mailto:postmas...@openbet.com> and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by OpenBet for employment and security purposes. To protect the environment please do not print this e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building 9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company registered in England and Wales. Registered no. 3134634. VAT no. GB927523612

Reply via email to