Re: Re: Re: ZK and Kafka failover testing

2017-04-20 Thread Jun Rao
icas: > 4,5,1,2,3 Isr: 3,4,2,5 > Topic: ${topic-name} Partition: 10 Leader: 5 Replicas: > 5,2,3,4,1 Isr: 5,2,3,4 > Topic: ${topic-name} Partition: 11 Leader: 3 Replicas: > 1,3,4,5,2 Isr: 5,2,3,4 > Topic: ${topic-name} Parti

Re: Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Hans Jespersen
367.4302 > Enterprise Architecture Team > PDX-NHIN > > > -Original Message- > From: Shrikant Patel > Sent: Wednesday, April 19, 2017 5:49 PM > To: users@kafka.apache.org > Subject: RE: [EXTERNAL] Re: Re: ZK and Kafka failover testing > > Thanks Jeff, Onur, Jun, Han

RE: Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Shrikant Patel
Patel Sent: Wednesday, April 19, 2017 5:49 PM To: users@kafka.apache.org Subject: RE: [EXTERNAL] Re: Re: ZK and Kafka failover testing Thanks Jeff, Onur, Jun, Hans. I am learning a lot from your response. Just to summarize briefly my steps, 5 node Kafka and ZK cluster. 1. ZK cluster has all node

RE: Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Shrikant Patel
4,2,5,3 Thanks, Shri -Original Message- From: Jeff Widman [mailto:j...@netskope.com] Sent: Wednesday, April 19, 2017 4:11 PM To: users@kafka.apache.org Subject: [EXTERNAL] Re: Re: ZK and Kafka failover testing * Notice: This email was received from an external source * Oops, I l

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Jeff Widman
Oops, I linked to the wrong ticket, this is the one we hit: https://issues.apache.org/jira/browse/KAFKA-3042 On Wed, Apr 19, 2017 at 1:45 PM, Jeff Widman wrote: > > > > > > *As Onur explained, if ZK is down, Kafka can still work, but won't be able > to react to actual broker failures until ZK is

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Jeff Widman
*As Onur explained, if ZK is down, Kafka can still work, but won't be able to react to actual broker failures until ZK is up again. So if a broker is down in that window, some of the partitions may not be ready for read or write.* We had a production scenario where ZK had a long GC pause and Kafka

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Jun Rao
Hi, Shri, As Onur explained, if ZK is down, Kafka can still work, but won't be able to react to actual broker failures until ZK is up again. So if a broker is down in that window, some of the partitions may not be ready for read or write. As for the duplicates in the consumer, Hans had a good poi

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Hans Jespersen
The OP was asking about duplicate messages, not lost messages, so I think we are discussing two different possible scenarios. When ever someone says they see duplicate messages it's always good practice to first double check ack mode, in flight messages, and retries. Also its important to check if

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Onur Karaman
If this is what I think it is, it has nothing to do with acks, max.in.flight.requests.per.connection, or anything client-side and is purely about the kafka cluster. Here's a simple example involving a single zookeeper instance, 3 brokers, a KafkaConsumer and KafkaProducer (neither of these clients

RE: Re: ZK and Kafka failover testing

2017-04-19 Thread Shrikant Patel
While we were testing, our producer had following configuration max.in.flight.requests.per.connection=1, acks= all and retries=3. The entire producer side set is below. The consumer has manual offset commit, it commit offset after it has successfully processed the message. Producer setting boot