That could be the cause, and it can be verified by changing the acks to -1 and checking the data loss ratio then.
Guozhang On Tue, Jul 15, 2014 at 12:49 PM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -) <jwu...@bloomberg.net> wrote: > Guozhang,My coworker came up with an explaination: at one moment the > leader L, and two followers F1, F2 are all in ISR. The producer sends a > message m1 and receives acks from L and F1. Before the messge is replicated > to F2, L is down. In the following leader election, F2, instead of F1, > becomes the leader, and loses m1 somehow. > Could that be the root cause? > Thanks, > Jiang > > From: users@kafka.apache.org At: Jul 15 2014 15:05:25 > To: users@kafka.apache.org > Subject: Re: message loss for sync producer, acks=2, topic replicas=3 > > Guozhang, > > Please find the config below: > > Producer: > > props.put("producer.type", "sync"); > > props.put("request.required.acks", 2); > > props.put("serializer.class", "kafka.serializer.StringEncoder"); > > props.put("partitioner.class", "kafka.producer.DefaultPartitioner"); > > props.put("message.send.max.retries", "60"); > > props.put("retry.backoff.ms", "300"); > > Consumer: > > props.put("zookeeper.session.timeout.ms", "400"); > > props.put("zookeeper.sync.time.ms", "200"); > > props.put("auto.commit.interval.ms", "1000"); > > Broker: > num.network.threads=2 > num.io.threads=8 > socket.send.buffer.bytes=1048576 > socket.receive.buffer.bytes=1048576 > socket.request.max.bytes=104857600 > num.partitions=2 > log.retention.hours=168 > log.retention.bytes=20000000 > log.segment.bytes=536870912 > log.retention.check.interval.ms=60000 > log.cleaner.enable=false > zookeeper.connection.timeout.ms=1000000 > > Topic: > Topic:p1r3 PartitionCount:1 ReplicationFactor:3 > Configs:retention.bytes=10000000000 > > Thanks, > Jiang > > From: users@kafka.apache.org At: Jul 15 2014 13:59:03 > To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -), users@kafka.apache.org > Subject: Re: message loss for sync producer, acks=2, topic replicas=3 > > What config property values did you use on producer/consumer/broker? > > Guozhang > > > On Tue, Jul 15, 2014 at 10:32 AM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731 > LEX -) <jwu...@bloomberg.net> wrote: > > > Guozhang, > > I'm testing on 0.8.1.1; just kill pid, no -9. > > Regards, > > Jiang > > > > From: users@kafka.apache.org At: Jul 15 2014 13:27:50 > > To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -), > users@kafka.apache.org > > Subject: Re: message loss for sync producer, acks=2, topic replicas=3 > > > > Hello Jiang, > > > > Which version of Kafka are you using, and did you kill the broker with > -9? > > > > Guozhang > > > > > > On Tue, Jul 15, 2014 at 9:23 AM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731 > > LEX -) <jwu...@bloomberg.net> wrote: > > > > > Hi, > > > I observed some unexpected message loss in kafka fault tolerant test. > In > > > the test, a topic with 3 replicas is created. A sync producer with > acks=2 > > > publishes to the topic. A consumer consumes from the topic and tracks > > > message ids. During the test, the leader is killed. Both producer and > > > consumer continue to run for a while. After the producer stops, the > > > consumer reports if all messages are received. > > > > > > The test was repeated multiple rounds; message loss happened in about > 10% > > > of the tests. A typical scenario is as follows: before the leader is > > > killed, all 3 replicas are in ISR. After the leader is killed, one > > follower > > > becomes the leader, and 2 replicas (including the new leader) are in > ISR. > > > Both the producer and consumer pause for several seconds during that > > time, > > > and then continue. Message loss happens after the leader is killed. > > > > > > Because the new leader is in ISR before the old leader is killed, > unclean > > > leader election doesn't explain the message loss. > > > > > > I'm wondering if anyone else also observed such message loss? Is there > > any > > > known issue that may cause the message loss in the above scenario? > > > > > > Thanks, > > > Jiang > > > > > > -- > > -- Guozhang > > > > > > > > > -- > -- Guozhang > > > -- -- Guozhang