That could be the cause, and it can be verified by changing the acks to -1
and checking the data loss ratio then.

Guozhang


On Tue, Jul 15, 2014 at 12:49 PM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731
LEX -) <jwu...@bloomberg.net> wrote:

> Guozhang,My coworker came up with an explaination: at one moment the
> leader L, and two followers F1, F2 are all in ISR. The producer sends a
> message m1 and receives acks from L and F1. Before the messge is replicated
> to F2, L is down. In the following leader election, F2, instead of F1,
> becomes the leader, and loses m1 somehow.
> Could that be the root cause?
> Thanks,
> Jiang
>
> From: users@kafka.apache.org At: Jul 15 2014 15:05:25
> To: users@kafka.apache.org
> Subject: Re: message loss for sync producer, acks=2, topic replicas=3
>
> Guozhang,
>
> Please find the config below:
>
> Producer:
>
>    props.put("producer.type", "sync");
>
>    props.put("request.required.acks", 2);
>
>    props.put("serializer.class", "kafka.serializer.StringEncoder");
>
>    props.put("partitioner.class", "kafka.producer.DefaultPartitioner");
>
>    props.put("message.send.max.retries", "60");
>
>    props.put("retry.backoff.ms", "300");
>
> Consumer:
>
>    props.put("zookeeper.session.timeout.ms", "400");
>
>    props.put("zookeeper.sync.time.ms", "200");
>
>    props.put("auto.commit.interval.ms", "1000");
>
> Broker:
> num.network.threads=2
> num.io.threads=8
> socket.send.buffer.bytes=1048576
> socket.receive.buffer.bytes=1048576
> socket.request.max.bytes=104857600
> num.partitions=2
> log.retention.hours=168
> log.retention.bytes=20000000
> log.segment.bytes=536870912
> log.retention.check.interval.ms=60000
> log.cleaner.enable=false
> zookeeper.connection.timeout.ms=1000000
>
> Topic:
> Topic:p1r3      PartitionCount:1        ReplicationFactor:3
> Configs:retention.bytes=10000000000
>
> Thanks,
> Jiang
>
> From: users@kafka.apache.org At: Jul 15 2014 13:59:03
> To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -), users@kafka.apache.org
> Subject: Re: message loss for sync producer, acks=2, topic replicas=3
>
> What config property values did you use on producer/consumer/broker?
>
> Guozhang
>
>
> On Tue, Jul 15, 2014 at 10:32 AM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731
> LEX -) <jwu...@bloomberg.net> wrote:
>
> > Guozhang,
> > I'm testing on 0.8.1.1; just kill pid, no -9.
> > Regards,
> > Jiang
> >
> > From: users@kafka.apache.org At: Jul 15 2014 13:27:50
> > To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -),
> users@kafka.apache.org
> > Subject: Re: message loss for sync producer, acks=2, topic replicas=3
> >
> > Hello Jiang,
> >
> > Which version of Kafka are you using, and did you kill the broker with
> -9?
> >
> > Guozhang
> >
> >
> > On Tue, Jul 15, 2014 at 9:23 AM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731
> > LEX -) <jwu...@bloomberg.net> wrote:
> >
> > > Hi,
> > > I observed some unexpected message loss in kafka fault tolerant test.
> In
> > > the test, a topic with 3 replicas is created. A sync producer with
> acks=2
> > > publishes to the topic. A consumer consumes from the topic and tracks
> > > message ids. During the test, the leader is killed. Both producer and
> > > consumer continue to run for a while. After the producer stops, the
> > > consumer reports if all messages are received.
> > >
> > > The test was repeated multiple rounds; message loss happened in about
> 10%
> > > of the tests. A typical scenario is as follows: before the leader is
> > > killed, all 3 replicas are in ISR. After the leader is killed, one
> > follower
> > > becomes the leader, and 2 replicas (including the new leader) are in
> ISR.
> > > Both the producer and consumer pause for several seconds during that
> > time,
> > > and then continue. Message loss happens after the leader is killed.
> > >
> > > Because the new leader is in ISR before the old leader is killed,
> unclean
> > > leader election doesn't explain the message loss.
> > >
> > > I'm wondering if anyone else also observed such message loss? Is there
> > any
> > > known issue that may cause the message loss in the above scenario?
> > >
> > > Thanks,
> > > Jiang
> >
> >
> > --
> > -- Guozhang
> >
> >
> >
>
>
> --
> -- Guozhang
>
>
>


-- 
-- Guozhang

Reply via email to