Thanks Gwen for your excellent slides

I will test it again based on your suggestions.




Best regards
Hawin

On Thu, Nov 12, 2015 at 6:35 PM, Gwen Shapira <g...@confluent.io> wrote:

> Hi,
>
> First, here's a handy slide-deck on avoiding data loss in Kafka:
>
> http://www.slideshare.net/gwenshap/kafka-reliability-when-it-absolutely-positively-has-to-be-there
>
> Note configuration parameters like the number of retries.
>
> Also, it looks like you are sending data to Kafka asynchronously, but you
> don't have a callback - so if Send fails, you have no way of updating the
> count, logging or anything else. I recommend taking a look at the callback
> API for sending data and improving your test to at least record send
> failures.
>
> Gwen
>
>
> On Thu, Nov 12, 2015 at 11:10 AM, Hawin Jiang <hawin.ji...@gmail.com>
> wrote:
>
> > Hi  Pradeep
> >
> > Here is my configuration
> >
> > ############################# Producer Basics
> #############################
> >
> > # list of brokers used for bootstrapping knowledge about the rest of the
> > cluster
> > # format: host1:port1,host2:port2 ...
> > metadata.broker.list=localhost:9092
> >
> > # name of the partitioner class for partitioning events; default
> partition
> > spreads data randomly
> > #partitioner.class=
> >
> > # specifies whether the messages are sent asynchronously (async) or
> > synchronously (sync)
> > producer.type=sync
> >
> > # specify the compression codec for all data generated: none, gzip,
> snappy,
> > lz4.
> > # the old config values work as well: 0, 1, 2, 3 for none, gzip, snappy,
> > lz4, respectively
> > compression.codec=none
> >
> > # message encoder
> > serializer.class=kafka.serializer.DefaultEncoder
> >
> > # allow topic level compression
> > #compressed.topics=
> >
> > ############################# Async Producer
> #############################
> > # maximum time, in milliseconds, for buffering data on the producer queue
> > #queue.buffering.max.ms=
> >
> > # the maximum size of the blocking queue for buffering on the producer
> > #queue.buffering.max.messages=
> >
> > # Timeout for event enqueue:
> > # 0: events will be enqueued immediately or dropped if the queue is full
> > # -ve: enqueue will block indefinitely if the queue is full
> > # +ve: enqueue will block up to this many milliseconds if the queue is
> full
> > #queue.enqueue.timeout.ms=
> >
> > # the number of messages batched at the producer
> > #batch.num.messages=
> >
> >
> >
> > @Jinxing
> >
> > I have not found the kafka 0.8.3.0 on
> > http://mvnrepository.com/artifact/org.apache.kafka/kafka_2.10.
> > The latest release is 0.8.2.2.
> > if we flush producer all the times,  I think it will impact our
> > performance.
> > I just want to make sure how many messages in producer, then can be found
> > it in consumer as well.
> > We can not lost so many messages in our system.
> >
> >
> >
> > Best regards
> > Hawin
> >
> >
> > On Thu, Nov 12, 2015 at 7:02 AM, jinxing <jinxing6...@126.com> wrote:
> >
> > > i have 3 brokers;
> > > the ack configuration is -1(all), meaning a message is sent
> successfully
> > > only after getting every broker's  ack;
> > > is this a bug?
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > At 2015-11-12 21:08:49, "Pradeep Gollakota" <pradeep...@gmail.com>
> > wrote:
> > > >What is your producer configuration? Specifically, how many acks are
> you
> > > >requesting from Kafka?
> > > >
> > > >On Thu, Nov 12, 2015 at 2:03 AM, jinxing <jinxing6...@126.com> wrote:
> > > >
> > > >> in kafka_0.8.3.0:
> > > >> kafkaProducer = new KafkaProducer<>(properties, new
> > > ByteArraySerializer(),
> > > >> new ByteArraySerializer());
> > > >> kafkaProducer.flush();
> > > >> you can call the flush after sending every few messages;
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> At 2015-11-12 17:36:24, "Hawin Jiang" <hawin.ji...@gmail.com>
> wrote:
> > > >> >Hi  Prabhjot
> > > >> >
> > > >> >The messages are "Thread1_kafka_1" and "Thread2_kafka_1". Something
> > > like
> > > >> >that.
> > > >> >
> > > >> >For GetOffsetShell report below:
> > > >> >
> > > >> >[kafka@dn-01 bin]$ ./kafka-run-class.sh kafka.tools.GetOffsetShell
> > > >> >--broker-list dn-01:9092 --time -1 --topic kafka-test
> > > >> >kafka-test:0:12529261
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >@Jinxing
> > > >> >
> > > >> >Can you share your flush example to me?  How to avoid lost my
> > messages?
> > > >> >Thanks.
> > > >> >
> > > >> >
> > > >> >
> > > >> >Best regards
> > > >> >Hawin
> > > >> >
> > > >> >
> > > >> >
> > > >> >On Thu, Nov 12, 2015 at 1:00 AM, jinxing <jinxing6...@126.com>
> > wrote:
> > > >> >
> > > >> >> there is a flush api of the producer, you can call this to
> prevent
> > > >> >> messages lost;
> > > >> >>
> > > >> >>
> > > >> >> maybe it can help;
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> At 2015-11-12 16:43:54, "Hawin Jiang" <hawin.ji...@gmail.com>
> > wrote:
> > > >> >> >Hi  Jinxing
> > > >> >> >
> > > >> >> >I don't think we can resolve this issue by increasing producers.
> > > if I
> > > >> >> >increased more producers, it should lost more messages.
> > > >> >> >
> > > >> >> >I just test two producers.
> > > >> >> >Thread Producer 1 has 83064 messages in producer side and 82273
> > > >> messages
> > > >> >> in
> > > >> >> >consumer side
> > > >> >> >Thread Producer 2 has 89844 messages in producer side and 88892
> > > >> messages
> > > >> >> in
> > > >> >> >consumer side.
> > > >> >> >
> > > >> >> >Thanks.
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> >Best regards
> > > >> >> >Hawin
> > > >> >> >
> > > >> >> >
> > > >> >> >On Thu, Nov 12, 2015 at 12:24 AM, jinxing <jinxing6...@126.com>
> > > wrote:
> > > >> >> >
> > > >> >> >> maybe there some changes in 0.9.0.0;
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> but still you can try increase producer sending rate, and see
> if
> > > >> there
> > > >> >> are
> > > >> >> >> message lost but no exception;
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> note that, to increase the producer sending rate, you must
> have
> > > >> enough
> > > >> >> >> producer 'power';
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> in my case,  I have 50 producer sending message at the same
> time
> > > : )
> > > >> >> >>
> > > >> >> >>
> > > >> >> >>
> > > >> >> >>
> > > >> >> >>
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> At 2015-11-12 16:16:23, "Hawin Jiang" <hawin.ji...@gmail.com>
> > > wrote:
> > > >> >> >> >Hi  Jinxing
> > > >> >> >> >
> > > >> >> >> >I am using kafka_2.10-0.9.0.0-SNAPSHOT.  I have downloaded
> > source
> > > >> code
> > > >> >> and
> > > >> >> >> >installed it last week.
> > > >> >> >> >
> > > >> >> >> >I saw 97446 messages have been sent to kafka successfully.
> > > >> >> >> >
> > > >> >> >> >So far, I have not found any failed messages.
> > > >> >> >> >
> > > >> >> >> >
> > > >> >> >> >
> > > >> >> >> >Best regards
> > > >> >> >> >Hawin
> > > >> >> >> >
> > > >> >> >> >On Thu, Nov 12, 2015 at 12:07 AM, jinxing <
> jinxing6...@126.com
> > >
> > > >> wrote:
> > > >> >> >> >
> > > >> >> >> >> Hi, what version are you using ?
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >> i am using 0.8.2.0, and I encountered this problem before;
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >> it seems that if the message rate of the producer side is
> to
> > > high,
> > > >> >> some
> > > >> >> >> of
> > > >> >> >> >> the messages will lost;
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >> also, i found that the callback method of the producer
> 'send'
> > > API
> > > >> is
> > > >> >> not
> > > >> >> >> >> reliable;
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >> only successful sent message will trigger the callback, but
> > the
> > > >> >> failed
> > > >> >> >> >> ones don't;
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >> you saw this?
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >>
> > > >> >> >> >> At 2015-11-12 16:01:17, "Hawin Jiang" <
> hawin.ji...@gmail.com
> > >
> > > >> wrote:
> > > >> >> >> >> >Hi  All
> > > >> >> >> >> >
> > > >> >> >> >> >I have sent messages to Kafka for one minute.  I found
> 97446
> > > >> >> messages
> > > >> >> >> in
> > > >> >> >> >> >producer side and 96896 messages in consumer side for Case
> > 1.
> > > >> >> >> >> >I also tried case 2. I have faced the same issues.  The
> > > number is
> > > >> >> not
> > > >> >> >> >> match
> > > >> >> >> >> >between producer and consumer.
> > > >> >> >> >> >Can someone take a look at this issue?
> > > >> >> >> >> >Thanks.
> > > >> >> >> >> >
> > > >> >> >> >> >
> > > >> >> >> >> >Case 1:
> > > >> >> >> >> >
> > > >> >> >> >> >long startTime = System.currentTimeMillis();
> > > >> >> >> >> >long maxDurationInMilliseconds = 1 * 60 * 1000;
> > > >> >> >> >> >int messageNo = 0;
> > > >> >> >> >> >while (true) {
> > > >> >> >> >> >if (System.currentTimeMillis() <= startTime
> > > >> >> >> >> >+ maxDurationInMilliseconds) {
> > > >> >> >> >> >messageNo++;
> > > >> >> >> >> >String messageStr = "KAFKA_"+messageNo;
> > > >> >> >> >> >System.out.println("Message: "+messageNo);
> > > >> >> >> >> >producer.send(new KeyedMessage<Integer,
> > > >> String>(topic,messageStr));
> > > >> >> >> >> >} else {
> > > >> >> >> >> >producer.close();
> > > >> >> >> >> >System.out.println("Total kafka Message: "+messageNo);
> > > >> >> >> >> >break;
> > > >> >> >> >> >}
> > > >> >> >> >> >}
> > > >> >> >> >> >
> > > >> >> >> >> >
> > > >> >> >> >> >Case 2:
> > > >> >> >> >> >
> > > >> >> >> >> >for (int i=1;i<=12000;i++)
> > > >> >> >> >> >String messageStr = "KAFKA_"+i;
> > > >> >> >> >> >producer.send(new KeyedMessage<Integer,
> > > >> String>(topic,messageStr));
> > > >> >> >> >> >
> > > >> >> >> >> >
> > > >> >> >> >> >
> > > >> >> >> >> >Best regards
> > > >> >> >> >> >Hawin
> > > >> >> >> >>
> > > >> >> >>
> > > >> >>
> > > >>
> > >
> >
>

Reply via email to