Yes. I can hardly believe there is message loss in such a case so I've checked 
my test code very carefully. Unfortunately, I cannot provide any log because 
of the company policy. This is a big risk for our operation as all our servers 
at one data center must be rebooted regularly at the same time by system admins.
So rolling start is not an option for us.

> Date: Sat, 7 Jun 2014 16:02:35 -0700
> Subject: Re: question about synchronous producer
> From: wangg...@gmail.com
> To: users@kafka.apache.org
> 
> I see. So previously you run the test as ack=1?
> 
> Guozhang
> 
> 
> On Sat, Jun 7, 2014 at 7:24 AM, Libo Yu <yu_l...@hotmail.com> wrote:
> 
> > Hi Guozhang,
> >
> > The issue is not constantly to reproduce but fairly easy to reproduce.
> > It seems there is some kind of failure that has not been captured by the
> > Kafka code and no exception has been thrown.
> >
> > I did a new test with request.required.acks set to -1. The number of lost
> > message dropped significantly but there were still message loss.
> >
> > > Date: Fri, 6 Jun 2014 08:12:17 -0700
> > > Subject: Re: question about synchronous producer
> > > From: wangg...@gmail.com
> > > To: users@kafka.apache.org
> > >
> > > Libo,
> > >
> > > I have double checked the code. With sync producers all failures should
> > be
> > > either thrown as exceptions or logged in warning/error log entries.
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Jun 5, 2014 at 6:38 PM, Libo Yu <yu_l...@hotmail.com> wrote:
> > >
> > > > Not really. The issue was reported by a client. I added a lot of
> > logging
> > > > to make sure no exception was thrown
> > > > from send() when the message was lost. It is not hard to reproduce.
> > This
> > > > is a critical issue for operation. It may
> > > > not be possible for brokers and producers to be restarted at the same
> > time.
> > > >
> > > > > Date: Thu, 5 Jun 2014 16:53:29 -0700
> > > > > Subject: Re: question about synchronous producer
> > > > > From: wangg...@gmail.com
> > > > > To: users@kafka.apache.org
> > > > >
> > > > > Libo, did you see any exception/error entries on the producer log?
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > > On Thu, Jun 5, 2014 at 10:33 AM, Libo Yu <yu_l...@hotmail.com>
> > wrote:
> > > > >
> > > > > > Yes. I used three sync producers with request.required.acks=1. I
> > let
> > > > them
> > > > > > publish 2k short messages and in the process I restart all
> > zookeeper
> > > > and
> > > > > > kafka processes ( 3 hosts in a cluster). Normally there will be
> > message
> > > > > > loss after 3 restarts. After 3 restarts, I use a consumer to
> > retrieve
> > > > the
> > > > > > messages and do the verification.
> > > > > >
> > > > > > > Date: Thu, 5 Jun 2014 10:15:18 -0700
> > > > > > > Subject: Re: question about synchronous producer
> > > > > > > From: wangg...@gmail.com
> > > > > > > To: users@kafka.apache.org
> > > > > > >
> > > > > > > Libo,
> > > > > > >
> > > > > > > For clarification, you can use sync producer to reproduce this
> > issue?
> > > > > > >
> > > > > > > Guozhang
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 5, 2014 at 10:03 AM, Libo Yu <yu_l...@hotmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > When all the  brokers are down the producer should retry for a
> > few
> > > > > > times
> > > > > > > > and throw FailedToSendMessageException. And user code can
> > catch the
> > > > > > > > exception and retry after a backoff. However, in my tests, no
> > > > > > exception was
> > > > > > > > caught and the message was lost silently. My broker is 0.8.1.1
> > and
> > > > my
> > > > > > > > client is 0.8.0. It is fairly easy to reproduce. Any insight on
> > > > this
> > > > > > issue?
> > > > > > > >
> > > > > > > > Libo
> > > > > > > >
> > > > > > > > > Date: Thu, 5 Jun 2014 09:05:27 -0700
> > > > > > > > > Subject: Re: question about synchronous producer
> > > > > > > > > From: wangg...@gmail.com
> > > > > > > > > To: users@kafka.apache.org
> > > > > > > > >
> > > > > > > > > When the producer exhausted all the retries it will drop the
> > > > message
> > > > > > on
> > > > > > > > the
> > > > > > > > > floor. So when the broker is down for too long there will be
> > data
> > > > > > loss.
> > > > > > > > >
> > > > > > > > > Guozhang
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jun 5, 2014 at 6:20 AM, Libo Yu <yu_l...@hotmail.com
> > >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > I want to know why there will be message loss when brokers
> > are
> > > > > > down for
> > > > > > > > > > too long.
> > > > > > > > > > I've noticed message loss when brokers are restarted during
> > > > > > > > publishing. It
> > > > > > > > > > is a sync producer with request.required.acks set to 1.
> > > > > > > > > >
> > > > > > > > > > Libo
> > > > > > > > > >
> > > > > > > > > > > Date: Thu, 29 May 2014 20:11:48 -0700
> > > > > > > > > > > Subject: Re: question about synchronous producer
> > > > > > > > > > > From: wangg...@gmail.com
> > > > > > > > > > > To: users@kafka.apache.org
> > > > > > > > > > >
> > > > > > > > > > > Libo,
> > > > > > > > > > >
> > > > > > > > > > > That is correct. You may want to increase the
> > > > retry.backoff.ms
> > > > > > in
> > > > > > > > this
> > > > > > > > > > > case. In practice, if the brokers are down for too long,
> > then
> > > > > > data
> > > > > > > > loss
> > > > > > > > > > is
> > > > > > > > > > > usually inevitable.
> > > > > > > > > > >
> > > > > > > > > > > Guozhang
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 29, 2014 at 2:55 PM, Libo Yu <
> > > > yu_l...@hotmail.com>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi team,
> > > > > > > > > > > >
> > > > > > > > > > > > Assume I am using a synchronous producer and it has the
> > > > > > following
> > > > > > > > > > default
> > > > > > > > > > > > properties:
> > > > > > > > > > > >
> > > > > > > > > > > > message.send.max.retries
> > > > > > > > > > > >       3
> > > > > > > > > > > > retry.backoff.ms
> > > > > > > > > > > >       100
> > > > > > > > > > > >
> > > > > > > > > > > > I use java api Producer.send(message) to send a
> > message.
> > > > > > > > > > > > While send() is being called, if the brokers are
> > shutdown,
> > > > what
> > > > > > > > > > happens?
> > > > > > > > > > > > send() will retry 3 times with a 100ms interval and
> > fail
> > > > > > silently?
> > > > > > > > > > > > If I don't want to lose any message when the brokers
> > are
> > > > back
> > > > > > > > online,
> > > > > > > > > > what
> > > > > > > > > > > > should I do? Thanks.
> > > > > > > > > > > >
> > > > > > > > > > > > Libo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > -- Guozhang
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > -- Guozhang
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -- Guozhang
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> >
> >
> 
> 
> 
> -- 
> -- Guozhang
                                          

Reply via email to