Re: produce request failed: due to Leader not local for partition

Joel Koshy Mon, 24 Jun 2013 01:58:28 -0700

After we implement non-blocking IO for the producer, there may not be much
incentive left to use ack = 0, but this is an interesting idea - not just
for the controlled shutdown case, but also when leadership moves due to
say, a broker's zk session expiring. Will have to think about it a bit more.




On Mon, Jun 24, 2013 at 12:22 AM, Jason Rosenberg <j...@squareup.com> wrote:

> Yeah I am using ack = 0, so that makes sense.  I'll need to rethink that,
> it would seem.  It would be nice, wouldn't it, in this case, for the broker
> to realize this and just forward the messages to the correct leader.  Would
> that be possible?
>
> Also, it would be nice to have a second option to the controlled shutdown
> (e.g. controlled.shutdown.quiescence.ms), to allow the broker to wait
> after
> the controlled shutdown, a prescribed amount of time before actually
> shutting down the server. Then, I could set this value to something a
> little greater than the producer's 'topic.metadata.refresh.interval.ms'.
>  This would help with hitless rolling restarts too.  Currently, every
> producer gets a very loud "Connection Reset" with a tall stack trace each
> time I restart a broker.  Would be nicer to have the producers still be
> able to produce until the metadata refresh interval expires, then get the
> word that the leader has moved due to the controlled shutdown, and then
> start producing to the new leader, all before the shutting down server
> actually shuts down.  Does that seem feasible?
>
> Jason
>
>
> On Sun, Jun 23, 2013 at 8:23 PM, Jun Rao <jun...@gmail.com> wrote:
>
> > Jason,
> >
> > Are you using ack = 0 in the producer? This mode doesn't work well with
> > controlled shutdown (this is explained in FAQ i*n
> > https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#)*
> > *
> > *
> > Thanks,
> >
> > Jun
> >
> >
> > On Sun, Jun 23, 2013 at 1:45 AM, Jason Rosenberg <j...@squareup.com>
> wrote:
> >
> > > I'm working on trying on having seamless rolling restarts for my kafka
> > > servers, running 0.8.  I have it so that each server will be restarted
> > > sequentially.  Each server takes itself out of the load balancer (e.g.
> > sets
> > > a status that the lb will recognize, and then waits more than long
> enough
> > > for the lb to stop sending meta-data requests to that server).  Then I
> > > initiate the shutdown (with controlled.shutdown.enable=true).  This
> seems
> > > to work well, however, I occasionally see warnings like this in the log
> > > from the server, after restart:
> > >
> > > 2013-06-23 08:28:46,770  WARN [kafka-request-handler-2]
> server.KafkaApis
> > -
> > > [KafkaApi-508818741] Produce request with correlation id 7136261 from
> > > client  on partition [mytopic,0] failed due to Leader not local for
> > > partition [mytopic,0] on broker 508818741
> > >
> > > This WARN seems to persistently repeat, until the producer client
> > initiates
> > > a new meta-data request (e.g. every 10 minutes, by default).  However,
> > the
> > > producer doesn't log any errors/exceptions when the server is logging
> > this
> > > WARN.
> > >
> > > What's happening here?  Is the message silently being forwarded on to
> the
> > > correct leader for the partition?  Is the message dropped?  Are these
> > WARNS
> > > particularly useful?
> > >
> > > Thanks,
> > >
> > > Jason
> > >
> >
>

Re: produce request failed: due to Leader not local for partition

Reply via email to