Re: Broker Shutdown / Leader handoff issue

Jun Rao Mon, 23 Feb 2015 16:57:07 -0800

Hmm, if you shut down the leader, the follower should get a socket
exception immediately.


Thanks,

Jun

On Wed, Feb 18, 2015 at 7:01 AM, Philippe Laflamme <plafla...@hopper.com>
wrote:

> After further investigation, I've figured out that the issue is caused by
> the follower not processing messages from the controller until its
> ReplicaFetcherThread has shutdown completely (which only happens when the
> socket times out).
>
> If the test waits for the socket to timeout, the logs show that the
> ReplicaFetcherThread shuts down completely, and immediately thereafter, the
> UpdateMetadata requests get processed.
>
> Strangely, this happens even when controlled shutdown is enabled.
>
> Sounds related to this[1] which seems to have been fixed in 0.8.0. Are
> there other edge cases not covered by the fix? Is this a known problem in
> 0.8.1.1?
>
> Thanks,
> Philippe
> [1] https://issues.apache.org/jira/browse/KAFKA-612
>
> On Wed, Feb 18, 2015 at 12:21 AM, Philippe Laflamme <plafla...@hopper.com>
> wrote:
>
> > Hi,
> >
> > I'm trying to replicate a broker shutdown in unit tests. I've got a
> simple
> > cluster running with 2 brokers (and one ZK). I'm successfully able to
> > create a topic with a single partition and replication factor of 2.
> >
> > I'd like to test shutting down the current leader for the partition and
> > make sure my code handles the exceptions thrown such as
> > NotLeaderForPartitionException.
> >
> > I can't seem to shutdown a broker and have the remaining one report that
> > it is now the leader for the partition. It looks as though the controller
> > successfully changes leadership, but the broker itself is unaware of the
> > change.
> >
> > Here's a gist of the (convoluted) logs[1].
> >
> > The sequence is as follows:
> > 1- start 1 ZK and 2 brokers
> > 2- create a topic (test-bogus) with 1 partition and 2 replication factor
> > 3- wait for leadership
> > 4- ask the controller who is the leader
> > 5- ask all brokers who is the leader
> > 6- shutdown leader
> > 7- wait for leadership
> > 8- ask the controller who is the leader
> > 9- ask the remaining broker who is the leader
> >
> > Steps 4-6 appear here in the logs[2]
> > Steps 8-9 appear here[3]
> >
> > As you can see, the controller is aware of the leadership change, but not
> > the broker. I've activated controlled shutdown and this is still
> happening.
> > Any idea what may be causing this?
> >
> > I'm using Kafka 0.8.1.1 and ZK 3.4.5-cdh4.6
> >
> > I'm using a TopicMetadataRequest for asking the brokers and inspecting
> > ControllerContext.partitionLeadershipInfo to fetch leadership from the
> > Controller.
> >
> > Thanks
> > Philippe
> > [1] https://gist.github.com/plaflamme/60805bfe15ae0106304a
> > [2]
> >
> https://gist.github.com/plaflamme/60805bfe15ae0106304a#file-gistfile1-txt-L153-L158
> > [3]
> >
> https://gist.github.com/plaflamme/60805bfe15ae0106304a#file-gistfile1-txt-L227-L228
> >
>

Re: Broker Shutdown / Leader handoff issue

Reply via email to