Hmm, if you shut down the leader, the follower should get a socket exception immediately.
Thanks, Jun On Wed, Feb 18, 2015 at 7:01 AM, Philippe Laflamme <plafla...@hopper.com> wrote: > After further investigation, I've figured out that the issue is caused by > the follower not processing messages from the controller until its > ReplicaFetcherThread has shutdown completely (which only happens when the > socket times out). > > If the test waits for the socket to timeout, the logs show that the > ReplicaFetcherThread shuts down completely, and immediately thereafter, the > UpdateMetadata requests get processed. > > Strangely, this happens even when controlled shutdown is enabled. > > Sounds related to this[1] which seems to have been fixed in 0.8.0. Are > there other edge cases not covered by the fix? Is this a known problem in > 0.8.1.1? > > Thanks, > Philippe > [1] https://issues.apache.org/jira/browse/KAFKA-612 > > On Wed, Feb 18, 2015 at 12:21 AM, Philippe Laflamme <plafla...@hopper.com> > wrote: > > > Hi, > > > > I'm trying to replicate a broker shutdown in unit tests. I've got a > simple > > cluster running with 2 brokers (and one ZK). I'm successfully able to > > create a topic with a single partition and replication factor of 2. > > > > I'd like to test shutting down the current leader for the partition and > > make sure my code handles the exceptions thrown such as > > NotLeaderForPartitionException. > > > > I can't seem to shutdown a broker and have the remaining one report that > > it is now the leader for the partition. It looks as though the controller > > successfully changes leadership, but the broker itself is unaware of the > > change. > > > > Here's a gist of the (convoluted) logs[1]. > > > > The sequence is as follows: > > 1- start 1 ZK and 2 brokers > > 2- create a topic (test-bogus) with 1 partition and 2 replication factor > > 3- wait for leadership > > 4- ask the controller who is the leader > > 5- ask all brokers who is the leader > > 6- shutdown leader > > 7- wait for leadership > > 8- ask the controller who is the leader > > 9- ask the remaining broker who is the leader > > > > Steps 4-6 appear here in the logs[2] > > Steps 8-9 appear here[3] > > > > As you can see, the controller is aware of the leadership change, but not > > the broker. I've activated controlled shutdown and this is still > happening. > > Any idea what may be causing this? > > > > I'm using Kafka 0.8.1.1 and ZK 3.4.5-cdh4.6 > > > > I'm using a TopicMetadataRequest for asking the brokers and inspecting > > ControllerContext.partitionLeadershipInfo to fetch leadership from the > > Controller. > > > > Thanks > > Philippe > > [1] https://gist.github.com/plaflamme/60805bfe15ae0106304a > > [2] > > > https://gist.github.com/plaflamme/60805bfe15ae0106304a#file-gistfile1-txt-L153-L158 > > [3] > > > https://gist.github.com/plaflamme/60805bfe15ae0106304a#file-gistfile1-txt-L227-L228 > > >