[jira] [Commented] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730439#comment-13730439 ] Swapnil Ghike commented on KAFKA-999: - Since we need leader broker's host:port to create ReplicaFetcherThread, the easiest fix for this ticket's purpose seems to be to pass all leaders through LeaderAndIsrRequest: val leaders = liveOrShuttingDownBrokers.filter(b = leaderIds.contains(b.id)) val leaderAndIsrRequest = new LeaderAndIsrRequest(partitionStateInfos, leaders, controllerId, controllerEpoch, correlationId, clientId) Any suggestions on avoiding a wire protocol change? Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Critical A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Fixes for 0.8 in trunk, absent from 0.8 branch
Hey guys, There are a few fixes which are marked for 0.8 and fixed in JIRA but have only been pushed to trunk and missing from 0.8 https://issues.apache.org/jira/browse/KAFKA-925 https://issues.apache.org/jira/browse/KAFKA-852 https://issues.apache.org/jira/browse/KAFKA-995 https://issues.apache.org/jira/browse/KAFKA-985 (also affectsVersion is 0.9 and fixVersion is 0.8.1) https://issues.apache.org/jira/browse/KAFKA-615 Were this mislabeled in JIRA or should they be present on the 0.8 branch. At the same time there seem to be issues on the 0.8 branch which are absent from trunk (e.g. KAFKA-989). Cosmin
[jira] [Commented] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730616#comment-13730616 ] Cosmin Lehene commented on KAFKA-347: - Is there some operational documentation on how to use this? change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch, KAFKA-347-v5.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-989) Race condition shutting down high-level consumer results in spinning background thread
[ https://issues.apache.org/jira/browse/KAFKA-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730787#comment-13730787 ] Phil Hargett commented on KAFKA-989: Thank you! Race condition shutting down high-level consumer results in spinning background thread -- Key: KAFKA-989 URL: https://issues.apache.org/jira/browse/KAFKA-989 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Environment: Ubuntu Linux x64 Reporter: Phil Hargett Assignee: Phil Hargett Fix For: 0.8 Attachments: KAFKA-989-failed-to-find-leader.patch, KAFKA-989-failed-to-find-leader-patch2.patch, KAFKA-989-failed-to-find-leader-patch3.patch Running an application that uses the Kafka client under load, can often hit this issue within a few hours. High-level consumers come and go over this application's lifecycle, but there are a variety of defenses that ensure each high-level consumer lasts several seconds before being shutdown. Nevertheless, some race is causing this background thread to continue long after the ZKClient it is using has been disconnected. Since the thread was spawned by a consumer that has already been shutdown, the application has no way to find this thread and stop it. Reported on the users-kafka mailing list 6/25 as 0.8 throwing exception 'Failed to find leader' and high-level consumer fails to make progress. The only remedy is to shutdown the application and restart it. Externally detecting that this state has occurred is not pleasant: need to grep log for repeated occurrences of the same exception. Stack trace: Failed to find leader for Set([topic6,0]): java.lang.NullPointerException at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416) at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413) at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409) at kafka.utils.ZkUtils$.getChildrenParentMayNotExist(ZkUtils.scala:438) at kafka.utils.ZkUtils$.getAllBrokersInCluster(ZkUtils.scala:75) at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:63) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730916#comment-13730916 ] Jun Rao commented on KAFKA-992: --- Thinking about this more. The same ZK issue can affect the controller and the consumer re-registration too. Should those be handled too? Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Fixes for 0.8 in trunk, absent from 0.8 branch
Yes, what is perhaps confusing is that 0.8 is essentially the release branch only fixes required for 0.8 final are going there. All other development is on trunk which will be kafka.next, which we are calling 0.8.1. Hope that helps. -Jay On Tue, Aug 6, 2013 at 9:25 AM, Jun Rao jun...@gmail.com wrote: Actually, all of them are marked as fixed in 0.8.1. Thanks, Jun On Tue, Aug 6, 2013 at 3:43 AM, Cosmin Lehene cleh...@adobe.com wrote: Hey guys, There are a few fixes which are marked for 0.8 and fixed in JIRA but have only been pushed to trunk and missing from 0.8 https://issues.apache.org/jira/browse/KAFKA-925 https://issues.apache.org/jira/browse/KAFKA-852 https://issues.apache.org/jira/browse/KAFKA-995 https://issues.apache.org/jira/browse/KAFKA-985 (also affectsVersion is 0.9 and fixVersion is 0.8.1) https://issues.apache.org/jira/browse/KAFKA-615 Were this mislabeled in JIRA or should they be present on the 0.8 branch. At the same time there seem to be issues on the 0.8 branch which are absent from trunk (e.g. KAFKA-989). Cosmin
[jira] [Commented] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731174#comment-13731174 ] Swapnil Ghike commented on KAFKA-999: - Actually that's not needed, will get a patch out in a couple hours. Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Critical A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Deleted] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swapnil Ghike updated KAFKA-999: Comment: was deleted (was: Actually that's not needed, will get a patch out in a couple hours.) Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Critical A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731187#comment-13731187 ] Joel Koshy commented on KAFKA-992: -- Delayed review - looks good to me, although I still don't see a benefit in storing the timestamp. i.e., the approach to retry on nodeexists if the host and port are the same would remain the same. i.e., it seems more for informative purposes. Let me know if I'm missing something. @Jun, you have a point about the controller. It seems it may not be a problem there since controller re-election will happen only after the data is actually deleted. For consumers it may not be an issue either given that the consumer id string includes a random uuid. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
[ https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-990: - Attachment: KAFKA-990-v1.patch Fix ReassignPartitionCommand and improve usability -- Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: KAFKA-990-v1.patch 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731282#comment-13731282 ] Joel Koshy commented on KAFKA-992: -- ok nm the comment about timestamp. I had forgotten that nodeexists wouldn't be thrown if the data is the same. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swapnil Ghike updated KAFKA-999: Attachment: kafka-999-v1.patch Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Swapnil Ghike Priority: Critical Attachments: kafka-999-v1.patch A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731337#comment-13731337 ] Joel Koshy commented on KAFKA-992: -- and nm for my comments about controller/consumers as well. For consumers, we don't regenerate the consumer id string. For controller, what can end up happening is: - controller session expires and becomes the controller again (with the stale ephemeral node) - another broker (whose session may not have expired) receives a watch when the stale ephemeral node is actually deleted - so we can end up with two controllers in this scenario. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731352#comment-13731352 ] Neha Narkhede commented on KAFKA-992: - We just found a way to reliably reproduce the zookeeper bug and verify that KAFKA-992 fix works. Now, we can fix the controller and consumer the same way. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731353#comment-13731353 ] Guozhang Wang commented on KAFKA-992: - The zookeeper bug can be reproduced as follows: 1. Checkout a clean 0.8 branch, revert back the KAFKA-992 fix: 2. Build and create a server connecting to a Zookeeper instance (make sure maxClientCnxns = 0 in ZK config so that one IP address can create as many connections as wanted) 3. Load the Zookeeper with dummy sessions, each creates and maintains a thousand ephemeral nodes. 4. Write a script that pause and resume the Zookeeper process continuously, for example: --- while true do kill -STOP $1 sleep 8 kill -CONT $1 sleep 60 done --- 5. Then when the Zookeeper process resumes, it will mark all sessions as timeout, but since the ephemeral nodes to delete are too many, the server's registration node may not be deleted yet when the servers tries to re-register itself and the server think himself as registered successfully. 6. And later Zookeeper will delete the server's registration node without the server's awareness. 7. If we re-apply KAFKA-992's patch, and redo the same testing setup. Under similar conditions the server will wait and retry. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731353#comment-13731353 ] Guozhang Wang edited comment on KAFKA-992 at 8/6/13 10:26 PM: -- The zookeeper bug can be reproduced as follows: 1. Checkout a clean 0.8 branch, revert back the KAFKA-992 fix: 2. Build and create a server connecting to a Zookeeper instance (make sure maxClientCnxns = 0 in ZK config so that one IP address can create as many connections as wanted) 3. Load the Zookeeper with dummy sessions, each creates and maintains a thousand ephemeral nodes. 4. Write a script that pause and resume the Zookeeper process continuously, for example: --- while true do kill -STOP $1 sleep 8 kill -CONT $1 sleep 60 done --- 5. Then when the Zookeeper process resumes, it will mark all sessions as timeout, but since the ephemeral nodes to delete are too many, the server's registration node may not be deleted yet when the servers tries to re-register itself and the server think himself as registered successfully. 6. And later Zookeeper will delete the server's registration node without the server's awareness. 7. If we re-apply KAFKA-992's patch, and redo the same testing setup. Under similar conditions the server will wait and retry. - Since we can now re-produce the bug and verify the fix, the same fix will be applied to Controller and Consumer. was (Author: guozhang): The zookeeper bug can be reproduced as follows: 1. Checkout a clean 0.8 branch, revert back the KAFKA-992 fix: 2. Build and create a server connecting to a Zookeeper instance (make sure maxClientCnxns = 0 in ZK config so that one IP address can create as many connections as wanted) 3. Load the Zookeeper with dummy sessions, each creates and maintains a thousand ephemeral nodes. 4. Write a script that pause and resume the Zookeeper process continuously, for example: --- while true do kill -STOP $1 sleep 8 kill -CONT $1 sleep 60 done --- 5. Then when the Zookeeper process resumes, it will mark all sessions as timeout, but since the ephemeral nodes to delete are too many, the server's registration node may not be deleted yet when the servers tries to re-register itself and the server think himself as registered successfully. 6. And later Zookeeper will delete the server's registration node without the server's awareness. 7. If we re-apply KAFKA-992's patch, and redo the same testing setup. Under similar conditions the server will wait and retry. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list:
[jira] [Commented] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731427#comment-13731427 ] Neha Narkhede commented on KAFKA-999: - Thanks for the patch Swapnil. Overall, well thought through. Few minor comments - 1. ControllerChannelManager leaderIds is not used anymore 2. LeaderAndIsrRequest The field actually means all the brokers in the cluster. So can we rename it from leaders to allBrokers? Same for Partition.scala and ReplicaManager.scala Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Swapnil Ghike Priority: Critical Attachments: kafka-999-v1.patch A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swapnil Ghike updated KAFKA-999: Attachment: kafka-999-v2.patch Thanks for pointing that out. Actually in ControllerChannelManager, we should rather pass liveOrShuttingDownBroker.filter(b = leaderIds.contains(b.id)) as the leaders to LeaderAndIsrRequest. Attached patch v2. Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Swapnil Ghike Priority: Critical Attachments: kafka-999-v1.patch, kafka-999-v2.patch A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-992: Attachment: KAFKA-992.v5.patch Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swapnil Ghike updated KAFKA-999: Attachment: kafka-999-v3.patch Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Swapnil Ghike Priority: Critical Attachments: kafka-999-v1.patch, kafka-999-v2.patch, kafka-999-v3.patch A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swapnil Ghike updated KAFKA-999: Attachment: (was: LIKAFKA-269-v3.patch) Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Swapnil Ghike Priority: Critical Attachments: kafka-999-v1.patch, kafka-999-v2.patch, kafka-999-v3.patch A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager
Swapnil Ghike created KAFKA-1003: Summary: ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager Key: KAFKA-1003 URL: https://issues.apache.org/jira/browse/KAFKA-1003 Project: Kafka Issue Type: Bug Reporter: Swapnil Ghike Assignee: Swapnil Ghike For consistency. We use clientId in the metric names elsewhere on clients. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager
[ https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swapnil Ghike updated KAFKA-1003: - Attachment: kafka-1003.patch ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager -- Key: KAFKA-1003 URL: https://issues.apache.org/jira/browse/KAFKA-1003 Project: Kafka Issue Type: Bug Reporter: Swapnil Ghike Assignee: Swapnil Ghike Attachments: kafka-1003.patch For consistency. We use clientId in the metric names elsewhere on clients. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-999: Resolution: Fixed Status: Resolved (was: Patch Available) Committed v3 to 0.8 Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Swapnil Ghike Priority: Critical Attachments: kafka-999-v1.patch, kafka-999-v2.patch, kafka-999-v3.patch A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira