[jira] [Comment Edited] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331093#comment-16331093 ] Jeff Widman edited comment on KAFKA-1120 at 1/18/18 7:58 PM: - The issue description says "the broker will be in this weird state until it is restarted." Could this also be fixed by simply forcing a controller re-election through removing the /controller znode? Since it will re-identify the leaders? In some scenarios, it seems that might be a lighter-weight solution. I understand this does not fix the root code cause, but just want to be sure I understand what options I have if we hit this in an emergency situation. was (Author: jeffwidman): The issue description says "the broker will be in this weird state until it is restarted." Couldn't this also be fixed by simply forcing a controller re-election? Since it will re-identiy the leaders? > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Sub-task > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao >Assignee: Mickael Maison >Priority: Major > Labels: reliability > Fix For: 1.1.0 > > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277146#comment-16277146 ] Ramnatthan Alagappan edited comment on KAFKA-1120 at 12/4/17 5:46 PM: -- I ran into this issue and have a reproducible setup irrespective of the number of partitions or nodes. [~onurkaraman]'s analysis in comment @ [#comment-16113645] is correct. The root cause is that the shutdown broker restarts and registers with ZK in a short interval of time. When the broker shuts down, ZK delivers a callback for deletion of the broker. Before ZKClient can reestablish the callback (by issuing a stat call), the broker registers with ZK. By the time ZKClient gets the /brokers/ids node from ZK, the shutdown broker also appears in /brokers/ids. With this, the shutdown broker appears both in curBrokerIds and liveOrShuttingDownBrokerIds, causing newBrokerIds to be empty, which causes this problem. was (Author: ramanala): I ran into this issue and have a reproducible setup irrespective of the number of partitions or nodes. [~onurkaraman]'s analysis in comment @ [#comment-16113645] is correct. The root cause is that the shutdown broker restarts and registers with ZK in a short interval of time. When the broker shutsdown, ZK delivers a callback for deletion of the broker. Before ZKClient can reestablish the callback (by issuing a stat call), the broker registers with ZK. By the time ZKClient gets the /brokers/ids node from ZK, the shutdown broker also appears in /brokers/ids. With this, the shutdown broker appears both in curBrokerIds and liveOrShuttingDownBrokerIds, causing newBrokerIds to be empty, which causes this problem. > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Sub-task > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao >Assignee: Mickael Maison > Labels: reliability > Fix For: 1.1.0 > > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277146#comment-16277146 ] Ramnatthan Alagappan edited comment on KAFKA-1120 at 12/4/17 5:46 PM: -- I ran into this issue and have a reproducible setup irrespective of the number of partitions or nodes. [~onurkaraman]'s analysis in comment @ [#comment-16113645] is correct. The root cause is that the shutdown broker restarts and registers with ZK in a short interval of time. When the broker shutsdown, ZK delivers a callback for deletion of the broker. Before ZKClient can reestablish the callback (by issuing a stat call), the broker registers with ZK. By the time ZKClient gets the /brokers/ids node from ZK, the shutdown broker also appears in /brokers/ids. With this, the shutdown broker appears both in curBrokerIds and liveOrShuttingDownBrokerIds, causing newBrokerIds to be empty, which causes this problem. was (Author: ramanala): I ran into this issue and have a reproducible setup irrespective of the number of partitions or nodes. [~onurkaraman]'s analysis in comment @ [#comment-16113645] is correct. The root cause is that the shutdown broker restarts and registers with ZK in a short interval of time. During this time, ZK delivers a callback for deletion of the broker. Before ZKClient can reestablish the callback (by issuing a stat call), the broker registers with ZK. By the time ZKClient gets the /brokers/ids node from ZK, the shutdown broker also appears in /brokers/ids. With this, the shutdown broker appears both in curBrokerIds and liveOrShuttingDownBrokerIds, causing newBrokerIds to be empty, which causes this problem. > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Sub-task > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao >Assignee: Mickael Maison > Labels: reliability > Fix For: 1.1.0 > > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142304#comment-16142304 ] James Cheng edited comment on KAFKA-1120 at 8/25/17 10:04 PM: -- [~onurkaraman], while you are figuring out a fix, can you recommend anything we can do to avoid triggering the scenario? We have high partition counts, and so are encountering long ControlledShutdown events, which means we are hitting this quite often. It seems like the bug gets triggered when the broker gets back into zookeeper before the controlled shutdown is finished being processed. So, I either make ControlledShutdown get processed faster, or I make it so that the broker gets into zookeeper slower. Would that be right? I can make ControlledShutdown get processed faster by reducing partition count, for example. I can make the broker get into zookeeper slower either by making sure it takes longer to shutdown (increasing controlled.shutdown.retry.backoff.ms?) or delaying startup ("sleep 60 && ./bin/kafka-server-start.sh") was (Author: wushujames): Onur, while you are figuring out a fix, can you recommend anything we can do to avoid triggering the scenario? We have high partition counts, and so are encountering long ControlledShutdown events, which means we are hitting this quite often. It seems like the bug gets triggered when the broker gets back into zookeeper before the controlled shutdown is finished being processed. So, I either make ControlledShutdown get processed faster, or I make it so that the broker gets into zookeeper slower. Would that be right? I can make ControlledShutdown get processed faster by reducing partition count, for example. I can make the broker get into zookeeper slower either by making sure it takes longer to shutdown (increasing controlled.shutdown.retry.backoff.ms?) or delaying startup ("sleep 60 && ./bin/kafka-server-start.sh") > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao > Labels: reliability > Fix For: 1.0.0 > > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078619#comment-16078619 ] Ismael Juma edited comment on KAFKA-1120 at 8/1/17 7:58 AM: [~wushujames], it seems that problem that you reproduced should have been fixed by KAFKA-5028. In that jira, there will be only a single thread processing any broker change and controlled shutdown request. So, they won't interleave. was (Author: junrao): [~wushujames], it seems that problem that you reproduced should have been fixed by https://issues.apache.org/jira/browse/KAFKA-5028. In that jira, there will be only a single thread processing any broker change and controlled shutdown request. So, they won't interleave. > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao > Labels: reliability > Fix For: 1.0.0 > > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107857#comment-16107857 ] James Cheng edited comment on KAFKA-1120 at 7/31/17 7:53 PM: - [~noslowerdna] [~junrao], I retested this will Kafka 0.11. The problem still exists. I followed the steps from my 24/Feb/17 22:57 comment. I ran it maybe 10 times in a row. Every single time, the broker that I restarted came back up and did not take leadership for any partitions. In addition, it only became a follower for about half the partitions. The fact that it became follower for half the partitions shows that the controller is at least aware that the broker exists (that is, the controller successfully saw the broker come back online.). But the controller didn't tell the broker to follow all the partitions that it should have. was (Author: wushujames): Hi, I retested this will Kafka 0.11. The problem still exists. I followed the steps from my 24/Feb/17 22:57 comment. I ran it maybe 10 times in a row. Every single time, the broker that I restarted came back up and did not take leadership for any partitions. In addition, it only became a follower for about half the partitions. The fact that it became follower for half the partitions shows that the controller is at least aware that the broker exists (that is, the controller successfully saw the broker come back online.). But the controller didn't tell the broker to follow all the partitions that it should have. > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao > Labels: reliability > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)