[ https://issues.apache.org/jira/browse/KAFKA-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raoufeh Hashemian updated KAFKA-5780: ------------------------------------- Attachment: shutdown.log shutdown_statechange.log shutdown_controller.log > Long shutdown time when updated to 0.11.0 > ----------------------------------------- > > Key: KAFKA-5780 > URL: https://issues.apache.org/jira/browse/KAFKA-5780 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.11.0.0 > Environment: CentOS Linux release 7.3.1611 , Kernel 3.10 > Reporter: Raoufeh Hashemian > Assignee: Apurva Mehta > Fix For: 0.11.0.1 > > Attachments: broker_shutdown.png, shutdown_controller.log, > shutdown.log, shutdown_statechange.log > > > When we switched from Kafka 0.10.2 to Kafka 0.11.0 , We faced a problem with > stopping the kafka service on a broker node. > Our cluster consists of 6 broker nodes. We had an existing topic when > switched to Kafka 0.11.0 . Since then, gracefully stoping the service on a > Kafka broker node results in the following warning message being repeated > every 100 ms in the broker log, and the shutdown takes approximately 45 > minutes to complete. > {code:java} > @40000000599714da1e582e4c [2017-08-18 16:24:48,509] WARN Connection to node > 1002 could not be established. Broker may not be available. > (org.apache.kafka.clients.NetworkClient) > @40000000599714da245483a4 [2017-08-18 16:24:48,609] WARN Connection to node > 1002 could not be established. Broker may not be available. > (org.apache.kafka.clients.NetworkClient) > @40000000599714da2a51177c [2017-08-18 16:24:48,709] WARN Connection to node > 1002 could not be established. Broker may not be available. > (org.apache.kafka.clients.NetworkClient) > {code} > Below is the last log lines when the shutdown is complete : > {code:java} > @4000000059971afd31113dbc [2017-08-18 16:50:59,823] WARN Connection to node > 1002 could not be established. Broker may not be available. > (org.apache.kafka.clients.NetworkClient) > @4000000059971afd361200bc [2017-08-18 16:50:59,907] INFO Shutdown complete. > (kafka.log.LogManager) > @4000000059971afd36afa04c [2017-08-18 16:50:59,917] INFO Terminate ZkClient > event thread. (org.I0Itec.zkclient.ZkEventThread) > @4000000059971afd36dd6edc [2017-08-18 16:50:59,920] INFO Session: > 0x35d68c9e76702a4 closed (org.apache.zookeeper.ZooKeeper) > @4000000059971afd36deca84 [2017-08-18 16:50:59,920] INFO EventThread shut > down for session: 0x35d68c9e76702a4 (org.apache.zookeeper.ClientCnxn) > @4000000059971afd36f6afb4 [2017-08-18 16:50:59,922] INFO [Kafka Server 1002], > shut down completed (kafka.server.KafkaServer) > {code} > I should note that I stopped the producers before shutting down the broker. > If I repeat the process after brining up the service, the shutdown takes less > than a minute. However, if I start the producers even for a short time and > repeat the process, it will again take around 45 minutes to do a graceful > shutdown. > Attached files shows the brokers CPU usage during the shutdown period (light > blue curve is the node in which the broker service is shutting down). > The size of the topic is 2.3 TB per broker. > I was wondering if this is an expected behaviour or It is a bug or a > misconfiguration? -- This message was sent by Atlassian JIRA (v6.4.14#64029)