[
https://issues.apache.org/jira/browse/KAFKA-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673651#comment-13673651
]
Jun Rao commented on KAFKA-927:
-------------------------------
Thanks for patch v3. A few more comments:
30. KafkaServer:
30.1 Could you combine isShuttingDown and startupComplete?
30.2 In controlledShutdown(), it's not clear if it's worth caching the socket
channel. Technically, it's possible for a controller to come back on the broker
with the same id, but with a different broker host/port. It's simpler to just
always close the socket channel on each ControlledShutdownRequest and create a
new channel on retry.
31. KafkaController:
31.1 remove unused import java.util.concurrent.{Semaphore
31.2 I think we still need to set shuttingDownBrokerIds to empty in
onControllerFailover(). A controller may failover during a controlled shutdown
and later regain the controllership. OnBrokerFailure() is only called if the
controller is active. So shuttingDownBrokerIds may not be empty when the
controllership switches back.
> Integrate controlled shutdown into kafka shutdown hook
> ------------------------------------------------------
>
> Key: KAFKA-927
> URL: https://issues.apache.org/jira/browse/KAFKA-927
> Project: Kafka
> Issue Type: Bug
> Reporter: Sriram Subramanian
> Assignee: Sriram Subramanian
> Attachments: KAFKA-927.patch, KAFKA-927-v2.patch,
> KAFKA-927-v2-revised.patch, KAFKA-927-v3.patch
>
>
> The controlled shutdown mechanism should be integrated into the software for
> better operational benefits. Also few optimizations can be done to reduce
> unnecessary rpc and zk calls. This patch has been tested on a prod like
> environment by doing rolling bounces continuously for a day. The average time
> of doing a rolling bounce with controlled shutdown for a cluster with 7 nodes
> without this patch is 340 seconds. With this patch it reduces to 220 seconds.
> Also it ensures correctness in scenarios where the controller shrinks the isr
> and the new leader could place the broker to be shutdown back into the isr.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira