[
https://issues.apache.org/jira/browse/KAFKA-340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joel Koshy updated KAFKA-340:
-----------------------------
Attachment: KAFKA-340-v1.patch
Short summary of this implementation of clean shutdown:
- Shutdown is triggered through a JMX operation on the controller.
- Steps during shutdown:
- Record the broker as shutting down in controller context.
- This set will contain all shutting brokers until they are actually
taken down. The liveBroker set will mask these (through the custom
getter/setter).
- Send a "wildcard" StopReplica request to the broker to stop its replica
fetchers. This will cause it to fall out of ISR sooner (as explained in
the previous comment.)
- Identify partitions led by the broker with replication factor > 1
- Transition leadership to another broker in ISR
- Return the number of remaining partitions that are led by the broker.
In practice, the way you would do clean shutdown is:
- Use the admin tool: ./bin/kafka-run-class.sh kafka.admin.ShutdownBroker
--broker <bid> --zookeeper <zkconnect>
- If the shutdown status that it prints out is "complete" then it means
broker <bid> has stopped its replica fetchers, and does not lead any
partitions. In this case, send a SIGTERM to the Kafka process to actually
take down the broker.
- If the shutdown status that is prints is "incomplete" then you may want to
wait a bit before retrying - which would typically make sense in a rolling
bounce.
- If you are bringing down the entire cluster, you will eventually hit the
"incomplete" status - since there will be insufficient brokers to move the
partition leadership to. In this case the operator presumably knows the
situation and will proceed to do an "unclean" shutdown on the remaining
brokers.
- If the jmx operation itself fails (say due to a controller failover),
simply retry.
Other comments:
- I initially thought to use boolean for handleStateChange, but needed to
query for the actual moved partition counts so did away with that.
- Also, considered using a zkpath (instead of jmx), but did not do this
because we would effectively lock the zkclient event thread until all
partition leadership moves are attempted. In this implementation the
controller context's lock is relinquished after moving each partition.
Another benefit of jmx over the zkpath is that it is convenient to return
the shutdown status so there's no need for a follow-up status check.
- For stopping the replica fetchers, I simply used a "wildcard" StopReplica
request - i.e., without any partitions listed. The broker will not get any
more leaderAndIsr requests (since it is no longer exposed under
liveBrokers) so the fetchers will not restart.
- I added a slightly dumb unit test (in addition to local stand-alone
testing), but we will need a more rigorous system test for this.
- Please let me know if you can think of corner cases to test for.
> Implement clean shutdown in 0.8
> -------------------------------
>
> Key: KAFKA-340
> URL: https://issues.apache.org/jira/browse/KAFKA-340
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Joel Koshy
> Priority: Blocker
> Labels: bugs
> Attachments: KAFKA-340-v1.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> If we are shutting down a broker when the ISR of a partition includes only
> that broker, we could lose some messages that have been previously committed.
> For clean shutdown, we need to guarantee that there is at least 1 other
> broker in ISR after the broker is shut down.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira