[ 
https://issues.apache.org/jira/browse/KAFKA-340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-340:
-----------------------------

    Attachment: KAFKA-340-v1.patch

Short summary of this implementation of clean shutdown:

- Shutdown is triggered through a JMX operation on the controller.
- Steps during shutdown:
  - Record the broker as shutting down in controller context.
    - This set will contain all shutting brokers until they are actually
      taken down. The liveBroker set will mask these (through the custom
      getter/setter).
  - Send a "wildcard" StopReplica request to the broker to stop its replica
    fetchers. This will cause it to fall out of ISR sooner (as explained in
    the previous comment.)
  - Identify partitions led by the broker with replication factor > 1
  - Transition leadership to another broker in ISR
- Return the number of remaining partitions that are led by the broker.

In practice, the way you would do clean shutdown is:
- Use the admin tool: ./bin/kafka-run-class.sh kafka.admin.ShutdownBroker
  --broker <bid> --zookeeper <zkconnect>
- If the shutdown status that it prints out is "complete" then it means
  broker <bid> has stopped its replica fetchers, and does not lead any
  partitions. In this case, send a SIGTERM to the Kafka process to actually
  take down the broker.
- If the shutdown status that is prints is "incomplete" then you may want to
  wait a bit before retrying - which would typically make sense in a rolling
  bounce.
- If you are bringing down the entire cluster, you will eventually hit the
  "incomplete" status - since there will be insufficient brokers to move the
  partition leadership to. In this case the operator presumably knows the
  situation and will proceed to do an "unclean" shutdown on the remaining
  brokers.
- If the jmx operation itself fails (say due to a controller failover),
  simply retry.

Other comments:

- I initially thought to use boolean for handleStateChange, but needed to
  query for the actual moved partition counts so did away with that.
- Also, considered using a zkpath (instead of jmx), but did not do this
  because we would effectively lock the zkclient event thread until all
  partition leadership moves are attempted. In this implementation the
  controller context's lock is relinquished after moving each partition.
  Another benefit of jmx over the zkpath is that it is convenient to return
  the shutdown status so there's no need for a follow-up status check.
- For stopping the replica fetchers, I simply used a "wildcard" StopReplica
  request - i.e., without any partitions listed. The broker will not get any
  more leaderAndIsr requests (since it is no longer exposed under
  liveBrokers) so the fetchers will not restart.
- I added a slightly dumb unit test (in addition to local stand-alone
  testing), but we will need a more rigorous system test for this.
- Please let me know if you can think of corner cases to test for.

                
> Implement clean shutdown in 0.8
> -------------------------------
>
>                 Key: KAFKA-340
>                 URL: https://issues.apache.org/jira/browse/KAFKA-340
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: bugs
>         Attachments: KAFKA-340-v1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If we are shutting down a broker when the ISR of a partition includes only 
> that broker, we could lose some messages that have been previously committed. 
> For clean shutdown, we need to guarantee that there is at least 1 other 
> broker in ISR after the broker is shut down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to