[ 
https://issues.apache.org/jira/browse/KAFKA-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479056#comment-13479056
 ] 

Jun Rao commented on KAFKA-340:
-------------------------------

Thanks for the patch. Looks good overall. Some comments:

10. KafkaController.shutdownBroker:
10.1 Just stopping the follower in the broker to be shut down doesn't make it 
faster to fall out of leader's isr. This is because the leader will still need 
to wait for the timeout before dropping the broker out of the isr. The 
controller will need to shrink isr and send a leaderAndIsr request to each of 
the leaders. If we do this, there is probably no need for the wildcard 
stopReplica request.
10.2 It's better to use partitionsToMove in the following statement. 
    debug("Partitions with replication factor > 1 for which broker %d is 
leader: %s"
          .format(id, replicatedPartitionsBrokerLeads.mkString(",")))

11. IsrPartitionLeaderSelector:
11.1 The name seems very general. Could we rename it to something like 
controlledShutdownLeaderElector?
11.2 In the leader election logic, there is no need to make sure that the new 
leader is not the current leader. The customized 
controllerContext.liveBrokerIds should have filtered out the current leader 
(which is shut down by the jmx operation).

12. StopReplicaRequest: Agree with Neha here. We need to add a flag to 
distinguish between the case that we just want to stop the replica and the case 
that we want to stop the replica and delete its data. The latter will be used 
in reassigning partitions (and delete topics in the future).


                
> Implement clean shutdown in 0.8
> -------------------------------
>
>                 Key: KAFKA-340
>                 URL: https://issues.apache.org/jira/browse/KAFKA-340
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: bugs
>         Attachments: KAFKA-340-v1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If we are shutting down a broker when the ISR of a partition includes only 
> that broker, we could lose some messages that have been previously committed. 
> For clean shutdown, we need to guarantee that there is at least 1 other 
> broker in ISR after the broker is shut down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to