[ https://issues.apache.org/jira/browse/KAFKA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yang Ye updated KAFKA-343: -------------------------- Attachment: kafka_343.diff.3 20. KafkaController: 20.1 tryToBecomeController(): It seems that we don't need to first call ZkUtils.readDataMaybeNull and check if the controller exists or not. Instead, we can just call ZkUtils.createEphemeralPathExpectConflict directly. did that 20.2 allTopicPartitionAssignment and allPartitionReplicaAssignment are representing the same data in a slightly different form. Can we just keep one of them? Did that 20.3 leaderAndISRRecovery(): We need to turn on the Init flag for LeaderAndISRRequest since this method is the first request sent during controller failover. Also, instead of including all partitions in the request, we should only include partitions assigned to each broker. did it 20.4 initLeaders(): Instead of sending the same leaderAndISRRequest to each broker, we should only send to a broker partitions assigned to it. changed 20.5 onBrokerChanges: 20.5.1 It seems that the logic for handling new brokers in the same as leaderAndISRRecovery() and we can just reuse the logic. merged into a function "recoverLeaderAndISRFromZookeeper" 20.5.2 liveBrokerIds is allBrokerIds. fixed 20.5.3 Instead of trying to elect the leader of all partitions from ZK, the controller should cache the current leader of each partition and only try to elect the leader for partitions whose current leader is no long alive. This will save the # of ZK reads during broker failure. fixed 20.5.4 similar to initLeaders(), instead of sending the same leaderAndISRRequest to each broker, we should only send to a broker partitions assigned to it. fixed 21. BrokerChangeListener: 21.1 handleChildChange(): should we remove the TODO comment? removed 21.2 handleDeletedTopics(): similar to initLeaders(), instead of sending the same StopReplicaRequest to each broker, we should only send to a broker partitions assigned to it. fixed 22. KafkaApis.handleLeaderAndISRRequest(): If the IsInit flag is on, we should just call stopReplicaCbk to remove partitions that are to be deleted, instead of rewriting the logic already in stopRelicaCbk. fixed 23. KafkaZookeeper: remove unused imports fixed 24. LeaderAndISR: Just to be consistent with ZK versioning, should initialLeaderEpoc start from 0? changed to start from 0 25. Log.deleteWholeLog() is not needed since there is already LogManager.deleteLog This is not a problem since LogManager.deleteLog calls Log.deleteWholeLog() internally 26. Replica: isLeader() is not used. removed 27. ReplicaManager.maybeShrinkISR(): fix the indentation of the closing bracket fixed 28. StopReplicaRequest: DefaultAckTimeout: 1000 ms seems too long. How about 100ms? chagned 29. FetcherTest: tearDown(): there is no need to call fetcher.stopAllConnections() since fetcher.shutdown() does that already. changed 30. LeaderElectionTest.testLeaderElectionAndEpoch(): instead of adding sleep, could we change waitUntilLeaderIsElected so that it waits until the leader is a live broker. If you feel this is better handled in a separate jira, that fine too. Just create a new jira and provide enough details there. I'm going to open a new jira for it. > revisit the become leader and become follower state change operations using > V3 design > ------------------------------------------------------------------------------------- > > Key: KAFKA-343 > URL: https://issues.apache.org/jira/browse/KAFKA-343 > Project: Kafka > Issue Type: Sub-task > Components: core > Affects Versions: 0.8 > Reporter: Jun Rao > Assignee: Yang Ye > Fix For: 0.8 > > Attachments: kafka_343.diff.2, kafka_343.diff.3, kafka_343.patch > > > We need to reimplement become leader/follower using the controller model > described in > https://cwiki.apache.org/confluence/display/KAFKA/kafka+Detailed+Replication+Design+V3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira