[ https://issues.apache.org/jira/browse/KAFKA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-901: -------------------------------- Attachment: kafka-901-v2.patch Thanks for the great review! 1. KafkaController: 1.1 This is a good suggestion, however it wouldn't suffice in the following cases - - new broker startup - Here we have to send the metadata for all partitions to the new brokers. The leader and isr request only sends the relevant partitions - controller failover - Here we have to send metadata for all partitions to all brokers - partition reassignment - Here we have to send another metadata request just to communicate the change in isr and other replicas. For now, I've left the old calls to sendUpdateMetadataRequest commented out to show what has changed. I will remove those comments before check in. I still think that the send update metadata request handling can be optimized to make it reach the brokers sooner, but every optimization will come with a risk. So I suggest, we first focus on correctness and then optimize if it works on large deployments. 1.2 ControllerContext: I thought that it is unintuitive to not send broker information and only send broker id for brokers that are offline. There was a bug filed for this where users complained it was unintuitive. However, this change will need more thought to do it correctly. So I might include another patch to fix it properly. This patch doesn't have this change 1.3 Fixed 1.4.1 Fixed 1.4.2 Correct, it is to fix updating the partition leadership info while shrinking isr since the leader can also change in those cases and we use partition leadership info while sending update metadata request, so it should always be kept current 2,3. Same concern as 1.2 4. PartitionStateInfo: 4.1 We still need to send the number of all replicas to be able to deserialize the replica list correctly, which is the replication factor. 4.2 Good point, changed that 5. Good observation. This was somehow leftover in all controller state change requests. Didn't make sense, so removed it from LeaderAndIsrRequest, StopReplicaRequest and UpdateMetadataRequest 6. It really didn't make sense to me that the producer throw NoBrokersForPartitionException when in reality it had failed to fetch metadata. This will help us read errors better 7. Good point, moved it to the toString() API of TopicMetadata 8. AdminUtils: 8.1 Because it is a duplicate of the test in TopicMetadataTest. 8.2 Good point, added that and changed all tests to use it 9. Ideally yes. But the old code was not retrying for any exception, including UnknownTopicOrPartitionException. I've changed DefaultEventHandler to retry no matter what exception it hits. So the test is changed to reflect that it shouldn't give up with UnknownTopicOrPartitionException but instead should retry sending the message and succeed. ~ > Kafka server can become unavailable if clients send several metadata requests > ----------------------------------------------------------------------------- > > Key: KAFKA-901 > URL: https://issues.apache.org/jira/browse/KAFKA-901 > Project: Kafka > Issue Type: Bug > Components: replication > Affects Versions: 0.8 > Reporter: Neha Narkhede > Assignee: Neha Narkhede > Priority: Blocker > Attachments: kafka-901.patch, kafka-901-v2.patch, > metadata-request-improvement.patch > > > Currently, if a broker is bounced without controlled shutdown and there are > several clients talking to the Kafka cluster, each of the clients realize the > unavailability of leaders for some partitions. This leads to several metadata > requests sent to the Kafka brokers. Since metadata requests are pretty slow, > all the I/O threads quickly become busy serving the metadata requests. This > leads to a full request queue, that stalls handling of finished responses > since the same network thread handles requests as well as responses. In this > situation, clients timeout on metadata requests and send more metadata > requests. This quickly makes the Kafka cluster unavailable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira