[
https://issues.apache.org/jira/browse/KAFKA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neha Narkhede updated KAFKA-901:
--------------------------------
Attachment: kafka-901-v2.patch
Thanks for the great review!
1. KafkaController:
1.1 This is a good suggestion, however it wouldn't suffice in the following
cases -
- new broker startup - Here we have to send the metadata for all partitions to
the new brokers. The leader and isr request only sends the relevant partitions
- controller failover - Here we have to send metadata for all partitions to all
brokers
- partition reassignment - Here we have to send another metadata request just
to communicate the change in isr and other replicas.
For now, I've left the old calls to sendUpdateMetadataRequest commented out to
show what has changed. I will remove those comments before check in. I still
think that the send update metadata request handling can be optimized to make
it reach the brokers sooner, but every optimization will come with a risk. So I
suggest, we first focus on correctness and then optimize if it works on large
deployments.
1.2 ControllerContext: I thought that it is unintuitive to not send broker
information and only send broker id for brokers that are offline. There was a
bug filed for this where users complained it was unintuitive. However, this
change will need more thought to do it correctly. So I might include another
patch to fix it properly. This patch doesn't have this change
1.3 Fixed
1.4.1 Fixed
1.4.2 Correct, it is to fix updating the partition leadership info while
shrinking isr since the leader can also change in those cases and we use
partition leadership info while sending update metadata request, so it should
always be kept current
2,3. Same concern as 1.2
4. PartitionStateInfo:
4.1 We still need to send the number of all replicas to be able to deserialize
the replica list correctly, which is the replication factor.
4.2 Good point, changed that
5. Good observation. This was somehow leftover in all controller state change
requests. Didn't make sense, so removed it from LeaderAndIsrRequest,
StopReplicaRequest and UpdateMetadataRequest
6. It really didn't make sense to me that the producer throw
NoBrokersForPartitionException when in reality it had failed to fetch metadata.
This will help us read errors better
7. Good point, moved it to the toString() API of TopicMetadata
8. AdminUtils:
8.1 Because it is a duplicate of the test in TopicMetadataTest.
8.2 Good point, added that and changed all tests to use it
9. Ideally yes. But the old code was not retrying for any exception, including
UnknownTopicOrPartitionException. I've changed DefaultEventHandler to retry no
matter what exception it hits. So the test is changed to reflect that it
shouldn't give up with UnknownTopicOrPartitionException but instead should
retry sending the message and succeed.
~
> Kafka server can become unavailable if clients send several metadata requests
> -----------------------------------------------------------------------------
>
> Key: KAFKA-901
> URL: https://issues.apache.org/jira/browse/KAFKA-901
> Project: Kafka
> Issue Type: Bug
> Components: replication
> Affects Versions: 0.8
> Reporter: Neha Narkhede
> Assignee: Neha Narkhede
> Priority: Blocker
> Attachments: kafka-901.patch, kafka-901-v2.patch,
> metadata-request-improvement.patch
>
>
> Currently, if a broker is bounced without controlled shutdown and there are
> several clients talking to the Kafka cluster, each of the clients realize the
> unavailability of leaders for some partitions. This leads to several metadata
> requests sent to the Kafka brokers. Since metadata requests are pretty slow,
> all the I/O threads quickly become busy serving the metadata requests. This
> leads to a full request queue, that stalls handling of finished responses
> since the same network thread handles requests as well as responses. In this
> situation, clients timeout on metadata requests and send more metadata
> requests. This quickly makes the Kafka cluster unavailable.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira