[ 
https://issues.apache.org/jira/browse/KAFKA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-901:
--------------------------------

    Attachment: kafka-901-v2.patch

Thanks for the great review!

1. KafkaController:
1.1 This is a good suggestion, however it wouldn't suffice in the following 
cases -
- new broker startup - Here we have to send the metadata for all partitions to 
the new brokers. The leader and isr request only sends the relevant partitions
- controller failover - Here we have to send metadata for all partitions to all 
brokers
- partition reassignment - Here we have to send another metadata request just 
to communicate the change in isr and other replicas.

For now, I've left the old calls to sendUpdateMetadataRequest commented out to 
show what has changed. I will remove those comments before check in. I still 
think that the send update metadata request handling can be optimized to make 
it reach the brokers sooner, but every optimization will come with a risk. So I 
suggest, we first focus on correctness and then optimize if it works on large 
deployments. 

1.2 ControllerContext: I thought that it is unintuitive to not send broker 
information and only send broker id for brokers that are offline. There was a 
bug filed for this where users complained it was unintuitive. However, this 
change will need more thought to do it correctly. So I might include another 
patch to fix it properly. This patch doesn't have this change

1.3 Fixed
1.4.1 Fixed
1.4.2 Correct, it is to fix updating the partition leadership info while 
shrinking isr since the leader can also change in those cases and we use 
partition leadership info while sending update metadata request, so it should 
always be kept current

2,3. Same concern as 1.2

4. PartitionStateInfo:
4.1 We still need to send the number of all replicas to be able to deserialize 
the replica list correctly, which is the replication factor.
4.2 Good point, changed that

5. Good observation. This was somehow leftover in all controller state change 
requests. Didn't make sense, so removed it from LeaderAndIsrRequest, 
StopReplicaRequest and UpdateMetadataRequest

6. It really didn't make sense to me that the producer throw 
NoBrokersForPartitionException when in reality it had failed to fetch metadata. 
This will help us read errors better

7. Good point, moved it to the toString() API of TopicMetadata

8. AdminUtils:
8.1 Because it is a duplicate of the test in TopicMetadataTest.
8.2 Good point, added that and changed all tests to use it

9. Ideally yes. But the old code was not retrying for any exception, including 
UnknownTopicOrPartitionException. I've changed DefaultEventHandler to retry no 
matter what exception it hits. So the test is changed to reflect that it 
shouldn't give up with UnknownTopicOrPartitionException but instead should 
retry sending the message and succeed.
~                                                                               
                            
                
> Kafka server can become unavailable if clients send several metadata requests
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-901
>                 URL: https://issues.apache.org/jira/browse/KAFKA-901
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>         Attachments: kafka-901.patch, kafka-901-v2.patch, 
> metadata-request-improvement.patch
>
>
> Currently, if a broker is bounced without controlled shutdown and there are 
> several clients talking to the Kafka cluster, each of the clients realize the 
> unavailability of leaders for some partitions. This leads to several metadata 
> requests sent to the Kafka brokers. Since metadata requests are pretty slow, 
> all the I/O threads quickly become busy serving the metadata requests. This 
> leads to a full request queue, that stalls handling of finished responses 
> since the same network thread handles requests as well as responses. In this 
> situation, clients timeout on metadata requests and send more metadata 
> requests. This quickly makes the Kafka cluster unavailable. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to