[ 
https://issues.apache.org/jira/browse/KAFKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932125#comment-13932125
 ] 

Jay Kreps commented on KAFKA-1303:
----------------------------------

I don't think this is actually a problem. A metadata request could be slow for 
any number of reasons in which case this will happen. There is no guarantee 
that failover + metadata refresh completes before the retries are exhausted.

I would be opposed to having special connections for metadata requests.

However if we want to try to improve this we could include a smarter heuristic 
in Sender.selectMetadataDestination. Currently we prefer a node that we already 
have a connection to and which has no requests currently being sent, however 
many requests could be sent but not processed yet. We could prefer instead the 
node which we have a connection to which has the fewest in-flight requests. 

> metadata request in the new producer can be delayed
> ---------------------------------------------------
>
>                 Key: KAFKA-1303
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1303
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2
>            Reporter: Jun Rao
>
> While debugging a system test, I observed the following.
> 1. A broker side configuration 
> (replica.fetch.wait.max.ms=500,replica.fetch.min.bytes=4096) made the time to 
> complete a produce request long (each taking about 500ms with ack=-1).
> 2. The producer client has a bunch of outstanding produce requests queued up 
> on the brokers.
> 3. One of the brokers fails and we force updating the metadata.
> 4. The metadata request is queued up behind those outstanding producer 
> requests.
> 5. By the time the metadata response comes back, some messages have failed 
> all retries because of stale metadata.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to