[ https://issues.apache.org/jira/browse/KAFKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941237#comment-13941237 ]
Jay Kreps commented on KAFKA-1303: ---------------------------------- Jun and I chatted briefly. The old approach is quite bad as it suddenly breaks something running live. It does this unexpectedly since your connections are working fine and you can send data. Even if this is rare it is really really really bad. I am -1 on anything that has this property. This has happened multiple times and people keep emailing us about this. The current approach has the possibility of sending a request to a broker that has outstanding requests. This may cause the metadata request to be slower than it otherwise would. Let's try to understand how significant this is. I do a load test where I write 100 byte messages as fast as possible with no linger time, this should be the worst case. Here are the metrics I see: 422,176 bytes/request 28.5 ms/request 4 in-flight requests with default socket buffer size (which is surprising I would expect only two would be possible with 1MB buffer). So in other words if we happened to chose this most heavily loaded broker in this worst case scenario we would wait 4 * 28.5 ms = 114 ms for the metadata update. Jun and I discussed a bit. For sure we should implement a preference for nodes with no in-flight requests. The question is what to do when no such node exists? At this point you can either create a new connection or you can send to an existing broker with in flight requests. This case should be very rare once we start sending back the full node list from the broker since even a fully loaded client will not be fully loaded against every broker. My preference would be to send to an existing broker since the penalty seems to be small and there are many other cases which cause periods of unavailability--GC, hard failures, etc. I don't Jun agrees. He was going to look into an approach will create a new connection for these cases using the original bootstrap urls. I'm still not wild about this case as it means more metadata requests will go to these brokers. (In a good deployment you would want standard config across all clients and you might have 50 brokers, but only 3 bootstrap urls, but all metadata requests would go to these three when no other connection was free which creates imbalance). > metadata request in the new producer can be delayed > --------------------------------------------------- > > Key: KAFKA-1303 > URL: https://issues.apache.org/jira/browse/KAFKA-1303 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.2 > Reporter: Jun Rao > > While debugging a system test, I observed the following. > 1. A broker side configuration > (replica.fetch.wait.max.ms=500,replica.fetch.min.bytes=4096) made the time to > complete a produce request long (each taking about 500ms with ack=-1). > 2. The producer client has a bunch of outstanding produce requests queued up > on the brokers. > 3. One of the brokers fails and we force updating the metadata. > 4. The metadata request is queued up behind those outstanding producer > requests. > 5. By the time the metadata response comes back, some messages have failed > all retries because of stale metadata. -- This message was sent by Atlassian JIRA (v6.2#6252)