[ https://issues.apache.org/jira/browse/KAFKA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750082#comment-15750082 ]
Jeff Widman commented on KAFKA-3135: ------------------------------------ Can the tags on this issue be updated to note that it applies to 0.10.x versions as well? At least, that's what it seems from reading through the issue description. > Unexpected delay before fetch response transmission > --------------------------------------------------- > > Key: KAFKA-3135 > URL: https://issues.apache.org/jira/browse/KAFKA-3135 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.9.0.0 > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Critical > Fix For: 0.10.2.0 > > > From the user list, Krzysztof Ciesielski reports the following: > {quote} > Scenario description: > First, a producer writes 500000 elements into a topic > Then, a consumer starts to read, polling in a loop. > When "max.partition.fetch.bytes" is set to a relatively small value, each > "consumer.poll()" returns a batch of messages. > If this value is left as default, the output tends to look like this: > Poll returned 13793 elements > Poll returned 13793 elements > Poll returned 13793 elements > Poll returned 13793 elements > Poll returned 0 elements > Poll returned 0 elements > Poll returned 0 elements > Poll returned 0 elements > Poll returned 13793 elements > Poll returned 13793 elements > As we can see, there are weird "gaps" when poll returns 0 elements for some > time. What is the reason for that? Maybe there are some good practices > about setting "max.partition.fetch.bytes" which I don't follow? > {quote} > The gist to reproduce this problem is here: > https://gist.github.com/kciesielski/054bb4359a318aa17561. > After some initial investigation, the delay appears to be in the server's > networking layer. Basically I see a delay of 5 seconds from the time that > Selector.send() is invoked in SocketServer.Processor with the fetch response > to the time that the send is completed. Using netstat in the middle of the > delay shows the following output: > {code} > tcp4 0 0 10.191.0.30.55455 10.191.0.30.9092 ESTABLISHED > tcp4 0 102400 10.191.0.30.9092 10.191.0.30.55454 ESTABLISHED > {code} > From this, it looks like the data reaches the send buffer, but needs to be > flushed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)