[ https://issues.apache.org/jira/browse/KAFKA-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723913#comment-14723913 ]
ASF GitHub Bot commented on KAFKA-2486: --------------------------------------- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/180 KAFKA-2486; fix performance regression in new consumer The sleep() in KafkaConsumer's poll blocked any pending IO from being completed and created a performance bottleneck. It was intended to implement the fetch backoff behavior, but that was a misunderstanding of the setting "retry.backoff.ms" which should only affect failed fetches. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-2486 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/180.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #180 ---- commit 8bec099900fdb511cf9f4b5a31c63d695e6a9c49 Author: Jason Gustafson <ja...@confluent.io> Date: 2015-08-31T19:23:30Z KAFKA-2486; fix performance regression in new consumer ---- > New consumer performance > ------------------------ > > Key: KAFKA-2486 > URL: https://issues.apache.org/jira/browse/KAFKA-2486 > Project: Kafka > Issue Type: Sub-task > Components: consumer > Reporter: Ewen Cheslack-Postava > Assignee: Jason Gustafson > Fix For: 0.8.3 > > > The new consumer was previously reaching getting good performance. However, a > recent report on the mailing list indicates it's dropped significantly. After > evaluation, even with a local broker it seems to only be reaching a 2-10MB/s, > compared to 600+MB/s previously. Before release, we should get the > performance back on par. > Some details about where the regression occurred from the mailing list > http://mail-archives.apache.org/mod_mbox/kafka-dev/201508.mbox/%3CCAAdKFaE8bPSeWZf%2BF9RuA-xZazRpBrZG6vo454QLVHBAk_VOJg%40mail.gmail.com%3E > : > bq. At 49026f11781181c38e9d5edb634be9d27245c961 (May 14th), we went from good > performance -> an error due to broker apparently not accepting the partition > assignment strategy. Since this commit seems to add heartbeats and the server > side code for partition assignment strategies, I assume we were missing > something on the client side and by filling in the server side, things > stopped working. > bq. On either 84636272422b6379d57d4c5ef68b156edc1c67f8 or > a5b11886df8c7aad0548efd2c7c3dbc579232f03 (July 17th), I am able to run the > perf test again, but it's slow -- ~10MB/s for me vs the 2MB/s Jay was seeing, > but that's still far less than the 600MB/s I saw on the earlier commits. > Ideally we would also at least have a system test in place for the new > consumer, even if regressions weren't automatically detected. It would at > least allow for manually checking for regressions. This should not be > difficult since there are already old consumer performance tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)