[
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eno Thereska updated KAFKA-4405:
--------------------------------
Description:
In KafkaConsumer:poll, the code always calls "pollNoWakeup", which turns out to
be expensive. When max.poll.records=1, for example, that call adds about 50%
performance overhead. The code should avoid avoid that function unnecessarily
when there are no outstanding prefetches.
Old JIRA description (discarded because turned out not to be the case):
---------------------------------------------------------------------------------------------------
Now kafka consumer has added max.poll.records to limit the count of messages
return by poll().
According to KIP-41, to implement max.poll.records, the prefetch request
should only be sent when the total number of retained records is less than
max.poll.records.
But in the code of 0.10.0.1 , the consumer will send a prefetch request if it
retained any records and never check if total number of retained records is
less than max.poll.records..
If max.poll.records is set to a count much less than the count of message
fetched , the poll() loop will send a lot of requests than expected and will
have more and more records fetched and stored in memory before they can be
consumed.
So before sending a prefetch request , the consumer must check if total number
of retained records is less than max.poll.records.
was:
Now kafka consumer has added max.poll.records to limit the count of messages
return by poll().
According to KIP-41, to implement max.poll.records, the prefetch request
should only be sent when the total number of retained records is less than
max.poll.records.
But in the code of 0.10.0.1 , the consumer will send a prefetch request if it
retained any records and never check if total number of retained records is
less than max.poll.records..
If max.poll.records is set to a count much less than the count of message
fetched , the poll() loop will send a lot of requests than expected and will
have more and more records fetched and stored in memory before they can be
consumed.
So before sending a prefetch request , the consumer must check if total number
of retained records is less than max.poll.records.
> Avoid calling pollNoWakeup unnecessarily
> ----------------------------------------
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.10.0.1
> Reporter: ysysberserk
> Assignee: Eno Thereska
> Fix For: 0.10.2.0
>
>
> In KafkaConsumer:poll, the code always calls "pollNoWakeup", which turns out
> to be expensive. When max.poll.records=1, for example, that call adds about
> 50% performance overhead. The code should avoid avoid that function
> unnecessarily when there are no outstanding prefetches.
> Old JIRA description (discarded because turned out not to be the case):
> ---------------------------------------------------------------------------------------------------
> Now kafka consumer has added max.poll.records to limit the count of messages
> return by poll().
> According to KIP-41, to implement max.poll.records, the prefetch request
> should only be sent when the total number of retained records is less than
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it
> retained any records and never check if total number of retained records is
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message
> fetched , the poll() loop will send a lot of requests than expected and will
> have more and more records fetched and stored in memory before they can be
> consumed.
> So before sending a prefetch request , the consumer must check if total
> number of retained records is less than max.poll.records.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)