[
https://issues.apache.org/jira/browse/KAFKA-19652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18018722#comment-18018722
]
Ritvik Gupta commented on KAFKA-19652:
--------------------------------------
Hi,
Updating here on behalf of [~gargsha] -
*Topic configs :*
{{cleanup.policy=compact,delete; retention.ms=1200000; min.insync.replicas=2}}
*Consumer configs :*
{{fetch.min.bytes=131072,fetch.max.wait.ms=500}}
( acks=1 for the Producer )
I have generally observed this case for when the topic has a high throughput
producer(s) running on it with upwards of *590 MB/sec* ( *600 K msgs/sec* with
*1000 b* message size each ). Can also simulate this case with running multiple
producers on the topic to achieve close to 600 MB/sec.
In the above test scenario, the consumer throughput drops down to {*}<
90MB/sec{*}. This is for a 1 Producer + 1 Consumer test, which should be the
ideal test scenario and was getting me upto *250+ MB/sec* consumer throughput
in the old {*}v3.5.1 Kafka cluster{*}.
My best guess is some form of loss in *insync replicas* ( with the high
throughput producers as *acks=1* ), as only the leader replica would stay
in-sync in case of a high throughput on the topic. And if this can have a
performance impact on the consumer in some way ??
To reason on this I had conducted a below tests ( *min.insync.replicas=2* vs
*min.insync.replicas=1* ) -
| |
|
|*test*|*Test Name*|*Observed message rate (K msgs/sec*|*Test description*|
|1| | | |
|Consumer|Throughput test run ( Java client ) - Min ISR = 2 - 1 producer & 1
consumer|*{color:#de350b}89.21{color}*|Test Configs = acks:1, partitions:1,
rf:3, min-isr:2, producers:1, consumers:1, batch_size:256KB|
|Producer|Throughput test run ( Java client ) - Min ISR = 2 - 1 producer & 1
consumer|617.20|Test Configs = acks:1, partitions:1, rf:3, min-isr:2,
producers:1, consumers:1, batch_size:256KB|
|2| | | |
|Consumer|Throughput test run ( Java client ) - Min ISR = 1 - 1 producer & 1
consumer|*{color:#00875a}298.99{color}*|Test Configs = acks:1, partitions:1,
rf:3, min-isr:1, producers:1, consumers:1, batch_size:256KB|
|Producer|Throughput test run ( Java client ) - Min ISR = 1 - 1 producer & 1
consumer|597.20|Test Configs = acks:1, partitions:1, rf:3, min-isr:1,
producers:1, consumers:1, batch_size:256KB|
|
Just to note I have also observed this heavy drop in consumer throughput in a
*KRaft mode cluster ( v3.9.0 ).*
Let me know if to share more details regarding the tests I conducted or any
other configs, for getting to debug this.
> Consumer throughput drops by 10 times with Kafka v3.9.0 in ZK mode
> ------------------------------------------------------------------
>
> Key: KAFKA-19652
> URL: https://issues.apache.org/jira/browse/KAFKA-19652
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer
> Affects Versions: 3.9.0
> Reporter: Sharad Garg
> Priority: Blocker
>
> Kafka consumer throughput in best-case drops by ~10 times after upgrading to
> kafka v3.9.0 from v3.5.1. Note that this is in ZK mode and KRAFT migration is
> not done yet.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)