[ 
https://issues.apache.org/jira/browse/KAFKA-19652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18018722#comment-18018722
 ] 

Ritvik Gupta commented on KAFKA-19652:
--------------------------------------

Hi,

Updating here on behalf of [~gargsha] -

 

*Topic configs :*

{{cleanup.policy=compact,delete; retention.ms=1200000; min.insync.replicas=2}}

*Consumer configs :*

{{fetch.min.bytes=131072,fetch.max.wait.ms=500}}

( acks=1 for the Producer )

 

I have generally observed this case for when the topic has a high throughput 
producer(s) running on it with upwards of *590 MB/sec* ( *600 K msgs/sec* with 
*1000 b* message size each ). Can also simulate this case with running multiple 
producers on the topic to achieve close to 600 MB/sec.

 

In the above test scenario, the consumer throughput drops down to {*}< 
90MB/sec{*}. This is for a 1 Producer + 1 Consumer test, which should be the 
ideal test scenario and was getting me upto *250+ MB/sec* consumer throughput 
in the old {*}v3.5.1 Kafka cluster{*}.

 

My best guess is some form of loss in *insync replicas* ( with the high 
throughput producers as *acks=1* ), as only the leader replica would stay 
in-sync in case of a high throughput on the topic. And if this can have a 
performance impact on the consumer in some way ?? 

To reason on this I had conducted a below tests ( *min.insync.replicas=2* vs 
*min.insync.replicas=1* ) -

 
| |
|
|*test*|*Test Name*|*Observed message rate (K msgs/sec*|*Test description*|
|1| | | |
|Consumer|Throughput test run ( Java client ) - Min ISR = 2 - 1 producer & 1 
consumer|*{color:#de350b}89.21{color}*|Test Configs = acks:1, partitions:1, 
rf:3, min-isr:2, producers:1, consumers:1, batch_size:256KB|
|Producer|Throughput test run ( Java client ) - Min ISR = 2 - 1 producer & 1 
consumer|617.20|Test Configs = acks:1, partitions:1, rf:3, min-isr:2, 
producers:1, consumers:1, batch_size:256KB|
|2| | | |
|Consumer|Throughput test run ( Java client ) - Min ISR = 1 - 1 producer & 1 
consumer|*{color:#00875a}298.99{color}*|Test Configs = acks:1, partitions:1, 
rf:3, min-isr:1, producers:1, consumers:1, batch_size:256KB|
|Producer|Throughput test run ( Java client ) - Min ISR = 1 - 1 producer & 1 
consumer|597.20|Test Configs = acks:1, partitions:1, rf:3, min-isr:1, 
producers:1, consumers:1, batch_size:256KB|
|

 

Just to note I have also observed this heavy drop in consumer throughput in a 
*KRaft mode cluster ( v3.9.0 ).*

Let me know if to share more details regarding the tests I conducted or any 
other configs, for getting to debug this.

> Consumer throughput drops by 10 times with Kafka v3.9.0 in ZK mode
> ------------------------------------------------------------------
>
>                 Key: KAFKA-19652
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19652
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer
>    Affects Versions: 3.9.0
>            Reporter: Sharad Garg
>            Priority: Blocker
>
> Kafka consumer throughput in best-case drops by ~10 times after upgrading to 
> kafka v3.9.0 from v3.5.1. Note that this is in ZK mode and KRAFT migration is 
> not done yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to