[jira] [Updated] (KAFKA-16226) Java client: Performance regression in Trogdor benchmark with high partition counts

Mayank Shekhar Narula (Jira) Wed, 07 Feb 2024 06:01:15 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mayank Shekhar Narula updated KAFKA-16226:
------------------------------------------
    Description: 
h1. Background

https://issues.apache.org/jira/browse/KAFKA-15415 implemented optimisation in 
java-client to skip backoff period if client knows of a newer leader, for 
produce-batch being retried.
h1. What changed

The implementation introduced a regression noticed on a trogdor-benchmark 
running with high partition counts(36000!).
With regression, following metrics changed on the produce side.
 # record-queue-time-avg: increased from 20ms to 30ms.
 # request-latency-avg: increased from 50ms to 100ms.

h1. Why it happened

As can be seen from the original 
[PR|https://github.com/apache/kafka/pull/14384] 
RecordAccmulator.partitionReady() & drainBatchesForOneNode() started using 
synchronised method Metadata.currentLeader(). This has led to increased 
synchronization between KafkaProducer's application-thread that call send(), 
and background-thread that actively send producer-batches to leaders.

Lock profiles clearly show increased synchronisation in KAFKA-15415 
PR(highlighted in {color:#de350b}Red{color}) Vs baseline ( see below ). Note 
the synchronisation is much worse for paritionReady() in this benchmark as its 
called for each partition, and it has 36k partitions!
h3. Lock Profile: Kafka-15415

!kafka_15415_lock_profile.png!
h3. Lock Profile: Baseline

!baseline_lock_profile.png!
h1. Fix

Synchronization has to be reduced between 2 threads in order to address this. 
[https://github.com/apache/kafka/pull/15323] is a fix for it, as it avoids 
using Metadata.currentLeader() instead rely on Cluster.leaderFor().

With the fix, lock-profile & metrics are similar to baseline.

 

  was:
h1. Background

https://issues.apache.org/jira/browse/KAFKA-15415 implemented optimisation in 
java-client to skip backoff period if client knows of a newer leader, for 
produce-batch being retried.
h1. What changed

The implementation introduced a regression noticed on a trogdor-benchmark 
running with high partition counts(36000!).
With regression, following metrics changed on the produce side.
 # record-queue-time-avg: increased from 20ms to 30ms.
 # request-latency-avg: increased from 50ms to 100ms.

h1. Why it happened

As can be seen from the original 
[PR|https://github.com/apache/kafka/pull/14384] 
RecordAccmulator.partitionReady() & drainBatchesForOneNode() started using 
synchronised method Metadata.currentLeader(). This has led to increased 
synchronization between KafkaProducer's application-thread that call send(), 
and background-thread that actively send producer-batches to leaders.

See lock profiles that clearly show increased synchronisation in KAFKA-15415 
PR(highlighted in {color:#de350b}Red{color}) Vs baseline. Note the 
synchronisation is much worse for paritionReady() in this benchmark as its 
called for each partition, and it has 36k partitions!
h3. Lock Profile: Kafka-15415

!kafka_15415_lock_profile.png!
h3. Lock Profile: Baseline

!baseline_lock_profile.png!
h1. Fix

Synchronization has to be reduced between 2 threads in order to address this. 
[https://github.com/apache/kafka/pull/15323] is a fix for it, as it avoids 
using Metadata.currentLeader() instead rely on Cluster.leaderFor().

With the fix, lock-profile & metrics are similar to baseline.

 


> Java client: Performance regression in Trogdor benchmark with high partition 
> counts
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-16226
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16226
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 3.7.0, 3.6.1
>            Reporter: Mayank Shekhar Narula
>            Assignee: Mayank Shekhar Narula
>            Priority: Major
>              Labels: kip-951
>             Fix For: 3.6.2, 3.8.0, 3.7.1
>
>         Attachments: baseline_lock_profile.png, kafka_15415_lock_profile.png
>
>
> h1. Background
> https://issues.apache.org/jira/browse/KAFKA-15415 implemented optimisation in 
> java-client to skip backoff period if client knows of a newer leader, for 
> produce-batch being retried.
> h1. What changed
> The implementation introduced a regression noticed on a trogdor-benchmark 
> running with high partition counts(36000!).
> With regression, following metrics changed on the produce side.
>  # record-queue-time-avg: increased from 20ms to 30ms.
>  # request-latency-avg: increased from 50ms to 100ms.
> h1. Why it happened
> As can be seen from the original 
> [PR|https://github.com/apache/kafka/pull/14384] 
> RecordAccmulator.partitionReady() & drainBatchesForOneNode() started using 
> synchronised method Metadata.currentLeader(). This has led to increased 
> synchronization between KafkaProducer's application-thread that call send(), 
> and background-thread that actively send producer-batches to leaders.
> Lock profiles clearly show increased synchronisation in KAFKA-15415 
> PR(highlighted in {color:#de350b}Red{color}) Vs baseline ( see below ). Note 
> the synchronisation is much worse for paritionReady() in this benchmark as 
> its called for each partition, and it has 36k partitions!
> h3. Lock Profile: Kafka-15415
> !kafka_15415_lock_profile.png!
> h3. Lock Profile: Baseline
> !baseline_lock_profile.png!
> h1. Fix
> Synchronization has to be reduced between 2 threads in order to address this. 
> [https://github.com/apache/kafka/pull/15323] is a fix for it, as it avoids 
> using Metadata.currentLeader() instead rely on Cluster.leaderFor().
> With the fix, lock-profile & metrics are similar to baseline.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-16226) Java client: Performance regression in Trogdor benchmark with high partition counts

Reply via email to