novosibman opened a new pull request, #13768:
URL: https://github.com/apache/kafka/pull/13768

   Related issue https://issues.apache.org/jira/browse/KAFKA-9693
   
   The issue with repeating latency spikes during Kafka log segments rolling 
still reproduced on the latest versions including  kafka_2.13-3.4.0.
   
   It was found that flushing Kafka snapshot file during segments rolling 
blocks producer request handling thread for some time:
   
https://github.com/apache/kafka/blob/3.4/core/src/main/scala/kafka/log/ProducerStateManager.scala#L452
   ```
     private def writeSnapshot(file: File, entries: mutable.Map[Long, 
ProducerStateEntry]): Unit = {
   ...
       val fileChannel = FileChannel.open(file.toPath, 
StandardOpenOption.CREATE, StandardOpenOption.WRITE)
       try {
         fileChannel.write(buffer)
         fileChannel.force(true)     <- here
       } finally {
         fileChannel.close()
       }...
   
   ```
   More partitions - more cumulative latency effect observed.
   
   Suggested fix offloads flush (fileChannel.force) operation to the background 
thread similar to (but not exactly) how it was done in the UnifiedLog.scala:
   ```
     def roll(
      ...
       // Schedule an asynchronous flush of the old segment
       scheduler.schedule("flush-log", () => 
flushUptoOffsetExclusive(newSegment.baseOffset))
     }
   ```
   The benchmarking using this fix shows significant reduction in repeating 
latency spikes:
   
   test config: 
   AWS
   3 node cluster (i3en.2xlarge)
   zulu11.62.17-ca-jdk11.0.18-linux_x64, heap 6G per broker
   1 loadgen (m5n.8xlarge) - OpenMessaging benchmark 
([OMB](https://github.com/openmessaging/benchmark)) 
   1  zookeeper (t2.small)
   acks=all batchSize=1048510 consumers=4 insyncReplicas=2 lingerMs=1 mlen=1024 
producers=4 rf=3 subscriptions=1 targetRate=200k time=12m topics=1 warmup=1m 
   
   ### variation 1:
   partitions=10 
   <span ng-repeat="item in report" ng-show="!item.visible || item.visible()" 
ng-class="item.cls + (item.sticky ? 'sticky' : '')" class="ng-scope"><span 
ng-if="item.table &amp;&amp; item.displayTable.show" ng-class="item.cls" 
class="ng-scope">
   
   metric | kafka_2.13-3.4.0 | kafka_2.13-3.4.0 patched
   -- | -- | --
   endToEnd service_time (ms) p50 max | 2.00 | 2.00 
   endToEnd service_time (ms) p75 max | 3.00 | 2.00 
   endToEnd service_time (ms) p95 max | 94.0 | 3.00 
   endToEnd service_time (ms) p99 max | 290 | 6.00  
   endToEnd service_time (ms) p99.9 max | 355 | 21.0
   endToEnd service_time (ms) p99.99 max | 372 | 34.0
   endToEnd service_time (ms) p100 max | 374 | 36.0 
   publish service_time (ms) p50 max | 1.70 | 1.67 
   publish service_time (ms) p75 max | 2.23 | 2.09 
   publish service_time (ms) p95 max | 90.7 | 2.82 
   publish service_time (ms) p99 max | 287 | 4.69  
   publish service_time (ms) p99.9 max | 353 | 19.6
   publish service_time (ms) p99.99 max | 369 | 31.3
   publish service_time (ms) p100 max | 371 | 33.5 
   
   kafka | endToEnd chart
   -- | --
   kafka_2.13-3.4.0 | 
![image](https://github.com/apache/kafka/assets/6793713/ec329711-47d4-459f-92d7-06310b770023)
 
   kafka_2.13-3.4.0 patched | 
![image](https://github.com/apache/kafka/assets/6793713/e5aeb6a6-d33a-4d57-be80-c916fa1f05be)
   
   latency score improved up to 10x times in high percentiles ^^^, spikes 
almost invisible
   
   ### variation 2:
   partitions=100
   metric | kafka_2.13-3.4.0 | kafka_2.13-3.4.0 patched
   -- | -- | --
   endToEnd service_time (ms) p50 max | 91.0 | 2.00 
   endToEnd service_time (ms) p75 max | 358 | 3.00 
   endToEnd service_time (ms) p95 max | 1814 | 4.00
   endToEnd service_time (ms) p99 max | 2777 | 21.0
   endToEnd service_time (ms) p99.9 max | 3643 | 119
   endToEnd service_time (ms) p99.99 max | 3724 | 141
   endToEnd service_time (ms) p100 max | 3726 | 143  
   publish service_time (ms) p50 max | 77.4 | 1.92 
   publish service_time (ms) p75 max | 352 | 2.35  
   publish service_time (ms) p95 max | 1748 | 3.80
   publish service_time (ms) p99 max | 2740 | 18.9
   publish service_time (ms) p99.9 max | 3619 | 116
   publish service_time (ms) p99.99 max | 3720 | 139
   publish service_time (ms) p100 max | 3722 | 141
   endToEnd service_time  
   
   kafka | endToEnd chart
   -- | --
   kafka_2.13-3.4.0 | 
![image](https://github.com/apache/kafka/assets/6793713/dc6e9820-c3b7-4bd0-8dac-bce9f3886d91)
 
   kafka_2.13-3.4.0 patched | 
![image](https://github.com/apache/kafka/assets/6793713/113b4480-97a4-4dd5-8d5c-f7dc87a3d7a5)
   
   latency score improved up to 25x times in high percentiles ^^^
   
   The fix was done for 3.4 branch  - scala version of ProducerStateManager. 
Trunk needs corresponding fix for ProducerStateManager.java.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to