jianbin.chen created KAFKA-15264: ------------------------------------ Summary: Compared with 1.1.0zk, the peak throughput of 3.5.1kraft is very jitter Key: KAFKA-15264 URL: https://issues.apache.org/jira/browse/KAFKA-15264 Project: Kafka Issue Type: Bug Reporter: jianbin.chen
I was preparing to upgrade from 1.1.0 to 3.5.1's kraft mode (new cluster deployment), and when I recently compared and tested, I found that when using the following stress test command, the throughput gap is obvious {code:java} ./kafka-producer-perf-test.sh --topic test321 --num-records 30000000 --record-size 1024 --throughput -1 --producer-props bootstrap.servers=xxx:xxxx acks=1 419813 records sent, 83962.6 records/sec (81.99 MB/sec), 241.1 ms avg latency, 588.0 ms max latency. 555300 records sent, 111015.6 records/sec (108.41 MB/sec), 275.1 ms avg latency, 460.0 ms max latency. 552795 records sent, 110536.9 records/sec (107.95 MB/sec), 265.9 ms avg latency, 1120.0 ms max latency. 552600 records sent, 110520.0 records/sec (107.93 MB/sec), 284.5 ms avg latency, 1097.0 ms max latency. 538500 records sent, 107656.9 records/sec (105.13 MB/sec), 277.5 ms avg latency, 610.0 ms max latency. 511545 records sent, 102309.0 records/sec (99.91 MB/sec), 304.1 ms avg latency, 1892.0 ms max latency. 511890 records sent, 102337.1 records/sec (99.94 MB/sec), 288.4 ms avg latency, 3000.0 ms max latency. 519165 records sent, 103812.2 records/sec (101.38 MB/sec), 262.1 ms avg latency, 1781.0 ms max latency. 513555 records sent, 102669.9 records/sec (100.26 MB/sec), 338.2 ms avg latency, 2590.0 ms max latency. 463329 records sent, 92665.8 records/sec (90.49 MB/sec), 276.8 ms avg latency, 1463.0 ms max latency. 494248 records sent, 98849.6 records/sec (96.53 MB/sec), 327.2 ms avg latency, 2362.0 ms max latency. 506272 records sent, 101254.4 records/sec (98.88 MB/sec), 322.1 ms avg latency, 2986.0 ms max latency. 393758 records sent, 78735.9 records/sec (76.89 MB/sec), 387.0 ms avg latency, 2958.0 ms max latency. 426435 records sent, 85252.9 records/sec (83.25 MB/sec), 363.3 ms avg latency, 1959.0 ms max latency. 412560 records sent, 82298.0 records/sec (80.37 MB/sec), 374.1 ms avg latency, 1995.0 ms max latency. 370137 records sent, 73997.8 records/sec (72.26 MB/sec), 396.8 ms avg latency, 1496.0 ms max latency. 391781 records sent, 78340.5 records/sec (76.50 MB/sec), 410.7 ms avg latency, 2446.0 ms max latency. 355901 records sent, 71166.0 records/sec (69.50 MB/sec), 397.5 ms avg latency, 2715.0 ms max latency. 385410 records sent, 77082.0 records/sec (75.28 MB/sec), 417.5 ms avg latency, 2702.0 ms max latency. 381160 records sent, 76232.0 records/sec (74.45 MB/sec), 407.7 ms avg latency, 1846.0 ms max latency. 333367 records sent, 66660.1 records/sec (65.10 MB/sec), 456.2 ms avg latency, 1414.0 ms max latency. 376251 records sent, 75175.0 records/sec (73.41 MB/sec), 401.9 ms avg latency, 1897.0 ms max latency. 354434 records sent, 70886.8 records/sec (69.23 MB/sec), 425.8 ms avg latency, 1601.0 ms max latency. 353795 records sent, 70744.9 records/sec (69.09 MB/sec), 411.7 ms avg latency, 1563.0 ms max latency. 321993 records sent, 64360.0 records/sec (62.85 MB/sec), 447.3 ms avg latency, 1975.0 ms max latency. 404075 records sent, 80750.4 records/sec (78.86 MB/sec), 408.4 ms avg latency, 1753.0 ms max latency. 384526 records sent, 76905.2 records/sec (75.10 MB/sec), 406.0 ms avg latency, 1833.0 ms max latency. 387652 records sent, 77483.9 records/sec (75.67 MB/sec), 397.3 ms avg latency, 1927.0 ms max latency. 343286 records sent, 68629.7 records/sec (67.02 MB/sec), 455.6 ms avg latency, 1685.0 ms max latency. 333300 records sent, 66646.7 records/sec (65.08 MB/sec), 456.6 ms avg latency, 2146.0 ms max latency. 361191 records sent, 72238.2 records/sec (70.55 MB/sec), 409.4 ms avg latency, 2125.0 ms max latency. 357525 records sent, 71490.7 records/sec (69.82 MB/sec), 436.0 ms avg latency, 1502.0 ms max latency. 340238 records sent, 68047.6 records/sec (66.45 MB/sec), 427.9 ms avg latency, 1932.0 ms max latency. 390016 records sent, 77956.4 records/sec (76.13 MB/sec), 418.5 ms avg latency, 1807.0 ms max latency. 352830 records sent, 70523.7 records/sec (68.87 MB/sec), 439.4 ms avg latency, 1892.0 ms max latency. 354526 records sent, 70905.2 records/sec (69.24 MB/sec), 429.6 ms avg latency, 2128.0 ms max latency. 356670 records sent, 71305.5 records/sec (69.63 MB/sec), 408.9 ms avg latency, 1329.0 ms max latency. 309204 records sent, 60687.7 records/sec (59.27 MB/sec), 438.6 ms avg latency, 2566.0 ms max latency. 366715 records sent, 72316.1 records/sec (70.62 MB/sec), 474.5 ms avg latency, 2169.0 ms max latency. 375174 records sent, 75034.8 records/sec (73.28 MB/sec), 429.9 ms avg latency, 1722.0 ms max latency. 359400 records sent, 70346.4 records/sec (68.70 MB/sec), 432.1 ms avg latency, 1961.0 ms max latency. 312276 records sent, 62430.2 records/sec (60.97 MB/sec), 477.4 ms avg latency, 2006.0 ms max latency. 361875 records sent, 72360.5 records/sec (70.66 MB/sec), 441.2 ms avg latency, 1618.0 ms max latency. 342449 records sent, 68462.4 records/sec (66.86 MB/sec), 446.7 ms avg latency, 2233.0 ms max latency. 338163 records sent, 67619.1 records/sec (66.03 MB/sec), 454.4 ms avg latency, 1839.0 ms max latency. 369139 records sent, 73798.3 records/sec (72.07 MB/sec), 388.3 ms avg latency, 1753.0 ms max latency. 362476 records sent, 72495.2 records/sec (70.80 MB/sec), 438.4 ms avg latency, 2037.0 ms max latency. 321426 records sent, 62267.7 records/sec (60.81 MB/sec), 475.5 ms avg latency, 2059.0 ms max latency. 389137 records sent, 77286.4 records/sec (75.47 MB/sec), 359.7 ms avg latency, 1547.0 ms max latency. 298050 records sent, 59586.2 records/sec (58.19 MB/sec), 563.9 ms avg latency, 2761.0 ms max latency. 325530 records sent, 65028.0 records/sec (63.50 MB/sec), 503.3 ms avg latency, 2950.0 ms max latency. 347306 records sent, 69419.5 records/sec (67.79 MB/sec), 404.0 ms avg latency, 2095.0 ms max latency. 361035 records sent, 72192.6 records/sec (70.50 MB/sec), 429.5 ms avg latency, 1698.0 ms max latency. 334539 records sent, 66907.8 records/sec (65.34 MB/sec), 461.1 ms avg latency, 1731.0 ms max latency. 367423 records sent, 73455.2 records/sec (71.73 MB/sec), 433.1 ms avg latency, 2089.0 ms max latency. 350940 records sent, 68947.0 records/sec (67.33 MB/sec), 434.8 ms avg latency, 1317.0 ms max latency. 351653 records sent, 70316.5 records/sec (68.67 MB/sec), 452.0 ms avg latency, 2948.0 ms max latency. 298410 records sent, 58834.8 records/sec (57.46 MB/sec), 479.2 ms avg latency, 2279.0 ms max latency. 351750 records sent, 70350.0 records/sec (68.70 MB/sec), 460.2 ms avg latency, 2496.0 ms max latency. 355367 records sent, 71073.4 records/sec (69.41 MB/sec), 416.3 ms avg latency, 2120.0 ms max latency. 238517 records sent, 47693.9 records/sec (46.58 MB/sec), 678.9 ms avg latency, 3072.0 ms max latency. 362347 records sent, 72469.4 records/sec (70.77 MB/sec), 423.8 ms avg latency, 1714.0 ms max latency. 308901 records sent, 61767.8 records/sec (60.32 MB/sec), 490.7 ms avg latency, 2339.0 ms max latency. 338280 records sent, 66919.9 records/sec (65.35 MB/sec), 422.8 ms avg latency, 1882.0 ms max latency. 311888 records sent, 61894.8 records/sec (60.44 MB/sec), 516.1 ms avg latency, 3857.0 ms max latency. 319164 records sent, 63832.8 records/sec (62.34 MB/sec), 494.3 ms avg latency, 2250.0 ms max latency. 291160 records sent, 58197.1 records/sec (56.83 MB/sec), 468.7 ms avg latency, 2250.0 ms max latency. 297599 records sent, 55834.7 records/sec (54.53 MB/sec), 472.1 ms avg latency, 3019.0 ms max latency. 314198 records sent, 62814.5 records/sec (61.34 MB/sec), 600.0 ms avg latency, 2863.0 ms max latency. 332534 records sent, 66440.4 records/sec (64.88 MB/sec), 479.2 ms avg latency, 3337.0 ms max latency. 320974 records sent, 64194.8 records/sec (62.69 MB/sec), 470.8 ms avg latency, 2644.0 ms max latency. 364638 records sent, 72825.6 records/sec (71.12 MB/sec), 408.4 ms avg latency, 2095.0 ms max latency. 350255 records sent, 70037.0 records/sec (68.40 MB/sec), 422.9 ms avg latency, 3059.0 ms max latency. 342961 records sent, 68592.2 records/sec (66.98 MB/sec), 461.5 ms avg latency, 1779.0 ms max latency. 348809 records sent, 69733.9 records/sec (68.10 MB/sec), 454.7 ms avg latency, 2621.0 ms max latency. 345438 records sent, 69032.4 records/sec (67.41 MB/sec), 439.0 ms avg latency, 2662.0 ms max latency. 306454 records sent, 61192.9 records/sec (59.76 MB/sec), 504.6 ms avg latency, 2513.0 ms max latency. 300053 records sent, 59843.0 records/sec (58.44 MB/sec), 415.6 ms avg latency, 1655.0 ms max latency. 332067 records sent, 66413.4 records/sec (64.86 MB/sec), 527.9 ms avg latency, 2409.0 ms max latency. 312132 records sent, 62426.4 records/sec (60.96 MB/sec), 463.3 ms avg latency, 2042.0 ms max latency. 30000000 records sent, 73963.402908 records/sec (72.23 MB/sec), 410.86 ms avg latency, 3857.00 ms max latency, 264 ms 50th, 1259 ms 95th, 2102 ms 99th, 2955 ms 99.9th. {code} And on the 1.1.0 test, I guarantee that the command is the same, it can be said that the stress test on 1.1.0 is basically jitter-free, I have tested many times, and the result is still the same {code:java} 30000000 records sent, 108280.576630 records/sec (105.74 MB/sec), 279.05 ms avg latency, 1426.00 ms max latency, 185 ms 50th, 646 ms 95th, 758 ms 99th, 865 ms 99.9th.{code} I haven't used the 3.5.1+ZK deployment method test, I will complete this piece of test content as soon as possible, but surprisingly, the throughput jitter under Kraft under extreme stress testing is obvious, the topic partitions are 30, no obvious jitter traces are found on the CPU and GC, and the 3.5.1 client to 3.5.1 broker, 1.1.0 client to 1.1.0 broker 4c8g*3 1.1.0 config {code:java} #### log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=300000 listeners=PLAINTEXT://:9092 broker.id=1 num.network.threads=5 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=2 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=30 log.segment.bytes=104857600 log.retention.check.interval.ms=300000 zookeeper.connect=/kafka110-test2 zookeeper.connection.timeout.ms=6000 group.initial.rebalance.delay.ms=2000 num.replica.fetchers=1{code} 3.5.1 conf {code:java} #### listeners=PLAINTEXT://:9092,CONTROLLER://:9093 # Name of listener used for communication between brokers. inter.broker.listener.name=PLAINTEXT # Listener name, hostname and port the broker will advertise to clients. # If not set, it uses the value for "listeners". advertised.listeners=PLAINTEXT://10.58.16.231:9092 # A comma-separated list of the names of the listeners used by the controller. # If no explicit mapping set in `listener.security.protocol.map`, default will b e using PLAINTEXT protocol # This is required if running in KRaft mode. controller.listener.names=CONTROLLER process.roles=broker,controller broker.id=1 num.network.threads=5 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=52428800 replica.fetch.max.bytes=52428800 log.dirs=/data01/kafka-logs-351 node.id=1 controller.quorum.voters=1@:9093,2@:9093,3@:9093 num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=3 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.retention.hours=72 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 num.replica.fetchers=1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)