Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

黃竣陽 Tue, 26 May 2026 08:51:13 -0700

Hello chia,

Thanks for the feedback. I have updated the KIP.


Best Regards,
Jiunn-Yang

> Chia-Ping Tsai <[email protected]> 於 2026年5月26日 晚上11:42 寫道：
> 
> hi Jiunn-Yang
> 
> Thanks for the KIP. It looks like this proposal enables testing on compacted 
> topics. If so, would you mind updating the Motivation section to include this?
> 
> Best,
> Chia-Ping
> 
> On 2026/05/26 11:46:16 黃竣陽 wrote:
>> Hello PoAn,
>> 
>> Thanks for the feedback
>> 
>> poan_00: In range mode, keys are generated by `recordIndex % keyRange`,
>> which is fully deterministic and not affected by `--random-seed`. The seed 
>> only controls 
>> the PRNG used for random payload generation in that case. The example is 
>> misleading,
>> I will remove it.
>> 
>> poan_01:  According to the JDK documentation, `SplittableRandom` generates 
>> uniformly 
>> distributed pseudorandom values. With a sufficiently large number of 
>> records, each key 
>> in random mode appears roughly the same number of times, so the partition 
>> distribution s
>> tatistically converges toward behavior similar to range mode. 
>> 
>> The main difference is that random mode introduces short-term burstiness, 
>> where the same 
>> key may appear consecutively for a period of time, while range mode produces 
>> a perfectly 
>> even round-robin pattern. However, neither mode inherently creates a truly 
>> skewed (hot-partition) 
>> distribution. 
>> 
>> I’ll update the motivation section to remove the hot-partition claim for 
>> random mode.
>> 
>> Best Regards,
>> Jiunn-Yang
>> 
>>> PoAn Yang <[email protected]> 於 2026年5月26日 晚上7:18 寫道：
>>> 
>>> Hi Jiunn,
>>> 
>>> Thanks for the KIP.
>>> 
>>> poan_00: In example usage, there is a case use --key-distribution range 
>>> with --random-seed.
>>> In this case, does the --random-seed parameter take effect? If not, can we 
>>> remove it?
>>> 
>>> poan_01: In motivation, one use case of random distribution is 
>>> hot-partition scenario.
>>> However, in JDK document, the SplittableRandom is a generator of uniform 
>>> pseudorandom values [0].
>>> If hot-partition scenario is just because small key range, can we do it 
>>> with range key distribution directly?
>>> 
>>> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/SplittableRandom.html
>>> 
>>> Best,
>>> PoAn
>>> 
>>>> On May 20, 2026, at 8:56 PM, 黃竣陽 <[email protected]> wrote:
>>>> 
>>>> Hi chia,
>>>> 
>>>> Thanks for the feedback,
>>>> 
>>>> chia_00: I have added a new optional argument --random-seed <SEED> 
>>>> (default: 0) 
>>>> to let users set the seed manually. The default value of 0 ensures 
>>>> deterministic, reproducible 
>>>> benchmark runs by default. 
>>>> 
>>>> chia_01: I have updated the Motivation section in the KIP to elaborate on 
>>>> the practical 
>>>> use cases for each key distribution mode.
>>>> 
>>>> Best Regards,
>>>> Jiunn-Yang
>>>> 
>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月20日 上午11:48 寫道：
>>>>> 
>>>>> hi Jiunn
>>>>> 
>>>>> thanks for this KIP!
>>>>> 
>>>>> chia_00: Regarding the random seed, what are your thoughts on its 
>>>>> initialization?
>>>>> 
>>>>> chia_01: Could you elaborate on the practical use cases for each key 
>>>>> distribution mode in the Motivation section?
>>>>> 
>>>>> Best,Chia-Ping
>>>>> 
>>>>> On 2026/03/30 13:06:05 黃竣陽 wrote:
>>>>>> Hello everyone, 
>>>>>> 
>>>>>> I would like to start a discussion on KIP-1299 Use key range in 
>>>>>> ProducerPerformance
>>>>>> <https://cwiki.apache.org/confluence/x/XpQ8G>
>>>>>> 
>>>>>> This proposal aims to add configurable key distribution support to 
>>>>>> kafka-producer-perf-test. 
>>>>>> Currently, the tool always produces records with null keys, which does 
>>>>>> not reflect real-world 
>>>>>> keyed workloads. This KIP introduces two new arguments — 
>>>>>> --key-distribution and --message-key-range 
>>>>>> — enabling engineers to benchmark with round-robin or random key 
>>>>>> strategies over a bounded 
>>>>>> key space, providing more realistic performance measurements.
>>>>>> 
>>>>>> Best regards,
>>>>>> Jiunn-Yang
>>>> 
>>> 
>> 
>>

Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Reply via email to