Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Chia-Ping Tsai Tue, 26 May 2026 08:42:14 -0700

hi Jiunn-Yang

Thanks for the KIP. It looks like this proposal enables testing on compacted 
topics. If so, would you mind updating the Motivation section to include this?


Best,
Chia-Ping

On 2026/05/26 11:46:16 黃竣陽 wrote:
> Hello PoAn,
> 
> Thanks for the feedback
> 
> poan_00: In range mode, keys are generated by `recordIndex % keyRange`,
> which is fully deterministic and not affected by `--random-seed`. The seed 
> only controls 
> the PRNG used for random payload generation in that case. The example is 
> misleading,
> I will remove it.
> 
> poan_01:  According to the JDK documentation, `SplittableRandom` generates 
> uniformly 
> distributed pseudorandom values. With a sufficiently large number of records, 
> each key 
> in random mode appears roughly the same number of times, so the partition 
> distribution s
> tatistically converges toward behavior similar to range mode. 
> 
> The main difference is that random mode introduces short-term burstiness, 
> where the same 
> key may appear consecutively for a period of time, while range mode produces 
> a perfectly 
> even round-robin pattern. However, neither mode inherently creates a truly 
> skewed (hot-partition) 
> distribution. 
> 
> I’ll update the motivation section to remove the hot-partition claim for 
> random mode.
> 
> Best Regards,
> Jiunn-Yang
> 
> > PoAn Yang <[email protected]> 於 2026年5月26日 晚上7:18 寫道：
> > 
> > Hi Jiunn,
> > 
> > Thanks for the KIP.
> > 
> > poan_00: In example usage, there is a case use --key-distribution range 
> > with --random-seed.
> > In this case, does the --random-seed parameter take effect? If not, can we 
> > remove it?
> > 
> > poan_01: In motivation, one use case of random distribution is 
> > hot-partition scenario.
> > However, in JDK document, the SplittableRandom is a generator of uniform 
> > pseudorandom values [0].
> > If hot-partition scenario is just because small key range, can we do it 
> > with range key distribution directly?
> > 
> > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/SplittableRandom.html
> > 
> > Best,
> > PoAn
> > 
> >> On May 20, 2026, at 8:56 PM, 黃竣陽 <[email protected]> wrote:
> >> 
> >> Hi chia,
> >> 
> >> Thanks for the feedback,
> >> 
> >> chia_00: I have added a new optional argument --random-seed <SEED> 
> >> (default: 0) 
> >> to let users set the seed manually. The default value of 0 ensures 
> >> deterministic, reproducible 
> >> benchmark runs by default. 
> >> 
> >> chia_01: I have updated the Motivation section in the KIP to elaborate on 
> >> the practical 
> >> use cases for each key distribution mode.
> >> 
> >> Best Regards,
> >> Jiunn-Yang
> >> 
> >>> Chia-Ping Tsai <[email protected]> 於 2026年5月20日 上午11:48 寫道：
> >>> 
> >>> hi Jiunn
> >>> 
> >>> thanks for this KIP!
> >>> 
> >>> chia_00: Regarding the random seed, what are your thoughts on its 
> >>> initialization?
> >>> 
> >>> chia_01: Could you elaborate on the practical use cases for each key 
> >>> distribution mode in the Motivation section?
> >>> 
> >>> Best,Chia-Ping
> >>> 
> >>> On 2026/03/30 13:06:05 黃竣陽 wrote:
> >>>> Hello everyone, 
> >>>> 
> >>>> I would like to start a discussion on KIP-1299 Use key range in 
> >>>> ProducerPerformance
> >>>> <https://cwiki.apache.org/confluence/x/XpQ8G>
> >>>> 
> >>>> This proposal aims to add configurable key distribution support to 
> >>>> kafka-producer-perf-test. 
> >>>> Currently, the tool always produces records with null keys, which does 
> >>>> not reflect real-world 
> >>>> keyed workloads. This KIP introduces two new arguments — 
> >>>> --key-distribution and --message-key-range 
> >>>> — enabling engineers to benchmark with round-robin or random key 
> >>>> strategies over a bounded 
> >>>> key space, providing more realistic performance measurements.
> >>>> 
> >>>> Best regards,
> >>>> Jiunn-Yang
> >> 
> > 
> 
>

Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Reply via email to