Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Andrew Schofield Wed, 27 May 2026 08:47:37 -0700

Hi Jiunn-Yang,
Thanks for the KIP. One small comment.

AS1: Why --message-key-range instead of just --key-range? We've not used 
--message previously in this tool, and actually prefer --record.


Thanks,
Andrew

On 2026/05/26 15:49:42 黃竣陽 wrote:
> Hello chia,
> 
> Thanks for the feedback. I have updated the KIP.
> 
> Best Regards,
> Jiunn-Yang
> 
> > Chia-Ping Tsai <[email protected]> 於 2026年5月26日 晚上11:42 寫道：
> > 
> > hi Jiunn-Yang
> > 
> > Thanks for the KIP. It looks like this proposal enables testing on 
> > compacted topics. If so, would you mind updating the Motivation section to 
> > include this?
> > 
> > Best,
> > Chia-Ping
> > 
> > On 2026/05/26 11:46:16 黃竣陽 wrote:
> >> Hello PoAn,
> >> 
> >> Thanks for the feedback
> >> 
> >> poan_00: In range mode, keys are generated by `recordIndex % keyRange`,
> >> which is fully deterministic and not affected by `--random-seed`. The seed 
> >> only controls 
> >> the PRNG used for random payload generation in that case. The example is 
> >> misleading,
> >> I will remove it.
> >> 
> >> poan_01:  According to the JDK documentation, `SplittableRandom` generates 
> >> uniformly 
> >> distributed pseudorandom values. With a sufficiently large number of 
> >> records, each key 
> >> in random mode appears roughly the same number of times, so the partition 
> >> distribution s
> >> tatistically converges toward behavior similar to range mode. 
> >> 
> >> The main difference is that random mode introduces short-term burstiness, 
> >> where the same 
> >> key may appear consecutively for a period of time, while range mode 
> >> produces a perfectly 
> >> even round-robin pattern. However, neither mode inherently creates a truly 
> >> skewed (hot-partition) 
> >> distribution. 
> >> 
> >> I’ll update the motivation section to remove the hot-partition claim for 
> >> random mode.
> >> 
> >> Best Regards,
> >> Jiunn-Yang
> >> 
> >>> PoAn Yang <[email protected]> 於 2026年5月26日 晚上7:18 寫道：
> >>> 
> >>> Hi Jiunn,
> >>> 
> >>> Thanks for the KIP.
> >>> 
> >>> poan_00: In example usage, there is a case use --key-distribution range 
> >>> with --random-seed.
> >>> In this case, does the --random-seed parameter take effect? If not, can 
> >>> we remove it?
> >>> 
> >>> poan_01: In motivation, one use case of random distribution is 
> >>> hot-partition scenario.
> >>> However, in JDK document, the SplittableRandom is a generator of uniform 
> >>> pseudorandom values [0].
> >>> If hot-partition scenario is just because small key range, can we do it 
> >>> with range key distribution directly?
> >>> 
> >>> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/SplittableRandom.html
> >>> 
> >>> Best,
> >>> PoAn
> >>> 
> >>>> On May 20, 2026, at 8:56 PM, 黃竣陽 <[email protected]> wrote:
> >>>> 
> >>>> Hi chia,
> >>>> 
> >>>> Thanks for the feedback,
> >>>> 
> >>>> chia_00: I have added a new optional argument --random-seed <SEED> 
> >>>> (default: 0) 
> >>>> to let users set the seed manually. The default value of 0 ensures 
> >>>> deterministic, reproducible 
> >>>> benchmark runs by default. 
> >>>> 
> >>>> chia_01: I have updated the Motivation section in the KIP to elaborate 
> >>>> on the practical 
> >>>> use cases for each key distribution mode.
> >>>> 
> >>>> Best Regards,
> >>>> Jiunn-Yang
> >>>> 
> >>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月20日 上午11:48 寫道：
> >>>>> 
> >>>>> hi Jiunn
> >>>>> 
> >>>>> thanks for this KIP!
> >>>>> 
> >>>>> chia_00: Regarding the random seed, what are your thoughts on its 
> >>>>> initialization?
> >>>>> 
> >>>>> chia_01: Could you elaborate on the practical use cases for each key 
> >>>>> distribution mode in the Motivation section?
> >>>>> 
> >>>>> Best,Chia-Ping
> >>>>> 
> >>>>> On 2026/03/30 13:06:05 黃竣陽 wrote:
> >>>>>> Hello everyone, 
> >>>>>> 
> >>>>>> I would like to start a discussion on KIP-1299 Use key range in 
> >>>>>> ProducerPerformance
> >>>>>> <https://cwiki.apache.org/confluence/x/XpQ8G>
> >>>>>> 
> >>>>>> This proposal aims to add configurable key distribution support to 
> >>>>>> kafka-producer-perf-test. 
> >>>>>> Currently, the tool always produces records with null keys, which does 
> >>>>>> not reflect real-world 
> >>>>>> keyed workloads. This KIP introduces two new arguments — 
> >>>>>> --key-distribution and --message-key-range 
> >>>>>> — enabling engineers to benchmark with round-robin or random key 
> >>>>>> strategies over a bounded 
> >>>>>> key space, providing more realistic performance measurements.
> >>>>>> 
> >>>>>> Best regards,
> >>>>>> Jiunn-Yang
> >>>> 
> >>> 
> >> 
> >> 
> 
>

Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Reply via email to