Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Andrew Schofield Wed, 27 May 2026 09:05:20 -0700

Alignment is important so people don't have to guess. Works for me too.


On 2026/05/27 15:55:31 Chia-Ping Tsai wrote:
> > AS1: Why --message-key-range instead of just --key-range? We've not used 
> > --message previously in this tool, and actually prefer --record.
> 
> If the goal is to align the naming, I prefer "--record-key-range" as many 
> existing arguments already have prefix "--record"
> 
> On 2026/05/27 15:47:29 Andrew Schofield wrote:
> > Hi Jiunn-Yang,
> > Thanks for the KIP. One small comment.
> > 
> > AS1: Why --message-key-range instead of just --key-range? We've not used 
> > --message previously in this tool, and actually prefer --record.
> > 
> > Thanks,
> > Andrew
> > 
> > On 2026/05/26 15:49:42 黃竣陽 wrote:
> > > Hello chia,
> > > 
> > > Thanks for the feedback. I have updated the KIP.
> > > 
> > > Best Regards,
> > > Jiunn-Yang
> > > 
> > > > Chia-Ping Tsai <[email protected]> 於 2026年5月26日 晚上11:42 寫道：
> > > > 
> > > > hi Jiunn-Yang
> > > > 
> > > > Thanks for the KIP. It looks like this proposal enables testing on 
> > > > compacted topics. If so, would you mind updating the Motivation section 
> > > > to include this?
> > > > 
> > > > Best,
> > > > Chia-Ping
> > > > 
> > > > On 2026/05/26 11:46:16 黃竣陽 wrote:
> > > >> Hello PoAn,
> > > >> 
> > > >> Thanks for the feedback
> > > >> 
> > > >> poan_00: In range mode, keys are generated by `recordIndex % keyRange`,
> > > >> which is fully deterministic and not affected by `--random-seed`. The 
> > > >> seed only controls 
> > > >> the PRNG used for random payload generation in that case. The example 
> > > >> is misleading,
> > > >> I will remove it.
> > > >> 
> > > >> poan_01:  According to the JDK documentation, `SplittableRandom` 
> > > >> generates uniformly 
> > > >> distributed pseudorandom values. With a sufficiently large number of 
> > > >> records, each key 
> > > >> in random mode appears roughly the same number of times, so the 
> > > >> partition distribution s
> > > >> tatistically converges toward behavior similar to range mode. 
> > > >> 
> > > >> The main difference is that random mode introduces short-term 
> > > >> burstiness, where the same 
> > > >> key may appear consecutively for a period of time, while range mode 
> > > >> produces a perfectly 
> > > >> even round-robin pattern. However, neither mode inherently creates a 
> > > >> truly skewed (hot-partition) 
> > > >> distribution. 
> > > >> 
> > > >> I’ll update the motivation section to remove the hot-partition claim 
> > > >> for random mode.
> > > >> 
> > > >> Best Regards,
> > > >> Jiunn-Yang
> > > >> 
> > > >>> PoAn Yang <[email protected]> 於 2026年5月26日 晚上7:18 寫道：
> > > >>> 
> > > >>> Hi Jiunn,
> > > >>> 
> > > >>> Thanks for the KIP.
> > > >>> 
> > > >>> poan_00: In example usage, there is a case use --key-distribution 
> > > >>> range with --random-seed.
> > > >>> In this case, does the --random-seed parameter take effect? If not, 
> > > >>> can we remove it?
> > > >>> 
> > > >>> poan_01: In motivation, one use case of random distribution is 
> > > >>> hot-partition scenario.
> > > >>> However, in JDK document, the SplittableRandom is a generator of 
> > > >>> uniform pseudorandom values [0].
> > > >>> If hot-partition scenario is just because small key range, can we do 
> > > >>> it with range key distribution directly?
> > > >>> 
> > > >>> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/SplittableRandom.html
> > > >>> 
> > > >>> Best,
> > > >>> PoAn
> > > >>> 
> > > >>>> On May 20, 2026, at 8:56 PM, 黃竣陽 <[email protected]> wrote:
> > > >>>> 
> > > >>>> Hi chia,
> > > >>>> 
> > > >>>> Thanks for the feedback,
> > > >>>> 
> > > >>>> chia_00: I have added a new optional argument --random-seed <SEED> 
> > > >>>> (default: 0) 
> > > >>>> to let users set the seed manually. The default value of 0 ensures 
> > > >>>> deterministic, reproducible 
> > > >>>> benchmark runs by default. 
> > > >>>> 
> > > >>>> chia_01: I have updated the Motivation section in the KIP to 
> > > >>>> elaborate on the practical 
> > > >>>> use cases for each key distribution mode.
> > > >>>> 
> > > >>>> Best Regards,
> > > >>>> Jiunn-Yang
> > > >>>> 
> > > >>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月20日 上午11:48 寫道：
> > > >>>>> 
> > > >>>>> hi Jiunn
> > > >>>>> 
> > > >>>>> thanks for this KIP!
> > > >>>>> 
> > > >>>>> chia_00: Regarding the random seed, what are your thoughts on its 
> > > >>>>> initialization?
> > > >>>>> 
> > > >>>>> chia_01: Could you elaborate on the practical use cases for each 
> > > >>>>> key distribution mode in the Motivation section?
> > > >>>>> 
> > > >>>>> Best,Chia-Ping
> > > >>>>> 
> > > >>>>> On 2026/03/30 13:06:05 黃竣陽 wrote:
> > > >>>>>> Hello everyone, 
> > > >>>>>> 
> > > >>>>>> I would like to start a discussion on KIP-1299 Use key range in 
> > > >>>>>> ProducerPerformance
> > > >>>>>> <https://cwiki.apache.org/confluence/x/XpQ8G>
> > > >>>>>> 
> > > >>>>>> This proposal aims to add configurable key distribution support to 
> > > >>>>>> kafka-producer-perf-test. 
> > > >>>>>> Currently, the tool always produces records with null keys, which 
> > > >>>>>> does not reflect real-world 
> > > >>>>>> keyed workloads. This KIP introduces two new arguments — 
> > > >>>>>> --key-distribution and --message-key-range 
> > > >>>>>> — enabling engineers to benchmark with round-robin or random key 
> > > >>>>>> strategies over a bounded 
> > > >>>>>> key space, providing more realistic performance measurements.
> > > >>>>>> 
> > > >>>>>> Best regards,
> > > >>>>>> Jiunn-Yang
> > > >>>> 
> > > >>> 
> > > >> 
> > > >> 
> > > 
> > > 
> > 
>

Re: [DISCUSS] KIP-1299 Use key range in ProducerPerformance

Reply via email to