RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Welch, Matt Tue, 02 Jul 2024 13:39:37 -0700

Hi Federico, 

Thanks for your response.  I have a few questions.


> You mean, no existing public interfaces right? Anyway, this content should be 
> in the "Compatibility, Deprecation, and Migration Plan"
section.
Just a question here: Should the contents of the "Public Interfaces" section 
after that first sentence be moved to "Compatibility, Deprecation, and 
Migration Plan"? I guess I wasn't thinking of the tools as an interface.

> I would also say that, if we specify warmup-records, we also want separate 
> stats. With this change it would be really straightforward IMO, and we 
> wouldn't need the additional separated-warmup option.
> Wdyt?
I no longer think we really need a separated-warmup option and I would prefer 
to leave the separated-warmup option out for simplicity. My preference is to 
have as much data as available presented, so the user has maximum information 
to make decisions about warmups and performance. That would imply having 
*three* printed summary lines for "whole test", "warmup only", and "steady 
state only".  Each of these would have a slightly different message like 
"records sent", "warmup records sent", and "steady state records sent" to 
enable the user to differentiate between them. I haven't modified the KIP to 
reflect this yet because there seems to be some motivation for having this as 
an option. What is the preference of the community here? Would having all three 
summary lines printed at end of test be confusing, informative, or other?

> Have you considered having a sort of autopilot for computing the warmup size 
> based on the tool's output information (window's p99)?
> Once p99 is stable enough, the tool could start the steady-state phase 
> printing out the computed warmup size. In case we decide this is tricky or 
> undesired behavior, we can list the autopilot mode in the rejected 
> alternatives, along with motivations.
I like the idea of a producer autopilot, but it's sufficiently complex that it 
needs its own KIP. I've added a description of the autopilot feature to the 
Rejected Alternatives.

> Just a nit, but I think we miss the --payload-file option in all snippets.
Nit or not, I really appreciate the thorough review!  I've updated the command 
lines to contain the payload-file option.

Thanks,
Matt

-----Original Message-----
From: Federico Valeri <fedeval...@gmail.com> 
Sent: Monday, July 1, 2024 1:04 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Matt, thanks for the updates. Snippets are really useful.

> No public interfaces are affected.

You mean, no existing public interfaces right? Anyway, this content should be 
in the "Compatibility, Deprecation, and Migration Plan"
section.

> The first option, --warmup-records, will be added to the producer performance 
> test to request that the initial records sent in a test be gathered into a 
> separate Stats object from the steady-state records to follow.

I would also say that, if we specify warmup-records, we also want separate 
stats. With this change it would be really straightforward IMO, and we wouldn't 
need the additional separated-warmup option.
Wdyt?

> Although the producer performance test output should provide 
> sufficient information to set a warmup

Have you considered having a sort of autopilot for computing the warmup size 
based on the tool's output information (window's p99)?
Once p99 is stable enough, the tool could start the steady-state phase printing 
out the computed warmup size. In case we decide this is tricky or undesired 
behavior, we can list the autopilot mode in the rejected alternatives, along 
with motivations.

> bin/kafka-producer-perf-test.sh --num-records 1000000 --throughput 
> 50000

Just a nit, but I think we miss the --payload-file option in all snippets.

On Sat, Jun 29, 2024 at 1:19 AM Welch, Matt <matt.we...@intel.com> wrote:
>
> Hi Luke and Federico,
>
> Thank you for your responses.  Your questions seemed to be along similar 
> lines so I've combined the responses.
> Please let me know if you need more clarification.
>
> 1. I've updated the KIP to describe both new command line options, now 
> '--warmup-records' and '--separated-warmup'.  After reading Federico's email, 
> I realized the parameter '--combined-summary' didn't make sense in its 
> intended use and the revised parameter name 'separated-warmup' more 
> accurately reflects its purpose: to separate warmup statistics from 
> steady-state statistics. I could easily be convinced that 
> 'separated-warmup-statistics' would be a better parameter name, but that 
> seemed too verbose at first. The default for 'separated-warmup' is false and 
> now has a better description and help output example. The intent for these 
> options is definitely to avoid any breakage of existing producer performance 
> system tests.  I've also revised some of the language used for better clarity.
>
> 2. I've added example snippets of tool invocation and output for the producer 
> performance test for three cases:
>     (A) typical (existing) usage without invoking 'warmup-records'
>     (B) 'warmup-records' without using 'separated-warmup'
>     (C) 'warmup-records' with 'separated-warmup'
>
> Also, I completely agree that a consumer test warmup would be a great feature 
> and should be described in a future KIP.
>
> Thanks,
> Matt
>
> -----Original Message-----
> From: Federico Valeri <fedeval...@gmail.com>
> Sent: Friday, June 21, 2024 8:02 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance 
> test
>
> Hi Matt, I thanks for the KIP, this is a really useful feature.
>
> In public interfaces, you say that the output won't change by default, so I 
> guess this means that --combined-summary will be false by default, otherwise 
> we would break the producer_performance system test. Is that correct? I think 
> a couple of command line snippets would help here.
>
> I think it would be great to also add a warmup phase to the consumer perf 
> tool, but this probably deserves it's own KIP as we don't have latency stats 
> there.
>
>
> On Fri, Jun 21, 2024 at 2:16 PM Luke Chen <show...@gmail.com> wrote:
> >
> > Hi Matt,
> >
> > Thanks for the KIP!
> > I agree having the warm-up records could help correctly analyze the 
> > performance.
> >
> > Some questions:
> > 1. It looks like we will add 2 more options to producer perf tool:
> >  - --warmup-records
> >  - --combined-summary
> >
> > Is this correct?
> > In the "public interface" section, only 1 is mentioned. Could you update it?
> > Also, in the KIP, you use the word: "An option such as "--warmup-records"
> > should be added...", it sounds like it is not decided, yet.
> > I suggest you update to say, we will add "--warmup-records" option 
> > for ...." to make it clear.
> >
> > 2. What will be the output containing both warm-up and steady-state results?
> > Could you give an example directly?
> >
> > For better understanding, I'd suggest you refer to KIP-1057 
> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-1057%3A+Add+r
> > em
> > ote+log+metadata+flag+to+the+dump+log+tool>
> > to add some examples using `kafka-producer-perf-test.sh` with the 
> > new option, to show what it will output.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Jun 21, 2024 at 10:39 AM Welch, Matt <matt.we...@intel.com> wrote:
> >
> > > Hi Divij,
> > >
> > > Thanks for your response.  You raise some very important points.
> > > I've updated the KIP to clarify the changes discussed here.
> > >
> > > 1. I agree that warmup stats should be printed separately.  I see 
> > > two cases here, both of which would have two summary lines printed 
> > > at the end of the producer perf test.  In the first case, 
> > > warmup-separate, the warmup stats are printed first as 
> > > warmup-only, followed by a second print of the steady state 
> > > performance. In the second case, warmup-combined, the first print 
> > > would look identical to the summary line that's currently used and 
> > > would reflect the "whole test", with a second summary print of the 
> > > steady-state performance.  This second case would allow for users 
> > > to compare what the test would have looked like without a warmup 
> > > to results of the test with a warmup. Although I've been looking 
> > > at the second case lately, I can see merits of both cases and 
> > > would be happy to support the warmup-separate case if that's the 
> > > preference of the community.
> > > Regarding the JMX metrics accumulated by Kafka, we need to decide 
> > > if we should reset the JMX metrics between the warmup and steady state.
> > > While I like the idea of having separate JMX buckets for warmup 
> > > and steady state, these statistics are usually heavily windowed, so 
> > > should naturally settle toward steady-state values after a warmup.
> > >
> > > 2. The total number of records sent by the producer and defined by 
> > > '--num-records' MUST be strictly greater than the '--warmup-records'
> > > or an error will be thrown. Negative warmup records should 
> > > similarly throw an error.  Specifying warmup-records of "0" should 
> > > have behavior identical to the current implementation.
> > >
> > > 3.  You're correct that choosing the warmup duration can have a 
> > > significant impact on the test output if care is not taken.  I've 
> > > updated the proposed change to describe a simplistic process to 
> > > choose how many warmup records to use.  Without understanding all 
> > > the factors that go into a warmup, a user could run a test and 
> > > watch the time series output of the producer test to determine 
> > > when steady state has been reached and warmup has completed.  The 
> > > number of records at which the producer hits steady state could 
> > > then be used in subsequent tests. In practice, we find that 1 
> > > minute is a good warmup for most cases, since aside from 
> > > networking and storage initialization, even the JVM should be 
> > > warmed up by then and using compiled code rather than interpreted 
> > > byte code. This is more a heuristic, however, and measured latency and 
> > > throughput of the system should be used to determine steady state.
> > >
> > > 4.  The current design has the user specifying the warmup records 
> > > like they would specify the number of records for the test. While 
> > > this is related to the throughput, it seemed a better option to 
> > > have the user specify the number of records in the warmup, rather 
> > > than some kind of duration which would be more complex to track. I 
> > > completely agree with your concern of warmup affecting steady 
> > > state, however, especially in short tests. With a warmup 
> > > "removing" some of the high latency from steady state results, it 
> > > could be tempting for users to run very short tests since they no 
> > > longer need to wait long to achieve a repeatable steady-state 
> > > result. I would consider this a case of insufficient warmup since 
> > > Kafka could still be processing the warmup records as you mention. 
> > > Best practice for warmup duration would be to hit steady state 
> > > during the warmup and only then consider it a successful warmup. 
> > > Our preferred process is to monitor producer latency until it hits 
> > > steady state in a first test, then double that duration for the 
> > > warmup in subsequent testing. One minute is usually sufficient. A 
> > > problem does occur when using unlimited throughput since the user 
> > > does not yet know how fast the producers will send so can't 
> > > estimate warmup records. If the iterative testing described above is not 
> > > possible to estimate a warmup, the user must choose a fairly large number 
> > > of records for the warmup.
> > >
> > > Best Regards,
> > > Matt Welch
> > >
> > >
> > > -----Original Message-----
> > > From: Divij Vaidya <divijvaidy...@gmail.com>
> > > Sent: Sunday, June 16, 2024 11:21 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer 
> > > performance test
> > >
> > > Thank you for the KIP, Matt.
> > >
> > > Totally agree on having a warm-up for benchmark testing. The 
> > > initial producer setup time could involve things such as network 
> > > connection setup (including authN, SSL handshake etc), DNS 
> > > resolution, metadata fetching etc which could impact the result of 
> > > steady-state performance.
> > >
> > > May I suggest adding some more clarity to the KIP about the following:
> > >
> > > 1. Should we also include the metrics for warm-up separately 
> > > (instead of having them as 0)? This would have the advantage of 
> > > reporting both warm up performance and steady state performance in 
> > > the same benchmark run. A similar report is followed by JMH
> > > (https://github.com/openjdk/jmh) as well.
> > > You can look at it for some inspiration.
> > >
> > > 2. Please add validation that num-records should be greater than 
> > > warm-up records. Else report an error.
> > >
> > > 3. Please add a recommendation in the docs for the tool on what an 
> > > ideal value for warm up should be. For users who may not be 
> > > completely familiar with producer buffering / back-pressure, it 
> > > would be useful to understand a good value to set. In my opinion,
> > >
> > > 4. I wonder how the --throughput parameter works with the warmup!
> > > Could we have a situation where the "steady-state" is impacted by 
> > > the warm-up traffic? As an example, we could land in a situation 
> > > where the slow processing of warm-up messages could impact the 
> > > measurement of steady-state. This could happen in a situation when 
> > > warm-up messages are waiting to be processed on the server (or 
> > > maybe on the producer buffer) but we have started recording end-to-end 
> > > latency for the steady-state messages.
> > > I imagine this should be ok because it achieves the purpose of 
> > > removing bootstrap times, but I haven't been able to reason about it in 
> > > my head.
> > > What are your thoughts on this?
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Fri, Jun 14, 2024 at 12:23 AM Eric Lu 
> > > <erickandmorty2...@gmail.com>
> > > wrote:
> > >
> > > > Hi Matt,
> > > >
> > > > Yes I forgot to update the KIP counter after creating a KIP. I 
> > > > changed mine to 1053. We should be all good now.
> > > >
> > > > Cheers,
> > > > Eric
> > > >
> > > > On Thu, Jun 13, 2024 at 3:08 PM Welch, Matt 
> > > > <matt.we...@intel.com>
> > > wrote:
> > > >
> > > > > Hello again Kafka devs,
> > > > >
> > > > > I'd like to again call attention to this KIP for discussion.
> > > > > Apparently, we encountered a race condition when choosing KIP 
> > > > > numbers,
> > > > but
> > > > > hopefully it's straightened out now.
> > > > >
> > > > > Regards,
> > > > > Matt
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Welch, Matt <matt.we...@intel.com>
> > > > > Sent: Thursday, June 6, 2024 4:44 PM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: [DISCUSS] KIP-1052: Enable warmup in producer 
> > > > > performance test
> > > > >
> > > > > Hello all,
> > > > >
> > > > > I'd like to propose a change that would allow the producer 
> > > > > performance test to have a warmup phase where the statistics 
> > > > > gathered could be separated from statistics gathered during 
> > > > > steady
> > > state.
> > > > >
> > > > > Although startup is an important phase of Kafka operations and 
> > > > > special attention should be paid to optimizing startup 
> > > > > performance, often we
> > > > would
> > > > > like to understand Kafka performance during steady-state 
> > > > > operation, separate from its performance during producer 
> > > > > startup.  It's common for
> > > > new
> > > > > producers, like in a fresh producer performance test run, to 
> > > > > have high latency during startup. This high latency can 
> > > > > complicate the
> > > > understanding
> > > > > of steady-state performance, even when collecting long-running 
> > > > > tests.  If we want to understand steady-state latency separate 
> > > > > from startup latency, we can collect measurements for each 
> > > > > phase in disjoint sets then present statistics on each set 
> > > > > independently or as a combined population of measurements.  
> > > > > This feature would be completely optional and could be 
> > > > > represented by a new command line flag for the producer performance 
> > > > > test, '--warmup-records'.
> > > > >
> > > > > KIP: KIP-1052: Enable warmup in producer performance test - 
> > > > > Apache Kafka
> > > > -
> > > > > Apache Software Foundation<
> > > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+En
> > > > ab
> > > > le+w
> > > > armup+in+producer+performance+test
> > > > > >
> > > > >
> > > > > Thank you,
> > > > > Matt Welch
> > > > >
> > > > >
> > > >
> > >

RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Reply via email to