RE: [VOTE] KIP-1052: Enable warmup in producer performance test

2024-09-27 Thread Welch, Matt
Thank you David and Chia-Ping for voting.
@David I added the mention of the new command line argument in the public 
interfaces section.

This brings our total to +3 binding, +1 non-binding so I will conclude the vote 
as *approved*:
Federico Valeri: +1 (non-binding)
Justine Olshan: +1 (binding)
David Jacot: +1 (binding)
Chia-Ping Tsai: +1 (binding)

Many thanks to Divij, Luke, Federico, Justine, David, and Chia-Ping for the 
great discussion and support!

Best Regards,
Matt

-Original Message-
From: David Jacot  
Sent: Tuesday, September 24, 2024 12:44 AM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance test

Hi Matt,

Thanks for the KIP. I have a minor nit. Could you please explicitly mention the 
new command line argument in the public interfaces section?

Otherwise, I am +1 (binding).

Best,
David


On Tue, Sep 24, 2024 at 1:48 AM Welch, Matt  wrote:

> Hi Kafka Devs,
>
> Bumping VOTE thread again for visibility.
>
> Thanks to Justine and Federico who've cast their votes on KIP-1052!
> Current vote tally:
> Justine Olshan: +1 (binding)
> Federico Valeri: +1 (non-binding)
>
> Let's keep it going and complete this vote!
>
> Thanks,
> Matt
>
> -Original Message-
> From: Justine Olshan 
> Sent: Wednesday, September 11, 2024 4:36 PM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
>
> +1 (binding) from me.
>
> Thanks,
> Justine
>
> On Wed, Sep 4, 2024 at 3:22 PM Welch, Matt  wrote:
>
> > Hi Kafka devs,
> >
> > Bumping this VOTE thread again for visibility.
> >
> > Thanks,
> > Matt
> >
> > -Original Message-
> > From: Welch, Matt 
> > Sent: Friday, August 23, 2024 4:26 PM
> > To: dev@kafka.apache.org
> > Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance 
> > test
> >
> > Hi Kafka devs,
> >
> > Bumping this VOTE thread for visibility.
> >
> > Thanks,
> > Matt
> >
> > -Original Message-
> > From: Federico Valeri 
> > Sent: Monday, August 19, 2024 12:38 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance 
> > test
> >
> > Hi Matt, +1 (non binding) from me. Thanks!
> >
> > Just a suggestion: I think that the following output line does not 
> > add much value and could be removed.
> >
> > "Warmup first 10 records. Steady-state results will print after 
> > the complete-test summary."
> >
> > On Wed, Aug 14, 2024 at 8:06 PM Welch, Matt 
> wrote:
> > >
> > >
> > > Hi all,
> > >
> > > It seems discussion has been quiet for a couple of weeks so I'd 
> > > like to call a vote on KIP-1052 
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enab
> > > le
> > > +w
> > > armup+in+producer+performance+test
> > >
> > > Thanks,
> > > Matt Welch
> > >
> >
>


RE: [VOTE] KIP-1052: Enable warmup in producer performance test

2024-09-26 Thread Welch, Matt
Hi Chia-Ping

Earlier in the discussion phase, Federico Valeri has also proposed the idea of 
a "producer autopilot" which could switch to a steady state mode once the p99 
was acceptably stable.  
I agree with both of you that an automatic warmup would be a very useful 
addition to the producer performance tooling, but creating a universal 
definition of "stable" and the automatic detection of stability are both 
somewhat complex. For these reasons, I think it's best that automatic warmup is 
contained in a follow-on KIP. I've attempted to include your ideas in the 
Rejected Alternatives section of the KIP.

Thanks,
Matt

-Original Message-
From: Chia-Ping Tsai  
Sent: Tuesday, September 24, 2024 4:22 AM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance test

hi Matt

Apologies for the delayed response. I completely agree that ProducerPerformance 
should separate warmup statistics from steady-state metrics. Overall +1, but I 
have a small question:

Have you considered implementing an explicit check for warmup instead of 
relying on sending a set number of records? My concern is that users may not 
know how many records are necessary to complete the warmup. Rather than relying 
on a rule of thumb, ProducerPerformance could check the metadata and node 
latency (via metrics) to ensure that the node information (such as connection, 
DNS, and metadata) is ready. 

In summary, this approach introduces a flag called enable-warmup instead of 
using warmup-records. The advantage is that users no longer need to specify the 
number of warmup records. When the flag is enabled, ProducerPerformance will 
continue sending warmup records until the node information is fully ready.

Best,
Chia-Ping

On 2024/09/04 22:20:15 "Welch, Matt" wrote:
> Hi Kafka devs,
> 
> Bumping this VOTE thread again for visibility.
> 
> Thanks,
> Matt
> 
> -Original Message-
> From: Welch, Matt 
> Sent: Friday, August 23, 2024 4:26 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
> 
> Hi Kafka devs,
> 
> Bumping this VOTE thread for visibility.
> 
> Thanks,
> Matt
> 
> -Original Message-
> From: Federico Valeri 
> Sent: Monday, August 19, 2024 12:38 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
> 
> Hi Matt, +1 (non binding) from me. Thanks!
> 
> Just a suggestion: I think that the following output line does not add much 
> value and could be removed.
> 
> "Warmup first 10 records. Steady-state results will print after the 
> complete-test summary."
> 
> On Wed, Aug 14, 2024 at 8:06 PM Welch, Matt  wrote:
> >
> >
> > Hi all,
> >
> > It seems discussion has been quiet for a couple of weeks so I'd like 
> > to call a vote on KIP-1052 
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable
> > +w
> > armup+in+producer+performance+test
> >
> > Thanks,
> > Matt Welch
> >
> 


RE: [VOTE] KIP-1052: Enable warmup in producer performance test

2024-09-23 Thread Welch, Matt
Hi Kafka Devs, 

Bumping VOTE thread again for visibility.

Thanks to Justine and Federico who've cast their votes on KIP-1052!
Current vote tally:
Justine Olshan: +1 (binding)
Federico Valeri: +1 (non-binding)

Let's keep it going and complete this vote!

Thanks,
Matt

-Original Message-
From: Justine Olshan  
Sent: Wednesday, September 11, 2024 4:36 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance test

+1 (binding) from me.

Thanks,
Justine

On Wed, Sep 4, 2024 at 3:22 PM Welch, Matt  wrote:

> Hi Kafka devs,
>
> Bumping this VOTE thread again for visibility.
>
> Thanks,
> Matt
>
> -Original Message-
> From: Welch, Matt 
> Sent: Friday, August 23, 2024 4:26 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
>
> Hi Kafka devs,
>
> Bumping this VOTE thread for visibility.
>
> Thanks,
> Matt
>
> -Original Message-
> From: Federico Valeri 
> Sent: Monday, August 19, 2024 12:38 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
>
> Hi Matt, +1 (non binding) from me. Thanks!
>
> Just a suggestion: I think that the following output line does not add 
> much value and could be removed.
>
> "Warmup first 100000 records. Steady-state results will print after 
> the complete-test summary."
>
> On Wed, Aug 14, 2024 at 8:06 PM Welch, Matt  wrote:
> >
> >
> > Hi all,
> >
> > It seems discussion has been quiet for a couple of weeks so I'd like 
> > to call a vote on KIP-1052 
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable
> > +w
> > armup+in+producer+performance+test
> >
> > Thanks,
> > Matt Welch
> >
>


RE: [VOTE] KIP-1052: Enable warmup in producer performance test

2024-08-23 Thread Welch, Matt
Hi Kafka devs, 

Bumping this VOTE thread for visibility.

Thanks,
Matt

-Original Message-
From: Federico Valeri  
Sent: Monday, August 19, 2024 12:38 AM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance test

Hi Matt, +1 (non binding) from me. Thanks!

Just a suggestion: I think that the following output line does not add much 
value and could be removed.

"Warmup first 10 records. Steady-state results will print after the 
complete-test summary."

On Wed, Aug 14, 2024 at 8:06 PM Welch, Matt  wrote:
>
>
> Hi all,
>
> It seems discussion has been quiet for a couple of weeks so I'd like 
> to call a vote on KIP-1052 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable+w
> armup+in+producer+performance+test
>
> Thanks,
> Matt Welch
>


[VOTE] KIP-1052: Enable warmup in producer performance test

2024-08-14 Thread Welch, Matt


Hi all, 
 
It seems discussion has been quiet for a couple of weeks so I'd like to call a 
vote on KIP-1052 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable+warmup+in+producer+performance+test
 
Thanks,
Matt Welch



RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-08-09 Thread Welch, Matt
Hi again Kafka devs, 

I think we're about ready to vote on KIP-1052, but just want to make sure 
there's no additional concerns with the KIP.
Apologies if it's poor form, but do the devs who commented previously have any 
additional input? @Divij, @Luke, 
@Federico

I will call a vote next week if nobody has any additional input or concerns.

Thanks and best regards,
Matt

-Original Message-----
From: Welch, Matt  
Sent: Monday, August 5, 2024 2:54 PM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi all, 

After much consideration on the options, I've updated the KIP with what I hope 
to be the final version.
Separated-warmup has been moved to Rejected Alternatives because warmup-only 
data is likely to have little value in normal operations and can already be 
gathered with very short tests.

If nobody has any more comments on this proposal, I think we might have 
consensus so is it appropriate to call a vote now?

Thanks,
Matt

-Original Message-----
From: Welch, Matt 
Sent: Monday, July 22, 2024 4:13 PM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Federico,

Bumping for visibility.  
I've had some additional thoughts on below over the last couple weeks.

I really don't think we need the option to have combined/separated stats and I 
think printing warmup-only, followed by steady-state only at the end of the run 
would be my preferred choice.  
As I see it there are a few possibilities: 
1) print warmup stats, followed by steady-state stats (simplest implementation, 
no separated-warmup option is necessary)
2) print warmup+steady-state (whole test), followed by steady-state stats (good 
for users transitioning to tests with warmups, no separated-warmup option is 
necessary)
3) option for "separated-warmup" which prints (1) when used on command line, or 
(2) when not used (default)

I'm not sure what the preferred implementation would be for the community, 
however so some feedback on that point would be helpful.

Thanks,
Matt

-Original Message-
From: Welch, Matt 
Sent: Tuesday, July 2, 2024 1:38 PM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Federico, 

Thanks for your response.  I have a few questions.

> You mean, no existing public interfaces right? Anyway, this content should be 
> in the "Compatibility, Deprecation, and Migration Plan"
section.
Just a question here: Should the contents of the "Public Interfaces" section 
after that first sentence be moved to "Compatibility, Deprecation, and 
Migration Plan"? I guess I wasn't thinking of the tools as an interface.

> I would also say that, if we specify warmup-records, we also want separate 
> stats. With this change it would be really straightforward IMO, and we 
> wouldn't need the additional separated-warmup option.
> Wdyt?
I no longer think we really need a separated-warmup option and I would prefer 
to leave the separated-warmup option out for simplicity. My preference is to 
have as much data as available presented, so the user has maximum information 
to make decisions about warmups and performance. That would imply having 
*three* printed summary lines for "whole test", "warmup only", and "steady 
state only".  Each of these would have a slightly different message like 
"records sent", "warmup records sent", and "steady state records sent" to 
enable the user to differentiate between them. I haven't modified the KIP to 
reflect this yet because there seems to be some motivation for having this as 
an option. What is the preference of the community here? Would having all three 
summary lines printed at end of test be confusing, informative, or other?

> Have you considered having a sort of autopilot for computing the warmup size 
> based on the tool's output information (window's p99)?
> Once p99 is stable enough, the tool could start the steady-state phase 
> printing out the computed warmup size. In case we decide this is tricky or 
> undesired behavior, we can list the autopilot mode in the rejected 
> alternatives, along with motivations.
I like the idea of a producer autopilot, but it's sufficiently complex that it 
needs its own KIP. I've added a description of the autopilot feature to the 
Rejected Alternatives.

> Just a nit, but I think we miss the --payload-file option in all snippets.
Nit or not, I really appreciate the thorough review!  I've updated the command 
lines to contain the payload-file option.

Thanks,
Matt

-Original Message-
From: Federico Valeri 
Sent: Monday, July 1, 2024 1:04 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Matt, thanks for the upda

RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-08-05 Thread Welch, Matt
Hi all, 

After much consideration on the options, I've updated the KIP with what I hope 
to be the final version.
Separated-warmup has been moved to Rejected Alternatives because warmup-only 
data is likely to have little value in normal operations and can already be 
gathered with very short tests.

If nobody has any more comments on this proposal, I think we might have 
consensus so is it appropriate to call a vote now?

Thanks,
Matt

-Original Message-
From: Welch, Matt  
Sent: Monday, July 22, 2024 4:13 PM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Federico,

Bumping for visibility.  
I've had some additional thoughts on below over the last couple weeks.

I really don't think we need the option to have combined/separated stats and I 
think printing warmup-only, followed by steady-state only at the end of the run 
would be my preferred choice.  
As I see it there are a few possibilities: 
1) print warmup stats, followed by steady-state stats (simplest implementation, 
no separated-warmup option is necessary)
2) print warmup+steady-state (whole test), followed by steady-state stats (good 
for users transitioning to tests with warmups, no separated-warmup option is 
necessary)
3) option for "separated-warmup" which prints (1) when used on command line, or 
(2) when not used (default)

I'm not sure what the preferred implementation would be for the community, 
however so some feedback on that point would be helpful.

Thanks,
Matt

-----Original Message-
From: Welch, Matt 
Sent: Tuesday, July 2, 2024 1:38 PM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Federico, 

Thanks for your response.  I have a few questions.

> You mean, no existing public interfaces right? Anyway, this content should be 
> in the "Compatibility, Deprecation, and Migration Plan"
section.
Just a question here: Should the contents of the "Public Interfaces" section 
after that first sentence be moved to "Compatibility, Deprecation, and 
Migration Plan"? I guess I wasn't thinking of the tools as an interface.

> I would also say that, if we specify warmup-records, we also want separate 
> stats. With this change it would be really straightforward IMO, and we 
> wouldn't need the additional separated-warmup option.
> Wdyt?
I no longer think we really need a separated-warmup option and I would prefer 
to leave the separated-warmup option out for simplicity. My preference is to 
have as much data as available presented, so the user has maximum information 
to make decisions about warmups and performance. That would imply having 
*three* printed summary lines for "whole test", "warmup only", and "steady 
state only".  Each of these would have a slightly different message like 
"records sent", "warmup records sent", and "steady state records sent" to 
enable the user to differentiate between them. I haven't modified the KIP to 
reflect this yet because there seems to be some motivation for having this as 
an option. What is the preference of the community here? Would having all three 
summary lines printed at end of test be confusing, informative, or other?

> Have you considered having a sort of autopilot for computing the warmup size 
> based on the tool's output information (window's p99)?
> Once p99 is stable enough, the tool could start the steady-state phase 
> printing out the computed warmup size. In case we decide this is tricky or 
> undesired behavior, we can list the autopilot mode in the rejected 
> alternatives, along with motivations.
I like the idea of a producer autopilot, but it's sufficiently complex that it 
needs its own KIP. I've added a description of the autopilot feature to the 
Rejected Alternatives.

> Just a nit, but I think we miss the --payload-file option in all snippets.
Nit or not, I really appreciate the thorough review!  I've updated the command 
lines to contain the payload-file option.

Thanks,
Matt

-Original Message-
From: Federico Valeri 
Sent: Monday, July 1, 2024 1:04 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Matt, thanks for the updates. Snippets are really useful.

> No public interfaces are affected.

You mean, no existing public interfaces right? Anyway, this content should be 
in the "Compatibility, Deprecation, and Migration Plan"
section.

> The first option, --warmup-records, will be added to the producer performance 
> test to request that the initial records sent in a test be gathered into a 
> separate Stats object from the steady-state records to follow.

I would also say that, if we specify warmup-records, we also want separate 
stats. With this change it would be r

RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-07-22 Thread Welch, Matt
Hi Federico,

Bumping for visibility.  
I've had some additional thoughts on below over the last couple weeks.

I really don't think we need the option to have combined/separated stats and I 
think printing warmup-only, followed by steady-state only at the end of the run 
would be my preferred choice.  
As I see it there are a few possibilities: 
1) print warmup stats, followed by steady-state stats (simplest implementation, 
no separated-warmup option is necessary)
2) print warmup+steady-state (whole test), followed by steady-state stats (good 
for users transitioning to tests with warmups, no separated-warmup option is 
necessary)
3) option for "separated-warmup" which prints (1) when used on command line, or 
(2) when not used (default)

I'm not sure what the preferred implementation would be for the community, 
however so some feedback on that point would be helpful.

Thanks,
Matt

-Original Message-
From: Welch, Matt  
Sent: Tuesday, July 2, 2024 1:38 PM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Federico, 

Thanks for your response.  I have a few questions.

> You mean, no existing public interfaces right? Anyway, this content should be 
> in the "Compatibility, Deprecation, and Migration Plan"
section.
Just a question here: Should the contents of the "Public Interfaces" section 
after that first sentence be moved to "Compatibility, Deprecation, and 
Migration Plan"? I guess I wasn't thinking of the tools as an interface.

> I would also say that, if we specify warmup-records, we also want separate 
> stats. With this change it would be really straightforward IMO, and we 
> wouldn't need the additional separated-warmup option.
> Wdyt?
I no longer think we really need a separated-warmup option and I would prefer 
to leave the separated-warmup option out for simplicity. My preference is to 
have as much data as available presented, so the user has maximum information 
to make decisions about warmups and performance. That would imply having 
*three* printed summary lines for "whole test", "warmup only", and "steady 
state only".  Each of these would have a slightly different message like 
"records sent", "warmup records sent", and "steady state records sent" to 
enable the user to differentiate between them. I haven't modified the KIP to 
reflect this yet because there seems to be some motivation for having this as 
an option. What is the preference of the community here? Would having all three 
summary lines printed at end of test be confusing, informative, or other?

> Have you considered having a sort of autopilot for computing the warmup size 
> based on the tool's output information (window's p99)?
> Once p99 is stable enough, the tool could start the steady-state phase 
> printing out the computed warmup size. In case we decide this is tricky or 
> undesired behavior, we can list the autopilot mode in the rejected 
> alternatives, along with motivations.
I like the idea of a producer autopilot, but it's sufficiently complex that it 
needs its own KIP. I've added a description of the autopilot feature to the 
Rejected Alternatives.

> Just a nit, but I think we miss the --payload-file option in all snippets.
Nit or not, I really appreciate the thorough review!  I've updated the command 
lines to contain the payload-file option.

Thanks,
Matt

-Original Message-
From: Federico Valeri 
Sent: Monday, July 1, 2024 1:04 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Matt, thanks for the updates. Snippets are really useful.

> No public interfaces are affected.

You mean, no existing public interfaces right? Anyway, this content should be 
in the "Compatibility, Deprecation, and Migration Plan"
section.

> The first option, --warmup-records, will be added to the producer performance 
> test to request that the initial records sent in a test be gathered into a 
> separate Stats object from the steady-state records to follow.

I would also say that, if we specify warmup-records, we also want separate 
stats. With this change it would be really straightforward IMO, and we wouldn't 
need the additional separated-warmup option.
Wdyt?

> Although the producer performance test output should provide 
> sufficient information to set a warmup

Have you considered having a sort of autopilot for computing the warmup size 
based on the tool's output information (window's p99)?
Once p99 is stable enough, the tool could start the steady-state phase printing 
out the computed warmup size. In case we decide this is tricky or undesired 
behavior, we can list the autopilot mode in the rejected alternatives, along 
with motivations.

> bin/kafka-pr

RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-07-02 Thread Welch, Matt
Hi Federico, 

Thanks for your response.  I have a few questions.

> You mean, no existing public interfaces right? Anyway, this content should be 
> in the "Compatibility, Deprecation, and Migration Plan"
section.
Just a question here: Should the contents of the "Public Interfaces" section 
after that first sentence be moved to "Compatibility, Deprecation, and 
Migration Plan"? I guess I wasn't thinking of the tools as an interface.

> I would also say that, if we specify warmup-records, we also want separate 
> stats. With this change it would be really straightforward IMO, and we 
> wouldn't need the additional separated-warmup option.
> Wdyt?
I no longer think we really need a separated-warmup option and I would prefer 
to leave the separated-warmup option out for simplicity. My preference is to 
have as much data as available presented, so the user has maximum information 
to make decisions about warmups and performance. That would imply having 
*three* printed summary lines for "whole test", "warmup only", and "steady 
state only".  Each of these would have a slightly different message like 
"records sent", "warmup records sent", and "steady state records sent" to 
enable the user to differentiate between them. I haven't modified the KIP to 
reflect this yet because there seems to be some motivation for having this as 
an option. What is the preference of the community here? Would having all three 
summary lines printed at end of test be confusing, informative, or other?

> Have you considered having a sort of autopilot for computing the warmup size 
> based on the tool's output information (window's p99)?
> Once p99 is stable enough, the tool could start the steady-state phase 
> printing out the computed warmup size. In case we decide this is tricky or 
> undesired behavior, we can list the autopilot mode in the rejected 
> alternatives, along with motivations.
I like the idea of a producer autopilot, but it's sufficiently complex that it 
needs its own KIP. I've added a description of the autopilot feature to the 
Rejected Alternatives.

> Just a nit, but I think we miss the --payload-file option in all snippets.
Nit or not, I really appreciate the thorough review!  I've updated the command 
lines to contain the payload-file option.

Thanks,
Matt

-Original Message-
From: Federico Valeri  
Sent: Monday, July 1, 2024 1:04 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Matt, thanks for the updates. Snippets are really useful.

> No public interfaces are affected.

You mean, no existing public interfaces right? Anyway, this content should be 
in the "Compatibility, Deprecation, and Migration Plan"
section.

> The first option, --warmup-records, will be added to the producer performance 
> test to request that the initial records sent in a test be gathered into a 
> separate Stats object from the steady-state records to follow.

I would also say that, if we specify warmup-records, we also want separate 
stats. With this change it would be really straightforward IMO, and we wouldn't 
need the additional separated-warmup option.
Wdyt?

> Although the producer performance test output should provide 
> sufficient information to set a warmup

Have you considered having a sort of autopilot for computing the warmup size 
based on the tool's output information (window's p99)?
Once p99 is stable enough, the tool could start the steady-state phase printing 
out the computed warmup size. In case we decide this is tricky or undesired 
behavior, we can list the autopilot mode in the rejected alternatives, along 
with motivations.

> bin/kafka-producer-perf-test.sh --num-records 100 --throughput 
> 5

Just a nit, but I think we miss the --payload-file option in all snippets.

On Sat, Jun 29, 2024 at 1:19 AM Welch, Matt  wrote:
>
> Hi Luke and Federico,
>
> Thank you for your responses.  Your questions seemed to be along similar 
> lines so I've combined the responses.
> Please let me know if you need more clarification.
>
> 1. I've updated the KIP to describe both new command line options, now 
> '--warmup-records' and '--separated-warmup'.  After reading Federico's email, 
> I realized the parameter '--combined-summary' didn't make sense in its 
> intended use and the revised parameter name 'separated-warmup' more 
> accurately reflects its purpose: to separate warmup statistics from 
> steady-state statistics. I could easily be convinced that 
> 'separated-warmup-statistics' would be a better parameter name, but that 
> seemed too verbose at first. The default for 'separated-warmup' is false and 
&

RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-06-28 Thread Welch, Matt
Hi Luke and Federico, 

Thank you for your responses.  Your questions seemed to be along similar lines 
so I've combined the responses.  
Please let me know if you need more clarification.

1. I've updated the KIP to describe both new command line options, now 
'--warmup-records' and '--separated-warmup'.  After reading Federico's email, I 
realized the parameter '--combined-summary' didn't make sense in its intended 
use and the revised parameter name 'separated-warmup' more accurately reflects 
its purpose: to separate warmup statistics from steady-state statistics. I 
could easily be convinced that 'separated-warmup-statistics' would be a better 
parameter name, but that seemed too verbose at first. The default for 
'separated-warmup' is false and now has a better description and help output 
example. The intent for these options is definitely to avoid any breakage of 
existing producer performance system tests.  I've also revised some of the 
language used for better clarity.  

2. I've added example snippets of tool invocation and output for the producer 
performance test for three cases: 
(A) typical (existing) usage without invoking 'warmup-records'
(B) 'warmup-records' without using 'separated-warmup'
(C) 'warmup-records' with 'separated-warmup'

Also, I completely agree that a consumer test warmup would be a great feature 
and should be described in a future KIP.

Thanks,
Matt

-Original Message-
From: Federico Valeri  
Sent: Friday, June 21, 2024 8:02 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hi Matt, I thanks for the KIP, this is a really useful feature.

In public interfaces, you say that the output won't change by default, so I 
guess this means that --combined-summary will be false by default, otherwise we 
would break the producer_performance system test. Is that correct? I think a 
couple of command line snippets would help here.

I think it would be great to also add a warmup phase to the consumer perf tool, 
but this probably deserves it's own KIP as we don't have latency stats there.


On Fri, Jun 21, 2024 at 2:16 PM Luke Chen  wrote:
>
> Hi Matt,
>
> Thanks for the KIP!
> I agree having the warm-up records could help correctly analyze the 
> performance.
>
> Some questions:
> 1. It looks like we will add 2 more options to producer perf tool:
>  - --warmup-records
>  - --combined-summary
>
> Is this correct?
> In the "public interface" section, only 1 is mentioned. Could you update it?
> Also, in the KIP, you use the word: "An option such as "--warmup-records"
> should be added...", it sounds like it is not decided, yet.
> I suggest you update to say, we will add "--warmup-records" option for 
> " to make it clear.
>
> 2. What will be the output containing both warm-up and steady-state results?
> Could you give an example directly?
>
> For better understanding, I'd suggest you refer to KIP-1057 
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-1057%3A+Add+rem
> ote+log+metadata+flag+to+the+dump+log+tool>
> to add some examples using `kafka-producer-perf-test.sh` with the new 
> option, to show what it will output.
>
> Thank you.
> Luke
>
> On Fri, Jun 21, 2024 at 10:39 AM Welch, Matt  wrote:
>
> > Hi Divij,
> >
> > Thanks for your response.  You raise some very important points.
> > I've updated the KIP to clarify the changes discussed here.
> >
> > 1. I agree that warmup stats should be printed separately.  I see 
> > two cases here, both of which would have two summary lines printed 
> > at the end of the producer perf test.  In the first case, 
> > warmup-separate, the warmup stats are printed first as warmup-only, 
> > followed by a second print of the steady state performance. In the 
> > second case, warmup-combined, the first print would look identical 
> > to the summary line that's currently used and would reflect the 
> > "whole test", with a second summary print of the steady-state 
> > performance.  This second case would allow for users to compare what 
> > the test would have looked like without a warmup to results of the 
> > test with a warmup. Although I've been looking at the second case 
> > lately, I can see merits of both cases and would be happy to support 
> > the warmup-separate case if that's the preference of the community.  
> > Regarding the JMX metrics accumulated by Kafka, we need to decide if 
> > we should reset the JMX metrics between the warmup and steady state. 
> > While I like the idea of having 

RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-06-20 Thread Welch, Matt
look at it for some inspiration.

2. Please add validation that num-records should be greater than warm-up 
records. Else report an error.

3. Please add a recommendation in the docs for the tool on what an ideal value 
for warm up should be. For users who may not be completely familiar with 
producer buffering / back-pressure, it would be useful to understand a good 
value to set. In my opinion,

4. I wonder how the --throughput parameter works with the warmup! Could we have 
a situation where the "steady-state" is impacted by the warm-up traffic? As an 
example, we could land in a situation where the slow processing of warm-up 
messages could impact the measurement of steady-state. This could happen in a 
situation when warm-up messages are waiting to be processed on the server (or 
maybe on the producer buffer) but we have started recording end-to-end latency 
for the steady-state messages.
I imagine this should be ok because it achieves the purpose of removing 
bootstrap times, but I haven't been able to reason about it in my head. What 
are your thoughts on this?

--
Divij Vaidya



On Fri, Jun 14, 2024 at 12:23 AM Eric Lu 
wrote:

> Hi Matt,
>
> Yes I forgot to update the KIP counter after creating a KIP. I changed 
> mine to 1053. We should be all good now.
>
> Cheers,
> Eric
>
> On Thu, Jun 13, 2024 at 3:08 PM Welch, Matt  wrote:
>
> > Hello again Kafka devs,
> >
> > I'd like to again call attention to this KIP for discussion.
> > Apparently, we encountered a race condition when choosing KIP 
> > numbers,
> but
> > hopefully it's straightened out now.
> >
> > Regards,
> > Matt
> >
> >
> > -Original Message-
> > From: Welch, Matt 
> > Sent: Thursday, June 6, 2024 4:44 PM
> > To: dev@kafka.apache.org
> > Subject: [DISCUSS] KIP-1052: Enable warmup in producer performance 
> > test
> >
> > Hello all,
> >
> > I'd like to propose a change that would allow the producer 
> > performance test to have a warmup phase where the statistics 
> > gathered could be separated from statistics gathered during steady state.
> >
> > Although startup is an important phase of Kafka operations and 
> > special attention should be paid to optimizing startup performance, 
> > often we
> would
> > like to understand Kafka performance during steady-state operation, 
> > separate from its performance during producer startup.  It's common 
> > for
> new
> > producers, like in a fresh producer performance test run, to have 
> > high latency during startup. This high latency can complicate the
> understanding
> > of steady-state performance, even when collecting long-running 
> > tests.  If we want to understand steady-state latency separate from 
> > startup latency, we can collect measurements for each phase in 
> > disjoint sets then present statistics on each set independently or 
> > as a combined population of measurements.  This feature would be 
> > completely optional and could be represented by a new command line 
> > flag for the producer performance test, '--warmup-records'.
> >
> > KIP: KIP-1052: Enable warmup in producer performance test - Apache 
> > Kafka
> -
> > Apache Software Foundation<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable+w
> armup+in+producer+performance+test
> > >
> >
> > Thank you,
> > Matt Welch
> >
> >
>


RE: [QUESTION] What to do about conflicted KIP numbers?

2024-06-17 Thread Welch, Matt
Thanks for the input Matthias,

I guess we will keep things as they are to prevent yet another layer added on 
top.
Eric has already taken the next available number and it's recorded on the wiki 
so I suppose there's no additional work required here.

Regards,
-Matt

-Original Message-
From: Matthias J. Sax  
Sent: Friday, June 14, 2024 3:35 PM
To: dev@kafka.apache.org
Subject: Re: [QUESTION] What to do about conflicted KIP numbers?

I don't think that there is an official guideline.

Personally, I would suggest that the corresponding KIP owners agree who is 
keeping the conflicting number, and how is changing it.

For the ones changing the number, I would propose to restart a new DISCUSS 
thread using the new number to separate the KIP threads.

Not sure if the is a better way to handle this... Just an idea on how I would 
do it.


Not sure if we can improve the wiki instruction to make the race 
condition less likely? Seems, this would happen if two people look at 
the next KIP number let's say X, but don't bump it right way to X+1 and 
publish their KIP with X a few hours/days later without verifying that X 
is still next available KIP number?


-Matthias


On 6/14/24 3:10 PM, Welch, Matt wrote:
> Hi Kafka devs,
> 
> I submitted a KIP last week and encountered a KIP-process race condition 
> where my KIP number was consumed by another dev without updating the wiki 
> page containing KIPs: Kafka Improvement Proposals - Apache Kafka - Apache 
> Software 
> Foundation<https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals>
> 
> There are now least three separate dev-list threads referencing this 
> conflicted KIP number so I'm concerned that discussion around this number 
> will now be permanently confusing due to the conflict and multiple concurrent 
> unrelated threads referencing the same KIP number.  I've intentionally kept 
> the KIP numbers out of this email to prevent yet another thread referencing 
> them.
> 
> While I'm happy to keep going with my existing KIP number, I was wondering if 
> I should "abandon" it and create a new one.
> This solution seems like it could create extra confusion, however, so what is 
> the best course of action here?
> 
> Thanks,
> Matt
> 


[QUESTION] What to do about conflicted KIP numbers?

2024-06-14 Thread Welch, Matt
Hi Kafka devs,

I submitted a KIP last week and encountered a KIP-process race condition where 
my KIP number was consumed by another dev without updating the wiki page 
containing KIPs: Kafka Improvement Proposals - Apache Kafka - Apache Software 
Foundation

There are now least three separate dev-list threads referencing this conflicted 
KIP number so I'm concerned that discussion around this number will now be 
permanently confusing due to the conflict and multiple concurrent unrelated 
threads referencing the same KIP number.  I've intentionally kept the KIP 
numbers out of this email to prevent yet another thread referencing them.

While I'm happy to keep going with my existing KIP number, I was wondering if I 
should "abandon" it and create a new one.
This solution seems like it could create extra confusion, however, so what is 
the best course of action here?

Thanks,
Matt


RE: [DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-06-13 Thread Welch, Matt
Hello again Kafka devs, 

I'd like to again call attention to this KIP for discussion.
Apparently, we encountered a race condition when choosing KIP numbers, but 
hopefully it's straightened out now.

Regards,
Matt


-Original Message-----
From: Welch, Matt  
Sent: Thursday, June 6, 2024 4:44 PM
To: dev@kafka.apache.org
Subject: [DISCUSS] KIP-1052: Enable warmup in producer performance test

Hello all,

I'd like to propose a change that would allow the producer performance test to 
have a warmup phase where the statistics gathered could be separated from 
statistics gathered during steady state.

Although startup is an important phase of Kafka operations and special 
attention should be paid to optimizing startup performance, often we would like 
to understand Kafka performance during steady-state operation, separate from 
its performance during producer startup.  It's common for new producers, like 
in a fresh producer performance test run, to have high latency during startup. 
This high latency can complicate the understanding of steady-state performance, 
even when collecting long-running tests.  If we want to understand steady-state 
latency separate from startup latency, we can collect measurements for each 
phase in disjoint sets then present statistics on each set independently or as 
a combined population of measurements.  This feature would be completely 
optional and could be represented by a new command line flag for the producer 
performance test, '--warmup-records'.

KIP: KIP-1052: Enable warmup in producer performance test - Apache Kafka - 
Apache Software 
Foundation<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable+warmup+in+producer+performance+test>

Thank you,
Matt Welch



RE: [DISCUSS] KIP-1052: Align the naming convention for config and default variables in *Config classes

2024-06-13 Thread Welch, Matt
Hi Eric, 

Apologies for the KIP number conflict.  I'm not sure how the community handles 
these race conditions, but I think we've got it sorted.
Also not sure if this is the right action, but it might be less confusing if 
you re-send your DISCUSS email to the dev list with the correct title and link 
to your KIP.
Also, please add your KIP-1053 to the wiki table as it's still missing.

Best Regards,
Matt

-Original Message-
From: Eric Lu  
Sent: Thursday, June 6, 2024 6:09 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-1052: Align the naming convention for config and 
default variables in *Config classes

Hi Divij,

Thanks for the response, you are right that changing the external interface 
leads to some repetition. The main benefit is that the variables would follow 
the normalized naming convention. It was a concern brought up in this ticket 
.

Also, I forgot to increment the KIP number and another dev took KIP-1052. I 
changed mine to 1053.

Another thing, I also did not receive the Pony Mail link. Did I do something 
wrong?

Cheers,

Eric

On Thu, Jun 6, 2024 at 5:37 PM Divij Vaidya  wrote:

> Hi Eric
>
> Thank you for writing the KIP.
>
> Standardizing the internal variables and classes as per a convention 
> is a good idea. Even better would be to enforce that convention using 
> check style rules so that the convention is enforced via a mechanism 
> in the future code. You don’t need a KIP for it.
>
> However, I am not able to appreciate the benefit of changing the 
> external interfaces for the sake of alignment. Keeping two similar 
> names, as you proposed for backward compatibility, only adds to the 
> additional overhead in code maintenance (reduces readability and adds 
> to confusion). This cost, just  to get a better aligned conventional does not 
> seem worthwhile to me.
>
> Is there an obvious benefit that I am missing here which would make 
> this proposal a good trade off with the cost?
>
> —
> Divij Vaidya
>
>
>
> On Thu 6. Jun 2024 at 21:13, Eric Lu  wrote:
>
> >  Hi,
> >
> > I wanted to follow-up on the discussion thread since I have not 
> > received anything yet.
> >
> > Best regards,
> >
> > Eric
> >
> > On Thu, Jun 6, 2024 at 12:39 PM Eric Lu 
> > 
> > wrote:
> >
> > > Hi,
> > >
> > > I'd like to start a discussion thread for my KIP:
> > > KIP-1052: Align the naming convention for config and default 
> > > variables
> in
> > > *Config classes
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Align+th
> e+naming+convention+for+config+and+default+variables+in+*Config+classe
> s
> > >
> > >
> > > Thanks,
> > >
> > > Eric
> > >
> >
>


[DISCUSS] KIP-1052: Enable warmup in producer performance test

2024-06-06 Thread Welch, Matt
Hello all,

I'd like to propose a change that would allow the producer performance test to 
have a warmup phase where the statistics gathered could be separated from 
statistics gathered during steady state.

Although startup is an important phase of Kafka operations and special 
attention should be paid to optimizing startup performance, often we would like 
to understand Kafka performance during steady-state operation, separate from 
its performance during producer startup.  It's common for new producers, like 
in a fresh producer performance test run, to have high latency during startup. 
This high latency can complicate the understanding of steady-state performance, 
even when collecting long-running tests.  If we want to understand steady-state 
latency separate from startup latency, we can collect measurements for each 
phase in disjoint sets then present statistics on each set independently or as 
a combined population of measurements.  This feature would be completely 
optional and could be represented by a new command line flag for the producer 
performance test, '--warmup-records'.

KIP: KIP-1052: Enable warmup in producer performance test - Apache Kafka - 
Apache Software 
Foundation

Thank you,
Matt Welch



Request permission to contribute to Apache Kafka

2024-06-05 Thread Welch, Matt
Hi Kafka devs,

I'd like to request permission to contribute to Apache Kafka for a new KIP.

My Wiki ID is: mattw4
My JIRA ID is: mattw4
Email: matt.we...@intel.com

Thanks in advance!
--Matt