Re: [DISCUSS] KIP-824 Allowing dumping segmentlogs limiting the batches in the output

2022-03-04 Thread Luke Chen
Hi Sergio,

Thanks for the KIP!

One question:
I saw there's a `max-message-size` argument that seems to do the same thing
as you want.
Could you help explain what's the difference between `max-message-size` and
`max-batches-size`?

Thank you.
Luke

On Sat, Mar 5, 2022 at 3:21 AM Kirk True  wrote:

> Hi Sergio,
>
> Thanks for the KIP. I don't know anything about the log segment internals,
> but the logic and implementation seem sound.
>
> Three questions:
>  1. Since the --max-batches-size unit is bytes, does it matter if that
> size doesn't align to a record boundary?
>  2. Can you add a check to make sure that --max-batches-size doesn't allow
> the user to pass in a negative number?
>  3. Can you add/update any unit tests related to the DumpLogSegments
> arguments?
> Thanks,
> Kirk
>
> On Thu, Mar 3, 2022, at 1:32 PM, Sergio Daniel Troiano wrote:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-824%3A+Allowing+dumping+segmentlogs+limiting+the+batches+in+the+output
> >
>


Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-04 Thread Luke Chen
Hi Artem,

Thanks for your explanation and update to the KIP.
Some comments:

5. In the description for `enable.adaptive.partitioning`, the `false` case,
you said:
> the producer will try to distribute messages uniformly.
I think we should describe the possible skewing distribution. Otherwise,
user might be confused about why adaptive partitioning is important.

6. In the description for `partition.availability.timeout.ms`, I think we
should mention in the last sentence about if `enable.adaptive.partitioning`
is disabled this logic is also disabled.

7. Similar thoughts as Ismael, I think we should have a POC and test to
prove that this adaptive partitioning algorithm can have better uniform
partitioning, compared with original sticky one.

Thank you.
Luke

On Fri, Mar 4, 2022 at 9:22 PM Ismael Juma  wrote:

> Regarding `3`, we should only deprecate it if we're sure the new approach
> handles all cases better. Are we confident about that for both of the
> previous partitioners?
>
> Ismael
>
> On Fri, Mar 4, 2022 at 1:37 AM David Jacot 
> wrote:
>
> > Hi Artem,
> >
> > Thanks for the KIP! I have a few comments:
> >
> > 1. In the preamble of the proposed change section, there is still a
> > mention of the
> > -1 approach. My understanding is that we have moved away from it now.
> >
> > 2. I am a bit concerned by the trick suggested about the
> > DefaultPartitioner and
> > the UniformStickyPartitioner. I do agree that implementing the logic in
> the
> > producer itself is a good thing. However, it is weird from a user
> > perspective
> > that he can set a class as partitioner that is not used in the end. I
> > think that
> > this will be confusing for our users. Have we considered changing the
> > default
> > value of partitioner.class to null to indicate that the new built-in
> > partitioner
> > must be used? By default, the built-in partitioner would be used unless
> the
> > user explicitly specify one. The downside is that the new default
> behavior
> > would not work if the user explicitly specify the partitioner but we
> could
> > mitigate this with my next point.
> >
> > 3. Related to the previous point, I think that we could deprecate both
> the
> > DefaultPartitioner and the UniformStickyPartitioner. I would also add a
> > warning if one of them is explicitly provided by the user to inform them
> > that they should switch to the new built-in one. I am pretty sure that
> most
> > of the folks use the default configuration anyway.
> >
> > 4. It would be great if we could explain why the -1 way was rejected. At
> > the moment, the rejected alternative only explain the idea but does not
> > say why we rejected it.
> >
> > Best,
> > David
> >
> > On Fri, Mar 4, 2022 at 6:03 AM Artem Livshits
> >  wrote:
> > >
> > > Hi Jun,
> > >
> > > 2. Removed the option from the KIP.  Now the sticky partitioning
> > threshold
> > > is hardcoded to batch.size.
> > >
> > > 20. Added the corresponding wording to the KIP.
> > >
> > > -Artem
> > >
> > > On Thu, Mar 3, 2022 at 10:52 AM Jun Rao 
> > wrote:
> > >
> > > > Hi, Artem,
> > > >
> > > > Thanks for the reply.
> > > >
> > > > 1. Sounds good.
> > > >
> > > > 2. If we don't expect users to change it, we probably could just
> leave
> > out
> > > > the new config. In general, it's easy to add a new config, but hard
> to
> > > > remove an existing config.
> > > >
> > > > 20. The two new configs enable.adaptive.partitioning and
> > > > partition.availability.timeout.ms only apply to the two built-in
> > > > partitioners DefaultPartitioner and UniformStickyPartitioner, right?
> It
> > > > would be useful to document that in the KIP.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Thu, Mar 3, 2022 at 9:47 AM Artem Livshits
> > > >  wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thank you for the suggestions.
> > > > >
> > > > > 1. As we discussed offline, we can hardcode the logic for
> > > > > DefaultPartitioner and UniformStickyPartitioner in the
> KafkaProducer
> > > > (i.e.
> > > > > the DefaultPartitioner.partition won't get called, instead
> > KafkaProducer
> > > > > would check if the partitioner is an instance of DefaultPartitioner
> > and
> > > > > then run the actual partitioning logic itself).  Then the change to
> > the
> > > > > Partitioner wouldn't be required.  I'll update the KIP to reflect
> > that.
> > > > >
> > > > > 2. I don't expect users to change this too often, as changing it
> > would
> > > > > require a bit of studying of the production patterns.  As a general
> > > > > principle, if I can think of a model that requires a deviation from
> > > > > default, I tend to add a configuration option.  It could be that
> > it'll
> > > > > never get used in practice, but I cannot prove that.  I'm ok with
> > > > removing
> > > > > the option, let me know what you think.
> > > > >
> > > > > -Artem
> > > > >
> > > > > On Mon, Feb 28, 2022 at 2:06 PM Jun Rao 
> > > > wrote:
> > > > >
> > > > > > Hi, Artem,
> > > > > >
> > > > > > Thanks for 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #739

2022-03-04 Thread Apache Jenkins Server
See 




Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.0 #181

2022-03-04 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13709) Document exactly-once support for source connectors

2022-03-04 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-13709:
-

 Summary: Document exactly-once support for source connectors
 Key: KAFKA-13709
 URL: https://issues.apache.org/jira/browse/KAFKA-13709
 Project: Kafka
  Issue Type: Task
  Components: KafkaConnect
Reporter: Chris Egerton
Assignee: Chris Egerton


Add documentation for the support for exactly-once source connectors introduced 
in 
[KIP-618|https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors].
 This includes but is not limited to:
 * How to safely perform a rolling upgrade to enable exactly-once source 
support for an existing cluster
 * How to perform a soft rollback (i.e., disable exactly-once source support by 
reconfiguring workers on a cluster)
 * How to perform a hard rollback (i.e., disable exactly-once source support by 
reverting workers to an earlier version that does not include code changes for 
KIP-618)
 * Any new APIs that connector authors can/should leverage for their source 
connectors that need clarification beyond what can be included in a Javadoc 
(for example, how to know what to return from 
{{{}SourceConnector::exactlyOnceSupport{}}}, and an example on how to define 
custom transaction boundaries for a connector)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #738

2022-03-04 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.1 #83

2022-03-04 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-824 Allowing dumping segmentlogs limiting the batches in the output

2022-03-04 Thread Kirk True
Hi Sergio,

Thanks for the KIP. I don't know anything about the log segment internals, but 
the logic and implementation seem sound.

Three questions:
 1. Since the --max-batches-size unit is bytes, does it matter if that size 
doesn't align to a record boundary?
 2. Can you add a check to make sure that --max-batches-size doesn't allow the 
user to pass in a negative number?
 3. Can you add/update any unit tests related to the DumpLogSegments arguments?
Thanks,
Kirk

On Thu, Mar 3, 2022, at 1:32 PM, Sergio Daniel Troiano wrote:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-824%3A+Allowing+dumping+segmentlogs+limiting+the+batches+in+the+output
> 


[jira] [Resolved] (KAFKA-13706) org.apache.kafka.test.MockSelector doesn't remove closed connections from its 'ready' field

2022-03-04 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-13706.
-
Fix Version/s: 3.2.0
 Reviewer: David Jacot
 Assignee: Vincent Jiang
   Resolution: Fixed

> org.apache.kafka.test.MockSelector doesn't remove closed connections from its 
> 'ready' field
> ---
>
> Key: KAFKA-13706
> URL: https://issues.apache.org/jira/browse/KAFKA-13706
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Vincent Jiang
>Assignee: Vincent Jiang
>Priority: Minor
> Fix For: 3.2.0
>
>
> MockSelector.close(String id) method doesn't remove closed connection from 
> "ready" field.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-04 Thread Ismael Juma
Regarding `3`, we should only deprecate it if we're sure the new approach
handles all cases better. Are we confident about that for both of the
previous partitioners?

Ismael

On Fri, Mar 4, 2022 at 1:37 AM David Jacot 
wrote:

> Hi Artem,
>
> Thanks for the KIP! I have a few comments:
>
> 1. In the preamble of the proposed change section, there is still a
> mention of the
> -1 approach. My understanding is that we have moved away from it now.
>
> 2. I am a bit concerned by the trick suggested about the
> DefaultPartitioner and
> the UniformStickyPartitioner. I do agree that implementing the logic in the
> producer itself is a good thing. However, it is weird from a user
> perspective
> that he can set a class as partitioner that is not used in the end. I
> think that
> this will be confusing for our users. Have we considered changing the
> default
> value of partitioner.class to null to indicate that the new built-in
> partitioner
> must be used? By default, the built-in partitioner would be used unless the
> user explicitly specify one. The downside is that the new default behavior
> would not work if the user explicitly specify the partitioner but we could
> mitigate this with my next point.
>
> 3. Related to the previous point, I think that we could deprecate both the
> DefaultPartitioner and the UniformStickyPartitioner. I would also add a
> warning if one of them is explicitly provided by the user to inform them
> that they should switch to the new built-in one. I am pretty sure that most
> of the folks use the default configuration anyway.
>
> 4. It would be great if we could explain why the -1 way was rejected. At
> the moment, the rejected alternative only explain the idea but does not
> say why we rejected it.
>
> Best,
> David
>
> On Fri, Mar 4, 2022 at 6:03 AM Artem Livshits
>  wrote:
> >
> > Hi Jun,
> >
> > 2. Removed the option from the KIP.  Now the sticky partitioning
> threshold
> > is hardcoded to batch.size.
> >
> > 20. Added the corresponding wording to the KIP.
> >
> > -Artem
> >
> > On Thu, Mar 3, 2022 at 10:52 AM Jun Rao 
> wrote:
> >
> > > Hi, Artem,
> > >
> > > Thanks for the reply.
> > >
> > > 1. Sounds good.
> > >
> > > 2. If we don't expect users to change it, we probably could just leave
> out
> > > the new config. In general, it's easy to add a new config, but hard to
> > > remove an existing config.
> > >
> > > 20. The two new configs enable.adaptive.partitioning and
> > > partition.availability.timeout.ms only apply to the two built-in
> > > partitioners DefaultPartitioner and UniformStickyPartitioner, right? It
> > > would be useful to document that in the KIP.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Mar 3, 2022 at 9:47 AM Artem Livshits
> > >  wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > Thank you for the suggestions.
> > > >
> > > > 1. As we discussed offline, we can hardcode the logic for
> > > > DefaultPartitioner and UniformStickyPartitioner in the KafkaProducer
> > > (i.e.
> > > > the DefaultPartitioner.partition won't get called, instead
> KafkaProducer
> > > > would check if the partitioner is an instance of DefaultPartitioner
> and
> > > > then run the actual partitioning logic itself).  Then the change to
> the
> > > > Partitioner wouldn't be required.  I'll update the KIP to reflect
> that.
> > > >
> > > > 2. I don't expect users to change this too often, as changing it
> would
> > > > require a bit of studying of the production patterns.  As a general
> > > > principle, if I can think of a model that requires a deviation from
> > > > default, I tend to add a configuration option.  It could be that
> it'll
> > > > never get used in practice, but I cannot prove that.  I'm ok with
> > > removing
> > > > the option, let me know what you think.
> > > >
> > > > -Artem
> > > >
> > > > On Mon, Feb 28, 2022 at 2:06 PM Jun Rao 
> > > wrote:
> > > >
> > > > > Hi, Artem,
> > > > >
> > > > > Thanks for the reply. A few more comments.
> > > > >
> > > > > 1. Since we control the implementation and the usage of
> > > > DefaultPartitioner,
> > > > > another way is to instantiate the DefaultPartitioner with a special
> > > > > constructor, which allows it to have more access to internal
> > > information.
> > > > > Then we could just change the behavior of  DefaultPartitioner such
> that
> > > > it
> > > > > can use the internal infoamtion when choosing the partition. This
> seems
> > > > > more intuitive than having DefaultPartitioner return -1 partition.
> > > > >
> > > > > 2. I guess partitioner.sticky.batch.size is introduced because the
> > > > > effective batch size could be less than batch.size and we want to
> align
> > > > > partition switching with the effective batch size. How would a user
> > > know
> > > > > the effective batch size to set partitioner.sticky.batch.size
> properly?
> > > > If
> > > > > the user somehow knows the effective batch size, does setting
> > > batch.size
> > > > to
> > > > > the effective batch size achieve the same result?
> 

[jira] [Created] (KAFKA-13708) The metrics-core-2.2.0.jar on which Kafak depends has the open-source vulnerability CVE-2022-20621.

2022-03-04 Thread caoguangjie (Jira)
caoguangjie created KAFKA-13708:
---

 Summary: The metrics-core-2.2.0.jar on which Kafak depends has the 
open-source vulnerability CVE-2022-20621.
 Key: KAFKA-13708
 URL: https://issues.apache.org/jira/browse/KAFKA-13708
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: caoguangjie


|h2. CVE-2022-20621 Detail
h3. Current Description
Jenkins Metrics Plugin 4.0.2.8 and earlier stores an access key unencrypted in 
its global configuration file on the Jenkins controller where it can be viewed 
by users with access to the Jenkins controller file system.


[https://nvd.nist.gov/vuln/detail/CVE-2022-20621]


|



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #737

2022-03-04 Thread Apache Jenkins Server
See 




RE: Proposal to add IBM Power (ppc64le) to apache kafka jenkins CI

2022-03-04 Thread Abhijit Mane
Thank you Mickael.

Regards,
Abhijit​

From: Mickael Maison 
Sent: 03 March 2022 22:06
To: dev 
Subject: [EXTERNAL] Re: Proposal to add IBM Power (ppc64le) to apache kafka 
jenkins CI

Hi Abhijit,

Nobody raised concerns so I went ahead an opened a PR to add ppc64le
to the Kafka CI:
https://github.com/apache/kafka/pull/11833

As a starter I reused the same settings as the ARM build (only running
unitTest) and confirmed it's building fine. We can reconsider later if
we also want to run integrationTest.

Thanks,
Mickael

On Thu, Feb 10, 2022 at 9:10 AM Abhijit Mane  wrote:
>
> Hi Mickael,
>
> Thanks for validating the ppc64le VM spec for suitability of kafka builds.
>
> If there are no more comments, can I proceed with working with Infra
> on further steps?
> https://issues.apache.org/jira/browse/INFRA-22612
>
> Regards,
> Abhijit
>
> On Wed, Jan 26, 2022 at 8:40 PM Mickael Maison  
> wrote:
> >
> > Hi Abhijit,
> >
> > Thanks for offering a ppc64le VM. Looking at the INFRA ticket, it
> > looks like it was added to Jenkins successfully.
> > The spec looks appropriate for Kafka builds.
> >
> > Let's wait a few days to see if anybody has questions or concerns.
> >
> > Thanks,
> > Mickael
> >
> >
> > On Mon, Jan 24, 2022 at 11:03 AM Abhijit Mane  wrote:
> > >
> > > Hello,
> > >
> > > We would like to work with the community to enable IBM Power (ppc64le)
> > > architecture support for Apache Jenkins CI so we can enable Apache
> > > Kafka on IBM Power systems. An IBM Power (ppc64le) VM available to the
> > > community for integrating into Jenkins cluster. The VM has also been
> > > set up as advised by Infra using the script below and verified.
> > > https://github.com/apache/cassandra-builds/blob/trunk/ASF-jenkins-agents.md
> > >
> > > ---
> > > Ubuntu 20.04 VM
> > > Root vol: 100GB,  Data vol: 500GB, mounted at /home/jenkins
> > > 8 cores / 16GB RAM
> > > Infra discussion with community:
> > > https://issues.apache.org/jira/browse/INFRA-22612
> > > ---
> > >
> > > Kindly review Jira to let me know if you have any questions/concerns
> > > and if it's okay to proceed further.
> > >
> > > Regards,
> > > Abhijit
> > >
> >
>


Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-04 Thread David Jacot
Hi Artem,

Thanks for the KIP! I have a few comments:

1. In the preamble of the proposed change section, there is still a
mention of the
-1 approach. My understanding is that we have moved away from it now.

2. I am a bit concerned by the trick suggested about the DefaultPartitioner and
the UniformStickyPartitioner. I do agree that implementing the logic in the
producer itself is a good thing. However, it is weird from a user perspective
that he can set a class as partitioner that is not used in the end. I think that
this will be confusing for our users. Have we considered changing the default
value of partitioner.class to null to indicate that the new built-in partitioner
must be used? By default, the built-in partitioner would be used unless the
user explicitly specify one. The downside is that the new default behavior
would not work if the user explicitly specify the partitioner but we could
mitigate this with my next point.

3. Related to the previous point, I think that we could deprecate both the
DefaultPartitioner and the UniformStickyPartitioner. I would also add a
warning if one of them is explicitly provided by the user to inform them
that they should switch to the new built-in one. I am pretty sure that most
of the folks use the default configuration anyway.

4. It would be great if we could explain why the -1 way was rejected. At
the moment, the rejected alternative only explain the idea but does not
say why we rejected it.

Best,
David

On Fri, Mar 4, 2022 at 6:03 AM Artem Livshits
 wrote:
>
> Hi Jun,
>
> 2. Removed the option from the KIP.  Now the sticky partitioning threshold
> is hardcoded to batch.size.
>
> 20. Added the corresponding wording to the KIP.
>
> -Artem
>
> On Thu, Mar 3, 2022 at 10:52 AM Jun Rao  wrote:
>
> > Hi, Artem,
> >
> > Thanks for the reply.
> >
> > 1. Sounds good.
> >
> > 2. If we don't expect users to change it, we probably could just leave out
> > the new config. In general, it's easy to add a new config, but hard to
> > remove an existing config.
> >
> > 20. The two new configs enable.adaptive.partitioning and
> > partition.availability.timeout.ms only apply to the two built-in
> > partitioners DefaultPartitioner and UniformStickyPartitioner, right? It
> > would be useful to document that in the KIP.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Mar 3, 2022 at 9:47 AM Artem Livshits
> >  wrote:
> >
> > > Hi Jun,
> > >
> > > Thank you for the suggestions.
> > >
> > > 1. As we discussed offline, we can hardcode the logic for
> > > DefaultPartitioner and UniformStickyPartitioner in the KafkaProducer
> > (i.e.
> > > the DefaultPartitioner.partition won't get called, instead KafkaProducer
> > > would check if the partitioner is an instance of DefaultPartitioner and
> > > then run the actual partitioning logic itself).  Then the change to the
> > > Partitioner wouldn't be required.  I'll update the KIP to reflect that.
> > >
> > > 2. I don't expect users to change this too often, as changing it would
> > > require a bit of studying of the production patterns.  As a general
> > > principle, if I can think of a model that requires a deviation from
> > > default, I tend to add a configuration option.  It could be that it'll
> > > never get used in practice, but I cannot prove that.  I'm ok with
> > removing
> > > the option, let me know what you think.
> > >
> > > -Artem
> > >
> > > On Mon, Feb 28, 2022 at 2:06 PM Jun Rao 
> > wrote:
> > >
> > > > Hi, Artem,
> > > >
> > > > Thanks for the reply. A few more comments.
> > > >
> > > > 1. Since we control the implementation and the usage of
> > > DefaultPartitioner,
> > > > another way is to instantiate the DefaultPartitioner with a special
> > > > constructor, which allows it to have more access to internal
> > information.
> > > > Then we could just change the behavior of  DefaultPartitioner such that
> > > it
> > > > can use the internal infoamtion when choosing the partition. This seems
> > > > more intuitive than having DefaultPartitioner return -1 partition.
> > > >
> > > > 2. I guess partitioner.sticky.batch.size is introduced because the
> > > > effective batch size could be less than batch.size and we want to align
> > > > partition switching with the effective batch size. How would a user
> > know
> > > > the effective batch size to set partitioner.sticky.batch.size properly?
> > > If
> > > > the user somehow knows the effective batch size, does setting
> > batch.size
> > > to
> > > > the effective batch size achieve the same result?
> > > >
> > > > 4. Thanks for the explanation. Makes sense to me.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Fri, Feb 25, 2022 at 8:26 PM Artem Livshits
> > > >  wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > 1. Updated the KIP to add a couple paragraphs about implementation
> > > > > necessities in the Proposed Changes section.
> > > > >
> > > > > 2. Sorry if my reply was confusing, what I meant to say (and I
> > >