Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Viktor Somogyi-Vass
Congrats Igor!

On Thu, Apr 25, 2024, 07:01 Bruno Cadonna  wrote:

> Congrats!
>
> Best,
> Bruno
>
> Am 25. April 2024 05:18:19 MESZ schrieb Yash Mayya :
> >Congratulations Igor!
> >
> >On Wed, 24 Apr, 2024, 23:36 Colin McCabe,  wrote:
> >
> >> Hi all,
> >>
> >> The PMC of Apache Kafka is pleased to announce a new Kafka committer,
> Igor
> >> Soarez.
> >>
> >> Igor has been a Kafka contributor since 2019. In addition to being a
> >> regular contributor and reviewer, he has made significant contributions
> to
> >> improving Kafka's JBOD support in KRaft mode. He has also contributed to
> >> discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866,
> and
> >> KIP-938.
> >>
> >> Congratulations, Igor!
> >>
> >> Thanks,
> >>
> >> Colin (on behalf of the Apache Kafka PMC)
> >>
>


Re: [Confluence] Request for an account

2024-04-24 Thread Luke Chen
Hi Arpit,

You should be able to check the status in
https://issues.apache.org/jira/browse/INFRA-25451 .
It's infra's issue, not ours.

Thanks.
Luke

On Thu, Apr 25, 2024 at 12:49 PM Arpit Goyal 
wrote:

> Hi All,
> Is this issue resolved ?
> Thanks and Regards
> Arpit Goyal
> 8861094754
>
>
> On Tue, Mar 26, 2024 at 9:11 AM Luke Chen  wrote:
>
> > Hi Johnny,
> >
> > Currently, there is an infra issue about this:
> > https://issues.apache.org/jira/browse/INFRA-25451 , and unfortunately
> it's
> > not fixed, yet.
> > I think alternatively, maybe you could put your proposal in a shared
> google
> > doc for discussion. (without comment enabled since we want to keep all
> the
> > discussion history in apache email threads).
> > After discussion is completed, committers can help you add the content
> into
> > confluence wiki.
> >
> > Thanks.
> > Luke
> >
> > On Mon, Mar 25, 2024 at 8:56 PM ChengHan Hsu 
> > wrote:
> >
> > > Hi all,
> > >
> > > I have sent a email to infrastruct...@apache.org for registering an
> > > account
> > > of Confluence, I am contributing to Kafka and would like to update some
> > > wiki.
> > > May I know if anyone can help with this?
> > >
> > > Thanks in advance!
> > >
> > > Best,
> > > Johnny
> > >
> >
>


Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-24 Thread Matthias J. Sax
104: I also don't feel super strong about it. Not sure if 
`onAssignment()` might overload the name in a confusing way? In the end, 
when the method is called, we don't assign anything? -- Guess, I am fine 
with whatever Rohan picks as a name from the suggestions we have so far.



107: Did not think about how to do it yet. Just raised the question to 
see if I am even heading into the right direction or not... I did not 
propose to remove the method; it's clear that we need it.


Thinking about it a little more, what we actually want to convey is a 
certain "processing capacity" an instance has? Thus, 
`numConsumerClients()` might not reflect this in the future? Should we 
just generically call it `processingCapacity()` or similar and for now 
explain in the JavaDocs that it maps to number of (currently running) 
`StreamsThread` (currently running, because users can dynamically 
add/remove them...). We can later update the JavaDocs when we have 
"processing threads" and point to number of processing threads? Maybe 
Lucas/Bruno can provide more input on what/how we plan the future 
threading model and configs.


Nit: not sure if we need the "NOTE" section at all? If we think we want 
it, maybe remove from the KIP and we can discuss in more detail on the 
PR (think it could be improved). Don't think the JavaDocs on he KIP but 
be 100% accurate to what we put into the code later.



108: I guess my question is two-fold. (1) Does user-code need to decide 
to schedule a probing rebalance to begin with? Or could the 
non-customizable part of `StreamsPartitionAssignor` decide it? (2) If 
custom code really need to make this decision, why would it not just 
return a boolean? It seems unnecessary to compute a deadline, given that 
the probing rebalance interval is a config? -- Or maybe I am missing 
something? If it's about regular probing rebalance vs immediate 
rebalance vs no follow up, maybe an enum would do the trick?



115: Thanks for the explanation. This does make sense. I am wondering if 
we need the new utility method though? Would it not be possible to 
encapsulate all this inside the non-customizable code? The method 
`kafkaStreamsStates(boolean computeTaskLags)` will be provided by us and 
called by the user code. Thus, if we cannot compute the lag, we could 
still throw an exception -- the user code does not need to know anything 
about it, and is not supposed to catch this exception. Hence, it should 
bubble up and get back to our code from 
`TaskAssingor#assign(ApplicationState applicationState)` which is called 
by us, and we can catch our own exception here, and do what we do 
currently: we return the old assignment, and request an immediate follow 
up rebalance? For this case, the user code does not need to know 
anything about it, and does not need to do anything special, and it 
would become a provided built-in feature what seems desirable?



117: not sure myself... Let's see what others think. I'll think about it 
a little bit more and follow up again later. Its a tricky one.



-Matthias


On 4/24/24 5:08 PM, Sophie Blee-Goldman wrote:

Now to respond to Matthias:

FYI, I'm following the numbering scheme from your email but added  to
mark responses with further questions or feedback and/or aren't yet
addressed in the KIP and need to be followed up on. You can more or less
just skip over the ones without stars to save time

100: I think this is leftover from a previous approach we considered.
Removed this line

101: I agree. Perhaps we hadn't fully committed to this decision when the
KIP was first written ;P Added a "Consumer Assignments" section under
"Public Changes" to address this more carefully and explicitly

102: fixed

103: fixed

104: I do agree with Bruno on the structure of this callback name, ie
that it should start with "on", but any of the suggestions with that sound
good to me. I really don't feel too strongly but just to throw in a bit of
extra context, there is an analogous callback on the
ConsumerPartitionAssignor that is called #onAssignment. So personally I
would slightly prefer to just call it #onAssignment. However I'm happy to
go with whatever the consensus is -- @Bruno/@Matthias WDYT?

105: done (fyi I did leave the config doc string which is technically a
private variable but is part of the public contract)

106: good point about the numKafkaStreamsClients and toString methods --
removed those

107: I guess this reflects how long it's been since the KIP was first
written :P But that's a fair point -- and yes, we should write this in a
forward looking manner. However, we can't build it in a way that's so
forward-looking that it doesn't work for the current version. Are you
proposing to remove this API altogether, or just rename it to
#numConsumerClients (or something like that) and update the javadocs
accordingly?  Assuming the latter, I totally agree, and have made the
change. But we definitely can't just remove it altogether (it may even be
relevant in 

Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Bruno Cadonna
Congrats!

Best,
Bruno

Am 25. April 2024 05:18:19 MESZ schrieb Yash Mayya :
>Congratulations Igor!
>
>On Wed, 24 Apr, 2024, 23:36 Colin McCabe,  wrote:
>
>> Hi all,
>>
>> The PMC of Apache Kafka is pleased to announce a new Kafka committer, Igor
>> Soarez.
>>
>> Igor has been a Kafka contributor since 2019. In addition to being a
>> regular contributor and reviewer, he has made significant contributions to
>> improving Kafka's JBOD support in KRaft mode. He has also contributed to
>> discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and
>> KIP-938.
>>
>> Congratulations, Igor!
>>
>> Thanks,
>>
>> Colin (on behalf of the Apache Kafka PMC)
>>


Re: [Confluence] Request for an account

2024-04-24 Thread Arpit Goyal
Hi All,
Is this issue resolved ?
Thanks and Regards
Arpit Goyal
8861094754


On Tue, Mar 26, 2024 at 9:11 AM Luke Chen  wrote:

> Hi Johnny,
>
> Currently, there is an infra issue about this:
> https://issues.apache.org/jira/browse/INFRA-25451 , and unfortunately it's
> not fixed, yet.
> I think alternatively, maybe you could put your proposal in a shared google
> doc for discussion. (without comment enabled since we want to keep all the
> discussion history in apache email threads).
> After discussion is completed, committers can help you add the content into
> confluence wiki.
>
> Thanks.
> Luke
>
> On Mon, Mar 25, 2024 at 8:56 PM ChengHan Hsu 
> wrote:
>
> > Hi all,
> >
> > I have sent a email to infrastruct...@apache.org for registering an
> > account
> > of Confluence, I am contributing to Kafka and would like to update some
> > wiki.
> > May I know if anyone can help with this?
> >
> > Thanks in advance!
> >
> > Best,
> > Johnny
> >
>


Permissions to contribute to Apache Kafka

2024-04-24 Thread Rajdeep Sahoo
Hi team ,
Please find my wiki id and jira id mentioned below. Requesting you to grant
access so that I will be able to contribute to apache kafka.

*wiki id*: rajdeepsahoo2012
*jira id*: rajdeep_sahoo

Thanks ,
Rajdeep sahoo


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Yash Mayya
Congratulations Igor!

On Wed, 24 Apr, 2024, 23:36 Colin McCabe,  wrote:

> Hi all,
>
> The PMC of Apache Kafka is pleased to announce a new Kafka committer, Igor
> Soarez.
>
> Igor has been a Kafka contributor since 2019. In addition to being a
> regular contributor and reviewer, he has made significant contributions to
> improving Kafka's JBOD support in KRaft mode. He has also contributed to
> discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and
> KIP-938.
>
> Congratulations, Igor!
>
> Thanks,
>
> Colin (on behalf of the Apache Kafka PMC)
>


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Kamal Chandraprakash
Congratulations, Igor!

On Thu, Apr 25, 2024, 06:11 Luke Chen  wrote:

> Congrats, Igor!
>
> On Thu, Apr 25, 2024 at 6:10 AM Matthias J. Sax  wrote:
>
> > Congrats!
> >
> > On 4/24/24 2:29 PM, Bill Bejeck wrote:
> > > Congrats Igor!
> > >
> > > -Bill
> > >
> > > On Wed, Apr 24, 2024 at 2:37 PM Tom Bentley 
> wrote:
> > >
> > >> Congratulations Igor!
> > >>
> > >> On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai 
> > wrote:
> > >>
> > >>> Congratulations, Igor! you are one of the best Kafka developers!!!
> > >>>
> > >>> Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:
> > >>>
> >  Congratulations Igor!
> > 
> >  On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe 
> > >> wrote:
> > >
> > > Hi all,
> > >
> > > The PMC of Apache Kafka is pleased to announce a new Kafka
> committer,
> >  Igor Soarez.
> > >
> > > Igor has been a Kafka contributor since 2019. In addition to being
> a
> >  regular contributor and reviewer, he has made significant
> > contributions
> > >>> to
> >  improving Kafka's JBOD support in KRaft mode. He has also
> contributed
> > >> to
> >  discussing and reviewing many KIPs such as KIP-690, KIP-554,
> KIP-866,
> > >> and
> >  KIP-938.
> > >
> > > Congratulations, Igor!
> > >
> > > Thanks,
> > >
> > > Colin (on behalf of the Apache Kafka PMC)
> > 
> > >>>
> > >>
> > >
> >
>


[jira] [Resolved] (KAFKA-16610) Replace "Map#entrySet#forEach" by "Map#forEach"

2024-04-24 Thread Igor Soarez (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Soarez resolved KAFKA-16610.
-
Resolution: Resolved

> Replace "Map#entrySet#forEach" by "Map#forEach"
> ---
>
> Key: KAFKA-16610
> URL: https://issues.apache.org/jira/browse/KAFKA-16610
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: TengYao Chi
>Priority: Minor
> Fix For: 3.8.0
>
>
> {quote}
> Targets
>     Occurrences of 'entrySet().forEach' in Project
> Found occurrences in Project  (16 usages found)
>     Unclassified  (16 usages found)
>         kafka.core.main  (9 usages found)
>             kafka.server  (4 usages found)
>                 ControllerApis.scala  (2 usages found)
>                     ControllerApis  (2 usages found)
>                         handleIncrementalAlterConfigs  (1 usage found)
>                             774 controllerResults.entrySet().forEach(entry => 
> response.responses().add(
>                         handleLegacyAlterConfigs  (1 usage found)
>                             533 controllerResults.entrySet().forEach(entry => 
> response.responses().add(
>                 ControllerConfigurationValidator.scala  (2 usages found)
>                     ControllerConfigurationValidator  (2 usages found)
>                         validate  (2 usages found)
>                             99 config.entrySet().forEach(e => {
>                             114 config.entrySet().forEach(e => 
> properties.setProperty(e.getKey, e.getValue))
>             kafka.server.metadata  (5 usages found)
>                 AclPublisher.scala  (1 usage found)
>                     AclPublisher  (1 usage found)
>                         onMetadataUpdate  (1 usage found)
>                             73 aclsDelta.changes().entrySet().forEach(e =>
>                 ClientQuotaMetadataManager.scala  (3 usages found)
>                     ClientQuotaMetadataManager  (3 usages found)
>                         handleIpQuota  (1 usage found)
>                             119 quotaDelta.changes().entrySet().forEach { e =>
>                         update  (2 usages found)
>                             54 quotasDelta.changes().entrySet().forEach { e =>
>                             99 quotaDelta.changes().entrySet().forEach { e =>
>                 KRaftMetadataCache.scala  (1 usage found)
>                     KRaftMetadataCache  (1 usage found)
>                         getClusterMetadata  (1 usage found)
>                             491 topic.partitions().entrySet().forEach { entry 
> =>
>         kafka.core.test  (1 usage found)
>             unit.kafka.integration  (1 usage found)
>                 KafkaServerTestHarness.scala  (1 usage found)
>                     KafkaServerTestHarness  (1 usage found)
>                         getTopicNames  (1 usage found)
>                             349 
> controllerServer.controller.findAllTopicIds(ANONYMOUS_CONTEXT).get().entrySet().forEach
>  {
>         kafka.metadata.main  (3 usages found)
>             org.apache.kafka.controller  (2 usages found)
>                 QuorumFeatures.java  (1 usage found)
>                     toString()  (1 usage found)
>                         144 localSupportedFeatures.entrySet().forEach(f -> 
> features.add(f.getKey() + ": " + f.getValue()));
>                 ReplicationControlManager.java  (1 usage found)
>                     createTopic(ControllerRequestContext, CreatableTopic, 
> List, Map, 
> List, boolean)  (1 usage found)
>                         732 newParts.entrySet().forEach(e -> 
> assignments.put(e.getKey(),
>             org.apache.kafka.metadata.properties  (1 usage found)
>                 MetaPropertiesEnsemble.java  (1 usage found)
>                     toString()  (1 usage found)
>                         610 logDirProps.entrySet().forEach(
>         kafka.metadata.test  (1 usage found)
>             org.apache.kafka.controller  (1 usage found)
>                 ReplicationControlManagerTest.java  (1 usage found)
>                     createTestTopic(String, int[][], Map, 
> short)  (1 usage found)
>                         307 configs.entrySet().forEach(e -> 
> topic.configs().add(
>         kafka.streams.main  (1 usage found)
>             org.apache.kafka.streams.processor.internals  (1 usage found)
>                 StreamsMetadataState.java  (1 usage found)
>                     onChange(Map>, 
> Map>, Map)  (1 
> usage found)
>                         317 topicPartitionInfo.entrySet().forEach(entry -> 
> this.partitionsByTopic
>         kafka.tools.main  (1 usage found)
>             org.apache.kafka.tools  (1 usage found)
>                 LeaderElectionCommand.java  (1 usage found)
>                     electLeaders(Admin, ElectionType, 
> 

[jira] [Created] (KAFKA-16617) Add KRaft info for the `advertised.listeners` doc description

2024-04-24 Thread Luke Chen (Jira)
Luke Chen created KAFKA-16617:
-

 Summary: Add KRaft info for the `advertised.listeners` doc 
description
 Key: KAFKA-16617
 URL: https://issues.apache.org/jira/browse/KAFKA-16617
 Project: Kafka
  Issue Type: Improvement
Reporter: Luke Chen


Currently, we only write ZK handler in the `advertised.listeners` doc 
description:

> Listeners to publish to ZooKeeper for clients to use, if different than the 
> listeners config property.

We should also add KRaft handler info in the doc

ref: https://kafka.apache.org/documentation/#brokerconfigs_advertised.listeners



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Luke Chen
Congrats, Igor!

On Thu, Apr 25, 2024 at 6:10 AM Matthias J. Sax  wrote:

> Congrats!
>
> On 4/24/24 2:29 PM, Bill Bejeck wrote:
> > Congrats Igor!
> >
> > -Bill
> >
> > On Wed, Apr 24, 2024 at 2:37 PM Tom Bentley  wrote:
> >
> >> Congratulations Igor!
> >>
> >> On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai 
> wrote:
> >>
> >>> Congratulations, Igor! you are one of the best Kafka developers!!!
> >>>
> >>> Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:
> >>>
>  Congratulations Igor!
> 
>  On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe 
> >> wrote:
> >
> > Hi all,
> >
> > The PMC of Apache Kafka is pleased to announce a new Kafka committer,
>  Igor Soarez.
> >
> > Igor has been a Kafka contributor since 2019. In addition to being a
>  regular contributor and reviewer, he has made significant
> contributions
> >>> to
>  improving Kafka's JBOD support in KRaft mode. He has also contributed
> >> to
>  discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866,
> >> and
>  KIP-938.
> >
> > Congratulations, Igor!
> >
> > Thanks,
> >
> > Colin (on behalf of the Apache Kafka PMC)
> 
> >>>
> >>
> >
>


Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-24 Thread Sophie Blee-Goldman
Now to respond to Matthias:

FYI, I'm following the numbering scheme from your email but added  to
mark responses with further questions or feedback and/or aren't yet
addressed in the KIP and need to be followed up on. You can more or less
just skip over the ones without stars to save time

100: I think this is leftover from a previous approach we considered.
Removed this line

101: I agree. Perhaps we hadn't fully committed to this decision when the
KIP was first written ;P Added a "Consumer Assignments" section under
"Public Changes" to address this more carefully and explicitly

102: fixed

103: fixed

104: I do agree with Bruno on the structure of this callback name, ie
that it should start with "on", but any of the suggestions with that sound
good to me. I really don't feel too strongly but just to throw in a bit of
extra context, there is an analogous callback on the
ConsumerPartitionAssignor that is called #onAssignment. So personally I
would slightly prefer to just call it #onAssignment. However I'm happy to
go with whatever the consensus is -- @Bruno/@Matthias WDYT?

105: done (fyi I did leave the config doc string which is technically a
private variable but is part of the public contract)

106: good point about the numKafkaStreamsClients and toString methods --
removed those

107: I guess this reflects how long it's been since the KIP was first
written :P But that's a fair point -- and yes, we should write this in a
forward looking manner. However, we can't build it in a way that's so
forward-looking that it doesn't work for the current version. Are you
proposing to remove this API altogether, or just rename it to
#numConsumerClients (or something like that) and update the javadocs
accordingly?  Assuming the latter, I totally agree, and have made the
change. But we definitely can't just remove it altogether (it may even be
relevant in the new threading model if we eventually allow configuring the
number of consumer clients independently of the processing threads -- but
that's a different conversation. The important thing is that this KIP be
compatible with the old/current threading model, in which case we need this
API).
Anyways, please take a look at the new javadocs and method name and lmk if
that makes sense to you

108: The #followupRebalanceDeadline allows the custom assignor to
request followup rebalances, for example in order to probe for restoration
progress or other conditions for task assignment. This is fundamental to
the HighAvailabilityTaskAssignor (ie the default assignor) and may be
useful to custom assignors with similar such approaches. So it's definitely
necessary -- are you asking why we have it at all, or why it's an API on
the KafkaStreamsAssignment class and not, say, the TaskAssignment class?

109: Makes sense to me -- done

110: ack -- updated all mentions of "node" to "KafkaStreams client"

111: ack -- updated name from "consumer" to "consumerClientId"

112: that's fair -- I do think it's valuable to have "#allTasks" since some
assignors may not care about the stateful vs stateless distinction, but
it's weird to have #statefulTasks without #statelessTasks. Let's just have
all three. Added to the KIP

113: this makes sense to me. Updated computeTaskLags  to be an input
parameter instead of a mutating API. Also noted that it can throw a
TimeoutException in this case (this is relevant for point 115 below)

114: fixed

115: Reasonable question -- I think the way it's described right now is
a bit awkward, and there's a better way to approach the issue. Ultimately
the "issue" is how we can handle failures in things like the task lag
computation, which notably makes a remote call to the brokers and can/has
been known to fail at times. Right now if this API fails, the
StreamsPartitionAssignor will just return the same assignment as the
previous one and trigger an immediate followup rebalance. This exception
was meant to be a "utility" that can be thrown to indicate to the
StreamsPartitionAssignor to just return the old assignment and trigger an
immediate followup.
That said, this exception does feel like an awkward way to do it,
especially since the TaskAssignor can already do all of this via native
APIs: it can request a followup rebalance (and better yet, determine for
itself what a reasonable retry/backoff interval would be). It can also just
return the same assignment -- the only issue is that implementing this is
kind of annoying. So I would propose that instead of throwing a
RetryableException, we should just add an additional utility method to the
TaskAssignmentUtils that does this "fallback" assignment and returns the
same tasks to their previous clients.
Added TaskAssignmentUtils#identityAssignment for this (though I'm happy to
take other name suggestions)

That's a long paragraph, sorry -- but hopefully it makes sense?

116: fair point. Updated the wording -- hopefully it makes more sense now.

117: This is an interesting question. I think it's 

RE:DISCUSS KIP-984 Add pluggable compression interface to Kafka

2024-04-24 Thread Diop, Assane
Hi, 

I would like to bring back attention to 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-984%3A+Add+pluggable+compression+interface+to+Kafka
I have made significant changes to the design to accommodate the concerns and 
would like some feedback from the community and engage communication. 

Assane  

-Original Message-
From: Diop, Assane 
Sent: Friday, March 1, 2024 4:45 PM
To: dev@kafka.apache.org
Subject: RE: DISCUSS KIP-984 Add pluggable compression interface to Kafka

Hi Luke, 

The proposal doesn't preclude supporting multiple clients but each client would 
need an implementation of the pluggable architecture.
At the very least we envision other clients such as librdkafka and kafka-python 
could be supported by C implementations. 

We agree with community feedback regarding the need to support these clients, 
and we are looking at alternative approaches for brokers and clients to 
coordinate the plugin. 

One way to do this coordination is each client should have a configuration 
mapping of the plugin name to its implementation.

Assane 






-Original Message-
From: Luke Chen 
Sent: Monday, February 26, 2024 7:50 PM
To: dev@kafka.apache.org
Subject: Re: DISCUSS KIP-984 Add pluggable compression interface to Kafka

Hi Assane,

I also share the same concern as Greg has, which is that the KIP is not kafka 
ecosystem friendly.
And this will make the kafka client and broker have high dependencies that once 
you use the pluggable compression interface, the producer must be java client.
This seems to go against the original Kafka's design.

If the proposal can support all kinds of clients, that would be great.

Thanks.
Luke

On Tue, Feb 27, 2024 at 7:44 AM Diop, Assane  wrote:

> Hi Greg,
>
> Thanks for taking the time to give some feedback. It was very insightful.
>
> I have some answers:
>
> 1. The current proposal is Java centric. We want to figure out with 
> Java first and then later incorporate other languages. We will get there.
>
> 2. The question of where the plugins would live is an important one. I 
> would like to get the community engagement on where a plugin would live.
>Officially supported plugins could be part of Kafka and others 
> could live in a plugin repository. Is there currently a way to store 
> plugins in Kafka and load them into the classpath? If such a space 
> could be allowed then it would provide an standard way of installing 
> officially supported plugins.
>In OpenSearch for example, there is a plugin utility that takes the 
> jar and installs it across the cluster, privileges can be granted by an admin.
> Such utility could be implemented in Kafka.
>
> 3. There is many way to look at this, we could change the message 
> format that use the pluggable interface to be for example v3 and 
> synchronize against that.
>In order to use the pluggable codec, you will have to be at message 
> version 3 for example.
>
> 4. Passing the class name as metadata is one way to have the producer 
> talk to the broker about which plugin to use. However there could be 
> other implementation
>where you could set every thing to know about the topic using topic 
> level compression. In this case for example a rule could be that in 
> order to use the
>pluggable interface, you should use topic level compression.
>
>  I would like to have your valuable inputs on this!!
>
> Thanks before end,
> Assane
>
> -Original Message-
> From: Greg Harris 
> Sent: Wednesday, February 14, 2024 2:36 PM
> To: dev@kafka.apache.org
> Subject: Re: DISCUSS KIP-984 Add pluggable compression interface to 
> Kafka
>
> Hi Assane,
>
> Thanks for the KIP!
> Looking back, it appears that the project has only ever added 
> compression types twice: lz4 in 2014 and zstd in 2018, and perhaps 
> Kafka has fallen behind the state-of-the-art compression algorithms.
> Thanks for working to fix that!
>
> I do have some concerns:
>
> 1. I think this is a very "java centric" proposal, and doesn't take 
> non-java clients into enough consideration. librdkafka [1] is a great 
> example of an implementation of the Kafka protocol which doesn't have 
> the same classloading and plugin infrastructure that Java has, which 
> would make implementing this feature much more difficult.
>
> 2. By making the interface pluggable, it puts the burden of 
> maintaining individual compression codecs onto external developers, 
> which may not be willing to maintain a codec for the service-lifetime of such 
> a codec.
> An individual developer can easily implement a plugin to allow them to 
> use a cutting-edge compression algorithm without consulting the Kafka 
> project, but as soon as data is compressed using that algorithm, they 
> are on the hook to support that plugin going forward by the 
> organizations using their implementation.
> Part of the collective benefits of the Kafka project is to ensure the 
> ongoing maintenance of such codecs, and provide a long deprecation 
> window when a 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #146

2024-04-24 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16616) refactor mergeWith in MetadataSnapshot

2024-04-24 Thread Alyssa Huang (Jira)
Alyssa Huang created KAFKA-16616:


 Summary: refactor mergeWith in MetadataSnapshot
 Key: KAFKA-16616
 URL: https://issues.apache.org/jira/browse/KAFKA-16616
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 3.7.0
Reporter: Alyssa Huang


Right now we keep track of topic ids and partition metadata to add/update 
separately in mergeWith (e.g. two maps passed as arguments). This means we 
iterate over topic metadata twice which could be costly when we're dealing with 
a large number of updates. 
`updatePartitionLeadership` which calls `mergeWith` does something similarly 
(generates map of topic ids to update in a loop separate from the list of 
partition metadata to update) and should be refactored as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Matthias J. Sax

Congrats!

On 4/24/24 2:29 PM, Bill Bejeck wrote:

Congrats Igor!

-Bill

On Wed, Apr 24, 2024 at 2:37 PM Tom Bentley  wrote:


Congratulations Igor!

On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai  wrote:


Congratulations, Igor! you are one of the best Kafka developers!!!

Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:


Congratulations Igor!

On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe 

wrote:


Hi all,

The PMC of Apache Kafka is pleased to announce a new Kafka committer,

Igor Soarez.


Igor has been a Kafka contributor since 2019. In addition to being a

regular contributor and reviewer, he has made significant contributions

to

improving Kafka's JBOD support in KRaft mode. He has also contributed

to

discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866,

and

KIP-938.


Congratulations, Igor!

Thanks,

Colin (on behalf of the Apache Kafka PMC)










Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-24 Thread Sophie Blee-Goldman
Responding to Bruno first:

(1) I actually think "KafkaStreams" is exactly right here -- for the reason
you said, ultimately this is describing a literal instance of the
"KafkaStreams" class. Glad we hashed this out! (I saw that Rohan went with
StreamsClient but i also prefer KafkaStreams)

(4) Rohan is  right about what I was saying -- but I'm now realizing that I
completely misinterpreted what your concern was. Sorry for the long-winded
and ultimately irrelevant answer. I'm completely fine with having the
return type be a simple Set with additional info such as TaskId in the
AssignedTask class (and I see Rohan already made this change so we're all
good)

(5) I don't insist either way :)   ApplicationState works for me

On Fri, Apr 19, 2024 at 9:37 PM Matthias J. Sax  wrote:

> One more thing. It might be good to clearly call out, which interfaced a
> user would implement, vs the other ones Kafka Streams implements and
> TaskAssignor only uses.
>
> My understanding is, that users would implement `TaskAssignor`,
> `TaskAssignment`, and `StreamsClientAssignment`.
>
> For `AssignedTask` it seems that users would actually only need to
> instantiate them. Should we add a public constructor?
>
> Also wondering if we should add an empty default implementation for
> `onAssignmentComputed()` as it seems not to be strictly necessary to use
> this method?
>
>
> -Matthias
>
> On 4/19/24 7:30 PM, Matthias J. Sax wrote:
> > Great KIP. I have some minor comments/questions:
> >
> >
> > 100 The KIP says: "In the future, additional plugins can use the same
> > partition.assignor  prefix". What does this mean?
> >
> >
> > 101 (nit) The KIP says: "Note that the thread-level assignment will
> > remain an un-configurable internal implementation detail of the
> > partition assignor (see "Rejected Alternatives" for further thoughts and
> > reasoning)." -- When I was reading this the first time, I did not
> > understand it, and it did only become clear later (eg while reading the
> > discussion thread). I think it would be good to be a little bit more
> > explicit, because this is not just some minor thing, but a core design
> > decision (which I, btw, support).
> >
> >
> > 102 (nit/typo): taskAssignor -> TaskAssignor (somewhere in the text).
> >
> >
> > 103 (nit): "new non-internal package" -> replace 'non-internal' with
> > 'public' :)
> >
> >
> > 104: Method name `TaskAssignor#onAssignmentComputed()` -> the name seems
> > to be a little bit clumsy? I kinda like the original `finalAssignment()`
> > -- I would also be happy with `onFinalAssignment` to address Bruno's
> > line of thinking (which I think is a good call out). (Btw:
> > `finalAssignment` is still used in the text on the KIP and should also
> > be updated.)
> >
> >
> > 105: Please remove all `private` variables. We should only show public
> > stuff on the KIP. Everything else is an implementation detail.
> >
> >
> > 106: `TaskAssignment#numStreamsClients()` -- why do we need this method?
> > Seems calling `assignment()` gives as a collection and we can just call
> > size() on it to get the same value? -- Also, why do we explicitly call
> > out the overwrite of `toString()`; seems unnecessary?
> >
> >
> > 107 `StreamsClientState#numStreamThreads` JavaDocs says: "Returns the
> > number of StreamThreads on this client, which is equal to the number of
> > main consumers and represents its overall capacity." -- Given our
> > planned thread refactoring, this might not hold correct for long (and I
> > am sure we will forget to updated the JavaDocs later). Talking to Lucas
> > the plan is to cut down `StreamsThread` to host the consumer (and there
> > will be only one, and it won't be configurable any longer), and we would
> > introduce a number of configurable "processing threads". Can/should we
> > build this API in a forward looking manner?
> >
> >
> > 108: Why do we need
> > `StreamsClientAssignment#followupRebalanceDeadline()` -- wondering how
> > this would be useful?
> >
> >
> > 109 `StreamsClientState#consumers`: should we rename this to
> > `#consumerClientIds()`?
> >
> >
> > 110 (nit) `StreamsClientState#previousl[Active|Standby]Tasks`: JavaDoc
> > says 'owned by consumers on this node' -- Should we just say `owned by
> > the Streams client`?
> >
> >
> > 111 `StreamsClientState#prevTasksByLag()`: it takes a `String consumer`
> > parameter -- not clear what this is -- I guess it's a consumer's
> > client.id? If yes, should we rename the parameter `consumerClientId`?
> >
> >
> > 112 `ApplicationState`: what is the reason to have `allTasks()` and
> > `stafefulTasks() -- why not have `statelessTasks()` and
> > `statefulTasks()` instead? Or all three?
> >
> >
> > 113 `ApplicationState#computeTaskLags()`: I understand the indent/reason
> > why we have this one, but it seems to be somewhat difficult to use
> > correctly, as it triggers an internal side-effect... Would it be
> > possible to replace this method in favor of passing in a `boolean
> > computeTaskLag` 

Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Bill Bejeck
Congrats Igor!

-Bill

On Wed, Apr 24, 2024 at 2:37 PM Tom Bentley  wrote:

> Congratulations Igor!
>
> On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai  wrote:
>
> > Congratulations, Igor! you are one of the best Kafka developers!!!
> >
> > Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:
> >
> > > Congratulations Igor!
> > >
> > > On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe 
> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > The PMC of Apache Kafka is pleased to announce a new Kafka committer,
> > > Igor Soarez.
> > > >
> > > > Igor has been a Kafka contributor since 2019. In addition to being a
> > > regular contributor and reviewer, he has made significant contributions
> > to
> > > improving Kafka's JBOD support in KRaft mode. He has also contributed
> to
> > > discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866,
> and
> > > KIP-938.
> > > >
> > > > Congratulations, Igor!
> > > >
> > > > Thanks,
> > > >
> > > > Colin (on behalf of the Apache Kafka PMC)
> > >
> >
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2842

2024-04-24 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 352470 lines...]
[2024-04-24T20:25:36.475Z] > Task :connect:json:testSrcJar
[2024-04-24T20:25:36.475Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2024-04-24T20:25:36.475Z] > Task :connect:json:publishToMavenLocal
[2024-04-24T20:25:36.475Z] > Task :storage:storage-api:compileTestJava
[2024-04-24T20:25:36.475Z] > Task :storage:storage-api:testClasses
[2024-04-24T20:25:36.475Z] > Task :server:compileTestJava
[2024-04-24T20:25:36.475Z] > Task :server:testClasses
[2024-04-24T20:25:36.475Z] > Task :server-common:compileTestJava
[2024-04-24T20:25:36.475Z] > Task :server-common:testClasses
[2024-04-24T20:25:38.731Z] > Task :raft:compileTestJava
[2024-04-24T20:25:38.731Z] > Task :raft:testClasses
[2024-04-24T20:25:40.285Z] > Task :core:compileScala
[2024-04-24T20:25:41.643Z] > Task 
:streams:generateMetadataFileForMavenJavaPublication
[2024-04-24T20:25:43.186Z] > Task :group-coordinator:compileTestJava
[2024-04-24T20:25:43.186Z] > Task :group-coordinator:testClasses
[2024-04-24T20:25:43.186Z] 
[2024-04-24T20:25:43.186Z] > Task :clients:javadoc
[2024-04-24T20:25:43.186Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/clients/src/main/java/org/apache/kafka/clients/admin/ScramMechanism.java:32:
 warning - Tag @see: missing final '>': "https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API;>KIP-554:
 Add Broker-side SCRAM Config API
[2024-04-24T20:25:43.186Z] 
[2024-04-24T20:25:43.186Z]  This code is duplicated in 
org.apache.kafka.common.security.scram.internals.ScramMechanism.
[2024-04-24T20:25:43.186Z]  The type field in both files must match and must 
not change. The type field
[2024-04-24T20:25:43.186Z]  is used both for passing ScramCredentialUpsertion 
and for the internal
[2024-04-24T20:25:43.186Z]  UserScramCredentialRecord. Do not change the type 
field."
[2024-04-24T20:25:44.417Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/package-info.java:21:
 warning - Tag @link: reference not found: 
org.apache.kafka.common.security.oauthbearer
[2024-04-24T20:25:44.417Z] 2 warnings
[2024-04-24T20:25:45.830Z] 
[2024-04-24T20:25:45.830Z] > Task :clients:javadocJar
[2024-04-24T20:25:45.830Z] > Task :clients:srcJar
[2024-04-24T20:25:45.830Z] > Task :metadata:compileTestJava
[2024-04-24T20:25:45.830Z] > Task :metadata:testClasses
[2024-04-24T20:25:47.406Z] > Task :clients:testJar
[2024-04-24T20:25:47.406Z] > Task :clients:testSrcJar
[2024-04-24T20:25:47.406Z] > Task 
:clients:publishMavenJavaPublicationToMavenLocal
[2024-04-24T20:25:47.406Z] > Task :clients:publishToMavenLocal
[2024-04-24T20:25:47.406Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2024-04-24T20:25:47.406Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2024-04-24T20:25:47.406Z] > Task :connect:api:testClasses UP-TO-DATE
[2024-04-24T20:25:47.406Z] > Task :connect:api:testJar
[2024-04-24T20:25:47.406Z] > Task :connect:api:testSrcJar
[2024-04-24T20:25:47.406Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2024-04-24T20:25:47.406Z] > Task :connect:api:publishToMavenLocal
[2024-04-24T20:25:53.044Z] > Task :streams:javadoc
[2024-04-24T20:25:53.044Z] > Task :streams:javadocJar
[2024-04-24T20:25:54.827Z] > Task :streams:srcJar
[2024-04-24T20:25:54.827Z] > Task :streams:processTestResources UP-TO-DATE
[2024-04-24T20:26:39.386Z] > Task :core:classes
[2024-04-24T20:26:39.386Z] > Task :core:compileTestJava NO-SOURCE
[2024-04-24T20:27:11.105Z] > Task :core:compileTestScala
[2024-04-24T20:28:12.582Z] > Task :core:testClasses
[2024-04-24T20:28:37.939Z] > Task :streams:compileTestJava
[2024-04-24T20:30:10.184Z] > Task :streams:testClasses
[2024-04-24T20:30:10.184Z] > Task :streams:testJar
[2024-04-24T20:30:10.184Z] > Task :streams:testSrcJar
[2024-04-24T20:30:10.184Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2024-04-24T20:30:10.184Z] > Task :streams:publishToMavenLocal
[2024-04-24T20:30:10.184Z] 
[2024-04-24T20:30:10.184Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 9.0.
[2024-04-24T20:30:10.184Z] 
[2024-04-24T20:30:10.184Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2024-04-24T20:30:10.184Z] 
[2024-04-24T20:30:10.184Z] For more on this, please refer to 
https://docs.gradle.org/8.7/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.
[2024-04-24T20:30:10.184Z] 
[2024-04-24T20:30:10.184Z] BUILD SUCCESSFUL in 5m 20s
[2024-04-24T20:30:10.184Z] 96 actionable tasks: 41 executed, 55 up-to-date
[2024-04-24T20:30:10.184Z] 
[2024-04-24T20:30:10.184Z] Publishing build scan...
[2024-04-24T20:30:10.184Z] https://ge.apache.org/s/3fratvsyx36j4

Re: [DISCUSS] KIP-1023: Follower fetch from tiered offset

2024-04-24 Thread Jun Rao
Hi, Abhijeet,

Thanks for the updated KIP. It looks good to me.

Jun

On Mon, Apr 22, 2024 at 12:08 PM Abhijeet Kumar 
wrote:

> Hi Jun,
>
> Please find my comments inline.
>
>
> On Thu, Apr 18, 2024 at 3:26 AM Jun Rao  wrote:
>
> > Hi, Abhijeet,
> >
> > Thanks for the reply.
> >
> > 1. I am wondering if we could achieve the same result by just lowering
> > local.retention.ms and local.retention.bytes. This also allows the newly
> > started follower to build up the local data before serving the consumer
> > traffic.
> >
>
> I am not sure I fully followed this. Do you mean we could lower the
> local.retention (by size and time)
> so that there is little data on the leader's local storage so that the
> follower can quickly catch up with the leader?
>
> In that case, we will need to set small local retention across brokers in
> the cluster. It will have the undesired
> effect where there will be increased remote log fetches for serving consume
> requests, and this can cause
> degradations. Also, this behaviour (of increased remote fetches) will
> happen on all brokers at all times, whereas in
> the KIP we are restricting the behavior only to the newly bootstrapped
> brokers and only until the time it fully builds
> the local logs as per retention defined at the cluster level.
> (Deprioritization of the broker could help reduce the impact
>  even further)
>
>
> >
> > 2. Have you updated the KIP?
> >
>
> The KIP has been updated now.
>
>
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Apr 9, 2024 at 3:36 AM Satish Duggana 
> > wrote:
> >
> > > +1 to Jun for adding the consumer fetching from a follower scenario
> > > also to the existing section that talked about the drawback when a
> > > node built with last-tiered-offset has become a leader. As Abhijeet
> > > mentioned, we plan to have a follow-up KIP that will address these by
> > > having a deprioritzation of these brokers. The deprioritization of
> > > those brokers can be removed once they catchup until the local log
> > > retention.
> > >
> > > Thanks,
> > > Satish.
> > >
> > > On Tue, 9 Apr 2024 at 14:08, Luke Chen  wrote:
> > > >
> > > > Hi Abhijeet,
> > > >
> > > > Thanks for the KIP to improve the tiered storage feature!
> > > >
> > > > Questions:
> > > > 1. We could also get the "pending-upload-offset" and epoch via remote
> > log
> > > > metadata, instead of adding a new API to fetch from the leader. Could
> > you
> > > > explain why you choose the later approach, instead of the former?
> > > > 2.
> > > > > We plan to have a follow-up KIP that will address both the
> > > > deprioritization
> > > > of these brokers from leadership, as well as
> > > > for consumption (when fetching from followers is allowed).
> > > >
> > > > I agree with Jun that we might need to make it clear all possible
> > > drawbacks
> > > > that could have. So, could we add the drawbacks that Jun mentioned
> > about
> > > > the performance issue when consumer fetch from follower?
> > > >
> > > > 3. Could we add "Rejected Alternatives" section to the end of the KIP
> > to
> > > > add some of them?
> > > > Like the "ListOffsetRequest" approach VS
> > "Earliest-Pending-Upload-Offset"
> > > > approach, or getting the "Earliest-Pending-Upload-Offset" from remote
> > log
> > > > metadata... etc.
> > > >
> > > > Thanks.
> > > > Luke
> > > >
> > > >
> > > > On Tue, Apr 9, 2024 at 2:25 PM Abhijeet Kumar <
> > > abhijeet.cse@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Christo,
> > > > >
> > > > > Please find my comments inline.
> > > > >
> > > > > On Fri, Apr 5, 2024 at 12:36 PM Christo Lolov <
> > christolo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello Abhijeet and Jun,
> > > > > >
> > > > > > I have been mulling this KIP over a bit more in recent days!
> > > > > >
> > > > > > re: Jun
> > > > > >
> > > > > > I wasn't aware we apply 2.1 and 2.2 for reserving new timestamps
> -
> > in
> > > > > > retrospect it should have been fairly obvious. I would need to go
> > an
> > > > > update
> > > > > > KIP-1005 myself then, thank you for giving the useful reference!
> > > > > >
> > > > > > 4. I think Abhijeet wants to rebuild state from
> > latest-tiered-offset
> > > and
> > > > > > fetch from latest-tiered-offset + 1 only for new replicas (or
> > > replicas
> > > > > > which experienced a disk failure) to decrease the time a
> partition
> > > spends
> > > > > > in under-replicated state. In other words, a follower which has
> > just
> > > > > fallen
> > > > > > out of ISR, but has local data will continue using today's Tiered
> > > Storage
> > > > > > replication protocol (i.e. fetching from earliest-local). I
> further
> > > > > believe
> > > > > > he has taken this approach so that local state of replicas which
> > have
> > > > > just
> > > > > > fallen out of ISR isn't forcefully wiped thus leading to
> situation
> > 1.
> > > > > > Abhijeet, have I understood (and summarised) what you are
> proposing
> > > > > > correctly?
> > > > > >
> > > > > > Yes, your understanding is 

[jira] [Created] (KAFKA-16615) JoinGroup API for upgrading ConsumerGroup

2024-04-24 Thread Dongnuo Lyu (Jira)
Dongnuo Lyu created KAFKA-16615:
---

 Summary: JoinGroup API for upgrading ConsumerGroup
 Key: KAFKA-16615
 URL: https://issues.apache.org/jira/browse/KAFKA-16615
 Project: Kafka
  Issue Type: Sub-task
Reporter: Dongnuo Lyu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-877: Mechanism for plugins and connectors to register metrics

2024-04-24 Thread Mickael Maison
Hi Matthias,

I'm not sure making the Monitorable interface Closeable really solves the issue.
Ultimately you need to understand the lifecycle of a plugin to
determine when it make sense to close it and which part of the code is
responsible for doing it. I'd rather have this described properly in
the interface of the plugin itself than it being a side effect of
implementing Monitorable.

Looking at Streams, as far as I can tell the only pluggable interfaces
that are Closeable today are the Serdes. It seems Streams can accept
Serdes instances created by the user [0]. In that case, I think it's
probably best to ignore Streams in this KIP. Nothing should prevent
Streams for adopting it, in a way that makes sense for Streams, in a
future KIP if needed.

0: 
https://github.com/apache/kafka/blob/trunk/streams/examples/src/main/java/org/apache/kafka/streams/examples/wordcount/WordCountDemo.java#L84

Thanks,
Mickael





On Fri, Feb 9, 2024 at 1:15 AM Greg Harris  wrote:
>
> Hi Mickael,
>
> Thanks for the KIP, this looks like a great change!
>
> 1. I see that one of my concerns was already discussed, and appears to
> have been concluded with:
>
> > I considered Chris' idea of automatically removing metrics but decided to 
> > leave that responsibility to the plugins.
>
> After chasing resource leaks for the last few years, I've internalized
> that preventing leaks through careful implementation is always
> inadequate, and that leaks need to be prevented by design.
> If a leak is possible in a design, then we should count on it
> happening somewhere as a certainty, and should be prepared for the
> behavior afterwards.
>
> Chris already brought up one of the negative behaviors: Connect
> plugins which are cancelled may keep their metrics open past the point
> that a replacement plugin is instantiated.
> This will have the effect of showing incorrect metrics, which is
> harmful and confusing for operators.
> If you are constantly skeptical of the accuracy of your metrics, and
> there is no "source of truth" to verify against, then what use are the
> metrics?
>
> I think that managing the lifecycle of the PluginMetrics on the
> framework side would be acceptable if we had an internal class like
> the following, to keep a reference to the metrics adjacent to the
> plugin:
> class MonitorablePlugin implements Supplier, Closeable {
> MonitorablePlugin(T plugin, PluginMetrics metrics);
> }
> I already believe that we need similar wrapper classes in Connect [1]
> to manage classloader swapping & exception safety, and this simpler
> interface could be applied to non-connect call-sites that don't need
> to swap the classloader.
>
> 2. Your "MyInterceptor" class doesn't have a "metrics" field, and
> doesn't perform a null-check on the field in close().
> Keeping the PluginMetrics as an non-final instance variable in every
> plugin implementation is another burden on the plugin implementations,
> as they will need to perform null checks in-case the metrics are never
> initialized, such as in a test environment.
> It also shows that without the Closeable interface, plugins may not
> need the PluginMetrics object after the initial setup if they only
> have a fixed set of sensors that need to be made instance fields.
>
> 3. I realized when writing the above that this implicitly sets a
> minimum framework version for plugins, as the Monitorable interface
> must exist in order to be able to load the plugin classes. This is a
> test which confirms that behavior: [2] [3]
> There's no way for plugins to at-runtime disable plugin metrics when
> put in an environment without support for them, which was one of the
> features which was incorporated in KIP-610: [4] which meaningfully
> changes the control flow of the connector.
>
> 4. For plugins which themselves use AbstractConfig (nearly all of
> them) and use the getConfiguredInstance methods to instantiate plugins
> of their own (e.g. MirrorMaker2 [5]), could these plugins also use
> PluginMetrics from the framework, with a small change to the
> signatures or class hierarchy?
>
> Thanks so much!
> Greg
>
> [1] https://issues.apache.org/jira/browse/KAFKA-14670
> [2] 
> https://github.com/apache/kafka/blob/116bc000c8c6533552321fe1d395629c2aa00bd9/connect/runtime/src/test/resources/test-plugins/bad-packaging/test/plugins/MissingSuperclassConverter.java#L28-L30
> [3] 
> https://github.com/apache/kafka/blob/116bc000c8c6533552321fe1d395629c2aa00bd9/connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/PluginsTest.java#L233-L239
> [4] 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors
> [5] 
> https://github.com/apache/kafka/blob/116bc000c8c6533552321fe1d395629c2aa00bd9/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorCheckpointConfig.java#L114
>
> On Thu, Feb 8, 2024 at 11:49 AM Matthias J. Sax  wrote:
> >
> > Still need to digest the KIP, but one thing coming to mind:
> >
> > Instead of requiring 

[jira] [Created] (KAFKA-16614) Disallow `@ClusterTemplate("")`

2024-04-24 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16614:
--

 Summary: Disallow `@ClusterTemplate("")`
 Key: KAFKA-16614
 URL: https://issues.apache.org/jira/browse/KAFKA-16614
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


`@ClusterTemplate` enable us to create dynamic configs, and it expect to accept 
a method name which can create server configs at runtime. It throws error when 
we pass a nonexistent method name, but it works if we pass an empty name



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Gaurav Narula
Congrats Igor!

> On 24 Apr 2024, at 19:37, Tom Bentley  wrote:
> 
> Congratulations Igor!
> 
>> On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai  wrote:
>> 
>> Congratulations, Igor! you are one of the best Kafka developers!!!
>> 
>> Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:
>> 
>>> Congratulations Igor!
>>> 
>>> On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe  wrote:
 
 Hi all,
 
 The PMC of Apache Kafka is pleased to announce a new Kafka committer,
>>> Igor Soarez.
 
 Igor has been a Kafka contributor since 2019. In addition to being a
>>> regular contributor and reviewer, he has made significant contributions
>> to
>>> improving Kafka's JBOD support in KRaft mode. He has also contributed to
>>> discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and
>>> KIP-938.
 
 Congratulations, Igor!
 
 Thanks,
 
 Colin (on behalf of the Apache Kafka PMC)
>>> 
>> 


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Tom Bentley
Congratulations Igor!

On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai  wrote:

> Congratulations, Igor! you are one of the best Kafka developers!!!
>
> Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:
>
> > Congratulations Igor!
> >
> > On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe  wrote:
> > >
> > > Hi all,
> > >
> > > The PMC of Apache Kafka is pleased to announce a new Kafka committer,
> > Igor Soarez.
> > >
> > > Igor has been a Kafka contributor since 2019. In addition to being a
> > regular contributor and reviewer, he has made significant contributions
> to
> > improving Kafka's JBOD support in KRaft mode. He has also contributed to
> > discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and
> > KIP-938.
> > >
> > > Congratulations, Igor!
> > >
> > > Thanks,
> > >
> > > Colin (on behalf of the Apache Kafka PMC)
> >
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #145

2024-04-24 Thread Apache Jenkins Server
See 




Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Chia-Ping Tsai
Congratulations, Igor! you are one of the best Kafka developers!!!

Mickael Maison  於 2024年4月25日 週四 上午2:16寫道:

> Congratulations Igor!
>
> On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe  wrote:
> >
> > Hi all,
> >
> > The PMC of Apache Kafka is pleased to announce a new Kafka committer,
> Igor Soarez.
> >
> > Igor has been a Kafka contributor since 2019. In addition to being a
> regular contributor and reviewer, he has made significant contributions to
> improving Kafka's JBOD support in KRaft mode. He has also contributed to
> discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and
> KIP-938.
> >
> > Congratulations, Igor!
> >
> > Thanks,
> >
> > Colin (on behalf of the Apache Kafka PMC)
>


[jira] [Resolved] (KAFKA-16613) Remove TestUtils#subscribeAndWaitForRecords

2024-04-24 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16613.

Fix Version/s: 3.8.0
   Resolution: Fixed

> Remove TestUtils#subscribeAndWaitForRecords
> ---
>
> Key: KAFKA-16613
> URL: https://issues.apache.org/jira/browse/KAFKA-16613
> Project: Kafka
>  Issue Type: Test
>Reporter: PoAn Yang
>Assignee: PoAn Yang
>Priority: Minor
> Fix For: 3.8.0
>
>
> After KAFKA-16483, we remove most of usage of 
> TestUtils#subscribeAndWaitForRecords. The only remaining case uses it is 
> PlaintextAdminIntegrationTest#testDeleteConsumerGroupOffsets. We can also 
> remove it because Consumer#poll already has timeout input. We don't need 
> TestUtils#subscribeAndWaitForRecords to give another waiting wrapper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Mickael Maison
Congratulations Igor!

On Wed, Apr 24, 2024 at 8:06 PM Colin McCabe  wrote:
>
> Hi all,
>
> The PMC of Apache Kafka is pleased to announce a new Kafka committer, Igor 
> Soarez.
>
> Igor has been a Kafka contributor since 2019. In addition to being a regular 
> contributor and reviewer, he has made significant contributions to improving 
> Kafka's JBOD support in KRaft mode. He has also contributed to discussing and 
> reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and KIP-938.
>
> Congratulations, Igor!
>
> Thanks,
>
> Colin (on behalf of the Apache Kafka PMC)


Re: [DISCUSS] KIP-932: Queues for Kafka

2024-04-24 Thread Jun Rao
Hi, Andrew,

Thanks for the response.

123. I thought the network issue can be covered with the group epoch.
Basically, if the assignment is to be changed, GC bumps up the epoch first,
but doesn't expose the new epoch to members until the assignment is
complete (including initializing the sharePartitionState). Once the
assignment is complete, GC includes the bumped up epoch and the new
assignment in the heartbeatResponse. If the next heartbeat Request includes
the new epoch, it means that the member has received the new assignment and
GC can exclude the assignment in the heartbeatResponse.

125. "If AlterShareGroupOffsets is called for a topic-partition which is
not yet in ShareGroupPartitionMetadataValue, it’s not really part of the
group yet. So I think it’s permitted to fail the AlterShareGroupOffsets
because the topic-partition would not be part of ListShareGroupOffsets
result."
  A user may want to initialize SPSO (e.g. based on timestamp) before the
application is first used. If we reject AlterShareGroupOffsets when
ShareGroupPartitionMetadataValue is empty, the user will be forced to start
the application with the wrong SPSO first and then reset it, which will be
inconvenient.

140. AdminClient.deleteShareGroupOffsets allows deletion of SPSO for
individual partitions, but ShareGroupPartitionMetadataValue only tracks
consecutive partitions.

141. Delete share group: Should GC also write a tombstone for each
ShareGroupMemberMetadata record?

142. kafka-share-groups.sh
142.1 What happens if neither --dry-run nor --execute is specified for
reset-offsets?
142.2 Should it support --delete-offsets and --reset-offsets at the
partition level to match the AdminClient api?
142.3 How useful is --dry-run? The corresponding RPCs don't carry the
dry-run flag.
142.4 --state [String]: What are the valid state values?

143. DescribeShareGroupOffsets is served by the share-partition leader. How
could a user describe the share group offset when there is no member?

144. kafka-console-share-consumer.sh: Is there an option to accept consumed
messages?

145. It would be useful to add a broker side metric that measures the
pressure from group.share.partition.max.record.locks, e.g. the fraction of
the time that SPL is blocked because group.share.partition.max.record.locks
is reached.

146. heartbeat-response-time-max: What's the unit?

147. rebalance related metrics: How does a share consumer know when there
is a rebalance and how does it measure the rebalance time?

148. records-per-request-avg: Should we pair it with
records-per-request-max?

149. If the shareGroupState is not available, what error code is used in
WriteShareGroupStateResponse?

Jun

On Wed, Apr 24, 2024 at 7:12 AM Andrew Schofield 
wrote:

> Hi Jun,
> Thanks for your reply.
>
> 123. The GC only sends the assignment in ShareGroupHeartbeat
> response if the member has just joined, or the assignment has been
> recalculated. This latter condition is met when the GC fails over.
>
> With a consumer group, it works a little differently. The heartbeating
> occurs asynchronously with the reconciliation process in the client.
> The client has reconciliation callbacks, as well as offsets to commit
> before it can confirm that revocation has occured. So, it might take
> multiple heartbeats to complete the installation of a new target
> assignment in the client.
>
> With a share group, it’s more brutal. The GC sends the assignment
> in a heartbeat response, and the client is assumed to have acted on
> It immediately. Since share group assignment is really about balancing
> consumers across the available partitions, rather than safely handing
> ownership of partitions between consumers, there is less coordination.
>
> I wonder whether there is one situation that does need considering.
> If the network connection between the client and the GC is lost, it is
> possible that a response containing the assignment is lost. Then,
> the connection will be reestablished, and the assignment will only be
> sent when it’s recalculated. Adding the equivalent of
> ConsumerGroupHeartbeatRequest.TopicPartitions would be one
> way to close that.
>
> 125. Yes, you are right. The reconciliation that I describe in the
> answer to (54) below is related. If AlterShareGroupOffsets is called
> for a topic-partition which is not yet in ShareGroupPartitionMetadataValue,
> it’s not really part of the group yet. So I think it’s permitted to fail
> the AlterShareGroupOffsets because the topic-partition would not be
> part of ListShareGroupOffsets result.
>
> If AlterShareGroupOffsets is called for a topic-partition which is in
> ShareGroupPartitionMetadataValue, then just calling
> InitializeShareGroupState to set its offset would be sufficient without
> writing ShareGroupPartitionMetadata again.
>
> 138. Added.
>
> 139. This week, I have been experimenting with using the existing
> consumer metrics with share consumer code. That was my plan to get
> started with metrics. While they work to some 

[ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Colin McCabe
Hi all,

The PMC of Apache Kafka is pleased to announce a new Kafka committer, Igor 
Soarez.

Igor has been a Kafka contributor since 2019. In addition to being a regular 
contributor and reviewer, he has made significant contributions to improving 
Kafka's JBOD support in KRaft mode. He has also contributed to discussing and 
reviewing many KIPs such as KIP-690, KIP-554, KIP-866, and KIP-938.

Congratulations, Igor!

Thanks,

Colin (on behalf of the Apache Kafka PMC)


Re: [PR] [MINOR] Add youtube to content security policy [kafka-site]

2024-04-24 Thread via GitHub


raboof commented on PR #597:
URL: https://github.com/apache/kafka-site/pull/597#issuecomment-2075421541

   unfortunately it seems https://w3c.github.io/webappsec-csp/#meta-element is 
correct that the meta element can only further restrict the policy, not widen 
it.
   
   I wonder if you could do a `Header add` in a `.htaccess` file like in 
https://github.com/apache/privacy-website/tree/main/examples/youtube-html ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Reopened] (KAFKA-15743) KRaft support in ReplicationQuotasTest

2024-04-24 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano reopened KAFKA-15743:
---

The {{registerBroker}} function in this test file needs to use the directory 
UUIDs assigned to the broker and not random UUIDs. If the directory UUIDs do 
not match then when the broker reregisters as part of startup, any topics with 
replicas on the broker which were created with the random UUIDs will actually 
be offline. [~soarez] can comment more here.

> KRaft support in ReplicationQuotasTest
> --
>
> Key: KAFKA-15743
> URL: https://issues.apache.org/jira/browse/KAFKA-15743
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Dmitry Werner
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.8.0
>
>
> The following tests in ReplicationQuotasTest in 
> core/src/test/scala/unit/kafka/server/ReplicationQuotasTest.scala need to be 
> updated to support KRaft
> 59 : def shouldBootstrapTwoBrokersWithLeaderThrottle(): Unit = {
> 64 : def shouldBootstrapTwoBrokersWithFollowerThrottle(): Unit = {
> 171 : def shouldThrottleOldSegments(): Unit = {
> Scanned 240 lines. Found 0 KRaft tests out of 3 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2841

2024-04-24 Thread Apache Jenkins Server
See 




[jira] [Reopened] (KAFKA-16606) JBOD support in KRaft does not seem to be gated by the metadata version

2024-04-24 Thread Igor Soarez (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Soarez reopened KAFKA-16606:
-
  Assignee: Igor Soarez

> JBOD support in KRaft does not seem to be gated by the metadata version
> ---
>
> Key: KAFKA-16606
> URL: https://issues.apache.org/jira/browse/KAFKA-16606
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.7.0
>Reporter: Jakub Scholz
>Assignee: Igor Soarez
>Priority: Major
>
> JBOD support in KRaft should be supported since Kafka 3.7.0. The Kafka 
> [source 
> code|https://github.com/apache/kafka/blob/1b301b30207ed8fca9f0aea5cf940b0353a1abca/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L194-L195]
>  suggests that it is supported with the metadata version {{{}3.7-IV2{}}}. 
> However, it seems to be possible to run KRaft cluster with JBOD even with 
> older metadata versions such as {{{}3.6{}}}. For example, I have a cluster 
> using the {{3.6}} metadata version:
> {code:java}
> bin/kafka-features.sh --bootstrap-server localhost:9092 describe
> Feature: metadata.version       SupportedMinVersion: 3.0-IV1    
> SupportedMaxVersion: 3.7-IV4    FinalizedVersionLevel: 3.6-IV2  Epoch: 1375 
> {code}
> Yet a KRaft cluster with JBOD seems to run fine:
> {code:java}
> bin/kafka-log-dirs.sh --bootstrap-server localhost:9092 --describe
> Querying brokers for log directories information
> Received log directory information from brokers 2000,3000,1000
> 

Re: [VOTE] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-24 Thread Omnia Ibrahim
Thanks Igor, I’ll conclude the vote with 3 binding votes from Mickael Maison, 
Chris Egerton and Igor Soarez.
Thanks everyone 

> On 24 Apr 2024, at 15:11, Igor Soarez  wrote:
> 
> Hi Omnia,
> 
> Thanks for your answers, and I see you've updated the KIP so thanks for the 
> changes too.
> 
> +1 (binding), thanks for the KIP
> 
> --
> Igor



Re: [DISCUSS] KIP-932: Queues for Kafka

2024-04-24 Thread Andrew Schofield
Hi Jun,
Thanks for your reply.

123. The GC only sends the assignment in ShareGroupHeartbeat
response if the member has just joined, or the assignment has been
recalculated. This latter condition is met when the GC fails over.

With a consumer group, it works a little differently. The heartbeating
occurs asynchronously with the reconciliation process in the client.
The client has reconciliation callbacks, as well as offsets to commit
before it can confirm that revocation has occured. So, it might take
multiple heartbeats to complete the installation of a new target
assignment in the client.

With a share group, it’s more brutal. The GC sends the assignment
in a heartbeat response, and the client is assumed to have acted on
It immediately. Since share group assignment is really about balancing
consumers across the available partitions, rather than safely handing
ownership of partitions between consumers, there is less coordination.

I wonder whether there is one situation that does need considering.
If the network connection between the client and the GC is lost, it is
possible that a response containing the assignment is lost. Then,
the connection will be reestablished, and the assignment will only be
sent when it’s recalculated. Adding the equivalent of
ConsumerGroupHeartbeatRequest.TopicPartitions would be one
way to close that.

125. Yes, you are right. The reconciliation that I describe in the
answer to (54) below is related. If AlterShareGroupOffsets is called
for a topic-partition which is not yet in ShareGroupPartitionMetadataValue,
it’s not really part of the group yet. So I think it’s permitted to fail
the AlterShareGroupOffsets because the topic-partition would not be
part of ListShareGroupOffsets result.

If AlterShareGroupOffsets is called for a topic-partition which is in
ShareGroupPartitionMetadataValue, then just calling
InitializeShareGroupState to set its offset would be sufficient without
writing ShareGroupPartitionMetadata again.

138. Added.

139. This week, I have been experimenting with using the existing
consumer metrics with share consumer code. That was my plan to get
started with metrics. While they work to some extent, I am not entirely
happy with the result.

I have added a basic set of share consumer-specific metrics to KIP-932
which I think is a step in the right direction. I anticipate a future KIP
that defines many more metrics and tackles concepts such as what
“lag” means for a share group. The metrics I’ve included are simply
counting and measuring the operations, which is a good start.

54. I think either with or without InitializingTopics/DeletingTopics in
ShareGroupPartitionMetadataValue works. However, I think there is
a deeper point behind what you said. The GC needs to be responsive
to topic creation and deletion, and addition of partitions in cases where
it doesn’t agree with InitializingTopics/DeletingTopics. So, we are
probably not saving anything by having those fields. As a result, I have
removed InitializingTopics and DeletingTopics and updated the KIP
accordingly. The GC needs to reconcile its view of the initialised
topic-partitions with ShareGroupPartitionMetadataValue and it will
initialize or delete share-group state accordingly.


I have made one more change to the KIP. The SharePartitionAssignor
interface has been renamed to
o.a.k.coordinator.group.assignor.ShareGroupPartitionAssignor and it
now extends the PartitionAssignor interface. It’s essentially a marker
of which partition assignors can be used with share groups.

Thanks,
Andrew

> On 23 Apr 2024, at 18:27, Jun Rao  wrote:
> 
> Hi, Andrew,
> 
> Thanks for the reply.
> 
> 123. "it doesn’t need to confirm the assignment back to the GC."
>  Hmm, I thought the member needs to confirm the assignment to GC to
> avoid GC including the assignment in the heartbeat response continuously. I
> assume this is done by including the new group epoch in the heartbeat
> response.
> 
> 125. It's possible that the share partition has never been initialized when
> AlterShareGroupOffsets is called. If GC doesn't write
> ShareGroupPartitionMetadata and the GC fails over, it would reinitialize
> the share partition and lose the effect of AlterShareGroupOffsets. If the
> partition has already been initialized and it's recorded
> in ShareGroupPartitionMetadata, it's possible not to write
> ShareGroupPartitionMetadata again when handling AlterShareGroupOffsets.
> 
> 138. Could we add the flow in GC when a topic is deleted?
> 
> 139. Do we need to add any metrics in KafkaShareConsumer?
> 
> 54. "I don’t think there is any redundancy. The ShareGroupMemberMetadata
> does include a list of subscribed topics. However, if there is a period of
> time in which no members are subscribed to a particular topic, it does not
> mean that the topic’s ShareGroupState should be immediately removed, but it
> does mean that there will be no ShareGroupMemberMetadata records containing
> that topic."
>  I am still trying to understand the 

Re: [VOTE] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-24 Thread Igor Soarez
Hi Omnia,

Thanks for your answers, and I see you've updated the KIP so thanks for the 
changes too.

+1 (binding), thanks for the KIP

--
Igor


Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-24 Thread Omnia Ibrahim
Hi Glaude sorry that it took me a while to respond. I finally had time to look 
into your implementation here 
https://github.com/Claudenw/kafka/blob/KIP-936/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerIDQuotaManager.java#L121
 and so far it make sense. 

> So an early PID is added to the first filter and the associated metric is
> updated.
> that PID is seen multiple times over the next 60 minutes, but is not added
> to the Bloom filters again.
> once the 60 minutes elapses the first filter is cleared, or removed and a
> new one started.  In any case the PID is no longer recorded in any extant
> Bloom filter.
> the PID is seen again and is added to the newest bloom filter and the
> associated metric is updated.
> 
> I believe at this point the metric is incorrect, the PID has been counted
> 2x, when it has been in use for the entire time.

You are right! My original design with slides window bloom carries this risk of 
duplication. I had a look into your track method and it makes sense.
> 1. p.i.q.window.size.seconds the length of time that a window will
>   exist.  This is also the maximum time between PID uses where the PID is
>   considered to be the same.  Reuse of the PID after p.iq.window.size.seconds
>   triggers recording the PID as a new PID.
>   Define a new configuration option "producer.id.quota.window.count" as
>   the number of windows active in window.size.seconds.
This is correct. 

>   3. p.i.q.window.num, specified as 11 in the KIP.  I thought this was how
>   many PIDs were expected in each window.  In the original KIP this means
>   that we expect to see the PID 22 times (2 windows x 11).  In the Layered
>   Bloom filter this would be the N value for the Shape.
producer.id.quota.window.num is how many metrics samples we retain but not how 
many PID we store in memory. 
>   2. p.i.q.window.count (the new one), how many sections to break
>   p.i.q.window.size.seconds into.  In the initial KIP this is 2.  This value
>   gives us the timing for creating a new layer in the layered bloom filter.
>   So every (p.i.q.window.size.seconds / p.i.q.window.count) seconds a new
>   layer will be created and an old layer or layers will be removed.  The
>   layered bloom filter will add layers to keep the probability of false
>   positives in range.
>   4. p.i.q.window.fpr (needs to be specified) the expected false positive
>   rate.  Not sure how to express this in the config in a way that makes sense
>   but something like 0.06 or the like.  This is the P value for the
>   Shape.  See https://hur.st/bloomfilter for a Bloom filter 

For these two we might need to introduce them. How would they interact with 
each other in the layered bloom filter? 

Thanks 

> On 16 Apr 2024, at 08:00, Claude Warren  wrote:
> 
> The difference between p.i.q.window.count and p.i.q.window.num:
> 
> To be honest, I may have misunderstood your definition of window num.  But
> here is what I have in mind:
> 
> 
>   1. p.i.q.window.size.seconds the length of time that a window will
>   exist.  This is also the maximum time between PID uses where the PID is
>   considered to be the same.  Reuse of the PID after p.iq.window.size.seconds
>   triggers recording the PID as a new PID.
>   Define a new configuration option "producer.id.quota.window.count" as
>   the number of windows active in window.size.seconds.
>   2. p.i.q.window.count (the new one), how many sections to break
>   p.i.q.window.size.seconds into.  In the initial KIP this is 2.  This value
>   gives us the timing for creating a new layer in the layered bloom filter.
>   So every (p.i.q.window.size.seconds / p.i.q.window.count) seconds a new
>   layer will be created and an old layer or layers will be removed.  The
>   layered bloom filter will add layers to keep the probability of false
>   positives in range.
>   3. p.i.q.window.num, specified as 11 in the KIP.  I thought this was how
>   many PIDs were expected in each window.  In the original KIP this means
>   that we expect to see the PID 22 times (2 windows x 11).  In the Layered
>   Bloom filter this would be the N value for the Shape.
>   4. p.i.q.window.fpr (needs to be specified) the expected false positive
>   rate.  Not sure how to express this in the config in a way that makes sense
>   but something like 0.06 or the like.  This is the P value for the
>   Shape.  See https://hur.st/bloomfilter for a Bloom filter calculator.
> 
> Once we have the N and P for the Shape the shape can be instantiated as
> "Shape s = Shape.fromNP( int n, double p );"
> 
> In the layered filter once N items have been added to the layer a new layer
> is created.  Layers are removed after p.i.q.window.size.seconds so if there
> is a burst of PIDs the number of layers will expand and then shrink back as
> the PIDs expire.  While running there is always at least 1 layer.
> 
> Some calculated points:
> 
>   - No layer will span more than p.i.q.window.size.seconds /
>   p.i.q.window.count 

[jira] [Resolved] (KAFKA-16592) ConfigKey constructor update can break clients using it

2024-04-24 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16592.

Fix Version/s: 3.8.0
   Resolution: Fixed

> ConfigKey constructor update can break clients using it
> ---
>
> Key: KAFKA-16592
> URL: https://issues.apache.org/jira/browse/KAFKA-16592
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sagar Rao
>Assignee: Sagar Rao
>Priority: Major
>  Labels: clients
> Fix For: 3.8.0
>
>
> In [KAFKA-14957|https://issues.apache.org/jira/browse/KAFKA-14957], the 
> constructor of ConfigDef.ConfigKey was updated to add a new argument called 
> {*}alternativeString{*}. As part of the PR, new *define* methods were also 
> added which makes sense. However, since the constructor of 
> *ConfigDef.ConfigKey* itself can be used directly by other clients which 
> import the dependency, this can break all clients who were using the older 
> constructor w/o the *alternativeString* argument. 
> I bumped into this when I was testing 
> the[kafka-connect-redis|[https://github.com/jcustenborder/kafka-connect-redis/tree/master]]
>  connector. It starts up correctly against the official 3.7 release, but 
> fails with the following error when run against a 3.8 snapshot
>  
>  
> {code:java}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.kafka.common.config.ConfigDef$ConfigKey.(Ljava/lang/String;Lorg/apache/kafka/common/config/ConfigDef$Type;Ljava/lang/Object;Lorg/apache/kafka/common/config/ConfigDef$Validator;Lorg/apache/kafka/common/config/ConfigDef$Importance;Ljava/lang/String;Ljava/lang/String;ILorg/apache/kafka/common/config/ConfigDef$Width;Ljava/lang/String;Ljava/util/List;Lorg/apache/kafka/common/config/ConfigDef$Recommender;Z)V
>  at 
> com.github.jcustenborder.kafka.connect.utils.config.ConfigKeyBuilder.build(ConfigKeyBuilder.java:62)
>  at 
> com.github.jcustenborder.kafka.connect.redis.RedisConnectorConfig.config(RedisConnectorConfig.java:133)
>  at 
> com.github.jcustenborder.kafka.connect.redis.RedisSinkConnectorConfig.config(RedisSinkConnectorConfig.java:46)
>  at 
> com.github.jcustenborder.kafka.connect.redis.RedisSinkConnector.config(RedisSinkConnector.java:73)
>  at 
> org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:538)
>  at 
> org.apache.kafka.connect.runtime.AbstractHerder.lambda$validateConnectorConfig$3(AbstractHerder.java:412)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more
>  
> {code}
>  
> The reason for that is that the connector uses another library called 
> connect-utils which invokes the old constructor 
> [directly|https://github.com/jcustenborder/connect-utils/blob/master/connect-utils/src/main/java/com/github/jcustenborder/kafka/connect/utils/config/ConfigKeyBuilder.java#L62]
> It is not expected for connector invocations to fail across versions so this 
> would cause confusion.
> We could argue that why is the constructor being invoked directly instead of 
> using the *define* method, but there might be other clients doing the same. 
> We should add the old constructor back which calls the new one by setting the 
> *alternativeString* to null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1036: Extend RecordDeserializationException exception

2024-04-24 Thread Frédérik Rouleau
Hi,

I have updated the KIP now and the latest version of PR is available.

About Kirk's questions:

K11: Yes, both can have a deserialization exception but we deserialize the
key first, so if an error occurs then, we do not try to deserialize the
value. So the exception raised is always for key or value.

K12: Not sure of concrete usage for now, just a sugar feature. I suppose we
can imagine some use case where you need/want only the first bytes and do
not want to waste memory allocating the whole payload (SchemaRegistry's
schema Id or something similar).

K13: The old constructor is not needed anymore. It is just for
compatibility until removed in a major version. As public we might have
some users using it even if I cannot see any valid reason for this.

Thanks,
Fred


Re: [VOTE] KIP-1037: Allow WriteTxnMarkers API with Alter Cluster Permission

2024-04-24 Thread Luke Chen
Hi Nikhil,
Thanks for the KIP.

+1 from me.

Luke

On Mon, Apr 22, 2024 at 7:41 PM Andrew Schofield 
wrote:

> Hi Nikhil,
> Thanks for the KIP. Looks good to me.
>
> +1 (non-binding)
>
> Thanks,
> Andrew
>
> > On 22 Apr 2024, at 09:17, Christo Lolov  wrote:
> >
> > Heya Nikhil,
> >
> > Thanks for the proposal, as mentioned before it makes sense to me!
> >
> > +1 (binding)
> >
> > Best,
> > Christo
> >
> > On Sat, 20 Apr 2024 at 00:25, Justine Olshan
> 
> > wrote:
> >
> >> Hey Nikhil,
> >>
> >> I meant to comment on the discussion thread, but my draft took so long,
> you
> >> opened the vote.
> >>
> >> Regardless, I just wanted to say that it makes sense to me. +1 (binding)
> >>
> >> Justine
> >>
> >> On Fri, Apr 19, 2024 at 7:22 AM Nikhil Ramakrishnan <
> >> ramakrishnan.nik...@gmail.com> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> I would like to start a voting thread for KIP-1037: Allow
> >>> WriteTxnMarkers API with Alter Cluster Permission
> >>> (
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1037%3A+Allow+WriteTxnMarkers+API+with+Alter+Cluster+Permission
> >>> )
> >>> as there have been no objections on the discussion thread.
> >>>
> >>> For comments or feedback please check the discussion thread here:
> >>> https://lists.apache.org/thread/bbkyt8mrc8xp3jfyvhph7oqtjxl29xmn
> >>>
> >>> Thanks,
> >>> Nikhil
> >>>
> >>
>
>


[jira] [Created] (KAFKA-16613) Remove TestUtils#subscribeAndWaitForRecords

2024-04-24 Thread PoAn Yang (Jira)
PoAn Yang created KAFKA-16613:
-

 Summary: Remove TestUtils#subscribeAndWaitForRecords
 Key: KAFKA-16613
 URL: https://issues.apache.org/jira/browse/KAFKA-16613
 Project: Kafka
  Issue Type: Test
Reporter: PoAn Yang
Assignee: PoAn Yang


After KAFKA-16483, we remove most of usage of 
TestUtils#subscribeAndWaitForRecords. The only remaining case uses it is 
PlaintextAdminIntegrationTest#testDeleteConsumerGroupOffsets. We can also 
remove it because Consumer#poll already has timeout input. We don't need 
TestUtils#subscribeAndWaitForRecords to give another waiting wrapper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16612) Talking to controllers via AdminClient requires reconfiguring controller listener

2024-04-24 Thread Gantigmaa Selenge (Jira)
Gantigmaa Selenge created KAFKA-16612:
-

 Summary: Talking to controllers via AdminClient requires 
reconfiguring controller listener
 Key: KAFKA-16612
 URL: https://issues.apache.org/jira/browse/KAFKA-16612
 Project: Kafka
  Issue Type: Improvement
Reporter: Gantigmaa Selenge


After KIP-919, Kafka controllers register themselves with the active controller 
once they  start up. This registration includes information about the endpoints 
which the controller listener is configured with. This endpoint is then sent to 
admin clients (via DescribeClusterResponse) so that clients send requests to 
the active controller. If the controller listener is configured with 
"CONTROLLER://0.0.0.0:9093" , this will result in admin clients requests 
failing (trying to connect to localhost). This was not clearly stated in the 
KIP or the documentation.

When clients talking to brokers, advertised.listeners is used, however 
advertised.listener is forbidden for controllers. Should we allow 
advertised.listeners for controllers so that admin client can use it to talk to 
controllers, in the same way it uses it to talk to brokers? Or should the 
endpoints provided in controller.quorum.voters, be returned to admin client?

If the intention is to use the regular "listeners" configuration of controller 
for clients, this should be clearly documented. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16483) Apply `ClusterTestExtensions` to DeleteOffsetsConsumerGroupCommandIntegrationTest

2024-04-24 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16483.

Fix Version/s: 3.8.0
   Resolution: Fixed

> Apply `ClusterTestExtensions` to 
> DeleteOffsetsConsumerGroupCommandIntegrationTest
> -
>
> Key: KAFKA-16483
> URL: https://issues.apache.org/jira/browse/KAFKA-16483
> Project: Kafka
>  Issue Type: Test
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Minor
> Fix For: 3.8.0
>
>
> By using ClusterTestExtensions, 
> DeleteOffsetsConsumerGroupCommandIntegrationTest get get away from 
> KafkaServerTestHarness dependency



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-24 Thread Omnia Ibrahim
I updated the KIP as well to briefly explain how offset.lag.max would help 
latency. Please let me know if the KIP now looks better? 

> On 24 Apr 2024, at 11:49, Omnia Ibrahim  wrote:
> 
> Thanks Igor for the suggestions. I updated the KIP with some of them. 
> 
>> s/recored's offset/recorded offsets
> I actually mean record’s offset and not recorded offset. 
> MirrorSourceConnector store the offset of the committed record on the 
> destination cluster in an internal topic. 
> 
>> 12. The use of the word "customers" seems a bit odd to me in this context.
>> Did you perhaps mean one of "use-cases", "users" or "operators”?
> Changed this to use-cases 
> 
>> 13. "They still left with part#1 of this feature which add cost to
>> the progress of their replication."
>> I'm unsure what this means. Do you mean to say that
>> MirrorCheckpointConnector is disabled but MirrorSourceConnector is not?
>> Could you also clarify where the additional cost comes from?
> Yes currently MM2 can disable step 2 of this feature by not running 
> MirrorCheckpointConnector or by disabling emit.checkpoints.enabled  and 
> sync.group.offsets.enabled  however as long as MirrorSourceConnector is 
> running this 1st step of the feature will never be disabled and this feature 
> is costing 1. Create internal topic 2. Latency as we need to produce/queue 
> translated offsets 
>> 14. This is probably more ignorance of mine: it doesn't seem obvious in
>> the KIP how increasing offset.lag.max  to INT_MAX helps reduce latency.
> 
> Based on my knowledge and understanding, offset.lag.max controls the size of 
> queued offsets to be synced which are stored in `pendingOffsetSyncs`. Then
> 1. MM2 will try to send these periodically in `commitRecord` using 
> "MirrorSourceTask ::firePendingOffsetSyncs”. However any as `commitRecord` 
> isn’t blocker then any failed offset sync will be skipped.
> 2. To fix this then in `commit` the connector which is a blocker method will 
> retry publish any skipped offsets.   
> 
> Now If we sit  offset.lag.max  to high value then translated queued offset 
> wouldn’t be considered stale and as a result wouldn’t be added to 
> `pendingOffsetSyncs` which what `firePendingOffsetSyncs` will try to commit. 
> Hope this clarify it.
> 
>> On 20 Apr 2024, at 09:41, Igor Soarez  wrote:
>> 
>> Hi Omnia,
>> 
>> Thanks for this KIP.
>> 
>> 11. These seem to me to be small misspellings, please double-check:
>> s/MM2 main features/MM2's main features
>> s/syncing consumer group offset/syncing consumer group offsets
>> s/relays/relies
>> s/recored's offset/recorded offsets
>> s/clusters without need for/clusters without the need for
>> s/creating internal topic./creating an internal topic.
>> s/This KIP propose that/This KIP proposes that
>> 
>> 12. The use of the word "customers" seems a bit odd to me in this context.
>> Did you perhaps mean one of "use-cases", "users" or "operators"?
>> 
>> 13. "They still left with part#1 of this feature which add cost to
>> the progress of their replication."
>> I'm unsure what this means. Do you mean to say that
>> MirrorCheckpointConnector is disabled but MirrorSourceConnector is not?
>> Could you also clarify where the additional cost comes from?
>> 
>> 14. This is probably more ignorance of mine: it doesn't seem obvious in
>> the KIP how increasing offset.lag.max  to INT_MAX helps reduce latency.
>> I'm guessing it's related to KAFKA-14610 but after having a look I
>> still couldn't understand why.
>> 
>> 
>> --
>> Igor
>> 
>> On Wed, Apr 17, 2024, at 3:22 PM, Omnia Ibrahim wrote:
>>> Thanks Chris and Mickael for the votes. 
>>> Can I please get one last +1 binding vote please?
>>> 
>>> Thanks
>>> Omnia
>>> 
 On 12 Apr 2024, at 13:21, Chris Egerton  wrote:
 
 +1 (binding), thanks Omnia!
 
 On Fri, Apr 12, 2024, 03:46 Mickael Maison  
 wrote:
 
> Hi Omnia,
> 
> +1 (binding), thanks for the KIP!
> 
> Mickael
> 
> On Fri, Apr 12, 2024 at 9:01 AM Omnia Ibrahim 
> wrote:
>> 
>> Hi everyone, I would like to start a voting thread for KIP-1031: Control
> offset translation in MirrorSourceConnector
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1031%3A+Control+offset+translation+in+MirrorSourceConnector
>> 
>> For comments or feedback please check the discussion thread here
> https://lists.apache.org/thread/ym6zr0wrhglft5c000x9c8ych098s7h6
>> 
>> Thanks
>> Omnia
>> 
> 
>>> 
>>> 
> 



[jira] [Created] (KAFKA-16611) Consider adding test name to "client.id" of Admin in testing

2024-04-24 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16611:
--

 Summary: Consider adding test name to "client.id" of Admin in 
testing
 Key: KAFKA-16611
 URL: https://issues.apache.org/jira/browse/KAFKA-16611
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


I observed following errors many times.

{quote}

org.opentest4j.AssertionFailedError: Found 16 unexpected threads during 
@BeforeAll: `kafka-admin-client-thread | 
adminclient-287,kafka-admin-client-thread | 
adminclient-276,kafka-admin-client-thread | 
adminclient-271,kafka-admin-client-thread | 
adminclient-293,kafka-admin-client-thread | 
adminclient-281,kafka-admin-client-thread | 
adminclient-302,kafka-admin-client-thread | 
adminclient-334,kafka-admin-client-thread | 
adminclient-323,kafka-admin-client-thread | 
adminclient-257,kafka-admin-client-thread | 
adminclient-336,kafka-admin-client-thread | 
adminclient-308,kafka-admin-client-thread | 
adminclient-263,kafka-admin-client-thread | 
adminclient-273,kafka-admin-client-thread | 
adminclient-278,kafka-admin-client-thread | 
adminclient-283,kafka-admin-client-thread | adminclient-317` ==> expected: 
 but was: 

{quote}

That could be caused by exceptional shutdown. Or we do have resource leaks in 
some failed tests. Adding the test name to "client.id" can give hints about that



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-24 Thread Omnia Ibrahim
Thanks Igor for the suggestions. I updated the KIP with some of them. 

> s/recored's offset/recorded offsets
I actually mean record’s offset and not recorded offset. MirrorSourceConnector 
store the offset of the committed record on the destination cluster in an 
internal topic. 

> 12. The use of the word "customers" seems a bit odd to me in this context.
> Did you perhaps mean one of "use-cases", "users" or "operators”?
Changed this to use-cases 

> 13. "They still left with part#1 of this feature which add cost to
> the progress of their replication."
> I'm unsure what this means. Do you mean to say that
> MirrorCheckpointConnector is disabled but MirrorSourceConnector is not?
> Could you also clarify where the additional cost comes from?
Yes currently MM2 can disable step 2 of this feature by not running 
MirrorCheckpointConnector or by disabling emit.checkpoints.enabled  and 
sync.group.offsets.enabled  however as long as MirrorSourceConnector is running 
this 1st step of the feature will never be disabled and this feature is costing 
1. Create internal topic 2. Latency as we need to produce/queue translated 
offsets 
> 14. This is probably more ignorance of mine: it doesn't seem obvious in
> the KIP how increasing offset.lag.max  to INT_MAX helps reduce latency.

Based on my knowledge and understanding, offset.lag.max controls the size of 
queued offsets to be synced which are stored in `pendingOffsetSyncs`. Then
1. MM2 will try to send these periodically in `commitRecord` using 
"MirrorSourceTask ::firePendingOffsetSyncs”. However any as `commitRecord` 
isn’t blocker then any failed offset sync will be skipped.
2. To fix this then in `commit` the connector which is a blocker method will 
retry publish any skipped offsets.   

Now If we sit  offset.lag.max  to high value then translated queued offset 
wouldn’t be considered stale and as a result wouldn’t be added to 
`pendingOffsetSyncs` which what `firePendingOffsetSyncs` will try to commit. 
Hope this clarify it.

> On 20 Apr 2024, at 09:41, Igor Soarez  wrote:
> 
> Hi Omnia,
> 
> Thanks for this KIP.
> 
> 11. These seem to me to be small misspellings, please double-check:
> s/MM2 main features/MM2's main features
> s/syncing consumer group offset/syncing consumer group offsets
> s/relays/relies
> s/recored's offset/recorded offsets
> s/clusters without need for/clusters without the need for
> s/creating internal topic./creating an internal topic.
> s/This KIP propose that/This KIP proposes that
> 
> 12. The use of the word "customers" seems a bit odd to me in this context.
> Did you perhaps mean one of "use-cases", "users" or "operators"?
> 
> 13. "They still left with part#1 of this feature which add cost to
> the progress of their replication."
> I'm unsure what this means. Do you mean to say that
> MirrorCheckpointConnector is disabled but MirrorSourceConnector is not?
> Could you also clarify where the additional cost comes from?
> 
> 14. This is probably more ignorance of mine: it doesn't seem obvious in
> the KIP how increasing offset.lag.max  to INT_MAX helps reduce latency.
> I'm guessing it's related to KAFKA-14610 but after having a look I
> still couldn't understand why.
> 
> 
> --
> Igor
> 
> On Wed, Apr 17, 2024, at 3:22 PM, Omnia Ibrahim wrote:
>> Thanks Chris and Mickael for the votes. 
>> Can I please get one last +1 binding vote please?
>> 
>> Thanks
>> Omnia
>> 
>>> On 12 Apr 2024, at 13:21, Chris Egerton  wrote:
>>> 
>>> +1 (binding), thanks Omnia!
>>> 
>>> On Fri, Apr 12, 2024, 03:46 Mickael Maison  wrote:
>>> 
 Hi Omnia,
 
 +1 (binding), thanks for the KIP!
 
 Mickael
 
 On Fri, Apr 12, 2024 at 9:01 AM Omnia Ibrahim 
 wrote:
> 
> Hi everyone, I would like to start a voting thread for KIP-1031: Control
 offset translation in MirrorSourceConnector
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-1031%3A+Control+offset+translation+in+MirrorSourceConnector
> 
> For comments or feedback please check the discussion thread here
 https://lists.apache.org/thread/ym6zr0wrhglft5c000x9c8ych098s7h6
> 
> Thanks
> Omnia
> 
 
>> 
>> 



[jira] [Created] (KAFKA-16610) Replace "Map#entrySet#forEach" by "Map#forEach"

2024-04-24 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16610:
--

 Summary: Replace "Map#entrySet#forEach" by "Map#forEach"
 Key: KAFKA-16610
 URL: https://issues.apache.org/jira/browse/KAFKA-16610
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


{quote}
Targets
    Occurrences of 'entrySet().forEach' in Project
Found occurrences in Project  (16 usages found)
    Unclassified  (16 usages found)
        kafka.core.main  (9 usages found)
            kafka.server  (4 usages found)
                ControllerApis.scala  (2 usages found)
                    ControllerApis  (2 usages found)
                        handleIncrementalAlterConfigs  (1 usage found)
                            774 controllerResults.entrySet().forEach(entry => 
response.responses().add(
                        handleLegacyAlterConfigs  (1 usage found)
                            533 controllerResults.entrySet().forEach(entry => 
response.responses().add(
                ControllerConfigurationValidator.scala  (2 usages found)
                    ControllerConfigurationValidator  (2 usages found)
                        validate  (2 usages found)
                            99 config.entrySet().forEach(e => {
                            114 config.entrySet().forEach(e => 
properties.setProperty(e.getKey, e.getValue))
            kafka.server.metadata  (5 usages found)
                AclPublisher.scala  (1 usage found)
                    AclPublisher  (1 usage found)
                        onMetadataUpdate  (1 usage found)
                            73 aclsDelta.changes().entrySet().forEach(e =>
                ClientQuotaMetadataManager.scala  (3 usages found)
                    ClientQuotaMetadataManager  (3 usages found)
                        handleIpQuota  (1 usage found)
                            119 quotaDelta.changes().entrySet().forEach { e =>
                        update  (2 usages found)
                            54 quotasDelta.changes().entrySet().forEach { e =>
                            99 quotaDelta.changes().entrySet().forEach { e =>
                KRaftMetadataCache.scala  (1 usage found)
                    KRaftMetadataCache  (1 usage found)
                        getClusterMetadata  (1 usage found)
                            491 topic.partitions().entrySet().forEach { entry =>
        kafka.core.test  (1 usage found)
            unit.kafka.integration  (1 usage found)
                KafkaServerTestHarness.scala  (1 usage found)
                    KafkaServerTestHarness  (1 usage found)
                        getTopicNames  (1 usage found)
                            349 
controllerServer.controller.findAllTopicIds(ANONYMOUS_CONTEXT).get().entrySet().forEach
 {
        kafka.metadata.main  (3 usages found)
            org.apache.kafka.controller  (2 usages found)
                QuorumFeatures.java  (1 usage found)
                    toString()  (1 usage found)
                        144 localSupportedFeatures.entrySet().forEach(f -> 
features.add(f.getKey() + ": " + f.getValue()));
                ReplicationControlManager.java  (1 usage found)
                    createTopic(ControllerRequestContext, CreatableTopic, 
List, Map, 
List, boolean)  (1 usage found)
                        732 newParts.entrySet().forEach(e -> 
assignments.put(e.getKey(),
            org.apache.kafka.metadata.properties  (1 usage found)
                MetaPropertiesEnsemble.java  (1 usage found)
                    toString()  (1 usage found)
                        610 logDirProps.entrySet().forEach(
        kafka.metadata.test  (1 usage found)
            org.apache.kafka.controller  (1 usage found)
                ReplicationControlManagerTest.java  (1 usage found)
                    createTestTopic(String, int[][], Map, 
short)  (1 usage found)
                        307 configs.entrySet().forEach(e -> topic.configs().add(
        kafka.streams.main  (1 usage found)
            org.apache.kafka.streams.processor.internals  (1 usage found)
                StreamsMetadataState.java  (1 usage found)
                    onChange(Map>, Map>, Map)  (1 usage found)
                        317 topicPartitionInfo.entrySet().forEach(entry -> 
this.partitionsByTopic
        kafka.tools.main  (1 usage found)
            org.apache.kafka.tools  (1 usage found)
                LeaderElectionCommand.java  (1 usage found)
                    electLeaders(Admin, ElectionType, 
Optional>)  (1 usage found)
                        178 failed.entrySet().forEach(entry -> {
{quote}

origin discussion: 
[https://github.com/apache/kafka/pull/15786#discussion_r1577656938]

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2840

2024-04-24 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-16424) truncated logs will be left undeleted after alter dir completion

2024-04-24 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-16424.
---
Fix Version/s: 3.8.0
   Resolution: Fixed

> truncated logs will be left undeleted after alter dir completion
> 
>
> Key: KAFKA-16424
> URL: https://issues.apache.org/jira/browse/KAFKA-16424
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.7.0
>Reporter: Luke Chen
>Assignee: PoAn Yang
>Priority: Major
> Fix For: 3.8.0
>
>
> When doing log dir movement, we'll create a temp future replica with the dir 
> named: topic-partition.uniqueId-future, ex: 
> t3-0.9af8e054dbe249cf9379a210ec449af8-future. After the log dir movement 
> completed, we'll rename the future log dir to the normal log dir, in the 
> above case, it'll be "t3" only.
> So, if there are some logs to be deleted during the log dir movement, we'll 
> send for a scheduler to do the deletion later 
> ([here|https://github.com/apache/kafka/blob/2d4abb85bf4a3afb1e3359a05786ab8f3fda127e/core/src/main/scala/kafka/log/LocalLog.scala#L926]).
>  However, when the log dir movement completed, the future log is renamed, the 
> async log deletion will fail with no file existed error:
>  
> {code:java}
> [2024-03-26 17:35:10,809] INFO [LocalLog partition=t3-0, 
> dir=/tmp/kraft-broker-logs] Deleting segment files LogSegment(baseOffset=0, 
> size=0, lastModifiedTime=0, largestRecordTimestamp=-1) (kafka.log.LocalLog$)
> [2024-03-26 17:35:10,810] INFO Failed to delete log 
> /tmp/kraft-broker-logs/t3-0.9af8e054dbe249cf9379a210ec449af8-future/.log.deleted
>  because it does not exist. 
> (org.apache.kafka.storage.internals.log.LogSegment)
> [2024-03-26 17:35:10,811] INFO Failed to delete offset index 
> /tmp/kraft-broker-logs/t3-0.9af8e054dbe249cf9379a210ec449af8-future/.index.deleted
>  because it does not exist. 
> (org.apache.kafka.storage.internals.log.LogSegment)
> [2024-03-26 17:35:10,811] INFO Failed to delete time index 
> /tmp/kraft-broker-logs/t3-0.9af8e054dbe249cf9379a210ec449af8-future/.timeindex.deleted
>  because it does not exist. 
> (org.apache.kafka.storage.internals.log.LogSegment) {code}
> I think we could consider fall back to the normal log dir if the future log 
> dir cannot find the files. That is, when the file cannot be found under 
> "t3-0.9af8e054dbe249cf9379a210ec449af8-future" dir, then try to find "t3" 
> folder, and delete the file. Because the file is already having the suffix 
> with ".delete", it should be fine if we delete them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: Add Igor to committers [kafka-site]

2024-04-24 Thread via GitHub


soarez commented on PR #598:
URL: https://github.com/apache/kafka-site/pull/598#issuecomment-2074419156

   Thank you all  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Add Igor to committers [kafka-site]

2024-04-24 Thread via GitHub


soarez merged PR #598:
URL: https://github.com/apache/kafka-site/pull/598


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Add Igor to committers [kafka-site]

2024-04-24 Thread via GitHub


dajac commented on PR #598:
URL: https://github.com/apache/kafka-site/pull/598#issuecomment-2074387386

   Congrats, Igor!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org