Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Viktor Somogyi-Vass
Congrats Igor!

On Thu, Apr 25, 2024, 07:01 Bruno Cadonna  wrote:

> Congrats!
>
> Best,
> Bruno
>
> Am 25. April 2024 05:18:19 MESZ schrieb Yash Mayya :
> >Congratulations Igor!
> >
> >On Wed, 24 Apr, 2024, 23:36 Colin McCabe,  wrote:
> >
> >> Hi all,
> >>
> >> The PMC of Apache Kafka is pleased to announce a new Kafka committer,
> Igor
> >> Soarez.
> >>
> >> Igor has been a Kafka contributor since 2019. In addition to being a
> >> regular contributor and reviewer, he has made significant contributions
> to
> >> improving Kafka's JBOD support in KRaft mode. He has also contributed to
> >> discussing and reviewing many KIPs such as KIP-690, KIP-554, KIP-866,
> and
> >> KIP-938.
> >>
> >> Congratulations, Igor!
> >>
> >> Thanks,
> >>
> >> Colin (on behalf of the Apache Kafka PMC)
> >>
>


[jira] [Resolved] (KAFKA-16596) Flaky test – org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()

2024-04-22 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-16596.
-
Fix Version/s: 3.8.0
 Assignee: Andras Katona
   Resolution: Fixed

> Flaky test – 
> org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
>  
> ---
>
> Key: KAFKA-16596
> URL: https://issues.apache.org/jira/browse/KAFKA-16596
> Project: Kafka
>  Issue Type: Test
>Reporter: Igor Soarez
>Assignee: Andras Katona
>Priority: Major
>  Labels: GoodForNewContributors, good-first-issue
> Fix For: 3.8.0
>
>
> org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
>  failed in the following way:
>  
> {code:java}
> org.opentest4j.AssertionFailedError: Unexpected addresses [93.184.215.14, 
> 2606:2800:21f:cb07:6820:80da:af6b:8b2c] ==> expected:  but was:  
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)   
>  at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) 
> at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)  
>   at 
> app//org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup(ClientUtilsTest.java:65)
>  {code}
> As a result of the following assertions:
>  
> {code:java}
> // With lookup of example.com, either one or two addresses are expected 
> depending on
> // whether ipv4 and ipv6 are enabled
> List validatedAddresses = 
> checkWithLookup(asList("example.com:1"));
> assertTrue(validatedAddresses.size() >= 1, "Unexpected addresses " + 
> validatedAddresses);
> List validatedHostNames = 
> validatedAddresses.stream().map(InetSocketAddress::getHostName)
> .collect(Collectors.toList());
> List expectedHostNames = asList("93.184.216.34", 
> "2606:2800:220:1:248:1893:25c8:1946"); {code}
> It seems that the DNS result has changed for example.com.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New Kafka PMC Member: Greg Harris

2024-04-13 Thread Viktor Somogyi-Vass
Congrats Greg! :)

On Sun, Apr 14, 2024, 00:35 Bill Bejeck  wrote:

> Congrats Greg!
>
> -Bill
>
> On Sat, Apr 13, 2024 at 4:25 PM Boudjelda Mohamed Said 
> wrote:
>
> > Congratulations Greg
> >
> > On Sat 13 Apr 2024 at 20:42, Chris Egerton  wrote:
> >
> > > Hi all,
> > >
> > > Greg Harris has been a Kafka committer since July 2023. He has remained
> > > very active and instructive in the community since becoming a
> committer.
> > > It's my pleasure to announce that Greg is now a member of Kafka PMC.
> > >
> > > Congratulations, Greg!
> > >
> > > Chris, on behalf of the Apache Kafka PMC
> > >
> >
>


Re: [ANNOUNCE] New committer: Christo Lolov

2024-03-26 Thread Viktor Somogyi-Vass
Congrats Christo! Well deserved!

On Tue, Mar 26, 2024, 13:27 Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Congrats Christo!
>
> On Tue, Mar 26, 2024, 17:42 Bruno Cadonna  wrote:
>
> > Congrats, Christo!
> >
> > Well deserved!
> >
> > Best,
> > Bruno
> >
> > On 3/26/24 1:04 PM, Luke Chen wrote:
> > > Hi, Everyone,
> > >
> > > The PMC of Apache Kafka is pleased to announce a new Kafka committer:
> > > Christo Lolov.
> > >
> > > Christo has been a Kafka contributor since 2021. He has made over 50
> > > commits. He authored KIP-902, KIP-963, and KIP-1005, as well as many
> > tiered
> > > storage related tasks. He also co-drives the migration from EasyMock to
> > > Mockito and from Junit 4 to JUnit 5.
> > >
> > > Congratulations, Christo!
> > >
> > > Thanks,
> > > Luke (on behalf of the Apache Kafka PMC)
> > >
> >
>


Re: [DISCUSS] KIP-1010: Topic Partition Quota

2024-03-05 Thread Viktor Somogyi-Vass
Hi Afshin,

A couple observations:
1. The image you inserted doesn't get shown, please fix it
2. I'd like to clarify your proposal a bit. So for now we had (user,
client), (user) or (client) combinations. You'd like to introduce
topic-partitions in this framework. Would it extend the current behavior,
so the previously 4 item set becomes a 6 item set like this: (tp, user,
client), (tp, user), (tp, client), (tp), (user) or (client)? Or do these tp
quotas behave differently?
3. How would your implementation work when the aggregate of topic quotas
exceed the available bandwidth? Do topics get fair access or is it possible
that some partitions can't be consumed because others eat the bandwidth?
4. I'm a bit confused about the motivation section. So you're saying that
if you have a topic with 6 partitions where a quota is set to 2MB/s, are
you expecting a 4MB/s throughput if 2 of those topic's partition leaders
are hosted on the broker? Wouldn't that violate backward compatibility
because with a client now I can produce at 4MB/s rate?

Thanks,
Viktor

On Wed, Feb 14, 2024 at 9:27 PM Afshin Moazami
 wrote:

> Thanks Viktor,
>
> Hi folks,
> I would like to propose a new feature to extend the quota management in
> Kafka to support topic-partition based quotas. The following is the link to
> the KIP
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1010%3A+Topic+Partition+Quota
>
>
> Best,
> Afshin Moazami
>
> On Wed, Feb 7, 2024 at 5:25 AM Viktor Somogyi-Vass
>  wrote:
>
> > Hi Afshin,
> >
> > We keep KIP discussions on dev@kafka.apache.org so please post this over
> > there too. I'll go over this later this week but devs usually monitor
> that
> > list more frequently and you'll have better chances of getting a reply
> > there.
> >
> > Regards,
> > Viktor
> >
> > On Wed, Jan 17, 2024 at 12:03 AM Afshin Moazami
> >  wrote:
> >
> > > Hi folks,
> > > I am not sure what is the KIP life-cycle and how we can get more
> > attention
> > > on them, so I just reply to this thread with the hope to get some
> > > discussion started.
> > >
> > > Thanks,
> > > Afshin
> > >
> > > On Mon, Dec 11, 2023 at 10:43 AM Afshin Moazami <
> amoaz...@salesforce.com
> > >
> > > wrote:
> > >
> > > > Hi folks,
> > > > I would like to propose a new feature to extend the quota management
> in
> > > > Kafka to support topic-partition based quotas. The following is the
> > link
> > > to
> > > > the KIP
> > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-1010*3A*Topic*Partition*Quota__;JSsrKw!!DCbAVzZNrAf4!BK-888ZjIeh53cmPcRZ_ZIpA6-02xIk5LXsT4cl82ieHRjWN31a-xsi36sN9I3P3LOhhpYCJU2FpbYkfg2YpGX2RXtPFAIjsHv0$
> > > >
> > > >
> > > > Best,
> > > > Afshin Moazami
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-1016 Make MM2 heartbeats topic name configurable

2024-01-17 Thread Viktor Somogyi-Vass
Hi Bertalan,

Thanks for creating this KIP.
A couple of observations/questions:
1. If I have multiple source->target pairs, can I set this property per
cluster by prefixing with "source->target" as many other configs or is it
global?
2. The replication policy must be set in MirrorClient as well. Is your
change applicable to both MirrorClient and the connectors as well?
3. It might be worth pointing out (both in the docs and the KIP) that if
the user overrides the replication policy to any other than
DefaultReplicationPolicy, then this config has no effect.
4. With regards to integration tests, I tend to lean towards that we don't
need them if we can cover this well with unit tests and mocking.

Thanks,
Viktor

On Wed, Jan 17, 2024 at 12:23 AM Ryanne Dolan  wrote:

> Makes sense to me, +1.
>
> On Tue, Jan 16, 2024 at 5:04 PM Kondrát Bertalan 
> wrote:
>
>> Hey Team,
>>
>> I would like to start a discussion thread about the *KIP-1016 Make MM2
>> heartbeats topic name configurable
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1016+Make+MM2+heartbeats+topic+name+configurable
>> >*
>> .
>>
>> This KIP aims to make the default heartbeat topic name (`heartbeats`) in
>> the DefaultReplicationPolicy configurable via a property.
>> Since this is my first KIP and the change is small, I implemented it in
>> advance so, I can include the PR
>> <https://github.com/apache/kafka/pull/15200> as well.
>>
>> I appreciate all your feedbacks and comments.
>>
>> Special thanks to Viktor Somogyi-Vass  and
>> Daniel
>> Urban  for the original idea and help.
>> Thank you,
>> Berci
>>
>> --
>> *Bertalan Kondrat* | Founder, SWE
>> servy.hu <https://www.servy.hu/>
>>
>>
>>
>> <https://www.cloudera.com/>
>> --
>>
>


Re: [PROPOSAL] Add commercial support page on website

2024-01-15 Thread Viktor Somogyi-Vass
Hi all,

I think that making and updating such a list implies tricky questions that
we have to deal with. Good examples were given by Matthias, that was my
first thought as well before reading his response.

On the other hand I think it would be a good alternative to create a list
of companies that are either regular or meaningful contributors to the
project, provide other help such as testing infra or sponsors of the
project in other ways. This may present those companies who provide support
too and I think it'd be some appreciation as well to display them.

What do you all think?

Best,
Viktor


On Mon, Jan 15, 2024 at 10:08 AM Francois Papon <
francois.pa...@openobject.fr> wrote:

> Hi Tison,
>
> Publishing a dedicated website for that can be a good idea, however if
> the link of the website could not be mention in the official Apache
> Kafka website I'm afraid that it will not be relevant.
>
> BTW, as I understand after all the feedback of the Apache Kafka PMC and
> community, my proposal is not a good idea for the project so I will
> close the PR.
>
> Thanks all for the feedback.
>
> regards,
>
> François
>
> On 14/01/2024 12:56, tison wrote:
> > FWIW - even if it's rejected by the Kafka PMC, you can maintain your
> > own page for such information and provide your personal comments on
> > them. If the object is to provide information and help users to make
> > decisions, it should help. Although you should do the SEO by yourself,
> > if the information is somehow neutral and valuable, you can ask the
> > @apachekafka Twitter (X) account to propagate it and provide a blog
> > for Kafka blogs.
> >
> > This is the common way how third-party "evangelist" producing content
> > and get it promoted.
> >
> > Best,
> > tison.
> >
> > Matthias J. Sax  于2024年1月13日周六 07:35写道:
> >> François,
> >>
> >> thanks for starting this initiative. Personally, I don't think it's
> >> necessarily harmful for the project to add such a new page, however, I
> >> share the same concerns others raised already.
> >>
> >> I understand your motivation that people had issues finding commercial
> >> support, but I am not sure we can address this issue that way. I am also
> >> "worried" (for the lack of a better word) that the page might become
> >> long an unwieldy. In the end, any freelancer/consultant offering Kafka
> >> services would be able to get on the page, so we might get hundreds of
> >> entries, what also makes it impossible for users to find what they are
> >> looking for. Also, the services of different companies might vary
> >> drastically; should users read all these descriptions? I can also
> >> imagine that some companies offer their services only in some
> >> countries/regions making it even harder for user to find what they are
> >> looking for?
> >>
> >> Overall, it sounds more like a search optimization problem, and thus it
> >> seems out-of-scope what we can solve. As I said, I am not strictly
> >> against it, but I just don't see much value either.
> >>
> >>
> >> -Matthias
> >>
> >> On 1/11/24 12:55 PM, Francois Papon wrote:
> >>> Hi Justine,
> >>>
> >>> You're right, Kafka is a part of my business (training, consulting,
> >>> architecture design, sla...) and most of the time, users/customers said
> >>> that it was hard for them to find a commercial support (in France for
> my
> >>> case) after searching on the Kafka website (Google didn't help them).
> >>>
> >>> As an ASF member and PMC of several ASF projects, I know that this kind
> >>> of page exist so this is why I made this proposal for the Kafka project
> >>> because I really think that it can help users.
> >>>
> >>> As you suggest, I can submit a PR to be added on the "powered by" page.
> >>>
> >>> Thanks,
> >>>
> >>> François
> >>>
> >>> On 11/01/2024 21:00, Justine Olshan wrote:
>  Hey François,
> 
>  My point was that the companies on that page use kafka as part of
> their
>  business. If you use Kafka as part of your business feel free to
> submit a
>  PR to be added.
> 
>  I second Chris's point that other projects are not enough to require
>  Kafka
>  having such a support page.
> 
>  Justine
> 
>  On Thu, Jan 11, 2024 at 11:57 AM Chris Egerton <
> fearthecel...@gmail.com>
>  wrote:
> 
> > Hi François,
> >
> > Is it an official policy of the ASF that projects provide a listing
> of
> > commercial support options for themselves? I understand that other
> > projects
> > have chosen to provide one, but this doesn't necessarily imply that
> all
> > projects should do the same, and I can't say I find this point very
> > convincing as a rebuttal to some of the good-faith concerns raised by
> > the
> > PMC and members of the community so far. However, if there's an
> official
> > ASF stance on this topic, then I acknowledge that Apache Kafka should
> > align
> > with it.
> >
> > Best,
> >
> > Chris
> >
> >
> > On Thu, 

Re: Logging in Kafka

2024-01-10 Thread Viktor Somogyi-Vass
Hi Mickael,

Reacting to your points:
1. I think it's somewhat unfortunate that we provide an appender tied to a
chosen logger implementation. I think that this shouldn't be part of the
project in its current form. However, there is the sl4fj2 Fluent API which
may solve our problem and turn KafkaLog4jAppender into a generic
implementation that doesn't depend on a specific library given that we can
upgrade to slf4j2. That is worth considering.
2. Since KIP-1013 we'd move to Java17 anyways by 4.0, so I don't feel it's
a problem if there's a specific dependency that has Java17 as the minimum
supported version. As I read though from your email thread with the log4j2
folks, it'll be supported for years to come and log4j3 isn't yet stable.
Since we already use log4j2 in our fork, I'm happy to contribute to this,
review PRs or drive it if needed.

Thanks,
Viktor

On Wed, Jan 10, 2024 at 3:58 PM Mickael Maison 
wrote:

> I asked for details about the future of log4j2 on the logging user list:
> https://lists.apache.org/thread/6n6bkgwj8tglgdgzz8wxhkx1p1xpwodl
>
> Let's see what they say.
>
> Thanks,
> Mickael
>
> On Wed, Jan 10, 2024 at 3:23 PM Ismael Juma  wrote:
> >
> > Hi Mickael,
> >
> > Thanks for starting the discussion and for summarizing the state of
> play. I
> > agree with you that it would be important to understand how long log4j2
> > will be supported for. An alternative would be sl4fj 2.x and logback.
> >
> > Ismael
> >
> > On Wed, Jan 10, 2024 at 2:17 PM Mickael Maison  >
> > wrote:
> >
> > > Hi,
> > >
> > > Starting a new thread to discuss the current logging situation in
> > > Kafka. I'll restate everything we know but see the [DISCUSS] Road to
> > > Kafka 4.0 if you are interested in what has already been said. [0]
> > >
> > > Currently Kafka uses SLF4J and reload4j as the logging backend. We had
> > > to adopt reload4j in 3.2.0 as log4j was end of life and has a few
> > > security issues.
> > >
> > > In 2020 we adopted KIP-653 to upgrade to log4j2. Due to
> > > incompatibilities in the configuration mechanism with log4j/reload4j
> > > we decide to delay the upgrade to the next major release, Kafka 4.0.
> > >
> > > Kafka also currently provides a log4j appender. In 2022, we adopted
> > > KIP-719 to deprecate it since we wanted to switch to log4j2. At the
> > > time Apache Logging also had a Kafka appender that worked with log4j2.
> > > They since deprecated that appender in log4j2 and it is not part of
> > > log4j3. [1]
> > >
> > > Log4j3 is also nearing release but it seems it will require Java 17.
> > > The website states Java 11 [2] but the artifacts from the latest 3.0.0
> > > beta are built for Java 17. I was not able to find clear maintenance
> > > statement about log4j2 once log4j3 gets released.
> > >
> > > The question is where do we go from here?
> > > We can stick with our plans:
> > > 1. Deprecate the appender in the next 3.x release and plan to remove
> it in
> > > 4.0
> > > 2. Do the necessary work to switch to log4j2 in 4.0
> > > If so we need people to drive these work items. We have PRs for these
> > > with hopefully the bulk of the code but they need
> > > rebasing/completing/reviewing.
> > >
> > > Otherwise we can reconsider KIP-653 and/or KIP-719.
> > >
> > > Assuming log4j2 does not go end of life in the near future (We can
> > > reach out to Apache Logging to clarify that point.), I think it still
> > > makes sense to adopt it. I would also go ahead and deprecate our
> > > appender.
> > >
> > > Thanks,
> > > Mickael
> > >
> > > 0: https://lists.apache.org/thread/q0sz910o1y9mhq159oy16w31d6dzh79f
> > > 1: https://github.com/apache/logging-log4j2/issues/1951
> > > 2: https://logging.apache.org/log4j/3.x/#requirements
> > >
>


Re: [VOTE] KIP-971: Expose replication-offset-lag MirrorMaker2 metric

2024-01-10 Thread Viktor Somogyi-Vass
Hi Elxan,

+1 (binding).

Thanks,
Viktor

On Mon, Jan 8, 2024 at 5:57 PM Dániel Urbán  wrote:

> Hi Elxan,
> +1 (non-binding)
> Thanks for the KIP, this will be a very useful metric for MM!
> Daniel
>
> Elxan Eminov  ezt írta (időpont: 2024. jan. 7.,
> V,
> 2:17):
>
> > Hi all,
> > Bumping this for visibility
> >
> > On Wed, 3 Jan 2024 at 18:13, Elxan Eminov 
> wrote:
> >
> > > Hi All,
> > > I'd like to initiate a vote for KIP-971.
> > > This KIP is about adding a new metric to the MirrorSourceTask that
> tracks
> > > the offset lag between a source and a target partition.
> > >
> > > KIP link:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-971%3A+Expose+replication-offset-lag+MirrorMaker2+metric
> > >
> > > Discussion thread:
> > > https://lists.apache.org/thread/gwq9jd75dnm8htmpqkn17bnks6h3wqwp
> > >
> > > Thanks!
> > >
> >
>


Re: Adding reviewers with Github actions

2024-01-03 Thread Viktor Somogyi-Vass
Hey all,

So finally I had some time for some lower prio stuff since everyone is
still on holiday. I've implemented this Github action check in a PR:
https://github.com/apache/kafka/pull/15115
It uses the Github API to fetch the name and email of the user based on the
username who submitted the review.
If this is still something that the community wants then please let me know
with a review :).
I've tested this on my own fork here:
https://github.com/viktorsomogyi/kafka/pull/4.
I think you can also try this by raising a PR against the trunk of my fork
(although I don't know if I need to add you as a collaborator but if so,
then I'll happily do it for this test).

Thanks,
Viktor

On Wed, May 3, 2023 at 9:41 AM Viktor Somogyi-Vass <
viktor.somo...@cloudera.com> wrote:

> Yes, perhaps this can be used in the github action too, I think this is a
> very useful tool. Sadly I couldn't get to the github action but hopefully I
> will get there soon.
>
> On Fri, Apr 28, 2023 at 8:48 AM David Jacot 
> wrote:
>
>> Thanks, David. This is a nice addition!
>>
>> Coming back to the original proposal of using github actions, it may be
>> possible to run David's script automatically. For instance, we could
>> trigger an action which pulls the folks who have approved the PR and feed
>> the script when a comment with `reviewers` is posted. Then the action
>> would
>> post a comment with the "Reviewers: ". This way, we could do
>> everything from within the PR.
>>
>> Cheers,
>> David
>>
>> On Thu, Apr 27, 2023 at 8:35 PM David Arthur
>>  wrote:
>>
>> > I just merged the "reviewers" script I wrote a while ago:
>> > https://github.com/apache/kafka/pull/11096
>> >
>> > It works by finding previous occurrences of "Reviewers: ...", so it only
>> > works for people who have reviewed something before. I do suspect this
>> is
>> > largely the common case.
>> >
>> > E.g., searching for "Ismael" gives:
>> >
>> > Possible matches (in order of most recent):
>> > [1] Ismael Juma ism...@juma.me.uk (1514)
>> > [2] Ismael Juma ij...@apache.org (3)
>> > [3] Ismael Juma mli...@juma.me.uk (4)
>> > [4] Ismael Juma ism...@confluent.io (19)
>> > [5] Ismael Juma git...@juma.me.uk (7)
>> >
>> > it shows them in order of most recently occurring along with the number
>> of
>> > occurrences. Now that it's merged, it should be easier for folks to try
>> it
>> > out.
>> >
>> > Cheers,
>> > David
>> >
>> > On Thu, Apr 20, 2023 at 1:02 PM Justine Olshan
>> > 
>> > wrote:
>> >
>> > > I've tried the script, but it's not quite complete.
>> > > I've had issues finding folks -- if they haven't reviewed in kafka, we
>> > can
>> > > not find an email for them. I also had some issues with finding folks
>> who
>> > > had reviewed before.
>> > >
>> > > Right now, my strategy is to use GitHub to search previous commits for
>> > > folks' emails, but that isn't the most optimal solution -- especially
>> if
>> > > the reviewer has no public email.
>> > > I do think it is useful to have in the commit though, so if anyone has
>> > some
>> > > ideas on how to improve, I'd be happy to hear.
>> > >
>> > > Justine
>> > >
>> > > On Wed, Apr 19, 2023 at 6:53 AM Ismael Juma 
>> wrote:
>> > >
>> > > > It's a lot more convenient to have it in the commit than having to
>> > follow
>> > > > links, etc.
>> > > >
>> > > > David Arthur also wrote a script to help with this step, I believe.
>> > > >
>> > > > Ismael
>> > > >
>> > > > On Tue, Apr 18, 2023, 9:29 AM Divij Vaidya > >
>> > > > wrote:
>> > > >
>> > > > > Do we even need a manual attribution for a reviewer in the commit
>> > > > message?
>> > > > > GitHub automatically marks the folks as "reviewers" who have used
>> the
>> > > > > "review-changes" button on the top left corner and left feedback.
>> > > GitHub
>> > > > > also has searchability for such reviews done by a particular
>> person
>> > > using
>> > > > > the following link:
>> > > > >
>> > > > > https://github.com/search?q=is%3

Re: [VOTE] KIP-1013: Drop broker and tools support for Java 11 in Kafka 4.0 (deprecate in 3.7)

2024-01-03 Thread Viktor Somogyi-Vass
Hi Ismael,

I think it's important to make this change, the youtube video you posted on
the discussion thread makes very good arguments and so does the KIP. Java 8
is almost a liability and Java 11 already has smaller (and decreasing)
adoption than 17. It's a +1 (binding) from me.

Thanks,
Viktor

On Wed, Jan 3, 2024 at 7:00 AM Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> +1 (non-binding).
>
> On Wed, Jan 3, 2024 at 8:01 AM Satish Duggana 
> wrote:
>
> > Thanks Ismael for the proposal.
> >
> > Adopting JDK 17 enhances developer productivity and has reached a
> > level of maturity that has led to its adoption by several other major
> > projects, signifying its reliability and effectiveness.
> >
> > +1 (binding)
> >
> >
> > ~Satish.
> >
> > On Wed, 3 Jan 2024 at 06:59, Justine Olshan
> >  wrote:
> > >
> > > Thanks for driving this.
> > >
> > > +1 (binding) from me.
> > >
> > > Justine
> > >
> > > On Tue, Jan 2, 2024 at 4:30 PM Ismael Juma  wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start a vote on KIP-1013.
> > > >
> > > > As stated in the discussion thread, this KIP was proposed after the
> KIP
> > > > freeze for Apache Kafka 3.7, but it is purely a documentation update
> > (if we
> > > > decide to adopt it) and I believe it would serve our users best if we
> > > > communicate the deprecation for removal sooner (i.e. 3.7) rather than
> > later
> > > > (i.e. 3.8).
> > > >
> > > > Please take a look and cast your vote.
> > > >
> > > > Link:
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789510
> > > >
> > > > Ismael
> > > >
> >
>


Re: [ANNOUNCE] New Kafka PMC Member: Divij Vaidya

2023-12-29 Thread Viktor Somogyi-Vass
Congrats Divij, well deserved!

On Fri, Dec 29, 2023, 09:36 Lucas Brutschy 
wrote:

> Congratulations, Divij!
>
> On Fri, Dec 29, 2023 at 1:32 AM Colin McCabe  wrote:
> >
> > Congratulations, Divij!
> >
> > best,
> > Colin
> >
> > On Thu, Dec 28, 2023, at 11:38, Bruno Cadonna wrote:
> > > Congratulations Divij! Well deserved!
> > >
> > > Best,
> > > Bruno
> > >
> > > On 12/27/23 12:45 PM, Luke Chen wrote:
> > >> Hi, Everyone,
> > >>
> > >> Divij has been a Kafka committer since June, 2023. He has remained
> very
> > >> active and instructive in the community since becoming a committer.
> It's my
> > >> pleasure to announce that Divij is now a member of Kafka PMC.
> > >>
> > >> Congratulations Divij!
> > >>
> > >> Luke
> > >> on behalf of Apache Kafka PMC
> > >>
>


Re: Kafka trunk test & build stability

2023-12-19 Thread Viktor Somogyi-Vass
Hey Stan,

I also experienced this, some of the tests indeed take a long time. As an
immediate workaround, do you think we can enforce a global timeout of let's
say 10 minutes?
I don't know if these are taking a long time because of some race condition
or because of the lack of resources and probably would need to investigate
them individually, but a test case shouldn't run for more than a couple of
minutes.

As a bit more of a strategic solution, I can imagine that we could run only
the tests of only the changing module and the others which are functionally
dependent on it. For instance we still likely need to run all tests for a
client or core change (as they can affect other parts of the project) but
we may need to run only connect and mm2 tests if the change is in connect,
or only mm2 tests if the change is in mm2.

Best,
Viktor

On Tue, Dec 19, 2023 at 3:33 PM Stanislav Kozlovski
 wrote:

> Hey everybody,
> I've heard various complaints that build times in trunk are taking too
> long, some taking as much as 8 hours (the timeout) - and this is slowing us
> down from being able to meet the code freeze deadline for 3.7.
>
> I took it upon myself to gather up some data in Gradle Enterprise to see if
> there are any outlier tests that are causing this slowness. Turns out there
> are a few, in this particular build -
> https://ge.apache.org/s/un2hv7n6j374k/
> - which took 10 hours and 29 minutes in total.
>
> I have compiled the tests that took a disproportionately large amount of
> time (20m+), alongside their time, error message and a link to their full
> log output here -
> https://gist.github.com/stanislavkozlovski/8959f7ee59434f774841f4ae2f5228c2
>
> It includes failures from core, streams, storage and clients.
> Interestingly, some other tests that don't fail also take a long time in
> what is apparently the test harness framework. See the gist for more
> information.
>
> I am starting this thread with the intention of getting the discussion
> started and brainstorming what we can do to get the build times back under
> control.
>
>
> --
> Best,
> Stanislav
>


Re: Apache Kafka 3.7.0 Release

2023-12-19 Thread Viktor Somogyi-Vass
Hi all,

I was wondering what people think about extending the code freeze date to
early January?
The reason I'm asking is that there are still a couple of testing gaps in
JBOD (https://issues.apache.org/jira/browse/KAFKA-14127) which I think is
very important to finish to ensure a high quality release (after all this
supposed to be the last 3.x) and secondly the year end holidays for many
people are coming fast, which means we'll likely have less people working
on testing and validation. In my opinion it would strengthen the release if
we could spend a week in January to really finish off JBOD and do a 2 week
stabilization.

What do you all think?

Best,
Viktor

On Tue, Dec 12, 2023 at 2:59 PM Stanislav Kozlovski
 wrote:

> Hey!
>
> Just notifying everybody on this thread that I have cut the 3.7 branch and
> sent a new email thread titled "New Release Branch 3.7" to the mailing list
> .
>
> Best,
> Stanislav
>
> On Wed, Dec 6, 2023 at 11:10 AM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > Hello again,
> >
> > Time is flying by! It is feature freeze day!
> >
> > By today, we expect to have major features merged and to begin working on
> > their stabilisation. Minor features should have PRs.
> >
> > I am planning to cut the release branch soon - on Monday EU daytime. When
> > I do that, I will create a new e-mail thread titled "New release branch
> > 3.7.0" to notify you, so be on the lookout for that. I will also notify
> > this thread.
> >
> > Thank you for your contributions. Let's get this release shipped!
> >
> > Best,
> > Stanislav
> >
> >
> > On Fri, Nov 24, 2023 at 6:11 PM Stanislav Kozlovski <
> > stanis...@confluent.io> wrote:
> >
> >> Hey all,
> >>
> >> The KIP Freeze has passed. I count 31 KIPs that will be going into the
> >> 3.7 Release. Thank you all for your hard work!
> >>
> >> They are the following (some of these were accepted in previous releases
> >> and have minor parts going out, some targeting a Preview release and the
> >> rest being fully released as regular.):
> >>  - KIP-1000: List Client Metrics Configuration Resources
> >>  - KIP-1001: Add CurrentControllerId Metric
> >>  - KIP-405: Kafka Tiered Storage
> >>  - KIP-580: Exponential Backoff for Kafka Clients
> >>  - KIP-714: Client metrics and observability
> >>  - KIP-770: Replace "buffered.records.per.partition" &
> >> "cache.max.bytes.buffering" with
> >> "{statestore.cache}/{input.buffer}.max.bytes"
> >>  - KIP-848: The Next Generation of the Consumer Rebalance Protocol
> >>  - KIP-858: Handle JBOD broker disk failure in KRaft
> >>  - KIP-890: Transactions Server-Side Defense
> >>  - KIP-892: Transactional StateStores
> >>  - KIP-896: Remove old client protocol API versions in Kafka 4.0 -
> >> metrics/request log changes to identify deprecated apis
> >>  - KIP-925: Rack aware task assignment in Kafka Streams
> >>  - KIP-938: Add more metrics for measuring KRaft performance
> >>  - KIP-951 - Leader discovery optimizations for the client
> >>  - KIP-954: expand default DSL store configuration to custom types
> >>  - KIP-959: Add BooleanConverter to Kafka Connect
> >>  - KIP-960: Single-key single-timestamp IQv2 for state stores
> >>  - KIP-963: Additional metrics in Tiered Storage
> >>  - KIP-968: Support single-key_multi-timestamp Interactive Queries
> (IQv2)
> >> for Versioned State Stores
> >>  - KIP-970: Deprecate and remove Connect's redundant task configurations
> >> endpoint
> >>  - KIP-975: Docker Image for Apache Kafka
> >>  - KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect
> >>  - KIP-978: Allow dynamic reloading of certificates with different DN /
> >> SANs
> >>  - KIP-979: Allow independently stop KRaft processes
> >>  - KIP-980: Allow creating connectors in a stopped state
> >>  - KIP-985: Add reverseRange and reverseAll query over kv-store in IQv2
> >>  - KIP-988: Streams Standby Update Listener
> >>  - KIP-992: Proposal to introduce IQv2 Query Types: TimestampedKeyQuery
> >> and TimestampedRangeQuery
> >>  - KIP-998: Give ProducerConfig(props, doLog) constructor protected
> access
> >>
> >> Notable KIPs that didn't make the Freeze were KIP-977 - it only got 2/3
> >> votes.
> >>
> >> For the full list and latest source of truth, refer to the Release Plan
> >> 3.7.0 Document
> >> .
> >>
> >> Thanks for your contributions once again!
> >> Best,
> >> Stan
> >>
> >>
> >> On Thu, Nov 23, 2023 at 2:27 PM Nick Telford 
> >> wrote:
> >>
> >>> Hi Stan,
> >>>
> >>> I'd like to propose including KIP-892 in the 3.7 release. The KIP has
> >>> been
> >>> accepted and I'm just working on rebasing the implementation against
> >>> trunk
> >>> before I open a PR.
> >>>
> >>> Regards,
> >>> Nick
> >>>
> >>> On Tue, 21 Nov 2023 at 11:27, Mayank Shekhar Narula <
> >>> mayanks.nar...@gmail.com> wrote:
> >>>
> >>> > Hi Stan
> >>> >
> >>> > Can you include KIP-951 to the 

[jira] [Created] (KAFKA-15992) Make MM2 heartbeats topic name configurable

2023-12-11 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-15992:
---

 Summary: Make MM2 heartbeats topic name configurable
 Key: KAFKA-15992
 URL: https://issues.apache.org/jira/browse/KAFKA-15992
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 3.7.0
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


With DefaultReplicationPolicy, the heartbeats topic name is hard-coded. 
Instead, this should be configurable, so users can avoid collisions with the 
"heartbeats" topics of other systems.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-12-04 Thread Viktor Somogyi-Vass
Elkhan, do you want to propose a vote for this KIP or do you have any other
ideas to include?

On Tue, Oct 17, 2023 at 2:47 PM Viktor Somogyi-Vass <
viktor.somo...@cloudera.com> wrote:

> Hi hudeqi,
>
> Good thinking about the OOM and resource leaks.
> The "update.replication.lag.interval.time" I think is almost good but we
> should include that it is about a metric (like
> "replication.lag.interval.metric.update.time") so it's obvious without the
> docs too.
>
> Thanks,
> Viktor
>
> On Sat, Oct 7, 2023 at 8:53 AM hudeqi <16120...@bjtu.edu.cn> wrote:
>
>> Hi, Elkhan, Viktor.
>>
>> I took a look at the updated KIP. I think Viktor mentioned that he did
>> not see the relevant configuration, which refers to "(Optional) -
>> MirrorConnectorConfig - a configuration to control the poll interval for
>> the Consumer.endOffsets() call at LEO acquisition mentioned below". I think
>> we can introduce the name of this configuration here, such as
>> "update.replication.lag.interval.time", which means that in a separate
>> periodic scheduling thread, the lag is calculated by this interval time
>> through "consumer.endOffsets - LRO". In addition, for the LRO cache, you
>> can add an expired time attribute for each partition. If this expired
>> interval time is exceeded before next updated, the LRO of this partition
>> can be removed from the cache to avoid possible leaks and OOM.
>>
>> best,
>> hudeqi
>
>


Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-12-04 Thread Viktor Somogyi-Vass
Hi Igor,

I'm just reading through your KIP and noticed that the new protocol you
created doesn't say anything about ACLs of the new AssignReplicasToDirs
API. Would it make sense to authorize these requests as other inter-broker
protocol calls are usually authorized, that is ClusterAction on Cluster
resource?

Thanks,
Viktor

On Tue, Nov 28, 2023 at 4:18 PM Igor Soarez  wrote:

> Hi everyone,
>
> There have been a number of further changes.
>
> I have updated the KIP to reflect them, but for reference,
> I'd also like to update this thread with a summary.
>
> 1. The reserved Uuids and their names for directories have changed.
> The first 100 Uuids are reserved for future use.
>
> 2. During the ZooKeeper to KRaft migration, if a broker still
> configured in ZK mode has any log directory offline, it will
> shutdown and refuse to startup. The expectation is that this
> escalation from a log directory's unavailability to the entire
> broker's unavailability will be temporary, limited to the migration
> period. And that the simplification will help develop, test and
> support this feature.
>
> 3. The representation of replica directories in metadata records is
> no longer tightly coupled with the respective broker IDs. Instead of
> replacing the int[] replicas field in PartitionRecord and
> PartitionChangeRecord, we are instead introducing a new field Uuid[]
> named directories, that should be kept in the same size and order as
> the existing replicas field. See
> https://github.com/apache/kafka/pull/14290 for further details.
>
> 4. Assignments that are respective to failed log directories are no
> longer prioritized. Previously the KIP proposed prioritizing
> assignments that related to failed log directories, aiming to
> synchronize the necessary replica to directory mapping on the
> controller before handling the directory failure. Recently, we have
> decided to remove any prioritization of these assignments, as
> delaying the reporting of directory failures is considered
> detrimental for any reason
>
> 5. Uuids for log directories that failed after startup are always
> included in every broker heartbeat request. Previously the KIP
> proposed sending Uuids for failed directories in the broker
> heartbeat until a successful reply is received. However, due to the
> overload mode handling of broker heartbeats, because broker
> heartbeat requests may receive a successful response without being
> fully processed, it is preferable to always send the cumulative list
> of directory IDs that have failed since startup. In the future, this
> list can be trimmed to remove directory IDs that are seen to be
> removed from the broker registration, as the broker catches up with
> metadata.
>
> 6. The proposal to shutdown the broker log.dir.failure.timeout.ms
> after not being able to communicate that some log directory is
> offline is now more of an open question. It's unclear if this will
> actually be necessary.
>
> Please share if you have any thoughts.
>
> Best,
>
> --
> Igor
>
>
> On Tue, Oct 10, 2023, at 5:28 AM, Igor Soarez wrote:
> > Hi Colin,
> >
> > Thanks for the renaming suggestions. UNASSIGNED is better then
> > UNKNOWN, MIGRATING is also better than SELECTED and I don't
> > expect it to be used outside of the migration phase.
> > LOST can also work instead of OFFLINE, but I think there will
> > be other uses for this value outside of the migration, like
> > in the intra-broker replica movement edge cases described in the KIP.
> > I've updated the KIP and also filed a tiny PR with your suggestion,
> > except I'm keeping the description of LOST more broad than just
> > scoped to the migration.
> >
> >   https://github.com/apache/kafka/pull/14517
> >
> >
> > The KIP already proposes that the broker does not want to unfence
> > until it has confirmed all the assignments are communicated
> > with the controller. And you're right about the interaction
> > with ReplicaManager, we definitely don't want RPCs coming
> > out of there. My intention is to introduce a new manager, as you
> > suggest, with its own event loop, that batches and prioritizes
> > assignment and dir failure events, called DirectoryEventManager.
> > There's already an open PR, perhaps you could have a look?
> >
> >   KAFKA-15357: Aggregate and propagate assignments and logdir failures
> >   https://github.com/apache/kafka/pull/14369
> >
> >
> > > With regard to the failure detection "gap" during hybrid mode: the
> > > kraft controller sends a full LeaderAndIsrRequest to the brokers
> > > that are in hybrid mode, right? And there is a per-partition
> > > response as well. Right now, we don't pay attention to the error
> > > codes sent back in the response. But we could. Any replica with an
> > > error could be transitioned from MIGRATING -> LOST, right? That
> > > would close the failure detection gap.
> >
> > Almost. The missing bit is that the controller would also need to
> > watch the /log_dir_event_notification znode, and on any change
> > 

Re: [VOTE] KIP-978: Allow dynamic reloading of certificates with different DN / SANs

2023-11-03 Thread Viktor Somogyi-Vass
+1 (binding)

Thank you for working on this, feel free to add me to the code review too.

On Fri, Oct 27, 2023 at 2:54 PM Mickael Maison 
wrote:

> Hi,
>
> +1 (binding)
> Thanks for the KIP
>
> Mickael
>
> On Wed, Oct 25, 2023 at 5:34 PM Federico Valeri 
> wrote:
> >
> > Hi Jakub, thanks for this KIP.
> >
> > +1 (non binding)
> >
> > Thanks
> > Fede
> >
> > On Wed, Oct 25, 2023 at 4:45 PM Manikumar 
> wrote:
> > >
> > > Hi,
> > >
> > > Thanks for the KIP.
> > >
> > > +1 (binding)
> > >
> > >
> > > Thanks.
> > >
> > > On Wed, Oct 25, 2023 at 1:37 AM Jakub Scholz  wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start a vote for the KIP-978: Allow dynamic
> reloading of
> > > > certificates with different DN / SANs
> > > > <
> > > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429128
> > > > >
> > > > .
> > > >
> > > > Thanks & Regards
> > > > Jakub
> > > >
>


Re: [ANNOUNCE] New Kafka PMC Member: Satish Duggana

2023-10-27 Thread Viktor Somogyi-Vass
Congrats Satish!

On Fri, Oct 27, 2023, 18:48 Ivan Yurchenko  wrote:

> Congrats Satish!
>
> Ivan
>
> On Fri, Oct 27, 2023, at 19:02, Kamal Chandraprakash wrote:
> > Congratulations Satish!
> >
> > On Fri, Oct 27, 2023, 21:10 Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Congratulations Satish!!
> > >
> > > On Fri, 27 Oct 2023 at 18:38, Mickael Maison  >
> > > wrote:
> > >
> > > > Congratulations Satish!
> > > >
> > > > On Fri, Oct 27, 2023 at 5:18 PM Lucas Brutschy
> > > >  wrote:
> > > > >
> > > > > Congrats!
> > > > >
> > > > > On Fri, Oct 27, 2023 at 5:06 PM Manikumar <
> manikumar.re...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > Congrats!
> > > > > >
> > > > > > On Fri, Oct 27, 2023 at 8:35 PM Jun Rao  >
> > > > wrote:
> > > > > >
> > > > > > > Hi, Everyone,
> > > > > > >
> > > > > > > Satish Duggana has been a Kafka committer since 2022. He has
> been
> > > > very
> > > > > > > instrumental to the community since becoming a committer. It's
> my
> > > > pleasure
> > > > > > > to announce that Satish is now a member of Kafka PMC.
> > > > > > >
> > > > > > > Congratulations Satish!
> > > > > > >
> > > > > > > Jun
> > > > > > > on behalf of Apache Kafka PMC
> > > > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-10-17 Thread Viktor Somogyi-Vass
Hi hudeqi,

Good thinking about the OOM and resource leaks.
The "update.replication.lag.interval.time" I think is almost good but we
should include that it is about a metric (like
"replication.lag.interval.metric.update.time") so it's obvious without the
docs too.

Thanks,
Viktor

On Sat, Oct 7, 2023 at 8:53 AM hudeqi <16120...@bjtu.edu.cn> wrote:

> Hi, Elkhan, Viktor.
>
> I took a look at the updated KIP. I think Viktor mentioned that he did not
> see the relevant configuration, which refers to "(Optional) -
> MirrorConnectorConfig - a configuration to control the poll interval for
> the Consumer.endOffsets() call at LEO acquisition mentioned below". I think
> we can introduce the name of this configuration here, such as
> "update.replication.lag.interval.time", which means that in a separate
> periodic scheduling thread, the lag is calculated by this interval time
> through "consumer.endOffsets - LRO". In addition, for the LRO cache, you
> can add an expired time attribute for each partition. If this expired
> interval time is exceeded before next updated, the LRO of this partition
> can be removed from the cache to avoid possible leaks and OOM.
>
> best,
> hudeqi


Re: [DISCUSS] KIP-978: Allow dynamic reloading of certificates with different DN / SANs

2023-10-17 Thread Viktor Somogyi-Vass
Hi Jakub,

I think the KIP looks good overall, and I have one question for now.
Would it make sense to split the config you want to introduce
(ssl.allow.dn.and.san.changes) into two configs? Would users want to enable
one but not the other?

Thanks,
Viktor

On Wed, Sep 13, 2023 at 10:00 PM Jakub Scholz  wrote:

> Hi all,
>
> I would like to start the discussion about the KIP-978: Allow dynamic
> reloading of certificates with different DN / SANs
> <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429128
> >.
> It proposes adding an option to disable the current validation of the DN
> and SANs when dynamically changing the keystore. Please have a look and let
> me know your thoughts ...
>
> Thanks & Regards
> Jakub
>


Re: Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-17 Thread Viktor Somogyi-Vass
Hi Greg,

> I don't think I understand what you mean here. Are you suggesting an
alternative to the Admin API? An external project could certainly
build such a component with the Admin API.

So I was thinking of something more complex than the Admin API, an external
service that can instruct clusters (using the Admin API) to start or stop
replication for certain partitions, list their replication flows, throttle
them and so on. With this you won't be able to control just one cluster but
you could register your clusters and control them in a centralized fashion.
Similar to what RestServer is to Connect, though with this you don't create
connectors but replication flows.

I've updated the KIP with my remote log storage thoughts (in data
semantics).

Best,
Viktor

On Sat, Oct 7, 2023 at 7:22 PM Greg Harris 
wrote:

> Hello hudeqi,
>
> I apologize if the KIP and discussion have diverged, as I've been
> trying to add detail rather than propose changes.
>
> > Why can't we use the follower fetch protocol?
>
> What you've described sounds like a very reasonable implementation of
> CCR. I purposely have not specified any implementation details so far
> and have been focusing only on the user-facing semantics of the
> feature. You are of course welcome to add details of how the
> follower-fetch implementation would work to the KIP.
>
> I think maybe this wording in the KIP was ambiguous: "Are not eligible
> for fetch-from-follower on the source cluster". I clarified the
> justification for this earlier in Tom's point D3. But to make the
> statement itself more clear:
>
> Consumers on the source cluster will not see a cross-cluster leader or
> followers as valid replicas to fetch from for load-sharing or latency
> optimization. Consumers on the target cluster will see the
> cross-cluster followers as valid replicas to fetch from, as they will
> appear as normal replicas. This was only meant to describe the
> relationship of the remote replicas with Consumers, not with each
> other.
>
> I hope this is more clear.
> Greg
>
> On Sat, Oct 7, 2023 at 1:38 AM hudeqi <16120...@bjtu.edu.cn> wrote:
> >
> > Hi, Greg.
> >
> > After reading this KIP and your discussion, I feel that it is very
> divergent. I can only start from one of them:
> > Why can't we use the follower fetch protocol? The leader of the target
> cluster topic partition can be treated as the follower of the source
> cluster topic partition leader, and the fetched data is directly appended
> to the local log (the remote fetch thread is inherited to the follower
> fetch thread, thereby retaining the offset of the log), so that consumer/
> producer client can be omitted. Of course, this is just data replication. I
> may have to think more about group offset/acl/config replication.
> >
> > best,
> > hudeqi
>


Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-05 Thread Viktor Somogyi-Vass
Hi Greg,

Sure, I'll expand it with my thoughts. Is it fine if I add it to the KIP
and update this discussion?

Another thing that crossed my mind is that in MM2 you can handle configs
and replication flow in a central place because it is a separate component.
I think that for use-cases where there are many replication flows, this
aspect can be useful (as Kafka itself is useful for microservices). For CCR
too it could be useful to have some kind of separated service that collects
this information. It could also serve as an admin endpoint (swagger maybe?)
for managing flows and configuration. With this you could instruct clusters
to create/pause/delete replications. What do you think?

Thanks,
Viktor



On Wed, Oct 4, 2023 at 6:20 PM Greg Harris 
wrote:

> Hey Viktor,
>
> Thanks for thinking about Tiered Storage. I'm not so familiar there,
> so if you could add some of your expectations about how the two
> features will interact, I would appreciate that.
>
> It appears to me that follower-fetch-from-remote is a significant
> optimization within TS, and so similar optimizations to support
> cross-cluster-replicate-from-remote and out-of-band remote replication
> could also be desirable.
> I think we can explore the idea further, and make sure that CCR is
> extensible to tiered topics if it doesn't make it into the initial
> implementation.
>
> Thanks!
> Greg
>
> On Wed, Oct 4, 2023 at 6:13 AM Viktor Somogyi-Vass
>  wrote:
> >
> > Hi Greg,
> >
> > Thanks for the answers. I think they all make sense.
> >
> > Another point I realized last evening is that now that tiered storage
> (TS)
> > is available, it might complicate things with CCR. What I'm thinking of
> is
> > that if you have multiple clusters in multiple regions, enabling the
> object
> > storage's replication between zones could be much more cost efficient
> than
> > replicating local+remote offsets through Kafka. You'd only need to copy
> > local segments over and remote partition replication would be done by the
> > remote layer. Or the user could simply choose to not replicate remote
> > segments between regions but instead just reference them (so that the
> > backup cluster's remote offsets point to the original region). These
> > options however likely require bigger coordination between clusters than
> in
> > pre-TS Kafka. Do you think we should take this into consideration in the
> > design and in the UX?
> >
> > Thanks,
> > Viktor
> >
> > On Tue, Oct 3, 2023 at 6:30 PM Greg Harris  >
> > wrote:
> >
> > > Hi Viktor,
> > >
> > > Thanks for your questions! I agree, replication is very fundamental in
> > > Kafka, so it's been implemented in many different ways by different
> > > people. I hope that this is the last implementation we'll need, but
> > > every software engineer says that :)
> > >
> > > GT-1: I think as this KIP is very focused on the UX of the feature,
> > > that user stories are appropriate to include. I think it isn't
> > > necessary to explain how the different applications are accomplished
> > > with MM2 or other solutions, but describing what they will look like
> > > after this KIP would be a wonderful addition. +1
> > >
> > > MM2-1: I think that replacing the consumer is insufficient, as we need
> > > a more expressive producer as well. This is not possible within the
> > > design constraints of MM2 as a Connector, as MM2 uses the
> > > connect-managed producer. This could be implemented in MM3 as a new
> > > process that can use more expressive "internal clients", but then
> > > we've thrown away the Connect runtime that made MM2 easier to run for
> > > some users.
> > > MM2-2: This is technically possible, but sounds operationally
> hazardous to
> > > me.
> > > MM2-3: From the user perspective, I believe that CCR can be made more
> > > simple to use and operate than MM2, while providing better guarantees.
> > > From the implementation standpoint, I think that CCR will be
> > > significantly more complex, as the architecture of MM2 leverages a lot
> > > of the Connect infrastructure.
> > >
> > > LaK-1: Yes, I think you understand what I was going for.
> > > LaK-2: I don't think that this is a user experience that we could add
> > > to CCR without changing the Kafka clients to be aware of both clusters
> > > concurrently. In order to redirect clients away from a failed cluster
> > > with a metadata refresh, the cluster that they're currently connected
> > > to must give them that data. But be

Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-04 Thread Viktor Somogyi-Vass
Hi Greg,

Thanks for the answers. I think they all make sense.

Another point I realized last evening is that now that tiered storage (TS)
is available, it might complicate things with CCR. What I'm thinking of is
that if you have multiple clusters in multiple regions, enabling the object
storage's replication between zones could be much more cost efficient than
replicating local+remote offsets through Kafka. You'd only need to copy
local segments over and remote partition replication would be done by the
remote layer. Or the user could simply choose to not replicate remote
segments between regions but instead just reference them (so that the
backup cluster's remote offsets point to the original region). These
options however likely require bigger coordination between clusters than in
pre-TS Kafka. Do you think we should take this into consideration in the
design and in the UX?

Thanks,
Viktor

On Tue, Oct 3, 2023 at 6:30 PM Greg Harris 
wrote:

> Hi Viktor,
>
> Thanks for your questions! I agree, replication is very fundamental in
> Kafka, so it's been implemented in many different ways by different
> people. I hope that this is the last implementation we'll need, but
> every software engineer says that :)
>
> GT-1: I think as this KIP is very focused on the UX of the feature,
> that user stories are appropriate to include. I think it isn't
> necessary to explain how the different applications are accomplished
> with MM2 or other solutions, but describing what they will look like
> after this KIP would be a wonderful addition. +1
>
> MM2-1: I think that replacing the consumer is insufficient, as we need
> a more expressive producer as well. This is not possible within the
> design constraints of MM2 as a Connector, as MM2 uses the
> connect-managed producer. This could be implemented in MM3 as a new
> process that can use more expressive "internal clients", but then
> we've thrown away the Connect runtime that made MM2 easier to run for
> some users.
> MM2-2: This is technically possible, but sounds operationally hazardous to
> me.
> MM2-3: From the user perspective, I believe that CCR can be made more
> simple to use and operate than MM2, while providing better guarantees.
> From the implementation standpoint, I think that CCR will be
> significantly more complex, as the architecture of MM2 leverages a lot
> of the Connect infrastructure.
>
> LaK-1: Yes, I think you understand what I was going for.
> LaK-2: I don't think that this is a user experience that we could add
> to CCR without changing the Kafka clients to be aware of both clusters
> concurrently. In order to redirect clients away from a failed cluster
> with a metadata refresh, the cluster that they're currently connected
> to must give them that data. But because the cluster failed, that
> refresh will not be reliable. With a proxy between the client and
> Kafka, that proxy can be available while the original Kafka cluster is
> not. Failovers would happen between distinct sets of clients that are
> part of the same logical application.
>
> Thanks for taking a look at the rejected alternatives!
> Greg
>
> On Tue, Oct 3, 2023 at 3:24 AM Viktor Somogyi-Vass
>  wrote:
> >
> > Hi Greg,
> >
> > Seems like finding the perfect replication solution is a never ending
> story
> > for Kafka :).
> >
> > Some general thoughts:
> > GT-1. While as you say it would be good to have some kind of built-in
> > replication in Kafka, we definitely need to understand the problem better
> > to provide a better solution. Replication has lots of user stories as you
> > iterated over a few and I think it's very well worth the time to detail
> > each one in the KIP. This may help understanding the problem on a deeper
> > level to others who may want to contribute, somewhat sets the scope and
> > describes the problem in a way that a good solution can be deduced from
> it.
> >
> > I also have a few questions regarding some of the rejected solutions:
> >
> > MM2:
> > I think your points about MM2 are fair (offset transparency and
> operational
> > complexity), however I think it needs more reasoning about why are we
> > moving in a different direction?
> > A few points I can think about what we could improve in MM2 that'd
> > transform it into more like a solution that you aim for:
> > MM2-1. What if we consider replacing the client based mechanism with a
> > follower fetch protocol?
> > MM2-2. Operating an MM2 cluster might be familiar to those who operate
> > Connect anyway. For those who don't, can we provide a "built-in" version
> > that runs in the same process as Kafka, like an embedded dedicated MM2
> > cluster?
> > MM2-3. Will

Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-03 Thread Viktor Somogyi-Vass
Hi Greg,

Seems like finding the perfect replication solution is a never ending story
for Kafka :).

Some general thoughts:
GT-1. While as you say it would be good to have some kind of built-in
replication in Kafka, we definitely need to understand the problem better
to provide a better solution. Replication has lots of user stories as you
iterated over a few and I think it's very well worth the time to detail
each one in the KIP. This may help understanding the problem on a deeper
level to others who may want to contribute, somewhat sets the scope and
describes the problem in a way that a good solution can be deduced from it.

I also have a few questions regarding some of the rejected solutions:

MM2:
I think your points about MM2 are fair (offset transparency and operational
complexity), however I think it needs more reasoning about why are we
moving in a different direction?
A few points I can think about what we could improve in MM2 that'd
transform it into more like a solution that you aim for:
MM2-1. What if we consider replacing the client based mechanism with a
follower fetch protocol?
MM2-2. Operating an MM2 cluster might be familiar to those who operate
Connect anyway. For those who don't, can we provide a "built-in" version
that runs in the same process as Kafka, like an embedded dedicated MM2
cluster?
MM2-3. Will we actually be able to achieve less complexity with a built-in
solution?

Layer above Kafka:
LaK-1. Would you please add more details about this? What I can currently
think of is that this "layer above Kafka" would be some kind of a proxy
which would proactively send an incoming request to multiple clusters like
"broadcast" it. Is that a correct assumption?
LaK-2. In case of a cluster failover a client needs to change bootstrap
servers to a different cluster. A layer above Kafka or a proxy can solve
this by abstracting away the cluster itself. It could force out a metadata
refresh and from that point on clients can fetch from the other cluster. Is
this problem within the scope of this KIP or not?

Thanks,
Viktor


On Tue, Oct 3, 2023 at 2:55 AM Greg Harris 
wrote:

> Hey Tom,
>
> Thanks for the high-level questions, as I am certainly approaching
> this KIP differently than I've seen before.
>
> I think that ideally this KIP will expand to include lots of
> requirements and possible implementations, and that through discussion
> we can narrow the scope and form a roadmap for implementation across
> multiple KIPs. I don't plan to be the decision-maker for this project,
> as I'm more interested in building consensus among the co-authors. I
> can certainly poll that consensus and update the KIP to keep the
> project moving, and any other co-author can do the same. And to set an
> example, I'll clarify your questions and for anything that I agree
> with, I'll ask that you make the update to the KIP, so that the KIP
> captures your understanding of the problem and your requirements. If
> you don't get the chance to make the changes yourself, I'll make sure
> they get included eventually, as they're very good ideas :)
>
> For your remaining questions:
>
> M1: I was trying to draw analogies to databases, but your suggested
> properties are much more compelling and informative. I'd love it if
> you added some formalism here, so that we have a better grasp on what
> we're trying to accomplish. +1
> M2: I think the "asynchronous" problem corresponds to the goal of
> "exactly once semantics" but the two are not obviously opposites. I
> think the MM2 deficiencies could focus less on the architecture
> (asynchronicity) and more on the user-facing effect (semantics). +1
> M3: I had a "non-goals" section that ended up becoming the "rejected
> alternatives" section instead. If you have some non-goals in mind,
> please add them.
> M4+M5: I think it's too early to nail down the assumptions directly,
> but if you believe that "separate operators of source and target" is a
> requirement, that would be good to write down. +1
> M6: That is a concerning edge case, and I don't know how to handle it.
> I was imagining that there would be a many:many relationship of
> clusters and links, but I understand that the book-keeping of that
> decision may be significant.
> M7: I think this may be appropriate to cover in a "user story" or
> "example usages". I naturally thought that the feature would describe
> some minimal way of linking two topics, and the applications
> (combining multiple links, performing failovers, or running
> active-active, etc) would be left to users to define. I included the
> regex configurations because I imagine that creating 100s or 1000s of
> links would be unnecessarily tedious. The feature may also encode
> those use-cases directly as first-class citizens as well.
>
> U1: These are states that can happen in reality, and I meant for that
> section to imply that we should expect these states and model them for
> operations and observability.
>
> D1: I think I may have introduced this 

Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-27 Thread Viktor Somogyi-Vass
Hi Elkhan,

I don't see the config you introduced on the KIP. Otherwise it looks
good to me. If no one has objections or anything to add then I think you
can start the vote after updating the KIP.

>I added the configurable poll interval as an optional - I believe it is
useful to let users to configure it but I doubt whether it would actually
be used.

Yes, it definitely won't be a high priority config but it's defensive. In
case of very large clusters I can imagine users might want longer periods
to avoid any unnecessary pressure from MM2.

Viktor

On Wed, Sep 27, 2023 at 12:50 AM Elxan Eminov 
wrote:

> Hi Team,
> Just pinging this
>
> Thanks,
> Elkhan
>
> On Thu, 21 Sep 2023 at 23:09, Elxan Eminov 
> wrote:
>
> > Hi Viktor and huqedi,
> > I've made the changes to the KIP.
> > I added the configurable poll interval as an optional - I believe it is
> > useful to let users to configure it but I doubt whether it would actually
> > be used.
> > Please let me know your thoughts on this,
> > Thanks!
> >
> > On Sat, 26 Aug 2023 at 15:49, Elxan Eminov 
> > wrote:
> >
> >> Relatively minor change with a new metric for MM2
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-971%3A+Expose+replication-offset-lag+MirrorMaker2+metric
> >>
> >
>


Re: [DISCUSS] KIP-975 Docker Image for Apache Kafka

2023-09-27 Thread Viktor Somogyi-Vass
Hi Krishna,

Thanks for the answer. I've seen multiple such images where environment
variables are used and I think they are generally good but it's unsafe for
passing around secrets, jaas configs and so on. Perhaps for secrets we
could recommend using the file config provider. Then users can create and
mount secured properties file(s) with configs that are considered as
secrets. What do you think? Did you already have something in your mind
regarding this?

Thanks,
Viktor

On Tue, Sep 26, 2023 at 3:05 PM Krishna Agarwal <
krishna0608agar...@gmail.com> wrote:

> Hi Ismael,
> Apologies for missing the mailing list in the last reply.
>
> Thank you for the suggestions.
> Just to clarify, the sizes mentioned in the previous email are of the
> uncompressed base images, not the resulting Apache Kafka docker images:
>
>1. eclipse-temurin:17-jre -- 263MB (They should release JRE 21 images
>soon)
>2. registry.access.redhat.com/ubi8/openjdk-17-runtime:1.17-1.1693366274
>-- 375MB
>
> Regards,
> Krishna
>
>
> On Tue, Sep 26, 2023 at 9:24 AM Ismael Juma  wrote:
>
> > Hi Krishna,
> >
> > Looks like you sent the response to me and not the mailing list,
> > please include the mailing list in the replies. Comments below.
> >
> > On Mon, Sep 25, 2023 at 11:45 AM Krishna Agarwal <
> > krishna0608agar...@gmail.com> wrote:
> >
> >> Hi Ismael,
> >> Thanks for the questions.
> >>
> >>1. We intend to support only the latest Java supported by Apache
> >>Kafka(As per this documentation
> >> Apache Kafka
> currently
> >>supports Java 8, Java 11, and Java 17) which currently is Java 17. If
> >>Apache Kafka supports Java 21 in the future, we will align with it.
> >>
> >> We are already building and testing with Java 21 (
> > https://github.com/apache/kafka/pull/14451 updates `README.md` to
> > indicate that). By 3.7.0 (the next release), we'll have Java 21 as one of
> > the officially supported versions. I think we should start with that
> > version for both docker image KIPs.
> >
> >>
> >>1. For users seeking a Docker image with an alternative Java version,
> >>they will have the flexibility to build their own Docker image
> utilising
> >>the Dockerfiles we provide. In our documentation, we will provide
> clear
> >>guidance on the designated base images for various Java versions.
> >>
> >> This sounds good to me. We should include these details as part of the
> > KIP and also the documentation for the docker images. More specifically,
> we
> > should state that we will update the Java major version as part of minor
> > Apache Kafka releases. The implication is that users who include broker
> > plugins alongside the broker should use custom images to ensure their
> > custom code is not broken by Java upgrades.
> >
> >>
> >>1. Apache Kafka only requires JRE, not JDK, for operation. Utilizing
> >>a base image with only JRE, rather than JDK, is a logical choice as
> it
> >>significantly reduces the size of the docker image.
> >>Upon further investigation, I discovered the eclipse-temurin
> >><
> https://hub.docker.com/layers/library/eclipse-temurin/17-jre/images/sha256-d1dfb065ae433fe1b43ac7e50a1ed03660f487c73ec256c686b126c37fd4d086?context=explore
> >
> >>docker image, which is notably smaller than Redhat’s ubi8 docker
> image (263
> >>MB vs 375 MB). Additionally, the fact that Apache Flink relies on
> >>eclipse-temurin base images
> >><
> https://github.com/apache/flink-docker/blob/master/1.17/scala_2.12-java11-ubuntu/Dockerfile#L19
> >
> >>further increases our confidence in their dependability(Will make
> this
> >>change in the KIP).
> >>
> >> Yes, eclipse-temurin looks like a good choice to me. Nice size
> reduction!
> >
> >>
> >>1. I'll conduct comparisons between our docker image and existing
> >>ones, and incorporate the findings into the KIP. I'll keep you
> posted on
> >>the same.
> >>
> >> Excellent, thanks!
> >
> > Ismael
> >
> >
> >> On Wed, Sep 20, 2023 at 11:26 PM Ismael Juma  wrote:
> >>
> >>> Hi Krishna,
> >>>
> >>> Thanks for the KIP. A few quick questions:
> >>>
> >>> 1. Since this will only be available for Kafka 3.7 in the best case, I
> >>> suggest we go with Java 21 instead of Java 17. Also, we should be clear
> >>> about Java version expectations. Are we allowed to change the Java
> >>> version
> >>> whenever we want? Or can it only be done in major versions? I prefer
> the
> >>> former option by default, but a subset of users may have their own
> >>> plugins
> >>> running within the broker. Such users may need to have control over the
> >>> Java version they use - how would they achieve tha
> >>> 2. We state that ubi-minimal would be the base image - it would be good
> >>> to
> >>> provide some details on why this was the choice.
> >>> 3. How does this docker image compare to the existing popular images?
> >>> Perhaps we can compare with the 3 most popular 

Re: [DISCUSS] KIP-983: Full speed async processing during rebalance

2023-09-25 Thread Viktor Somogyi-Vass
Hi Erik,

I'm still trying to wrap my head around the KIP, however I have a few
questions that weren't clear to me regarding offset commits:
1. Would auto-commits interfere with the behavior defined in your KIP or
would it work the same as manual commits?
2. As I see you don't separate offset commits by whether they're sync or
async. For sync commits timing isn't really a problem but how would you
change work in case of async offset commits? There can be a few caveats
there as you may not know whether a commit is finished or not until your
callback is called.

Thanks,
Viktor

On Sat, Sep 23, 2023 at 4:00 PM Erik van Oosten
 wrote:

> Hi all,
>
> I would like to start the discussion on KIP-983: Full speed async
> processing during rebalance [1].
>
> The idea is that we can prevent the drop in throughput during a
> cooperative rebalance.
>
> I am curious to your ideas and comments.
>
> Kind regards,
>  Erik.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-983%3A+Full+speed+async+processing+during+rebalance
>
>
> --
> Erik van Oosten
> e.vanoos...@grons.nl
> https://day-to-day-stuff.blogspot.com
>
>


Re: [ANNOUNCE] New Kafka PMC Member: Justine Olshan

2023-09-24 Thread Viktor Somogyi-Vass
Congrats Justine!

On Sun, Sep 24, 2023, 17:45 Kirk True  wrote:

> Congratulations Justine! Thanks for all your great work!
>
> > On Sep 24, 2023, at 8:37 AM, John Roesler  wrote:
> >
> > Congratulations, Justine!
> > -John
> >
> > On Sun, Sep 24, 2023, at 05:05, Mickael Maison wrote:
> >> Congratulations Justine!
> >>
> >> On Sun, Sep 24, 2023 at 5:04 AM Sophie Blee-Goldman
> >>  wrote:
> >>>
> >>> Congrats Justine!
> >>>
> >>> On Sat, Sep 23, 2023, 4:36 PM Tom Bentley  wrote:
> >>>
>  Congratulations!
> 
>  On Sun, 24 Sept 2023 at 12:32, Satish Duggana <
> satish.dugg...@gmail.com>
>  wrote:
> 
> > Congratulations Justine!!
> >
> > On Sat, 23 Sept 2023 at 15:46, Bill Bejeck 
> wrote:
> >>
> >> Congrats Justine!
> >>
> >> -Bill
> >>
> >> On Sat, Sep 23, 2023 at 6:23 PM Greg Harris
>   >>
> >> wrote:
> >>
> >>> Congratulations Justine!
> >>>
> >>> On Sat, Sep 23, 2023 at 5:49 AM Boudjelda Mohamed Said
> >>>  wrote:
> 
>  Congrats Justin !
> 
>  On Sat 23 Sep 2023 at 14:44, Randall Hauch 
>  wrote:
> 
> > Congratulations, Justine!
> >
> > On Sat, Sep 23, 2023 at 4:25 AM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> >> Congrats Justine!
> >>
> >> On Sat, Sep 23, 2023, 13:28 Divij Vaidya <
> > divijvaidy...@gmail.com>
> > wrote:
> >>
> >>> Congratulations Justine!
> >>>
> >>> On Sat 23. Sep 2023 at 07:06, Chris Egerton <
> >>> fearthecel...@gmail.com>
> >>> wrote:
> >>>
>  Congrats Justine!
>  On Fri, Sep 22, 2023, 20:47 Guozhang Wang <
> > guozhang.wang...@gmail.com>
>  wrote:
> 
> > Congratulations!
> >
> > On Fri, Sep 22, 2023 at 8:44 PM Tzu-Li (Gordon) Tai <
> >>> tzuli...@apache.org
> >
> > wrote:
> >>
> >> Congratulations Justine!
> >>
> >> On Fri, Sep 22, 2023, 19:25 Philip Nee <
> > philip...@gmail.com>
> >> wrote:
> >>
> >>> Congrats Justine!
> >>>
> >>> On Fri, Sep 22, 2023 at 7:07 PM Luke Chen <
> >>> show...@gmail.com>
> >>> wrote:
> >>>
>  Hi, Everyone,
> 
>  Justine Olshan has been a Kafka committer since
>  Dec.
> >>> 2022.
> > She
> >>> has
> > been
>  very active and instrumental to the community since
> >>> becoming
> > a
> > committer.
>  It's my pleasure to announce that Justine is now a
> >>> member of
> >>> Kafka
> > PMC.
> 
>  Congratulations Justine!
> 
>  Luke
>  on behalf of Apache Kafka PMC
> 
> >>>
> >
> 
> >>>
> >>
> >
> >>>
> >
> >
> 
>
>


Re: [ANNOUNCE] New committer: Lucas Brutschy

2023-09-21 Thread Viktor Somogyi-Vass
Congrats Lucas!

On Thu, Sep 21, 2023 at 7:12 PM Alexander Sorokoumov
 wrote:

> Congratulations, Lucas!
>
> On Thu, Sep 21, 2023 at 10:09 AM Walker Carlson
>  wrote:
>
> > Congrats Lucas!
> >
> > On Thu, Sep 21, 2023 at 11:42 AM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Congrats Lucas!
> > >
> > > On Thu, Sep 21, 2023, 22:05 Boudjelda Mohamed Said 
> > > wrote:
> > >
> > > > Congratulations, Lucas!!
> > > >
> > > > On Thu 21 Sep 2023 at 18:34, Lianet M.  wrote:
> > > >
> > > > > Congratulations Lucas!
> > > > >
> > > > > On Thu, Sept 21, 2023, 11:45 a.m. Bruno Cadonna <
> cado...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > The PMC of Apache Kafka is pleased to announce a new Kafka
> > committer
> > > > > > Lucas Brutschy.
> > > > > >
> > > > > > Lucas' major contributions are around Kafka Streams.
> > > > > >
> > > > > > Lucas' significantly contributed to the state updater
> > > > > > (https://issues.apache.org/jira/browse/KAFKA-10199) and he
> drives
> > > the
> > > > > > implementation of the new threading model for Kafka Streams
> > > > > > (https://issues.apache.org/jira/browse/KAFKA-15326).
> > > > > >
> > > > > > Lucas' contributions to KIP discussions and PR reviews are very
> > > > > thoughtful.
> > > > > >
> > > > > > Congratulations, Lucas!
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Bruno (on behalf of the Apache Kafka PMC)
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New committer: Yash Mayya

2023-09-21 Thread Viktor Somogyi-Vass
Congrats Yash!

On Thu, Sep 21, 2023 at 7:04 PM Josep Prat 
wrote:

> Congrats Yash!
>
> ———
> Josep Prat
>
> Aiven Deutschland GmbH
>
> Alexanderufer 3-7, 10117 Berlin
>
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>
> m: +491715557497
>
> w: aiven.io
>
> e: josep.p...@aiven.io
>
> On Thu, Sep 21, 2023, 18:55 Raymond Ng  wrote:
>
> > Congrats Yash! Well-deserved!
> >
> > /Ray
> >
> > On Thu, Sep 21, 2023 at 9:40 AM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Congratulations Yash!
> > >
> > > On Thu, Sep 21, 2023, 22:03 Bill Bejeck  wrote:
> > >
> > > > Congrats Yash!
> > > >
> > > > On Thu, Sep 21, 2023 at 12:26 PM Divij Vaidya <
> divijvaidy...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Congratulations Yash!
> > > > >
> > > > > Divij Vaidya
> > > > >
> > > > >
> > > > > On Thu, Sep 21, 2023 at 6:18 PM Sagar 
> > > wrote:
> > > > > >
> > > > > > Congrats Yash !
> > > > > > On Thu, 21 Sep 2023 at 9:38 PM, Ashwin
> >  > > >
> > > > > wrote:
> > > > > >
> > > > > > > Awesome ! Congratulations Yash !!
> > > > > > >
> > > > > > > On Thu, Sep 21, 2023 at 9:25 PM Edoardo Comar <
> > > edoardli...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations Yash
> > > > > > > >
> > > > > > > > On Thu, 21 Sept 2023 at 16:28, Bruno Cadonna <
> > cado...@apache.org
> > > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > The PMC of Apache Kafka is pleased to announce a new Kafka
> > > > > committer
> > > > > > > > > Yash Mayya.
> > > > > > > > >
> > > > > > > > > Yash's major contributions are around Connect.
> > > > > > > > >
> > > > > > > > > Yash authored the following KIPs:
> > > > > > > > >
> > > > > > > > > KIP-793: Allow sink connectors to be used with
> topic-mutating
> > > > SMTs
> > > > > > > > > KIP-882: Kafka Connect REST API configuration validation
> > > timeout
> > > > > > > > > improvements
> > > > > > > > > KIP-970: Deprecate and remove Connect's redundant task
> > > > > configurations
> > > > > > > > > endpoint
> > > > > > > > > KIP-980: Allow creating connectors in a stopped state
> > > > > > > > >
> > > > > > > > > Overall, Yash is known for insightful and friendly input to
> > > > > discussions
> > > > > > > > > and his high quality contributions.
> > > > > > > > >
> > > > > > > > > Congratulations, Yash!
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Bruno (on behalf of the Apache Kafka PMC)
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-975 Docker Image for Apache Kafka

2023-09-19 Thread Viktor Somogyi-Vass
Hi Ismael,

I'm not trying to advocate against the docker image, I just pointed out
that the current scoping of the KIP may be a bit too generic and thought
that KIP-974 and KIP-975 were aiming for mostly the same thing and can be
discussed under one umbrella. Apologies if this was rooted in a
misunderstanding.

Kirshna,

I think we need to refine the KIP a bit more. I think there are some
interfaces that we need to include in the KIP as Kafka has plugins in
certain cases where users are expected to provide implementation and I
think it's worth discussing this in the KIP as they're kind of interfaces
for users. Here are my questions in order:
1. In what environments do you want the image to be used? As I understand
it would replace the current testing image and serve as a basis for
development, but would it aim at production use cases too (docker-compose,
Kubernetes, etc.)?
2. How do you plan to forward configs to the broker? Do we expect a
populated server.properties file placed in a certain location or should the
docker image create this file based on some input (like env vars)?
3. Certain parts can be pluggable, like metric reporters or remote log
implementations that were just introduced by KIP-405. These manifest in jar
files that must be put on the classpath of Kafka while certain classnames
have to be configured. How do you plan to implement this, how do we
allow users to configure such things?

Thanks,
Viktor




On Thu, Sep 14, 2023 at 4:59 PM Kenneth Eversole
 wrote:

> Hello,
>
> I think this would be a wonderful improvement to the ecosystem. While
> Viktor is correct that most Docker pipelines eventually lead to a
> kubernetes deployment, that should not stop us from creating an
> Official Docker Image. Creating a Docker image would allow us to ensure a
> level of quality and support for people who want to deploy Kafka as a
> container on baremetal machines, it could allow us to create
> a sandbox/developer environment for new contributors and developers to test
> and have a single agreed upon environment that kafka works in for future
> KIPs and would most likely spawn more contributions from people wanting to
> optimize kafka for k8s.
>
>
> I am 100% for this and will gladly help if approved.
>
> Kenneth
>
> On Thu, Sep 14, 2023 at 5:47 AM Ismael Juma  wrote:
>
> > Hi Viktor,
> >
> > I disagree. Docker is a very popular deployment tool and it's not only
> used
> > with Kubernetes.
> >
> > Ismael
> >
> > On Thu, Sep 14, 2023, 1:14 AM Viktor Somogyi-Vass
> >  wrote:
> >
> > > Hi Krishna,
> > >
> > > I think you should merge this KIP and KIP-974
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-974> as there are
> overlaps as
> > > Federico pointed out on KIP-974
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-974>. I think you
> should keep that one as it
> > > has well defined goals (improve tests) while I feel this one is too
> > > generic. Docker is usually just a tool for either testing or
> Kubernetes,
> > so
> > > they have very well defined use-cases. In the case of Flink for
> instance
> > > the image is used for its kubernetes operator. The use case would
> > determine
> > > a lot of things and I think a generic image would likely not fit the
> > needs
> > > of all use-cases.
> > >
> > > Best,
> > > Viktor
> > >
> > > On Fri, Sep 8, 2023 at 9:58 AM Krishna Agarwal <
> > > krishna0608agar...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > > Apache Kafka does not have an official docker image currently.
> > > > I want to submit a KIP to publish a docker image for Apache Kafka.
> > > >
> > > > KIP-975 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-975>:
> Docker Image for Apache Kafka
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-975%3A+Docker+Image+for+Apache+Kafka
> > > > >
> > > >
> > > > Regards,
> > > > Krishna
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-975 Docker Image for Apache Kafka

2023-09-14 Thread Viktor Somogyi-Vass
Hi Krishna,

I think you should merge this KIP and KIP-974 as there are overlaps as
Federico pointed out on KIP-974. I think you should keep that one as it
has well defined goals (improve tests) while I feel this one is too
generic. Docker is usually just a tool for either testing or Kubernetes, so
they have very well defined use-cases. In the case of Flink for instance
the image is used for its kubernetes operator. The use case would determine
a lot of things and I think a generic image would likely not fit the needs
of all use-cases.

Best,
Viktor

On Fri, Sep 8, 2023 at 9:58 AM Krishna Agarwal 
wrote:

> Hi,
> Apache Kafka does not have an official docker image currently.
> I want to submit a KIP to publish a docker image for Apache Kafka.
>
> KIP-975: Docker Image for Apache Kafka
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-975%3A+Docker+Image+for+Apache+Kafka
> >
>
> Regards,
> Krishna
>


Re: Re: Re: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-13 Thread Viktor Somogyi-Vass
Elkhan, do you think making yours similar would make sense?

On Wed, Sep 6, 2023 at 4:12 AM hudeqi <16120...@bjtu.edu.cn> wrote:

> Hey, Viktor.
> As far as my implementation is concerned, the default setting is 30s, but
> I added it to `MirrorConnectorConfig`, which can be adjusted freely
> according to the load of the source cluster and the number of tasks.
>
> best,
> hudeqi
>
> Viktor Somogyi-Vass viktor.somo...@cloudera.com.INVALID
> 写道:
> > Hey Elkhan and hudeqi,
> >
> > I'm reading your debate around the implementation. I also think a
> > scheduled task would be better in overall accuracy and performance
> > (compared to calling endOffsets with every poll).
> > Hudeqi, do you have any experience of what works best for you in terms of
> > time intervals? I would think refreshing the metric every 5-10sec would
> be
> > overall good and sufficient for the users (as short intervals can be
> quite
> > noisy anyways).
> >
> > Best,
> > Viktor
> >
> > On Mon, Sep 4, 2023 at 11:41 AM hudeqi <16120...@bjtu.edu.cn> wrote:
> >
> > > My approach is to create another thread to regularly request and update
> > > the end offset of each partition for the `keySet` in the collection
> > > `lastReplicatedSourceOffsets` mentioned by your kip (if there is no
> update
> > > for a long time, it will be removed from
> `lastReplicatedSourceOffsets`).
> > > Obviously, such processing makes the calculation of the partition
> offset
> > > lag less real-time and accurate.
> > > But this also meets our needs, because we need the partition offset
> lag to
> > > analyze the replication performance of the task and which task may have
> > > performance problems; and if you monitor the overall offset lag of the
> > > topic, then using the
> > > "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric
> will be
> > > more real-time and accurate.
> > > This is my suggestion. I hope to be able to throw bricks and start
> jade,
> > > we can come up with a better solution.
> > >
> > > best,
> > > hudeqi
> > >
> > > Elxan Eminov elxanemino...@gmail.com写道:
> > > > @huqedi replying to your comment on the PR (
> > > > https://github.com/apache/kafka/pull/14077#discussion_r1314592488),
> > > quote:
> > > >
> > > > "I guess we have a disagreement about lag? My understanding of lag
> is:
> > > the
> > > > real LEO of the source cluster partition minus the LEO that has been
> > > > written to the target cluster. It seems that your definition of lag
> is:
> > > the
> > > > lag between the mirror task getting data from consumption and
> writing it
> > > to
> > > > the target cluster?"
> > > >
> > > > Yes, this is the case. I've missed the fact that the consumer itself
> > > might
> > > > be lagging behind the actual data in the partition.
> > > > I believe your definition of the lag is more precise, but:
> > > > Implementing it this way will come at the cost of an extra
> listOffsets
> > > > request, introducing the overhead that you mentioned in your initial
> > > > comment.
> > > >
> > > > If you have enough insights about this, what would you say is the
> chances
> > > > of the task consumer lagging behind the LEO of the partition?
> > > > Are they big enough to justify the extra call to listOffsets?
> > > > @Viktor,  any thoughts?
> > > >
> > > > Thanks,
> > > > Elkhan
> > > >
> > > > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov 
> > > wrote:
> > > >
> > > > > I already have the PR for this so if it will make it easier to
> discuss,
> > > > > feel free to take a look:
> https://github.com/apache/kafka/pull/14077
> > > > >
> > > > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn> wrote:
> > > > >
> > > > >> But does the offset of the last `ConsumerRecord` obtained in poll
> not
> > > > >> only represent the offset of this record in the source cluster? It
> > > seems
> > > > >> that it cannot represent the LEO of the source cluster for this
> > > partition.
> > > > >> I understand that the offset lag introduced here should be the
> LEO of
> > > the
> > > > >> source cluster minus the offset of the last record to be polled?
> > > > >>
> > > > >> best,
> > > > >> hudeqi
> > > > >>
> > > > >>
> > > > >>  -原始邮件-
> > > > >>  发件人: "Elxan Eminov" 
> > > > >>  发送时间: 2023-09-04 14:52:08 (星期一)
> > > > >>  收件人: dev@kafka.apache.org
> > > > >>  抄送:
> > > > >>  主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag
> > > MirrorMaker2
> > > > >> metric
> > > > >> 
> > > > >> 
> > > > >
> > > > >
> > >
>


Re: Re: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-05 Thread Viktor Somogyi-Vass
Hey Elkhan and hudeqi,

I'm reading your debate around the implementation. I also think a
scheduled task would be better in overall accuracy and performance
(compared to calling endOffsets with every poll).
Hudeqi, do you have any experience of what works best for you in terms of
time intervals? I would think refreshing the metric every 5-10sec would be
overall good and sufficient for the users (as short intervals can be quite
noisy anyways).

Best,
Viktor

On Mon, Sep 4, 2023 at 11:41 AM hudeqi <16120...@bjtu.edu.cn> wrote:

> My approach is to create another thread to regularly request and update
> the end offset of each partition for the `keySet` in the collection
> `lastReplicatedSourceOffsets` mentioned by your kip (if there is no update
> for a long time, it will be removed from `lastReplicatedSourceOffsets`).
> Obviously, such processing makes the calculation of the partition offset
> lag less real-time and accurate.
> But this also meets our needs, because we need the partition offset lag to
> analyze the replication performance of the task and which task may have
> performance problems; and if you monitor the overall offset lag of the
> topic, then using the
> "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric will be
> more real-time and accurate.
> This is my suggestion. I hope to be able to throw bricks and start jade,
> we can come up with a better solution.
>
> best,
> hudeqi
>
> Elxan Eminov elxanemino...@gmail.com写道:
> > @huqedi replying to your comment on the PR (
> > https://github.com/apache/kafka/pull/14077#discussion_r1314592488),
> quote:
> >
> > "I guess we have a disagreement about lag? My understanding of lag is:
> the
> > real LEO of the source cluster partition minus the LEO that has been
> > written to the target cluster. It seems that your definition of lag is:
> the
> > lag between the mirror task getting data from consumption and writing it
> to
> > the target cluster?"
> >
> > Yes, this is the case. I've missed the fact that the consumer itself
> might
> > be lagging behind the actual data in the partition.
> > I believe your definition of the lag is more precise, but:
> > Implementing it this way will come at the cost of an extra listOffsets
> > request, introducing the overhead that you mentioned in your initial
> > comment.
> >
> > If you have enough insights about this, what would you say is the chances
> > of the task consumer lagging behind the LEO of the partition?
> > Are they big enough to justify the extra call to listOffsets?
> > @Viktor,  any thoughts?
> >
> > Thanks,
> > Elkhan
> >
> > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov 
> wrote:
> >
> > > I already have the PR for this so if it will make it easier to discuss,
> > > feel free to take a look: https://github.com/apache/kafka/pull/14077
> > >
> > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn> wrote:
> > >
> > >> But does the offset of the last `ConsumerRecord` obtained in poll not
> > >> only represent the offset of this record in the source cluster? It
> seems
> > >> that it cannot represent the LEO of the source cluster for this
> partition.
> > >> I understand that the offset lag introduced here should be the LEO of
> the
> > >> source cluster minus the offset of the last record to be polled?
> > >>
> > >> best,
> > >> hudeqi
> > >>
> > >>
> > >>  -原始邮件-
> > >>  发件人: "Elxan Eminov" 
> > >>  发送时间: 2023-09-04 14:52:08 (星期一)
> > >>  收件人: dev@kafka.apache.org
> > >>  抄送:
> > >>  主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag
> MirrorMaker2
> > >> metric
> > >> 
> > >> 
> > >
> > >
>


Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-08-30 Thread Viktor Somogyi-Vass
Hi Elkhan,

I think this is quite a useful improvement. A few questions, suggestions:
1. How do you calculate the min, max and avg variants? If I understand
correctly then the metric itself is partition based
(where replication-offset-lag is the lag of the replica that is being
consumed) and these are min, max, avg across replicas?
2. You briefly mention replication-latency-ms at the end but I think it'd
be worth writing a bit more about it, what it does currently, how it is
calculated and therefore why it doesn't fit.

Thanks,
Viktor

On Sat, Aug 26, 2023 at 3:49 PM Elxan Eminov 
wrote:

> Relatively minor change with a new metric for MM2
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-971%3A+Expose+replication-offset-lag+MirrorMaker2+metric
>


[jira] [Created] (KAFKA-15219) Support delegation tokens in KRaft

2023-07-19 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-15219:
---

 Summary: Support delegation tokens in KRaft
 Key: KAFKA-15219
 URL: https://issues.apache.org/jira/browse/KAFKA-15219
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 3.6.0
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


Delegation tokens have been created in KIP-48 and improved in KIP-373. KRaft 
enabled the way to supporting them in KIP-900 by adding SCRAM support but 
delegation tokens still don't support KRaft.

There are multiple issues:
- TokenManager still would try to create tokens in Zookeeper. Instead of this 
we should forward admin requests to the controller that would store them in the 
metadata similarly to SCRAM.
- TokenManager should run on Controller nodes only (or in mixed mode).
- Integration tests will need to be adapted as well and parameterize them with 
Zookeeper/KRaft.
- Documentation needs to be improved to factor in KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-660: Pluggable ReplicaPlacer

2023-07-13 Thread Viktor Somogyi-Vass
Mickael, have you had some time to review this by any chance?

On Tue, Jun 20, 2023 at 5:23 PM Viktor Somogyi-Vass <
viktor.somo...@cloudera.com> wrote:

> Hey all,
>
> I'd like to revive this discussion. I've created
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness
> last November and it seems to be that there is a nice overlap between the
> two and would be good to merge. Should we revive KIP-660 and merge the two
> KIPs?
> If you don't have time for this Mickael currently, I'm happy to take it
> over from you and merge the two interfaces, it seems like they're somewhat
> similar (and also with the current internal interface).
>
> Best,
> Viktor
>
> On Tue, May 31, 2022 at 3:57 PM Mickael Maison 
> wrote:
>
>> Hi Vikas,
>>
>> You make some very good points and most importantly I agree that being
>> able to prevent putting new partitions on a broker should be part of
>> Kafka itself and not require a plugin.
>>
>> This feature would addresses 2 out of the 3 scenarios mentioned in the
>> motivation section. The last one "When adding brokers to a cluster,
>> Kafka currently does not necessarily place new partitions on new
>> brokers" is clearly less important.
>>
>> So I think I'll retire this KIP and I'll follow up with a new KIP to
>> focus on that feature.
>>
>> Thanks,
>> Mickael
>>
>>
>> On Mon, May 9, 2022 at 8:11 PM Vikas Singh 
>> wrote:
>> >
>> > Hi Mickael,
>> >
>> > It's a nice proposal. It's appealing to have a pluggable way to override
>> > default kafka placement decisions, and the motivation section lists
>> some of
>> > them. Here are few comments:
>> >
>> > * The motivation section has "When adding brokers to a cluster, Kafka
>> > currently does not necessarily place new partitions on new brokers". I
>> am
>> > not sure how valuable doing this will be. A newly created kafka topic
>> takes
>> > time to reach the same usage level as existing topics, say because the
>> > topic created by a new workload that is getting onboarded, or the
>> expansion
>> > was done to relieve disk pressure on existing nodes etc. While new
>> topics
>> > catch up to existing workload, the new brokers are not sharing equal
>> load
>> > in the cluster, which probably defeats the purpose of adding new
>> brokers.
>> > In addition to that clustering new topics like this on new brokers have
>> > implications from fault domain perspective. A reasonable way to
>> approach it
>> > is to indeed use CruiseControl to move things around so that the newly
>> > added nodes become immediately involved and share cluster load.
>> > * Regarding "When administrators want to remove brokers from a cluster,
>> > there is no way to prevent Kafka from placing partitions on them", this
>> is
>> > indeed an issue. I would argue that this is needed by everyone and
>> should
>> > be part of Kafka, instead of being implemented as part of a plugin
>> > interface by multiple teams.
>> > * For "When some brokers are near their storage/throughput limit, Kafka
>> > could avoid putting new partitions on them", while this can help relieve
>> > short term overload I think again the correct solution here is something
>> > like CruiseControl where the system is monitored and things moved
>> around to
>> > maintain a balanced cluster. A new topic will not take any disk space,
>> so
>> > placing them anywhere normally isn't going to add to the storage
>> overload.
>> > Similar to the previous case, maybe a mechanism in Kafka to put nodes
>> in a
>> > quarantine state is a better way to approach this.
>> >
>> > In terms of the proposed api, I have a couple of comments:
>> >
>> > * It is not clear if the proposal applies to partitions of new topics or
>> > addition on partitions to an existing topic. Explicitly stating that
>> will
>> > be helpful.
>> > * Regarding part "To address the use cases identified in the motivation
>> > section, some knowledge about the current state of the cluster is
>> > necessary. Details whether a new broker has just been added or is being
>> > decommissioned are not part of the cluster metadata. Therefore such
>> > knowledge has to be provided via an external means to the ReplicaPlacer,
>> > for example via the configuration". It's not clear how this will be
>> done.
>> 

Re: [ANNOUNCE] New committer: Greg Harris

2023-07-10 Thread Viktor Somogyi-Vass
Congrats Greg!

On Mon, Jul 10, 2023, 17:54 Yash Mayya  wrote:

> Congrats Greg!
>
> On Mon, Jul 10, 2023 at 9:15 PM Chris Egerton  wrote:
>
> > Hi all,
> >
> > The PMC for Apache Kafka has invited Greg Harris to become a committer,
> and
> > we are happy to announce that he has accepted!
> >
> > Greg has been contributing to Kafka since 2019. He has made over 50
> commits
> > mostly around Kafka Connect and Mirror Maker 2. His most notable
> > contributions include KIP-898: "Modernize Connect plugin discovery" and a
> > deep overhaul of the offset syncing logic in MM2 that addressed several
> > technically-difficult, long-standing, high-impact issues.
> >
> > He has also been an active participant in discussions and reviews on the
> > mailing lists and on GitHub.
> >
> > Thanks for all of your contributions, Greg. Congratulations!
> >
>


[jira] [Created] (KAFKA-15161) InvalidReplicationFactorException at connect startup

2023-07-07 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-15161:
---

 Summary: InvalidReplicationFactorException at connect startup
 Key: KAFKA-15161
 URL: https://issues.apache.org/jira/browse/KAFKA-15161
 Project: Kafka
  Issue Type: Improvement
  Components: clients, KafkaConnect
Affects Versions: 3.6.0
Reporter: Viktor Somogyi-Vass


.h2 Problem description

In our system test environment in certain cases due to a very specific timing 
issue Connect may fail to start up. the problem lies in the very specific 
timing of a Kafka cluster and connect start/restart. In these cases while the 
broker doesn't have metadata and a consumer in connect starts and asks for 
topic metadata, it returns the following exception and fails:
{noformat}
[2023-07-07 13:56:47,994] ERROR [Worker clientId=connect-1, 
groupId=connect-cluster] Uncaught exception in herder work thread, exiting:  
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
org.apache.kafka.common.KafkaException: Unexpected error fetching metadata for 
topic connect-offsets
at 
org.apache.kafka.clients.consumer.internals.TopicMetadataFetcher.getTopicMetadata(TopicMetadataFetcher.java:130)
at 
org.apache.kafka.clients.consumer.internals.TopicMetadataFetcher.getTopicMetadata(TopicMetadataFetcher.java:66)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:2001)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1969)
at 
org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:251)
at 
org.apache.kafka.connect.storage.KafkaOffsetBackingStore.start(KafkaOffsetBackingStore.java:242)
at org.apache.kafka.connect.runtime.Worker.start(Worker.java:230)
at 
org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:151)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:363)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: 
Replication factor is below 1 or larger than the number of available brokers.
{noformat}

Due to this error the connect node stops and it has to be manually restarted 
(and ofc it fails the test scenarios as well).

.h2 Reproduction

In my test scenario I had:
- 1 broker
- 1 connect distributed node
- I also had a patch that I applied on the broker to make sure we don't have 
metadata

Steps to repro:
# start up a zookeeper based broker without the patch
# put a breakpoint here: 
https://github.com/apache/kafka/blob/1d8b07ed6435568d3daf514c2d902107436d2ac8/clients/src/main/java/org/apache/kafka/clients/consumer/internals/TopicMetadataFetcher.java#L94
# start up a distributed connect node
# restart the kafka broker with the patch to make sure there is no metadata
# once the broker is started, release the debugger in connect

It should run into the error cited above and shut down.

This is not desirable, the connect cluster should retry to ensure its 
continuous operation or the broker should handle this case somehow differently, 
for instance by returning a RetriableException.

The earliest I've tried this is 2.8 but I think this affects versions before 
that as well (and after).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15059) Exactly-once source tasks fail to start during pending rebalances

2023-06-21 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-15059.
-
Resolution: Fixed

[~ChrisEgerton] since the PR is merged, I resolve this ticket.

> Exactly-once source tasks fail to start during pending rebalances
> -
>
> Key: KAFKA-15059
> URL: https://issues.apache.org/jira/browse/KAFKA-15059
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, mirrormaker
>Affects Versions: 3.6.0
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.6.0
>
>
> When asked to perform a round of zombie fencing, the distributed herder will 
> [reject the 
> request|https://github.com/apache/kafka/blob/17fd30e6b457f097f6a524b516eca1a6a74a9144/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L1249-L1250]
>  if a rebalance is pending, which can happen if (among other things) a config 
> for a new connector or a new set of task configs has been recently read from 
> the config topic.
> Normally this can be alleviated with a simple task restart, which isn't great 
> but isn't terrible.
> However, when running MirrorMaker 2 in dedicated mode, there is no API to 
> restart failed tasks, and it can be more common to see this kind of failure 
> on a fresh cluster because three connector configurations are written in 
> rapid succession to the config topic.
>  
> In order to provide a better experience for users of both vanilla Kafka 
> Connect and dedicated MirrorMaker 2 clusters, we can retry (likely with the 
> same exponential backoff introduced with KAFKA-14732) zombie fencing attempts 
> that fail due to a pending rebalance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-660: Pluggable ReplicaPlacer

2023-06-20 Thread Viktor Somogyi-Vass
Hey all,

I'd like to revive this discussion. I've created
https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness
last November and it seems to be that there is a nice overlap between the
two and would be good to merge. Should we revive KIP-660 and merge the two
KIPs?
If you don't have time for this Mickael currently, I'm happy to take it
over from you and merge the two interfaces, it seems like they're somewhat
similar (and also with the current internal interface).

Best,
Viktor

On Tue, May 31, 2022 at 3:57 PM Mickael Maison 
wrote:

> Hi Vikas,
>
> You make some very good points and most importantly I agree that being
> able to prevent putting new partitions on a broker should be part of
> Kafka itself and not require a plugin.
>
> This feature would addresses 2 out of the 3 scenarios mentioned in the
> motivation section. The last one "When adding brokers to a cluster,
> Kafka currently does not necessarily place new partitions on new
> brokers" is clearly less important.
>
> So I think I'll retire this KIP and I'll follow up with a new KIP to
> focus on that feature.
>
> Thanks,
> Mickael
>
>
> On Mon, May 9, 2022 at 8:11 PM Vikas Singh 
> wrote:
> >
> > Hi Mickael,
> >
> > It's a nice proposal. It's appealing to have a pluggable way to override
> > default kafka placement decisions, and the motivation section lists some
> of
> > them. Here are few comments:
> >
> > * The motivation section has "When adding brokers to a cluster, Kafka
> > currently does not necessarily place new partitions on new brokers". I am
> > not sure how valuable doing this will be. A newly created kafka topic
> takes
> > time to reach the same usage level as existing topics, say because the
> > topic created by a new workload that is getting onboarded, or the
> expansion
> > was done to relieve disk pressure on existing nodes etc. While new topics
> > catch up to existing workload, the new brokers are not sharing equal load
> > in the cluster, which probably defeats the purpose of adding new brokers.
> > In addition to that clustering new topics like this on new brokers have
> > implications from fault domain perspective. A reasonable way to approach
> it
> > is to indeed use CruiseControl to move things around so that the newly
> > added nodes become immediately involved and share cluster load.
> > * Regarding "When administrators want to remove brokers from a cluster,
> > there is no way to prevent Kafka from placing partitions on them", this
> is
> > indeed an issue. I would argue that this is needed by everyone and should
> > be part of Kafka, instead of being implemented as part of a plugin
> > interface by multiple teams.
> > * For "When some brokers are near their storage/throughput limit, Kafka
> > could avoid putting new partitions on them", while this can help relieve
> > short term overload I think again the correct solution here is something
> > like CruiseControl where the system is monitored and things moved around
> to
> > maintain a balanced cluster. A new topic will not take any disk space, so
> > placing them anywhere normally isn't going to add to the storage
> overload.
> > Similar to the previous case, maybe a mechanism in Kafka to put nodes in
> a
> > quarantine state is a better way to approach this.
> >
> > In terms of the proposed api, I have a couple of comments:
> >
> > * It is not clear if the proposal applies to partitions of new topics or
> > addition on partitions to an existing topic. Explicitly stating that will
> > be helpful.
> > * Regarding part "To address the use cases identified in the motivation
> > section, some knowledge about the current state of the cluster is
> > necessary. Details whether a new broker has just been added or is being
> > decommissioned are not part of the cluster metadata. Therefore such
> > knowledge has to be provided via an external means to the ReplicaPlacer,
> > for example via the configuration". It's not clear how this will be done.
> > If I have to implement this interface, it will be helpful to have clear
> > guidance/examples here which hopefully ties to the use cases in the
> > motivation section. It also allows us to figure out if the proposed
> > interface is complete and helps future implementers of the interface.
> >
> > Couple of minor comments:
> > * The KIP is not listed in the main KIP page (
> >
> https://cwiki-test.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> ).
> > Can you please add it there.
> > * The page has "This is especially true for the 4 scenarios listed in the
> > Motivation section", but there are only 3 scenarios listed.
> >
> > Regards,
> > Vikas
> >
> >
> > On Tue, May 3, 2022 at 5:51 PM Colin McCabe  wrote:
> >
> > > Hi Mickael,
> > >
> > > We did discuss this earlier, and I remember not being too enthusiastic
> > > about a pluggable policy here :)
> > >
> > > There have been several changes to the placement code in the last few
> > > weeks. (These are examples of the kind 

Re: [ANNOUNCE] New committer: Divij Vaidya

2023-06-13 Thread Viktor Somogyi-Vass
Congrats Divij!

On Tue, Jun 13, 2023, 20:27 Philip Nee  wrote:

> Congrats!
>
> On Tue, Jun 13, 2023 at 8:17 PM Randall Hauch  wrote:
>
> > Congratulations!
> >
> > On Tue, Jun 13, 2023 at 12:48 PM Matthias J. Sax 
> wrote:
> >
> > > Congrats!
> > >
> > > On 6/13/23 10:24 AM, Satish Duggana wrote:
> > > > Congratulations Divij!!
> > > >
> > > > On Tue, 13 Jun 2023 at 22:41, Manyanda Chitimbo
> > > >  wrote:
> > > >>
> > > >> Congratulations Divij.
> > > >>
> > > >> On Tue 13 Jun 2023 at 17:50, Bruno Cadonna 
> > wrote:
> > > >>
> > > >>> Hi all,
> > > >>>
> > > >>> The PMC of Apache Kafka is pleased to announce a new Kafka
> committer
> > > >>> Divij Vaidya.
> > > >>>
> > > >>> Divij's major contributions are:
> > > >>>
> > > >>> GDPR compliance enforcement of kafka-site -
> > > >>> https://issues.apache.org/jira/browse/KAFKA-13868
> > > >>>
> > > >>> Performance improvements:
> > > >>>
> > > >>> Improve performance of VarInt encoding and decoding -
> > > >>> https://github.com/apache/kafka/pull/13312
> > > >>>
> > > >>> Reduce data copy & buffer allocation during decompression -
> > > >>> https://github.com/apache/kafka/pull/13135
> > > >>>
> > > >>> He also was heavily involved in the migration to Mockito.
> > > >>>
> > > >>> Furthermore, Divij is very active on the mailing lists as well as
> in
> > > >>> maintaining and reviewing pull requests.
> > > >>>
> > > >>> Congratulations, Divij!
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Bruno (on behalf of the Apache Kafka PMC)
> > > >>>
> > > >>>
> > > >>> --
> > > >> Manyanda Chitimbo.
> > >
> >
>


[jira] [Resolved] (KAFKA-12384) Flaky Test ListOffsetsRequestTest.testResponseIncludesLeaderEpoch

2023-05-24 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-12384.
-
Fix Version/s: 3.6.0
   Resolution: Fixed

> Flaky Test ListOffsetsRequestTest.testResponseIncludesLeaderEpoch
> -
>
> Key: KAFKA-12384
> URL: https://issues.apache.org/jira/browse/KAFKA-12384
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Reporter: Matthias J. Sax
>Assignee: Chia-Ping Tsai
>Priority: Critical
>  Labels: flaky-test
> Fix For: 3.6.0, 3.0.0
>
>
> {quote}org.opentest4j.AssertionFailedError: expected: <(0,0)> but was: 
> <(-1,-1)> at 
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177) at 
> org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1124) at 
> kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:172){quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Adding reviewers with Github actions

2023-05-03 Thread Viktor Somogyi-Vass
Yes, perhaps this can be used in the github action too, I think this is a
very useful tool. Sadly I couldn't get to the github action but hopefully I
will get there soon.

On Fri, Apr 28, 2023 at 8:48 AM David Jacot 
wrote:

> Thanks, David. This is a nice addition!
>
> Coming back to the original proposal of using github actions, it may be
> possible to run David's script automatically. For instance, we could
> trigger an action which pulls the folks who have approved the PR and feed
> the script when a comment with `reviewers` is posted. Then the action would
> post a comment with the "Reviewers: ". This way, we could do
> everything from within the PR.
>
> Cheers,
> David
>
> On Thu, Apr 27, 2023 at 8:35 PM David Arthur
>  wrote:
>
> > I just merged the "reviewers" script I wrote a while ago:
> > https://github.com/apache/kafka/pull/11096
> >
> > It works by finding previous occurrences of "Reviewers: ...", so it only
> > works for people who have reviewed something before. I do suspect this is
> > largely the common case.
> >
> > E.g., searching for "Ismael" gives:
> >
> > Possible matches (in order of most recent):
> > [1] Ismael Juma ism...@juma.me.uk (1514)
> > [2] Ismael Juma ij...@apache.org (3)
> > [3] Ismael Juma mli...@juma.me.uk (4)
> > [4] Ismael Juma ism...@confluent.io (19)
> > [5] Ismael Juma git...@juma.me.uk (7)
> >
> > it shows them in order of most recently occurring along with the number
> of
> > occurrences. Now that it's merged, it should be easier for folks to try
> it
> > out.
> >
> > Cheers,
> > David
> >
> > On Thu, Apr 20, 2023 at 1:02 PM Justine Olshan
> > 
> > wrote:
> >
> > > I've tried the script, but it's not quite complete.
> > > I've had issues finding folks -- if they haven't reviewed in kafka, we
> > can
> > > not find an email for them. I also had some issues with finding folks
> who
> > > had reviewed before.
> > >
> > > Right now, my strategy is to use GitHub to search previous commits for
> > > folks' emails, but that isn't the most optimal solution -- especially
> if
> > > the reviewer has no public email.
> > > I do think it is useful to have in the commit though, so if anyone has
> > some
> > > ideas on how to improve, I'd be happy to hear.
> > >
> > > Justine
> > >
> > > On Wed, Apr 19, 2023 at 6:53 AM Ismael Juma  wrote:
> > >
> > > > It's a lot more convenient to have it in the commit than having to
> > follow
> > > > links, etc.
> > > >
> > > > David Arthur also wrote a script to help with this step, I believe.
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Apr 18, 2023, 9:29 AM Divij Vaidya 
> > > > wrote:
> > > >
> > > > > Do we even need a manual attribution for a reviewer in the commit
> > > > message?
> > > > > GitHub automatically marks the folks as "reviewers" who have used
> the
> > > > > "review-changes" button on the top left corner and left feedback.
> > > GitHub
> > > > > also has searchability for such reviews done by a particular person
> > > using
> > > > > the following link:
> > > > >
> > > > > https://github.com/search?q=is%3Apr+reviewed-by%3A
> > > > >
> > > >
> > >
> >
> +repo%3Aapache%2Fkafka+repo%3Aapache%2Fkafka-site=issues
> > > > >
> > > > > (replace  with the GitHub username)
> > > > >
> > > > > --
> > > > > Divij Vaidya
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Apr 18, 2023 at 4:09 PM Viktor Somogyi-Vass
> > > > >  wrote:
> > > > >
> > > > > > I'm not that familiar with Actions either, it just seemed like a
> > tool
> > > > for
> > > > > > this purpose. :)
> > > > > > I Did some digging and what I have in mind is that on pull
> request
> > > > review
> > > > > > it can trigger a workflow:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_review
> > > > > >
> > > > > > We could in theory use Github CLI to edit the

[jira] [Resolved] (KAFKA-14929) Flaky KafkaStatusBackingStoreFormatTest#putTopicStateRetriableFailure

2023-04-27 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-14929.
-
Resolution: Fixed

> Flaky KafkaStatusBackingStoreFormatTest#putTopicStateRetriableFailure
> -
>
> Key: KAFKA-14929
> URL: https://issues.apache.org/jira/browse/KAFKA-14929
> Project: Kafka
>  Issue Type: Test
>  Components: KafkaConnect
>Reporter: Greg Harris
>Assignee: Sagar Rao
>Priority: Major
>  Labels: flaky-test
> Fix For: 3.5.0
>
>
> This test recently started flaky-failing with the following stack trace:
> {noformat}
> org.mockito.exceptions.verification.TooFewActualInvocations: 
> kafkaBasedLog.send(, , );
> Wanted 2 times:->
>  at org.apache.kafka.connect.util.KafkaBasedLog.send(KafkaBasedLog.java:376)
> But was 1 time:->
>  at 
> org.apache.kafka.connect.storage.KafkaStatusBackingStore.sendTopicStatus(KafkaStatusBackingStore.java:315)
>   at 
> app//org.apache.kafka.connect.util.KafkaBasedLog.send(KafkaBasedLog.java:376)
>   at 
> app//org.apache.kafka.connect.storage.KafkaStatusBackingStoreFormatTest.putTopicStateRetriableFailure(KafkaStatusBackingStoreFormatTest.java:219)
>   at 
> java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>  Method)
>   at 
> java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base@11.0.16.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base@11.0.16.1/java.lang.reflect.Method.invoke(Method.java:566)
> ...{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New PMC chair: Mickael Maison

2023-04-21 Thread Viktor Somogyi-Vass
Jun, thank you for all your hard work! Also, congrats Mickael, it is very
well deserved :)

Best,
Viktor

On Fri, Apr 21, 2023, 18:15 Adam Bellemare  wrote:

> Thank you for all your hard work Jun - that's a decade-long legacy!
> And congratulations to you Mickael!
>
> On Fri, Apr 21, 2023 at 11:20 AM Josep Prat 
> wrote:
>
> > Thanks Jun for your work as Chair all these years!
> > Congratulations Mickael!
> >
> > Best,
> >
> > ———
> > Josep Prat
> >
> > Aiven Deutschland GmbH
> >
> > Alexanderufer 3-7, 10117 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> >
> > m: +491715557497
> >
> > w: aiven.io
> >
> > e: josep.p...@aiven.io
> >
> > On Fri, Apr 21, 2023, 17:10 Jun Rao  wrote:
> >
> > > Hi, everyone,
> > >
> > > After more than 10 years, I am stepping down as the PMC chair of Apache
> > > Kafka. We now have a new chair Mickael Maison, who has been a PMC
> member
> > > since 2020. I plan to continue to contribute to Apache Kafka myself.
> > >
> > > Congratulations, Mickael!
> > >
> > > Jun
> > >
> >
>


Re: [DISCUSS] KIP-916: MM2 distributed mode flow log context

2023-04-21 Thread Viktor Somogyi-Vass
Hi Daniel,

I think this is a useful addition, it helps resolving issues and
escalations, and improves overall traceability.
Changing the logging context may imply the risk of making certain log
parsers unable to work on new logs. As I see we by default disable this
feature which solves this problem, however I also think that by disabling
it by default it isn't much of a help because users may not know about this
configuration and would not benefit from these when they face problems. So
overall I'd like to go with default=true and wanted to put this out here
for the community to discuss whether it's a problem.
Also, what was the reasoning behind rejecting the second alternative? As I
see that would be a viable option and maybe a bit more idiomatic to the
logging framework.

A minor note: please update the JIRA link in the KIP to point to the right
one.

Best,
Viktor

On Thu, Apr 13, 2023 at 2:19 PM Dániel Urbán  wrote:

> Hi everyone,
>
> I would like to bump this thread. I think this would be very useful for any
> MM2 users, as the current logs with certain architectures (e.g. fan-out)
> are impossible to decipher.
> I already submitted a PR to demonstrate the proposed solution:
> https://github.com/apache/kafka/pull/13475
>
> Thanks for your comments in advance,
> Daniel
>
> Dániel Urbán  ezt írta (időpont: 2023. márc. 30.,
> Cs, 18:24):
>
> > Hello everyone,
> >
> > I would like to kick off a discussion about KIP-916:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-916%3A+MM2+distributed+mode+flow+log+context
> >
> > The KIP aims to enhance the diagnostic information for MM2 distributed
> > mode. MM2 relies on multiple Connect worker instances nested in a single
> > process. In Connect, Connector names are guaranteed to be unique in a
> > single process, but in MM2, this is not true. Because of this, the
> > diagnostics provided by Connect (client.ids, log context) do not ensure
> > that logs are distinguishable for different flows (Connect workers)
> inside
> > an MM2 process.
> >
> > Thanks for all you input in advance,
> > Daniel
> >
>


Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener

2023-04-21 Thread Viktor Somogyi-Vass
Hi all,

A couple of comments:
1) Regarding the motivation: is the motivation simply monitoring related or
are there any other reasons to this?
2) Can we change monitoring to be identical to filters, so that what is
actively filtered, we monitor exactly those topics and groups? (So group
metrics aren't added lazily when a checkpoint is created but when the
filter is changed.)
3) Not sure if we want to widen the scope but since these are interfaces
I'd use TopicPartition and some kind of GroupDescription classes (not sure
if the latter exists) instead of Strings. If later on we'll need extra
properties for these then it can be added on easier.

Best,
Viktor

On Wed, Apr 19, 2023 at 1:42 PM Dániel Urbán  wrote:

> I wouldn't really include a non-existent group (same as we don't care about
> a non-existent topic), that doesn't really matter.
> I think having an existing group which doesn't have an offset to checkpoint
> is equivalent to a topic having no records to replicate from the monitoring
> perspective.
>
> I think the precise way to put it is to monitor the topics and groups
> picked up by the filtering logic of MM2. "The list currently replicated" is
> not a good definition, as an empty topic would still be interesting for
> monitoring purposes, even if there is no message to replicate.
> I think the core motivation is to capture the output of the
> TopicFilter/GroupFilter + the extra, built-in logic of MM2 related to
> filtering (e.g. internal topics are never replicated, the heartbeats topics
> are always replicated, and so on). This logic is too complex to reproduce
> in an external monitoring system, as it would need to use the exact same
> TopicFilter/GroupFilter configs as MM2 is using, and then implement the
> additional built-in logic of MM2 to finally get the topics and groups
> picked up by the replication.
>
> I think this would be useful in any replication setups (finding the
> effective list of filtered topics and groups), but especially useful when
> using the IdentityReplicationPolicy. One gap related to the identity policy
> is that we cannot find the replica topics of a specific flow, even when
> using MirrorClient, or having access to the source and target Kafka
> clusters, as the "traditional" way of finding replica topics is based on
> topic naming and the ReplicationPolicy.
>
> Thanks,
> Daniel
>
> hudeqi <16120...@bjtu.edu.cn> ezt írta (időpont: 2023. ápr. 19., Sze,
> 10:58):
>
> > Thanks for your reply, Daniel.
> > Regarding the group list, do you mean that if the group of the source
> > cluster has not committed an offset (the group does not exist or the
> group
> > has not committed an offset to the topic being replicated), then the
> > current metric cannot be collected? Then this involves the question of
> > motivation: Do we want to monitor the topic list and group list we
> > configured, or the topic list and group list that are currently being
> > replicated? If it is the latter, shouldn't it be detected for a group
> that
> > has not committed an offset? I don't know if I understand correctly.
> >
> > best,
> > hudeqi
> >
> >
> >  -原始邮件-
> >  发件人: "Dániel Urbán" 
> >  发送时间: 2023-04-19 15:50:01 (星期三)
> >  收件人: dev@kafka.apache.org
> >  抄送:
> >  主题: Re: Re: [DISCUSS] KIP-918: MM2 Topic And Group Listener
> > 
> > 
>


Re: Adding reviewers with Github actions

2023-04-18 Thread Viktor Somogyi-Vass
I'm not that familiar with Actions either, it just seemed like a tool for
this purpose. :)
I Did some digging and what I have in mind is that on pull request review
it can trigger a workflow:
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_review

We could in theory use Github CLI to edit the description of the PR when
someone gives a review (or we could perhaps enable this to simply comment
too):
https://docs.github.com/en/actions/using-workflows/using-github-cli-in-workflows

So the action definition would look something like this below. Note that
the "run" part is very basic, it's just here for the idea. We'll probably
need a shell script instead of that line to format it better. But the point
is that it edits the PR and adds the reviewer:

name: Add revieweron:
  issues:
types:
  - pull_request_reviewjobs:
  comment:
runs-on: ubuntu-latest
steps:  - run: gh pr edit $PR_ID --title "$PR_TITLE" --body
"$PR_BODY\n\nReviewers: $SENDER"
env:
  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  PR_ID: ${{ github.event.pull_request.id }}
  PR_TITLE: ${{ github.event.pull_request.title }}
  PR_BODY: ${{ github.event.pull_request.body }}
  SENDER: ${{ github.event.sender }}

I'll take a look if I can try this out one my fork and get back if it leads
to anything.

Viktor

On Tue, Apr 18, 2023 at 10:12 AM Josep Prat 
wrote:

> Hi all,
> Unless I miss something, wouldn't this GitHub action either amend the
> commit (breaking signature if any) or directly do the commit itself
> (meaning the action would be the one squashing and merging and not the
> maintainer anymore)?
>
> Let me know if I'm missing something or if there are some nice hidden
> tricks in GitHub that I didn't know :)
>
> Best,
> On Tue, Apr 18, 2023 at 9:48 AM Viktor Somogyi-Vass
>  wrote:
>
> > Hi all,
> >
> > Unfortunately I forgot to add myself as a reviewer *again *on a PR when
> > merging. Shame on me.
> > However I was thinking about looking into Github actions whether we can
> > automate this process or at least prevent PRs from merging that don't
> have
> > "reviewers" in the description.
> >
> > Has anyone ever looked at it, is it worth chasing this or does anyone
> know
> > anything that'd prevent us from using it?
> >
> > Viktor
> >
>
>
> --
> [image: Aiven] <https://www.aiven.io>
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io <https://www.aiven.io>   |   <https://www.facebook.com/aivencloud
> >
>   <https://www.linkedin.com/company/aiven/>   <
> https://twitter.com/aiven_io>
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B
>


Adding reviewers with Github actions

2023-04-18 Thread Viktor Somogyi-Vass
Hi all,

Unfortunately I forgot to add myself as a reviewer *again *on a PR when
merging. Shame on me.
However I was thinking about looking into Github actions whether we can
automate this process or at least prevent PRs from merging that don't have
"reviewers" in the description.

Has anyone ever looked at it, is it worth chasing this or does anyone know
anything that'd prevent us from using it?

Viktor


Re: [ANNOUNCE] New Kafka PMC Member: David Arthur

2023-03-10 Thread Viktor Somogyi-Vass
Congrats David!

On Fri, Mar 10, 2023 at 9:12 AM Tom Bentley  wrote:

> Congratulations!
>
> On Fri, 10 Mar 2023 at 03:36, John Roesler  wrote:
>
> > Congratulations, David!
> > -John
> >
> > On Thu, Mar 9, 2023, at 20:18, ziming deng wrote:
> > > Congrats David!
> > >
> > > Ziming
> > >
> > >> On Mar 10, 2023, at 10:02, Luke Chen  wrote:
> > >>
> > >> Congratulations, David!
> > >>
> > >> On Fri, Mar 10, 2023 at 9:56 AM Yash Mayya 
> > wrote:
> > >>
> > >>> Congrats David!
> > >>>
> > >>> On Thu, Mar 9, 2023, 23:42 Jun Rao  wrote:
> > >>>
> >  Hi, Everyone,
> > 
> >  David Arthur has been a Kafka committer since 2013. He has been very
> >  instrumental to the community since becoming a committer. It's my
> > >>> pleasure
> >  to announce that David is now a member of Kafka PMC.
> > 
> >  Congratulations David!
> > 
> >  Jun
> >  on behalf of Apache Kafka PMC
> > 
> > >>>
> >
> >
>


Re: [ANNOUNCE] New Kafka PMC Member: Chris Egerton

2023-03-10 Thread Viktor Somogyi-Vass
Congrats Chris!

On Fri, Mar 10, 2023 at 12:12 PM Sagar  wrote:

> Congrats Chris! Absolutely well deserved!
>
> Sagar.
>
> On Fri, Mar 10, 2023 at 1:42 PM Tom Bentley  wrote:
>
> > Congratulations!
> >
> > On Fri, 10 Mar 2023 at 03:35, John Roesler  wrote:
> >
> > > Congratulations, Chris!
> > > -John
> > >
> > > On Thu, Mar 9, 2023, at 20:02, Luke Chen wrote:
> > > > Congratulations, Chris!
> > > >
> > > > On Fri, Mar 10, 2023 at 9:57 AM Yash Mayya 
> > wrote:
> > > >
> > > >> Congratulations Chris!
> > > >>
> > > >> On Thu, Mar 9, 2023, 23:42 Jun Rao 
> wrote:
> > > >>
> > > >> > Hi, Everyone,
> > > >> >
> > > >> > Chris Egerton has been a Kafka committer since July 2022. He has
> > been
> > > >> very
> > > >> > instrumental to the community since becoming a committer. It's my
> > > >> pleasure
> > > >> > to announce that Chris is now a member of Kafka PMC.
> > > >> >
> > > >> > Congratulations Chris!
> > > >> >
> > > >> > Jun
> > > >> > on behalf of Apache Kafka PMC
> > > >> >
> > > >>
> > >
> > >
> >
>


Re: [ANNOUNCE] New committer: Stanislav Kozlovski

2023-01-24 Thread Viktor Somogyi-Vass
Congrats Stan! :)

On Fri, Jan 20, 2023 at 12:35 AM Colin McCabe  wrote:

> Congratulations, Stan! Well deserved.
>
> best,
> Colin
>
> On Tue, Jan 17, 2023, at 07:50, Jun Rao wrote:
> > Hi, Everyone,
> >
> > The PMC of Apache Kafka is pleased to announce a new Kafka committer
> > Stanislav Kozlovski.
> >
> > Stan has been contributing to Apache Kafka since June 2018. He made
> various
> > contributions including the following KIPs.
> >
> > KIP-455: Create an Administrative API for Replica Reassignment
> > KIP-412: Extend Admin API to support dynamic application log levels
> >
> > Congratulations, Stan!
> >
> > Thanks,
> >
> > Jun (on behalf of the Apache Kafka PMC)
>


Re: [VOTE] KIP-710: Full support for distributed mode in dedicated MirrorMaker 2.0 clusters

2023-01-10 Thread Viktor Somogyi-Vass
Ok, then +1 (binding) :)

On Mon, Jan 9, 2023 at 3:44 PM John Roesler  wrote:

> Yes, you are!
>
> Congrats again :)
> -John
>
> On Mon, Jan 9, 2023, at 08:25, Viktor Somogyi-Vass wrote:
> > Hey all,
> >
> > Now that I'm a committer am I allowed to change my non-binding vote to
> > binding to pass the KIP? :)
> >
> > On Thu, Nov 10, 2022 at 6:13 PM Greg Harris  >
> > wrote:
> >
> >> +1 (non-binding)
> >>
> >> Thanks for the KIP, this is an important improvement.
> >>
> >> Greg
> >>
> >> On Thu, Nov 10, 2022 at 7:21 AM John Roesler 
> wrote:
> >>
> >> > Thanks for the KIP, Daniel!
> >> >
> >> > I'm no MM expert, but I've read over the KIP and discussion, and it
> seems
> >> > reasonable to me.
> >> >
> >> > I'm +1 (binding).
> >> >
> >> > Thanks,
> >> > -John
> >> >
> >> > On 2022/10/22 07:38:38 Urbán Dániel wrote:
> >> > > Hi everyone,
> >> > >
> >> > > I would like to start a vote on KIP-710 which aims to support
> running a
> >> > > dedicated MM2 cluster in distributed mode:
> >> > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters
> >> > >
> >> > > Regards,
> >> > > Daniel
> >> > >
> >> > >
> >> > > --
> >> > > Ezt az e-mailt átvizsgálta az Avast AntiVirus szoftver.
> >> > > www.avast.com
> >> > >
> >> >
> >>
>


Re: [VOTE] KIP-710: Full support for distributed mode in dedicated MirrorMaker 2.0 clusters

2023-01-09 Thread Viktor Somogyi-Vass
Hey all,

Now that I'm a committer am I allowed to change my non-binding vote to
binding to pass the KIP? :)

On Thu, Nov 10, 2022 at 6:13 PM Greg Harris 
wrote:

> +1 (non-binding)
>
> Thanks for the KIP, this is an important improvement.
>
> Greg
>
> On Thu, Nov 10, 2022 at 7:21 AM John Roesler  wrote:
>
> > Thanks for the KIP, Daniel!
> >
> > I'm no MM expert, but I've read over the KIP and discussion, and it seems
> > reasonable to me.
> >
> > I'm +1 (binding).
> >
> > Thanks,
> > -John
> >
> > On 2022/10/22 07:38:38 Urbán Dániel wrote:
> > > Hi everyone,
> > >
> > > I would like to start a vote on KIP-710 which aims to support running a
> > > dedicated MM2 cluster in distributed mode:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters
> > >
> > > Regards,
> > > Daniel
> > >
> > >
> > > --
> > > Ezt az e-mailt átvizsgálta az Avast AntiVirus szoftver.
> > > www.avast.com
> > >
> >
>


Re: [ANNOUNCE] New committer: Edoardo Comar

2023-01-07 Thread Viktor Somogyi-Vass
Congrats Edoardo!

On Sat, Jan 7, 2023, 18:15 Bill Bejeck  wrote:

> Congratulations, Edoardo!
>
> -Bill
>
> On Sat, Jan 7, 2023 at 12:11 PM John Roesler  wrote:
>
> > Congrats, Edoardo!
> > -John
> >
> > On Fri, Jan 6, 2023, at 20:47, Matthias J. Sax wrote:
> > > Congrats!
> > >
> > > On 1/6/23 5:15 PM, Luke Chen wrote:
> > >> Congratulations, Edoardo!
> > >>
> > >> Luke
> > >>
> > >> On Sat, Jan 7, 2023 at 7:58 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Congratulations Edo!
> > >>>
> > >>>
> > >>> On Sat, Jan 7, 2023 at 12:05 AM Jun Rao 
> > wrote:
> > 
> >  Hi, Everyone,
> > 
> >  The PMC of Apache Kafka is pleased to announce a new Kafka committer
> > >>> Edoardo
> >  Comar.
> > 
> >  Edoardo has been a long time Kafka contributor since 2016. His major
> >  contributions are the following.
> > 
> >  KIP-302: Enable Kafka clients to use all DNS resolved IP addresses
> >  KIP-277: Fine Grained ACL for CreateTopics API
> >  KIP-136: Add Listener name to SelectorMetrics tags
> > 
> >  Congratulations, Edoardo!
> > 
> >  Thanks,
> > 
> >  Jun (on behalf of the Apache Kafka PMC)
> > >>>
> > >>
> >
>


Re: [DISCUSS] KIP-896: Remove old client protocol API versions in Kafka 4.0

2023-01-04 Thread Viktor Somogyi-Vass
Hi Ismael,

I think this is a good idea, it can also simplify the code quite a bit.
I was gonna ask about whether you think it would be useful to automate
protocol deprecation but I see you'll have a separate KIP for that so I'll
just wait :).

I see that in multiple places clients have older than baseline
implementation. Would it make sense in your opinion to also document that
those protocols below the baseline will be deprecated in 4.0 and possibly
go away in next versions (with maybe the deprecation/removal automation)?

Viktor

On Tue, Jan 3, 2023 at 5:18 PM Ismael Juma  wrote:

> Hi all,
>
> I would like to start a discussion regarding the removal of very old client
> protocol API versions in Apache Kafka 4.0 to improve maintainability &
> supportability of Kafka. Please take a look at the proposal:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-896%3A+Remove+old+client+protocol+API+versions+in+Kafka+4.0
>
> Ismael
>


Re: [ANNOUNCE] New committer: Justine Olshan

2022-12-29 Thread Viktor Somogyi-Vass
Congrats Justine!

On Thu, Dec 29, 2022, 22:05 Chris Egerton  wrote:

> Congrats, Justine!
>
> On Thu, Dec 29, 2022, 15:58 David Jacot  wrote:
>
> > Hi all,
> >
> > The PMC of Apache Kafka is pleased to announce a new Kafka committer
> > Justine
> > Olshan.
> >
> > Justine has been contributing to Kafka since June 2019. She contributed
> 53
> > PRs including the following KIPs.
> >
> > KIP-480: Sticky Partitioner
> > KIP-516: Topic Identifiers & Topic Deletion State Improvements
> > KIP-854: Separate configuration for producer ID expiry
> > KIP-890: Transactions Server-Side Defense (in progress)
> >
> > Congratulations, Justine!
> >
> > Thanks,
> >
> > David (on behalf of the Apache Kafka PMC)
> >
>


Re: [ANNOUNCE] New committer: Satish Duggana

2022-12-23 Thread Viktor Somogyi-Vass
Congrats Satish!

On Fri, Dec 23, 2022, 19:38 Mickael Maison  wrote:

> Congratulations Satish!
>
> On Fri, Dec 23, 2022 at 7:36 PM Divij Vaidya 
> wrote:
> >
> > Congratulations Satish! 
> >
> > On Fri 23. Dec 2022 at 19:32, Josep Prat 
> > wrote:
> >
> > > Congrats Satish!
> > >
> > > ———
> > > Josep Prat
> > >
> > > Aiven Deutschland GmbH
> > >
> > > Immanuelkirchstraße 26, 10405 Berlin
> > > <
> https://www.google.com/maps/search/Immanuelkirchstra%C3%9Fe+26,+10405+Berlin?entry=gmail=g
> >
> > >
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > >
> > > m: +491715557497
> > >
> > > w: aiven.io
> > >
> > > e: josep.p...@aiven.io
> > >
> > > On Fri, Dec 23, 2022, 19:23 Chris Egerton 
> wrote:
> > >
> > > > Congrats, Satish!
> > > >
> > > > On Fri, Dec 23, 2022, 13:19 Arun Raju  wrote:
> > > >
> > > > > Congratulations 
> > > > >
> > > > > On Fri, Dec 23, 2022, 1:08 PM Jun Rao 
> > > wrote:
> > > > >
> > > > > > Hi, Everyone,
> > > > > >
> > > > > > The PMC of Apache Kafka is pleased to announce a new Kafka
> committer
> > > > > Satish
> > > > > > Duggana.
> > > > > >
> > > > > > Satish has been a long time Kafka contributor since 2017. He is
> the
> > > > main
> > > > > > driver behind KIP-405 that integrates Kafka with remote storage,
> a
> > > > > > significant and much anticipated feature in Kafka.
> > > > > >
> > > > > > Congratulations, Satish!
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun (on behalf of the Apache Kafka PMC)
> > > > > >
> > > > >
> > > >
> > >
> > --
> > Divij Vaidya
>


Re: [ANNOUNCE] New committer: Josep Prat

2022-12-20 Thread Viktor Somogyi-Vass
Congrats Josep!

On Tue, Dec 20, 2022, 21:56 Matthias J. Sax  wrote:

> Congrats!
>
> On 12/20/22 12:01 PM, Josep Prat wrote:
> > Thank you all!
> >
> > ———
> > Josep Prat
> >
> > Aiven Deutschland GmbH
> >
> > Immanuelkirchstraße 26, 10405 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> >
> > m: +491715557497
> >
> > w: aiven.io
> >
> > e: josep.p...@aiven.io
> >
> > On Tue, Dec 20, 2022, 20:42 Bill Bejeck  wrote:
> >
> >> Congratulations Josep!
> >>
> >> -Bill
> >>
> >> On Tue, Dec 20, 2022 at 1:11 PM Mickael Maison <
> mickael.mai...@gmail.com>
> >> wrote:
> >>
> >>> Congratulations Josep!
> >>>
> >>> On Tue, Dec 20, 2022 at 6:55 PM Bruno Cadonna 
> >> wrote:
> 
>  Congrats, Josep!
> 
>  Well deserved!
> 
>  Best,
>  Bruno
> 
>  On 20.12.22 18:40, Kirk True wrote:
> > Congrats Josep!
> >
> > On Tue, Dec 20, 2022, at 9:33 AM, Jorge Esteban Quilcate Otoya wrote:
> >> Congrats Josep!!
> >>
> >> On Tue, 20 Dec 2022, 17:31 Greg Harris,
> >>  
> >> wrote:
> >>
> >>> Congratulations Josep!
> >>>
> >>> On Tue, Dec 20, 2022 at 9:29 AM Chris Egerton <
> >>> fearthecel...@gmail.com>
> >>> wrote:
> >>>
>  Congrats Josep! Well-earned.
> 
>  On Tue, Dec 20, 2022, 12:26 Jun Rao 
> >>> wrote:
> 
> > Hi, Everyone,
> >
> > The PMC of Apache Kafka is pleased to announce a new Kafka
> >>> committer
>  Josep
> >Prat.
> >
> > Josep has been contributing to Kafka since May 2021. He
> >>> contributed 20
>  PRs
> > including the following 2 KIPs.
> >
> > KIP-773 Differentiate metric latency measured in ms and ns
> > KIP-744: Migrate TaskMetadata and ThreadMetadata to an interface
> >>> with
> > internal implementation
> >
> > Congratulations, Josep!
> >
> > Thanks,
> >
> > Jun (on behalf of the Apache Kafka PMC)
> >
> 
> >>>
> >>
> >
> >>>
> >>
> >
>


Re: [ANNOUNCE] New committer: Viktor Somogyi-Vass

2022-12-17 Thread Viktor Somogyi-Vass
Thanks again everyone!

On Fri, Dec 16, 2022, 18:36 Bill Bejeck  wrote:

> Congratulations, Viktor!
>
> -Bill
>
> On Fri, Dec 16, 2022 at 12:32 PM Matthias J. Sax  wrote:
>
> > Congrats!
> >
> > On 12/15/22 7:10 AM, Rajini Sivaram wrote:
> > > Congratulations, Viktor!
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Thu, Dec 15, 2022 at 11:41 AM Ron Dagostino 
> > wrote:
> > >
> > >> Congrats to you too, Victor!
> > >>
> > >> Ron
> > >>
> > >>> On Dec 15, 2022, at 4:59 AM, Viktor Somogyi-Vass <
> > >> viktor.somo...@cloudera.com.invalid> wrote:
> > >>>
> > >>> Thank you everyone! :)
> > >>>
> > >>>> On Thu, Dec 15, 2022 at 10:22 AM Mickael Maison <
> > >> mickael.mai...@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> Congratulations Viktor!
> > >>>>
> > >>>>> On Thu, Dec 15, 2022 at 10:06 AM Tamas Barnabas Egyed
> > >>>>>  wrote:
> > >>>>>
> > >>>>> Congratulations, Viktor!
> > >>>>
> > >>
> > >
> >
>


Re: [ANNOUNCE] New Kafka PMC Member: Luke Chen

2022-12-16 Thread Viktor Somogyi-Vass
Congrats Luke! :)

On Fri, Dec 16, 2022, 21:26 Randall Hauch  wrote:

> Congratulations, Luke!
>
> On Fri, Dec 16, 2022 at 2:08 PM Josep Prat 
> wrote:
>
> > Congrats Luke!
> >
> > On Fri, Dec 16, 2022 at 8:55 PM Bill Bejeck  wrote:
> >
> > > Congratulations Luke!
> > >
> > > -Bill
> > >
> > > On Fri, Dec 16, 2022 at 2:47 PM Chris Egerton  >
> > > wrote:
> > >
> > > > Congrats Luke, well-deserved!
> > > >
> > > > On Fri, Dec 16, 2022, 14:41 Jun Rao 
> wrote:
> > > >
> > > > > Hi, Everyone,
> > > > >
> > > > > Luke Chen has been a Kafka committer since Feb. 9, 2022. He has
> been
> > > very
> > > > > instrumental to the community since becoming a committer. It's my
> > > > pleasure
> > > > > to announce that Luke  is now a member of Kafka PMC.
> > > > >
> > > > > Congratulations Luke!
> > > > >
> > > > > Jun
> > > > > on behalf of Apache Kafka PMC
> > > > >
> > > >
> > >
> >
> >
> > --
> > [image: Aiven] 
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io    |   <
> https://www.facebook.com/aivencloud
> > >
> >      <
> > https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Immanuelkirchstraße 26, 10405 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
>


Re: [ANNOUNCE] New committer: Viktor Somogyi-Vass

2022-12-15 Thread Viktor Somogyi-Vass
Thank you everyone! :)

On Thu, Dec 15, 2022 at 10:22 AM Mickael Maison 
wrote:

> Congratulations Viktor!
>
> On Thu, Dec 15, 2022 at 10:06 AM Tamas Barnabas Egyed
>  wrote:
> >
> > Congratulations, Viktor!
>


Re: [ANNOUNCE] New committer: Ron Dagostino

2022-12-15 Thread Viktor Somogyi-Vass
Congrats Ron! :)

On Thu, Dec 15, 2022 at 10:22 AM Mickael Maison 
wrote:

> Congratulations Ron!
>
> On Thu, Dec 15, 2022 at 9:41 AM Eslam Farag  wrote:
> >
> > Congratulations, Ron ☺️
> >
> > On Thu, 15 Dec 2022 at 10:40 AM Tom Bentley  wrote:
> >
> > > Congratulations!
> > >
> > > On Thu, 15 Dec 2022 at 07:40, Satish Duggana  >
> > > wrote:
> > >
> > > > Congratulations, Ron!!
> > > >
> > > > On Thu, 15 Dec 2022 at 07:48, ziming deng 
> > > > wrote:
> > > >
> > > > > Congratulations, Ron!
> > > > > Well deserved!
> > > > >
> > > > > --
> > > > > Ziming
> > > > >
> > > > > > On Dec 15, 2022, at 09:16, Luke Chen  wrote:
> > > > > >
> > > > > > Congratulations, Ron!
> > > > > > Well deserved!
> > > > > >
> > > > > > Luke
> > > > >
> > > > >
> > > >
> > >
>


Re: [DISCUSS] KIP-879: Multi-level Rack Awareness

2022-12-13 Thread Viktor Somogyi-Vass
Hi Ziming,

Thanks for the reply. It may not be a public interface, yes, but it is
annotated with the @InterfaceStability.Unstable annotation from which I
presumed it'll be public at some point. Therefore in my KIP I proposed to
take it public as we've seen significant interest from our users towards
stretch clusters and this kind of replica placement I described. In fact in
our latest release we've already implemented something like this and we'd
like to contribute it back if the community also feels the need for it.
May I ask why don't you want to make this public?

Thanks,
Viktor

On Thu, Dec 8, 2022 at 3:06 PM Andrew Otto  wrote:

> FWIW, the Wikimedia Foundation would find this change really helpful.  We
> are going to soon experiment with a stretched Kafka cluster, and it would
> be nice to be able to target datacenter AND racks for replica placement.
>
> On Thu, Dec 8, 2022 at 3:37 AM ziming deng 
> wrote:
>
> > Hi Viktor,
> >
> > As far as I know, we haven't make ReplicaPlacer a public interface, and
> we
> > have no plan to make it public. I think you can submit a discussion or
> > create a JIRA ticket directly without KIP if you have ideas on improving
> > it, right?
> >
> > --
> > Best,
> > Ziming
> >
> > > On Nov 29, 2022, at 21:52, Viktor Somogyi-Vass <
> > viktor.somo...@cloudera.com.INVALID> wrote:
> > >
> > > Hi All,
> > >
> > > I'd like to bump this. I've also updated the KIP to incorporate the new
> > > KRaft changes (ReplicaPlacer). Luckily my proposals were quite similar
> to
> > > that, so mostly I've made some minor rewording, naming changes, etc.
> > >
> > > Again, the brief summary of the KIP:
> > > - expose replica placement strategies with a new config
> > > - create an admin API and protocol to expose replica placement
> > > functionality (mainly for the reassignment tool)
> > > - create a new multi-level rack awareness strategy which improves
> > > availability on stretch clusters
> > >
> > > I'm happy for any feedback.
> > >
> > > Best,
> > > Viktor
> > >
> > > On Fri, Oct 28, 2022 at 4:14 PM Viktor Somogyi-Vass <
> > > viktor.somo...@cloudera.com> wrote:
> > >
> > >> Hey all,
> > >>
> > >> I'd like to propose a new broker side replica assignment strategy and
> an
> > >> interface that generalizes replica assignment on brokers and makes
> them
> > >> pluggable.
> > >>
> > >> Briefly, the motivation for the new replica assignment strategy is
> that
> > >> more and more of our customers would want to run their clusters in a
> > >> stretched environment, where for instance a cluster is running over
> > >> multiple regions (and multiple racks inside a region). Since this
> seems
> > >> like a more common need, we'd like to contribute back our
> implementation
> > >> and also make a generalized interface, so that new strategies that
> > people
> > >> may come up with could be served better.
> > >>
> > >> I welcome any feedback on this KIP.
> > >>
> > >> The link:
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness
> > >>
> > >> Best to all,
> > >> Viktor
> > >>
> >
> >
>


Re: [DISCUSS] KIP-879: Multi-level Rack Awareness

2022-11-29 Thread Viktor Somogyi-Vass
Hi All,

I'd like to bump this. I've also updated the KIP to incorporate the new
KRaft changes (ReplicaPlacer). Luckily my proposals were quite similar to
that, so mostly I've made some minor rewording, naming changes, etc.

Again, the brief summary of the KIP:
- expose replica placement strategies with a new config
- create an admin API and protocol to expose replica placement
functionality (mainly for the reassignment tool)
- create a new multi-level rack awareness strategy which improves
availability on stretch clusters

I'm happy for any feedback.

Best,
Viktor

On Fri, Oct 28, 2022 at 4:14 PM Viktor Somogyi-Vass <
viktor.somo...@cloudera.com> wrote:

> Hey all,
>
> I'd like to propose a new broker side replica assignment strategy and an
> interface that generalizes replica assignment on brokers and makes them
> pluggable.
>
> Briefly, the motivation for the new replica assignment strategy is that
> more and more of our customers would want to run their clusters in a
> stretched environment, where for instance a cluster is running over
> multiple regions (and multiple racks inside a region). Since this seems
> like a more common need, we'd like to contribute back our implementation
> and also make a generalized interface, so that new strategies that people
> may come up with could be served better.
>
> I welcome any feedback on this KIP.
>
> The link:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness
>
> Best to all,
> Viktor
>


Re: [DISCUSS] KIP-881: Rack-aware Partition Assignment for Kafka Consumers

2022-11-03 Thread Viktor Somogyi-Vass
Hi Rajini,

If I understand correctly, the client.rack config would stay supported
after KIP-848 but does it expand the scope of that KIP too with this
config? I mean that currently you propose ConsumerProtocolSubscription to
be used but this protocol won't be available and we need to transfer the
config to the coordinator via other means. Should this be added to that KIP?

Thanks,
Viktor

On Wed, Nov 2, 2022 at 9:50 PM Rajini Sivaram 
wrote:

> Hi Jun,
>
> Thank you for the review. Yes, we should add rack id to Subscription, had
> missed that part. Updated the KIP, thank you for pointing that out!
>
> Regards,
>
> Rajini
>
> On Wed, Nov 2, 2022 at 7:06 PM Jun Rao  wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the KIP. Just one comment.
> >
> > Should we add rackId to GroupSubscription.Subscription for each member?
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Nov 2, 2022 at 4:57 AM Rajini Sivaram 
> > wrote:
> >
> > > Hi all,
> > >
> > > I have submitted KIP-881 to implement rack-aware partition assignment
> for
> > > consumers:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-881%3A+Rack-aware+Partition+Assignment+for+Kafka+Consumers
> > > .
> > > It adds rack id to the consumer group protocol to propagate rack
> > > information so that rack-aware assignors can be added to benefit from
> > > locality.
> > >
> > > Feedback and suggestions are welcome!
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> >
>


[DISCUSS] KIP-879: Multi-level Rack Awareness

2022-10-28 Thread Viktor Somogyi-Vass
Hey all,

I'd like to propose a new broker side replica assignment strategy and an
interface that generalizes replica assignment on brokers and makes them
pluggable.

Briefly, the motivation for the new replica assignment strategy is that
more and more of our customers would want to run their clusters in a
stretched environment, where for instance a cluster is running over
multiple regions (and multiple racks inside a region). Since this seems
like a more common need, we'd like to contribute back our implementation
and also make a generalized interface, so that new strategies that people
may come up with could be served better.

I welcome any feedback on this KIP.

The link:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness

Best to all,
Viktor


Re: [VOTE] KIP-710: Full support for distributed mode in dedicated MirrorMaker 2.0 clusters

2022-10-27 Thread Viktor Somogyi-Vass
+1 (non-binding). Thanks for the KIP!

On Mon, Oct 24, 2022 at 3:23 PM Chris Egerton 
wrote:

> +1 (binding). Thanks for the KIP!
>
> On Sat, Oct 22, 2022 at 3:38 AM Urbán Dániel 
> wrote:
>
> > Hi everyone,
> >
> > I would like to start a vote on KIP-710 which aims to support running a
> > dedicated MM2 cluster in distributed mode:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters
> >
> > Regards,
> > Daniel
> >
> >
> > --
> > Ezt az e-mailt átvizsgálta az Avast AntiVirus szoftver.
> > www.avast.com
> >
>


[jira] [Resolved] (KAFKA-14331) Upgrade to Scala 2.13.10

2022-10-25 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-14331.
-
Resolution: Duplicate

> Upgrade to Scala 2.13.10
> 
>
> Key: KAFKA-14331
> URL: https://issues.apache.org/jira/browse/KAFKA-14331
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 3.4.0
>    Reporter: Viktor Somogyi-Vass
>Priority: Major
>
> There are some CVEs in Scala 2.13.8, so we should upgrade to the latest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14331) Upgrade to Scala 2.13.10

2022-10-24 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-14331:
---

 Summary: Upgrade to Scala 2.13.10
 Key: KAFKA-14331
 URL: https://issues.apache.org/jira/browse/KAFKA-14331
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 3.4.0
Reporter: Viktor Somogyi-Vass


There are some CVEs in Scala 2.13.8, so we should upgrade to the latest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14281) Multi-level rack awareness

2022-10-05 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-14281:
---

 Summary: Multi-level rack awareness
 Key: KAFKA-14281
 URL: https://issues.apache.org/jira/browse/KAFKA-14281
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 3.4.0
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


h1. Motivation

With replication services data can be replicated across independent Kafka 
clusters in multiple data center. In addition, many customers need "stretch 
clusters" - a single Kafka cluster that spans across multiple data centers. 
This architecture has the following useful characteristics:
 - Data is natively replicated into all data centers by Kafka topic replication.
 - No data is lost when 1 DC is lost and no configuration change is required - 
design is implicitly relying on native Kafka replication.
 - From operational point of view, it is much easier to configure and operate 
such a topology than a replication scenario via MM2.

Kafka should provide "native" support for stretch clusters, covering any 
special aspects of operations of stretch cluster.

h2. Multi-level rack awareness

Additionally, stretch clusters are implemented using the rack awareness 
feature, where each DC is represented as a rack. This ensures that replicas are 
spread across DCs evenly. Unfortunately, there are cases where this is too 
limiting - in case there are actual racks inside the DCs, we cannot specify 
those. Consider having 3 DCs with 2 racks each:

/DC1/R1, /DC1/R2
/DC2/R1, /DC2/R2
/DC3/R1, /DC3/R2

If we were to use racks as DC1, DC2, DC3, we lose the rack-level information of 
the setup. This means that it is possible that when we are using RF=6, that the 
2 replicas assigned to DC1 will both end up in the same rack.

If we were to use racks as /DC1/R1, /DC1/R2, etc, then when using RF=3, it is 
possible that 2 replicas end up in the same DC, e.g. /DC1/R1, /DC1/R2, /DC2/R1.

Because of this, Kafka should support "multi-level" racks, which means that 
rack IDs should be able to describe some kind of a hierarchy. With this 
feature, brokers should be able to:
 # spread replicas evenly based on the top level of the hierarchy (i.e. first, 
between DCs)
 # then inside a top-level unit (DC), if there are multiple replicas, they 
should be spread evenly among lower-level units (i.e. between racks, then 
between physical hosts, and so on)
 ## repeat for all levels



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14250) Exception during normal operation in MirrorSourceTask causes the task to fail instead of shutting down gracefully

2022-09-21 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-14250:
---

 Summary: Exception during normal operation in MirrorSourceTask 
causes the task to fail instead of shutting down gracefully
 Key: KAFKA-14250
 URL: https://issues.apache.org/jira/browse/KAFKA-14250
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect, mirrormaker
Affects Versions: 3.3
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


In MirrorSourceTask we are loading offsets for the topic partitions. At this 
point, while we are fetching the partitions, it is possible for the offset 
reader to be stopped by a parallel thread. Stopping the reader causes a 
CancellationException to be thrown, due to KAFKA-9051.

Currently this exception is not caught in MirrorSourceTask and so the exception 
propagates up and causes the task to go into FAILED state. We only need it to 
go to STOPPED state so that it would be restarted later.

This can be achieved by catching the exception and stopping the task directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New Committer: Chris Egerton

2022-07-25 Thread Viktor Somogyi-Vass
Congrats Chris!

On Mon, Jul 25, 2022, 18:33 Matthew Benedict de Detrich
 wrote:

> Congratulations!
>
> --
> Matthew de Detrich
> Aiven Deutschland GmbH
> Immanuelkirchstraße 26, 10405 Berlin
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> m: +491603708037
> w: aiven.io e: matthew.dedetr...@aiven.io
> On 25. Jul 2022, 18:26 +0200, Mickael Maison , wrote:
> > Hi all,
> >
> > The PMC for Apache Kafka has invited Chris Egerton as a committer, and
> > we are excited to announce that he accepted!
> >
> > Chris has been contributing to Kafka since 2017. He has made over 80
> > commits mostly around Kafka Connect. His most notable contributions
> > include KIP-507: Securing Internal Connect REST Endpoints and KIP-618:
> > Exactly-Once Support for Source Connectors.
> >
> > He has been an active participant in discussions and reviews on the
> > mailing lists and on Github.
> >
> > Thanks for all of your contributions Chris. Congratulations!
> >
> > -- Mickael, on behalf of the Apache Kafka PMC
>


Re: Transactions, delivery timeout and changing transactional producer behavior

2022-07-25 Thread Viktor Somogyi-Vass
Hi Luke & Artem,

We prepared the fix, would you please help in getting a committer-reviewer
to get this issue resolved?

Thanks,
Viktor

On Fri, Jul 8, 2022 at 12:57 PM Dániel Urbán  wrote:

> Submitted a PR with the fix: https://github.com/apache/kafka/pull/12392
> In the PR I tried keeping the producer in a usable state after the forced
> bump. I understand that it might be the cleanest solution, but the only
> other option I know of is to transition into a fatal state, meaning that
> the producer has to be recreated after a delivery timeout. I think that is
> still fine compared to the out-of-order messages.
>
> Looking forward to your reviews,
> Daniel
>
> Dániel Urbán  ezt írta (időpont: 2022. júl. 7., Cs,
> 12:04):
>
> > Thanks for the feedback, I created
> > https://issues.apache.org/jira/browse/KAFKA-14053 and started working on
> > a PR.
> >
> > Luke, for the workaround, we used the transaction admin tool released in
> > 3.0 to "abort" these hanging batches manually.
> > Naturally, the cluster health should be stabilized. This issue popped up
> > most frequently around times when some partitions went into a few minute
> > window of unavailability. The infinite retries on the producer side
> caused
> > a situation where the last retry was still in-flight, but the delivery
> > timeout was triggered on the client side. We reduced the retries and
> > increased the delivery timeout to avoid such situations.
> > Still, the issue can occur in other scenarios, for example a client
> > queueing up many batches in the producer buffer, and causing those
> batches
> > to spend most of the delivery timeout window in the client memory.
> >
> > Thanks,
> > Daniel
> >
> > Luke Chen  ezt írta (időpont: 2022. júl. 7., Cs,
> 5:13):
> >
> >> Hi Daniel,
> >>
> >> Thanks for reporting the issue, and the investigation.
> >> I'm curious, so, what's your workaround for this issue?
> >>
> >> I agree with Artem, it makes sense. Please file a bug in JIRA.
> >> And looking forward to your PR! :)
> >>
> >> Thank you.
> >> Luke
> >>
> >> On Thu, Jul 7, 2022 at 3:07 AM Artem Livshits
> >>  wrote:
> >>
> >> > Hi Daniel,
> >> >
> >> > What you say makes sense.  Could you file a bug and put this info
> there
> >> so
> >> > that it's easier to track?
> >> >
> >> > -Artem
> >> >
> >> > On Wed, Jul 6, 2022 at 8:34 AM Dániel Urbán 
> >> wrote:
> >> >
> >> > > Hello everyone,
> >> > >
> >> > > I've been investigating some transaction related issues in a very
> >> > > problematic cluster. Besides finding some interesting issues, I had
> >> some
> >> > > ideas about how transactional producer behavior could be improved.
> >> > >
> >> > > My suggestion in short is: when the transactional producer
> encounters
> >> an
> >> > > error which doesn't necessarily mean that the in-flight request was
> >> > > processed (for example a client side timeout), the producer should
> not
> >> > send
> >> > > an EndTxnRequest on abort, but instead it should bump the producer
> >> epoch.
> >> > >
> >> > > The long description about the issue I found, and how I came to the
> >> > > suggestion:
> >> > >
> >> > > First, the description of the issue. When I say that the cluster is
> >> "very
> >> > > problematic", I mean all kinds of different issues, be it infra
> (disks
> >> > and
> >> > > network) or throughput (high volume producers without fine tuning).
> >> > > In this cluster, Kafka transactions are widely used by many
> producers.
> >> > And
> >> > > in this cluster, partitions get "stuck" frequently (few times every
> >> > week).
> >> > >
> >> > > The exact meaning of a partition being "stuck" is this:
> >> > >
> >> > > On the client side:
> >> > > 1. A transactional producer sends X batches to a partition in a
> single
> >> > > transaction
> >> > > 2. Out of the X batches, the last few get sent, but are timed out
> >> thanks
> >> > to
> >> > > the delivery timeout config
> >> > > 3. producer.flush() is unblocked due to all batches being "finished"
> >> > > 4. Based on the errors reported in the producer.send() callback,
> >> > > producer.abortTransaction() is called
> >> > > 5. Then producer.close() is also invoked with a 5s timeout (this
> >> > > application does not reuse the producer instances optimally)
> >> > > 6. The transactional.id of the producer is never reused (it was
> >> random
> >> > > generated)
> >> > >
> >> > > On the partition leader side (what appears in the log segment of the
> >> > > partition):
> >> > > 1. The batches sent by the producer are all appended to the log
> >> > > 2. But the ABORT marker of the transaction was appended before the
> >> last 1
> >> > > or 2 batches of the transaction
> >> > >
> >> > > On the transaction coordinator side (what appears in the transaction
> >> > state
> >> > > partition):
> >> > > The transactional.id is present with the Empty state.
> >> > >
> >> > > These happenings result in the following:
> >> > > 1. The partition leader handles the first batch after the ABORT
> >> marker as
> >> > > 

Re: [DISCUSS] Apache Kafka 3.2.1 release

2022-07-22 Thread Viktor Somogyi-Vass
Thanks David :)

On Thu, Jul 21, 2022 at 6:00 PM David Arthur  wrote:

> Viktor, seeing as it's been on trunk for a while and is a very small
> change, it seems fine to include in this release. I just finished building
> the RC, but haven't started a vote thread yet. Good timing on your part :)
> I'll merge this PR shortly and start a new RC build.
>
> -David
>
> On Thu, Jul 21, 2022 at 11:42 AM Viktor Somogyi-Vass
>  wrote:
>
> > Hi David,
> >
> > Found an issue (tight loop in the consumer), fixed it on trunk and
> > backported it onto the 3.2 branch. Is it possible to include this in the
> > 3.2.1 release?
> > https://github.com/apache/kafka/pull/12417
> >
> > Thanks,
> > Viktor
> >
> > On Tue, Jul 19, 2022 at 5:57 PM Randall Hauch  wrote:
> >
> > > Hi, Chris S and Chris E,
> > >
> > > Thanks for quickly working on and reviewing the
> > > https://issues.apache.org/jira/browse/KAFKA-14079 issue mentioned
> > > above. The two PRs you created, one for the `trunk` branch and one for
> > the
> > > `3.2` branch, have both been merged, and the issue has been marked as
> > > resolved.
> > >
> > > Best regards,
> > >
> > > Randall
> > >
> > > On Sun, Jul 17, 2022 at 5:44 PM Christopher Shannon <
> > > christopher.l.shan...@gmail.com> wrote:
> > >
> > > > Hi Chris E.
> > > >
> > > > Thanks for all the feedback earlier, I updated the PR based on your
> > > > comments and also pushed a second PR for trunk for 3.3.0
> > > >
> > > > I agree the impact is high which is why I found the issue. I recently
> > > > turned on this feature and suddenly my connect worker/task kept
> > > > periodically falling over with OOM errors. Finally I took a heap dump
> > and
> > > > saw a ton of submitted record objects in memory and started
> > investigating
> > > > and that's how I figured out it was related to this new feature..
> > > >
> > > > Chris
> > > >
> > > > On Sun, Jul 17, 2022 at 1:48 PM Chris Egerton <
> fearthecel...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Chris,
> > > > >
> > > > > Good find, and thanks for filing a fix. I agree that we should get
> > this
> > > > > into 3.2.1 if possible. The risk is fairly low (the functional
> parts
> > of
> > > > the
> > > > > fix are just two lines long) and the impact of the bug is high for
> > > users
> > > > > who have configured source connectors with "errors.tolerance" set
> to
> > > > "all".
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Sat, Jul 16, 2022 at 12:26 PM Christopher Shannon <
> > > > > christopher.l.shan...@gmail.com> wrote:
> > > > >
> > > > > > HI,
> > > > > >
> > > > > > I think I found a memory leak that was introduced in 3.2.0 in a
> > > > Connector
> > > > > > SourceTask. I created a JIRA:
> > > > > > https://issues.apache.org/jira/browse/KAFKA-14079 and small PR
> > with
> > > a
> > > > > fix:
> > > > > > https://github.com/apache/kafka/pull/12412
> > > > > >
> > > > > > I think this should be included in 3.2.1. It should also go into
> > > 3.3.0
> > > > > but
> > > > > > there was a lot of refactoring done there with the source task
> code
> > > due
> > > > > to
> > > > > > KIP-618 so another PR needs to be done for that if this is
> merged.
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > On Fri, Jul 15, 2022 at 10:06 AM David Arthur 
> > > > wrote:
> > > > > >
> > > > > > > Here is the release plan for 3.2.1
> > > > > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.2.1
> > > > > > >
> > > > > > > I am working on getting clarity on the one open blocker. Once
> > that
> > > is
> > > > > > > resolved (or rescheduled for a future release), I will build
> the
> > > > first
> > > > > > > release candidate.
> > > > > > >
> > > > > > > -David
> > > > > > >
> > > > > > > On Thu, Jul 14, 2022 at 3:10 AM Luke Chen 
> > > wrote:
> > > > > > >
> > > > > > > > +1, Thanks David!
> > > > > > > >
> > > > > > > > On Thu, Jul 14, 2022 at 1:16 PM David Jacot <
> > > david.ja...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1. Thanks David.
> > > > > > > > >
> > > > > > > > > Le mer. 13 juil. 2022 à 23:43, José Armando García Sancio
> > > > > > > > >  a écrit :
> > > > > > > > >
> > > > > > > > > > +1. Thanks for volunteering David.
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > -José
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > David Arthur
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> David Arthur
>


Re: [DISCUSS] Apache Kafka 3.2.1 release

2022-07-21 Thread Viktor Somogyi-Vass
Hi David,

Found an issue (tight loop in the consumer), fixed it on trunk and
backported it onto the 3.2 branch. Is it possible to include this in the
3.2.1 release?
https://github.com/apache/kafka/pull/12417

Thanks,
Viktor

On Tue, Jul 19, 2022 at 5:57 PM Randall Hauch  wrote:

> Hi, Chris S and Chris E,
>
> Thanks for quickly working on and reviewing the
> https://issues.apache.org/jira/browse/KAFKA-14079 issue mentioned
> above. The two PRs you created, one for the `trunk` branch and one for the
> `3.2` branch, have both been merged, and the issue has been marked as
> resolved.
>
> Best regards,
>
> Randall
>
> On Sun, Jul 17, 2022 at 5:44 PM Christopher Shannon <
> christopher.l.shan...@gmail.com> wrote:
>
> > Hi Chris E.
> >
> > Thanks for all the feedback earlier, I updated the PR based on your
> > comments and also pushed a second PR for trunk for 3.3.0
> >
> > I agree the impact is high which is why I found the issue. I recently
> > turned on this feature and suddenly my connect worker/task kept
> > periodically falling over with OOM errors. Finally I took a heap dump and
> > saw a ton of submitted record objects in memory and started investigating
> > and that's how I figured out it was related to this new feature..
> >
> > Chris
> >
> > On Sun, Jul 17, 2022 at 1:48 PM Chris Egerton 
> > wrote:
> >
> > > Hi Chris,
> > >
> > > Good find, and thanks for filing a fix. I agree that we should get this
> > > into 3.2.1 if possible. The risk is fairly low (the functional parts of
> > the
> > > fix are just two lines long) and the impact of the bug is high for
> users
> > > who have configured source connectors with "errors.tolerance" set to
> > "all".
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Sat, Jul 16, 2022 at 12:26 PM Christopher Shannon <
> > > christopher.l.shan...@gmail.com> wrote:
> > >
> > > > HI,
> > > >
> > > > I think I found a memory leak that was introduced in 3.2.0 in a
> > Connector
> > > > SourceTask. I created a JIRA:
> > > > https://issues.apache.org/jira/browse/KAFKA-14079 and small PR with
> a
> > > fix:
> > > > https://github.com/apache/kafka/pull/12412
> > > >
> > > > I think this should be included in 3.2.1. It should also go into
> 3.3.0
> > > but
> > > > there was a lot of refactoring done there with the source task code
> due
> > > to
> > > > KIP-618 so another PR needs to be done for that if this is merged.
> > > >
> > > > Chris
> > > >
> > > > On Fri, Jul 15, 2022 at 10:06 AM David Arthur 
> > wrote:
> > > >
> > > > > Here is the release plan for 3.2.1
> > > > >
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.2.1
> > > > >
> > > > > I am working on getting clarity on the one open blocker. Once that
> is
> > > > > resolved (or rescheduled for a future release), I will build the
> > first
> > > > > release candidate.
> > > > >
> > > > > -David
> > > > >
> > > > > On Thu, Jul 14, 2022 at 3:10 AM Luke Chen 
> wrote:
> > > > >
> > > > > > +1, Thanks David!
> > > > > >
> > > > > > On Thu, Jul 14, 2022 at 1:16 PM David Jacot <
> david.ja...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > +1. Thanks David.
> > > > > > >
> > > > > > > Le mer. 13 juil. 2022 à 23:43, José Armando García Sancio
> > > > > > >  a écrit :
> > > > > > >
> > > > > > > > +1. Thanks for volunteering David.
> > > > > > > >
> > > > > > > > --
> > > > > > > > -José
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > David Arthur
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Apache Kafka 3.3.0 Release

2022-06-22 Thread Viktor Somogyi-Vass
hi Jose,

KIP-373 just went in yesterday (and likely add some docs too today), would
you please add it to the list?

Thanks,
Viktor

On Fri, Jun 17, 2022 at 11:45 AM Matthew Benedict de Detrich
 wrote:

> Hi José,
>
> KIP-797 has been accepted and the associated PR is now ready, is it
> possible to add to the release?
>
> Regards
>
> On Thu, Jun 16, 2022 at 11:56 PM José Armando García Sancio
>  wrote:
>
> > Hi Divij,
> >
> > On Thu, Jun 16, 2022 at 1:37 AM Divij Vaidya 
> > wrote:
> > > *Question#1*: Do we only track the KIPs over here that are blockers for
> > > release or do we track the non-KIP JIRA tickets as well?
> >
> > This page documents the KIPs and Jira issues I am tracking for the
> > 3.3.0 release.
> > https://cwiki.apache.org/confluence/x/-xahD
> >
> > > If we don't track the JIRA tickets, please ignore the following, but if
> > we
> > > do, I would like to propose that we fix/merge the following before
> > release:
> > > 1. https://github.com/apache/kafka/pull/12228 -> Fixes multiple memory
> > > leaks.
> > > 2. https://github.com/apache/kafka/pull/12184 -> Fixes an edge case
> > where a
> > > specific configuration for quota values could lead to errors.
> >
> > This is the Jira query I am using to track issues that need to be
> > fixed by 3.3.0:
> >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20fixVersion%20%3D%203.3.0%20AND%20status%20not%20in%20(resolved%2C%20closed)%20ORDER%20BY%20priority%20DESC%2C%20status%20DESC%2C%20updated%20DESC%20%20%20%20%20%20
> >
> > If you think those PRs need to be reviewed and merged before feature
> > freeze or code freeze please feel free to add 3.3.0 to the fixVersion
> > of the Jira. I don't have time to review those PRs this week but I'll
> > try to take a look next week.
> >
> > > *Question#2*: As a non-committer, is there anything that I could help
> > with
> > > for the release process?
> >
> > Thanks for volunteering to help. I would suggest looking at the issues
> > in the search above and work on any issue that interests you and it is
> > not already assigned.
> > --
> > -José
> >
>
>
> --
>
> Matthew de Detrich
>
> *Aiven Deutschland GmbH*
>
> Immanuelkirchstraße 26, 10405 Berlin
>
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>
> *m:* +491603708037
>
> *w:* aiven.io *e:* matthew.dedetr...@aiven.io
>


[jira] [Reopened] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users

2022-06-22 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass reopened KAFKA-6945:


> Add support to allow users to acquire delegation tokens for other users
> ---
>
> Key: KAFKA-6945
> URL: https://issues.apache.org/jira/browse/KAFKA-6945
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Manikumar
>    Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.3.0
>
>
> Currently, we only allow a user to create delegation token for that user 
> only. 
> We should allow users to acquire delegation tokens for other users.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Should we automatically close stale PRs?

2022-06-08 Thread Viktor Somogyi-Vass
>One thing that might make sense to do maybe is to add frequent contributors
>with the "triage" role, so they could label PRs they reviewed and they can
>be taken by committers for a further review and potential merge. What do
>you think?

In addition to labeling commits as stale I think the opposite as said above
(triage and label new PRs) is a very good idea too. We often try to review
each other's commits in the team before/after publishing them upstream.
Such commits I think would filter the incoming PRs well and make overall
quality better.
Would it be possible to do this? I'd be happy to be triager.

On Sun, Feb 27, 2022 at 4:23 AM Guozhang Wang  wrote:

> Hey David,
>
> Just reviving on this thread, do you have some final decision on this now
> with all the feedbacks received so far?
>
> On Sun, Feb 13, 2022 at 8:41 PM Ismael Juma  wrote:
>
> > Hi David,
> >
> > I think it's a good idea to use the bot for auto closing stale PRs. The
> > ideal flow would be:
> >
> > 1. Write a comment and add stale label
> > 2. If user responds saying that the PR is still valid, the stale label is
> > removed
> > 3. Otherwise, the PR is closed
> >
> > Thanks,
> > Ismael
> >
> > On Sat, Feb 5, 2022, 2:22 AM David Jacot  wrote:
> >
> > > Hi team,
> > >
> > > I find our ever growing back of PRs a little frustrating, don't
> > > you? I just made a pass over all the list and a huge chunk
> > > of the PRs are abandoned, outdated or irrelevant with the
> > > current code base. For instance, we still have PRs opened
> > > back in 2015.
> > >
> > > There is not a Github Action [1] for automatically marking
> > > PRs as stale and to automatically close them as well. How
> > > would the community feel about enabling this? I think that
> > > we could mark a PR as stable after one year and close it
> > > a month after if there are no new activities. Reopening a
> > > closed PR is really easy so there is no real arm is closing
> > > it.
> > >
> > > [1] https://github.com/actions/stale
> > >
> >
>
>
> --
> -- Guozhang
>


[jira] [Created] (KAFKA-13949) Connect /connectors endpoint should support querying the active topics and the task configs

2022-05-31 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-13949:
---

 Summary: Connect /connectors endpoint should support querying the 
active topics and the task configs
 Key: KAFKA-13949
 URL: https://issues.apache.org/jira/browse/KAFKA-13949
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Affects Versions: 3.2.0
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


The /connectors endpoint supports the "expand" query parameter, which acts as a 
set of queried categories, currently supporting info (config) and status 
(monitoring status).

The endpoint should also support adding the active topics of a connector, and 
adding the separate task configs, too.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13917) Avoid calling lookupCoordinator() in tight loop

2022-05-19 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-13917:
---

 Summary: Avoid calling lookupCoordinator() in tight loop
 Key: KAFKA-13917
 URL: https://issues.apache.org/jira/browse/KAFKA-13917
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Affects Versions: 3.1.1, 3.1.0, 3.1.2
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


Currently the heartbeat thread's lookupCoordinator() is called in a tight loop 
if brokers crash and the consumer is left running. Besides that it floods the 
logs on debug level, it increases CPU usage as well.

The fix is easy, just need to put a backoff call after coordinator lookup.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (KAFKA-6084) ReassignPartitionsCommand should propagate JSON parsing failures

2022-04-28 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-6084.

Fix Version/s: 2.8.0
   Resolution: Fixed

> ReassignPartitionsCommand should propagate JSON parsing failures
> 
>
> Key: KAFKA-6084
> URL: https://issues.apache.org/jira/browse/KAFKA-6084
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Affects Versions: 0.11.0.0
>    Reporter: Viktor Somogyi-Vass
>    Assignee: Viktor Somogyi-Vass
>Priority: Minor
>  Labels: easyfix, newbie
> Fix For: 2.8.0
>
> Attachments: Screen Shot 2017-10-18 at 23.31.22.png
>
>
> Basically looking at Json.scala it will always swallow any parsing errors:
> {code}
>   def parseFull(input: String): Option[JsonValue] =
> try Option(mapper.readTree(input)).map(JsonValue(_))
> catch { case _: JsonProcessingException => None }
> {code}
> However sometimes it is easy to figure out the problem by simply looking at 
> the JSON, in some cases it is not very trivial, such as some invisible 
> characters (like byte order mark) won't be displayed by most of the text 
> editors and can people spend time on figuring out what's the problem.
> As Jackson provides a really detailed exception about what failed and how, it 
> is easy to propagate the failure to the user.
> As an example I attached a BOM prefixed JSON which fails with the following 
> error which is very counterintuitive:
> {noformat}
> [root@localhost ~]# kafka-reassign-partitions --zookeeper localhost:2181 
> --reassignment-json-file /root/increase-replication-factor.json --execute
> Partitions reassignment failed due to Partition reassignment data file 
> /root/increase-replication-factor.json is empty
> kafka.common.AdminCommandFailedException: Partition reassignment data file 
> /root/increase-replication-factor.json is empty
> at 
> kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:120)
> at 
> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:52)
> at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
> ...
> {noformat}
> In case of the above error it would be much better to see what fails exactly:
> {noformat}
> kafka.common.AdminCommandFailedException: Admin command failed
>   at 
> kafka.admin.ReassignPartitionsCommand$.parsePartitionReassignmentData(ReassignPartitionsCommand.scala:267)
>   at 
> kafka.admin.ReassignPartitionsCommand$.parseAndValidate(ReassignPartitionsCommand.scala:275)
>   at 
> kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:197)
>   at 
> kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:193)
>   at 
> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:64)
>   at 
> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
> Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected 
> character ('' (code 65279 / 0xfeff)): expected a valid value (number, 
> String, array, object, 'true', 'false' or 'null')
>  at [Source: (String)"{"version":1,
>   "partitions":[
>{"topic": "test1", "partition": 0, "replicas": [1,2]},
>{"topic": "test2", "partition": 1, "replicas": [2,3]}
> ]}"; line: 1, column: 2]
>   at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1798)
>   at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663)
>   at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:561)
>   at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1892)
>   at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:747)
>   at 
> com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4030)
>   at 
> com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2539)
>   at kafka.utils.Json$.kafka$utils$Json$$doParseFull(Json.scala:46)
>   at kafka.utils.Json$$anonfun$tryParseFull$1.apply(Json.scala:44)
>   at kafka.utils.Json$$anonfun$tryParseFull$1.apply(Json.scala:44)
>   at scala.util.Try$.apply(Try.scala:192)
>   at kafka.utils.Json$.tryParseFull(Json.scala:44)
>   at 
> kafka.admin.ReassignPartitionsCommand$.parsePartitionReassignmentData(ReassignPartitionsCommand.scala:241)
>   ... 5 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (KAFKA-13442) REST API endpoint for fetching a connector's config definition

2022-04-28 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-13442.
-
Resolution: Duplicate

> REST API endpoint for fetching a connector's config definition
> --
>
> Key: KAFKA-13442
> URL: https://issues.apache.org/jira/browse/KAFKA-13442
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 3.2.0
>    Reporter: Viktor Somogyi-Vass
>    Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> To enhance UI based applications' capability in helping users to create new 
> connectors from default configurations, it would be very good to have an API 
> which can fetch a connector type's configuration definition which will be 
> filled out by users and sent back for validation and then creating a new 
> connector out of it.
> The API should be placed under {{connector-plugins}} and since 
> {{connector-plugins/\{connectorType\}/config/validate}} already exists, 
> {{connector-plugins/\{connectorType\}/config}} might be a good option for the 
> new API.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (KAFKA-13452) MM2 creates invalid checkpoint when offset mapping is not available

2022-04-28 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-13452.
-
Resolution: Duplicate

> MM2 creates invalid checkpoint when offset mapping is not available
> ---
>
> Key: KAFKA-13452
> URL: https://issues.apache.org/jira/browse/KAFKA-13452
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Daniel Urban
>    Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> MM2 checkpointing reads the offset-syncs topic to create offset mappings for 
> committed consumer group offsets. In some corner cases, it is possible that a 
> mapping is not available in offset-syncs - in that case, MM2 simply copies 
> the source offset, which might not be a valid offset in the replica topic at 
> all.
> One possible situation is if there is an empty topic in the source cluster 
> with a non-zero endoffset (e.g. retention already removed the records), and a 
> consumer group which has a committed offset set to the end offset. If 
> replication is configured to start replicating this topic, it will not have 
> an offset mapping available in offset-syncs (as the topic is empty), causing 
> MM2 to copy the source offset.
> This can cause issues when auto offset sync is enabled, as the consumer group 
> offset can be potentially set to a high number. MM2 never rewinds these 
> offsets, so even when there is a correct offset mapping available, the offset 
> will not be updated correctly.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] KIP-769: Connect APIs to list all connector plugins and retrieve their configuration definitions

2022-01-26 Thread Viktor Somogyi-Vass
Hi Michael,

+1 (non-binding) from me.

Viktor

On Thu, Jan 13, 2022 at 11:32 AM Mickael Maison 
wrote:

> Bumping this vote.
>
> We have 2 non-binding votes so far. Please take a look and let me know
> if you have any feedback.
>
> Thanks,
> Mickael
>
> On Mon, Dec 13, 2021 at 10:50 PM Ryanne Dolan 
> wrote:
> >
> > +1 (non-binding)
> >
> > Ryanne
> >
> > On Mon, Dec 13, 2021, 4:18 AM Mickael Maison 
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start a vote on KIP-769 which proposes adding new
> > > endpoints to the Connect REST API to list all connectors plugins and
> > > retrieve their configurations.
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions
> > >
> > > Please take a look and let me know if you have any feedback.
> > >
> > > Thanks,
> > > Mickael
> > >
>


Re: [DISCUSS] KIP-719: Add Log4J2 Appender

2022-01-19 Thread Viktor Somogyi-Vass
Hi folks,

>a) I think this makes sense. But I think this shouldn't block this
KIP, it can be done after moving to log4j2.

Absolutely.

>However, this limitation may change in the future - slf4j will support
dynamic logging levels in 2.0:

This would be awesome, let's keep an eye on this.

>Kafka avoided all the log4shell vulnerabilities exactly because we DIDN'T
move to log4j 2.x.

My first reaction was how lucky we are (and all Kafka users) that we didn't
have to do anything on that weekend! Blessed are the lazy. :)

Viktor

On Tue, Jan 18, 2022 at 5:16 PM Colin McCabe  wrote:

> On Wed, Jan 12, 2022, at 02:37, Viktor Somogyi-Vass wrote:
> > Hi Dongjin,
> >
> > We're also looking into this matter as our stack was also affected by all
> > the log4j hell and users increasingly pushing us to upgrade to log4j2 or
> > logback because of the existing vulnerabilities in log4j1.
> > Regarding the points raised by Haruki:
> >
>
> I've heard the same thing from other people -- that there is now more
> interest in moving from log4j1.x to log4j2.x, after all the recent
> vulnerabilities in the latter. I found this a bit strange. Kafka avoided
> all the log4shell vulnerabilities exactly because we DIDN'T move to log4j
> 2.x. (Yes, I am aware that there is a longstanding vulnerability in that
> one log sink in log4j 1.x, but you can just not use that one.)
>
> I haven't thought about this very hard. Maybe it's still a good idea to
> move to log4j2. But it's odd that nobody is commenting about how in this
> case, not updating actually prevented a major security incident for Kafka.
>
> best,
> Colin
>
> >
> > a) In my opinion the best would be to make the dynamic logger support
> > (Log4jController and LoggingResource) pluggable for log4j2 and logback
> (so
> > an interface could be used to define the dynamic logging control methods
> > and a config to specify the implementation). That way we're not bound to
> > either logback or log4j and seems like a low-effort thing to do.
> > Additionally this could be used in Connect too in LoggingResource.
> >
> > b) I think testing dependencies aren't that important from the user
> > perspective, it's fine to either use log4j2 or logback, whichever is
> > easier. Kafka is either used from the distribution (tgz) or pulled in
> > through maven, but test dependencies shouldn't be exposed to the world.
> >
> > c) I would support deprecating the appender in favor of the log4j2 Kafka
> > appender. VerifiableLog4jAppender is intended as a testing tool anyway,
> so
> > I think it's less important to change this to logback.
> >
> > Future vulnerabilities will always be found in either logback or log4j2
> or
> > any other logging framework, so I think the safest approach is to allow
> > users to choose their implementation, while in tests I think we're free
> to
> > use whatever we want as that shouldn't be constrained by vulnerabilities.
> >
> > Viktor
> >
> > On Thu, Dec 23, 2021 at 9:37 AM Haruki Okada 
> wrote:
> >
> >> Thanks for the clarification.
> >>
> >> About 2, I wan't aware of those concerns.
> >> Let me check them first.
> >>
> >>
> >> Thanks,
> >>
> >> 2021年12月23日(木) 13:37 Dongjin Lee :
> >>
> >> > Hi Haruki,
> >> >
> >> >
> >> > Thanks for organizing the issue.
> >> >
> >> >
> >> > If the community prefers logback, I will gladly change the dependency
> and
> >> > update the PR. However, it has the following issues:
> >> >
> >> >
> >> > 1. The log4j2 vulnerabilities seem mostly fixed, and KIP-653 + KIP-719
> >> are
> >> > not released yet. So, using log4j2 (whose recent update pace is so
> high)
> >> > will not affect the users.
> >> >
> >> >
> >> > 2. To switch to logback, the following features should be reworked:
> >> >
> >> >
> >> >   a. Dynamic logger level configuration (core, connect)
> >> >
> >> >   b. Logging tests (streams)
> >> >
> >> >   c. Kafka Appender (tools)
> >> >
> >> >
> >> > a and b are the most challenging ones since there is little
> documentation
> >> > on how to do this, so it requires analyzing the implementation itself.
> >> > (what I actually did with log4j2) About c, logback does not provide a
> >> Kafka
> >> > Appender so we have to provide an equivalent.
> >> >
> >> >
> >> > It is wh

Re: [VOTE] KIP-719: Deprecate Log4J Appender

2022-01-19 Thread Viktor Somogyi-Vass
Hi Dongjin,

+1 (non-binding) from me as well. Thanks for picking this up!

Viktor

On Mon, Jan 17, 2022 at 1:07 PM Mickael Maison 
wrote:

> Hi Dongjin,
>
> +1 (binding) Thanks for all your efforts on this KIP
>
> Thanks,
> Mickael
>
> On Tue, Dec 21, 2021 at 10:00 AM Dongjin Lee  wrote:
> >
> > Hi All,
> >
> > I hope to reboot the voting for KIP-719: Deprecate Log4J Appender.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-719%3A+Deprecate+Log4J+Appender
> >
> > Here is some context: the purpose of this proposal is to remove
> > log4j-appender from the dependency, and initially, the proposal focused
> on
> > deprecating log4j-appender and introducing a log4j2 equivalent. However,
> as
> > the discussion proceeded, it became clear that removing log4j-appender
> from
> > the dependency only is available without introducing a new subproject.
> >
> > For the reasons above, the point of the proposal was slightly changed:
> >
> > - Before: Deprecate the log4j-appender and add log4j2-appender
> > - After: Deprecate the log4j-appender and replace the current dependency
> > with log4j 2.x's Kafka appender.
> >
> > This is why I reset the voting thread.
> >
> > Best,
> > Dongjin
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> >
> >
> > *github:  github.com/dongjinleekr
> > keybase:
> https://keybase.io/dongjinleekr
> > linkedin:
> kr.linkedin.com/in/dongjinleekr
> > speakerdeck:
> speakerdeck.com/dongjin
> > *
>


Re: [DISCUSS] KIP-719: Add Log4J2 Appender

2022-01-12 Thread Viktor Somogyi-Vass
Hi Dongjin,

We're also looking into this matter as our stack was also affected by all
the log4j hell and users increasingly pushing us to upgrade to log4j2 or
logback because of the existing vulnerabilities in log4j1.
Regarding the points raised by Haruki:

a) In my opinion the best would be to make the dynamic logger support
(Log4jController and LoggingResource) pluggable for log4j2 and logback (so
an interface could be used to define the dynamic logging control methods
and a config to specify the implementation). That way we're not bound to
either logback or log4j and seems like a low-effort thing to do.
Additionally this could be used in Connect too in LoggingResource.

b) I think testing dependencies aren't that important from the user
perspective, it's fine to either use log4j2 or logback, whichever is
easier. Kafka is either used from the distribution (tgz) or pulled in
through maven, but test dependencies shouldn't be exposed to the world.

c) I would support deprecating the appender in favor of the log4j2 Kafka
appender. VerifiableLog4jAppender is intended as a testing tool anyway, so
I think it's less important to change this to logback.

Future vulnerabilities will always be found in either logback or log4j2 or
any other logging framework, so I think the safest approach is to allow
users to choose their implementation, while in tests I think we're free to
use whatever we want as that shouldn't be constrained by vulnerabilities.

Viktor

On Thu, Dec 23, 2021 at 9:37 AM Haruki Okada  wrote:

> Thanks for the clarification.
>
> About 2, I wan't aware of those concerns.
> Let me check them first.
>
>
> Thanks,
>
> 2021年12月23日(木) 13:37 Dongjin Lee :
>
> > Hi Haruki,
> >
> >
> > Thanks for organizing the issue.
> >
> >
> > If the community prefers logback, I will gladly change the dependency and
> > update the PR. However, it has the following issues:
> >
> >
> > 1. The log4j2 vulnerabilities seem mostly fixed, and KIP-653 + KIP-719
> are
> > not released yet. So, using log4j2 (whose recent update pace is so high)
> > will not affect the users.
> >
> >
> > 2. To switch to logback, the following features should be reworked:
> >
> >
> >   a. Dynamic logger level configuration (core, connect)
> >
> >   b. Logging tests (streams)
> >
> >   c. Kafka Appender (tools)
> >
> >
> > a and b are the most challenging ones since there is little documentation
> > on how to do this, so it requires analyzing the implementation itself.
> > (what I actually did with log4j2) About c, logback does not provide a
> Kafka
> > Appender so we have to provide an equivalent.
> >
> >
> > It is why I prefer to use log4j2. How do you think?
> >
> >
> > Thanks,
> >
> > Dongjin
> >
> >
> > On Thu, Dec 23, 2021 at 9:01 AM Haruki Okada 
> wrote:
> >
> > > Hi, Dongjin,
> > >
> > > Sorry for interrupting the discussion.
> > > And thank you for your hard work about KIP-653, KIP-719.
> > >
> > > I understand that KIP-653 is already accepted so log4j2 is the choice
> of
> > > the Kafka community though, I'm now feeling that logback is a better
> > choice
> > > here.
> > >
> > > Reasons:
> > >
> > > - even after "log4shell", several vulnerabilities found on log4j2 so
> new
> > > versions are released and users have to update in high-pace
> > > * actually, a CVE was also reported for logback (CVE-2021-42550)
> but
> > it
> > > requires edit-permission of the config file for an attacker so it's
> much
> > > less threatening
> > > - log4j1.x and logback are made by same developer (ceki), so
> > substantially
> > > the successor of log4j1 is logback rather than log4j2
> > > - in Hadoop project, seems similar suggestion was made from a PMC
> > > * https://issues.apache.org/jira/browse/HADOOP-12956
> > >
> > >
> > > What do you think about adopting logback instead?
> > >
> > >
> > > Thanks,
> > >
> > > 2021年12月21日(火) 18:02 Dongjin Lee :
> > >
> > > > Hi Mickael,
> > > >
> > > > > In the meantime, you may want to bump the VOTE thread too.
> > > >
> > > > Sure, I just reset the voting thread with a brief context.
> > > >
> > > > Thanks,
> > > > Dongjin
> > > >
> > > > On Tue, Dec 21, 2021 at 2:13 AM Mickael Maison <
> > mickael.mai...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Thanks Dongjin!
> > > > >
> > > > > I'll take a look soon.
> > > > > In the meantime, you may want to bump the VOTE thread too.
> > > > >
> > > > > Best,
> > > > > Mickael
> > > > >
> > > > >
> > > > > On Sat, Dec 18, 2021 at 10:00 AM Dongjin Lee 
> > > wrote:
> > > > > >
> > > > > > Hi Mickael,
> > > > > >
> > > > > > Finally, I did it! As you can see at the PR
> > > > > > , KIP-719 now uses
> > > > log4j2's
> > > > > > Kafka appender, and log4j-appender is not used by the other
> modules
> > > > > > anymore. You can see how it will work with KIP-653 at this
> preview
> > > > > > <
> http://home.apache.org/~dongjin/post/apache-kafka-log4j2-support/
> > >,
> > > > > based
> > > > > > on Apache Kafka 3.0.0. The 

Re: [DISCUSS] KIP-714: Client metrics and observability

2021-11-22 Thread Viktor Somogyi-Vass
Hi Magnus,

I think this is a very useful addition. We also have a similar (but much
more simplistic) implementation of this. Maybe I missed it in the KIP but
what about adding metrics about the subscription cache itself? That I think
would improve its usability and debuggability as we'd be able to see its
performance, hit/miss rates, eviction counts and others.

Best,
Viktor

On Thu, Nov 18, 2021 at 5:12 PM Magnus Edenhill  wrote:

> Hi Mickael,
>
> see inline.
>
> Den ons 10 nov. 2021 kl 15:21 skrev Mickael Maison <
> mickael.mai...@gmail.com
> >:
>
> > Hi Magnus,
> >
> > I see you've addressed some of the points I raised above but some (4,
> > 5) have not been addressed yet.
> >
>
> Re 4) How will the user/app know metrics are being sent.
>
> One possibility is to add a JMX metric (thus for user consumption) for the
> number of metric pushes the
> client has performed, or perhaps the number of metrics subscriptions
> currently being collected.
> Would that be sufficient?
>
> Re 5) Metric sizes and rates
>
> A worst case scenario for a producer that is producing to 50 unique topics
> and emitting all standard metrics yields
> a serialized size of around 100KB prior to compression, which compresses
> down to about 20-30% of that depending
> on compression type and topic name uniqueness.
> The numbers for a consumer would be similar.
>
> In practice the number of unique topics would be far less, and the
> subscription set would typically be for a subset of metrics.
> So we're probably closer to 1kb, or less, compressed size per client per
> push interval.
>
> As both the subscription set and push intervals are controlled by the
> cluster operator it shouldn't be too hard
> to strike a good balance between metrics overhead and granularity.
>
>
>
> >
> > I'm really uneasy with this being enabled by default on the client
> > side. When collecting data, I think the best practice is to ensure
> > users are explicitly enabling it.
> >
>
> Requiring metrics to be explicitly enabled on clients severely cripples its
> usability and value.
>
> One of the problems that this KIP aims to solve is for useful metrics to be
> available on demand
> regardless of the technical expertise of the user. As Ryanne points, out a
> savvy user/organization
> will typically have metrics collection and monitoring in place already, and
> the benefits of this KIP
> are then more of a common set and format metrics across client
> implementations and languages.
> But that is not the typical Kafka user in my experience, they're not Kafka
> experts and they don't have the
> knowledge of how to best instrument their clients.
> Having metrics enabled by default for this user base allows the Kafka
> operators to proactively and reactively
> monitor and troubleshoot client issues, without the need for the less savvy
> user to do anything.
> It is often too late to tell a user to enable metrics when the problem has
> already occurred.
>
> Now, to be clear, even though metrics are enabled by default on clients it
> is not enabled by default
> on the brokers; the Kafka operator needs to build and set up a metrics
> plugin and add metrics subscriptions
> before anything is sent from the client.
> It is opt-out on the clients and opt-in on the broker.
>
>
>
>
> > You mentioned brokers already have
> > some(most?) of the information contained in metrics, if so then why
> > are we collecting it again? Surely there must be some new information
> > in the client metrics.
> >
>
> From the user's perspective the Kafka infrastructure extends from
> producer.send() to
> messages being returned from consumer.poll(), a giant black box where
> there's a lot going on between those
> two points. The brokers currently only see what happens once those requests
> and messages hits the broker,
> but as Kafka clients are complex pieces of machinery there's a myriad of
> queues, timers, and state
> that's critical to the operation and infrastructure that's not currently
> visible to the operator.
> Relying on the user to accurately and timely provide this missing
> information is not generally feasible.
>
>
> Most of the standard metrics listed in the KIP are data points that the
> broker does not have.
> Only a small number of metrics are duplicates (like the request counts and
> sizes), but they are included
> to ease correlation when inspecting these client metrics.
>
>
>
> > Moreover this is a brand new feature so it's even harder to justify
> > enabling it and forcing onto all our users. If disabled by default,
> > it's relatively easy to enable in a new release if we decide to, but
> > once enabled by default it's much harder to disable. Also this feature
> > will apply to all future metrics we will add.
> >
>
> I think maturity of a feature implementation should be the deciding factor,
> rather than
> the design of it (which this KIP is). I.e., if the implementation is not
> deemed mature enough
> for release X.Y it will be disabled.
>
>
>
> > Overall I think 

Re: KIP-769: Connect API to retrieve connector configuration definitions

2021-11-16 Thread Viktor Somogyi-Vass
Hi Folks,

I too think this would be a very useful feature. Some of our management
applications would provide a wizard for creating connectors. In this
scenario the user basically would fill out a sample configuration generated
by the UI which would send it back to Connect for validation and eventually
create a new connector. The first part of this workflow can be enhanced if
we had an API that can return the configuration definition of the given
type of connector as the UI application would be able to generate a sample
for the user based on that (nicely drawn diagram:
https://imgur.com/a/7S1Xwm5).
The connector-plugins/{connectorType}/config/validate API essentially works
and returns the data that we need, however it is a HTTP PUT API that is a
bit unintuitive for a fetch-like functionality and also functionally
different as it validates the given (dummy) request. In case of sink
connectors one would need to also provide a topic name.

A suggestion for the KIP: I think it can be useful to return the config
groups and the connector class' name similarly to the validate API just in
case any frontend needs them (and also the response would be more like the
validate API but simpler).

Viktor

On Fri, Aug 20, 2021 at 4:51 PM Ryanne Dolan  wrote:

> I think it'd be worth adding a GET version, fwiw. Could be the same handler
> with just a different spelling maybe.
>
> On Fri, Aug 20, 2021, 7:44 AM Mickael Maison 
> wrote:
>
> > Hi Chris,
> >
> > You're right, you can achieve the same functionality using the
> > existing validate endpoint.
> > In my mind it was only for validation once you have build a
> > configuration but when used with an empty configuration, it basically
> > serves the same purpose as the proposed new endpoint.
> >
> > I think it's a bit easier to use a GET endpoint but I don't think it
> > really warrants a different endpoint.
> >
> > Thanks
> >
> > On Thu, Aug 19, 2021 at 2:56 PM Chris Egerton
> >  wrote:
> > >
> > > Hi Mickael,
> > >
> > > I'm wondering about the use case here. The motivation section states
> that
> > > "Connect does not provide a way to see what configurations a connector
> > > requires. Instead users have to go look at the connector documentation
> or
> > > in the worst case, look directly at the connector source code.", and
> that
> > > with this KIP, "users will be able to discover the required
> > configurations
> > > for connectors installed in a Connect cluster" and "tools will be able
> to
> > > generate wizards for configuring and starting connectors".
> > >
> > > Does the existing "PUT
> > /connector-plugins/{connector-type}/config/validate"
> > > endpoint not address these points? What will the newly-proposed
> endpoint
> > > allow users to do that they will not already be able to do with the
> > > existing endpoint?
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Thu, Aug 19, 2021 at 9:20 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I've created KIP-769 to expose connector configuration definitions in
> > > > the Connect API
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+API+to+retrieve+connector+configuration+definitions
> > > >
> > > > Please take a look and let me know if you have any feedback.
> > > >
> > > > Thanks
> > > >
> >
>


[jira] [Created] (KAFKA-13442) REST API endpoint for fetching a connector's config definition

2021-11-10 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-13442:
---

 Summary: REST API endpoint for fetching a connector's config 
definition
 Key: KAFKA-13442
 URL: https://issues.apache.org/jira/browse/KAFKA-13442
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Affects Versions: 3.2.0
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


To enhance UI based applications' capability in helping users to create new 
connectors from default configurations, it would be very good to have an API 
which can fetch a connector type's configuration definition which will be 
filled out by users and sent back for validation and then creating a new 
connector out of it.

The API should be placed under {{connector-plugins}} and since 
{{connector-plugins/\{connectorType\}/config/validate}} already exists, 
{{connector-plugins/\{connectorType\}/config}} might be a good option for the 
new API.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13240) HTTP TRACE should be disabled in Connect

2021-08-27 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-13240:
---

 Summary: HTTP TRACE should be disabled in Connect
 Key: KAFKA-13240
 URL: https://issues.apache.org/jira/browse/KAFKA-13240
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


Modern browsers mostly disable HTTP TRACE to prevent XST (cross-site tracking) 
attacks.  Because of this usually this type of attack isn't too prevalent these 
days but since it isn't disabled in Connect it may open up possible ways of 
attacks (and constantly pops up in security scans :) ). Therefore we'd like to 
disable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12922) MirrorCheckpointTask should close topic filter

2021-06-09 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-12922:
---

 Summary: MirrorCheckpointTask should close topic filter
 Key: KAFKA-12922
 URL: https://issues.apache.org/jira/browse/KAFKA-12922
 Project: Kafka
  Issue Type: Improvement
  Components: mirrormaker
Affects Versions: 2.8.0
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


When a lot of connectors are restarted it turned out that underlying 
ConfigConsumers are not closed property and from the logs we can see that the 
old ones are still running.

Turns out that MirrorCheckpointTask utilizes a TopicFilter, but never closes 
it, leaking resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-373: Allow users to create delegation tokens for other users

2021-05-31 Thread Viktor Somogyi-Vass
The code is ready for review now: https://github.com/apache/kafka/pull/10738

Thanks,
Viktor

On Thu, May 20, 2021 at 9:58 AM Viktor Somogyi-Vass 
wrote:

> Hi Matthew,
>
> I saw your email the other day. Unfortunately this has been deprioritized
> in our team back then but now I started to rebase and finish the solution.
> I'll create a PR (at least a draft) sometime later today and hopefully can
> start reviewing it soon in a few days.
>
> Viktor
>
> On Tue, May 18, 2021 at 11:50 AM Matthew de Detrich
>  wrote:
>
>> Forgot to mention the code that was never merged in the PR
>>
>> https://github.com/omkreddy/kafka/commit/fc47aa8d06828ef1de1c12b6c33192e10e3afd0c
>>
>> On Tue, May 18, 2021 at 11:42 AM Matthew de Detrich <
>> matthew.dedetr...@aiven.io> wrote:
>>
>> > Apologies for necro/bump on this topic, but I am currently trying to
>> work
>> > on
>> > tihs topic and I noticed that the PR in question for KAFKA-6945 never up
>> > being
>> > created/merged (I have checked through git logs plus manually looking at
>> > the
>> > code).
>> >
>> > Is there a reason why this PR was never created/merged and if so would
>> > there
>> > be any issues if I was to go forward in rebasing commit for latest trunk
>> > and
>> > creating a new PR for it?
>> >
>> > --
>> > Matthew de Detrich
>> >
>> > Aiven Deutschland GmbH
>> >
>> > Immanuelkirchstraße 26, 10405 Berlin
>> >
>> > Amtsgericht Charlottenburg, HRB 209739 B
>> >
>> > m: +491603708037
>> >
>> > w: aiven.io e: matthew.dedetr...@aiven.io
>> >
>> > On 2020/01/31 09:35:18, Viktor Somogyi-Vass  wrote:
>> > > Hi All,>
>> > >
>> > > As a few days passed and we have the required number of binding votes,
>> > the>
>> > > KIP has passed it.>
>> > > Thank you all who have voted, I'll post the PR about this soon!>
>> > > Binding votes: Manikumar, Harsha, Jun>
>> > > Non-binding ones: Ryanne>
>> > >
>> > > Thanks,>
>> > > Viktor>
>> > >
>> > > On Tue, Jan 28, 2020 at 10:56 AM Viktor Somogyi-Vass <>
>> > > viktorsomo...@gmail.com> wrote:>
>> > >
>> > > > Hi Rajini,>
>> > > >>
>> > > > I rebased my older PR and double checked it. It'll work with a new>
>> > > > resource type without adding new fields the ACL admin client APIs.
>> As
>> > I>
>> > > > mentioned though, it'll be good to increment their version though to
>> > allow>
>> > > > more graceful handling of the protocol compatibilities as an older
>> > broker>
>> > > > won't know about the User resource type and probably will fail with
>> a>
>> > > > serialization error whereas if they match the protocol the client
>> > could>
>> > > > detect it's an older broker and wouldn't allow the request. I'll
>> > append>
>> > > > this to the KIP.>
>> > > > Please let me know if we're good to continue with this.>
>> > > >>
>> > > > Best,>
>> > > > Viktor>
>> > > >>
>> > > > On Mon, Jan 20, 2020 at 5:45 PM Viktor Somogyi-Vass <>
>> > > > viktorsomo...@gmail.com> wrote:>
>> > > >>
>> > > >> Hi Rajini,>
>> > > >>>
>> > > >> 1) I think we can to keep the conventions in the tool. As an
>> addition
>> > we>
>> > > >> wouldn't have to retain certain characters (for creating the
>> list).>
>> > > >> 2) Yes, so based on 1) and this --users changes to --user-principal
>> > (and>
>> > > >> accepts one single user principal).>
>> > > >> 3) Looking at it again probably we'll want to increase the version
>> of
>> > the>
>> > > >> ACL protocols as new resource and operation types are getting added
>> > and>
>> > > >> currently sending such requests to old brokers would result in>
>> > > >> serialization errors. So it would be nicer to handle them on the
>> API>
>> > > >> handshake. Besides this I don't see if we need to do anything else
>> as
>> > these>
>> > > >> operations should be able to

Re: [VOTE] KIP-373: Allow users to create delegation tokens for other users

2021-05-20 Thread Viktor Somogyi-Vass
Hi Matthew,

I saw your email the other day. Unfortunately this has been deprioritized
in our team back then but now I started to rebase and finish the solution.
I'll create a PR (at least a draft) sometime later today and hopefully can
start reviewing it soon in a few days.

Viktor

On Tue, May 18, 2021 at 11:50 AM Matthew de Detrich
 wrote:

> Forgot to mention the code that was never merged in the PR
>
> https://github.com/omkreddy/kafka/commit/fc47aa8d06828ef1de1c12b6c33192e10e3afd0c
>
> On Tue, May 18, 2021 at 11:42 AM Matthew de Detrich <
> matthew.dedetr...@aiven.io> wrote:
>
> > Apologies for necro/bump on this topic, but I am currently trying to work
> > on
> > tihs topic and I noticed that the PR in question for KAFKA-6945 never up
> > being
> > created/merged (I have checked through git logs plus manually looking at
> > the
> > code).
> >
> > Is there a reason why this PR was never created/merged and if so would
> > there
> > be any issues if I was to go forward in rebasing commit for latest trunk
> > and
> > creating a new PR for it?
> >
> > --
> > Matthew de Detrich
> >
> > Aiven Deutschland GmbH
> >
> > Immanuelkirchstraße 26, 10405 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > m: +491603708037
> >
> > w: aiven.io e: matthew.dedetr...@aiven.io
> >
> > On 2020/01/31 09:35:18, Viktor Somogyi-Vass  wrote:
> > > Hi All,>
> > >
> > > As a few days passed and we have the required number of binding votes,
> > the>
> > > KIP has passed it.>
> > > Thank you all who have voted, I'll post the PR about this soon!>
> > > Binding votes: Manikumar, Harsha, Jun>
> > > Non-binding ones: Ryanne>
> > >
> > > Thanks,>
> > > Viktor>
> > >
> > > On Tue, Jan 28, 2020 at 10:56 AM Viktor Somogyi-Vass <>
> > > viktorsomo...@gmail.com> wrote:>
> > >
> > > > Hi Rajini,>
> > > >>
> > > > I rebased my older PR and double checked it. It'll work with a new>
> > > > resource type without adding new fields the ACL admin client APIs. As
> > I>
> > > > mentioned though, it'll be good to increment their version though to
> > allow>
> > > > more graceful handling of the protocol compatibilities as an older
> > broker>
> > > > won't know about the User resource type and probably will fail with
> a>
> > > > serialization error whereas if they match the protocol the client
> > could>
> > > > detect it's an older broker and wouldn't allow the request. I'll
> > append>
> > > > this to the KIP.>
> > > > Please let me know if we're good to continue with this.>
> > > >>
> > > > Best,>
> > > > Viktor>
> > > >>
> > > > On Mon, Jan 20, 2020 at 5:45 PM Viktor Somogyi-Vass <>
> > > > viktorsomo...@gmail.com> wrote:>
> > > >>
> > > >> Hi Rajini,>
> > > >>>
> > > >> 1) I think we can to keep the conventions in the tool. As an
> addition
> > we>
> > > >> wouldn't have to retain certain characters (for creating the list).>
> > > >> 2) Yes, so based on 1) and this --users changes to --user-principal
> > (and>
> > > >> accepts one single user principal).>
> > > >> 3) Looking at it again probably we'll want to increase the version
> of
> > the>
> > > >> ACL protocols as new resource and operation types are getting added
> > and>
> > > >> currently sending such requests to old brokers would result in>
> > > >> serialization errors. So it would be nicer to handle them on the
> API>
> > > >> handshake. Besides this I don't see if we need to do anything else
> as
> > these>
> > > >> operations should be able to handle these changes on the code level.
> > I'll>
> > > >> make sure to test this ACL scenario and report back about it
> > (although I>
> > > >> need a few days as the code I have is very old and contains a lot
> of>
> > > >> conflicts with the current trunk). Please let me know if I'm
> missing>
> > > >> something here.>
> > > >>>
> > > >> Thanks,>
> > > >> Viktor>
> > > >>>
> > > >> On Fri, Jan 17, 2020 at 5:23 PM Rajini Sivara

Re: [VOTE] KIP-567: Auditing in Kafka

2021-04-27 Thread Viktor Somogyi-Vass
Bumping it in hope for any feedback.

On Fri, Mar 12, 2021 at 3:49 PM Viktor Somogyi-Vass 
wrote:

> Hi Everyone,
>
> It seems people don't have anything to add on the discussion, so I'd like
> to propose a vote now. In any case if you still have something to add,
> please feel free to write either on this thread or on the discussion.
>
> A summary of the KIP:
> I plan to add a simple interface called Auditor which has a method with
> two parameters. The first one is a request specific parameter and the
> second one is the request context. This can be used to implement any
> auditing logic that can connect to external systems. The interface is wired
> into KafkaApis and called right before sending the response back. I
> implemented for some of the most used APIs that have user side relevance
> (like acls, admin apis). A more complete list can be found in the KIP.
>
> Link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-567%3A+Kafka+Cluster+Audit
>
> The interface I plan to add:
>
> public interface Auditor extends Configurable, AutoCloseable {
>
> void audit(AuditEvent event, AuthorizableRequestContext
> requestContext);
> }
>
> I'm happy to receive any feedback on this (preferably +1 votes :) ).
>
> Thanks,
> Viktor
>


Re: [ANNOUNCE] New committer: Tom Bentley

2021-03-19 Thread Viktor Somogyi-Vass
Congrats Tom! :)

On Wed, Mar 17, 2021 at 3:47 PM Chia-Ping Tsai  wrote:

> Congratulations!!!
>
> On 2021/03/15 17:59:56, Mickael Maison  wrote:
> > Hi all,
> >
> > The PMC for Apache Kafka has invited Tom Bentley as a committer, and
> > we are excited to announce that he accepted!
> >
> > Tom first contributed to Apache Kafka in June 2017 and has been
> > actively contributing since February 2020.
> > He has accumulated 52 commits and worked on a number of KIPs. Here are
> > some of the most significant ones:
> >KIP-183: Change PreferredReplicaLeaderElectionCommand to use
> AdminClient
> >KIP-195: AdminClient.createPartitions
> >KIP-585: Filter and Conditional SMTs
> >KIP-621: Deprecate and replace DescribeLogDirsResult.all() and
> .values()
> >KIP-707: The future of KafkaFuture (still in discussion)
> >
> > In addition, he is very active on the mailing list and has helped
> > review many KIPs.
> >
> > Congratulations Tom and thanks for all the contributions!
> >
>


Re: [DISCUSS] KIP-707: The future of KafkaFuture

2021-03-19 Thread Viktor Somogyi-Vass
Hi Tom,

Have you considered to directly subclass CompletableFuture? Can we do this?
Maybe a good addition to the alternatives.

Thanks,
Viktor

On Wed, Feb 24, 2021 at 10:13 AM Tom Bentley  wrote:

> If the next release is going to be Kafka 3.0, as seems to be the case, it
> would be a great time to decide whether and what we're doing with this API.
> So I'd be grateful for any feedback people might have.
>
> Many thanks,
>
> Tom
>
> On Tue, Feb 2, 2021 at 10:40 AM Tom Bentley  wrote:
>
> > I've previously discounted the possibility of an "Admin2" client, but
> > seeing the recent discussions on the thread for KIP-706, I wonder whether
> > this current proposal in KIP-707 would benefit from a bit more
> > discussion... I think there are broadly two approaches to evolving the
> > Admin client API to use CompletionStage directly (rather than what's
> > currently proposed in KIP-707):
> >
> > The simpler option, from a development point of view, would be to
> > introduce an alternative/parallel set of classes for each of the existing
> > result classes. E.g. ListTopicsOutcome which was the same as
> > ListTopicsResult, but using CompletionStage rather than KafkaFuture.
> Adding
> > methods to the existing Admin interface would require coming up with
> > synonym method names for every API call, and probably half of the API
> being
> > deprecated (if not immediately then in the long run). It would be cleaner
> > to have a whole new interface, let's call it Manager, using the same
> method
> > names. The existing Admin client implementation would then wrap a Manager
> > instance, and the existing Result classes could have a constructor
> > parameter of their corresponding Outcome instance which wrapped the
> > CompletionStages with KafkaFutures. The Options classes would be
> unchanged.
> > From a users point of view migrating to the new Manager client would
> mostly
> > be a matter of changing class names and adding a `.toCompletionStage()`
> to
> > those places where they were calling KafkaFuture.get()/getNow() (and even
> > this could be avoided if we used CompletableFuture rather than
> > CompletionStage in the Outcome class APIs). In the long run Admin would
> be
> > removed and we'd be left with the minor annoyance of having a client
> called
> > Manager in a package called admin.
> >
> > The more involved version would do a similar refactoring, but within a
> > different package. If we stuck with the Admin and Result class names the
> > users experience of migrating their codebase would be limited to changing
> > import statements and the same additions of `.toCompletionStage()`. On
> the
> > implementation side it would force us to duplicate all the Options
> classes
> > and also have a way of converting old Options instances to their new
> > equivalents so that the old Admin implementation could delegate to the
> new
> > one. The main benefit of this approach seems to be the slightly easier
> > experience for people porting their code to the new client.
> >
> > In doing either of these much more significant refactorings there would
> > also be the opportunity to omit the current Admin API's deprecated
> methods
> > and classes from the new API.
> >
> > Do we think this is worth biting off in order to have more long term
> > consistency between the Admin, Producer and consumer APIs?
> >
> > Kind regards,
> >
> > Tom
> >
> > On Fri, Jan 22, 2021 at 3:02 PM Tom Bentley  wrote:
> >
> >> Hi,
> >>
> >> Following a recent discussion on a PR[1], I've written KIP-707 to
> >> establish what should be done to improve the API of KafkaFuture.
> >> If you have the time, your comments would be most welcome, as some of
> the
> >> rejected alternatives are not unreasonable.
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-707%3A+The+future+of+KafkaFuture
> >>
> >> Many thanks,
> >>
> >> Tom
> >>
> >> [1]: https://github.com/apache/kafka/pull/9878
> >>
> >
>


  1   2   3   >