Re: Request permissions to contribute to Apache Kafka

2023-12-21 Thread Josep Prat
Hi Jiao,

Your accounts are now all set. The Jira one was already properly set up.
Let me know if you have any problems with the accounts.

Thanks for showing your interest in Apache Kafka,

Best,

On Fri, Dec 22, 2023 at 5:50 AM Jiao Zhang  wrote:

> Hi team,
>
> May I request permissions?
> My wiki ID is "zhangjiao.thu"
> My jira ID is "Jiao-zhang"
>
> Thank you!
>
> --
> Jiao Zhang
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


[jira] [Resolved] (KAFKA-16035) add integration test for ExpiresPerSec and RemoteLogSizeComputationTime metrics

2023-12-21 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-16035.
---
Fix Version/s: 3.7.0
   Resolution: Fixed

> add integration test for ExpiresPerSec and RemoteLogSizeComputationTime 
> metrics
> ---
>
> Key: KAFKA-16035
> URL: https://issues.apache.org/jira/browse/KAFKA-16035
> Project: Kafka
>  Issue Type: Test
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.7.0
>
>
> add integration test for ExpiresPerSec and RemoteLogSizeComputationTime 
> metrics
> https://github.com/apache/kafka/pull/15015/commits/517a7c19d5a19bc94f0f79c02a239fd1ff7f6991



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: DISCUSS KIP-1011: Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh

2023-12-21 Thread ziming deng
+1 for adding them to rejected alternatives, These kafka-ui tools should also 
evolve with the iterations of Kafka.

> On Dec 21, 2023, at 16:58, Николай Ижиков  wrote:
> 
>> In fact alterConfig and incrementalAlterConfig have different semantics, we 
>> should pass all configs when using alterConfig and we can update config  
>> incrementally using incrementalAlterConfigs, and is’t not worth doing so 
>> since alterConfig has been deprecated for a long time.
> 
> There can be third-party tools like `kafka-ui` or similar that suffer from 
> the same bug as you fixing.
> If we fix `alterConfig` itself then we fix all tools, scripts that still 
> using alterConfig.
> 
> Anyway, let’s add to the «Rejected alternatives» section reasons - why we 
> keep buggy method as is and fixing only tools.
> 
>> I think your suggestion is nice, it should be marked as deprecated and will 
>> be removed together with `AdminClient.alterConfigs()`
> 
> Is it OK to introduce option that is deprecated from the beginning?
> 
> 
>> 21 дек. 2023 г., в 06:03, ziming deng > > написал(а):
>> 
>>> shouldn't we also introduce --disable-incremental as deprecated?
>> 
>> I think your suggestion is nice, it should be marked as deprecated and will 
>> be removed together with `AdminClient.alterConfigs()`
>> 
>> 
>>> On Dec 19, 2023, at 16:36, Federico Valeri  wrote:
>>> 
>>> HI Ziming, thanks for the KIP. Looks good to me.
>>> 
>>> Just on question: given that alterConfig is deprecated, shouldn't we
>>> also introduce --disable-incremental as deprecated? That way we would
>>> get rid of both in Kafka 4.0. Also see:
>>> https://issues.apache.org/jira/browse/KAFKA-14705.
>>> 
>>> On Tue, Dec 19, 2023 at 9:05 AM ziming deng >> > wrote:
 
 Thank you for mention this Ismael,
 
 I added this to the motivation section, and I think we can still update 
 configs in this case by passing all sensitive configs, which is weird and 
 not friendly.
 
 --
 Best,
 Ziming
 
> On Dec 19, 2023, at 14:24, Ismael Juma  > wrote:
> 
> Thanks for the KIP. I think one of the main benefits of the change isn't 
> listed: sensitive configs make it impossible to make updates with the 
> current cli tool because sensitive config values are never returned.
> 
> Ismael
> 
> On Mon, Dec 18, 2023 at 7:58 PM ziming deng   
> > wrote:
>> 
>> Hello, I want to start a discussion on KIP-1011, to make the broker 
>> config change path unified with that of user/topic/client-metrics and 
>> avoid some bugs.
>> 
>> Here is the link:
>> 
>> KIP-1011: Use incrementalAlterConfigs when updating broker configs by 
>> kafka-configs.sh - Apache Kafka - Apache Software Foundation
>> cwiki.apache.org  
>> 
>> KIP-1011:
>>  Use incrementalAlterConfigs when updating broker configs by 
>> kafka-configs.sh - Apache Kafka - Apache Software Foundation 
>> 
>> cwiki.apache.org   
>> 
>>  
>> 
>> 
>> Best,
>> Ziming.



Request permissions to contribute to Apache Kafka

2023-12-21 Thread Jiao Zhang
Hi team,

May I request permissions?
My wiki ID is "zhangjiao.thu"
My jira ID is "Jiao-zhang"

Thank you!

-- 
Jiao Zhang


Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Luke Chen
For release 3.8, I think we should also include the unclean leader election
support in KRaft.
But we can discuss more details in the KIP.

Thank you, Josep!
And thank you all for the comments!

Luke

On Fri, Dec 22, 2023 at 1:14 AM Ismael Juma  wrote:

> Thank you Josep!
>
> Ismael
>
> On Thu, Dec 21, 2023, 9:09 AM Josep Prat 
> wrote:
>
> > Hi Ismael,
> >
> > I can volunteer to write the KIP. Unless somebody else has any
> objections,
> > I'll get to write it by the end of this week.
> >
> > Best,
> >
> > Josep Prat
> > Open Source Engineering Director, aivenjosep.p...@aiven.io   |
> > +491715557497 | aiven.io
> > Aiven Deutschland GmbH
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > On Thu, Dec 21, 2023, 17:58 Ismael Juma  wrote:
> >
> > > Hi all,
> > >
> > > After understanding the use case Josep and Anton described in more
> > detail,
> > > I think it's fair to say that quorum reconfiguration is necessary for
> > > migration of Apache Kafka users who follow this pattern. Given that, I
> > > think we should have a 3.8 release before the 4.0 release.
> > >
> > > The next question is whether we should do something special when it
> comes
> > > to timeline, parallel releases, etc. After careful consideration, I
> think
> > > we should simply follow our usual approach: regular 3.8 release around
> > > early May 2024 and regular 4.0 release around early September 2024. The
> > > community will be able to start working on items specific to 4.0 after
> > 3.8
> > > is branched in late March/early April - I don't think we need to deal
> > with
> > > the overhead of maintaining multiple long-lived branches for
> > > feature development.
> > >
> > > If the proposal above sounds reasonable, I suggest we write a KIP and
> > vote
> > > on it. Any volunteers?
> > >
> > > Ismael
> > >
> > > On Tue, Nov 21, 2023 at 8:18 PM Ismael Juma  wrote:
> > >
> > > > Hi Luke,
> > > >
> > > > I think we're conflating different things here. There are 3 separate
> > > > points in your email, but only 1 of them requires 3.8:
> > > >
> > > > 1. JBOD may have some bugs in 3.7.0. Whatever bugs exist can be fixed
> > in
> > > > 3.7.x. We have already said that we will backport critical fixes to
> > 3.7.x
> > > > for some time.
> > > > 2. Quorum reconfiguration is important to include in 4.0, the release
> > > > where ZK won't be supported. This doesn't need a 3.8 release either.
> > > > 3. Quorum reconfiguration is necessary for migration use cases and
> > hence
> > > > needs to be in a 3.x release. This one would require a 3.8 release if
> > > true.
> > > > But we should have a debate on whether it is indeed true. It's not
> > clear
> > > to
> > > > me yet.
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Nov 21, 2023 at 7:30 PM Luke Chen  wrote:
> > > >
> > > >> Hi Colin and Jose,
> > > >>
> > > >> I revisited the discussion of KIP-833 here
> > > >> ,
> > and
> > > >> you
> > > >> can see I'm the first one to reply to the discussion thread to
> express
> > > my
> > > >> excitement at that time. Till now, I personally still think having
> > KRaft
> > > >> in
> > > >> Kafka is a good direction we have to move forward. But to move to
> this
> > > >> destination, we need to make our users comfortable with this
> decision.
> > > The
> > > >> worst scenario is, we said 4.0 is ready, and ZK is removed. Then,
> some
> > > >> users move to 4.0 and say, wait a minute, why does it not support
> xxx
> > > >> feature? And then start to search for other alternatives to replace
> > > Apache
> > > >> Kafka. We all don't want to see this, right? So, that's why some
> > > community
> > > >> users start to express their concern to move to 4.0 too quickly,
> > > including
> > > >> me.
> > > >>
> > > >>
> > > >> Quoting Colin:
> > > >> > While dynamic quorum reconfiguration is a nice feature, it doesn't
> > > block
> > > >> anything: not migration, not deployment.
> > > >>
> > > >> Clearly Confluent team might deploy ZooKeeper in a particular way
> and
> > > >> didn’t depend on its ability to support reconfiguration. So KRaft is
> > > ready
> > > >> from your point of view. But users of Apache Kafka might have come
> to
> > > >> depend on some ZooKeeper functionality, such as the ability to
> > > reconfigure
> > > >> ZooKeeper quorums, that is not available in KRaft, yet. I don’t
> think
> > > the
> > > >> Apache Kafka documentation has ever said “do not depend on this
> > ability
> > > of
> > > >> Apache Kafka or Zookeeper”, so it doesn’t seem unreasonable for
> users
> > to
> > > >> have deployed ZooKeeper in this way. In KIP-833
> > > >> <
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-MissingFeatures
> > > >> >,
> > > >> we said: “Modifying certain dynamic configurations on the standalone
> > > 

Re: [DISCUSS] KIP-954: expand default DSL store configuration to custom types

2023-12-21 Thread Almog Gavra
Hello Everyone! I updated the KIP once more as a result of a bug
investigation - I added DslWindowParams#isTimestamped to the public API as
a result of https://issues.apache.org/jira/browse/KAFKA-16046. Please let
me know if there's any concerns with this addition.

On Thu, Dec 14, 2023 at 5:40 PM Almog Gavra  wrote:

> Sorry for the late response to the late reply, hah! I didn't give any
> thought about how we would want to integrate this custom store supplier
> with querying of the stores. My initial intuition suggests that we'd
> probably want a separate API for that, or just recommend people to query
> their external stores outside of the context of Kafka Streams (with the
> understanding that there are fewer semantic guarantees).
>
> On Sat, Dec 2, 2023 at 9:38 AM Guozhang Wang 
> wrote:
>
>> Hey Almog,
>>
>> Sorry for the late reply.
>>
>> Re: 2) above, maybe I'm just overthinking it. What I had in mind is
>> that when we have, say, a remote store impl customized by the users.
>> Besides being used inside the KS app itself, the user may try to
>> access the store instance outside the KS app as well? If that's the
>> case, maybe it's still worth having an interface from KS to expose the
>> store instance directly.
>>
>>
>> Guozhang
>>
>>
>> On Sun, Nov 19, 2023 at 5:26 PM Almog Gavra 
>> wrote:
>> >
>> > Hello Guozhang,
>> >
>> > Thanks for the feedback! For 1 there are tests verifying this and I did
>> so
>> > manually as well, it does not reveal anything about the store types --
>> just
>> > the names, so I think we're good there. I've put an example at the
>> bottom
>> > of this reply for people following the conversation.
>> >
>> > I'm not sure I understand your question about 2. What's the integration
>> > point with the actual store for this external component? What does that
>> > have to do with this PR/how does it differ from what's available today
>> > (with the default.dsl.store configuration)? In either scenario, getting
>> the
>> > actual instantiated store supplier must be done only after the topology
>> is
>> > built and rewritten (it can be passed in either via
>> > Materialized/StreamJoined in the DSL code, via TopologyConfig overrides
>> or
>> > in the global StreamsConfig passed in to KafkaStreams). Today, AFAIK,
>> this
>> > isn't possible (you can't get from the built topology the instantiated
>> > store supplier).
>> >
>> > Thanks,
>> > Almog
>> >
>> > 
>> >
>> > Topologies:
>> >Sub-topology: 0
>> > Source: KSTREAM-SOURCE-00 (topics: [test_topic])
>> >   --> KSTREAM-TRANSFORMVALUES-01
>> > Processor: KSTREAM-TRANSFORMVALUES-01 (stores: [])
>> >   --> Aggregate-Prepare
>> >   <-- KSTREAM-SOURCE-00
>> > Processor: Aggregate-Prepare (stores: [])
>> >   --> KSTREAM-AGGREGATE-03
>> >   <-- KSTREAM-TRANSFORMVALUES-01
>> > Processor: KSTREAM-AGGREGATE-03 (stores:
>> > [Aggregate-Aggregate-Materialize])
>> >   --> Aggregate-Aggregate-ToOutputSchema
>> >   <-- Aggregate-Prepare
>> > Processor: Aggregate-Aggregate-ToOutputSchema (stores: [])
>> >   --> Aggregate-Project
>> >   <-- KSTREAM-AGGREGATE-03
>> > Processor: Aggregate-Project (stores: [])
>> >   --> KTABLE-TOSTREAM-06
>> >   <-- Aggregate-Aggregate-ToOutputSchema
>> > Processor: KTABLE-TOSTREAM-06 (stores: [])
>> >   --> KSTREAM-SINK-07
>> >   <-- Aggregate-Project
>> > Sink: KSTREAM-SINK-07 (topic: S2)
>> >   <-- KTABLE-TOSTREAM-06
>> >
>> > On Sat, Nov 18, 2023 at 6:05 PM Guozhang Wang <
>> guozhang.wang...@gmail.com>
>> > wrote:
>> >
>> > > Hello Almog,
>> > >
>> > > I left a comment in the PR before I got to read the newest updates
>> > > from this thread. My 2c:
>> > >
>> > > 1. I liked the idea of delaying the instantiation of StoreBuiler from
>> > > suppliers after the Topology is created. It has been a bit annoying
>> > > for many other features we were trying back then. The only thing is,
>> > > we need to check when we call Topology.describe() which gets a
>> > > TopologyDescription, does that reveal anything about the source of
>> > > truth store impl types already; if it does not, then we are good to
>> > > go.
>> > >
>> > > 2. I originally thought (and commented in the PR) that maybe we can
>> > > just add this new func "resolveDslStoreSuppliers" into StreamsConfig
>> > > directly and mark it as EVOLVING, because I was not clear that we are
>> > > trying to do 1) above. Now I'm leaning more towards what you proposed.
>> > > But I still have a question in mind: even after we've done
>> > > https://github.com/apache/kafka/pull/14548 later, don't we still need
>> > > some interface that user's can call to get the actual instantiated
>> > > store supplier for cases where some external custom logic, like an
>> > > external controller / scheduler which is developed by a different
>> > > group of people rather than the 

[jira] [Created] (KAFKA-16046) Stream Stream Joins fail after restoration with deserialization exceptions

2023-12-21 Thread Almog Gavra (Jira)
Almog Gavra created KAFKA-16046:
---

 Summary: Stream Stream Joins fail after restoration with 
deserialization exceptions
 Key: KAFKA-16046
 URL: https://issues.apache.org/jira/browse/KAFKA-16046
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.7.0
Reporter: Almog Gavra
Assignee: Almog Gavra


Before KIP-954, the `KStreamImplJoin` class would always create non-timestamped 
persistent windowed stores. After that KIP, the default was changed to create 
timestamped stores. This wasn't compatible because, during restoration, 
timestamped stores have their changelog values transformed to prepend the 
timestamp to the value. This caused serialization errors when trying to read 
from the store because the deserializers did not expect the timestamp to be 
prepended.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Kafka trunk test & build stability

2023-12-21 Thread Philip Nee
Hey Sophie - I've gotten 2 inflight PRs each with more than 15 retries...
Namely: https://github.com/apache/kafka/pull/15023 and
https://github.com/apache/kafka/pull/15035

justin filed a flaky test report here though:
https://issues.apache.org/jira/browse/KAFKA-16045

P

On Thu, Dec 21, 2023 at 3:18 PM Sophie Blee-Goldman 
wrote:

> On a related note, has anyone else had trouble getting even a single run
> with no build failures lately? I've had multiple pure-docs PRs blocked for
> days or even weeks because of miscellaneous infra, test, and timeout
> failures. I know we just had a discussion about whether it's acceptable to
> ever merge with a failing build, and the consensus (which I agree with) was
> NO -- but seriously, this is getting ridiculous. The build might be the
> worst I've ever seen it, and it just makes it really difficult to maintain
> good will with external contributors.
>
> Take for example this small docs PR:
> https://github.com/apache/kafka/pull/14949
>
> It's on its 7th replay, with the first 6 runs all having (at least) one
> build that failed completely. The issues I saw on this one PR are a good
> summary of what I've been seeing elsewhere, so here's the briefing:
>
> 1. gradle issue:
>
> > * What went wrong:
> >
> > Gradle could not start your build.
> >
> > > Cannot create service of type BuildSessionActionExecutor using method
> > LauncherServices$ToolingBuildSessionScopeServices.createActionExecutor()
> as
> > there is a problem with parameter #21 of type
> FileSystemWatchingInformation.
> >
> >> Cannot create service of type BuildLifecycleAwareVirtualFileSystem
> > using method
> >
> VirtualFileSystemServices$GradleUserHomeServices.createVirtualFileSystem()
> > as there is a problem with parameter #7 of type GlobalCacheLocations.
> >   > Cannot create service of type GlobalCacheLocations using method
> > GradleUserHomeScopeServices.createGlobalCacheLocations() as there is a
> > problem with parameter #1 of type List.
> >  > Could not create service of type FileAccessTimeJournal using
> > GradleUserHomeScopeServices.createFileAccessTimeJournal().
> > > Timeout waiting to lock journal cache
> > (/home/jenkins/.gradle/caches/journal-1). It is currently in use by
> another
> > Gradle instance.
> >
>
> 2. git issue:
>
> > ERROR: Error cloning remote repo 'origin'
> > hudson.plugins.git.GitException: java.io.IOException: Remote call on
> > builds43 failed
>
>
> 3. storage test calling System.exit (I think)
>
> > * What went wrong:
> >  Execution failed for task ':storage:test'.
> >  > Process 'Gradle Test Executor 73' finished with non-zero exit value 1
>
> This problem might be caused by incorrect test process configuration.
>
>
> 4.  3/4 builds aborted suddenly for no clear reason
>
> 5. 1 build was aborted, 1 build failed due to a gradle(?) issue with a
> storage test:
>
> Failed to map supported failure 'org.opentest4j.AssertionFailedError:
> > Failed to observe commit callback before timeout' with mapper
> >
> 'org.gradle.api.internal.tasks.testing.failure.mappers.OpenTestAssertionFailedMapper@38bb78ea
> ':
> > null
>
>
>
> * What went wrong:
> > Execution failed for task ':storage:test'.
> > > Process 'Gradle Test Executor 73' finished with non-zero exit value 1
> >   This problem might be caused by incorrect test process configuration.
> >
>
> 6.  Unknown issue with a core test:
>
> > Unexpected exception thrown.
> > org.gradle.internal.remote.internal.MessageIOException: Could not read
> > message from '/127.0.0.1:46952'.
> >   at
> >
> org.gradle.internal.remote.internal.inet.SocketConnection.receive(SocketConnection.java:94)
> >   at
> >
> org.gradle.internal.remote.internal.hub.MessageHub$ConnectionReceive.run(MessageHub.java:270)
> >   at
> >
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
> >   at
> >
> org.gradle.internal.concurrent.AbstractManagedExecutor$1.run(AbstractManagedExecutor.java:47)
> >   at
> >
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> >   at
> >
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> >   at java.base/java.lang.Thread.run(Thread.java:1583)
> > Caused by: java.lang.IllegalArgumentException
> >   at
> >
> org.gradle.internal.remote.internal.hub.InterHubMessageSerializer$MessageReader.read(InterHubMessageSerializer.java:72)
> >   at
> >
> org.gradle.internal.remote.internal.hub.InterHubMessageSerializer$MessageReader.read(InterHubMessageSerializer.java:52)
> >   at
> >
> org.gradle.internal.remote.internal.inet.SocketConnection.receive(SocketConnection.java:81)
> > ... 6 more
> > org.gradle.internal.remote.internal.ConnectException: Could not connect
> to
> > server [1d62bf97-6a3e-441d-93b6-093617cbbea9 port:41289, addresses:[/
> > 127.0.0.1]]. Tried addresses: [/127.0.0.1].
> >   at
> >
> 

Re: Kafka trunk test & build stability

2023-12-21 Thread Sophie Blee-Goldman
On a related note, has anyone else had trouble getting even a single run
with no build failures lately? I've had multiple pure-docs PRs blocked for
days or even weeks because of miscellaneous infra, test, and timeout
failures. I know we just had a discussion about whether it's acceptable to
ever merge with a failing build, and the consensus (which I agree with) was
NO -- but seriously, this is getting ridiculous. The build might be the
worst I've ever seen it, and it just makes it really difficult to maintain
good will with external contributors.

Take for example this small docs PR:
https://github.com/apache/kafka/pull/14949

It's on its 7th replay, with the first 6 runs all having (at least) one
build that failed completely. The issues I saw on this one PR are a good
summary of what I've been seeing elsewhere, so here's the briefing:

1. gradle issue:

> * What went wrong:
>
> Gradle could not start your build.
>
> > Cannot create service of type BuildSessionActionExecutor using method
> LauncherServices$ToolingBuildSessionScopeServices.createActionExecutor() as
> there is a problem with parameter #21 of type FileSystemWatchingInformation.
>
>> Cannot create service of type BuildLifecycleAwareVirtualFileSystem
> using method
> VirtualFileSystemServices$GradleUserHomeServices.createVirtualFileSystem()
> as there is a problem with parameter #7 of type GlobalCacheLocations.
>   > Cannot create service of type GlobalCacheLocations using method
> GradleUserHomeScopeServices.createGlobalCacheLocations() as there is a
> problem with parameter #1 of type List.
>  > Could not create service of type FileAccessTimeJournal using
> GradleUserHomeScopeServices.createFileAccessTimeJournal().
> > Timeout waiting to lock journal cache
> (/home/jenkins/.gradle/caches/journal-1). It is currently in use by another
> Gradle instance.
>

2. git issue:

> ERROR: Error cloning remote repo 'origin'
> hudson.plugins.git.GitException: java.io.IOException: Remote call on
> builds43 failed


3. storage test calling System.exit (I think)

> * What went wrong:
>  Execution failed for task ':storage:test'.
>  > Process 'Gradle Test Executor 73' finished with non-zero exit value 1

This problem might be caused by incorrect test process configuration.


4.  3/4 builds aborted suddenly for no clear reason

5. 1 build was aborted, 1 build failed due to a gradle(?) issue with a
storage test:

Failed to map supported failure 'org.opentest4j.AssertionFailedError:
> Failed to observe commit callback before timeout' with mapper
> 'org.gradle.api.internal.tasks.testing.failure.mappers.OpenTestAssertionFailedMapper@38bb78ea':
> null



* What went wrong:
> Execution failed for task ':storage:test'.
> > Process 'Gradle Test Executor 73' finished with non-zero exit value 1
>   This problem might be caused by incorrect test process configuration.
>

6.  Unknown issue with a core test:

> Unexpected exception thrown.
> org.gradle.internal.remote.internal.MessageIOException: Could not read
> message from '/127.0.0.1:46952'.
>   at
> org.gradle.internal.remote.internal.inet.SocketConnection.receive(SocketConnection.java:94)
>   at
> org.gradle.internal.remote.internal.hub.MessageHub$ConnectionReceive.run(MessageHub.java:270)
>   at
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
>   at
> org.gradle.internal.concurrent.AbstractManagedExecutor$1.run(AbstractManagedExecutor.java:47)
>   at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>   at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>   at java.base/java.lang.Thread.run(Thread.java:1583)
> Caused by: java.lang.IllegalArgumentException
>   at
> org.gradle.internal.remote.internal.hub.InterHubMessageSerializer$MessageReader.read(InterHubMessageSerializer.java:72)
>   at
> org.gradle.internal.remote.internal.hub.InterHubMessageSerializer$MessageReader.read(InterHubMessageSerializer.java:52)
>   at
> org.gradle.internal.remote.internal.inet.SocketConnection.receive(SocketConnection.java:81)
> ... 6 more
> org.gradle.internal.remote.internal.ConnectException: Could not connect to
> server [1d62bf97-6a3e-441d-93b6-093617cbbea9 port:41289, addresses:[/
> 127.0.0.1]]. Tried addresses: [/127.0.0.1].
>   at
> org.gradle.internal.remote.internal.inet.TcpOutgoingConnector.connect(TcpOutgoingConnector.java:67)
>   at
> org.gradle.internal.remote.internal.hub.MessageHubBackedClient.getConnection(MessageHubBackedClient.java:36)
>   at
> org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:103)
>   at
> org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:65)
>   at
> worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
>   at
> 

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.7 #29

2023-12-21 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 3007 lines...]
> Task :core:compileTestScala
> Task :clients:check

> Task :core:compileScala
[Warn] /home/jenkins/.gradle/workers/warning:[options] bootstrap class path not 
set in conjunction with -source 8
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.7/core/src/main/java/kafka/log/remote/RemoteLogManager.java:235:
  [removal] AccessController in java.security has been deprecated and marked 
for removal
[Warn] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.7/core/src/main/java/kafka/log/remote/RemoteLogManager.java:257:
  [removal] AccessController in java.security has been deprecated and marked 
for removal

> Task :core:classes
> Task :core:compileTestJava NO-SOURCE
> Task :core:checkstyleMain
> Task :shell:compileJava
> Task :shell:classes
> Task :shell:checkstyleMain
> Task :shell:spotbugsMain
> Task :core:compileTestScala
> Task :clients:check

> Task :core:compileTestScala
Unexpected javac output: warning: [options] bootstrap class path not set in 
conjunction with -source 8
Note: 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.7/core/src/test/java/kafka/log/remote/RemoteLogManagerTest.java
 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning.

> Task :core:testClasses
> Task :core:spotbugsTest SKIPPED

> Task :core:compileTestScala
Unexpected javac output: warning: [options] bootstrap class path not set in 
conjunction with -source 8
warning: [options] source value 8 is obsolete and will be removed in a future 
release
warning: [options] target value 8 is obsolete and will be removed in a future 
release
warning: [options] To suppress warnings about obsolete options, use 
-Xlint:-options.
Note: 
/home/jenkins/workspace/Kafka_kafka_3.7/core/src/test/java/kafka/log/remote/RemoteLogManagerTest.java
 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
4 warnings.

> Task :core:testClasses
> Task :core:spotbugsTest SKIPPED
> Task :shell:compileTestJava
> Task :shell:testClasses
> Task :shell:spotbugsTest SKIPPED
> Task :core:checkstyleTest
> Task :shell:checkstyleTest
> Task :shell:check
> Task :storage:compileTestJava
> Task :storage:testClasses
> Task :storage:spotbugsTest SKIPPED
> Task :jmh-benchmarks:compileJava
> Task :jmh-benchmarks:classes
> Task :jmh-benchmarks:compileTestJava NO-SOURCE
> Task :jmh-benchmarks:testClasses UP-TO-DATE
> Task :jmh-benchmarks:checkstyleTest NO-SOURCE
> Task :jmh-benchmarks:spotbugsTest SKIPPED
> Task :shell:compileTestJava
> Task :shell:testClasses
> Task :shell:spotbugsTest SKIPPED
> Task :core:checkstyleTest
> Task :shell:checkstyleTest
> Task :shell:check
> Task :storage:checkstyleTest
> Task :storage:check
> Task :jmh-benchmarks:checkstyleMain
> Task :storage:compileTestJava
> Task :storage:testClasses
> Task :storage:spotbugsTest SKIPPED
> Task :connect:runtime:compileTestJava
> Task :connect:runtime:testClasses
> Task :connect:runtime:spotbugsTest SKIPPED
> Task :connect:file:compileTestJava
> Task :connect:file:testClasses
> Task :connect:file:spotbugsTest SKIPPED
> Task :connect:file:checkstyleTest
> Task :connect:file:check
> Task :connect:mirror:compileTestJava
> Task :connect:mirror:testClasses
> Task :connect:mirror:spotbugsTest SKIPPED
> Task :jmh-benchmarks:compileJava
> Task :jmh-benchmarks:classes
> Task :jmh-benchmarks:compileTestJava NO-SOURCE
> Task :jmh-benchmarks:testClasses UP-TO-DATE
> Task :jmh-benchmarks:checkstyleTest NO-SOURCE
> Task :jmh-benchmarks:spotbugsTest SKIPPED
> Task :tools:compileTestJava
> Task :tools:testClasses
> Task :tools:spotbugsTest SKIPPED
> Task :storage:checkstyleTest
> Task :storage:check
> Task :jmh-benchmarks:checkstyleMain
> Task :connect:mirror:checkstyleTest
> Task :connect:mirror:check
> Task :connect:runtime:compileTestJava
> Task :connect:runtime:testClasses
> Task :connect:runtime:spotbugsTest SKIPPED
> Task :connect:file:compileTestJava
> Task :connect:file:testClasses
> Task :connect:file:spotbugsTest SKIPPED
> Task :tools:checkstyleTest
> Task :tools:check
> Task :connect:file:checkstyleTest
> Task :connect:file:check
> Task :connect:mirror:compileTestJava
> Task :connect:mirror:testClasses
> Task :connect:mirror:spotbugsTest SKIPPED
> Task :tools:compileTestJava
> Task :tools:testClasses
> Task :tools:spotbugsTest SKIPPED
> Task :connect:mirror:checkstyleTest
> Task :connect:mirror:check
> Task :streams:compileTestJava
> Task :core:spotbugsMain
> Task :tools:checkstyleTest
> Task :tools:check
> Task :connect:runtime:checkstyleTest
> Task :connect:runtime:check
> Task :streams:compileTestJava
> Task :core:spotbugsMain
> Task 

[jira] [Resolved] (KAFKA-16040) Rename `Generic` to `Classic`

2023-12-21 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-16040.
-
Resolution: Fixed

> Rename `Generic` to `Classic`
> -
>
> Key: KAFKA-16040
> URL: https://issues.apache.org/jira/browse/KAFKA-16040
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Blocker
> Fix For: 3.7.0
>
>
> People has raised concerned about using {{Generic}} as a name to designate 
> the old rebalance protocol. We considered using {{Legacy}} but discarded it 
> because there are still applications, such as Connect, using the old 
> protocol. We settled on using {{Classic}} for the {{{}Classic Rebalance 
> Protocol{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16045) ZkMigrationIntegrationTest.testMigrateTopicDeletion flaky

2023-12-21 Thread Justine Olshan (Jira)
Justine Olshan created KAFKA-16045:
--

 Summary: ZkMigrationIntegrationTest.testMigrateTopicDeletion flaky
 Key: KAFKA-16045
 URL: https://issues.apache.org/jira/browse/KAFKA-16045
 Project: Kafka
  Issue Type: Test
Reporter: Justine Olshan


I'm seeing ZkMigrationIntegrationTest.testMigrateTopicDeletion fail for many 
builds. I believe it is also causing a thread leak because on most runs where 
it fails I also see ReplicaManager tests also fail with extra threads. 

The test always fails 
`org.opentest4j.AssertionFailedError: Timed out waiting for topics to be 
deleted`


gradle enterprise link:

[https://ge.apache.org/scans/tests?search.names=Git%20branch[…]lues=trunk=kafka.zk.ZkMigrationIntegrationTest|https://ge.apache.org/scans/tests?search.names=Git%20branch=P28D=kafka=America%2FLos_Angeles=trunk=kafka.zk.ZkMigrationIntegrationTest]

recent pr: 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-15023/18/tests/]
trunk builds: 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/2502/tests],
 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/2501/tests]
 (edited) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Ismael Juma
Thank you Josep!

Ismael

On Thu, Dec 21, 2023, 9:09 AM Josep Prat 
wrote:

> Hi Ismael,
>
> I can volunteer to write the KIP. Unless somebody else has any objections,
> I'll get to write it by the end of this week.
>
> Best,
>
> Josep Prat
> Open Source Engineering Director, aivenjosep.p...@aiven.io   |
> +491715557497 | aiven.io
> Aiven Deutschland GmbH
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B
>
> On Thu, Dec 21, 2023, 17:58 Ismael Juma  wrote:
>
> > Hi all,
> >
> > After understanding the use case Josep and Anton described in more
> detail,
> > I think it's fair to say that quorum reconfiguration is necessary for
> > migration of Apache Kafka users who follow this pattern. Given that, I
> > think we should have a 3.8 release before the 4.0 release.
> >
> > The next question is whether we should do something special when it comes
> > to timeline, parallel releases, etc. After careful consideration, I think
> > we should simply follow our usual approach: regular 3.8 release around
> > early May 2024 and regular 4.0 release around early September 2024. The
> > community will be able to start working on items specific to 4.0 after
> 3.8
> > is branched in late March/early April - I don't think we need to deal
> with
> > the overhead of maintaining multiple long-lived branches for
> > feature development.
> >
> > If the proposal above sounds reasonable, I suggest we write a KIP and
> vote
> > on it. Any volunteers?
> >
> > Ismael
> >
> > On Tue, Nov 21, 2023 at 8:18 PM Ismael Juma  wrote:
> >
> > > Hi Luke,
> > >
> > > I think we're conflating different things here. There are 3 separate
> > > points in your email, but only 1 of them requires 3.8:
> > >
> > > 1. JBOD may have some bugs in 3.7.0. Whatever bugs exist can be fixed
> in
> > > 3.7.x. We have already said that we will backport critical fixes to
> 3.7.x
> > > for some time.
> > > 2. Quorum reconfiguration is important to include in 4.0, the release
> > > where ZK won't be supported. This doesn't need a 3.8 release either.
> > > 3. Quorum reconfiguration is necessary for migration use cases and
> hence
> > > needs to be in a 3.x release. This one would require a 3.8 release if
> > true.
> > > But we should have a debate on whether it is indeed true. It's not
> clear
> > to
> > > me yet.
> > >
> > > Ismael
> > >
> > > On Tue, Nov 21, 2023 at 7:30 PM Luke Chen  wrote:
> > >
> > >> Hi Colin and Jose,
> > >>
> > >> I revisited the discussion of KIP-833 here
> > >> ,
> and
> > >> you
> > >> can see I'm the first one to reply to the discussion thread to express
> > my
> > >> excitement at that time. Till now, I personally still think having
> KRaft
> > >> in
> > >> Kafka is a good direction we have to move forward. But to move to this
> > >> destination, we need to make our users comfortable with this decision.
> > The
> > >> worst scenario is, we said 4.0 is ready, and ZK is removed. Then, some
> > >> users move to 4.0 and say, wait a minute, why does it not support xxx
> > >> feature? And then start to search for other alternatives to replace
> > Apache
> > >> Kafka. We all don't want to see this, right? So, that's why some
> > community
> > >> users start to express their concern to move to 4.0 too quickly,
> > including
> > >> me.
> > >>
> > >>
> > >> Quoting Colin:
> > >> > While dynamic quorum reconfiguration is a nice feature, it doesn't
> > block
> > >> anything: not migration, not deployment.
> > >>
> > >> Clearly Confluent team might deploy ZooKeeper in a particular way and
> > >> didn’t depend on its ability to support reconfiguration. So KRaft is
> > ready
> > >> from your point of view. But users of Apache Kafka might have come to
> > >> depend on some ZooKeeper functionality, such as the ability to
> > reconfigure
> > >> ZooKeeper quorums, that is not available in KRaft, yet. I don’t think
> > the
> > >> Apache Kafka documentation has ever said “do not depend on this
> ability
> > of
> > >> Apache Kafka or Zookeeper”, so it doesn’t seem unreasonable for users
> to
> > >> have deployed ZooKeeper in this way. In KIP-833
> > >> <
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-MissingFeatures
> > >> >,
> > >> we said: “Modifying certain dynamic configurations on the standalone
> > KRaft
> > >> controller” was an important missing feature. Unfortunately it wasn’t
> as
> > >> explicit as it could have been. While no one expects KRaft to support
> > all
> > >> the features of ZooKeeper, it looks to me that users might depend on
> > this
> > >> particular feature and it’s only recently that it’s become apparent
> that
> > >> you don’t consider it a blocker.
> > >>
> > >> Quoting José:
> > >> > If we do a 3.8 release before 4.0 and we implement KIP-853 in 3.8,
> the
> > >> user will be able to migrate to a 

Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Josep Prat
Hi Ismael,

I can volunteer to write the KIP. Unless somebody else has any objections,
I'll get to write it by the end of this week.

Best,

Josep Prat
Open Source Engineering Director, aivenjosep.p...@aiven.io   |
+491715557497 | aiven.io
Aiven Deutschland GmbH
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B

On Thu, Dec 21, 2023, 17:58 Ismael Juma  wrote:

> Hi all,
>
> After understanding the use case Josep and Anton described in more detail,
> I think it's fair to say that quorum reconfiguration is necessary for
> migration of Apache Kafka users who follow this pattern. Given that, I
> think we should have a 3.8 release before the 4.0 release.
>
> The next question is whether we should do something special when it comes
> to timeline, parallel releases, etc. After careful consideration, I think
> we should simply follow our usual approach: regular 3.8 release around
> early May 2024 and regular 4.0 release around early September 2024. The
> community will be able to start working on items specific to 4.0 after 3.8
> is branched in late March/early April - I don't think we need to deal with
> the overhead of maintaining multiple long-lived branches for
> feature development.
>
> If the proposal above sounds reasonable, I suggest we write a KIP and vote
> on it. Any volunteers?
>
> Ismael
>
> On Tue, Nov 21, 2023 at 8:18 PM Ismael Juma  wrote:
>
> > Hi Luke,
> >
> > I think we're conflating different things here. There are 3 separate
> > points in your email, but only 1 of them requires 3.8:
> >
> > 1. JBOD may have some bugs in 3.7.0. Whatever bugs exist can be fixed in
> > 3.7.x. We have already said that we will backport critical fixes to 3.7.x
> > for some time.
> > 2. Quorum reconfiguration is important to include in 4.0, the release
> > where ZK won't be supported. This doesn't need a 3.8 release either.
> > 3. Quorum reconfiguration is necessary for migration use cases and hence
> > needs to be in a 3.x release. This one would require a 3.8 release if
> true.
> > But we should have a debate on whether it is indeed true. It's not clear
> to
> > me yet.
> >
> > Ismael
> >
> > On Tue, Nov 21, 2023 at 7:30 PM Luke Chen  wrote:
> >
> >> Hi Colin and Jose,
> >>
> >> I revisited the discussion of KIP-833 here
> >> , and
> >> you
> >> can see I'm the first one to reply to the discussion thread to express
> my
> >> excitement at that time. Till now, I personally still think having KRaft
> >> in
> >> Kafka is a good direction we have to move forward. But to move to this
> >> destination, we need to make our users comfortable with this decision.
> The
> >> worst scenario is, we said 4.0 is ready, and ZK is removed. Then, some
> >> users move to 4.0 and say, wait a minute, why does it not support xxx
> >> feature? And then start to search for other alternatives to replace
> Apache
> >> Kafka. We all don't want to see this, right? So, that's why some
> community
> >> users start to express their concern to move to 4.0 too quickly,
> including
> >> me.
> >>
> >>
> >> Quoting Colin:
> >> > While dynamic quorum reconfiguration is a nice feature, it doesn't
> block
> >> anything: not migration, not deployment.
> >>
> >> Clearly Confluent team might deploy ZooKeeper in a particular way and
> >> didn’t depend on its ability to support reconfiguration. So KRaft is
> ready
> >> from your point of view. But users of Apache Kafka might have come to
> >> depend on some ZooKeeper functionality, such as the ability to
> reconfigure
> >> ZooKeeper quorums, that is not available in KRaft, yet. I don’t think
> the
> >> Apache Kafka documentation has ever said “do not depend on this ability
> of
> >> Apache Kafka or Zookeeper”, so it doesn’t seem unreasonable for users to
> >> have deployed ZooKeeper in this way. In KIP-833
> >> <
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-MissingFeatures
> >> >,
> >> we said: “Modifying certain dynamic configurations on the standalone
> KRaft
> >> controller” was an important missing feature. Unfortunately it wasn’t as
> >> explicit as it could have been. While no one expects KRaft to support
> all
> >> the features of ZooKeeper, it looks to me that users might depend on
> this
> >> particular feature and it’s only recently that it’s become apparent that
> >> you don’t consider it a blocker.
> >>
> >> Quoting José:
> >> > If we do a 3.8 release before 4.0 and we implement KIP-853 in 3.8, the
> >> user will be able to migrate to a KRaft cluster that supports
> dynamically
> >> changing the set of voters and has better support for disk failures.
> >>
> >> Yes, KIP-853 and disk failure support are both very important missing
> >> features. For the disk failure support, I don't think this is a
> >> "good-to-have-feature", it should be a "must-have" IMO. We can't
> 

Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread David Jacot
Thanks, Ismael. The proposal makes sense. +1

David

On Thu, Dec 21, 2023 at 5:59 PM Ismael Juma  wrote:

> Hi all,
>
> After understanding the use case Josep and Anton described in more detail,
> I think it's fair to say that quorum reconfiguration is necessary for
> migration of Apache Kafka users who follow this pattern. Given that, I
> think we should have a 3.8 release before the 4.0 release.
>
> The next question is whether we should do something special when it comes
> to timeline, parallel releases, etc. After careful consideration, I think
> we should simply follow our usual approach: regular 3.8 release around
> early May 2024 and regular 4.0 release around early September 2024. The
> community will be able to start working on items specific to 4.0 after 3.8
> is branched in late March/early April - I don't think we need to deal with
> the overhead of maintaining multiple long-lived branches for
> feature development.
>
> If the proposal above sounds reasonable, I suggest we write a KIP and vote
> on it. Any volunteers?
>
> Ismael
>
> On Tue, Nov 21, 2023 at 8:18 PM Ismael Juma  wrote:
>
> > Hi Luke,
> >
> > I think we're conflating different things here. There are 3 separate
> > points in your email, but only 1 of them requires 3.8:
> >
> > 1. JBOD may have some bugs in 3.7.0. Whatever bugs exist can be fixed in
> > 3.7.x. We have already said that we will backport critical fixes to 3.7.x
> > for some time.
> > 2. Quorum reconfiguration is important to include in 4.0, the release
> > where ZK won't be supported. This doesn't need a 3.8 release either.
> > 3. Quorum reconfiguration is necessary for migration use cases and hence
> > needs to be in a 3.x release. This one would require a 3.8 release if
> true.
> > But we should have a debate on whether it is indeed true. It's not clear
> to
> > me yet.
> >
> > Ismael
> >
> > On Tue, Nov 21, 2023 at 7:30 PM Luke Chen  wrote:
> >
> >> Hi Colin and Jose,
> >>
> >> I revisited the discussion of KIP-833 here
> >> , and
> >> you
> >> can see I'm the first one to reply to the discussion thread to express
> my
> >> excitement at that time. Till now, I personally still think having KRaft
> >> in
> >> Kafka is a good direction we have to move forward. But to move to this
> >> destination, we need to make our users comfortable with this decision.
> The
> >> worst scenario is, we said 4.0 is ready, and ZK is removed. Then, some
> >> users move to 4.0 and say, wait a minute, why does it not support xxx
> >> feature? And then start to search for other alternatives to replace
> Apache
> >> Kafka. We all don't want to see this, right? So, that's why some
> community
> >> users start to express their concern to move to 4.0 too quickly,
> including
> >> me.
> >>
> >>
> >> Quoting Colin:
> >> > While dynamic quorum reconfiguration is a nice feature, it doesn't
> block
> >> anything: not migration, not deployment.
> >>
> >> Clearly Confluent team might deploy ZooKeeper in a particular way and
> >> didn’t depend on its ability to support reconfiguration. So KRaft is
> ready
> >> from your point of view. But users of Apache Kafka might have come to
> >> depend on some ZooKeeper functionality, such as the ability to
> reconfigure
> >> ZooKeeper quorums, that is not available in KRaft, yet. I don’t think
> the
> >> Apache Kafka documentation has ever said “do not depend on this ability
> of
> >> Apache Kafka or Zookeeper”, so it doesn’t seem unreasonable for users to
> >> have deployed ZooKeeper in this way. In KIP-833
> >> <
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-MissingFeatures
> >> >,
> >> we said: “Modifying certain dynamic configurations on the standalone
> KRaft
> >> controller” was an important missing feature. Unfortunately it wasn’t as
> >> explicit as it could have been. While no one expects KRaft to support
> all
> >> the features of ZooKeeper, it looks to me that users might depend on
> this
> >> particular feature and it’s only recently that it’s become apparent that
> >> you don’t consider it a blocker.
> >>
> >> Quoting José:
> >> > If we do a 3.8 release before 4.0 and we implement KIP-853 in 3.8, the
> >> user will be able to migrate to a KRaft cluster that supports
> dynamically
> >> changing the set of voters and has better support for disk failures.
> >>
> >> Yes, KIP-853 and disk failure support are both very important missing
> >> features. For the disk failure support, I don't think this is a
> >> "good-to-have-feature", it should be a "must-have" IMO. We can't
> announce
> >> the 4.0 release without a good solution for disk failure in KRaft.
> >>
> >> It’s also worth thinking about how Apache Kafka users who depend on JBOD
> >> might look at the risks of not having a 3.8 release. JBOD support on
> KRaft
> >> is planned to be added in 3.7, and is still in progress so far. So it’s
> >> hard 

Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Ismael Juma
Hi all,

After understanding the use case Josep and Anton described in more detail,
I think it's fair to say that quorum reconfiguration is necessary for
migration of Apache Kafka users who follow this pattern. Given that, I
think we should have a 3.8 release before the 4.0 release.

The next question is whether we should do something special when it comes
to timeline, parallel releases, etc. After careful consideration, I think
we should simply follow our usual approach: regular 3.8 release around
early May 2024 and regular 4.0 release around early September 2024. The
community will be able to start working on items specific to 4.0 after 3.8
is branched in late March/early April - I don't think we need to deal with
the overhead of maintaining multiple long-lived branches for
feature development.

If the proposal above sounds reasonable, I suggest we write a KIP and vote
on it. Any volunteers?

Ismael

On Tue, Nov 21, 2023 at 8:18 PM Ismael Juma  wrote:

> Hi Luke,
>
> I think we're conflating different things here. There are 3 separate
> points in your email, but only 1 of them requires 3.8:
>
> 1. JBOD may have some bugs in 3.7.0. Whatever bugs exist can be fixed in
> 3.7.x. We have already said that we will backport critical fixes to 3.7.x
> for some time.
> 2. Quorum reconfiguration is important to include in 4.0, the release
> where ZK won't be supported. This doesn't need a 3.8 release either.
> 3. Quorum reconfiguration is necessary for migration use cases and hence
> needs to be in a 3.x release. This one would require a 3.8 release if true.
> But we should have a debate on whether it is indeed true. It's not clear to
> me yet.
>
> Ismael
>
> On Tue, Nov 21, 2023 at 7:30 PM Luke Chen  wrote:
>
>> Hi Colin and Jose,
>>
>> I revisited the discussion of KIP-833 here
>> , and
>> you
>> can see I'm the first one to reply to the discussion thread to express my
>> excitement at that time. Till now, I personally still think having KRaft
>> in
>> Kafka is a good direction we have to move forward. But to move to this
>> destination, we need to make our users comfortable with this decision. The
>> worst scenario is, we said 4.0 is ready, and ZK is removed. Then, some
>> users move to 4.0 and say, wait a minute, why does it not support xxx
>> feature? And then start to search for other alternatives to replace Apache
>> Kafka. We all don't want to see this, right? So, that's why some community
>> users start to express their concern to move to 4.0 too quickly, including
>> me.
>>
>>
>> Quoting Colin:
>> > While dynamic quorum reconfiguration is a nice feature, it doesn't block
>> anything: not migration, not deployment.
>>
>> Clearly Confluent team might deploy ZooKeeper in a particular way and
>> didn’t depend on its ability to support reconfiguration. So KRaft is ready
>> from your point of view. But users of Apache Kafka might have come to
>> depend on some ZooKeeper functionality, such as the ability to reconfigure
>> ZooKeeper quorums, that is not available in KRaft, yet. I don’t think the
>> Apache Kafka documentation has ever said “do not depend on this ability of
>> Apache Kafka or Zookeeper”, so it doesn’t seem unreasonable for users to
>> have deployed ZooKeeper in this way. In KIP-833
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-MissingFeatures
>> >,
>> we said: “Modifying certain dynamic configurations on the standalone KRaft
>> controller” was an important missing feature. Unfortunately it wasn’t as
>> explicit as it could have been. While no one expects KRaft to support all
>> the features of ZooKeeper, it looks to me that users might depend on this
>> particular feature and it’s only recently that it’s become apparent that
>> you don’t consider it a blocker.
>>
>> Quoting José:
>> > If we do a 3.8 release before 4.0 and we implement KIP-853 in 3.8, the
>> user will be able to migrate to a KRaft cluster that supports dynamically
>> changing the set of voters and has better support for disk failures.
>>
>> Yes, KIP-853 and disk failure support are both very important missing
>> features. For the disk failure support, I don't think this is a
>> "good-to-have-feature", it should be a "must-have" IMO. We can't announce
>> the 4.0 release without a good solution for disk failure in KRaft.
>>
>> It’s also worth thinking about how Apache Kafka users who depend on JBOD
>> might look at the risks of not having a 3.8 release. JBOD support on KRaft
>> is planned to be added in 3.7, and is still in progress so far. So it’s
>> hard to say it’s a blocker or not. But in practice, even if the feature is
>> made into 3.7 in time, a lot of new code for this feature is unlikely to
>> be
>> entirely bug free. We need to maintain the confidence of those users, and
>> forcing them to migrate through 3.7 where this new code is hardly
>> battle-tested doesn’t appear 

[jira] [Created] (KAFKA-16044) Throttling using Topic Partition Quota

2023-12-21 Thread Afshin Moazami (Jira)
Afshin Moazami created KAFKA-16044:
--

 Summary: Throttling using Topic Partition Quota 
 Key: KAFKA-16044
 URL: https://issues.apache.org/jira/browse/KAFKA-16044
 Project: Kafka
  Issue Type: New Feature
Reporter: Afshin Moazami


With 
!https://issues.apache.org/jira/secure/viewavatar?size=xsmall=21141=issuetype!
 KAFKA-16042 introducing the topic-partition byte rate and metrics, and 
!https://issues.apache.org/jira/secure/viewavatar?size=xsmall=21141=issuetype!
 KAFKA-16043 introducing the quota limit configuration in the topic-level, we 
can enforce quota on topic-partition level for configured topics. 

More details in the 
[KIP-1010|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1010%3A+Topic+Partition+Quota]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16043) Add Quota configuration for topics

2023-12-21 Thread Afshin Moazami (Jira)
Afshin Moazami created KAFKA-16043:
--

 Summary: Add Quota configuration for topics
 Key: KAFKA-16043
 URL: https://issues.apache.org/jira/browse/KAFKA-16043
 Project: Kafka
  Issue Type: New Feature
Reporter: Afshin Moazami


To be able to have topic-partition quota, we need to introduce two topic 
configuration for the producer-byte-rate and consumer-byte-rate. 

The assumption is that all partitions of the same topic get the same quota, so 
we define one config per topic. 

This configuration should work both with zookeeper and kraft setup. 

Also, we should define a default quota value (to be discussed) and potentially 
use the same format as user/client default configuration using `` as 
the value. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16042) Quota Metrics based on per topic-partition produce/fetch byte rate

2023-12-21 Thread Afshin Moazami (Jira)
Afshin Moazami created KAFKA-16042:
--

 Summary: Quota Metrics based on per topic-partition produce/fetch 
byte rate
 Key: KAFKA-16042
 URL: https://issues.apache.org/jira/browse/KAFKA-16042
 Project: Kafka
  Issue Type: New Feature
  Components: core
Reporter: Afshin Moazami
Assignee: Afshin Moazami


Currently, Kafka emits the producer-byte-rate and fetch-bytes-rate for quota 
calculations. By adding a new signature to the 
`[quotaMetricTags|https://github.com/afshing/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/server/quota/ClientQuotaCallback.java#L40]`
 method to add the individual topic-partitions size as a parameter, we can 
define metrics based on the topic name and partition id. 

To do that, we need both `ProduceRequest` and `FetchResponse` have the 
`partitionSizes` method and it is public. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Justine Olshan
Hey all,

While I understand your points Divij, I am also not in favor of having two
official release branches being developed at the same time.
If we are really concerned about the metrics change or any other JIRA
ticket, we can have a separate branch for that, rather than a new release
branch. I think it will be difficult to merge with trunk when it is time to
do a real release.

I'm interested to hear from Ismael's new proposal :)

Justine



On Thu, Dec 21, 2023 at 8:00 AM Divij Vaidya 
wrote:

> Fair point David. The point of experimental release was to allow users to
> test the initial major version and allow for developers to start working on
> the major version. Even if we don't release, I think that there is value in
> starting a 4.x branch (separate from trunk).
>
> Having a 4.x branch will allow us to start developing (or removing) things
> that we are currently unable to do due to constraints of having to maintain
> backward compatibility of JDK 8 and other deprecated APIs/dependencies. If
> we don't do it right now and instead choose to do it after 3.8, there is
> very limited time (~3-4 months) for that branch to bake and make the
> required changes.
>
> As an example, our metrics library (metrics-core) is still running a
> version (2.2.0) from 2012. Upgrading it is a breaking change (long story,
> not relevant to this thread) and hence, we can't merge it to trunk right
> now. So, we will have to schedule this change between 3.8 & 4.0. What if we
> don't have developer bandwidth to work on this change during that 3 month
> window? With a 4.x branch, we can start building (and more importantly,
> testing!) changes for the next major version right away. There are numerous
> other things (I came across another one
> https://issues.apache.org/jira/browse/KAFKA-16041) that we can start doing
> now for 4.x.
>
> What do you think?
>
> --
> Divij Vaidya
>
>
>
> On Thu, Dec 21, 2023 at 4:30 PM David Jacot 
> wrote:
>
> > Hi Divij,
> >
> > > Release 4.0 as an "experimental" release
> >
> > I don't think that this is something that we should do. If we need more
> > time, we should just do a 3.8 release and then release 4.0 when we are
> > ready. An experimental major release will be more confusing than anything
> > else. We should also keep in mind that major releases are also adopted
> with
> > more scrutiny in general. I don't think that many users will jump to 4.0
> > anyway. They will likely wait for 4.0.1 or even 4.1.
> >
> > Best,
> > David
> >
> > On Thu, Dec 21, 2023 at 3:59 PM Divij Vaidya 
> > wrote:
> >
> > > Hi folks
> > >
> > > I am late to the conversation but I would like to add my point of view
> > > here.
> > >
> > > I have three main concerns:
> > >
> > > 1\ Durability/availability bugs in kraft - Even though kraft has been
> > > around for a while, we keep finding bugs that impact availability and
> > data
> > > durability in it almost with every release [1] [2]. It's a complex
> > feature
> > > and such bugs are expected during the stabilization phase. But we can't
> > > remove the alternative until we see stabilization in kraft i.e. no new
> > > stability/durability bugs for at least 2 releases.
> > > 2\ Parity with Zk - There are also pending bugs [3] which are in the
> > > category of Zk parity. Removing Zk from Kafka without having full
> feature
> > > parity with Zk will leave some Kafka users with no upgrade path.
> > > 3\ Test coverage - We also don't have sufficient test coverage for
> kraft
> > > since quite a few tests are Zk only at this stage.
> > >
> > > Given these concerns, I believe we need to reach 100% Zk parity and
> allow
> > > new feature stabilisation (such as scram, JBOD) for at least 1 version
> > > (maybe more if we find bugs in that feature) before we remove Zk. I
> also
> > > agree with the point of view that we can't delay 4.0 indefinitely and
> we
> > > need a clear cut line.
> > >
> > > Hence, I propose the following:
> > > 1\ Keep trunk with 3.x. Release 3.8 and potentially 3.9 if we find
> major
> > > (durability/availability related) bugs in 3.8. This will help users
> > > continue to use their tried and tested Kafka setup until we have a
> proven
> > > alternative from feature parity & stability point of view.
> > > 2\ Release 4.0 as an "experimental" release along with 3.8 "stable"
> > > release. This will help get user feedback on the feasibility of
> removing
> > Zk
> > > completely right now.
> > > 3\ Create a criteria for moving 4.1 as "stable" release instead of
> > > "experimental". This list should include 100% Zk parity and 100% Kafka
> > > tests operating with kraft. It will also include other community
> feedback
> > > from this & other threads.
> > > 4\ When the 4.x version is "stable", move the trunk to 4.x and stop all
> > > development on the 3.x branch.
> > >
> > > I acknowledge that earlier in the community, we have decided to make
> 3.7
> > as
> > > the last release in the 3.x series. But, IMO we have learnt a lot since
> > > then 

Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Divij Vaidya
Fair point David. The point of experimental release was to allow users to
test the initial major version and allow for developers to start working on
the major version. Even if we don't release, I think that there is value in
starting a 4.x branch (separate from trunk).

Having a 4.x branch will allow us to start developing (or removing) things
that we are currently unable to do due to constraints of having to maintain
backward compatibility of JDK 8 and other deprecated APIs/dependencies. If
we don't do it right now and instead choose to do it after 3.8, there is
very limited time (~3-4 months) for that branch to bake and make the
required changes.

As an example, our metrics library (metrics-core) is still running a
version (2.2.0) from 2012. Upgrading it is a breaking change (long story,
not relevant to this thread) and hence, we can't merge it to trunk right
now. So, we will have to schedule this change between 3.8 & 4.0. What if we
don't have developer bandwidth to work on this change during that 3 month
window? With a 4.x branch, we can start building (and more importantly,
testing!) changes for the next major version right away. There are numerous
other things (I came across another one
https://issues.apache.org/jira/browse/KAFKA-16041) that we can start doing
now for 4.x.

What do you think?

--
Divij Vaidya



On Thu, Dec 21, 2023 at 4:30 PM David Jacot 
wrote:

> Hi Divij,
>
> > Release 4.0 as an "experimental" release
>
> I don't think that this is something that we should do. If we need more
> time, we should just do a 3.8 release and then release 4.0 when we are
> ready. An experimental major release will be more confusing than anything
> else. We should also keep in mind that major releases are also adopted with
> more scrutiny in general. I don't think that many users will jump to 4.0
> anyway. They will likely wait for 4.0.1 or even 4.1.
>
> Best,
> David
>
> On Thu, Dec 21, 2023 at 3:59 PM Divij Vaidya 
> wrote:
>
> > Hi folks
> >
> > I am late to the conversation but I would like to add my point of view
> > here.
> >
> > I have three main concerns:
> >
> > 1\ Durability/availability bugs in kraft - Even though kraft has been
> > around for a while, we keep finding bugs that impact availability and
> data
> > durability in it almost with every release [1] [2]. It's a complex
> feature
> > and such bugs are expected during the stabilization phase. But we can't
> > remove the alternative until we see stabilization in kraft i.e. no new
> > stability/durability bugs for at least 2 releases.
> > 2\ Parity with Zk - There are also pending bugs [3] which are in the
> > category of Zk parity. Removing Zk from Kafka without having full feature
> > parity with Zk will leave some Kafka users with no upgrade path.
> > 3\ Test coverage - We also don't have sufficient test coverage for kraft
> > since quite a few tests are Zk only at this stage.
> >
> > Given these concerns, I believe we need to reach 100% Zk parity and allow
> > new feature stabilisation (such as scram, JBOD) for at least 1 version
> > (maybe more if we find bugs in that feature) before we remove Zk. I also
> > agree with the point of view that we can't delay 4.0 indefinitely and we
> > need a clear cut line.
> >
> > Hence, I propose the following:
> > 1\ Keep trunk with 3.x. Release 3.8 and potentially 3.9 if we find major
> > (durability/availability related) bugs in 3.8. This will help users
> > continue to use their tried and tested Kafka setup until we have a proven
> > alternative from feature parity & stability point of view.
> > 2\ Release 4.0 as an "experimental" release along with 3.8 "stable"
> > release. This will help get user feedback on the feasibility of removing
> Zk
> > completely right now.
> > 3\ Create a criteria for moving 4.1 as "stable" release instead of
> > "experimental". This list should include 100% Zk parity and 100% Kafka
> > tests operating with kraft. It will also include other community feedback
> > from this & other threads.
> > 4\ When the 4.x version is "stable", move the trunk to 4.x and stop all
> > development on the 3.x branch.
> >
> > I acknowledge that earlier in the community, we have decided to make 3.7
> as
> > the last release in the 3.x series. But, IMO we have learnt a lot since
> > then based on the continuous improvements in kraft. I believe we should
> be
> > flexible with our earlier stance here and allow for greater stability
> > before forcing users to a completely new functionality.
> >
> > [1] https://issues.apache.org/jira/browse/KAFKA-15495
> > [2] https://issues.apache.org/jira/browse/KAFKA-15489
> > [3] https://issues.apache.org/jira/browse/KAFKA-14874
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Wed, Dec 20, 2023 at 4:59 PM Josep Prat 
> > wrote:
> >
> > > Hi Justine, Luke, and others,
> > >
> > > I believe a 3.8 version would make sense, and I would say KIP-853
> should
> > be
> > > part of it as well.
> > >
> > > Best,
> > >
> > > On Wed, Dec 20, 2023 at 4:11 PM Justine 

Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Ismael Juma
Hi Divij,

Comments inline.

On Thu, Dec 21, 2023 at 7:00 AM Divij Vaidya 
wrote:

> 1\ Durability/availability bugs in kraft - Even though kraft has been
> around for a while, we keep finding bugs that impact availability and data
> durability in it almost with every release [1] [2]. It's a complex feature
> and such bugs are expected during the stabilization phase. But we can't
> remove the alternative until we see stabilization in kraft i.e. no new
> stability/durability bugs for at least 2 releases.
>

I strongly disagree with this description. The first reference is in
relation to a bug that affects zk mode too and has done since forever. The
second issue was known and deemed not a blocker (and has since been fixed).
KRaft mode is truly prod ready and is significantly better than zk mode
based on prod metrics for thousands of clusters.

2\ Parity with Zk - There are also pending bugs [3] which are in the
> category of Zk parity. Removing Zk from Kafka without having full feature
> parity with Zk will leave some Kafka users with no upgrade path.
>

The bug you referenced isn't a real concern since it can be fixed in a
patch release.


> 3\ Test coverage - We also don't have sufficient test coverage for kraft
> since quite a few tests are Zk only at this stage.
>

I disagree with this again, we have more than enough coverage for kraft.

Given these concerns, I believe we need to reach 100% Zk parity and allow
> new feature stabilisation (such as scram, JBOD) for at least 1 version
> (maybe more if we find bugs in that feature) before we remove Zk. I also
> agree with the point of view that we can't delay 4.0 indefinitely and we
> need a clear cut line.
>
> Hence, I propose the following:
> 1\ Keep trunk with 3.x. Release 3.8 and potentially 3.9 if we find major
> (durability/availability related) bugs in 3.8. This will help users
> continue to use their tried and tested Kafka setup until we have a proven
> alternative from feature parity & stability point of view.
> 2\ Release 4.0 as an "experimental" release along with 3.8 "stable"
> release. This will help get user feedback on the feasibility of removing Zk
> completely right now.
> 3\ Create a criteria for moving 4.1 as "stable" release instead of
> "experimental". This list should include 100% Zk parity and 100% Kafka
> tests operating with kraft. It will also include other community feedback
> from this & other threads.
> 4\ When the 4.x version is "stable", move the trunk to 4.x and stop all
> development on the 3.x branch.
>

Strong -1 to this proposal. I will send a separate email with what I think
we should do.

Ismael


[jira] [Created] (KAFKA-16041) Replace Afterburn module with Blackbird

2023-12-21 Thread Mario Fiore Vitale (Jira)
Mario Fiore Vitale created KAFKA-16041:
--

 Summary: Replace Afterburn module with Blackbird
 Key: KAFKA-16041
 URL: https://issues.apache.org/jira/browse/KAFKA-16041
 Project: Kafka
  Issue Type: Task
  Components: connect
Reporter: Mario Fiore Vitale
 Fix For: 4.0.0


[Blackbird|https://github.com/FasterXML/jackson-modules-base/blob/master/blackbird/README.md]
 is the Afterburn replacement for Java 11+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread David Jacot
Hi Divij,

> Release 4.0 as an "experimental" release

I don't think that this is something that we should do. If we need more
time, we should just do a 3.8 release and then release 4.0 when we are
ready. An experimental major release will be more confusing than anything
else. We should also keep in mind that major releases are also adopted with
more scrutiny in general. I don't think that many users will jump to 4.0
anyway. They will likely wait for 4.0.1 or even 4.1.

Best,
David

On Thu, Dec 21, 2023 at 3:59 PM Divij Vaidya 
wrote:

> Hi folks
>
> I am late to the conversation but I would like to add my point of view
> here.
>
> I have three main concerns:
>
> 1\ Durability/availability bugs in kraft - Even though kraft has been
> around for a while, we keep finding bugs that impact availability and data
> durability in it almost with every release [1] [2]. It's a complex feature
> and such bugs are expected during the stabilization phase. But we can't
> remove the alternative until we see stabilization in kraft i.e. no new
> stability/durability bugs for at least 2 releases.
> 2\ Parity with Zk - There are also pending bugs [3] which are in the
> category of Zk parity. Removing Zk from Kafka without having full feature
> parity with Zk will leave some Kafka users with no upgrade path.
> 3\ Test coverage - We also don't have sufficient test coverage for kraft
> since quite a few tests are Zk only at this stage.
>
> Given these concerns, I believe we need to reach 100% Zk parity and allow
> new feature stabilisation (such as scram, JBOD) for at least 1 version
> (maybe more if we find bugs in that feature) before we remove Zk. I also
> agree with the point of view that we can't delay 4.0 indefinitely and we
> need a clear cut line.
>
> Hence, I propose the following:
> 1\ Keep trunk with 3.x. Release 3.8 and potentially 3.9 if we find major
> (durability/availability related) bugs in 3.8. This will help users
> continue to use their tried and tested Kafka setup until we have a proven
> alternative from feature parity & stability point of view.
> 2\ Release 4.0 as an "experimental" release along with 3.8 "stable"
> release. This will help get user feedback on the feasibility of removing Zk
> completely right now.
> 3\ Create a criteria for moving 4.1 as "stable" release instead of
> "experimental". This list should include 100% Zk parity and 100% Kafka
> tests operating with kraft. It will also include other community feedback
> from this & other threads.
> 4\ When the 4.x version is "stable", move the trunk to 4.x and stop all
> development on the 3.x branch.
>
> I acknowledge that earlier in the community, we have decided to make 3.7 as
> the last release in the 3.x series. But, IMO we have learnt a lot since
> then based on the continuous improvements in kraft. I believe we should be
> flexible with our earlier stance here and allow for greater stability
> before forcing users to a completely new functionality.
>
> [1] https://issues.apache.org/jira/browse/KAFKA-15495
> [2] https://issues.apache.org/jira/browse/KAFKA-15489
> [3] https://issues.apache.org/jira/browse/KAFKA-14874
>
> --
> Divij Vaidya
>
>
>
> On Wed, Dec 20, 2023 at 4:59 PM Josep Prat 
> wrote:
>
> > Hi Justine, Luke, and others,
> >
> > I believe a 3.8 version would make sense, and I would say KIP-853 should
> be
> > part of it as well.
> >
> > Best,
> >
> > On Wed, Dec 20, 2023 at 4:11 PM Justine Olshan
> > 
> > wrote:
> >
> > > Hey Luke,
> > >
> > > I think your point is valid. This is another good reason to have a 3.8
> > > release.
> > > Would you say that implementing KIP-966 in 3.8 would be an acceptable
> way
> > > to move forward?
> > >
> > > Thanks,
> > > Justine
> > >
> > >
> > > On Tue, Dec 19, 2023 at 4:35 AM Luke Chen  wrote:
> > >
> > > > Hi Justine,
> > > >
> > > > Thanks for your reply.
> > > >
> > > > > I think that for folks that want to prioritize availability over
> > > > durability, the aggressive recovery strategy from KIP-966 should be
> > > > preferable to the old unclean leader election configuration.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas#KIP966:EligibleLeaderReplicas-Uncleanrecovery
> > > >
> > > > Yes, I'm aware that we're going to implement the new way of leader
> > > election
> > > > in KIP-966.
> > > > But obviously, KIP-966 is not included in v3.7.0.
> > > > What I'm worried about is the users who prioritize availability over
> > > > durability and enable the unclean leader election in ZK mode.
> > > > Once they migrate to KRaft, there will be availability impact when
> > > unclean
> > > > leader election is needed.
> > > > And like you said, they can run unclean leader election via CLI, but
> > > again,
> > > > the availability is already impacted, which might be unacceptable in
> > some
> > > > cases.
> > > >
> > > > IMO, we should prioritize this missing feature and include it in 3.x
> > > > release.
> > > > Including in 3.x 

Re: [DISCUSS] Road to Kafka 4.0

2023-12-21 Thread Divij Vaidya
Hi folks

I am late to the conversation but I would like to add my point of view here.

I have three main concerns:

1\ Durability/availability bugs in kraft - Even though kraft has been
around for a while, we keep finding bugs that impact availability and data
durability in it almost with every release [1] [2]. It's a complex feature
and such bugs are expected during the stabilization phase. But we can't
remove the alternative until we see stabilization in kraft i.e. no new
stability/durability bugs for at least 2 releases.
2\ Parity with Zk - There are also pending bugs [3] which are in the
category of Zk parity. Removing Zk from Kafka without having full feature
parity with Zk will leave some Kafka users with no upgrade path.
3\ Test coverage - We also don't have sufficient test coverage for kraft
since quite a few tests are Zk only at this stage.

Given these concerns, I believe we need to reach 100% Zk parity and allow
new feature stabilisation (such as scram, JBOD) for at least 1 version
(maybe more if we find bugs in that feature) before we remove Zk. I also
agree with the point of view that we can't delay 4.0 indefinitely and we
need a clear cut line.

Hence, I propose the following:
1\ Keep trunk with 3.x. Release 3.8 and potentially 3.9 if we find major
(durability/availability related) bugs in 3.8. This will help users
continue to use their tried and tested Kafka setup until we have a proven
alternative from feature parity & stability point of view.
2\ Release 4.0 as an "experimental" release along with 3.8 "stable"
release. This will help get user feedback on the feasibility of removing Zk
completely right now.
3\ Create a criteria for moving 4.1 as "stable" release instead of
"experimental". This list should include 100% Zk parity and 100% Kafka
tests operating with kraft. It will also include other community feedback
from this & other threads.
4\ When the 4.x version is "stable", move the trunk to 4.x and stop all
development on the 3.x branch.

I acknowledge that earlier in the community, we have decided to make 3.7 as
the last release in the 3.x series. But, IMO we have learnt a lot since
then based on the continuous improvements in kraft. I believe we should be
flexible with our earlier stance here and allow for greater stability
before forcing users to a completely new functionality.

[1] https://issues.apache.org/jira/browse/KAFKA-15495
[2] https://issues.apache.org/jira/browse/KAFKA-15489
[3] https://issues.apache.org/jira/browse/KAFKA-14874

--
Divij Vaidya



On Wed, Dec 20, 2023 at 4:59 PM Josep Prat 
wrote:

> Hi Justine, Luke, and others,
>
> I believe a 3.8 version would make sense, and I would say KIP-853 should be
> part of it as well.
>
> Best,
>
> On Wed, Dec 20, 2023 at 4:11 PM Justine Olshan
> 
> wrote:
>
> > Hey Luke,
> >
> > I think your point is valid. This is another good reason to have a 3.8
> > release.
> > Would you say that implementing KIP-966 in 3.8 would be an acceptable way
> > to move forward?
> >
> > Thanks,
> > Justine
> >
> >
> > On Tue, Dec 19, 2023 at 4:35 AM Luke Chen  wrote:
> >
> > > Hi Justine,
> > >
> > > Thanks for your reply.
> > >
> > > > I think that for folks that want to prioritize availability over
> > > durability, the aggressive recovery strategy from KIP-966 should be
> > > preferable to the old unclean leader election configuration.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas#KIP966:EligibleLeaderReplicas-Uncleanrecovery
> > >
> > > Yes, I'm aware that we're going to implement the new way of leader
> > election
> > > in KIP-966.
> > > But obviously, KIP-966 is not included in v3.7.0.
> > > What I'm worried about is the users who prioritize availability over
> > > durability and enable the unclean leader election in ZK mode.
> > > Once they migrate to KRaft, there will be availability impact when
> > unclean
> > > leader election is needed.
> > > And like you said, they can run unclean leader election via CLI, but
> > again,
> > > the availability is already impacted, which might be unacceptable in
> some
> > > cases.
> > >
> > > IMO, we should prioritize this missing feature and include it in 3.x
> > > release.
> > > Including in 3.x release means users can migrate to KRaft in dual-write
> > > mode, and run it for a while to make sure everything works fine, before
> > > they decide to upgrade to 4.0.
> > >
> > > Does that make sense?
> > >
> > > Thanks.
> > > Luke
> > >
> > > On Tue, Dec 19, 2023 at 12:15 AM Justine Olshan
> > >  wrote:
> > >
> > > > Hey Luke --
> > > >
> > > > There were some previous discussions on the mailing list about this
> but
> > > > looks like we didn't file the ticket
> > > > https://lists.apache.org/thread/sqsssos1d9whgmo92vdn81n9r5woy1wk
> > > >
> > > > When I asked some of the folks who worked on Kraft about this, they
> > > > communicated to me that it was intentional to make unclean leader
> > > election
> > > > a manual action.
> > > >
> > > > I 

[jira] [Resolved] (KAFKA-15456) Add support for OffsetFetch version 9 in consumer

2023-12-21 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-15456.
-
Resolution: Fixed

> Add support for OffsetFetch version 9 in consumer
> -
>
> Key: KAFKA-15456
> URL: https://issues.apache.org/jira/browse/KAFKA-15456
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: David Jacot
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16030) new group coordinator should check if partition goes offline during load

2023-12-21 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-16030.
-
Fix Version/s: 3.7.0
   Resolution: Fixed

> new group coordinator should check if partition goes offline during load
> 
>
> Key: KAFKA-16030
> URL: https://issues.apache.org/jira/browse/KAFKA-16030
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jeff Kim
>Assignee: Jeff Kim
>Priority: Major
> Fix For: 3.7.0
>
>
> The new coordinator stops loading if the partition goes offline during load. 
> However, the partition is still considered active. Instead, we should return 
> NOT_LEADER_OR_FOLLOWER exception during load.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16040) Rename `Generic` to `Classic`

2023-12-21 Thread David Jacot (Jira)
David Jacot created KAFKA-16040:
---

 Summary: Rename `Generic` to `Classic`
 Key: KAFKA-16040
 URL: https://issues.apache.org/jira/browse/KAFKA-16040
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: David Jacot
 Fix For: 3.7.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16036) Add `group.coordinator.rebalance.protocols` and publish all new configs

2023-12-21 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-16036.
-
Resolution: Fixed

> Add `group.coordinator.rebalance.protocols` and publish all new configs
> ---
>
> Key: KAFKA-16036
> URL: https://issues.apache.org/jira/browse/KAFKA-16036
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Blocker
> Fix For: 3.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1007: Introduce Remote Storage Not Ready Exception

2023-12-21 Thread Luke Chen
Hi Kamal,

Thanks for the KIP.
+1 (binding) from me.

Luke

On Thu, Dec 21, 2023 at 4:51 PM Christo Lolov 
wrote:

> Heya Kamal,
>
> The proposed change makes sense to me as it will be a more explicit
> behaviour than what Kafka does today - I am happy with it!
>
> +1 (non-binding) from me
>
> Best,
> Christo
>
> On Tue, 12 Dec 2023 at 09:01, Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hi,
> >
> > I would like to call a vote for KIP-1007
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1007%3A+Introduce+Remote+Storage+Not+Ready+Exception
> > >.
> > This KIP aims to introduce a new error code for retriable remote storage
> > errors. Thanks to everyone who reviewed the KIP!
> >
> > --
> > Kamal
> >
>


Re: DISCUSS KIP-1011: Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh

2023-12-21 Thread Николай Ижиков
> In fact alterConfig and incrementalAlterConfig have different semantics, we 
> should pass all configs when using alterConfig and we can update config  
> incrementally using incrementalAlterConfigs, and is’t not worth doing so 
> since alterConfig has been deprecated for a long time.

There can be third-party tools like `kafka-ui` or similar that suffer from the 
same bug as you fixing.
If we fix `alterConfig` itself then we fix all tools, scripts that still using 
alterConfig.

Anyway, let’s add to the «Rejected alternatives» section reasons - why we keep 
buggy method as is and fixing only tools.

> I think your suggestion is nice, it should be marked as deprecated and will 
> be removed together with `AdminClient.alterConfigs()`

Is it OK to introduce option that is deprecated from the beginning?


> 21 дек. 2023 г., в 06:03, ziming deng  написал(а):
> 
>> shouldn't we also introduce --disable-incremental as deprecated?
> 
> I think your suggestion is nice, it should be marked as deprecated and will 
> be removed together with `AdminClient.alterConfigs()`
> 
> 
>> On Dec 19, 2023, at 16:36, Federico Valeri  wrote:
>> 
>> HI Ziming, thanks for the KIP. Looks good to me.
>> 
>> Just on question: given that alterConfig is deprecated, shouldn't we
>> also introduce --disable-incremental as deprecated? That way we would
>> get rid of both in Kafka 4.0. Also see:
>> https://issues.apache.org/jira/browse/KAFKA-14705.
>> 
>> On Tue, Dec 19, 2023 at 9:05 AM ziming deng > > wrote:
>>> 
>>> Thank you for mention this Ismael,
>>> 
>>> I added this to the motivation section, and I think we can still update 
>>> configs in this case by passing all sensitive configs, which is weird and 
>>> not friendly.
>>> 
>>> --
>>> Best,
>>> Ziming
>>> 
 On Dec 19, 2023, at 14:24, Ismael Juma  wrote:
 
 Thanks for the KIP. I think one of the main benefits of the change isn't 
 listed: sensitive configs make it impossible to make updates with the 
 current cli tool because sensitive config values are never returned.
 
 Ismael
 
 On Mon, Dec 18, 2023 at 7:58 PM ziming deng >>>  > wrote:
> 
> Hello, I want to start a discussion on KIP-1011, to make the broker 
> config change path unified with that of user/topic/client-metrics and 
> avoid some bugs.
> 
> Here is the link:
> 
> KIP-1011: Use incrementalAlterConfigs when updating broker configs by 
> kafka-configs.sh - Apache Kafka - Apache Software Foundation
> cwiki.apache.org 
> 
> KIP-1011:
>  Use incrementalAlterConfigs when updating broker configs by 
> kafka-configs.sh - Apache Kafka - Apache Software Foundation 
> 
> cwiki.apache.org  
> 
>  
> 
> 
> Best,
> Ziming.
> 



[VOTE] KIP-1005: Expose EarliestLocalOffset and TieredOffset

2023-12-21 Thread Christo Lolov
Heya all!

KIP-1005 (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1005%3A+Expose+EarliestLocalOffset+and+TieredOffset)
has been open for around a month with no further comments - I would like to
start a voting round on it!

Best,
Christo


Re: [VOTE] KIP-1007: Introduce Remote Storage Not Ready Exception

2023-12-21 Thread Christo Lolov
Heya Kamal,

The proposed change makes sense to me as it will be a more explicit
behaviour than what Kafka does today - I am happy with it!

+1 (non-binding) from me

Best,
Christo

On Tue, 12 Dec 2023 at 09:01, Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Hi,
>
> I would like to call a vote for KIP-1007
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1007%3A+Introduce+Remote+Storage+Not+Ready+Exception
> >.
> This KIP aims to introduce a new error code for retriable remote storage
> errors. Thanks to everyone who reviewed the KIP!
>
> --
> Kamal
>