Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-15 Thread Claude Warren
Let's put aside the CPC datasketch idea and just discuss the Bloom filter
approach.

I thinkthe problem with the way the KIP is worded is that PIDs are only
added if they are not seen in either of the Bloom filters.

So an early PID is added to the first filter and the associated metric is
updated.
that PID is seen multiple times over the next 60 minutes, but is not added
to the Bloom filters again.
once the 60 minutes elapses the first filter is cleared, or removed and a
new one started.  In any case the PID is no longer recorded in any extant
Bloom filter.
the PID is seen again and is added to the newest bloom filter and the
associated metric is updated.

I believe at this point the metric is incorrect, the PID has been counted
2x, when it has been in use for the entire time.

The "track" method that I added solves this problem by ensuring that the
PID is always seen in the latter half of the set of Bloom filters.  In the
case of 2 filters that is always the second one, but remember that the
number of layers will grow as the filters become saturated.  So if your
filter is intended to hold 500 PIDs and the 501st PID is registered before
the expiration a new layer (Bloom filter) is added for new PIDS to be added
into.

On Mon, Apr 15, 2024 at 5:00 PM Omnia Ibrahim 
wrote:

> Hi Claude,
> Thanks for the implementation of the LayeredBloomFilter in apache commons.
>
> > Define a new configuration option "producer.id.quota.window.count" as
> > the number of windows active in window.size.seconds.
> What is the different between “producer.id.quota.window.count” and
> producer.id.quota.window.num
>
> > Basically the kip says, if the PID is found in either of the Bloom
> filters
> > then no action is taken
> > If the PID is not found then it is added and the quota rating metrics are
> > incremented.
> > In this case long running PIDs will be counted multiple times.
>
> The PID is considered not encountered if both frames of the window don’t
> have it. If you checked the diagram of for `Caching layer to track active
> PIDs per KafkaPrincipal` you will see that each window will have 2 bloom
> layers and the first created one will be disposed only when we start the
> next window. Which means window2 is starting from the 2nd bloom. Basically
> the bloom filter in the KIP is trying to implement a sliding window
> pattern.
>
> >  think the question is not whether or not we have seen a given PID before
> > but rather how many unique PIDs did the principal create in the last
> hour.
> > Perhaps more exactly it is: did the Principal create more than X PIDS in
> > the last Y time units?
> We don’t really care about the count of unique PIDs per user. The KIP is
> trying to follow and build on top of ClientQuotaManager which already have
> a patter for throttling that the producer client is aware of so we don’t
> need to upgrade old clients for brokers to throttle them and they can
> respect the throttling.
>
> The pattern for throttling is that we record the activities by
> incrementing a metric sensor and only when we catch
> `QuotaViolationException` from the quota sensor we will be sending a
> throttleTimeMs to the client.
> For bandwidth throttling for example we increment the sensor by the size
> of the request. For PID the KIP is aiming to call
> `QuotaManagers::producerIdQuotaManager::maybeRecordAndGetThrottleTimeMs` to
> increment by +1 every time we encounter a new PID and if and if
> `Sensor::record` returned `QuotaViolationException` then we will send back
> to the producer the trolling time that the client should wait for before
> sending a new request with a new PID.
> I hope this make sense.
>
> > This question can be quickly answered by a CPC datasketch [1].  The
> > solution would be something like:
> > Break the Y time units into a set of Y' smaller partitions (e.g. 60
> > 1-minute partitions for an hour).  Create a circular queue of Y' CPC
> > datasketches for each principal.  Implement a queue entry selector based
> on
> > the modulus of the system by the resolution of the Y' partitions. On each
> > call:
> I didn’t evaluate CPC datasketch or any counter solution as I explained
> above the aim is not to build a counter specially the Kafka Sensor can be
> enough to indicate if we are violating the quota or not.
>
> Thanks
> Omnia
>
> > On 15 Apr 2024, at 10:35, Claude Warren  wrote:
> >
> > After thinking about his KIP over the weekend I think that there is
> another
> > lighter weight approach.
> >
> > I think the question is not whether or not we have seen a given PID
> before
> > but rather how many unique PIDs did the principal create in the last
> hour.
> > Perhaps more exactly it is: did the Principal create more than X PIDS in
> > the last Y time units?
> >
> > This question can be quickly answered by a CPC datasketch [1].  The
> > solution would be something like:
> > Break the Y time units into a set of Y' smaller partitions (e.g. 60
> > 1-minute partitions for an hour).  Create a circular queue of

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2816

2024-04-15 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 472154 lines...]
[2024-04-16T03:44:32.447Z] > Task :connect:json:testClasses UP-TO-DATE
[2024-04-16T03:44:32.447Z] > Task :connect:json:testJar
[2024-04-16T03:44:32.447Z] > Task :storage:storage-api:compileTestJava
[2024-04-16T03:44:32.447Z] > Task :storage:storage-api:testClasses
[2024-04-16T03:44:32.447Z] > Task :connect:json:testSrcJar
[2024-04-16T03:44:32.447Z] > Task :server:compileTestJava
[2024-04-16T03:44:32.447Z] > Task :server:testClasses
[2024-04-16T03:44:32.447Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2024-04-16T03:44:32.447Z] > Task :connect:json:publishToMavenLocal
[2024-04-16T03:44:33.979Z] > Task :server-common:compileTestJava
[2024-04-16T03:44:33.979Z] > Task :server-common:testClasses
[2024-04-16T03:44:39.723Z] > Task :raft:compileTestJava
[2024-04-16T03:44:39.723Z] > Task :raft:testClasses
[2024-04-16T03:44:42.681Z] > Task :group-coordinator:compileTestJava
[2024-04-16T03:44:42.681Z] > Task :group-coordinator:testClasses
[2024-04-16T03:44:47.911Z] 
[2024-04-16T03:44:47.911Z] > Task :clients:javadoc
[2024-04-16T03:44:47.911Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/clients/src/main/java/org/apache/kafka/clients/admin/ScramMechanism.java:32:
 warning - Tag @see: missing final '>': "https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API";>KIP-554:
 Add Broker-side SCRAM Config API
[2024-04-16T03:44:47.911Z] 
[2024-04-16T03:44:47.911Z]  This code is duplicated in 
org.apache.kafka.common.security.scram.internals.ScramMechanism.
[2024-04-16T03:44:47.911Z]  The type field in both files must match and must 
not change. The type field
[2024-04-16T03:44:47.911Z]  is used both for passing ScramCredentialUpsertion 
and for the internal
[2024-04-16T03:44:47.911Z]  UserScramCredentialRecord. Do not change the type 
field."
[2024-04-16T03:44:49.824Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/package-info.java:21:
 warning - Tag @link: reference not found: 
org.apache.kafka.common.security.oauthbearer
[2024-04-16T03:44:51.043Z] 2 warnings
[2024-04-16T03:44:51.043Z] 
[2024-04-16T03:44:51.043Z] > Task :metadata:compileTestJava
[2024-04-16T03:44:51.043Z] > Task :metadata:testClasses
[2024-04-16T03:44:52.435Z] > Task :clients:javadocJar
[2024-04-16T03:45:01.760Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2024-04-16T03:45:16.112Z] > Task :core:classes
[2024-04-16T03:45:16.112Z] > Task :core:compileTestJava NO-SOURCE
[2024-04-16T03:45:18.972Z] > Task :clients:srcJar
[2024-04-16T03:45:20.601Z] > Task :clients:testJar
[2024-04-16T03:45:20.601Z] > Task :clients:testSrcJar
[2024-04-16T03:45:21.984Z] > Task 
:clients:publishMavenJavaPublicationToMavenLocal
[2024-04-16T03:45:21.984Z] > Task :clients:publishToMavenLocal
[2024-04-16T03:45:21.984Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2024-04-16T03:45:21.984Z] > Task :connect:api:testClasses UP-TO-DATE
[2024-04-16T03:45:21.984Z] > Task :connect:api:testJar
[2024-04-16T03:45:23.552Z] > Task :connect:api:testSrcJar
[2024-04-16T03:45:23.552Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2024-04-16T03:45:23.552Z] > Task :connect:api:publishToMavenLocal
[2024-04-16T03:45:32.696Z] > Task :streams:javadoc
[2024-04-16T03:45:32.696Z] > Task :streams:javadocJar
[2024-04-16T03:45:42.332Z] > Task :streams:srcJar
[2024-04-16T03:45:42.332Z] > Task :streams:processTestResources UP-TO-DATE
[2024-04-16T03:45:47.316Z] > Task :core:compileTestScala
[2024-04-16T03:46:50.108Z] > Task :core:testClasses
[2024-04-16T03:47:21.044Z] > Task :streams:compileTestJava
[2024-04-16T03:48:54.656Z] > Task :streams:testClasses
[2024-04-16T03:48:54.656Z] > Task :streams:testJar
[2024-04-16T03:48:54.656Z] > Task :streams:testSrcJar
[2024-04-16T03:48:54.656Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2024-04-16T03:48:54.656Z] > Task :streams:publishToMavenLocal
[2024-04-16T03:48:54.656Z] 
[2024-04-16T03:48:54.656Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 9.0.
[2024-04-16T03:48:54.656Z] 
[2024-04-16T03:48:54.656Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2024-04-16T03:48:54.656Z] 
[2024-04-16T03:48:54.656Z] For more on this, please refer to 
https://docs.gradle.org/8.7/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.
[2024-04-16T03:48:54.656Z] 
[2024-04-16T03:48:54.656Z] BUILD SUCCESSFUL in 5m 49s
[2024-04-16T03:48:54.656Z] 96 actionable tasks: 41 executed, 55 up-to-date
[2024-04-16T03:48:54.656Z] 
[2024-04-16T03:48:54.656Z] Publishing build scan...
[2024-04-16T03:48:54.656Z] https://ge.apache.org/s/r32xscic44t4e
[2024-04-16T03:

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #136

2024-04-15 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-1036: Extend RecordDeserializationException exception

2024-04-15 Thread Almog Gavra
Hi Frederik - thanks for the KIP, this will be a fantastic and elegant
addition to Kafka Streams.

I have a higher level question about this, which is that the `poll`
interface returns multiple records and yet the DeserializationException
will be thrown if any record in the batch cannot be processed. I suspect
that makes your example of a DLQ challenging, since it would skip all
records up until the record that could not be deserialized (even if they
were valid). People would have to be careful of polling up until the offset
of the "bad" record...

I don't have a great suggestion for an API that could address this, here
are a few suggestions that come to mind:
1. add an optional callback to `poll` that could specify behavior for
records that fail to deserialize (that way the callback could specify a
return value of "fail" or "ignore" individual failing records within the
batch)
2. have a version of consumer#poll that returns a version of
ConsumerRecords that has two lists: all successfully polled records and all
failed records (the obvious downside is that devs might forget to check the
failed records)
3. have the serialization exception contain all successful and all failed
records (that's just not very elegant).

Anyway, there are many people much smarter than I watching this thread --
they may have better suggestions! (Or I may have misunderstood anything, in
which case please carry on...)

Cheers,
Almog

On Fri, Apr 12, 2024 at 1:26 PM Sophie Blee-Goldman 
wrote:

> I think the bigger question here is: why is checkstyle complaining about
> this import? Does anyone know?
>
> On Thu, Apr 11, 2024 at 11:12 AM Frédérik Rouleau
>  wrote:
>
> > Hi everyone,
> >
> > I have made some changes to take in account comments. I have replaced the
> > ConsumerRecord by Record. As it was not allowed by checkstyle, I have
> > modified its configuration. I hope that's ok.
> > I find this new version better. Thanks for your help.
> >
> > Regards,
> > Fred
> >
>


Re: [DISCUSS] KIP-932: Queues for Kafka

2024-04-15 Thread Jun Rao
Hi, Andrew,

Thanks for the updated KIP.

42.1 "If the share group offset is altered multiple times when the group
remains empty, it would be harmless if the same state epoch was reused to
initialize the state."
Hmm, how does the partition leader know for sure that it has received the
latest share group offset if epoch is reused?
Could we update the section "Group epoch - Trigger a rebalance" that
AdminClient.alterShareGroupOffsets causes the group epoch to be bumped too?

47,56 "my view is that BaseOffset should become FirstOffset in ALL schemas
defined in the KIP."
Yes, that seems better to me.

105. "I have another non-terminal state in mind for 3."
Should we document it?

114. session.timeout.ms in the consumer configuration is deprecated in
KIP-848. So, we need to remove it from the shareConsumer configuration.

115. I am wondering if it's a good idea to always commit acks on
ShareConsumer.poll(). In the extreme case, each batch may only contain a
single record and each poll() only returns a single batch. This will cause
each record to be committed individually. Is there a way for a user to
optimize this?

116. For each new RPC, could we list the associated acls?

117. Since this KIP changes internal records and RPCs, it would be useful
to document the upgrade process.

Jun

On Wed, Apr 10, 2024 at 7:35 AM Andrew Schofield 
wrote:

> Hi Jun,
> Thanks for your questions.
>
> 41.
> 41.1. The partition leader obtains the state epoch in the response from
> ReadShareGroupState. When it becomes a share-partition leader,
> it reads the share-group state and one of the things it learns is the
> current state epoch. Then it uses the state epoch in all subsequent
> calls to WriteShareGroupState. The fencing is to prevent writes for
> a previous state epoch, which are very unlikely but which would mean
> that a leader was using an out-of-date epoch and was likely no longer
> the current leader at all, perhaps due to a long pause for some reason.
>
> 41.2. If the group coordinator were to set the SPSO, wouldn’t it need
> to discover the initial offset? I’m trying to avoid yet another
> inter-broker
> hop.
>
> 42.
> 42.1. I think I’ve confused things. When the share group offset is altered
> using AdminClient.alterShareGroupOffsets, the group coordinator WILL
> update the state epoch. I don’t think it needs to update the group epoch
> at the same time (although it could) because the group epoch will have
> been bumped when the group became empty. If the share group offset
> is altered multiple times when the group remains empty, it would be
> harmless if the same state epoch was reused to initialize the state.
>
> When the share-partition leader updates the SPSO as a result of
> the usual flow of record delivery, it does not update the state epoch.
>
> 42.2. The share-partition leader will notice the alteration because,
> when it issues WriteShareGroupState, the response will contain the
> error code FENCED_STATE_EPOCH. This is supposed to be the
> last-resort way of catching this.
>
> When the share-partition leader handles its first ShareFetch request,
> it learns the state epoch from the response to ReadShareGroupState.
>
> In normal running, the state epoch will remain constant, but, when there
> are no consumers and the group is empty, it might change. As a result,
> I think it would be sensible when the set of share sessions transitions
> from 0 to 1, which is a reasonable proxy for the share group transitioning
> from empty to non-empty, for the share-partition leader to issue
> ReadShareGroupOffsetsState to validate the state epoch. If its state
> epoch is out of date, it can then ReadShareGroupState to re-initialize.
>
> I’ve changed the KIP accordingly.
>
> 47, 56. If I am to change BaseOffset to FirstOffset, we need to have
> a clear view of which is the correct term. Having reviewed all of the
> instances, my view is that BaseOffset should become FirstOffset in
> ALL schemas defined in the KIP. Then, BaseOffset is just used in
> record batches, which is already a known concept.
>
> Please let me know if you agree.
>
> 60. I’ve added FindCoordinator to the top level index for protocol changes.
>
> 61. OK. I expect you are correct about how users will be using the
> console share consumer. When I use the console consumer, I always get
> a new consumer group. I have changed the default group ID for console
> share consumer to “console-share-consumer” to match the console consumer
> better and give more of an idea where this mysterious group has come from.
>
> 77. I will work on a proposal that does not use compaction and we can
> make a judgement about whether it’s a better course for KIP-932.
> Personally,
> until I’ve written it down and lived with the ideas for a few days, I
> won’t be
> able to choose which I prefer.
>
> I should be able to get the proposal written by the end of this week.
>
> 100. ShareGroupHeartbeatRequest.RebalanceTimeoutMs matches
> ConsumerGroupHeartbeatRequest.RebalanceTimeoutMs

[jira] [Created] (KAFKA-16558) Implement HeartbeatRequestState.toStringBase()

2024-04-15 Thread Kirk True (Jira)
Kirk True created KAFKA-16558:
-

 Summary: Implement HeartbeatRequestState.toStringBase()
 Key: KAFKA-16558
 URL: https://issues.apache.org/jira/browse/KAFKA-16558
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Affects Versions: 3.7.0
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 3.8.0


The code incorrectly overrides the {{toString()}} method instead of overriding 
{{{}toStringBase(){}}}. This affects debugging and troubleshooting consumer 
issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16557) Fix CommitRequestManager’s OffsetFetchRequestState.toString()

2024-04-15 Thread Kirk True (Jira)
Kirk True created KAFKA-16557:
-

 Summary: Fix CommitRequestManager’s 
OffsetFetchRequestState.toString()
 Key: KAFKA-16557
 URL: https://issues.apache.org/jira/browse/KAFKA-16557
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Affects Versions: 3.7.0
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 3.8.0


The code incorrectly overrides the {{toString()}} method instead of overriding 
{{{}toStringBase(){}}}. This affects debugging and troubleshooting consumer 
issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16556) Race condition between ConsumerRebalanceListener and SubscriptionState

2024-04-15 Thread Kirk True (Jira)
Kirk True created KAFKA-16556:
-

 Summary: Race condition between ConsumerRebalanceListener and 
SubscriptionState
 Key: KAFKA-16556
 URL: https://issues.apache.org/jira/browse/KAFKA-16556
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Affects Versions: 3.7.0
Reporter: Kirk True
 Fix For: 3.8.0


There appears to be a race condition between invoking the 
{{ConsumerRebalanceListener}} callbacks on reconciliation and 
{{initWithCommittedOffsetsIfNeeded}} in the consumer.
 
The membership manager adds the newly assigned partitions to the 
{{{}SubscriptionState{}}}, but marks them as {{{}pendingOnAssignedCallback{}}}. 
Then, after the {{ConsumerRebalanceListener.onPartitionsAssigned()}} completes, 
the membership manager will invoke {{enablePartitionsAwaitingCallback}} to set 
all of those partitions' 'pending' flag to false.
 
During the main {{Consumer.poll()}} loop, {{AsyncKafkaConsumer}} may need to 
call {{initWithCommittedOffsetsIfNeeded()}} if the positions aren't already 
cached. Inside {{{}initWithCommittedOffsetsIfNeeded{}}}, the consumer calls the 
subscription's {{initializingPartitions}} method to get a set of the partitions 
for which to fetch their committed offsets. However, 
{{SubscriptionState.initializingPartitions()}} only returns partitions that 
have the {{pendingOnAssignedCallback}} flag set to to false.
 
The result is: * If the {{MembershipManagerImpl.assignPartitions()}} future  is 
completed on the background thread first, the 'pending' flag is set to false. 
On the application thread, when {{SubscriptionState.initializingPartitions()}} 
is called, it returns the partition, and we fetch its committed offsets
 * If instead the application thread calls 
{{SubscriptionState.initializingPartitions()}} first, the partitions's 
'pending' flag is still set to false, and so the partition is omitted from the 
returned set. The {{updateFetchPositions()}} method then continues on and 
re-initializes the partition's fetch offset to 0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16555) Consumer's RequestState has incorrect logic to determine if inflight

2024-04-15 Thread Kirk True (Jira)
Kirk True created KAFKA-16555:
-

 Summary: Consumer's RequestState has incorrect logic to determine 
if inflight
 Key: KAFKA-16555
 URL: https://issues.apache.org/jira/browse/KAFKA-16555
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer
Affects Versions: 3.7.0
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 3.8.0


When running system tests for the new consumer, I've hit an issue where the 
{{HeartbeatRequestManager}} is sending out multiple concurrent 
{{CONSUMER_GROUP_REQUEST}} RPCs. The effect is the coordinator creates multiple 
members which causes downstream assignment problems.

Here's the order of events:

* Time 202: {{HearbeatRequestManager.poll()}} determines it's OK to send a 
request. In so doing, it updates the {{RequestState}}'s {{lastSentMs}} to the 
current timestamp, 202
* Time 236: the response is received and response handler is invoked, setting 
the {{RequestState}}'s {{lastReceivedMs}} to the current timestamp, 236
* Time 236: {{HearbeatRequestManager.poll()}} is invoked again, and it sees 
that it's OK to send a request. It creates another request, once again updating 
the {{RequestState}}'s {{lastSentMs}} to the current timestamp, 236
* Time 237:  {{HearbeatRequestManager.poll()}} is invoked again, and 
ERRONEOUSLY decides it's OK to send another request, despite one already in 
flight.

Here's the problem with {{requestInFlight()}}:

{code:java}
public boolean requestInFlight() {
return this.lastSentMs > -1 && this.lastReceivedMs < this.lastSentMs;
}
{code}

On our case, {{lastReceivedMs}} is 236 and {{lastSentMs}} is _also_ 236. So the 
received timestamp is _equal_ to the sent timestamp, not _less_.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1022 Formatting and Updating Features

2024-04-15 Thread Justine Olshan
Hey folks,

Thanks everyone! I will go ahead and call it.
The KIP passes with the following +1 votes:

- Andrew Schofield (non-binding)
- David Jacot (binding)
- José Armando García Sancio (binding)
- Jun Rao (binding)

Thanks again,
Justine

On Fri, Apr 12, 2024 at 11:16 AM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the KIP. +1
>
> Jun
>
> On Wed, Apr 10, 2024 at 9:13 AM José Armando García Sancio
>  wrote:
>
> > Hi Justine,
> >
> > +1 (binding)
> >
> > Thanks for the improvement.
> > --
> > -José
> >
>


Re:[ANNOUNCE] New Kafka PMC Member: Greg Harris

2024-04-15 Thread Hector Geraldino (BLOOMBERG/ 919 3RD A)
Congrats! Well deserved

From: dev@kafka.apache.org At: 04/13/24 14:42:22 UTC-4:00To:  
dev@kafka.apache.org
Subject: [ANNOUNCE] New Kafka PMC Member: Greg Harris

Hi all,

Greg Harris has been a Kafka committer since July 2023. He has remained
very active and instructive in the community since becoming a committer.
It's my pleasure to announce that Greg is now a member of Kafka PMC.

Congratulations, Greg!

Chris, on behalf of the Apache Kafka PMC




[jira] [Created] (KAFKA-16554) Online downgrade triggering and group type conversion

2024-04-15 Thread Dongnuo Lyu (Jira)
Dongnuo Lyu created KAFKA-16554:
---

 Summary: Online downgrade triggering and group type conversion
 Key: KAFKA-16554
 URL: https://issues.apache.org/jira/browse/KAFKA-16554
 Project: Kafka
  Issue Type: Sub-task
Reporter: Dongnuo Lyu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-932: Queues for Kafka

2024-04-15 Thread Andrew Schofield
Hi David,
Thanks for reviewing the KIP and your comments.

001: I think the `group.type` is the area of the KIP that has had the most 
comments.
I think you’re right that a better approach would be to make the creation of 
the group
explicit, for users who want that. I have removed `group.type` from this KIP and
I propose to introduce a separate KIP for how to control the type of groups.

002: Makes sense. What I was going for was essentially a GroupMetadata record,
but ConsumerGroupMetadata already exists. I’m perfectly happy creating a 
separate
ShareGroupMetadata and leaving ConsumerGroupMetadata unchanged by this KIP.
I have updated the KIP.

003: In order to be able to fence zombie writes to the state, I am putting the
responsibility of setting the state epoch for a share-partition in the group 
coordinator.
Because the GC already manages the group epoch, using that to initialize the 
state
epoch seems sensible. The SC is not the GC, and it doesn’t know the group epoch
without being told.

004: There is no group expiration for share groups in this KIP.

005: The share coordinator is an internal service which is not directly accessed
by external users. The RPCs that it serves are internal and only issued by 
brokers.

006: When KIP-932 becomes generally available, which I expect is at least a year
away, there will need to be a proper way of turning the capability on. I draw 
the
parallel with KIP-848 for which the configuration evolved right up until just 
before
the cut-off for 3.7 when `group.coordinator.rebalance.protocols` was introduced.
The final mechanism for enabling KIP-848 when it becomes the default in AK 4.0
is still being finalised (KIP-1022).

I expect to create a finalizing KIP for this feature which confirms the actual
configuration that administrators will use to enable it in production clusters.
I expect that will include adding “share” to 
`group.coordinator.rebalance.protocols`
and incrementing `group.version`. I don’t really like having 2 configurations
required, but I think that’s the implication of KIP-848 and KIP-1022.

I am viewing `group.share.enable` as the pre-GA way to enable this feature.
If you’d rather prefer me to use `group.coordinator.rebalance.protocols` in 
KIP-932,
I can do that, but I’d rather save this until we reach preview or GA status.

007: I had not intended to make the assignor interface public at this point,
but really there’s no reason not to. The interface is identical to the 
server-side
assignor used for cosnumer groups. However, it’s a separate interface because
we want to avoid someone using assignors for the wrong group type. I’ve
updated the KIP.

008: The SPSO and SPEO are maintained by the share-partition leader.
The SPSO is persisted using the share coordinator. The SPEO does not need
to be persisted.

009: The ShareAcknowledge API gives a less convoluted way to acknowledge
delivery for situations in which no fetching of records is required. For 
example,
when `KafkaShareConsumer.commitSync` is used, we want to acknowledge without
fetching.

010: Having a hard limit for the number of share sessions on a broker is tricky.
The KIP says that the limit is calculated based on `group.share.max.groups` and
`group.share.max.size`. I think multiplying those together gives the absolute 
limit
but the number required in practice will be somewhat less. I will make the text
a bit clearer here.

In a share group, each consumer assigned partitions which a particular broker 
leads
will need a share session on that broker. It won’t generally be the case that 
every
consumer needs a session on every broker. But the actual number needed on
a broker depends upon the distribution of leadership across the cluster.

The limit is intentionally the theoretical maximum because, without a share 
session,
a consumer cannot acquire records to consume or acknowledge them.

011: It’s a good question. I think it’s safe.

Let’s assume that in a pathological case, the records in a ShareSnapshot
are not contiguous and it is necessary to record a Base/LastOffset for each of
them. Then, it would take around 20 bytes per record to persist its state.
We could store over 50,000 records in 1MB. The configuration limit for
`group.share.record.lock.partition.limit` is 10,000.

012: Yes, we plan to implement it using the CoordinatorRuntime.

013: That makes sense. I will add `share.coordinator.threads`.

014: The `group.share.state.topic.min.isr` is used to set the min ISR
configuration to be used when the share-state partition is automatically
created. I have followed `transaction.state.log.min.isr` which is the closest
analogue I think.

015: You make a good point about the RebalanceTimeoutMs. Removed.

016: I have added some clarifying text about Start/EndPartitionIndex.
Imagine that there’s a topic with 3 partitions. The administrator increases
the partition count to 6. The StartPartitionIndex is 4 and EndPartitionIndex is
6 while the new partitions are being initialised by the share 

Re: [DISCUSS] KIP-1035: StateStore managed changelog offsets

2024-04-15 Thread Nick Telford
Hi Sophie,

Interesting idea! Although what would that mean for the StateStore
interface? Obviously we can't require that the constructor take the TaskId.
Is it enough to add the parameter to the StoreSupplier?

Would doing this be in-scope for this KIP, or are we over-complicating it?

Nick

On Fri, 12 Apr 2024 at 21:30, Sophie Blee-Goldman 
wrote:

> Somewhat minor point overall, but it actually drives me crazy that you
> can't get access to the taskId of a StateStore until #init is called. This
> has caused me a huge headache personally (since the same is true for
> processors and I was trying to do something that's probably too hacky to
> actually complain about here lol)
>
> Can we just change the StateStoreSupplier to receive and pass along the
> taskId when creating a new store? Presumably by adding a new version of the
> #get method that takes in a taskId parameter? We can have it default to
> invoking the old one for compatibility reasons and it should be completely
> safe to tack on.
>
> Would also prefer the same for a ProcessorSupplier, but that's definitely
> outside the scope of this KIP
>
> On Fri, Apr 12, 2024 at 3:31 AM Nick Telford 
> wrote:
>
> > On further thought, it's clear that this can't work for one simple
> reason:
> > StateStores don't know their associated TaskId (and hence, their
> > StateDirectory) until the init() call. Therefore, committedOffset() can't
> > be called before init(), unless we also added a StateStoreContext
> argument
> > to committedOffset(), which I think might be trying to shoehorn too much
> > into committedOffset().
> >
> > I still don't like the idea of the Streams engine maintaining the cache
> of
> > changelog offsets independently of stores, mostly because of the
> > maintenance burden of the code duplication, but it looks like we'll have
> to
> > live with it.
> >
> > Unless you have any better ideas?
> >
> > Regards,
> > Nick
> >
> > On Wed, 10 Apr 2024 at 14:12, Nick Telford 
> wrote:
> >
> > > Hi Bruno,
> > >
> > > Immediately after I sent my response, I looked at the codebase and came
> > to
> > > the same conclusion. If it's possible at all, it will need to be done
> by
> > > creating temporary StateManagers and StateStores during rebalance. I
> > think
> > > it is possible, and probably not too expensive, but the devil will be
> in
> > > the detail.
> > >
> > > I'll try to find some time to explore the idea to see if it's possible
> > and
> > > report back, because we'll need to determine this before we can vote on
> > the
> > > KIP.
> > >
> > > Regards,
> > > Nick
> > >
> > > On Wed, 10 Apr 2024 at 11:36, Bruno Cadonna 
> wrote:
> > >
> > >> Hi Nick,
> > >>
> > >> Thanks for reacting on my comments so quickly!
> > >>
> > >>
> > >> 2.
> > >> Some thoughts on your proposal.
> > >> State managers (and state stores) are parts of tasks. If the task is
> not
> > >> assigned locally, we do not create those tasks. To get the offsets
> with
> > >> your approach, we would need to either create kind of inactive tasks
> > >> besides active and standby tasks or store and manage state managers of
> > >> non-assigned tasks differently than the state managers of assigned
> > >> tasks. Additionally, the cleanup thread that removes unassigned task
> > >> directories needs to concurrently delete those inactive tasks or
> > >> task-less state managers of unassigned tasks. This seems all quite
> messy
> > >> to me.
> > >> Could we create those state managers (or state stores) for locally
> > >> existing but unassigned tasks on demand when
> > >> TaskManager#getTaskOffsetSums() is executed? Or have a different
> > >> encapsulation for the unused task directories?
> > >>
> > >>
> > >> Best,
> > >> Bruno
> > >>
> > >>
> > >>
> > >> On 4/10/24 11:31 AM, Nick Telford wrote:
> > >> > Hi Bruno,
> > >> >
> > >> > Thanks for the review!
> > >> >
> > >> > 1, 4, 5.
> > >> > Done
> > >> >
> > >> > 3.
> > >> > You're right. I've removed the offending paragraph. I had originally
> > >> > adapted this from the guarantees outlined in KIP-892. But it's
> > >> difficult to
> > >> > provide these guarantees without the KIP-892 transaction buffers.
> > >> Instead,
> > >> > we'll add the guarantees back into the JavaDoc when KIP-892 lands.
> > >> >
> > >> > 2.
> > >> > Good point! This is the only part of the KIP that was
> (significantly)
> > >> > changed when I extracted it from KIP-892. My prototype currently
> > >> maintains
> > >> > this "cache" of changelog offsets in .checkpoint, but doing so
> becomes
> > >> very
> > >> > messy. My intent with this change was to try to better encapsulate
> > this
> > >> > offset "caching", especially for StateStores that can cheaply
> provide
> > >> the
> > >> > offsets stored directly in them without needing to duplicate them in
> > >> this
> > >> > cache.
> > >> >
> > >> > It's clear some more work is needed here to better encapsulate this.
> > My
> > >> > immediate thought is: what if we construct *but don't initialize*
> the
> > >> > State

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-15 Thread Omnia Ibrahim
Hi Claude, 
Thanks for the implementation of the LayeredBloomFilter in apache commons. 

> Define a new configuration option "producer.id.quota.window.count" as
> the number of windows active in window.size.seconds.
What is the different between “producer.id.quota.window.count” and 
producer.id.quota.window.num

> Basically the kip says, if the PID is found in either of the Bloom filters
> then no action is taken
> If the PID is not found then it is added and the quota rating metrics are
> incremented.
> In this case long running PIDs will be counted multiple times.

The PID is considered not encountered if both frames of the window don’t have 
it. If you checked the diagram of for `Caching layer to track active PIDs per 
KafkaPrincipal` you will see that each window will have 2 bloom layers and the 
first created one will be disposed only when we start the next window. Which 
means window2 is starting from the 2nd bloom. Basically the bloom filter in the 
KIP is trying to implement a sliding window pattern. 

>  think the question is not whether or not we have seen a given PID before
> but rather how many unique PIDs did the principal create in the last hour.
> Perhaps more exactly it is: did the Principal create more than X PIDS in
> the last Y time units?
We don’t really care about the count of unique PIDs per user. The KIP is trying 
to follow and build on top of ClientQuotaManager which already have a patter 
for throttling that the producer client is aware of so we don’t need to upgrade 
old clients for brokers to throttle them and they can respect the throttling. 

The pattern for throttling is that we record the activities by incrementing a 
metric sensor and only when we catch `QuotaViolationException` from the quota 
sensor we will be sending a throttleTimeMs to the client. 
For bandwidth throttling for example we increment the sensor by the size of the 
request. For PID the KIP is aiming to call 
`QuotaManagers::producerIdQuotaManager::maybeRecordAndGetThrottleTimeMs` to 
increment by +1 every time we encounter a new PID and if and if 
`Sensor::record` returned `QuotaViolationException` then we will send back to 
the producer the trolling time that the client should wait for before sending a 
new request with a new PID. 
I hope this make sense. 

> This question can be quickly answered by a CPC datasketch [1].  The
> solution would be something like:
> Break the Y time units into a set of Y' smaller partitions (e.g. 60
> 1-minute partitions for an hour).  Create a circular queue of Y' CPC
> datasketches for each principal.  Implement a queue entry selector based on
> the modulus of the system by the resolution of the Y' partitions. On each
> call:
I didn’t evaluate CPC datasketch or any counter solution as I explained above 
the aim is not to build a counter specially the Kafka Sensor can be enough to 
indicate if we are violating the quota or not. 

Thanks 
Omnia 

> On 15 Apr 2024, at 10:35, Claude Warren  wrote:
> 
> After thinking about his KIP over the weekend I think that there is another
> lighter weight approach.
> 
> I think the question is not whether or not we have seen a given PID before
> but rather how many unique PIDs did the principal create in the last hour.
> Perhaps more exactly it is: did the Principal create more than X PIDS in
> the last Y time units?
> 
> This question can be quickly answered by a CPC datasketch [1].  The
> solution would be something like:
> Break the Y time units into a set of Y' smaller partitions (e.g. 60
> 1-minute partitions for an hour).  Create a circular queue of Y' CPC
> datasketches for each principal.  Implement a queue entry selector based on
> the modulus of the system by the resolution of the Y' partitions. On each
> call:
> 
> On queue entry selector change clear the CPC (makes it empty)
> Add the latest PID to the current queue entry.
> Sum up the CPCs and check if the max (or min) estimate of unique counts
> exceeds the limit for the user.
> 
> When the CPC returns a zero estimated count then the principal has gone
> away and the principal/CPC-queue pair can be removed from the tracking
> system.
> 
> I believe that this code solution is smaller and faster than the Bloom
> filter implementation.
> 
> [1] https://datasketches.apache.org/docs/CPC/CPC.html
> 
> 
> 
> On Fri, Apr 12, 2024 at 3:10 PM Claude Warren  wrote:
> 
>> I think there is an issue in the KIP.
>> 
>> Basically the kip says, if the PID is found in either of the Bloom filters
>> then no action is taken
>> If the PID is not found then it is added and the quota rating metrics are
>> incremented.
>> 
>> In this case long running PIDs will be counted multiple times.
>> 
>> Let's assume a 30 minute window with 2 15-minute frames.  So for the first
>> 15 minutes all PIDs are placed in the first Bloom filter and for the 2nd 15
>> minutes all new PIDs are placed in the second bloom filter.  At the 3rd 15
>> minutes the first filter is removed and a new empty one created.
>> 
>> Let

Re: [VOTE] KIP-899: Allow producer and consumer clients to rebootstrap

2024-04-15 Thread Andrew Schofield
Thanks for the KIP

+1 (non-binding)

Andrew

> On 15 Apr 2024, at 14:16, Chris Egerton  wrote:
>
> Hi Ivan,
>
> Thanks for the KIP. After the recent changes, this LGTM. +1 (binding)
>
> Cheers,
>
> Chris
>
> On Wed, Aug 2, 2023 at 12:15 AM Ivan Yurchenko 
> wrote:
>
>> Hello,
>>
>> The discussion [1] for KIP-899 [2] has been open for quite some time. I'd
>> like to put the KIP up for a vote.
>>
>> Best,
>> Ivan
>>
>> [1] https://lists.apache.org/thread/m0ncbmfxs5m87sszby2jbmtjx2bdpcdl
>> [2]
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-899%3A+Allow+producer+and+consumer+clients+to+rebootstrap
>>



Re: [DISCUSS] KIP-932: Queues for Kafka

2024-04-15 Thread David Jacot
Hi Andrew,

Thanks for the KIP. This work is really exciting.

I finally had a bit of time to go through the KIP. I need to read it a
second time in order to get into the details. I have noted a few
points/questions:

001: The dynamic config to force the group type is really weird. As you
said, groups are created on first use and so they are. If we want something
better, we should rather make the creation of the group explicit.

002: It is weird to write a ConsumerGroupMetadata to reserve the group id.
I think that we should rather have a ShareGroupMetadata for this purpose.
Similarly, I don't think that we should add a type to the
ConsumerGroupMetadataValue record. This record is meant to be used by
"consumer" groups.

003: I don't fully understand the motivation for having the
ShareGroupPartitionMetadata record and the InitializeShareGroupState API
called from the group coordinator. Could you elaborate a bit more? Isn't it
possible to lazily initialize the state in the share coordinator when the
share leader fetches the state for the first time?

004: Could you precise how the group expiration will work? I did not see it
mentioned in the KIP but I may have missed it.

005: I would like to ensure that I understand the proposal for the share
coordinator. It looks like we want it to be an internal service. By this, I
mean that it won't be directly accessed by external users. Is my
understanding correct?

006: group.share.enable: We should rather use
`group.coordinator.rebalance.protocols` with `share`.

007: SimpleShareAssignor, do we have an interface for it?

008: For my understanding, will the SPSO and SPEO bookeeped in the
Partition and in the Log layer?

009: Is there a reason why we still need the ShareAcknowledge API if
acknowledging can also be done with the ShareFetch API?

010: Do we plan to limit the number of share sessions on the share leader?
The KIP mentions a limit calculated based on group.share.max.groups and
group.share.max.size but it is quite vague.

011: Do you have an idea of the size that ShareSnapshot will use in
practice? Could it get larger than the max size of the batch within a
partition (default to 1MB)

012: Regarding the share group coordinator, do you plan to implement it on
top of the CoordinatorRuntime introduced by KIP-848? I hope so in order to
reuse code.

013: Following my previous question, do we need a config similar to
`group.coordinator.threads` for the share coordinator?

014: I am not sure to understand why we need
`group.share.state.topic.min.isr`. Is the topic level configuration enough
for this?

015: ShareGroupHeartbeat API: Do we need RebalanceTimeoutMs? What's its
purpose if there is no revocation in the protocol?

016: ShareGroupPartitionMetadataValue: What are the StartPartitionIndex and
EndPartitionIndex?

017: The metric `num-partitions` with a tag called protocol does not make
sense in the group coordinator. The number of partitions is the number of
__consumer_offsets partitions here.

018: Do we need a tag for `share-acknowledgement` if the name is already
scope to share groups?

019: Should we also scope the name of `record-acknowledgement` to follow
`share-acknowledgement`?

020: I suppose that the SPEO is always bounded by the HWM. It may be good
to call it out. Is it also bounded by the LSO?

021: WriteShareGroupState API: Is there a mechanism to prevent zombie share
leaders from committing wrong state?

Best,
David


On Fri, Apr 12, 2024 at 2:32 PM Andrew Schofield 
wrote:

> Hi,
> 77. I’ve updated the KIP to use log retention rather than log compaction.
> The basic ideas of what to persist are unchanged. It makes a few changes:
>
> * It changes the record names: ShareCheckpoint -> ShareSnapshot and
>   ShareDelta -> ShareUpdate. They’re equivalent, but renaming makes it
>   simple to check I did an atomic change to the new proposal.
> * It uses log retention and explicit pruning of elderly records using
>   ReplicaManager.deleteRecords
> * It gets rid of the nasty DeltaIndex scheme because we don’t need to worry
>   about the log compactor and key uniqueness.
>
> I have also changed the ambiguous “State” to “DeliveryState” in RPCs
> and records.
>
> And I added a clarification about how the “group.type” configuration should
> be used.
>
> Thanks,
> Andrew
>
> > On 10 Apr 2024, at 15:33, Andrew Schofield <
> andrew_schofield_j...@live.com> wrote:
> >
> > Hi Jun,
> > Thanks for your questions.
> >
> > 41.
> > 41.1. The partition leader obtains the state epoch in the response from
> > ReadShareGroupState. When it becomes a share-partition leader,
> > it reads the share-group state and one of the things it learns is the
> > current state epoch. Then it uses the state epoch in all subsequent
> > calls to WriteShareGroupState. The fencing is to prevent writes for
> > a previous state epoch, which are very unlikely but which would mean
> > that a leader was using an out-of-date epoch and was likely no longer
> > the current leader at all, perhaps due t

Re: [VOTE] KIP-899: Allow producer and consumer clients to rebootstrap

2024-04-15 Thread Chris Egerton
Hi Ivan,

Thanks for the KIP. After the recent changes, this LGTM. +1 (binding)

Cheers,

Chris

On Wed, Aug 2, 2023 at 12:15 AM Ivan Yurchenko 
wrote:

> Hello,
>
> The discussion [1] for KIP-899 [2] has been open for quite some time. I'd
> like to put the KIP up for a vote.
>
> Best,
> Ivan
>
> [1] https://lists.apache.org/thread/m0ncbmfxs5m87sszby2jbmtjx2bdpcdl
> [2]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-899%3A+Allow+producer+and+consumer+clients+to+rebootstrap
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2814

2024-04-15 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16553) log controller configs when startup

2024-04-15 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16553:
--

 Summary: log controller configs when startup
 Key: KAFKA-16553
 URL: https://issues.apache.org/jira/browse/KAFKA-16553
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


We can't observe the controller configs from the log file. We can copy the 
solution used by broker 
(https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerServer.scala#L492).

Or this issue should be blocked by 
https://issues.apache.org/jira/browse/KAFKA-13105 to wait for more graceful 
solution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-950: Tiered Storage Disablement

2024-04-15 Thread Christo Lolov
Heya Doguscan,

I believe that the state of the world after this KIP will be the following:

For Zookeeper-backed clusters there will be 3 states: ENABLED, DISABLING
and DISABLED. We want this because Zookeeper-backed clusters will await a
confirmation from the brokers that they have indeed stopped tiered-related
operations on the topic.

For KRaft-backed clusters there will be only 2 states: ENABLED and
DISABLED. KRaft takes a fire-and-forget approach for topic deletion. I
believe the same approach ought to be taken for tiered topics. The
mechanism which will ensure that leftover state in remote due to failures
is cleaned up to me is the retention mechanism. In today's code, a leader
deletes all segments it finds in remote with offsets below the log start
offset. I believe this will be good enough for cleaning up leftover state
in remote due to failures.

I know that quite a few changes have been discussed so I will aim to put
them on paper in the upcoming days and let everyone know!

Best,
Christo

On Tue, 9 Apr 2024 at 14:49, Doğuşcan Namal 
wrote:

> +1 let's not introduce a new api and mark it immediately as deprecated :)
>
> On your second comment Luke, one thing we need to clarify is when do we
> consider remote storage to be DISABLED for a topic?
> Particularly, what is the state when the remote storage is being deleted
> in case of disablement.policy=delete? Is it DISABLING or DISABLED?
>
> If we move directly to the DISABLED state,
>
> a) in case of failures, the leaders should continue remote storage
> deletion even if the topic is moved to the DISABLED state, otherwise we
> risk having stray data on remote storage.
> b) on each restart, we should initiate the remote storage deletion because
> although we replayed a record with a DISABLED state, we can not be sure if
> the remote data is deleted or not.
>
> We could either consider keeping the remote topic in DISABLING state until
> all of the remote storage data is deleted, or we need an additional
> mechanism to handle the remote stray data.
>
> The existing topic deletion, for instance, handles stray logs on disk by
> detecting them on KafkaBroker startup and deleting before the
> ReplicaManager is started.
> Maybe we need a similar mechanism here as well if we don't want a
> DISABLING state. Otherwise, we need a callback from Brokers to validate
> that remote storage data is deleted and now we could move to the DISABLED
> state.
>
> Thanks.
>
> On Tue, 9 Apr 2024 at 12:45, Luke Chen  wrote:
>
>> Hi Christo,
>>
>> > I would then opt for moving information from DisableRemoteTopic
>> within the StopReplicas API which will then disappear in KRaft world as it
>> is already scheduled for deprecation. What do you think?
>>
>> Sounds good to me.
>>
>> Thanks.
>> Luke
>>
>> On Tue, Apr 9, 2024 at 6:46 PM Christo Lolov 
>> wrote:
>>
>> > Heya Luke!
>> >
>> > I thought a bit more about it and I reached the same conclusion as you
>> for
>> > 2 as a follow-up from 1. In other words, in KRaft world I don't think
>> the
>> > controller needs to wait for acknowledgements for the brokers. All we
>> care
>> > about is that the leader (who is responsible for archiving/deleting
>> data in
>> > tiered storage) knows about the change and applies it properly. If
>> there is
>> > a leadership change halfway through the operation then the new leader
>> still
>> > needs to apply the message from the state topic and we know that a
>> > disable-message will be applied before a reenablement-message. I will
>> > change the KIP later today/tomorrow morning to reflect this reasoning.
>> >
>> > However, with this I believe that introducing a new API just for
>> > Zookeeper-based clusters (i.e. DisableRemoteTopic) becomes a bit of an
>> > overkill. I would then opt for moving information from
>> DisableRemoteTopic
>> > within the StopReplicas API which will then disappear in KRaft world as
>> it
>> > is already scheduled for deprecation. What do you think?
>> >
>> > Best,
>> > Christo
>> >
>> > On Wed, 3 Apr 2024 at 07:59, Luke Chen  wrote:
>> >
>> > > Hi Christo,
>> > >
>> > > 1. I agree with Doguscan that in KRaft mode, the controller won't send
>> > RPCs
>> > > to the brokers (except in the migration path).
>> > > So, I think we could adopt the similar way we did to
>> > `AlterReplicaLogDirs`
>> > > (
>> > > KIP-858
>> > > <
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-858%3A+Handle+JBOD+broker+disk+failure+in+KRaft#KIP858:HandleJBODbrokerdiskfailureinKRaft-Intra-brokerreplicamovement
>> > > >)
>> > > that let the broker notify controller any update, instead of
>> controller
>> > to
>> > > broker. And once the controller receives all the complete requests
>> from
>> > > brokers, it'll enter "Disabled" state. WDYT?
>> > >
>> > > 2. Why should we wait until all brokers to respond before moving to
>> > > "Disabled" state in "KRaft mode"?
>> > > Currently, only the leader node does the remote log upload/fetch
>> tasks,
>> > so
>> > > does that mean the

Re: [DISCUSS] KIP-1037: Allow WriteTxnMarkers API with Alter Cluster Permission

2024-04-15 Thread Christo Lolov
Heya Nikhil,

Thank you for raising this KIP!

Your proposal makes sense to me. In essence you are saying that the
permission required by WriteTxnMarkers should be the same as for CreateAcls
and DeleteAcls, which is reasonable. If we trust an administrator to assign
the correct permissions then we should also trust them to be able to abort
a hanging transaction.

I would support this KIP if it is put to the vote unless there are other
suggestions for improvements!

Best,
Christo

On Thu, 11 Apr 2024 at 16:48, Nikhil Ramakrishnan <
ramakrishnan.nik...@gmail.com> wrote:

> Hi everyone,
>
> I would like to start a discussion for
>
> KIP-1037: Allow WriteTxnMarkers API with Alter Cluster Permission
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1037%3A+Allow+WriteTxnMarkers+API+with+Alter+Cluster+Permission
>
> The WriteTxnMarkers API was originally used for inter-broker
> communication only. This required the ClusterAction permission on the
> Cluster resource to invoke.
>
> In KIP-664, we modified the WriteTxnMarkers API so that it could be
> invoked externally from the Kafka AdminClient to safely abort a
> hanging transaction. Such usage is more aligned with the Alter
> permission on the Cluster resource, which includes other
> administrative actions invoked from the Kafka AdminClient (i.e.
> CreateAcls and DeleteAcls). This KIP proposes allowing the
> WriteTxnMarkers API to be invoked with the Alter permission on the
> Cluster.
>
> I am looking forward to your thoughts and suggestions for improvement!
>
> Thanks,
> Nikhil
>


[jira] [Resolved] (KAFKA-15230) ApiVersions data between controllers is not reliable

2024-04-15 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-15230.

Fix Version/s: (was: 3.7.0)
   Resolution: Duplicate

this is fixed by https://issues.apache.org/jira/browse/KAFKA-15369

> ApiVersions data between controllers is not reliable
> 
>
> Key: KAFKA-15230
> URL: https://issues.apache.org/jira/browse/KAFKA-15230
> Project: Kafka
>  Issue Type: Bug
>Reporter: David Arthur
>Assignee: Colin McCabe
>Priority: Critical
>
> While testing ZK migrations, I noticed a case where the controller was not 
> starting the migration due to the missing ApiVersions data from other 
> controllers. This was unexpected because the quorum was running and the 
> followers were replicating the metadata log as expected. After examining a 
> heap dump of the leader, it was in fact the case that the ApiVersions map of 
> NodeApiVersions was empty.
>  
> After further investigation and offline discussion with [~jsancio], we 
> realized that after the initial leader election, the connection from the Raft 
> leader to the followers will become idle and eventually timeout and close. 
> This causes NetworkClient to purge the NodeApiVersions data for the closed 
> connections.
>  
> There are two main side effects of this behavior: 
> 1) If migrations are not started within the idle timeout period (10 minutes, 
> by default), then they will not be able to be started. After this timeout 
> period, I was unable to restart the controllers in such a way that the leader 
> had active connections with all followers.
> 2) Dynamically updating features, such as "metadata.version", is not 
> guaranteed to be safe
>  
> There is a partial workaround for the migration issue. If we set "
> connections.max.idle.ms" to -1, the Raft leader will never disconnect from 
> the followers. However, if a follower restarts, the leader will not 
> re-establish a connection.
>  
> The feature update issue has no safe workarounds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-15230) ApiVersions data between controllers is not reliable

2024-04-15 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reopened KAFKA-15230:


reopen to make it as duplicate

> ApiVersions data between controllers is not reliable
> 
>
> Key: KAFKA-15230
> URL: https://issues.apache.org/jira/browse/KAFKA-15230
> Project: Kafka
>  Issue Type: Bug
>Reporter: David Arthur
>Assignee: Colin McCabe
>Priority: Critical
> Fix For: 3.7.0
>
>
> While testing ZK migrations, I noticed a case where the controller was not 
> starting the migration due to the missing ApiVersions data from other 
> controllers. This was unexpected because the quorum was running and the 
> followers were replicating the metadata log as expected. After examining a 
> heap dump of the leader, it was in fact the case that the ApiVersions map of 
> NodeApiVersions was empty.
>  
> After further investigation and offline discussion with [~jsancio], we 
> realized that after the initial leader election, the connection from the Raft 
> leader to the followers will become idle and eventually timeout and close. 
> This causes NetworkClient to purge the NodeApiVersions data for the closed 
> connections.
>  
> There are two main side effects of this behavior: 
> 1) If migrations are not started within the idle timeout period (10 minutes, 
> by default), then they will not be able to be started. After this timeout 
> period, I was unable to restart the controllers in such a way that the leader 
> had active connections with all followers.
> 2) Dynamically updating features, such as "metadata.version", is not 
> guaranteed to be safe
>  
> There is a partial workaround for the migration issue. If we set "
> connections.max.idle.ms" to -1, the Raft leader will never disconnect from 
> the followers. However, if a follower restarts, the leader will not 
> re-establish a connection.
>  
> The feature update issue has no safe workarounds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15369) Allow AdminClient to Talk Directly with the KRaft Controller Quorum and add Controller Registration

2024-04-15 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-15369.

Fix Version/s: 3.7.0
 Assignee: Colin McCabe
   Resolution: Fixed

> Allow AdminClient to Talk Directly with the KRaft Controller Quorum and add 
> Controller Registration
> ---
>
> Key: KAFKA-15369
> URL: https://issues.apache.org/jira/browse/KAFKA-15369
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
> Fix For: 3.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New Kafka PMC Member: Greg Harris

2024-04-15 Thread Christo Lolov
Congratulations, Greg :)

On Mon, 15 Apr 2024 at 07:34, Zhisheng Zhang <31791909...@gmail.com> wrote:

> Congratulations Greg!
>
>
> Manikumar  于2024年4月15日周一 13:49写道:
>
> > Congratulations, Greg.
> >
> > On Mon, Apr 15, 2024 at 11:18 AM Bruno Cadonna 
> wrote:
> > >
> > > Congratulations, Greg!
> > >
> > > Best,
> > > Bruno
> > >
> > > On 4/15/24 7:33 AM, Claude Warren wrote:
> > > > Congrats Greg!  All the hard work paid off.
> > > >
> > > > On Mon, Apr 15, 2024 at 6:58 AM Ivan Yurchenko 
> wrote:
> > > >
> > > >> Congrats Greg!
> > > >>
> > > >> On Sun, Apr 14, 2024, at 22:51, Sophie Blee-Goldman wrote:
> > > >>> Congrats Greg! Happy to have you
> > > >>>
> > > >>> On Sun, Apr 14, 2024 at 9:26 AM Jorge Esteban Quilcate Otoya <
> > > >>> quilcate.jo...@gmail.com> wrote:
> > > >>>
> > >  Congrats, Greg!!
> > > 
> > >  On Sun 14. Apr 2024 at 15.05, Josep Prat
> > 
> > >  wrote:
> > > 
> > > > Congrats Greg!!!
> > > >
> > > >
> > > > Best,
> > > >
> > > > Josep Prat
> > > > Open Source Engineering Director, aivenjosep.p...@aiven.io   |
> > > > +491715557497 | aiven.io
> > > > Aiven Deutschland GmbH
> > > > Alexanderufer 3-7, 10117 Berlin
> > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > >
> > > > On Sun, Apr 14, 2024, 12:30 Divij Vaidya <
> divijvaidy...@gmail.com>
> > >  wrote:
> > > >
> > > >> Congratulations Greg!
> > > >>
> > > >> --
> > > >> Divij Vaidya
> > > >>
> > > >>
> > > >>
> > > >> On Sun, Apr 14, 2024 at 6:39 AM Kamal Chandraprakash <
> > > >> kamal.chandraprak...@gmail.com> wrote:
> > > >>
> > > >>> Congratulations, Greg!
> > > >>>
> > > >>> On Sun, Apr 14, 2024 at 8:57 AM Yash Mayya <
> yash.ma...@gmail.com
> > > >>>
> > > > wrote:
> > > >>>
> > >  Congrats Greg!
> > > 
> > >  On Sun, 14 Apr, 2024, 05:56 Randall Hauch, 
> > >  wrote:
> > > 
> > > > Congratulations, Greg!
> > > >
> > > > On Sat, Apr 13, 2024 at 6:36 PM Luke Chen  > > >>>
> > > > wrote:
> > > >
> > > >> Congrats, Greg!
> > > >>
> > > >> On Sun, Apr 14, 2024 at 7:05 AM Viktor Somogyi-Vass
> > > >>  wrote:
> > > >>
> > > >>> Congrats Greg! :)
> > > >>>
> > > >>> On Sun, Apr 14, 2024, 00:35 Bill Bejeck <
> > > >> bbej...@gmail.com>
> > > >> wrote:
> > > >>>
> > >  Congrats Greg!
> > > 
> > >  -Bill
> > > 
> > >  On Sat, Apr 13, 2024 at 4:25 PM Boudjelda Mohamed Said
> > > >> <
> > > >>> bmsc...@gmail.com>
> > >  wrote:
> > > 
> > > > Congratulations Greg
> > > >
> > > > On Sat 13 Apr 2024 at 20:42, Chris Egerton <
> > > >>> ceger...@apache.org>
> > > >>> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Greg Harris has been a Kafka committer since July
> > > >> 2023.
> > > > He
> > > >>> has
> > > >>> remained
> > > >> very active and instructive in the community since
> > > >> becoming a
> > >  committer.
> > > >> It's my pleasure to announce that Greg is now a
> > > >> member
> > >  of
> > > >>> Kafka
> > > >> PMC.
> > > >>
> > > >> Congratulations, Greg!
> > > >>
> > > >> Chris, on behalf of the Apache Kafka PMC
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > > >
> > > >
> >
>


[jira] [Created] (KAFKA-16552) Create an internal config to control InitialTaskDelayMs in LogManager to speed up tests

2024-04-15 Thread Luke Chen (Jira)
Luke Chen created KAFKA-16552:
-

 Summary: Create an internal config to control InitialTaskDelayMs 
in LogManager to speed up tests
 Key: KAFKA-16552
 URL: https://issues.apache.org/jira/browse/KAFKA-16552
 Project: Kafka
  Issue Type: Improvement
Reporter: Luke Chen
Assignee: Luke Chen


When startup LogManager, we'll create schedule tasks like: kafka-log-retention, 
kafka-recovery-point-checkpoint threads...etc 
([here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogManager.scala#L629]).
 All of them have public configs to configure the interval, like 
`log.retention.check.interval.ms`. But in addition to the scheduler interval, 
there's a hard coded InitialTaskDelayMs (30 seconds) for all of them. That 
might not be a problem in production env, since it'll make the kafka server 
start up faster. But in test env, the 30 secs delay means if there are tests 
verifying the behaviors like log retention, it'll take 30 secs up to complete 
the tests.

To speed up tests, we should create an internal config (ex: 
"log.initial.task.delay.ms") to control InitialTaskDelayMs in LogManager to 
speed up tests. This is not intended to be used by normal users, just for 
speeding up testing usage.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-15 Thread Claude Warren
After thinking about his KIP over the weekend I think that there is another
lighter weight approach.

I think the question is not whether or not we have seen a given PID before
but rather how many unique PIDs did the principal create in the last hour.
Perhaps more exactly it is: did the Principal create more than X PIDS in
the last Y time units?

This question can be quickly answered by a CPC datasketch [1].  The
solution would be something like:
Break the Y time units into a set of Y' smaller partitions (e.g. 60
1-minute partitions for an hour).  Create a circular queue of Y' CPC
datasketches for each principal.  Implement a queue entry selector based on
the modulus of the system by the resolution of the Y' partitions. On each
call:

On queue entry selector change clear the CPC (makes it empty)
Add the latest PID to the current queue entry.
Sum up the CPCs and check if the max (or min) estimate of unique counts
exceeds the limit for the user.

When the CPC returns a zero estimated count then the principal has gone
away and the principal/CPC-queue pair can be removed from the tracking
system.

I believe that this code solution is smaller and faster than the Bloom
filter implementation.

[1] https://datasketches.apache.org/docs/CPC/CPC.html



On Fri, Apr 12, 2024 at 3:10 PM Claude Warren  wrote:

> I think there is an issue in the KIP.
>
> Basically the kip says, if the PID is found in either of the Bloom filters
> then no action is taken
> If the PID is not found then it is added and the quota rating metrics are
> incremented.
>
> In this case long running PIDs will be counted multiple times.
>
> Let's assume a 30 minute window with 2 15-minute frames.  So for the first
> 15 minutes all PIDs are placed in the first Bloom filter and for the 2nd 15
> minutes all new PIDs are placed in the second bloom filter.  At the 3rd 15
> minutes the first filter is removed and a new empty one created.
>
> Let's denote Bloom filters as BFn{} and indicate the contained pids
> between the braces.
>
>
> So at t0 lets insert PID0 and increment the rating metrics.  Thus we have
> BF0{PID0}
> at t0+5 let's insert PID1 and increment the rating metrics.  Thus we have
> BF0{PID0, PID1}
> at t0+10 we see PID0 again but no changes occur.
> at t0+15 we start t1  and we have BF0{PID0, PID1}, BF1{}
> at t1+5 we see PID2, increment the rating metrics, and we have BF0{PID0,
> PID1}, BF1{PID2}
> at t1+6 we see PID0 again and no changes occur
> at t1+7 we see PID1 again and no changes occur
> at t1+15 we start a new window and dispose of BF0.  Thus we have
> BF1{PID2}, BF2{}
> at t2+1 we see PID3, increment the rating metrics,  and we have we have
> BF1{PID2}, BF2{PID3}
> at t2+6 we see PID0 again but now it is not in the list so we increment
> the rating metrics and add it BF1{PID2}, BF2{PID3, PID0}
>
> But we just saw PID0 15 minutes ago.  Well within the 30 minute window we
> are trying to track.  Or am I missing something?  It seems like we need to
> add each PID to the last bloom filter
>
> On Fri, Apr 12, 2024 at 2:45 PM Claude Warren  wrote:
>
>> Initial code is available at
>> https://github.com/Claudenw/kafka/blob/KIP-936/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerIDQuotaManager.java
>>
>> On Tue, Apr 9, 2024 at 2:37 PM Claude Warren  wrote:
>>
>>> I should also note that the probability of false positives does not fall
>>> below shape.P because as it approaches shape.P a new layer is created and
>>> filters are added to that.  So no layer in the LayeredBloomFilter exceeds
>>> shape.P thus the entire filter does not exceed shape.P.
>>>
>>> Claude
>>>
>>> On Tue, Apr 9, 2024 at 2:26 PM Claude Warren  wrote:
>>>
 The overall design for KIP-936 seems sound to me.  I would make the
 following changes:

 Replace the "TimedBloomFilter" with a "LayeredBloomFilter" from
 commons-collections v4.5

 Define the producer.id.quota.window.size.seconds to be the length of
 time that a Bloom filter of PIDs will exist.
 Define a new configuration option "producer.id.quota.window.count" as
 the number of windows active in window.size.seconds.

 Define the "Shape" (See commons-collections bloomfilters v4.5) of the
 bloom filter from the average number of PIDs expected in
 window.size.seconds/window.count (call this N) and the probability of false
 positives (call this P).  Due to the way the LayeredBloomFilter works the
 number of items can be a lower number than the max.  I'll explain that in a
 minute.

 The LayeredBloomFilter implements the standard BloomFilter interface
 but internally keeps an ordered list of filters (called layers) from oldest
 created to newest.  It adds new layers when a specified Predicate
 (checkExtend) returns true.  It will remove filters as defined by a
 specified Consumer (filterCleanup).

 Everytime a BloomFilter is merged into the LayeredBloomFilter the
 filter checks to the "checkExtend

[jira] [Created] (KAFKA-16551) add integration test for ClusterTool

2024-04-15 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16551:
--

 Summary: add integration test for ClusterTool
 Key: KAFKA-16551
 URL: https://issues.apache.org/jira/browse/KAFKA-16551
 Project: Kafka
  Issue Type: Test
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16550) add integration test for LogDirsCommand

2024-04-15 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16550:
--

 Summary: add integration test for LogDirsCommand
 Key: KAFKA-16550
 URL: https://issues.apache.org/jira/browse/KAFKA-16550
 Project: Kafka
  Issue Type: Test
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


Currently LogDirsCommand have only UT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16490) Upgrade gradle from 8.6 to 8.7

2024-04-15 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16490.

Fix Version/s: 3.8.0
   Resolution: Fixed

> Upgrade gradle from 8.6 to 8.7
> --
>
> Key: KAFKA-16490
> URL: https://issues.apache.org/jira/browse/KAFKA-16490
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia Chuan Yu
>Priority: Major
> Fix For: 3.8.0
>
>
> gradle 8.7: 
> https://docs.gradle.org/8.7/release-notes.html?_gl=1*meg7rg*_ga*MTA4Mzk2MzA3MC4xNzEwOTI1MjQx*_ga_7W7NC6YNPT*MTcxMjY2MjM3My4yMC4wLjE3MTI2NjIzNzMuNjAuMC4w
> As there is a unresolved issue about 8.6 [0], it would be nice to test all 
> instructions in readme when running this update.
> [0] https://github.com/apache/kafka/pull/15553



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16549) suppress the warnings from RemoteLogManager

2024-04-15 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16549:
--

 Summary: suppress the warnings from RemoteLogManager
 Key: KAFKA-16549
 URL: https://issues.apache.org/jira/browse/KAFKA-16549
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


{quote}
/home/chia7712/project/kafka/core/src/main/java/kafka/log/remote/RemoteLogManager.java:234:
 warning: [removal] AccessController in java.security has been deprecated and 
marked for removal
return java.security.AccessController.doPrivileged(new 
PrivilegedAction() {
^
/home/chia7712/project/kafka/core/src/main/java/kafka/log/remote/RemoteLogManager.java:256:
 warning: [removal] AccessController in java.security has been deprecated and 
marked for removal
return java.security.AccessController.doPrivileged(new 
PrivilegedAction() {
{quote}

we should add 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-04-15 Thread Manikumar
Hi Krish,

Thanks for the updated KIP. a few comments below.

> "These actions can be carried out by the RM or any contributor post the 
> release process."
Maybe as part of the release process, RM can create a JIRA for this
task. This can be taken by RM or any comitter or any contributor (with
some help from commiters to run "Docker Image Preparation via GitHub
Actions:"

> "Perform Docker build tests to ensure image integrity"
Is this using GitHub Actions workflow? or manual testing?

> "The RM will manually raise the final PR to Docker Hub’s official images 
> repository using the contents of the generated file"
 Is it mandatory for RM/comitters to raise the PR to Docker Hub’s
official images repository (or) can it be done by any contributor.

Also I was thinking, once the KIP gets voted, we should try to release
kafka:3.7.0 (or 3.7.1) Docker Official image. This will help us to
validate the process and allow us to fix any changes suggested by
Dockerhub before the 3.8.0 release.


Thanks,

On Mon, Apr 8, 2024 at 2:33 PM Krish Vora  wrote:
>
> Hi Manikumar and Luke.
> Thanks for the questions.
>
> 1. No, the Docker inventory files and configurations will not be the same
> for Open Source Software (OSS) Images and Docker Official Images (DOI).
>
> For OSS images, the Dockerfile located in docker/jvm/dockerfile is
> utilized. This process is integrated with the existing release pipeline as
> outlined in KIP-975
> ,
> where the Kafka URL is provided as a build argument. This method allows for
> building, testing, and releasing OSS images dynamically. The OSS images
> will continue to be released under the standard release process .
>
> In contrast, the release process for DOIs requires providing the Docker Hub
> team with a specific directory for each version release that contains a
> standalone Dockerfile. These Dockerfiles are designed to be
> self-sufficient, hence require hardcoded values instead of relying on build
> arguments. To accommodate this, in our proposed approach, a new directory
> named docker_official_images has been created. This directory contains
> version-specific directories, having Dockerfiles with hardcoded
> configurations for each release, acting as the source of truth for DOI
> releases. The hardcoded dockerfiles will be created using the
> docker/jvm/dockerfile as a template. Thus, as part of post release we will
> be creating a Dockerfile that will be reviewed by the Dockerhub community
> and might need changes as per their review. This approach ensures that DOIs
> are built consistently and meet the specific requirements set by Docker Hub.
>
> 2. Yes Manikumar, transitioning the release of Docker Official Images (DOI)
> to a post-release activity does address the concerns about complicating the
> release process. Initially, we considered incorporating DOI release
> directly into Kafka's release workflow. However, this approach
> significantly increased the RMs workload due to the addition of numerous
> steps, complicating the process. By designating the DOI release as a
> post-release task, we maintain the original release process. This
> adjustment allows for the DOI release to be done after the main release. We
> have revised the KIP to reflect that DOI releases will now occur after the
> main release phase. Please review the updated document and provide any
> feedback you might have.
>
> Thanks,
> Krish.
>
> On Wed, Apr 3, 2024 at 3:35 PM Luke Chen  wrote:
>
> > Hi Krishna,
> >
> > I also have the same question as Manikumar raised:
> > 1. Will the Docker inventory files/etc are the same for OSS Image and
> > Docker Official Images?
> > If no, then why not? Could we make them identical so that we don't have to
> > build 2 images for each release?
> >
> > Thank you.
> > Luke
> >
> > On Wed, Apr 3, 2024 at 12:41 AM Manikumar 
> > wrote:
> >
> > > Hi Krishna,
> > >
> > > Thanks for the KIP.
> > >
> > > I think Docker Official Images will be beneficial to the Kafka community.
> > > Few queries below.
> > >
> > > 1. Will the Docker inventory files/etc are the same for OSS Image and
> > > Docker Official Images
> > > 2. I am a bit worried about the new steps to the release process. Maybe
> > we
> > > should consider Docker Official Images release as Post-Release activity.
> > >
> > > Thanks,
> > >
> > > On Fri, Mar 22, 2024 at 3:29 PM Krish Vora 
> > wrote:
> > >
> > > > Hi Hector,
> > > >
> > > > Thanks for reaching out. This KIP builds on top of KIP-975
> > > > <
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-975%3A+Docker+Image+for+Apache+Kafka
> > > > >
> > > > and
> > > > aims to introduce a JVM-based Docker Official Image (DOI
> > > > ) for Apache
> > > > Kafka that will be visible under Docker Official Images
> > > >