Re: [VOTE] 3.5.1 RC1

2023-07-17 Thread Luke Chen
Hi Divij,

I've run:
1. Download kafka_2.12-3.5.1.tgz
2. Run quick start using KRaft mode
3. Verified the checksum
4. Sanity check the javadoc

All looks good.
+1 (binding)

Thanks.
Luke

On Tue, Jul 18, 2023 at 5:15 AM Chris Egerton 
wrote:

> Hi Divij,
>
> Thanks for running this release!
>
> To verify, I:
> - Built from source using Java 11 with both:
> - - the 3.5.1-rc1 tag on GitHub
> - - the kafka-3.5.1-src.tgz artifact from
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/
> - Checked signatures and checksums
> - Ran the quickstart using the kafka_2.13-3.5.1.tgz artifact from
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/ with Java 11 and Scala 13
> in KRaft mode
> - Ran all unit tests
> - Ran all integration tests for Connect and MM2
> - Verified that only version 1.1.10.1 of Snappy is present in the libs/
> directory of the unpacked kafka_2.12-3.5.1.tgz and kafka_2.13-3.5.1.tgz
> artifacts
> - Verified that case-insensitive validation of the security.protocol
> property is restored for Kafka clients by setting it to "pLAiNTexT" with
> the bin/kafka-topics.sh command (using the --command-config option), and
> with a standalone Connect worker (by adjusting the security.protocol,
> consumer.security.protocol, producer.security.protocol, and
> admin.security.protocol properties in the worker config file)
>
> Everything looks good to me!
>
> +1 (binding)
>
> Cheers,
>
> Chris
>
> On Mon, Jul 17, 2023 at 12:29 PM Federico Valeri 
> wrote:
>
> > Hi Divij, I did the following checks:
> >
> > - Checked signature, checksum, licenses
> > - Spot checked documentation and javadoc
> > - Built from source with Java 17 and Scala 2.13
> > - Ran full unit and integration test suites
> > - Ran test Java app using staging Maven artifacts
> >
> > +1 (non binding)
> >
> > Cheers
> > Fede
> >
> > On Mon, Jul 17, 2023 at 10:27 AM Divij Vaidya 
> > wrote:
> > >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the second candidate (RC1) for release of Apache Kafka 3.5.1.
> > First
> > > release candidate (RC0) was discarded due to incorrect license files.
> > They
> > > have been fixed since then.
> > >
> > > This release is a security patch release. It upgrades the dependency,
> > > snappy-java, to a version which is not vulnerable to CVE-2023-34455.
> You
> > > can find more information about the CVE at Kafka CVE list
> > > .
> > >
> > > Additionally, this releases fixes a regression introduced in 3.3.0,
> which
> > > caused security.protocol configuration values to be restricted to upper
> > > case only. With this release, security.protocol values are
> > > case insensitive. See KAFKA-15053
> > >  for details.
> > >
> > > Release notes for the 3.5.1 release:
> > > https://home.apache.org/~divijv/kafka-3.5.1-rc1/RELEASE_NOTES.html
> > >
> > > *** Please download, test and vote by Thursday, July 20, 9am PT
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > https://kafka.apache.org/KEYS
> > >
> > > Release artifacts to be voted upon (source and binary):
> > > https://home.apache.org/~divijv/kafka-3.5.1-rc1/
> > >
> > > Maven artifacts to be voted upon:
> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >
> > > Javadoc:
> > > https://home.apache.org/~divijv/kafka-3.5.1-rc1/javadoc/
> > >
> > > Tag to be voted upon (off 3.5 branch) is the 3.5.1 tag:
> > > https://github.com/apache/kafka/releases/tag/3.5.1-rc1
> > >
> > > Documentation:
> > > https://kafka.apache.org/35/documentation.html
> > > Please note that documentation will be updated with upgrade notes (
> > >
> >
> https://github.com/apache/kafka/commit/4c78fd64454e25e3536e8c7ed5725d3fbe944a49
> > )
> > > after the release is complete.
> > >
> > > Protocol:
> > > https://kafka.apache.org/35/protocol.html
> > >
> > > Unit/integration tests:
> > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/43/ (2
> > failures)
> > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/42/ (6
> > failures)
> > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/39/ (9
> > failures)
> > >
> > > In all 3 runs above, there are no common tests which are failing, which
> > > leads me to believe that they are flaky. I have also verified that
> > > unit/integration tests on my local machine successfully pass (JDK 17 +
> > > Scala 2.13)
> > >
> > > System tests:
> > > Not planning to run system tests since this is a patch release.
> > >
> > > Thank you.
> > >
> > > --
> > > Divij Vaidya
> > > Release Manager for Apache Kafka 3.5.1
> >
>


Re: [DISCUSS] KIP-953: partition method to be overloaded to accept headers as well.

2023-07-17 Thread Sagar
Hi Jack,

Thanks for the KIP! Seems like an interesting idea. I have some feedback:

1) It would be great if you could clean up the text that seems to mimic the
KIP template. It is generally not required in the KIP.

2) In the Public Interfaces where you mentioned *Partitioner method in
**org/apache/kafka/clients/producer
will have the following update*, I believe you meant the Partitioner
*interface*?

3) Staying on Public Interface, it is generally preferable to add a
Javadocs section along with the newly added method. You could also describe
the behaviour of it invoking the default existing method.

4) The option that is mentioned in the Rejected Alternatives, seems more
like a workaround to the current problem that you are describing. That
could be added to the Motivation section IMO.

5) Can you also add some more examples of scenarios where this would be
helpful? The only scenario mentioned seems to have a workaround. Just
trying to ensure that we have a strong enough motivation before adding a
public API.

6) One thing which should also be worth noting down would be what happens
if users override both methods, only one method (new or old) and no methods
(the default behaviour). It would help in understanding the proposal better.

Thanks!
Sagar.


On Mon, Jul 17, 2023 at 9:19 PM Jack Tomy  wrote:

> Hey everyone,
>
> Not seeing much discussion on the KPI. Might be because it is too
> obvious 😉.
>
> If there are no more comments, I will start the VOTE in the coming days.
>
> On Sat, Jul 15, 2023 at 8:48 PM Jack Tomy  wrote:
>
> > Hey everyone,
> >
> > Please take a look at the KPI below and provide your suggestions and
> > feedback. TIA.
> >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937
> >
> >
> > --
> > Best Regards
> > *Jack*
> >
>
>
> --
> Best Regards
> *Jack*
>


[GitHub] [kafka-site] mjsax commented on pull request #528: MINOR: Add statmenet about ZK deprecation to 3.5 release blog post

2023-07-17 Thread via GitHub


mjsax commented on PR #528:
URL: https://github.com/apache/kafka-site/pull/528#issuecomment-1639047530

   We got a PR for the ZK section: https://github.com/apache/kafka/pull/14031
   
   In case you want to take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2008

2023-07-17 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 387636 lines...]
Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldRestoreStateFromSourceTopic(boolean) > [2] false 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
OptimizedKTableIntegrationTest > shouldApplyUpdatesToStandbyStore(TestInfo) 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldRestoreStateFromSourceTopic(boolean) > [2] false 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldSuccessfullyStartWhenLoggingDisabled(boolean) > 
[1] true STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
OptimizedKTableIntegrationTest > shouldApplyUpdatesToStandbyStore(TestInfo) 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldSuccessfullyStartWhenLoggingDisabled(boolean) > 
[1] true PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldSuccessfullyStartWhenLoggingDisabled(boolean) > 
[2] false STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldRecycleStateFromStandbyTaskPromotedToActiveTaskAndNotRestore(boolean) > 
[1] true STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldSuccessfullyStartWhenLoggingDisabled(boolean) > 
[2] false PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldRestoreStateFromChangelogTopic(boolean) > [1] 
true STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldRestoreStateFromChangelogTopic(boolean) > [1] 
true PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldRestoreStateFromChangelogTopic(boolean) > [2] 
false STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
RestoreIntegrationTest > shouldRestoreStateFromChangelogTopic(boolean) > [2] 
false PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
SmokeTestDriverIntegrationTest > shouldWorkWithRebalance(boolean) > [1] true 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldRecycleStateFromStandbyTaskPromotedToActiveTaskAndNotRestore(boolean) > 
[1] true PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldRecycleStateFromStandbyTaskPromotedToActiveTaskAndNotRestore(boolean) > 
[2] false STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldRecycleStateFromStandbyTaskPromotedToActiveTaskAndNotRestore(boolean) > 
[2] false PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldProcessDataFromStoresWithLoggingDisabled(boolean) > [1] true STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldProcessDataFromStoresWithLoggingDisabled(boolean) > [1] true PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldProcessDataFromStoresWithLoggingDisabled(boolean) > [2] false STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
SmokeTestDriverIntegrationTest > shouldWorkWithRebalance(boolean) > [1] true 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 182 > 
SmokeTestDriverIntegrationTest > shouldWorkWithRebalance(boolean) > [2] false 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > 
shouldProcessDataFromStoresWithLoggingDisabled(boolean) > [2] false PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > shouldRestoreNullRecord() STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > shouldRestoreNullRecord() PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > shouldRestoreStateFromSourceTopic(boolean) > [1] true 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > shouldRestoreStateFromSourceTopic(boolean) > [1] true 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > shouldRestoreStateFromSourceTopic(boolean) > [2] false 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 178 > 
RestoreIntegrationTest > shouldRestoreStateFromSourceTopic(boolean) > [2] false 
PASSED

Gradle Test Run :str

Re: [VOTE] 3.5.1 RC1

2023-07-17 Thread Chris Egerton
Hi Divij,

Thanks for running this release!

To verify, I:
- Built from source using Java 11 with both:
- - the 3.5.1-rc1 tag on GitHub
- - the kafka-3.5.1-src.tgz artifact from
https://home.apache.org/~divijv/kafka-3.5.1-rc1/
- Checked signatures and checksums
- Ran the quickstart using the kafka_2.13-3.5.1.tgz artifact from
https://home.apache.org/~divijv/kafka-3.5.1-rc1/ with Java 11 and Scala 13
in KRaft mode
- Ran all unit tests
- Ran all integration tests for Connect and MM2
- Verified that only version 1.1.10.1 of Snappy is present in the libs/
directory of the unpacked kafka_2.12-3.5.1.tgz and kafka_2.13-3.5.1.tgz
artifacts
- Verified that case-insensitive validation of the security.protocol
property is restored for Kafka clients by setting it to "pLAiNTexT" with
the bin/kafka-topics.sh command (using the --command-config option), and
with a standalone Connect worker (by adjusting the security.protocol,
consumer.security.protocol, producer.security.protocol, and
admin.security.protocol properties in the worker config file)

Everything looks good to me!

+1 (binding)

Cheers,

Chris

On Mon, Jul 17, 2023 at 12:29 PM Federico Valeri 
wrote:

> Hi Divij, I did the following checks:
>
> - Checked signature, checksum, licenses
> - Spot checked documentation and javadoc
> - Built from source with Java 17 and Scala 2.13
> - Ran full unit and integration test suites
> - Ran test Java app using staging Maven artifacts
>
> +1 (non binding)
>
> Cheers
> Fede
>
> On Mon, Jul 17, 2023 at 10:27 AM Divij Vaidya 
> wrote:
> >
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second candidate (RC1) for release of Apache Kafka 3.5.1.
> First
> > release candidate (RC0) was discarded due to incorrect license files.
> They
> > have been fixed since then.
> >
> > This release is a security patch release. It upgrades the dependency,
> > snappy-java, to a version which is not vulnerable to CVE-2023-34455. You
> > can find more information about the CVE at Kafka CVE list
> > .
> >
> > Additionally, this releases fixes a regression introduced in 3.3.0, which
> > caused security.protocol configuration values to be restricted to upper
> > case only. With this release, security.protocol values are
> > case insensitive. See KAFKA-15053
> >  for details.
> >
> > Release notes for the 3.5.1 release:
> > https://home.apache.org/~divijv/kafka-3.5.1-rc1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Thursday, July 20, 9am PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~divijv/kafka-3.5.1-rc1/
> >
> > Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > Javadoc:
> > https://home.apache.org/~divijv/kafka-3.5.1-rc1/javadoc/
> >
> > Tag to be voted upon (off 3.5 branch) is the 3.5.1 tag:
> > https://github.com/apache/kafka/releases/tag/3.5.1-rc1
> >
> > Documentation:
> > https://kafka.apache.org/35/documentation.html
> > Please note that documentation will be updated with upgrade notes (
> >
> https://github.com/apache/kafka/commit/4c78fd64454e25e3536e8c7ed5725d3fbe944a49
> )
> > after the release is complete.
> >
> > Protocol:
> > https://kafka.apache.org/35/protocol.html
> >
> > Unit/integration tests:
> > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/43/ (2
> failures)
> > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/42/ (6
> failures)
> > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/39/ (9
> failures)
> >
> > In all 3 runs above, there are no common tests which are failing, which
> > leads me to believe that they are flaky. I have also verified that
> > unit/integration tests on my local machine successfully pass (JDK 17 +
> > Scala 2.13)
> >
> > System tests:
> > Not planning to run system tests since this is a patch release.
> >
> > Thank you.
> >
> > --
> > Divij Vaidya
> > Release Manager for Apache Kafka 3.5.1
>


[jira] [Created] (KAFKA-15202) MM2 OffsetSyncStore clears too many syncs when sync spacing is variable

2023-07-17 Thread Greg Harris (Jira)
Greg Harris created KAFKA-15202:
---

 Summary: MM2 OffsetSyncStore clears too many syncs when sync 
spacing is variable
 Key: KAFKA-15202
 URL: https://issues.apache.org/jira/browse/KAFKA-15202
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Affects Versions: 3.4.1, 3.5.0, 3.3.3
Reporter: Greg Harris


The spacing between OffsetSyncs can vary significantly, due to conditions in 
the upstream topic and in the replication rate of the MirrorSourceTask.

The OffsetSyncStore attempts to keep a maximal number of distinct syncs 
present, and for regularly spaced syncs it does not allow an incoming sync to 
expire more than one other unique sync. There are tests to enforce this 
property.

For variable spaced syncs, there is no such guarantee, because multiple 
fine-grained syncs may need to be expired at the same time. However, instead of 
only those fine-grained syncs being expired, the store may also expire 
coarser-grained syncs. This causes a large decrease in the number of unique 
syncs.

This is an extremely simple example:

* Syncs: 0 (start), 1, 2, 4.
The result:
```
TRACE New sync OffsetSync\{topicPartition=topic1-2, upstreamOffset=1, 
downstreamOffset=1} applied, new state is [1:1,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
TRACE New sync OffsetSync\{topicPartition=topic1-2, upstreamOffset=2, 
downstreamOffset=2} applied, new state is [2:2,1:1,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
TRACE New sync OffsetSync\{topicPartition=topic1-2, upstreamOffset=4, 
downstreamOffset=4} applied, new state is [4:4,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
```
Instead of being expired, the `2:2` sync should still be present in the final 
state, allowing the store to maintain 3 unique syncs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15201) When git fails, script goes into a loop

2023-07-17 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15201:


 Summary: When git fails, script goes into a loop
 Key: KAFKA-15201
 URL: https://issues.apache.org/jira/browse/KAFKA-15201
 Project: Kafka
  Issue Type: Sub-task
Reporter: Divij Vaidya


When the git push to remote fails (let's say with unauthenticated exception), 
then the script runs into a loop. It should not retry and fail gracefully 
instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15200) verify pre-requisite at start of release.py

2023-07-17 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15200:


 Summary: verify pre-requisite at start of release.py
 Key: KAFKA-15200
 URL: https://issues.apache.org/jira/browse/KAFKA-15200
 Project: Kafka
  Issue Type: Sub-task
Reporter: Divij Vaidya


At the start of release.py, the first thing it should do is verify that 
dependency pre-requisites are satisfied. The pre-requisites are:

1. maven should be installed.
2. sftp should be installed. Connection to @home.apache.org should be 
successful. Currently it is done manually at the step "Verify by using 
`{{{}sftp @home.apache.org{}}}`" in 
[https://cwiki.apache.org/confluence/display/KAFKA/Release+Process] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15199) remove leading and trailing spaces from user input in release.py

2023-07-17 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15199:


 Summary: remove leading and trailing spaces from user input in 
release.py
 Key: KAFKA-15199
 URL: https://issues.apache.org/jira/browse/KAFKA-15199
 Project: Kafka
  Issue Type: Sub-task
Reporter: Divij Vaidya






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15198) Improve version release scripts

2023-07-17 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15198:


 Summary: Improve version release scripts
 Key: KAFKA-15198
 URL: https://issues.apache.org/jira/browse/KAFKA-15198
 Project: Kafka
  Issue Type: Improvement
Reporter: Divij Vaidya


This is the parent Jira to track improvement items for version release process.

See: [https://cwiki.apache.org/confluence/display/KAFKA/Release+Process] for 
the complete release process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15197) Flaky test MirrorConnectorsIntegrationExactlyOnceTest.testOffsetTranslationBehindReplicationFlow()

2023-07-17 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15197:


 Summary: Flaky test 
MirrorConnectorsIntegrationExactlyOnceTest.testOffsetTranslationBehindReplicationFlow()
 Key: KAFKA-15197
 URL: https://issues.apache.org/jira/browse/KAFKA-15197
 Project: Kafka
  Issue Type: Test
  Components: mirrormaker
Reporter: Divij Vaidya
 Fix For: 3.6.0


As of Jul 17th, this is the second most flaky test in our CI on trunk and fails 
46% of times. 

See: 
[https://ge.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe/Berlin]
 

Note that MirrorConnectorsIntegrationExactlyOnceTest has multiple tests but 
testOffsetTranslationBehindReplicationFlow is the one that is the reason for 
most failures. see: 
[https://ge.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe/Berlin&tests.container=org.apache.kafka.connect.mirror.integration.MirrorConnectorsIntegrationExactlyOnceTest]
 

 

Reason for failure is: 
|org.opentest4j.AssertionFailedError: Condition not met within timeout 2. 
Offsets for consumer group consumer-group-lagging-behind not translated from 
primary for topic primary.test-topic-1 ==> expected:  but was: |



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2007

2023-07-17 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-15196) Additional ZK migration metrics

2023-07-17 Thread David Arthur (Jira)
David Arthur created KAFKA-15196:


 Summary: Additional ZK migration metrics
 Key: KAFKA-15196
 URL: https://issues.apache.org/jira/browse/KAFKA-15196
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Arthur
Assignee: David Arthur


This issue is to track the remaining metrics defined in KIP-866. So far, we 
have ZkMigrationState and ZkWriteBehindLag.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-951: Leader discovery optimisations for the client

2023-07-17 Thread Philip Nee
Hey Mayank:

For #1: I think fetch and produce behave a bit differently on metadata.
Maybe it is worth highlighting the changes for each client in detail. In
producer did you mean by the metadata timeout before sending out produce
requests? For consumer: I think for fetches it requires user to retry if
the position does not exist on the leader. I don't have the detail on top
of my head, but I think we should lay out these behavioral changes.

For #3: Thanks for the clarification.

On Mon, Jul 17, 2023 at 10:39 AM Mayank Shekhar Narula <
mayanks.nar...@gmail.com> wrote:

> Philip
>
> 1. Good call out about "poll" behaviour, my understanding is the same. I am
> assuming it's about the motivation of the KIP. There with async, my
> intention was to convey that the client doesn't wait for the
> metadata-refresh before a subsequent retry of the produce or fetch request
> that failed due to stale metadata(i.e. going to an old leader). The only
> wait client has is the configured retry-delay.
>
> 2. Yes, in theory other APIs could benefit from this too. But that is
> outside of the scope of the KIP.
>
> 3. Do you mean the response for the Metadata RPC? I think brokers always
> have a view of the cluster, although it can be stale,it would always return
> a leader(whether old or new).
>
> Mayank
>
> On Fri, Jul 14, 2023 at 8:53 PM Philip Nee  wrote:
>
> > Hey Mayank,
> >
> > Thanks for the KIP. I think this is a great proposal, and I'm in favor
> > of this idea.  A few comments:
> >
> > 1. Claiming metadata refresh is done asynchronously is misleading.  The
> > metadata refresh requires Network Client to be physically polled, which
> is
> > done in a separate thread in Producer and Admin Client (IIRC!) but not
> > Consumer.
> > 2. There are other API calls that might result in NOT_LEADER_OR_FOLLOWER
> > response, but it seems like this KIP only wants to update on fetch and
> > produce. Do we want to make the leader information available for other
> API
> > calls?
> > 3. Do you know what would happen during a leader election? I'm not sure
> > about this process and I wonder if the current metadata response uses the
> > old leader or null as the leader isn't readily available yet.
> >
> > Thanks,
> > P
> >
> > On Fri, Jul 14, 2023 at 11:30 AM Kirk True  wrote:
> >
> > > Hi Mayank,
> > >
> > > > On Jul 14, 2023, at 11:25 AM, Mayank Shekhar Narula <
> > > mayanks.nar...@gmail.com> wrote:
> > > >
> > > > Kirk
> > > >
> > > >
> > > >> Is the requested restructuring of the response “simply” to preserve
> > > bytes,
> > > >> or is it possible that the fetch response could/should/would return
> > > >> leadership changes for partitions that we’re specifically requested?
> > > >>
> > > >
> > > > Moving endpoints to top-level fields would preserve bytes, otherwise
> > the
> > > > endpoint-information would be duplicated if included with the
> > > > partition-data in the response. Right now, the top-level field will
> > only
> > > be
> > > > set in case leader changes for any requested partitions. But it can
> be
> > > > re-used in the future, for which Jose has a use-case in mind shared
> up
> > in
> > > > the thread. KIP is now upto date with endpoint info being at
> top-level.
> > >
> > >
> > > I didn’t catch before that there was a separate section for the full
> node
> > > information, not just the ID and epoch.
> > >
> > > Thanks!
> > >
> > > >>> 3. In the future, I may use this information in the KRaft/Metadata
> > > >>> implementation of FETCH. In that implementation not all of the
> > > >>> replicas are brokers.
> > > >>
> > > >> Side point: any references to the change you’re referring to? The
> idea
> > > of
> > > >> non-brokers serving as replicas is blowing my mind a bit :)
> > > >>
> > > >>
> > > > Jose, I missed this as well, would love to know more about non-broker
> > > > serving as replica!
> > > > --
> > > > Regards,
> > > > Mayank Shekhar Narula
> > >
> > >
> >
>
>
> --
> Regards,
> Mayank Shekhar Narula
>


Re: [DISCUSS] KIP-951: Leader discovery optimisations for the client

2023-07-17 Thread Mayank Shekhar Narula
Philip

1. Good call out about "poll" behaviour, my understanding is the same. I am
assuming it's about the motivation of the KIP. There with async, my
intention was to convey that the client doesn't wait for the
metadata-refresh before a subsequent retry of the produce or fetch request
that failed due to stale metadata(i.e. going to an old leader). The only
wait client has is the configured retry-delay.

2. Yes, in theory other APIs could benefit from this too. But that is
outside of the scope of the KIP.

3. Do you mean the response for the Metadata RPC? I think brokers always
have a view of the cluster, although it can be stale,it would always return
a leader(whether old or new).

Mayank

On Fri, Jul 14, 2023 at 8:53 PM Philip Nee  wrote:

> Hey Mayank,
>
> Thanks for the KIP. I think this is a great proposal, and I'm in favor
> of this idea.  A few comments:
>
> 1. Claiming metadata refresh is done asynchronously is misleading.  The
> metadata refresh requires Network Client to be physically polled, which is
> done in a separate thread in Producer and Admin Client (IIRC!) but not
> Consumer.
> 2. There are other API calls that might result in NOT_LEADER_OR_FOLLOWER
> response, but it seems like this KIP only wants to update on fetch and
> produce. Do we want to make the leader information available for other API
> calls?
> 3. Do you know what would happen during a leader election? I'm not sure
> about this process and I wonder if the current metadata response uses the
> old leader or null as the leader isn't readily available yet.
>
> Thanks,
> P
>
> On Fri, Jul 14, 2023 at 11:30 AM Kirk True  wrote:
>
> > Hi Mayank,
> >
> > > On Jul 14, 2023, at 11:25 AM, Mayank Shekhar Narula <
> > mayanks.nar...@gmail.com> wrote:
> > >
> > > Kirk
> > >
> > >
> > >> Is the requested restructuring of the response “simply” to preserve
> > bytes,
> > >> or is it possible that the fetch response could/should/would return
> > >> leadership changes for partitions that we’re specifically requested?
> > >>
> > >
> > > Moving endpoints to top-level fields would preserve bytes, otherwise
> the
> > > endpoint-information would be duplicated if included with the
> > > partition-data in the response. Right now, the top-level field will
> only
> > be
> > > set in case leader changes for any requested partitions. But it can be
> > > re-used in the future, for which Jose has a use-case in mind shared up
> in
> > > the thread. KIP is now upto date with endpoint info being at top-level.
> >
> >
> > I didn’t catch before that there was a separate section for the full node
> > information, not just the ID and epoch.
> >
> > Thanks!
> >
> > >>> 3. In the future, I may use this information in the KRaft/Metadata
> > >>> implementation of FETCH. In that implementation not all of the
> > >>> replicas are brokers.
> > >>
> > >> Side point: any references to the change you’re referring to? The idea
> > of
> > >> non-brokers serving as replicas is blowing my mind a bit :)
> > >>
> > >>
> > > Jose, I missed this as well, would love to know more about non-broker
> > > serving as replica!
> > > --
> > > Regards,
> > > Mayank Shekhar Narula
> >
> >
>


-- 
Regards,
Mayank Shekhar Narula


Re: [DISCUSS] KIP-951: Leader discovery optimisations for the client

2023-07-17 Thread Mayank Shekhar Narula
Kirk

1. For the client, it doesn't matter whether the server is KRaft or ZK.
Client behaviour will be simply driven by the protocol-changes proposed for
the FetchResponse & ProduceResponse. On the server side, there will be
differences on how a new leader is discovered depending on whether it's ZK
or KRaft . But again that wouldn't affect the client impl.

2. You are right, whenever new leader info advances the view of leadership
for the client, the client should retry immediately, irrespective if the
information is available after many retries. So in our example, the 4th
attempt would retry immediately to the new leader. And for all other cases,
retries would be subject to the configured delay. So if the 4th attempt
failed due to non-leadership related issues, the 5th attempt would be
delayed.

Thanks
Mayank


On Fri, Jul 14, 2023 at 7:20 PM Kirk True  wrote:

> Hi Mayank,
>
> Thanks for the KIP!
>
> Questions:
>
> 1. From the standpoint of the client, does it matter if the cluster is
> running in KRaft mode vs. Zookeeper? Will the behavior somehow be subtlety
> different given that metadata propagation is handled differently between
> the two?
>
> 2. Is there anything we need to do to handle the case where the new leader
> information appears after retries? Suppose the first two attempts to send a
> produce fail, in which case we hit the backoff logic. On the third attempt,
> the broker/node returns new leader information. Would the fourth attempt
> (with the new leader) still be performed without any delay? To be honest,
> I’m not sure that case is valid, but I would assume it would retry
> immediately, right?
>
> Thanks,
> Kirk
>
> > On Jul 13, 2023, at 7:15 AM, Mayank Shekhar Narula <
> mayanks.nar...@gmail.com> wrote:
> >
> > Hi everyone
> >
> > Following KIP is up for discussion. Thanks for your feedback.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-951%3A+Leader+discovery+optimisations+for+the+client
> >
> > --
> > Regards,
> > Mayank Shekhar Narula
>
>

-- 
Regards,
Mayank Shekhar Narula


[jira] [Resolved] (KAFKA-14884) Include check transaction is still ongoing right before append

2023-07-17 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-14884.

Resolution: Fixed

> Include check transaction is still ongoing right before append 
> ---
>
> Key: KAFKA-14884
> URL: https://issues.apache.org/jira/browse/KAFKA-14884
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 3.6.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Blocker
>
> Even after checking via AddPartitionsToTxn, the transaction could be aborted 
> after the response. We can add one more check before appending.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15193) Allow choosing Platform specific rocksDBJNI jar

2023-07-17 Thread Utkarsh Khare (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Utkarsh Khare resolved KAFKA-15193.
---
Resolution: Not A Problem

> Allow choosing Platform specific rocksDBJNI jar
> ---
>
> Key: KAFKA-15193
> URL: https://issues.apache.org/jira/browse/KAFKA-15193
> Project: Kafka
>  Issue Type: Improvement
>  Components: build
>Reporter: Utkarsh Khare
>Assignee: Utkarsh Khare
>Priority: Major
>
> RocksDBJNI uber jar is currently at ~58MBs. 
> There are smaller platform-specific jars available for RocksDB. 
> This ticket is created to allow developers to choose platform-specific 
> RocksDB jar via build arguments, which is much smaller and can considerably 
> reduce the generated RPM size.  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] 3.5.1 RC1

2023-07-17 Thread Federico Valeri
Hi Divij, I did the following checks:

- Checked signature, checksum, licenses
- Spot checked documentation and javadoc
- Built from source with Java 17 and Scala 2.13
- Ran full unit and integration test suites
- Ran test Java app using staging Maven artifacts

+1 (non binding)

Cheers
Fede

On Mon, Jul 17, 2023 at 10:27 AM Divij Vaidya  wrote:
>
> Hello Kafka users, developers and client-developers,
>
> This is the second candidate (RC1) for release of Apache Kafka 3.5.1. First
> release candidate (RC0) was discarded due to incorrect license files. They
> have been fixed since then.
>
> This release is a security patch release. It upgrades the dependency,
> snappy-java, to a version which is not vulnerable to CVE-2023-34455. You
> can find more information about the CVE at Kafka CVE list
> .
>
> Additionally, this releases fixes a regression introduced in 3.3.0, which
> caused security.protocol configuration values to be restricted to upper
> case only. With this release, security.protocol values are
> case insensitive. See KAFKA-15053
>  for details.
>
> Release notes for the 3.5.1 release:
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, July 20, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/
>
> Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> Javadoc:
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/javadoc/
>
> Tag to be voted upon (off 3.5 branch) is the 3.5.1 tag:
> https://github.com/apache/kafka/releases/tag/3.5.1-rc1
>
> Documentation:
> https://kafka.apache.org/35/documentation.html
> Please note that documentation will be updated with upgrade notes (
> https://github.com/apache/kafka/commit/4c78fd64454e25e3536e8c7ed5725d3fbe944a49)
> after the release is complete.
>
> Protocol:
> https://kafka.apache.org/35/protocol.html
>
> Unit/integration tests:
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/43/ (2 failures)
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/42/ (6 failures)
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/39/ (9 failures)
>
> In all 3 runs above, there are no common tests which are failing, which
> leads me to believe that they are flaky. I have also verified that
> unit/integration tests on my local machine successfully pass (JDK 17 +
> Scala 2.13)
>
> System tests:
> Not planning to run system tests since this is a patch release.
>
> Thank you.
>
> --
> Divij Vaidya
> Release Manager for Apache Kafka 3.5.1


[DISCUSS] KIP-952: Regenerate segment-aligned producer snapshots when upgrading to a Kafka version supporting Tiered Storage

2023-07-17 Thread Christo Lolov
Hello!

A customer upgrading from Kafka < 2.8 to the future release 3.6 and wanting
to enable tiered storage would have to take responsibility for ensuring
that all segments lacking a producer snapshot file have expired and are
deleted before enabling the feature.

In our experience customers are not aware of this limitation and expect to
be able to enable the feature as soon as their upgrade is complete. If they
do this today, however, this results in NPEs. As such, one could argue this
is a blocker for 3.6 due to the non-direct upgrade path from versions < 2.8.

I would like to start a discussion on KIP-952: Regenerate segment-aligned
producer snapshots when upgrading to a Kafka version supporting Tiered
Storage (https://cwiki.apache.org/confluence/x/dIuzDw) which aims to solve
this issue.

Best,
Christo


Re: [DISCUSS] KIP-953: partition method to be overloaded to accept headers as well.

2023-07-17 Thread Jack Tomy
Hey everyone,

Not seeing much discussion on the KPI. Might be because it is too
obvious 😉.

If there are no more comments, I will start the VOTE in the coming days.

On Sat, Jul 15, 2023 at 8:48 PM Jack Tomy  wrote:

> Hey everyone,
>
> Please take a look at the KPI below and provide your suggestions and
> feedback. TIA.
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937
>
>
> --
> Best Regards
> *Jack*
>


-- 
Best Regards
*Jack*


[jira] [Created] (KAFKA-15195) Regenerate segment-aligned producer snapshots when upgrading to a Kafka version supporting Tiered Storage

2023-07-17 Thread Christo Lolov (Jira)
Christo Lolov created KAFKA-15195:
-

 Summary: Regenerate segment-aligned producer snapshots when 
upgrading to a Kafka version supporting Tiered Storage
 Key: KAFKA-15195
 URL: https://issues.apache.org/jira/browse/KAFKA-15195
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 3.6.0
Reporter: Christo Lolov
Assignee: Christo Lolov


As mentioned in KIP-405: Kafka Tiered Storage#Upgrade a customer wishing to 
upgrade from a Kafka version < 2.8.0 to 3.6 and turn Tiered Storage on will 
have to wait for retention to clean up segments without an associated producer 
snapshot.

However, in our experience, customers of Kafka expect to be able to immediately 
enable tiering on a topic once their cluster upgrade is complete. Once they do 
this, however, they start seeing NPEs and no data is uploaded to Tiered Storage 
(https://github.com/apache/kafka/blob/9e50f7cdd37f923cfef4711cf11c1c5271a0a6c7/storage/api/src/main/java/org/apache/kafka/server/log/remote/storage/LogSegmentData.java#L61).

To achieve this, we propose changing Kafka to retroactively create producer 
snapshot files on upload whenever a segment is due to be archived and lacks one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-951: Leader discovery optimisations for the client

2023-07-17 Thread José Armando García Sancio
Hi Krik,

On Fri, Jul 14, 2023 at 10:59 AM Kirk True  wrote:
> Is the requested restructuring of the response “simply” to preserve bytes, or 
> is it possible that the fetch response could/should/would return leadership 
> changes for partitions that we’re specifically requested?

Both. My reasoning for the restructuring is that embedding the node
endpoint in the partition response would lead to duplicate information
being returned and as you point out the node endpoint information is
orthogonal to the partition leader.

> > 3. In the future, I may use this information in the KRaft/Metadata
> > implementation of FETCH. In that implementation not all of the
> > replicas are brokers.
>
> Side point: any references to the change you’re referring to? The idea of 
> non-brokers serving as replicas is blowing my mind a bit :)

I am especially referring to the Draft KIP for KRaft Controller
Membership Change (https://cwiki.apache.org/confluence/x/nyH1D). The
Fetch RPC is used by KRaft's cluster metadata partition which
implements a different consensus protocol that is used by both
Controllers and Brokers.

Thanks!
-- 
-José


Re: Re: [DISCUSS] KIP-951: Leader discovery optimisations for the client

2023-07-17 Thread Mayank Shekhar Narula
Emanuele

I agree with this. That's why i quoted below -

> I wonder if non-Kafka clients might benefit from not bumping the
> version. If versions are bumped, say for FetchResponse to 16, I believe
> that client would have to support all versions until 16 to fully
> utilise this feature. Whereas, if not bumped, they can simply support until
> version 12( will change to version:12 for tagged fields ), and non-AK
> clients can then implement this feature. What do you think? I am inclined
> to
> not bump.


I will explicitly highlight the reason for not bumping in the KIP. Unless I
hear otherwise, i.e. strong opposition, I am inclined to not bump it.

On Mon, Jul 17, 2023 at 11:54 AM Emanuele Sabellico
 wrote:

> The downsides of bumping the version is that clients have to have all
> the latest features implemented before being able to benefit from this
> performance improvement.
> One of the benefits of using a tagged field is to make the field
> available to previous versions too.
> Choosing a minimum value for taggedVersions could be an alternative.
>
> Emanuele
>
> On 2023/07/13 17:30:45 Andrew Schofield wrote:
>  > Hi Mayank,
>  > If we bump the version, the broker can tell whether it’s worth
> providing the leader
>  > endpoint information to the client when the leader has changed.
> That’s my reasoning.
>  >
>  > Thanks,
>  > Andrew
>  >
>  > > On 13 Jul 2023, at 18:02, Mayank Shekhar Narula wrote:
>  > >
>  > > Thanks both for looking into this.
>  > >
>  > > Jose,
>  > >
>  > > 1/2 & 4(changes for PRODUCE) & 5 makes sense, will follow
>  > >
>  > > 3. If I understood this correctly, certain replicas "aren't"
> brokers, what
>  > > are they then?
>  > >
>  > > Also how about replacing "Replica" with "Leader", this is more
> readable on
>  > > the client. so, how about this?
>  > > { "name": "LeaderEndpoints", "type": "[]Leader", "versions": "15+",
>  > > "taggedVersions": "15+", "tag": 3,
>  > > "about": "Endpoints for all current leaders enumerated in
>  > > PartitionData.", "fields": [
>  > > { "name": "NodeId", "type": "int32", "versions": "15+",
>  > > "mapKey": true, "entityType": "brokerId", "about": "The ID of the
>  > > associated leader"},
>  > > { "name": "Host", "type": "string", "versions": "15+",
>  > > "about": "The leader's hostname." },
>  > > { "name": "Port", "type": "int32", "versions": "15+",
>  > > "about": "The leader's port." },
>  > > { "name": "Rack", "type": "string", "versions": "15+", "ignorable":
>  > > true, "default": "null",
>  > > "about": "The rack of the leader, or null if it has not been
>  > > assigned to a rack." }
>  > > ]}
>  > >
>  > > Andrew
>  > >
>  > > 6. I wonder if non-Kafka clients might benefit from not bumping the
>  > > version. If versions are bumped, say for FetchResponse to 16, I
> believe
>  > > that client would have to support all versions until 16 to fully
> utilise
>  > > this feature. Whereas, if not bumped, they can simply support until
> version
>  > > 12( will change to version:12 for tagged fields ), and non-AK
> clients can
>  > > then implement this feature. What do you think? I am inclined to
> not bump.
>  > >
>  > > On Thu, Jul 13, 2023 at 5:21 PM Andrew Schofield <
>  > > andrew_schofield_j...@outlook.com> wrote:
>  > >
>  > >> Hi José,
>  > >> Thanks. Sounds good.
>  > >>
>  > >> Andrew
>  > >>
>  > >>> On 13 Jul 2023, at 16:45, José Armando García Sancio
>  > >> wrote:
>  > >>>
>  > >>> Hi Andrew,
>  > >>>
>  > >>> On Thu, Jul 13, 2023 at 8:35 AM Andrew Schofield
>  > >>> wrote:
>  >  I have a question about José’s comment (2). I can see that it’s
>  > >> possible for multiple
>  >  partitions to change leadership to the same broker/node and it’s
>  > >> wasteful to repeat
>  >  all of the connection information for each topic-partition. But, I
>  > >> think it’s important to
>  >  know which partitions are now lead by which node. That
> information at
>  > >> least needs to be
>  >  per-partition I think. I may have misunderstood, but it sounded
> like
>  > >> your comment
>  >  suggestion lost that relationship.
>  > >>>
>  > >>> Each partition in both the FETCH response and the PRODUCE response
>  > >>> will have the CurrentLeader, the tuple leader id and leader epoch.
>  > >>> Clients can use this information to update their partition to leader
>  > >>> id and leader epoch mapping.
>  > >>>
>  > >>> They can also use the NodeEndpoints to update their mapping from
>  > >>> replica id to the tuple host, port and rack so that they can connect
>  > >>> to the correct node for future FETCH requests and PRODUCE requests.
>  > >>>
>  > >>> Thanks,
>  > >>> --
>  > >>> -José
>  > >>
>  > >>
>  > >
>  > > --
>  > > Regards,
>  > > Mayank Shekhar Narula
>  >
>  >



-- 
Regards,
Mayank Shekhar Narula


Re: [DISCUSS] KIP-714: Client metrics and observability

2023-07-17 Thread Milind Luthra
Hi Andrew, thanks for this KIP.

I had a few questions regarding the "Error handling" section.

1. It mentions that "The 5 and 30 minute retries are to eventually trigger
a retry and avoid having to restart clients if the cluster metrics
configuration is disabled temporarily, e.g., by operator error, rolling
upgrades, etc."
But this 30 min interval isn't mentioned anywhere else. What is it
referring to?

2. For the actual errors:
INVALID_RECORD : The action required is to "Log a warning to the
application and schedule the next GetTelemetrySubscriptionsRequest to 5
minutes". Why is this 5 minutes, and not something like PushIntervalMs? And
also, why are we scheduling a GetTelemetrySubscriptionsRequest in this
case, if the serialization is broken?
UNKNOWN_SUBSCRIPTION_ID , UNSUPPORTED_COMPRESSION_TYPE : just to confirm,
the GetTelemetrySubscriptionsRequest needs to be scheduled immediately
after the PushTelemetry response, correct?

3. For "Subsequent GetTelemetrySubscriptionsRequests must include the
ClientInstanceId returned in the first response, regardless of broker":
Will a broker error be returned in case some implementation of this KIP
violates this accidentally and sends a request with ClientInstanceId = Null
even when it's been obtained already? Or will a new ClientInstanceId be
returned without an error?

Thanks!

On Tue, Jun 13, 2023 at 8:38 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi,
> I would like to start a new discussion thread on KIP-714: Client metrics
> and observability.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability
>
> I have edited the proposal significantly to reduce the scope. The overall
> mechanism for client metric subscriptions is unchanged, but the
> KIP is now based on the existing client metrics, rather than introducing
> new metrics. The purpose remains helping cluster operators
> investigate performance problems experienced by clients without requiring
> changes to the client application code or configuration.
>
> Thanks,
> Andrew


[jira] [Created] (KAFKA-15194) Rename local tiered storage segment with offset as prefix for easy navigation

2023-07-17 Thread Kamal Chandraprakash (Jira)
Kamal Chandraprakash created KAFKA-15194:


 Summary: Rename local tiered storage segment with offset as prefix 
for easy navigation
 Key: KAFKA-15194
 URL: https://issues.apache.org/jira/browse/KAFKA-15194
 Project: Kafka
  Issue Type: Task
Reporter: Kamal Chandraprakash


In LocalTieredStorage which is an implementation of RemoteStorageManager, 
segments are saved with random UUID. This makes navigating to a particular 
segment harder. To navigate a given segment by offset, prepend the offset 
information to the segment filename.

https://github.com/apache/kafka/pull/13837#discussion_r1258896009



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


RE: Re: [DISCUSS] KIP-951: Leader discovery optimisations for the client

2023-07-17 Thread Emanuele Sabellico
The downsides of bumping the version is that clients have to have all 
the latest features implemented before being able to benefit from this 
performance improvement.
One of the benefits of using a tagged field is to make the field 
available to previous versions too.

Choosing a minimum value for taggedVersions could be an alternative.

Emanuele

On 2023/07/13 17:30:45 Andrew Schofield wrote:
> Hi Mayank,
> If we bump the version, the broker can tell whether it’s worth 
providing the leader
> endpoint information to the client when the leader has changed. 
That’s my reasoning.

>
> Thanks,
> Andrew
>
> > On 13 Jul 2023, at 18:02, Mayank Shekhar Narula wrote:
> >
> > Thanks both for looking into this.
> >
> > Jose,
> >
> > 1/2 & 4(changes for PRODUCE) & 5 makes sense, will follow
> >
> > 3. If I understood this correctly, certain replicas "aren't" 
brokers, what

> > are they then?
> >
> > Also how about replacing "Replica" with "Leader", this is more 
readable on

> > the client. so, how about this?
> > { "name": "LeaderEndpoints", "type": "[]Leader", "versions": "15+",
> > "taggedVersions": "15+", "tag": 3,
> > "about": "Endpoints for all current leaders enumerated in
> > PartitionData.", "fields": [
> > { "name": "NodeId", "type": "int32", "versions": "15+",
> > "mapKey": true, "entityType": "brokerId", "about": "The ID of the
> > associated leader"},
> > { "name": "Host", "type": "string", "versions": "15+",
> > "about": "The leader's hostname." },
> > { "name": "Port", "type": "int32", "versions": "15+",
> > "about": "The leader's port." },
> > { "name": "Rack", "type": "string", "versions": "15+", "ignorable":
> > true, "default": "null",
> > "about": "The rack of the leader, or null if it has not been
> > assigned to a rack." }
> > ]}
> >
> > Andrew
> >
> > 6. I wonder if non-Kafka clients might benefit from not bumping the
> > version. If versions are bumped, say for FetchResponse to 16, I believe
> > that client would have to support all versions until 16 to fully 
utilise
> > this feature. Whereas, if not bumped, they can simply support until 
version
> > 12( will change to version:12 for tagged fields ), and non-AK 
clients can
> > then implement this feature. What do you think? I am inclined to 
not bump.

> >
> > On Thu, Jul 13, 2023 at 5:21 PM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> >> Hi José,
> >> Thanks. Sounds good.
> >>
> >> Andrew
> >>
> >>> On 13 Jul 2023, at 16:45, José Armando García Sancio
> >> wrote:
> >>>
> >>> Hi Andrew,
> >>>
> >>> On Thu, Jul 13, 2023 at 8:35 AM Andrew Schofield
> >>> wrote:
>  I have a question about José’s comment (2). I can see that it’s
> >> possible for multiple
>  partitions to change leadership to the same broker/node and it’s
> >> wasteful to repeat
>  all of the connection information for each topic-partition. But, I
> >> think it’s important to
>  know which partitions are now lead by which node. That 
information at

> >> least needs to be
>  per-partition I think. I may have misunderstood, but it sounded like
> >> your comment
>  suggestion lost that relationship.
> >>>
> >>> Each partition in both the FETCH response and the PRODUCE response
> >>> will have the CurrentLeader, the tuple leader id and leader epoch.
> >>> Clients can use this information to update their partition to leader
> >>> id and leader epoch mapping.
> >>>
> >>> They can also use the NodeEndpoints to update their mapping from
> >>> replica id to the tuple host, port and rack so that they can connect
> >>> to the correct node for future FETCH requests and PRODUCE requests.
> >>>
> >>> Thanks,
> >>> --
> >>> -José
> >>
> >>
> >
> > --
> > Regards,
> > Mayank Shekhar Narula
>
>

[jira] [Resolved] (KAFKA-14518) Rebalance on topic/partition metadata changes

2023-07-17 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-14518.
-
Resolution: Fixed

This was done as part of https://issues.apache.org/jira/browse/KAFKA-14462.

> Rebalance on topic/partition metadata changes
> -
>
> Key: KAFKA-14518
> URL: https://issues.apache.org/jira/browse/KAFKA-14518
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15193) Allow choosing Platform specific rocksDBJNI jar

2023-07-17 Thread Utkarsh Khare (Jira)
Utkarsh Khare created KAFKA-15193:
-

 Summary: Allow choosing Platform specific rocksDBJNI jar
 Key: KAFKA-15193
 URL: https://issues.apache.org/jira/browse/KAFKA-15193
 Project: Kafka
  Issue Type: Improvement
  Components: build
Reporter: Utkarsh Khare
Assignee: Utkarsh Khare


RocksDBJNI uber jar is currently at ~58MBs. 
There are smaller platform-specific jars available for RocksDB. 

This ticket is created to allow developers to choose platform-specific RocksDB 
jar via build arguments, which is much smaller and can considerably reduce the 
generated RPM size.  

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Application for permission grants for contribution

2023-07-17 Thread Bruno Cadonna

Hi Abhinav,

You should be all set now.

Thanks for your interest in Apache Kafka!

Best,
Bruno

On 7/14/23 9:04 AM, abhinav tripathi wrote:

Hi Team,
I would like to contribute to the Kafka community. I want to work on open
issues and also have some ideas of my own.
Requesting access and grants required for myself.

My credentials:

JIRA username : shikamaru-96
Cwiki username: shikamaru-96

Email: imabhinav.tripa...@gmail.com

Thanks,
Abhinav



Re: Requesting permissions to contribute to Apache Kafka

2023-07-17 Thread Bruno Cadonna

Hi Mital,

You should be all set now.

Thanks for your interest in Apache Kafka!

Best,
Bruno

On 7/14/23 3:04 PM, Mital Awachat wrote:

Hi Team,

I am requesting permission to contribute to Apache Kafka.

Wiki:
Username: mital.awachat
Email: mital.awacha...@gmail.com

Jira:
Username: mital.awachat
Email: mital.awacha...@gmail.com
Using https://selfserve.apache.org/jira-account.html



[VOTE] 3.5.1 RC1

2023-07-17 Thread Divij Vaidya
Hello Kafka users, developers and client-developers,

This is the second candidate (RC1) for release of Apache Kafka 3.5.1. First
release candidate (RC0) was discarded due to incorrect license files. They
have been fixed since then.

This release is a security patch release. It upgrades the dependency,
snappy-java, to a version which is not vulnerable to CVE-2023-34455. You
can find more information about the CVE at Kafka CVE list
.

Additionally, this releases fixes a regression introduced in 3.3.0, which
caused security.protocol configuration values to be restricted to upper
case only. With this release, security.protocol values are
case insensitive. See KAFKA-15053
 for details.

Release notes for the 3.5.1 release:
https://home.apache.org/~divijv/kafka-3.5.1-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Thursday, July 20, 9am PT

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

Release artifacts to be voted upon (source and binary):
https://home.apache.org/~divijv/kafka-3.5.1-rc1/

Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

Javadoc:
https://home.apache.org/~divijv/kafka-3.5.1-rc1/javadoc/

Tag to be voted upon (off 3.5 branch) is the 3.5.1 tag:
https://github.com/apache/kafka/releases/tag/3.5.1-rc1

Documentation:
https://kafka.apache.org/35/documentation.html
Please note that documentation will be updated with upgrade notes (
https://github.com/apache/kafka/commit/4c78fd64454e25e3536e8c7ed5725d3fbe944a49)
after the release is complete.

Protocol:
https://kafka.apache.org/35/protocol.html

Unit/integration tests:
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/43/ (2 failures)
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/42/ (6 failures)
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/39/ (9 failures)

In all 3 runs above, there are no common tests which are failing, which
leads me to believe that they are flaky. I have also verified that
unit/integration tests on my local machine successfully pass (JDK 17 +
Scala 2.13)

System tests:
Not planning to run system tests since this is a patch release.

Thank you.

--
Divij Vaidya
Release Manager for Apache Kafka 3.5.1


Re: [VOTE] 3.5.1 RC0

2023-07-17 Thread Divij Vaidya
Please ignore this. I need to re-send this email with the correct release
canddiate in this email.

--
Divij Vaidya



On Mon, Jul 17, 2023 at 10:07 AM Divij Vaidya 
wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the second candidate (RC1) for release of Apache Kafka 3.5.1.
> First release candidate (RC0) was discarded due to incorrect license files.
> They have been fixed since then.
>
> This release is a security patch release. It upgrades the dependency,
> snappy-java, to a version which is not vulnerable to CVE-2023-34455. You
> can find more information about the CVE at Kafka CVE list
> .
>
> Additionally, this releases fixes a regression introduced in 3.3.0, which
> caused security.protocol configuration values to be restricted to upper
> case only. With this release, security.protocol values are
> case insensitive. See KAFKA-15053
>  for details.
>
> Release notes for the 3.5.1 release:
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, July 20, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/
>
> Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> Javadoc:
> https://home.apache.org/~divijv/kafka-3.5.1-rc1/javadoc/
>
> Tag to be voted upon (off 3.5 branch) is the 3.5.1 tag:
> https://github.com/apache/kafka/releases/tag/3.5.1-rc1
>
> Documentation:
> https://kafka.apache.org/35/documentation.html
> Please note that documentation will be updated with upgrade notes (
> https://github.com/apache/kafka/commit/4c78fd64454e25e3536e8c7ed5725d3fbe944a49)
> after the release is complete.
>
> Protocol:
> https://kafka.apache.org/35/protocol.html
>
> Unit/integration tests:
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/43/ (2 failures)
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/42/ (6
> failures)
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/39/ (9 failures)
>
> In all 3 runs above, there are no common tests which are failing, which
> leads me to believe that they are flaky. I have also verified that
> unit/integration tests on my local machine successfully pass (JDK 17 +
> Scala 2.13)
>
> System tests:
> Not planning to run system tests since this is a patch release.
>
> Thank you.
>
> --
> Divij Vaidya
> Release Manager for Apache Kafka 3.5.1
>
>


[VOTE] 3.5.1 RC0

2023-07-17 Thread Divij Vaidya
Hello Kafka users, developers and client-developers,

This is the second candidate (RC1) for release of Apache Kafka 3.5.1. First
release candidate (RC0) was discarded due to incorrect license files. They
have been fixed since then.

This release is a security patch release. It upgrades the dependency,
snappy-java, to a version which is not vulnerable to CVE-2023-34455. You
can find more information about the CVE at Kafka CVE list
.

Additionally, this releases fixes a regression introduced in 3.3.0, which
caused security.protocol configuration values to be restricted to upper
case only. With this release, security.protocol values are
case insensitive. See KAFKA-15053
 for details.

Release notes for the 3.5.1 release:
https://home.apache.org/~divijv/kafka-3.5.1-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Thursday, July 20, 9am PT

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

Release artifacts to be voted upon (source and binary):
https://home.apache.org/~divijv/kafka-3.5.1-rc1/

Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

Javadoc:
https://home.apache.org/~divijv/kafka-3.5.1-rc1/javadoc/

Tag to be voted upon (off 3.5 branch) is the 3.5.1 tag:
https://github.com/apache/kafka/releases/tag/3.5.1-rc1

Documentation:
https://kafka.apache.org/35/documentation.html
Please note that documentation will be updated with upgrade notes (
https://github.com/apache/kafka/commit/4c78fd64454e25e3536e8c7ed5725d3fbe944a49)
after the release is complete.

Protocol:
https://kafka.apache.org/35/protocol.html

Unit/integration tests:
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/43/ (2 failures)
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/42/ (6 failures)
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.5/39/ (9 failures)

In all 3 runs above, there are no common tests which are failing, which
leads me to believe that they are flaky. I have also verified that
unit/integration tests on my local machine successfully pass (JDK 17 +
Scala 2.13)

System tests:
Not planning to run system tests since this is a patch release.

Thank you.

--
Divij Vaidya
Release Manager for Apache Kafka 3.5.1