Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
If we use  WithDescendingKeys() to generate a RangeQuery to do the
reveseQuery, how do we achieve the methods like withRange, withUpperBound,
and withLowerBound only in this method?

On Tue, Oct 3, 2023 at 8:01 PM Hanyu (Peter) Zheng 
wrote:

> I believe there's no need to introduce a method like WithDescendingKeys().
> Instead, we can simply add a reverse flag to RangeQuery. Each method within
> RangeQuery would then accept an additional parameter. If the reverse is set
> to true, it would indicate the results should be reversed.
>
> Initially, I introduced a reverse variable. When set to false, the
> RangeQuery class behaves normally. However, when reverse is set to true,
> the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
> Further details can be found in the "Rejected Alternatives" section.
>
> In my perspective, RangeQuery is a class responsible for creating a series
> of RangeQuery objects. It offers methods such as withRange, withUpperBound,
> and withLowerBound, allowing us to generate objects representing different
> queries. I'm unsure how adding a withDescendingOrder() method would be
> compatible with the other methods, especially considering that, based on
> KIP 969, WithDescendingKeys() doesn't appear to take any input variables.
> And if withDescendingOrder() doesn't accept any input, how does it return a
> RangeQuery?
>
> On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
> wrote:
>
>> Hi, Colt,
>> The underlying structure of inMemoryKeyValueStore is treeMap.
>> Sincerely,
>> Hanyu
>>
>> On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
>> wrote:
>>
>>> Hi Bill,
>>> 1. I will update the KIP in accordance with the PR and synchronize their
>>> future updates.
>>> 2. I will use that name.
>>> 3. you mean add something about ordering at the motivation section?
>>>
>>> Sincerely,
>>> Hanyu
>>>
>>>
>>> On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
>>> wrote:
>>>
 Hi, Walker,

 1. I will update the KIP in accordance with the PR and synchronize
 their future updates.
 2. I will use that name.
 3. I'll provide additional details in that section.
 4. I intend to utilize rangeQuery to achieve what we're referring to as
 reverseQuery. In essence, reverseQuery is merely a term. To clear up any
 ambiguity, I'll make necessary adjustments to the KIP.

 Sincerely,
 Hanyu



 On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
 wrote:

> Ok, I will change it back to following the code, and update them
> together.
>
> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
>  wrote:
>
>> Hello Hanyu,
>>
>> Looking over your kip things mostly make sense but I have a couple of
>> comments.
>>
>>
>>1. You have "withDescandingOrder()". I think you mean "descending"
>> :)
>>Also there are still a few places in the do where its called
>> "setReverse"
>>2. Also I like "WithDescendingKeys()" better
>>3. I'm not sure of what ordering guarantees we are offering.
>> Perhaps we
>>can add a section to the motivation clearly spelling out the
>> current
>>ordering and the new offering?
>>4. When you say "use unbounded reverseQuery to achieve reverseAll"
>> do
>>you mean "use unbounded RangeQuery to achieve reverseAll"? as far
>> as I can
>>tell we don't have a reverseQuery as a named object?
>>
>>
>> Looking good so far
>>
>> best,
>> Walker
>>
>> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy 
>> wrote:
>>
>> > Hello Hanyu,
>> >
>> > Thank you for the KIP. I agree with Matthias' proposal to keep the
>> naming
>> > convention consistent with KIP-969. I favor the
>> `.withDescendingKeys()`
>> > name.
>> >
>> > I am curious about one thing. RocksDB guarantees that records
>> returned
>> > during a range scan are lexicographically ordered by the bytes of
>> the keys
>> > (either ascending or descending order, as specified in the query).
>> This
>> > means that results within a single partition are indeed ordered.**
>> My
>> > reading of KIP-805 suggests to me that you don't need to specify the
>> > partition number you are querying in IQv2, which means that you can
>> have a
>> > valid reversed RangeQuery over a store with "multiple partitions"
>> in it.
>> >
>> > Currently, IQv1 does not guarantee order of keys in this scenario.
>> Does
>> > IQv2 support ordering across partitions? Such an implementation
>> would
>> > require opening a rocksdb range scan** on multiple rocksdb
>> instances (one
>> > per partition), and polling the first key of each. Whether or not
>> this is
>> > ordered, could we please add that to the documentation?
>> >
>> > **(How is this implemented/guaranteed in an
>> `inMemoryKeyValueStore`? I
>>

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2253

2023-10-03 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 315358 lines...]

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToSuspendShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToSuspendShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldOnlyKeepLastUpdateAction() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldOnlyKeepLastUpdateAction() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToRecycle() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToRecycle() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToUpdateInputPartitionsShouldRemoveTaskFromPendingUpdateActions()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToUpdateInputPartitionsShouldRemoveTaskFromPendingUpdateActions()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldVerifyIfPendingTaskToRecycleExist() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldVerifyIfPendingTaskToRecycleExist() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToUpdateInputPartitions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToUpdateInputPartitions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldDrainPendingTasksToCreate() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > StateQueryResultTest 
> Zero query results shouldn't error STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > StateQueryResultTest 
> Zero query results shouldn't error PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > StateQueryResultTest 
> Valid query results still works STARTED

Gradle Test Run :streams:test > Gradle Test Executor 78 > StateQueryResultTest 
> Valid query results still works PASSED

Gradle Test Run :streams:test > Gradle Test Executor 78 > 
RocksDBBlockCache

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
I believe there's no need to introduce a method like WithDescendingKeys().
Instead, we can simply add a reverse flag to RangeQuery. Each method within
RangeQuery would then accept an additional parameter. If the reverse is set
to true, it would indicate the results should be reversed.

Initially, I introduced a reverse variable. When set to false, the
RangeQuery class behaves normally. However, when reverse is set to true,
the RangeQuery essentially takes on the functionality of ReverseRangeQuery.
Further details can be found in the "Rejected Alternatives" section.

In my perspective, RangeQuery is a class responsible for creating a series
of RangeQuery objects. It offers methods such as withRange, withUpperBound,
and withLowerBound, allowing us to generate objects representing different
queries. I'm unsure how adding a withDescendingOrder() method would be
compatible with the other methods, especially considering that, based on
KIP 969, WithDescendingKeys() doesn't appear to take any input variables.
And if withDescendingOrder() doesn't accept any input, how does it return a
RangeQuery?

On Tue, Oct 3, 2023 at 4:37 PM Hanyu (Peter) Zheng 
wrote:

> Hi, Colt,
> The underlying structure of inMemoryKeyValueStore is treeMap.
> Sincerely,
> Hanyu
>
> On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
> wrote:
>
>> Hi Bill,
>> 1. I will update the KIP in accordance with the PR and synchronize their
>> future updates.
>> 2. I will use that name.
>> 3. you mean add something about ordering at the motivation section?
>>
>> Sincerely,
>> Hanyu
>>
>>
>> On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
>> wrote:
>>
>>> Hi, Walker,
>>>
>>> 1. I will update the KIP in accordance with the PR and synchronize their
>>> future updates.
>>> 2. I will use that name.
>>> 3. I'll provide additional details in that section.
>>> 4. I intend to utilize rangeQuery to achieve what we're referring to as
>>> reverseQuery. In essence, reverseQuery is merely a term. To clear up any
>>> ambiguity, I'll make necessary adjustments to the KIP.
>>>
>>> Sincerely,
>>> Hanyu
>>>
>>>
>>>
>>> On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
>>> wrote:
>>>
 Ok, I will change it back to following the code, and update them
 together.

 On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
  wrote:

> Hello Hanyu,
>
> Looking over your kip things mostly make sense but I have a couple of
> comments.
>
>
>1. You have "withDescandingOrder()". I think you mean "descending"
> :)
>Also there are still a few places in the do where its called
> "setReverse"
>2. Also I like "WithDescendingKeys()" better
>3. I'm not sure of what ordering guarantees we are offering.
> Perhaps we
>can add a section to the motivation clearly spelling out the current
>ordering and the new offering?
>4. When you say "use unbounded reverseQuery to achieve reverseAll"
> do
>you mean "use unbounded RangeQuery to achieve reverseAll"? as far
> as I can
>tell we don't have a reverseQuery as a named object?
>
>
> Looking good so far
>
> best,
> Walker
>
> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy 
> wrote:
>
> > Hello Hanyu,
> >
> > Thank you for the KIP. I agree with Matthias' proposal to keep the
> naming
> > convention consistent with KIP-969. I favor the
> `.withDescendingKeys()`
> > name.
> >
> > I am curious about one thing. RocksDB guarantees that records
> returned
> > during a range scan are lexicographically ordered by the bytes of
> the keys
> > (either ascending or descending order, as specified in the query).
> This
> > means that results within a single partition are indeed ordered.** My
> > reading of KIP-805 suggests to me that you don't need to specify the
> > partition number you are querying in IQv2, which means that you can
> have a
> > valid reversed RangeQuery over a store with "multiple partitions" in
> it.
> >
> > Currently, IQv1 does not guarantee order of keys in this scenario.
> Does
> > IQv2 support ordering across partitions? Such an implementation would
> > require opening a rocksdb range scan** on multiple rocksdb instances
> (one
> > per partition), and polling the first key of each. Whether or not
> this is
> > ordered, could we please add that to the documentation?
> >
> > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`?
> I
> > don't know about that implementation).
> >
> > Colt McNealy
> >
> > *Founder, LittleHorse.dev*
> >
> >
> > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
> >  wrote:
> >
> > > ok, I will update it. Thank you  Matthias
> > >
> > > Sincerely,
> > > Hanyu
> > >
> > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
> > wrote:
> > >
>>

[jira] [Created] (KAFKA-15536) dynamically resize remoteIndexCache

2023-10-03 Thread Luke Chen (Jira)
Luke Chen created KAFKA-15536:
-

 Summary: dynamically resize remoteIndexCache
 Key: KAFKA-15536
 URL: https://issues.apache.org/jira/browse/KAFKA-15536
 Project: Kafka
  Issue Type: Improvement
  Components: Tiered-Storage
Affects Versions: 3.6.0
Reporter: Luke Chen
Assignee: hudeqi


context:
https://github.com/apache/kafka/pull/14243#discussion_r1320630057



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-03 Thread Artem Livshits
Hi Colin,

I think in your example "do_unclean_recovery" would need to do different
things depending on the strategy.

do_unclean_recovery() {
   if (unclean.recovery.manager.enabled) {
if (strategy == Aggressive)
  use UncleanRecoveryManager(waitLastKnownERL=false)  // just inspect
logs from whoever is available
else
  use  UncleanRecoveryManager(waitLastKnownERL=true)  // must wait for
at least last known ELR
  } else {
if (strategy == Aggressive)
  choose the last known leader if that is available, or a random leader
if not)
else
  wait for last known leader to get back
  }
}

The idea is that the Aggressive strategy would kick in as soon as we lost
the leader and would pick a leader from whoever is available; but the
Balanced will only kick in when ELR is empty and will wait for the brokers
that likely have most data to be available.

On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe  wrote:

> On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > Hi, Calvin,
> >
> > Thanks for the update KIP. A few more comments.
> >
> > 41. Why would a user choose the option to select a random replica as the
> > leader instead of using unclean.recovery.strateg=Aggressive? It seems
> that
> > the latter is strictly better? If that's not the case, could we fold this
> > option under unclean.recovery.strategy instead of introducing a separate
> > config?
>
> Hi Jun,
>
> I thought the flow of control was:
>
> If there is no leader for the partition {
>   If (there are unfenced ELR members) {
> choose_an_unfenced_ELR_member
>   } else if (there are fenced ELR members AND strategy=Aggressive) {
> do_unclean_recovery
>   } else if (there are no ELR members AND strategy != None) {
> do_unclean_recovery
>   } else {
> do nothing about the missing leader
>   }
> }
>
> do_unclean_recovery() {
>if (unclean.recovery.manager.enabled) {
> use UncleanRecoveryManager
>   } else {
> choose the last known leader if that is available, or a random leader
> if not)
>   }
> }
>
> However, I think this could be clarified, especially the behavior when
> unclean.recovery.manager.enabled=false. Inuitively the goal for
> unclean.recovery.manager.enabled=false is to be "the same as now, mostly"
> but it's very underspecified in the KIP, I agree.
>
> >
> > 50. ElectLeadersRequest: "If more than 20 topics are included, only the
> > first 20 will be served. Others will be returned with DesiredLeaders."
> Hmm,
> > not sure that I understand this. ElectLeadersResponse doesn't have a
> > DesiredLeaders field.
> >
> > 51. GetReplicaLogInfo: "If more than 2000 partitions are included, only
> the
> > first 2000 will be served" Do we return an error for the remaining
> > partitions? Actually, should we include an errorCode field at the
> partition
> > level in GetReplicaLogInfoResponse to cover non-existing partitions and
> no
> > authorization, etc?
> >
> > 52. The entry should matches => The entry should match
> >
> > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a
> user
> > may not specify DesiredLeaders?
> >
> > 54. Downgrade: Is that indeed possible? I thought earlier you said that
> > once the new version of the records are in the metadata log, one can't
> > downgrade since the old broker doesn't know how to parse the new version
> of
> > the metadata records?
> >
>
> MetadataVersion downgrade is currently broken but we have fixing it on our
> plate for Kafka 3.7.
>
> The way downgrade works is that "new features" are dropped, leaving only
> the old ones.
>
> > 55. CleanShutdownFile: Should we add a version field for future
> extension?
> >
> > 56. Config changes are public facing. Could we have a separate section to
> > document all the config changes?
>
> +1. A separate section for this would be good.
>
> best,
> Colin
>
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu 
> > wrote:
> >
> >> Hi Jun
> >> Thanks for the comments.
> >>
> >> 40. If we change to None, it is not guaranteed for no data loss. For
> users
> >> who are not able to validate the data with external resources, manual
> >> intervention does not give a better result but a loss of availability.
> So
> >> practically speaking, the Balance mode would be a better default value.
> >>
> >> 41. No, it represents how we want to do the unclean leader election. If
> it
> >> is false, the unclean leader election will be the old random way.
> >> Otherwise, the unclean recovery will be used.
> >>
> >> 42. Good catch. Updated.
> >>
> >> 43. Only the first 20 topics will be served. Others will be returned
> with
> >> InvalidRequestError
> >>
> >> 44. The order matters. The desired leader entries match with the topic
> >> partition list by the index.
> >>
> >> 45. Thanks! Updated.
> >>
> >> 46. Good advice! Updated.
> >>
> >> 47.1, updated the comment. Basically it will elect the replica in the
> >> desiredLeader field to be the leader
> >>
> >> 47.2 We can let the admin client do the conversion.

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #84

2023-10-03 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 306677 lines...]
streams-5: SMOKE-TEST-CLIENT-CLOSED
streams-6: SMOKE-TEST-CLIENT-CLOSED
streams-4: SMOKE-TEST-CLIENT-CLOSED
streams-1: SMOKE-TEST-CLIENT-CLOSED
streams-0: SMOKE-TEST-CLIENT-CLOSED
streams-3: SMOKE-TEST-CLIENT-CLOSED
streams-1: SMOKE-TEST-CLIENT-CLOSED
streams-5: SMOKE-TEST-CLIENT-CLOSED
streams-4: SMOKE-TEST-CLIENT-CLOSED
streams-3: SMOKE-TEST-CLIENT-CLOSED
streams-2: SMOKE-TEST-CLIENT-CLOSED
streams-2: SMOKE-TEST-CLIENT-CLOSED

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.2.1/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD SUCCESSFUL in 2h 27m 26s
296 actionable tasks: 109 executed, 187 up-to-date

Publishing build scan...
https://ge.apache.org/s/elzdzswust5vu


See the profiling report at: 
file:///home/jenkins/workspace/Kafka_kafka_3.6/build/reports/profile/profile-2023-10-03-21-19-58.html
A fine-grained performance profile is available: use the --scan option.
[Pipeline] junit
Recording test results
[Checks API] No suitable checks publisher found.
[Pipeline] echo
Skipping Kafka Streams archetype test for Java 20
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
> Task :core:testClasses
> Task :streams:compileTestJava
> Task :streams:testClasses
> Task :streams:testJar
> Task :streams:testSrcJar
> Task :streams:publishMavenJavaPublicationToMavenLocal
> Task :streams:publishToMavenLocal

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.2.1/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD SUCCESSFUL in 8m 16s
94 actionable tasks: 41 executed, 53 up-to-date

Publishing build scan...
https://ge.apache.org/s/4dxj2k5qri7ts

[Pipeline] sh
+ grep ^version= gradle.properties
+ cut -d= -f 2
[Pipeline] dir
Running in 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/streams/quickstart
[Pipeline] {
[Pipeline] sh
+ mvn clean install -Dgpg.skip
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Kafka Streams :: Quickstart[pom]
[INFO] streams-quickstart-java[maven-archetype]
[INFO] 
[INFO] < org.apache.kafka:streams-quickstart >-
[INFO] Building Kafka Streams :: Quickstart 3.6.1-SNAPSHOT[1/2]
[INFO]   from pom.xml
[INFO] [ pom ]-
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 
streams-quickstart ---
[INFO] 
[INFO] --- site:3.5.1:attach-descriptor (attach-descriptor) @ 
streams-quickstart ---
[INFO] 
[INFO] --- gpg:1.6:sign (sign-artifacts) @ streams-quickstart ---
[INFO] 
[INFO] --- install:2.5.2:install (default-install) @ streams-quickstart ---
[INFO] Installing 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.6/streams/quickstart/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/kafka/streams-quickstart/3.6.1-SNAPSHOT/streams-quickstart-3.6.1-SNAPSHOT.pom
[INFO] 
[INFO] --< org.apache.kafka:streams-quickstart-java >--
[INFO] Building streams-quickstart-java 3.6.1-SNAPSHOT[2/2]
[INFO]   from java/pom.xml
[INFO] --[ maven-archetype ]---
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart-java ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 
streams-quickstart-java ---
[INFO] 
[INFO] --- resources:2.7:resources (default-resources) @ 
streams-quickstart-java ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 6 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- resources:2.7:testResources (default-testResources) @ 
streams-quickstart-java ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- archetype:2.2:jar (default-jar) @ streams-quickstart-java ---
[INFO] Building archetype jar: 
/home/jenkins/jenkins-age

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
Hi, Colt,
The underlying structure of inMemoryKeyValueStore is treeMap.
Sincerely,
Hanyu

On Tue, Oct 3, 2023 at 4:34 PM Hanyu (Peter) Zheng 
wrote:

> Hi Bill,
> 1. I will update the KIP in accordance with the PR and synchronize their
> future updates.
> 2. I will use that name.
> 3. you mean add something about ordering at the motivation section?
>
> Sincerely,
> Hanyu
>
>
> On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
> wrote:
>
>> Hi, Walker,
>>
>> 1. I will update the KIP in accordance with the PR and synchronize their
>> future updates.
>> 2. I will use that name.
>> 3. I'll provide additional details in that section.
>> 4. I intend to utilize rangeQuery to achieve what we're referring to as
>> reverseQuery. In essence, reverseQuery is merely a term. To clear up any
>> ambiguity, I'll make necessary adjustments to the KIP.
>>
>> Sincerely,
>> Hanyu
>>
>>
>>
>> On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
>> wrote:
>>
>>> Ok, I will change it back to following the code, and update them
>>> together.
>>>
>>> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
>>>  wrote:
>>>
 Hello Hanyu,

 Looking over your kip things mostly make sense but I have a couple of
 comments.


1. You have "withDescandingOrder()". I think you mean "descending" :)
Also there are still a few places in the do where its called
 "setReverse"
2. Also I like "WithDescendingKeys()" better
3. I'm not sure of what ordering guarantees we are offering. Perhaps
 we
can add a section to the motivation clearly spelling out the current
ordering and the new offering?
4. When you say "use unbounded reverseQuery to achieve reverseAll" do
you mean "use unbounded RangeQuery to achieve reverseAll"? as far as
 I can
tell we don't have a reverseQuery as a named object?


 Looking good so far

 best,
 Walker

 On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy 
 wrote:

 > Hello Hanyu,
 >
 > Thank you for the KIP. I agree with Matthias' proposal to keep the
 naming
 > convention consistent with KIP-969. I favor the
 `.withDescendingKeys()`
 > name.
 >
 > I am curious about one thing. RocksDB guarantees that records returned
 > during a range scan are lexicographically ordered by the bytes of the
 keys
 > (either ascending or descending order, as specified in the query).
 This
 > means that results within a single partition are indeed ordered.** My
 > reading of KIP-805 suggests to me that you don't need to specify the
 > partition number you are querying in IQv2, which means that you can
 have a
 > valid reversed RangeQuery over a store with "multiple partitions" in
 it.
 >
 > Currently, IQv1 does not guarantee order of keys in this scenario.
 Does
 > IQv2 support ordering across partitions? Such an implementation would
 > require opening a rocksdb range scan** on multiple rocksdb instances
 (one
 > per partition), and polling the first key of each. Whether or not
 this is
 > ordered, could we please add that to the documentation?
 >
 > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
 > don't know about that implementation).
 >
 > Colt McNealy
 >
 > *Founder, LittleHorse.dev*
 >
 >
 > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
 >  wrote:
 >
 > > ok, I will update it. Thank you  Matthias
 > >
 > > Sincerely,
 > > Hanyu
 > >
 > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
 > wrote:
 > >
 > > > Thanks for the KIP Hanyu!
 > > >
 > > >
 > > > I took a quick look and it think the proposal makes sense overall.
 > > >
 > > > A few comments about how to structure the KIP.
 > > >
 > > > As you propose to not add `ReverseRangQuery` class, the code
 example
 > > > should go into "Rejected Alternatives" section, not in the
 "Proposed
 > > > Changes" section.
 > > >
 > > > For the `RangeQuery` code example, please omit all existing
 methods
 > etc,
 > > > and only include what will be added/changed. This make it simpler
 to
 > > > read the KIP.
 > > >
 > > >
 > > > nit: typo
 > > >
 > > > >  the fault value is false
 > > >
 > > > Should be "the default value is false".
 > > >
 > > >
 > > > Not sure if `setReverse()` is the best name. Maybe
 > `withDescandingOrder`
 > > > (or similar, I guess `withReverseOrder` would also work) might be
 > > > better? Would be good to align to KIP-969 proposal that suggest
 do use
 > > > `withDescendingKeys` methods for "reverse key-range"; if we go
 with
 > > > `withReverseOrder` we should change KIP-969 accordingly.
 > > >
 > > > Curious to hear what others think about naming this consistently
 across
>>>

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
Hi Bill,
1. I will update the KIP in accordance with the PR and synchronize their
future updates.
2. I will use that name.
3. you mean add something about ordering at the motivation section?

Sincerely,
Hanyu


On Tue, Oct 3, 2023 at 4:29 PM Hanyu (Peter) Zheng 
wrote:

> Hi, Walker,
>
> 1. I will update the KIP in accordance with the PR and synchronize their
> future updates.
> 2. I will use that name.
> 3. I'll provide additional details in that section.
> 4. I intend to utilize rangeQuery to achieve what we're referring to as
> reverseQuery. In essence, reverseQuery is merely a term. To clear up any
> ambiguity, I'll make necessary adjustments to the KIP.
>
> Sincerely,
> Hanyu
>
>
>
> On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
> wrote:
>
>> Ok, I will change it back to following the code, and update them
>> together.
>>
>> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
>>  wrote:
>>
>>> Hello Hanyu,
>>>
>>> Looking over your kip things mostly make sense but I have a couple of
>>> comments.
>>>
>>>
>>>1. You have "withDescandingOrder()". I think you mean "descending" :)
>>>Also there are still a few places in the do where its called
>>> "setReverse"
>>>2. Also I like "WithDescendingKeys()" better
>>>3. I'm not sure of what ordering guarantees we are offering. Perhaps
>>> we
>>>can add a section to the motivation clearly spelling out the current
>>>ordering and the new offering?
>>>4. When you say "use unbounded reverseQuery to achieve reverseAll" do
>>>you mean "use unbounded RangeQuery to achieve reverseAll"? as far as
>>> I can
>>>tell we don't have a reverseQuery as a named object?
>>>
>>>
>>> Looking good so far
>>>
>>> best,
>>> Walker
>>>
>>> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy  wrote:
>>>
>>> > Hello Hanyu,
>>> >
>>> > Thank you for the KIP. I agree with Matthias' proposal to keep the
>>> naming
>>> > convention consistent with KIP-969. I favor the `.withDescendingKeys()`
>>> > name.
>>> >
>>> > I am curious about one thing. RocksDB guarantees that records returned
>>> > during a range scan are lexicographically ordered by the bytes of the
>>> keys
>>> > (either ascending or descending order, as specified in the query). This
>>> > means that results within a single partition are indeed ordered.** My
>>> > reading of KIP-805 suggests to me that you don't need to specify the
>>> > partition number you are querying in IQv2, which means that you can
>>> have a
>>> > valid reversed RangeQuery over a store with "multiple partitions" in
>>> it.
>>> >
>>> > Currently, IQv1 does not guarantee order of keys in this scenario. Does
>>> > IQv2 support ordering across partitions? Such an implementation would
>>> > require opening a rocksdb range scan** on multiple rocksdb instances
>>> (one
>>> > per partition), and polling the first key of each. Whether or not this
>>> is
>>> > ordered, could we please add that to the documentation?
>>> >
>>> > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
>>> > don't know about that implementation).
>>> >
>>> > Colt McNealy
>>> >
>>> > *Founder, LittleHorse.dev*
>>> >
>>> >
>>> > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
>>> >  wrote:
>>> >
>>> > > ok, I will update it. Thank you  Matthias
>>> > >
>>> > > Sincerely,
>>> > > Hanyu
>>> > >
>>> > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
>>> > wrote:
>>> > >
>>> > > > Thanks for the KIP Hanyu!
>>> > > >
>>> > > >
>>> > > > I took a quick look and it think the proposal makes sense overall.
>>> > > >
>>> > > > A few comments about how to structure the KIP.
>>> > > >
>>> > > > As you propose to not add `ReverseRangQuery` class, the code
>>> example
>>> > > > should go into "Rejected Alternatives" section, not in the
>>> "Proposed
>>> > > > Changes" section.
>>> > > >
>>> > > > For the `RangeQuery` code example, please omit all existing methods
>>> > etc,
>>> > > > and only include what will be added/changed. This make it simpler
>>> to
>>> > > > read the KIP.
>>> > > >
>>> > > >
>>> > > > nit: typo
>>> > > >
>>> > > > >  the fault value is false
>>> > > >
>>> > > > Should be "the default value is false".
>>> > > >
>>> > > >
>>> > > > Not sure if `setReverse()` is the best name. Maybe
>>> > `withDescandingOrder`
>>> > > > (or similar, I guess `withReverseOrder` would also work) might be
>>> > > > better? Would be good to align to KIP-969 proposal that suggest do
>>> use
>>> > > > `withDescendingKeys` methods for "reverse key-range"; if we go with
>>> > > > `withReverseOrder` we should change KIP-969 accordingly.
>>> > > >
>>> > > > Curious to hear what others think about naming this consistently
>>> across
>>> > > > both KIPs.
>>> > > >
>>> > > >
>>> > > > -Matthias
>>> > > >
>>> > > >
>>> > > > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
>>> > > > >
>>> > > >
>>> > >
>>> > >
>

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
Hi, Walker,

1. I will update the KIP in accordance with the PR and synchronize their
future updates.
2. I will use that name.
3. I'll provide additional details in that section.
4. I intend to utilize rangeQuery to achieve what we're referring to as
reverseQuery. In essence, reverseQuery is merely a term. To clear up any
ambiguity, I'll make necessary adjustments to the KIP.

Sincerely,
Hanyu



On Tue, Oct 3, 2023 at 4:09 PM Hanyu (Peter) Zheng 
wrote:

> Ok, I will change it back to following the code, and update them together.
>
> On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson
>  wrote:
>
>> Hello Hanyu,
>>
>> Looking over your kip things mostly make sense but I have a couple of
>> comments.
>>
>>
>>1. You have "withDescandingOrder()". I think you mean "descending" :)
>>Also there are still a few places in the do where its called
>> "setReverse"
>>2. Also I like "WithDescendingKeys()" better
>>3. I'm not sure of what ordering guarantees we are offering. Perhaps we
>>can add a section to the motivation clearly spelling out the current
>>ordering and the new offering?
>>4. When you say "use unbounded reverseQuery to achieve reverseAll" do
>>you mean "use unbounded RangeQuery to achieve reverseAll"? as far as I
>> can
>>tell we don't have a reverseQuery as a named object?
>>
>>
>> Looking good so far
>>
>> best,
>> Walker
>>
>> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy  wrote:
>>
>> > Hello Hanyu,
>> >
>> > Thank you for the KIP. I agree with Matthias' proposal to keep the
>> naming
>> > convention consistent with KIP-969. I favor the `.withDescendingKeys()`
>> > name.
>> >
>> > I am curious about one thing. RocksDB guarantees that records returned
>> > during a range scan are lexicographically ordered by the bytes of the
>> keys
>> > (either ascending or descending order, as specified in the query). This
>> > means that results within a single partition are indeed ordered.** My
>> > reading of KIP-805 suggests to me that you don't need to specify the
>> > partition number you are querying in IQv2, which means that you can
>> have a
>> > valid reversed RangeQuery over a store with "multiple partitions" in it.
>> >
>> > Currently, IQv1 does not guarantee order of keys in this scenario. Does
>> > IQv2 support ordering across partitions? Such an implementation would
>> > require opening a rocksdb range scan** on multiple rocksdb instances
>> (one
>> > per partition), and polling the first key of each. Whether or not this
>> is
>> > ordered, could we please add that to the documentation?
>> >
>> > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
>> > don't know about that implementation).
>> >
>> > Colt McNealy
>> >
>> > *Founder, LittleHorse.dev*
>> >
>> >
>> > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
>> >  wrote:
>> >
>> > > ok, I will update it. Thank you  Matthias
>> > >
>> > > Sincerely,
>> > > Hanyu
>> > >
>> > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
>> > wrote:
>> > >
>> > > > Thanks for the KIP Hanyu!
>> > > >
>> > > >
>> > > > I took a quick look and it think the proposal makes sense overall.
>> > > >
>> > > > A few comments about how to structure the KIP.
>> > > >
>> > > > As you propose to not add `ReverseRangQuery` class, the code example
>> > > > should go into "Rejected Alternatives" section, not in the "Proposed
>> > > > Changes" section.
>> > > >
>> > > > For the `RangeQuery` code example, please omit all existing methods
>> > etc,
>> > > > and only include what will be added/changed. This make it simpler to
>> > > > read the KIP.
>> > > >
>> > > >
>> > > > nit: typo
>> > > >
>> > > > >  the fault value is false
>> > > >
>> > > > Should be "the default value is false".
>> > > >
>> > > >
>> > > > Not sure if `setReverse()` is the best name. Maybe
>> > `withDescandingOrder`
>> > > > (or similar, I guess `withReverseOrder` would also work) might be
>> > > > better? Would be good to align to KIP-969 proposal that suggest do
>> use
>> > > > `withDescendingKeys` methods for "reverse key-range"; if we go with
>> > > > `withReverseOrder` we should change KIP-969 accordingly.
>> > > >
>> > > > Curious to hear what others think about naming this consistently
>> across
>> > > > both KIPs.
>> > > >
>> > > >
>> > > > -Matthias
>> > > >
>> > > >
>> > > > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > >
>> > > [image: Confluent] 
>> > > Hanyu (Peter) Zheng he/him/his
>> > > Software Engineer Intern
>> > > +1 (213) 431-7193 <+1+(213)+431-7193>
>> > > Follow us: [image: Blog]
>> > > <
>> > >
>> >
>> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog
>> > > >[image:
>> > > Twitter] [image: LinkedIn]

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-03 Thread Calvin Liu
Hi Jun,
Thanks for the comments. And thanks to Colin's explanation.

41. The unclean recovery manager may need time to be ready for production.
So adding a flag to disable the new feature is in case it has any problems.
The config may be deprecated or removed when we are ready.

50. Sorry for the confusion, it should be “Others will be returned with
InvalidRequestError.”

51. Good point, added the error message field. Also if more partitions are
included, it will have the InvalidRequestError

52. Done

53. Good idea, make the DesiredLeaders nullable.

54. The downgrade refers to the Metadata version downgrade, so the brokers
still have the ability to parse the records.

In the ELR case, if the MV is downgraded, the ELR fields will be dropped in
the following partition updates.

55. Good idea, adding a version field to the CleanShutdownFile

56. Will have a section to show all the config changes.

===

Hi David,

Thanks for the comments.

57. The CleanShutdownFile is removed after the log manager is initialized.
It will be created and written when the log manager is shutting down.

58. Good question, if the broker shuts down before it receives the broker
epoch, it will write -1.

59. The CleanShutdownFile is a part of the log manager shutdown. Its
presence represents the files were safely flushed to the disk before the
shutdown. So if the controlled shutdown timeouts or having any errors, it
is not necessary to affect the CleanShutdownFile.

On Tue, Oct 3, 2023 at 3:40 PM David Arthur
 wrote:

> Calvin, thanks for the KIP!
>
> I'm getting up to speed on the discussion. I had a few questions
>
> 57. When is the CleanShutdownFile removed? I think it probably happens
> after registering with the controller, but it would be good to clarify
> this.
>
> 58. Since the broker epoch comes from the controller, what would go
> into the CleanShutdownFile in the case of a broker being unable to register
> with the controller? For example:
>
> 1) Broker A registers
>
> 2) Controller sees A, gives epoch 1
>
> 3) Broker A crashes, no CleanShutdownFile
>
> 4) Broker A starts up and shuts down before registering
>
>
> During 4) is a CleanShutdownFile produced? If so, what epoch goes in it?
>
> 59. What is the expected behavior when controlled shutdown times out?
> Looking at BrokerServer, I think the logs have a chance of still being
> closed cleanly, so this could be a regular clean shutdown scenario.
>
>
>
>
> On Tue, Oct 3, 2023 at 6:04 PM Colin McCabe  wrote:
>
> > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > > Hi, Calvin,
> > >
> > > Thanks for the update KIP. A few more comments.
> > >
> > > 41. Why would a user choose the option to select a random replica as
> the
> > > leader instead of using unclean.recovery.strateg=Aggressive? It seems
> > that
> > > the latter is strictly better? If that's not the case, could we fold
> this
> > > option under unclean.recovery.strategy instead of introducing a
> separate
> > > config?
> >
> > Hi Jun,
> >
> > I thought the flow of control was:
> >
> > If there is no leader for the partition {
> >   If (there are unfenced ELR members) {
> > choose_an_unfenced_ELR_member
> >   } else if (there are fenced ELR members AND strategy=Aggressive) {
> > do_unclean_recovery
> >   } else if (there are no ELR members AND strategy != None) {
> > do_unclean_recovery
> >   } else {
> > do nothing about the missing leader
> >   }
> > }
> >
> > do_unclean_recovery() {
> >if (unclean.recovery.manager.enabled) {
> > use UncleanRecoveryManager
> >   } else {
> > choose the last known leader if that is available, or a random leader
> > if not)
> >   }
> > }
> >
> > However, I think this could be clarified, especially the behavior when
> > unclean.recovery.manager.enabled=false. Inuitively the goal for
> > unclean.recovery.manager.enabled=false is to be "the same as now, mostly"
> > but it's very underspecified in the KIP, I agree.
> >
> > >
> > > 50. ElectLeadersRequest: "If more than 20 topics are included, only the
> > > first 20 will be served. Others will be returned with DesiredLeaders."
> > Hmm,
> > > not sure that I understand this. ElectLeadersResponse doesn't have a
> > > DesiredLeaders field.
> > >
> > > 51. GetReplicaLogInfo: "If more than 2000 partitions are included, only
> > the
> > > first 2000 will be served" Do we return an error for the remaining
> > > partitions? Actually, should we include an errorCode field at the
> > partition
> > > level in GetReplicaLogInfoResponse to cover non-existing partitions and
> > no
> > > authorization, etc?
> > >
> > > 52. The entry should matches => The entry should match
> > >
> > > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a
> > user
> > > may not specify DesiredLeaders?
> > >
> > > 54. Downgrade: Is that indeed possible? I thought earlier you said that
> > > once the new version of the records are in the metadata log, one can't
> > > downgrade since the old broker doesn't know how to

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
Ok, I will change it back to following the code, and update them together.

On Tue, Oct 3, 2023 at 2:27 PM Walker Carlson 
wrote:

> Hello Hanyu,
>
> Looking over your kip things mostly make sense but I have a couple of
> comments.
>
>
>1. You have "withDescandingOrder()". I think you mean "descending" :)
>Also there are still a few places in the do where its called
> "setReverse"
>2. Also I like "WithDescendingKeys()" better
>3. I'm not sure of what ordering guarantees we are offering. Perhaps we
>can add a section to the motivation clearly spelling out the current
>ordering and the new offering?
>4. When you say "use unbounded reverseQuery to achieve reverseAll" do
>you mean "use unbounded RangeQuery to achieve reverseAll"? as far as I
> can
>tell we don't have a reverseQuery as a named object?
>
>
> Looking good so far
>
> best,
> Walker
>
> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy  wrote:
>
> > Hello Hanyu,
> >
> > Thank you for the KIP. I agree with Matthias' proposal to keep the naming
> > convention consistent with KIP-969. I favor the `.withDescendingKeys()`
> > name.
> >
> > I am curious about one thing. RocksDB guarantees that records returned
> > during a range scan are lexicographically ordered by the bytes of the
> keys
> > (either ascending or descending order, as specified in the query). This
> > means that results within a single partition are indeed ordered.** My
> > reading of KIP-805 suggests to me that you don't need to specify the
> > partition number you are querying in IQv2, which means that you can have
> a
> > valid reversed RangeQuery over a store with "multiple partitions" in it.
> >
> > Currently, IQv1 does not guarantee order of keys in this scenario. Does
> > IQv2 support ordering across partitions? Such an implementation would
> > require opening a rocksdb range scan** on multiple rocksdb instances (one
> > per partition), and polling the first key of each. Whether or not this is
> > ordered, could we please add that to the documentation?
> >
> > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
> > don't know about that implementation).
> >
> > Colt McNealy
> >
> > *Founder, LittleHorse.dev*
> >
> >
> > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
> >  wrote:
> >
> > > ok, I will update it. Thank you  Matthias
> > >
> > > Sincerely,
> > > Hanyu
> > >
> > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
> > wrote:
> > >
> > > > Thanks for the KIP Hanyu!
> > > >
> > > >
> > > > I took a quick look and it think the proposal makes sense overall.
> > > >
> > > > A few comments about how to structure the KIP.
> > > >
> > > > As you propose to not add `ReverseRangQuery` class, the code example
> > > > should go into "Rejected Alternatives" section, not in the "Proposed
> > > > Changes" section.
> > > >
> > > > For the `RangeQuery` code example, please omit all existing methods
> > etc,
> > > > and only include what will be added/changed. This make it simpler to
> > > > read the KIP.
> > > >
> > > >
> > > > nit: typo
> > > >
> > > > >  the fault value is false
> > > >
> > > > Should be "the default value is false".
> > > >
> > > >
> > > > Not sure if `setReverse()` is the best name. Maybe
> > `withDescandingOrder`
> > > > (or similar, I guess `withReverseOrder` would also work) might be
> > > > better? Would be good to align to KIP-969 proposal that suggest do
> use
> > > > `withDescendingKeys` methods for "reverse key-range"; if we go with
> > > > `withReverseOrder` we should change KIP-969 accordingly.
> > > >
> > > > Curious to hear what others think about naming this consistently
> across
> > > > both KIPs.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > [image: Confluent] 
> > > Hanyu (Peter) Zheng he/him/his
> > > Software Engineer Intern
> > > +1 (213) 431-7193 <+1+(213)+431-7193>
> > > Follow us: [image: Blog]
> > > <
> > >
> >
> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog
> > > >[image:
> > > Twitter] [image: LinkedIn]
> > > [image: Slack]
> > > [image: YouTube]
> > > 
> > >
> > > [image: Try Confluent Cloud for Free]
> > > <
> > >
> >
> https://www.confluent.io/get-started?utm_campaign=tm.fm-apac_cd.inbound&utm_source=gmail&utm_medium=organic
> > > >
> > >
> >
>


-- 

[image: Confluent] 
Hanyu (Peter) Zheng he/him/his
Software Engineer Intern
+1 (213) 431-7193 <+1+(213)+431-7193>
Follow us: [image: Blog]


Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
Ok, I just talked about it with Matthias, I will change the text back to
following the code, and update them together.

Sincerely,
Hanyu

On Tue, Oct 3, 2023 at 3:49 PM Bill Bejeck  wrote:

> Hi Hanyu,
>
> Thanks for the KIP, overall it's looking good, but I have a couple of
> comments
>
>- In the “Proposed Changes” section there's a reference to
>`setReverse()` but the code example has “withDescendingOrder()” so I
> think
>the text needs an update to reflect the code example.
>- I prefer “withDescendingKeys()”  to “withDescendingOrder()”
>- I also agree that we should include a section on ordering, but it
>should be fairly straightforward.  The “StateQueryRequest” of IQv2
> allows
>users to specify a partition or partitions, so if ordering is important
>they can elect to provide a single partition in the query.
>
>
> Thanks,
> Bill
>
> On Tue, Oct 3, 2023 at 5:27 PM Walker Carlson
> 
> wrote:
>
> > Hello Hanyu,
> >
> > Looking over your kip things mostly make sense but I have a couple of
> > comments.
> >
> >
> >1. You have "withDescandingOrder()". I think you mean "descending" :)
> >Also there are still a few places in the do where its called
> > "setReverse"
> >2. Also I like "WithDescendingKeys()" better
> >3. I'm not sure of what ordering guarantees we are offering. Perhaps
> we
> >can add a section to the motivation clearly spelling out the current
> >ordering and the new offering?
> >4. When you say "use unbounded reverseQuery to achieve reverseAll" do
> >you mean "use unbounded RangeQuery to achieve reverseAll"? as far as I
> > can
> >tell we don't have a reverseQuery as a named object?
> >
> >
> > Looking good so far
> >
> > best,
> > Walker
> >
> > On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy  wrote:
> >
> > > Hello Hanyu,
> > >
> > > Thank you for the KIP. I agree with Matthias' proposal to keep the
> naming
> > > convention consistent with KIP-969. I favor the `.withDescendingKeys()`
> > > name.
> > >
> > > I am curious about one thing. RocksDB guarantees that records returned
> > > during a range scan are lexicographically ordered by the bytes of the
> > keys
> > > (either ascending or descending order, as specified in the query). This
> > > means that results within a single partition are indeed ordered.** My
> > > reading of KIP-805 suggests to me that you don't need to specify the
> > > partition number you are querying in IQv2, which means that you can
> have
> > a
> > > valid reversed RangeQuery over a store with "multiple partitions" in
> it.
> > >
> > > Currently, IQv1 does not guarantee order of keys in this scenario. Does
> > > IQv2 support ordering across partitions? Such an implementation would
> > > require opening a rocksdb range scan** on multiple rocksdb instances
> (one
> > > per partition), and polling the first key of each. Whether or not this
> is
> > > ordered, could we please add that to the documentation?
> > >
> > > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
> > > don't know about that implementation).
> > >
> > > Colt McNealy
> > >
> > > *Founder, LittleHorse.dev*
> > >
> > >
> > > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
> > >  wrote:
> > >
> > > > ok, I will update it. Thank you  Matthias
> > > >
> > > > Sincerely,
> > > > Hanyu
> > > >
> > > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
> > > wrote:
> > > >
> > > > > Thanks for the KIP Hanyu!
> > > > >
> > > > >
> > > > > I took a quick look and it think the proposal makes sense overall.
> > > > >
> > > > > A few comments about how to structure the KIP.
> > > > >
> > > > > As you propose to not add `ReverseRangQuery` class, the code
> example
> > > > > should go into "Rejected Alternatives" section, not in the
> "Proposed
> > > > > Changes" section.
> > > > >
> > > > > For the `RangeQuery` code example, please omit all existing methods
> > > etc,
> > > > > and only include what will be added/changed. This make it simpler
> to
> > > > > read the KIP.
> > > > >
> > > > >
> > > > > nit: typo
> > > > >
> > > > > >  the fault value is false
> > > > >
> > > > > Should be "the default value is false".
> > > > >
> > > > >
> > > > > Not sure if `setReverse()` is the best name. Maybe
> > > `withDescandingOrder`
> > > > > (or similar, I guess `withReverseOrder` would also work) might be
> > > > > better? Would be good to align to KIP-969 proposal that suggest do
> > use
> > > > > `withDescendingKeys` methods for "reverse key-range"; if we go with
> > > > > `withReverseOrder` we should change KIP-969 accordingly.
> > > > >
> > > > > Curious to hear what others think about naming this consistently
> > across
> > > > > both KIPs.
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > >
> > > > > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
> > > > > >
> > > 

Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Bill Bejeck
Hi Hanyu,

Thanks for the KIP, overall it's looking good, but I have a couple of
comments

   - In the “Proposed Changes” section there's a reference to
   `setReverse()` but the code example has “withDescendingOrder()” so I think
   the text needs an update to reflect the code example.
   - I prefer “withDescendingKeys()”  to “withDescendingOrder()”
   - I also agree that we should include a section on ordering, but it
   should be fairly straightforward.  The “StateQueryRequest” of IQv2 allows
   users to specify a partition or partitions, so if ordering is important
   they can elect to provide a single partition in the query.


Thanks,
Bill

On Tue, Oct 3, 2023 at 5:27 PM Walker Carlson 
wrote:

> Hello Hanyu,
>
> Looking over your kip things mostly make sense but I have a couple of
> comments.
>
>
>1. You have "withDescandingOrder()". I think you mean "descending" :)
>Also there are still a few places in the do where its called
> "setReverse"
>2. Also I like "WithDescendingKeys()" better
>3. I'm not sure of what ordering guarantees we are offering. Perhaps we
>can add a section to the motivation clearly spelling out the current
>ordering and the new offering?
>4. When you say "use unbounded reverseQuery to achieve reverseAll" do
>you mean "use unbounded RangeQuery to achieve reverseAll"? as far as I
> can
>tell we don't have a reverseQuery as a named object?
>
>
> Looking good so far
>
> best,
> Walker
>
> On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy  wrote:
>
> > Hello Hanyu,
> >
> > Thank you for the KIP. I agree with Matthias' proposal to keep the naming
> > convention consistent with KIP-969. I favor the `.withDescendingKeys()`
> > name.
> >
> > I am curious about one thing. RocksDB guarantees that records returned
> > during a range scan are lexicographically ordered by the bytes of the
> keys
> > (either ascending or descending order, as specified in the query). This
> > means that results within a single partition are indeed ordered.** My
> > reading of KIP-805 suggests to me that you don't need to specify the
> > partition number you are querying in IQv2, which means that you can have
> a
> > valid reversed RangeQuery over a store with "multiple partitions" in it.
> >
> > Currently, IQv1 does not guarantee order of keys in this scenario. Does
> > IQv2 support ordering across partitions? Such an implementation would
> > require opening a rocksdb range scan** on multiple rocksdb instances (one
> > per partition), and polling the first key of each. Whether or not this is
> > ordered, could we please add that to the documentation?
> >
> > **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
> > don't know about that implementation).
> >
> > Colt McNealy
> >
> > *Founder, LittleHorse.dev*
> >
> >
> > On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
> >  wrote:
> >
> > > ok, I will update it. Thank you  Matthias
> > >
> > > Sincerely,
> > > Hanyu
> > >
> > > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
> > wrote:
> > >
> > > > Thanks for the KIP Hanyu!
> > > >
> > > >
> > > > I took a quick look and it think the proposal makes sense overall.
> > > >
> > > > A few comments about how to structure the KIP.
> > > >
> > > > As you propose to not add `ReverseRangQuery` class, the code example
> > > > should go into "Rejected Alternatives" section, not in the "Proposed
> > > > Changes" section.
> > > >
> > > > For the `RangeQuery` code example, please omit all existing methods
> > etc,
> > > > and only include what will be added/changed. This make it simpler to
> > > > read the KIP.
> > > >
> > > >
> > > > nit: typo
> > > >
> > > > >  the fault value is false
> > > >
> > > > Should be "the default value is false".
> > > >
> > > >
> > > > Not sure if `setReverse()` is the best name. Maybe
> > `withDescandingOrder`
> > > > (or similar, I guess `withReverseOrder` would also work) might be
> > > > better? Would be good to align to KIP-969 proposal that suggest do
> use
> > > > `withDescendingKeys` methods for "reverse key-range"; if we go with
> > > > `withReverseOrder` we should change KIP-969 accordingly.
> > > >
> > > > Curious to hear what others think about naming this consistently
> across
> > > > both KIPs.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > [image: Confluent] 
> > > Hanyu (Peter) Zheng he/him/his
> > > Software Engineer Intern
> > > +1 (213) 431-7193 <+1+(213)+431-7193>
> > > Follow us: [image: Blog]
> > > <
> > >
> >
> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog
> > > >[image:
> > > Twitter] [image: LinkedIn]
> > > 

Re: [VOTE] KIP-951: Leader discovery optimisations for the client

2023-10-03 Thread Jason Gustafson
+1 Thanks for the KIP

On Tue, Oct 3, 2023 at 12:30 PM David Jacot  wrote:

> Thanks for the KIP. +1 from me as well.
>
> Best,
> David
>
> Le mar. 3 oct. 2023 à 20:54, Jun Rao  a écrit :
>
> > Hi, Mayank,
> >
> > Thanks for the detailed explanation in the KIP. +1 from me.
> >
> > Jun
> >
> > On Wed, Sep 27, 2023 at 4:39 AM Mayank Shekhar Narula <
> > mayanks.nar...@gmail.com> wrote:
> >
> > > Reviving this thread, as the discussion thread has been updated.
> > >
> > > On Fri, Jul 28, 2023 at 11:29 AM Mayank Shekhar Narula <
> > > mayanks.nar...@gmail.com> wrote:
> > >
> > > > Thanks Jose.
> > > >
> > > > On Thu, Jul 27, 2023 at 5:46 PM José Armando García Sancio
> > > >  wrote:
> > > >
> > > >> The KIP LGTM. Thanks for the design. I am looking forward to the
> > > >> implementation.
> > > >>
> > > >> +1 (binding).
> > > >>
> > > >> Thanks!
> > > >> --
> > > >> -José
> > > >>
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Mayank Shekhar Narula
> > > >
> > >
> > >
> > > --
> > > Regards,
> > > Mayank Shekhar Narula
> > >
> >
>


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-03 Thread David Arthur
Calvin, thanks for the KIP!

I'm getting up to speed on the discussion. I had a few questions

57. When is the CleanShutdownFile removed? I think it probably happens
after registering with the controller, but it would be good to clarify this.

58. Since the broker epoch comes from the controller, what would go
into the CleanShutdownFile in the case of a broker being unable to register
with the controller? For example:

1) Broker A registers

2) Controller sees A, gives epoch 1

3) Broker A crashes, no CleanShutdownFile

4) Broker A starts up and shuts down before registering


During 4) is a CleanShutdownFile produced? If so, what epoch goes in it?

59. What is the expected behavior when controlled shutdown times out?
Looking at BrokerServer, I think the logs have a chance of still being
closed cleanly, so this could be a regular clean shutdown scenario.




On Tue, Oct 3, 2023 at 6:04 PM Colin McCabe  wrote:

> On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > Hi, Calvin,
> >
> > Thanks for the update KIP. A few more comments.
> >
> > 41. Why would a user choose the option to select a random replica as the
> > leader instead of using unclean.recovery.strateg=Aggressive? It seems
> that
> > the latter is strictly better? If that's not the case, could we fold this
> > option under unclean.recovery.strategy instead of introducing a separate
> > config?
>
> Hi Jun,
>
> I thought the flow of control was:
>
> If there is no leader for the partition {
>   If (there are unfenced ELR members) {
> choose_an_unfenced_ELR_member
>   } else if (there are fenced ELR members AND strategy=Aggressive) {
> do_unclean_recovery
>   } else if (there are no ELR members AND strategy != None) {
> do_unclean_recovery
>   } else {
> do nothing about the missing leader
>   }
> }
>
> do_unclean_recovery() {
>if (unclean.recovery.manager.enabled) {
> use UncleanRecoveryManager
>   } else {
> choose the last known leader if that is available, or a random leader
> if not)
>   }
> }
>
> However, I think this could be clarified, especially the behavior when
> unclean.recovery.manager.enabled=false. Inuitively the goal for
> unclean.recovery.manager.enabled=false is to be "the same as now, mostly"
> but it's very underspecified in the KIP, I agree.
>
> >
> > 50. ElectLeadersRequest: "If more than 20 topics are included, only the
> > first 20 will be served. Others will be returned with DesiredLeaders."
> Hmm,
> > not sure that I understand this. ElectLeadersResponse doesn't have a
> > DesiredLeaders field.
> >
> > 51. GetReplicaLogInfo: "If more than 2000 partitions are included, only
> the
> > first 2000 will be served" Do we return an error for the remaining
> > partitions? Actually, should we include an errorCode field at the
> partition
> > level in GetReplicaLogInfoResponse to cover non-existing partitions and
> no
> > authorization, etc?
> >
> > 52. The entry should matches => The entry should match
> >
> > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a
> user
> > may not specify DesiredLeaders?
> >
> > 54. Downgrade: Is that indeed possible? I thought earlier you said that
> > once the new version of the records are in the metadata log, one can't
> > downgrade since the old broker doesn't know how to parse the new version
> of
> > the metadata records?
> >
>
> MetadataVersion downgrade is currently broken but we have fixing it on our
> plate for Kafka 3.7.
>
> The way downgrade works is that "new features" are dropped, leaving only
> the old ones.
>
> > 55. CleanShutdownFile: Should we add a version field for future
> extension?
> >
> > 56. Config changes are public facing. Could we have a separate section to
> > document all the config changes?
>
> +1. A separate section for this would be good.
>
> best,
> Colin
>
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu 
> > wrote:
> >
> >> Hi Jun
> >> Thanks for the comments.
> >>
> >> 40. If we change to None, it is not guaranteed for no data loss. For
> users
> >> who are not able to validate the data with external resources, manual
> >> intervention does not give a better result but a loss of availability.
> So
> >> practically speaking, the Balance mode would be a better default value.
> >>
> >> 41. No, it represents how we want to do the unclean leader election. If
> it
> >> is false, the unclean leader election will be the old random way.
> >> Otherwise, the unclean recovery will be used.
> >>
> >> 42. Good catch. Updated.
> >>
> >> 43. Only the first 20 topics will be served. Others will be returned
> with
> >> InvalidRequestError
> >>
> >> 44. The order matters. The desired leader entries match with the topic
> >> partition list by the index.
> >>
> >> 45. Thanks! Updated.
> >>
> >> 46. Good advice! Updated.
> >>
> >> 47.1, updated the comment. Basically it will elect the replica in the
> >> desiredLeader field to be the leader
> >>
> >> 47.2 We can let the admin client do the conversion. Using the
>

Re: [PR] Added a blog entry for 3.6.0 release [kafka-site]

2023-10-03 Thread via GitHub


todiaz commented on code in PR #542:
URL: https://github.com/apache/kafka-site/pull/542#discussion_r1344810163


##
blog.html:
##
@@ -22,6 +22,46 @@
 
 
 Blog
+
+
+
+Apache 
Kafka 3.6.0 Release Announcement
+
+15 Sep 2023 - Satish Duggana (https://twitter.com/0xeed";>@SatishDuggana)
+We are proud to announce the release of Apache Kafka 3.6.0. 
This release contains many new features and improvements. This blog post will 
highlight some of the more prominent features. For a full list of changes, be 
sure to check the https://downloads.apache.org/kafka/3.6.0/RELEASE_NOTES.html";>release 
notes.
+See the https://kafka.apache.org/36/documentation.html#upgrade_3_6_0";>Upgrading 
to 3.6.0 from any version 0.8.x through 3.5.x section in the documentation 
for the list of notable changes and detailed upgrade steps.
+The ability to migrate Kafka clusters from ZK to KRaft mode 
with no downtime is still an early access feature. It is currently only 
suitable for testing in non production environments. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration";>KIP-866
 for more details.

Review Comment:
   non production -> non-production



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Added a blog entry for 3.6.0 release [kafka-site]

2023-10-03 Thread via GitHub


todiaz commented on code in PR #542:
URL: https://github.com/apache/kafka-site/pull/542#discussion_r1344810163


##
blog.html:
##
@@ -22,6 +22,46 @@
 
 
 Blog
+
+
+
+Apache 
Kafka 3.6.0 Release Announcement
+
+15 Sep 2023 - Satish Duggana (https://twitter.com/0xeed";>@SatishDuggana)
+We are proud to announce the release of Apache Kafka 3.6.0. 
This release contains many new features and improvements. This blog post will 
highlight some of the more prominent features. For a full list of changes, be 
sure to check the https://downloads.apache.org/kafka/3.6.0/RELEASE_NOTES.html";>release 
notes.
+See the https://kafka.apache.org/36/documentation.html#upgrade_3_6_0";>Upgrading 
to 3.6.0 from any version 0.8.x through 3.5.x section in the documentation 
for the list of notable changes and detailed upgrade steps.
+The ability to migrate Kafka clusters from ZK to KRaft mode 
with no downtime is still an early access feature. It is currently only 
suitable for testing in non production environments. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration";>KIP-866
 for more details.

Review Comment:
   non production -> non-production



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-03 Thread Colin McCabe
On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> Hi, Calvin,
>
> Thanks for the update KIP. A few more comments.
>
> 41. Why would a user choose the option to select a random replica as the
> leader instead of using unclean.recovery.strateg=Aggressive? It seems that
> the latter is strictly better? If that's not the case, could we fold this
> option under unclean.recovery.strategy instead of introducing a separate
> config?

Hi Jun,

I thought the flow of control was:

If there is no leader for the partition {
  If (there are unfenced ELR members) {
choose_an_unfenced_ELR_member
  } else if (there are fenced ELR members AND strategy=Aggressive) {
do_unclean_recovery
  } else if (there are no ELR members AND strategy != None) {
do_unclean_recovery
  } else {
do nothing about the missing leader
  }
}

do_unclean_recovery() {
   if (unclean.recovery.manager.enabled) {
use UncleanRecoveryManager
  } else {
choose the last known leader if that is available, or a random leader if 
not)
  }
}

However, I think this could be clarified, especially the behavior when
unclean.recovery.manager.enabled=false. Inuitively the goal for 
unclean.recovery.manager.enabled=false is to be "the same as now, mostly"
but it's very underspecified in the KIP, I agree.

>
> 50. ElectLeadersRequest: "If more than 20 topics are included, only the
> first 20 will be served. Others will be returned with DesiredLeaders." Hmm,
> not sure that I understand this. ElectLeadersResponse doesn't have a
> DesiredLeaders field.
>
> 51. GetReplicaLogInfo: "If more than 2000 partitions are included, only the
> first 2000 will be served" Do we return an error for the remaining
> partitions? Actually, should we include an errorCode field at the partition
> level in GetReplicaLogInfoResponse to cover non-existing partitions and no
> authorization, etc?
>
> 52. The entry should matches => The entry should match
>
> 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a user
> may not specify DesiredLeaders?
>
> 54. Downgrade: Is that indeed possible? I thought earlier you said that
> once the new version of the records are in the metadata log, one can't
> downgrade since the old broker doesn't know how to parse the new version of
> the metadata records?
>

MetadataVersion downgrade is currently broken but we have fixing it on our 
plate for Kafka 3.7.

The way downgrade works is that "new features" are dropped, leaving only the 
old ones.

> 55. CleanShutdownFile: Should we add a version field for future extension?
>
> 56. Config changes are public facing. Could we have a separate section to
> document all the config changes?

+1. A separate section for this would be good.

best,
Colin

>
> Thanks,
>
> Jun
>
> On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu 
> wrote:
>
>> Hi Jun
>> Thanks for the comments.
>>
>> 40. If we change to None, it is not guaranteed for no data loss. For users
>> who are not able to validate the data with external resources, manual
>> intervention does not give a better result but a loss of availability. So
>> practically speaking, the Balance mode would be a better default value.
>>
>> 41. No, it represents how we want to do the unclean leader election. If it
>> is false, the unclean leader election will be the old random way.
>> Otherwise, the unclean recovery will be used.
>>
>> 42. Good catch. Updated.
>>
>> 43. Only the first 20 topics will be served. Others will be returned with
>> InvalidRequestError
>>
>> 44. The order matters. The desired leader entries match with the topic
>> partition list by the index.
>>
>> 45. Thanks! Updated.
>>
>> 46. Good advice! Updated.
>>
>> 47.1, updated the comment. Basically it will elect the replica in the
>> desiredLeader field to be the leader
>>
>> 47.2 We can let the admin client do the conversion. Using the desiredLeader
>> field in the json format seems easier for users.
>>
>> 48. Once the MV version is downgraded, all the ELR related fields will be
>> removed on the next partition change. The controller will also ignore the
>> ELR fields. Updated the KIP.
>>
>> 49. Yes, it would be deprecated/removed.
>>
>>
>> On Mon, Sep 25, 2023 at 3:49 PM Jun Rao  wrote:
>>
>> > Hi, Calvin,
>> >
>> > Thanks for the updated KIP. Made another pass. A few more comments below.
>> >
>> > 40. unclean.leader.election.enable.false ->
>> > unclean.recovery.strategy.Balanced: The Balanced mode could still lead to
>> > data loss. So, I am wondering if unclean.leader.election.enable.false
>> > should map to None?
>> >
>> > 41. unclean.recovery.manager.enabled: I am not sure why we introduce this
>> > additional config. Is it the same as unclean.recovery.strategy=None?
>> >
>> > 42. DescribeTopicResponse.TopicAuthorizedOperations: Should this be at
>> the
>> > topic level?
>> >
>> > 43. "Limit: 20 topics max per request": Could we describe what happens if
>> > the request includes more than 20 topics?
>> >
>> > 44. ElectLeadersRequest.DesiredLeaders: Could we describe whethe

Re: [PR] MINOR: Added missing Kafka Broker docs for metrics.jmx.(include|exclude) configs [kafka-site]

2023-10-03 Thread via GitHub


Dionakra closed pull request #517: MINOR: Added missing Kafka Broker docs for 
metrics.jmx.(include|exclude) configs
URL: https://github.com/apache/kafka-site/pull/517


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Walker Carlson
Hello Hanyu,

Looking over your kip things mostly make sense but I have a couple of
comments.


   1. You have "withDescandingOrder()". I think you mean "descending" :)
   Also there are still a few places in the do where its called "setReverse"
   2. Also I like "WithDescendingKeys()" better
   3. I'm not sure of what ordering guarantees we are offering. Perhaps we
   can add a section to the motivation clearly spelling out the current
   ordering and the new offering?
   4. When you say "use unbounded reverseQuery to achieve reverseAll" do
   you mean "use unbounded RangeQuery to achieve reverseAll"? as far as I can
   tell we don't have a reverseQuery as a named object?


Looking good so far

best,
Walker

On Tue, Oct 3, 2023 at 2:13 PM Colt McNealy  wrote:

> Hello Hanyu,
>
> Thank you for the KIP. I agree with Matthias' proposal to keep the naming
> convention consistent with KIP-969. I favor the `.withDescendingKeys()`
> name.
>
> I am curious about one thing. RocksDB guarantees that records returned
> during a range scan are lexicographically ordered by the bytes of the keys
> (either ascending or descending order, as specified in the query). This
> means that results within a single partition are indeed ordered.** My
> reading of KIP-805 suggests to me that you don't need to specify the
> partition number you are querying in IQv2, which means that you can have a
> valid reversed RangeQuery over a store with "multiple partitions" in it.
>
> Currently, IQv1 does not guarantee order of keys in this scenario. Does
> IQv2 support ordering across partitions? Such an implementation would
> require opening a rocksdb range scan** on multiple rocksdb instances (one
> per partition), and polling the first key of each. Whether or not this is
> ordered, could we please add that to the documentation?
>
> **(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
> don't know about that implementation).
>
> Colt McNealy
>
> *Founder, LittleHorse.dev*
>
>
> On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
>  wrote:
>
> > ok, I will update it. Thank you  Matthias
> >
> > Sincerely,
> > Hanyu
> >
> > On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax 
> wrote:
> >
> > > Thanks for the KIP Hanyu!
> > >
> > >
> > > I took a quick look and it think the proposal makes sense overall.
> > >
> > > A few comments about how to structure the KIP.
> > >
> > > As you propose to not add `ReverseRangQuery` class, the code example
> > > should go into "Rejected Alternatives" section, not in the "Proposed
> > > Changes" section.
> > >
> > > For the `RangeQuery` code example, please omit all existing methods
> etc,
> > > and only include what will be added/changed. This make it simpler to
> > > read the KIP.
> > >
> > >
> > > nit: typo
> > >
> > > >  the fault value is false
> > >
> > > Should be "the default value is false".
> > >
> > >
> > > Not sure if `setReverse()` is the best name. Maybe
> `withDescandingOrder`
> > > (or similar, I guess `withReverseOrder` would also work) might be
> > > better? Would be good to align to KIP-969 proposal that suggest do use
> > > `withDescendingKeys` methods for "reverse key-range"; if we go with
> > > `withReverseOrder` we should change KIP-969 accordingly.
> > >
> > > Curious to hear what others think about naming this consistently across
> > > both KIPs.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
> > > >
> > >
> >
> >
> > --
> >
> > [image: Confluent] 
> > Hanyu (Peter) Zheng he/him/his
> > Software Engineer Intern
> > +1 (213) 431-7193 <+1+(213)+431-7193>
> > Follow us: [image: Blog]
> > <
> >
> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog
> > >[image:
> > Twitter] [image: LinkedIn]
> > [image: Slack]
> > [image: YouTube]
> > 
> >
> > [image: Try Confluent Cloud for Free]
> > <
> >
> https://www.confluent.io/get-started?utm_campaign=tm.fm-apac_cd.inbound&utm_source=gmail&utm_medium=organic
> > >
> >
>


Re: [DISCUSS] KIP-980: Allow creating connectors in a stopped state

2023-10-03 Thread Chris Egerton
Hi Yash,

Thanks for the in-depth discussion! Continuations here:

1. Regarding delimiters (dots vs. underscores), we use dots in connector
configs for all runtime-recognized properties, including but not limited to
connector.class, tasks.max, key.converter, key.converter.*, and
transforms.*.type. Regarding the choice of name--I think it's because the
concept here is different from the "state" that we display in the status
API that a different name is warranted. Users can't directly write the
connector's actual state, they can still only specify a target state. And
there's no guarantee that a connector will ever enter a target state
(whether it's specified at creation time or later), since a failure can
always occur.

2. It still seems simpler to emit one record instead of two, especially
since we're not guaranteed that the leader will be using a transactional
producer. I guess I'm just wary of knowingly putting the config topic in an
inconsistent state (target state present without accompanying connector
config), even if it's meant to be only for a brief period. Thinking about
it some more though, a one-record approach comes with drawbacks in the
downgrade scenario: older workers wouldn't know how to handle the new
config format and would just fall back to creating the connector in the
running state. I suppose we should favor the two-record approach since the
downgrade scenario is more likely than the other failure mode, but it'd be
nice if we could think of a way to satisfy both concerns. Not a blocker,
though.

3. Standalone mode has always supported the REST API, and so far FWICT
we've maintained feature parity between the two modes for everything except
exactly-once source connectors, which would have required significant
additional work since we'd have to add support for storing source connector
offsets in a Kafka topic instead of on local storage like we currently do.
I'd really prefer if we could try to maintain feature parity wherever
possible--one way we could possibly do that with this KIP is to also add
support for JSON files with standalone mode.

4. Yeah, no need to block on that idea since there are other use cases for
creating stopped connectors. We can treat it like the option to delete
offsets along with the connector discussed in KIP-875: punt for now,
possibly implement later pending user feedback and indication of demand.
Might be worth adding to a "Future work" section as an indication that we
haven't ruled it out (in which case it'd make sense as a rejected
alternative) but have chosen not to implement yet.


And I had one new thought that's pretty implementation-oriented but may
influence the design slightly:

6. Right now we write an empty set of task configs to the config topic when
handling requests to stop a connector in distributed mode. Do we need to do
the same when creating connectors in the stopped state, or add any other
special logic besides noting the new state in the config topic? Or is it
sufficient to write a non-running target state to the config topic and then
rely on existing logic to simply refuse to generate task configs for the
newly-created connector? Is there any chance that the lack of task configs
(as opposed to an empty list of task configs) in the config topic for a
connector that exists will cause issues?


Cheers,

Chris

On Tue, Oct 3, 2023 at 3:29 AM Yash Mayya  wrote:

> Hi Chris,
>
> Thanks for taking a look at this KIP!
>
> 1. I chose to go with simply "state" as that exact term is already exposed
> via some of the existing REST API responses and would be one that users are
> already familiar with (although admittedly something like "initial_state"
> wouldn't be much of a jump). Since it's a field in the request body for the
> connector creation endpoint, wouldn't it be implied that it is the
> "initial" state just like the "config" field represents the "initial"
> configuration? Also, I don't think x.y has been established as the field
> naming convention in the Connect REST API right? From what I can tell, x_y
> is the convention being followed for fields in requests ("kafka_topic" /
> "kafka_partition" / "kafka_offset" in the offsets APIs for instance) and
> responses ("error_count", "kafka_cluster_id", "recommended_values" etc.).
>
> 2. The connector configuration record is currently used for both connector
> create requests as well as connector config update requests. Since we're
> only allowing configuring the target state for newly created connectors, I
> feel like it'll be a cleaner separation of concerns to use the existing
> records for connector configurations and connector target states rather
> than bundling the "state" and "state.v2" (or equivalent) fields into the
> connector configuration record. The additional write should be very minimal
> overhead and the two writes would be an atomic operation for Connect
> clusters that are using a transactional producer for the config topic
> anyway. Thoughts?
>
> 3. I was thinking that we'd su

[DISCUSS] KIP-988 Streams Standby Task Update Listener

2023-10-03 Thread Colt McNealy
Hi all,

We would like to propose a small KIP to improve the ability of Streams apps
to monitor the progress of their standby tasks through a callback interface.

We have a nearly-working implementation on our fork and are curious for
feedback.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-988%3A+Streams+Standby+Task+Update+Listener

Thank you,
Colt McNealy

*Founder, LittleHorse.dev*


Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Satish Duggana
Hi,
Thank you all for participating in the 3.6.0 release candidate vote threads.

This vote passes with 11 +1 votes(4 binding) and no 0 or -1 votes.

+1 votes
PMC Members:
* Luke Chen
* Bill Bejeck
* Chris Egerton
* Justine Olshan

Community:
* Jakub Scholz
* Federico Valeri
* Divij Vaidya
* Kamal Chandraprakash
* Josep Prat
* Proven Provenzano
* Greg Harris

0 votes
* No votes

-1 votes
* No votes

I will continue with the release process and the release announcement
will follow in the next few days.

Thanks,
Satish.

On Tue, 3 Oct 2023 at 12:35, Satish Duggana  wrote:
>
> Hi,
> Thank you all for participating in the 3.6.0 release candidate vote threads.
>
> This vote passes with 11 +1 votes(4 binding) and no 0 or -1 votes.
>
> +1 votes
> PMC Members:
> * Luke Chen
> * Bill Bejeck
> * Chris Egerton
> * Justine Olshan
>
> Community:
> * Jakub Scholz
> * Federico Valeri
> * Divij Vaidya
> * Kamal Chandraprakash
> * Josep Prat
> * Proven Provenzano
> * Greg Harris
>
> 0 votes
> * No votes
>
> -1 votes
> * No votes
>
> I will continue with the release process and the release announcement
> will follow in the next few days.
>
> Thanks,
> Satish.
>
> On Tue, 3 Oct 2023 at 09:54, Justine Olshan
>  wrote:
> >
> > Thanks folks for following up. Given my previous testing and the results
> > you've provided, I'm +1 (binding)
> >
> > I will also follow up with the non-blocking metrics documentation.
> >
> > Thanks!
> > Justine
> >
> > On Tue, Oct 3, 2023 at 8:17 AM Chris Egerton 
> > wrote:
> >
> > > Hi Satish,
> > >
> > > Thanks for running this release!
> > >
> > > To verify, I:
> > > - Built from source using Java 11 with both:
> > > - - the 3.6.0-rc2 tag on GitHub
> > > - - the kafka-3.6.0-src.tgz artifact from
> > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/
> > > - Checked signatures and checksums
> > > - Ran the quickstart using the kafka_2.13-3.6.0.tgz artifact from
> > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/ with Java 11 and Scala
> > > 13
> > > in KRaft mode
> > > - Ran all unit tests
> > > - Ran all integration tests for Connect and MM2
> > > - Verified that the connect-test-plugins module is present in the staging
> > > Maven artifacts (https://issues.apache.org/jira/browse/KAFKA-15249)
> > >
> > > Everything looks good to me!
> > >
> > > +1 (binding)
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Tue, Oct 3, 2023 at 6:43 AM Satish Duggana 
> > > wrote:
> > >
> > > > Thanks Luke for helping on running system tests on RCs and updating
> > > > the status on this email thread.
> > > >
> > > > ~Satish.
> > > >
> > > > On Tue, 3 Oct 2023 at 05:04, Luke Chen  wrote:
> > > > >
> > > > > Hi Justine and all,
> > > > >
> > > > > The system test result for 3.6.0 RC2 can be found below.
> > > > > In short, no failed tests. The flaky tests will pass in the 2nd run.
> > > > >
> > > >
> > > https://drive.google.com/drive/folders/1qwIKg-B4CBrswUeo5fBRv65KWpDsGUiS?usp=sharing
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > > On Tue, Oct 3, 2023 at 7:08 AM Justine Olshan
> > > > 
> > > > > wrote:
> > > > >
> > > > > > I realized Luke shared the results here for RC1
> > > > > >
> > > > https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> > > > > > Given we had some runs that looked reasonable, and we made a small
> > > > change,
> > > > > > I'm ok with this. But I wouldn't be upset if we had another set of
> > > > runs :)
> > > > > >
> > > > > > As for the validation:
> > > > > >
> > > > > >- I've compiled from source with java 17, 2.13, run the
> > > > transactional
> > > > > >produce bench
> > > > > >- Run unit tests
> > > > > >- Validated the checksums
> > > > > >- Downloaded and ran the 2.12 version of the release
> > > > > >- Briefly took a look at the documentation
> > > > > >- I was browsing through the site html files and I noticed the
> > > html
> > > > for
> > > > > >documentation.html seemed to be for 3.4. Not sure if this is a
> > > > blocker,
> > > > > > but
> > > > > >wanted to flag it. This seems to be the case for the previous
> > > > release
> > > > > >candidates as well. (As well as 3.5 release it seems)
> > > > > >
> > > > > >
> > > > > > I will hold off on voting until we figure that part out. I will also
> > > > follow
> > > > > > up with the documentation Divij mentioned outside this thread.
> > > > > >
> > > > > > Thanks,
> > > > > > Justine
> > > > > >
> > > > > > On Mon, Oct 2, 2023 at 3:05 PM Greg Harris
> > > > 
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Satish,
> > > > > > >
> > > > > > > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hey all -- I noticed we still have the system tests as something
> > > > that
> > > > > > > will
> > > > > > > > be updated. Did we get a run for this RC

Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Satish Duggana
Hi,
Thank you all for participating in the 3.6.0 release candidate vote threads.

This vote passes with 11 +1 votes(4 binding) and no 0 or -1 votes.

+1 votes
PMC Members:
* Luke Chen
* Bill Bejeck
* Chris Egerton
* Justine Olshan

Community:
* Jakub Scholz
* Federico Valeri
* Divij Vaidya
* Kamal Chandraprakash
* Josep Prat
* Proven Provenzano
* Greg Harris

0 votes
* No votes

-1 votes
* No votes

I will continue with the release process and the release announcement
will follow in the next few days.

Thanks,
Satish.

On Tue, 3 Oct 2023 at 09:54, Justine Olshan
 wrote:
>
> Thanks folks for following up. Given my previous testing and the results
> you've provided, I'm +1 (binding)
>
> I will also follow up with the non-blocking metrics documentation.
>
> Thanks!
> Justine
>
> On Tue, Oct 3, 2023 at 8:17 AM Chris Egerton 
> wrote:
>
> > Hi Satish,
> >
> > Thanks for running this release!
> >
> > To verify, I:
> > - Built from source using Java 11 with both:
> > - - the 3.6.0-rc2 tag on GitHub
> > - - the kafka-3.6.0-src.tgz artifact from
> > https://home.apache.org/~satishd/kafka-3.6.0-rc2/
> > - Checked signatures and checksums
> > - Ran the quickstart using the kafka_2.13-3.6.0.tgz artifact from
> > https://home.apache.org/~satishd/kafka-3.6.0-rc2/ with Java 11 and Scala
> > 13
> > in KRaft mode
> > - Ran all unit tests
> > - Ran all integration tests for Connect and MM2
> > - Verified that the connect-test-plugins module is present in the staging
> > Maven artifacts (https://issues.apache.org/jira/browse/KAFKA-15249)
> >
> > Everything looks good to me!
> >
> > +1 (binding)
> >
> > Cheers,
> >
> > Chris
> >
> > On Tue, Oct 3, 2023 at 6:43 AM Satish Duggana 
> > wrote:
> >
> > > Thanks Luke for helping on running system tests on RCs and updating
> > > the status on this email thread.
> > >
> > > ~Satish.
> > >
> > > On Tue, 3 Oct 2023 at 05:04, Luke Chen  wrote:
> > > >
> > > > Hi Justine and all,
> > > >
> > > > The system test result for 3.6.0 RC2 can be found below.
> > > > In short, no failed tests. The flaky tests will pass in the 2nd run.
> > > >
> > >
> > https://drive.google.com/drive/folders/1qwIKg-B4CBrswUeo5fBRv65KWpDsGUiS?usp=sharing
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Tue, Oct 3, 2023 at 7:08 AM Justine Olshan
> > > 
> > > > wrote:
> > > >
> > > > > I realized Luke shared the results here for RC1
> > > > >
> > > https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> > > > > Given we had some runs that looked reasonable, and we made a small
> > > change,
> > > > > I'm ok with this. But I wouldn't be upset if we had another set of
> > > runs :)
> > > > >
> > > > > As for the validation:
> > > > >
> > > > >- I've compiled from source with java 17, 2.13, run the
> > > transactional
> > > > >produce bench
> > > > >- Run unit tests
> > > > >- Validated the checksums
> > > > >- Downloaded and ran the 2.12 version of the release
> > > > >- Briefly took a look at the documentation
> > > > >- I was browsing through the site html files and I noticed the
> > html
> > > for
> > > > >documentation.html seemed to be for 3.4. Not sure if this is a
> > > blocker,
> > > > > but
> > > > >wanted to flag it. This seems to be the case for the previous
> > > release
> > > > >candidates as well. (As well as 3.5 release it seems)
> > > > >
> > > > >
> > > > > I will hold off on voting until we figure that part out. I will also
> > > follow
> > > > > up with the documentation Divij mentioned outside this thread.
> > > > >
> > > > > Thanks,
> > > > > Justine
> > > > >
> > > > > On Mon, Oct 2, 2023 at 3:05 PM Greg Harris
> > > 
> > > > > wrote:
> > > > >
> > > > > > Hey Satish,
> > > > > >
> > > > > > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hey all -- I noticed we still have the system tests as something
> > > that
> > > > > > will
> > > > > > > be updated. Did we get a run for this RC?
> > > > > > >
> > > > > > > On Mon, Oct 2, 2023 at 1:24 PM Bill Bejeck 
> > > wrote:
> > > > > > >
> > > > > > > > Hi Satish,
> > > > > > > >
> > > > > > > > Thanks for running the release.
> > > > > > > > I performed the following steps:
> > > > > > > >
> > > > > > > >- Validated all the checksums, signatures, and keys
> > > > > > > >- Built the release from source
> > > > > > > >- Ran all unit tests
> > > > > > > >- Quick start validations
> > > > > > > >   - ZK and Kraft
> > > > > > > >   - Connect
> > > > > > > >   - Kafka Streams
> > > > > > > >- Spot checked java docs and documentation
> > > > > > > >
> > > > > > > > +1 (binding)
> > > > > > > >
> > > > > > > > - Bill
> > > > > > > >
> > > > > > > > On Mon, Oct 2, 2023 at 10:23 AM Proven Provenzano
> > > > > > > >  wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > 

Re: [VOTE] KIP-951: Leader discovery optimisations for the client

2023-10-03 Thread David Jacot
Thanks for the KIP. +1 from me as well.

Best,
David

Le mar. 3 oct. 2023 à 20:54, Jun Rao  a écrit :

> Hi, Mayank,
>
> Thanks for the detailed explanation in the KIP. +1 from me.
>
> Jun
>
> On Wed, Sep 27, 2023 at 4:39 AM Mayank Shekhar Narula <
> mayanks.nar...@gmail.com> wrote:
>
> > Reviving this thread, as the discussion thread has been updated.
> >
> > On Fri, Jul 28, 2023 at 11:29 AM Mayank Shekhar Narula <
> > mayanks.nar...@gmail.com> wrote:
> >
> > > Thanks Jose.
> > >
> > > On Thu, Jul 27, 2023 at 5:46 PM José Armando García Sancio
> > >  wrote:
> > >
> > >> The KIP LGTM. Thanks for the design. I am looking forward to the
> > >> implementation.
> > >>
> > >> +1 (binding).
> > >>
> > >> Thanks!
> > >> --
> > >> -José
> > >>
> > >
> > >
> > > --
> > > Regards,
> > > Mayank Shekhar Narula
> > >
> >
> >
> > --
> > Regards,
> > Mayank Shekhar Narula
> >
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2252

2023-10-03 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Colt McNealy
Hello Hanyu,

Thank you for the KIP. I agree with Matthias' proposal to keep the naming
convention consistent with KIP-969. I favor the `.withDescendingKeys()`
name.

I am curious about one thing. RocksDB guarantees that records returned
during a range scan are lexicographically ordered by the bytes of the keys
(either ascending or descending order, as specified in the query). This
means that results within a single partition are indeed ordered.** My
reading of KIP-805 suggests to me that you don't need to specify the
partition number you are querying in IQv2, which means that you can have a
valid reversed RangeQuery over a store with "multiple partitions" in it.

Currently, IQv1 does not guarantee order of keys in this scenario. Does
IQv2 support ordering across partitions? Such an implementation would
require opening a rocksdb range scan** on multiple rocksdb instances (one
per partition), and polling the first key of each. Whether or not this is
ordered, could we please add that to the documentation?

**(How is this implemented/guaranteed in an `inMemoryKeyValueStore`? I
don't know about that implementation).

Colt McNealy

*Founder, LittleHorse.dev*


On Tue, Oct 3, 2023 at 1:35 PM Hanyu (Peter) Zheng
 wrote:

> ok, I will update it. Thank you  Matthias
>
> Sincerely,
> Hanyu
>
> On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax  wrote:
>
> > Thanks for the KIP Hanyu!
> >
> >
> > I took a quick look and it think the proposal makes sense overall.
> >
> > A few comments about how to structure the KIP.
> >
> > As you propose to not add `ReverseRangQuery` class, the code example
> > should go into "Rejected Alternatives" section, not in the "Proposed
> > Changes" section.
> >
> > For the `RangeQuery` code example, please omit all existing methods etc,
> > and only include what will be added/changed. This make it simpler to
> > read the KIP.
> >
> >
> > nit: typo
> >
> > >  the fault value is false
> >
> > Should be "the default value is false".
> >
> >
> > Not sure if `setReverse()` is the best name. Maybe `withDescandingOrder`
> > (or similar, I guess `withReverseOrder` would also work) might be
> > better? Would be good to align to KIP-969 proposal that suggest do use
> > `withDescendingKeys` methods for "reverse key-range"; if we go with
> > `withReverseOrder` we should change KIP-969 accordingly.
> >
> > Curious to hear what others think about naming this consistently across
> > both KIPs.
> >
> >
> > -Matthias
> >
> >
> > On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
> > >
> >
>
>
> --
>
> [image: Confluent] 
> Hanyu (Peter) Zheng he/him/his
> Software Engineer Intern
> +1 (213) 431-7193 <+1+(213)+431-7193>
> Follow us: [image: Blog]
> <
> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog
> >[image:
> Twitter] [image: LinkedIn]
> [image: Slack]
> [image: YouTube]
> 
>
> [image: Try Confluent Cloud for Free]
> <
> https://www.confluent.io/get-started?utm_campaign=tm.fm-apac_cd.inbound&utm_source=gmail&utm_medium=organic
> >
>


[jira] [Created] (KAFKA-15535) Add documentation of "remote.log.index.file.cache.total.size.bytes" configuration property.

2023-10-03 Thread Satish Duggana (Jira)
Satish Duggana created KAFKA-15535:
--

 Summary: Add documentation of 
"remote.log.index.file.cache.total.size.bytes" configuration property. 
 Key: KAFKA-15535
 URL: https://issues.apache.org/jira/browse/KAFKA-15535
 Project: Kafka
  Issue Type: Task
  Components: documentation
Reporter: Satish Duggana
 Fix For: 3.7.0


Add documentation of "remote.log.index.file.cache.total.size.bytes" 
configuration property. 

Please double check all the existing public tiered storage configurations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15534) Propagate client response time when timeout to the request handler

2023-10-03 Thread Philip Nee (Jira)
Philip Nee created KAFKA-15534:
--

 Summary: Propagate client response time when timeout to the 
request handler
 Key: KAFKA-15534
 URL: https://issues.apache.org/jira/browse/KAFKA-15534
 Project: Kafka
  Issue Type: Bug
Reporter: Philip Nee
Assignee: Philip Nee


Currently, we don't have a good way to propagate the response time to the 
handler when timeout is thrown.
{code:java}
unsent.handler.onFailure(new TimeoutException(
"Failed to send request after " + unsent.timer.timeoutMs() + " ms.")); 
{code}
The current request manager invoke a system call to retrieve the response time, 
which is not idea because it is already available at network client

This is an example of the coordinator request manager:
{code:java}
unsentRequest.future().whenComplete((clientResponse, throwable) -> {
long responseTimeMs = time.milliseconds();
if (clientResponse != null) {
FindCoordinatorResponse response = (FindCoordinatorResponse) 
clientResponse.responseBody();
onResponse(responseTimeMs, response);
} else {
onFailedResponse(responseTimeMs, throwable);
}
}); {code}
But in the networkClientDelegate, we should utilize the currentTimeMs in the 
trySend to avoid calling time.milliseconds():
{code:java}
private void trySend(final long currentTimeMs) {
...
unsent.handler.onFailure(new TimeoutException(
"Failed to send request after " + unsent.timer.timeoutMs() + " ms."));
continue;
}
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-951: Leader discovery optimisations for the client

2023-10-03 Thread Jun Rao
Hi, Mayank,

Thanks for the detailed explanation in the KIP. +1 from me.

Jun

On Wed, Sep 27, 2023 at 4:39 AM Mayank Shekhar Narula <
mayanks.nar...@gmail.com> wrote:

> Reviving this thread, as the discussion thread has been updated.
>
> On Fri, Jul 28, 2023 at 11:29 AM Mayank Shekhar Narula <
> mayanks.nar...@gmail.com> wrote:
>
> > Thanks Jose.
> >
> > On Thu, Jul 27, 2023 at 5:46 PM José Armando García Sancio
> >  wrote:
> >
> >> The KIP LGTM. Thanks for the design. I am looking forward to the
> >> implementation.
> >>
> >> +1 (binding).
> >>
> >> Thanks!
> >> --
> >> -José
> >>
> >
> >
> > --
> > Regards,
> > Mayank Shekhar Narula
> >
>
>
> --
> Regards,
> Mayank Shekhar Narula
>


Re: Permission to Create KIP

2023-10-03 Thread Greg Harris
Hey Colt,

You should be all set. Looking forward to the KIP!

Greg

On Tue, Oct 3, 2023 at 11:37 AM Colt McNealy  wrote:
>
> Hello there,
>
> Could I please have access to create a Wiki page? A team member and I would
> like to jointly propose a small KIP.
>
> JIRA id: coltmcnealy-lh
>
> Thank you,
> Colt McNealy
>
> *Founder, LittleHorse.dev*


Permission to Create KIP

2023-10-03 Thread Colt McNealy
Hello there,

Could I please have access to create a Wiki page? A team member and I would
like to jointly propose a small KIP.

JIRA id: coltmcnealy-lh

Thank you,
Colt McNealy

*Founder, LittleHorse.dev*


Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
ok, I will update it. Thank you  Matthias

Sincerely,
Hanyu

On Tue, Oct 3, 2023 at 11:23 AM Matthias J. Sax  wrote:

> Thanks for the KIP Hanyu!
>
>
> I took a quick look and it think the proposal makes sense overall.
>
> A few comments about how to structure the KIP.
>
> As you propose to not add `ReverseRangQuery` class, the code example
> should go into "Rejected Alternatives" section, not in the "Proposed
> Changes" section.
>
> For the `RangeQuery` code example, please omit all existing methods etc,
> and only include what will be added/changed. This make it simpler to
> read the KIP.
>
>
> nit: typo
>
> >  the fault value is false
>
> Should be "the default value is false".
>
>
> Not sure if `setReverse()` is the best name. Maybe `withDescandingOrder`
> (or similar, I guess `withReverseOrder` would also work) might be
> better? Would be good to align to KIP-969 proposal that suggest do use
> `withDescendingKeys` methods for "reverse key-range"; if we go with
> `withReverseOrder` we should change KIP-969 accordingly.
>
> Curious to hear what others think about naming this consistently across
> both KIPs.
>
>
> -Matthias
>
>
> On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2
> >
>


-- 

[image: Confluent] 
Hanyu (Peter) Zheng he/him/his
Software Engineer Intern
+1 (213) 431-7193 <+1+(213)+431-7193>
Follow us: [image: Blog]
[image:
Twitter] [image: LinkedIn]
[image: Slack]
[image: YouTube]


[image: Try Confluent Cloud for Free]



Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Matthias J. Sax

Thanks for the KIP Hanyu!


I took a quick look and it think the proposal makes sense overall.

A few comments about how to structure the KIP.

As you propose to not add `ReverseRangQuery` class, the code example 
should go into "Rejected Alternatives" section, not in the "Proposed 
Changes" section.


For the `RangeQuery` code example, please omit all existing methods etc, 
and only include what will be added/changed. This make it simpler to 
read the KIP.



nit: typo


 the fault value is false


Should be "the default value is false".


Not sure if `setReverse()` is the best name. Maybe `withDescandingOrder` 
(or similar, I guess `withReverseOrder` would also work) might be 
better? Would be good to align to KIP-969 proposal that suggest do use 
`withDescendingKeys` methods for "reverse key-range"; if we go with 
`withReverseOrder` we should change KIP-969 accordingly.


Curious to hear what others think about naming this consistently across 
both KIPs.



-Matthias


On 10/3/23 9:17 AM, Hanyu (Peter) Zheng wrote:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2



[jira] [Created] (KAFKA-15533) Ensure HeartbeatRequestManager only send out some fields once

2023-10-03 Thread Philip Nee (Jira)
Philip Nee created KAFKA-15533:
--

 Summary: Ensure HeartbeatRequestManager only send out some fields 
once
 Key: KAFKA-15533
 URL: https://issues.apache.org/jira/browse/KAFKA-15533
 Project: Kafka
  Issue Type: Bug
Reporter: Philip Nee
Assignee: Philip Nee


We want to ensure ConsumerGroupHeartbeatRequest is as lightweight as possible, 
so a lot of fields in it don't need to be resend. An example would be the 
rebalanceTimeoutMs, currently we have the following code:

 

 
{code:java}
ConsumerGroupHeartbeatRequestData data = new ConsumerGroupHeartbeatRequestData()
.setGroupId(membershipManager.groupId())
.setMemberEpoch(membershipManager.memberEpoch())
.setMemberId(membershipManager.memberId())
.setRebalanceTimeoutMs(rebalanceTimeoutMs); {code}
 

 

We should encapsulate these once-used fields into a class such as 
HeartbeatMetdataBuilder, and it should maintain a state of whether a certain 
field needs to be sent or not.

 

Note that, currently only 3 fields are mandatory in the request:
 * groupId
 * memberEpoch
 * memberId



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-10-03 Thread Jun Rao
Hi, Calvin,

Thanks for the update KIP. A few more comments.

41. Why would a user choose the option to select a random replica as the
leader instead of using unclean.recovery.strateg=Aggressive? It seems that
the latter is strictly better? If that's not the case, could we fold this
option under unclean.recovery.strategy instead of introducing a separate
config?

50. ElectLeadersRequest: "If more than 20 topics are included, only the
first 20 will be served. Others will be returned with DesiredLeaders." Hmm,
not sure that I understand this. ElectLeadersResponse doesn't have a
DesiredLeaders field.

51. GetReplicaLogInfo: "If more than 2000 partitions are included, only the
first 2000 will be served" Do we return an error for the remaining
partitions? Actually, should we include an errorCode field at the partition
level in GetReplicaLogInfoResponse to cover non-existing partitions and no
authorization, etc?

52. The entry should matches => The entry should match

53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a user
may not specify DesiredLeaders?

54. Downgrade: Is that indeed possible? I thought earlier you said that
once the new version of the records are in the metadata log, one can't
downgrade since the old broker doesn't know how to parse the new version of
the metadata records?

55. CleanShutdownFile: Should we add a version field for future extension?

56. Config changes are public facing. Could we have a separate section to
document all the config changes?

Thanks,

Jun

On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu 
wrote:

> Hi Jun
> Thanks for the comments.
>
> 40. If we change to None, it is not guaranteed for no data loss. For users
> who are not able to validate the data with external resources, manual
> intervention does not give a better result but a loss of availability. So
> practically speaking, the Balance mode would be a better default value.
>
> 41. No, it represents how we want to do the unclean leader election. If it
> is false, the unclean leader election will be the old random way.
> Otherwise, the unclean recovery will be used.
>
> 42. Good catch. Updated.
>
> 43. Only the first 20 topics will be served. Others will be returned with
> InvalidRequestError
>
> 44. The order matters. The desired leader entries match with the topic
> partition list by the index.
>
> 45. Thanks! Updated.
>
> 46. Good advice! Updated.
>
> 47.1, updated the comment. Basically it will elect the replica in the
> desiredLeader field to be the leader
>
> 47.2 We can let the admin client do the conversion. Using the desiredLeader
> field in the json format seems easier for users.
>
> 48. Once the MV version is downgraded, all the ELR related fields will be
> removed on the next partition change. The controller will also ignore the
> ELR fields. Updated the KIP.
>
> 49. Yes, it would be deprecated/removed.
>
>
> On Mon, Sep 25, 2023 at 3:49 PM Jun Rao  wrote:
>
> > Hi, Calvin,
> >
> > Thanks for the updated KIP. Made another pass. A few more comments below.
> >
> > 40. unclean.leader.election.enable.false ->
> > unclean.recovery.strategy.Balanced: The Balanced mode could still lead to
> > data loss. So, I am wondering if unclean.leader.election.enable.false
> > should map to None?
> >
> > 41. unclean.recovery.manager.enabled: I am not sure why we introduce this
> > additional config. Is it the same as unclean.recovery.strategy=None?
> >
> > 42. DescribeTopicResponse.TopicAuthorizedOperations: Should this be at
> the
> > topic level?
> >
> > 43. "Limit: 20 topics max per request": Could we describe what happens if
> > the request includes more than 20 topics?
> >
> > 44. ElectLeadersRequest.DesiredLeaders: Could we describe whether the
> > ordering matters?
> >
> > 45. GetReplicaLogInfo.TopicPartitions: "about": "The topic partitions to
> > elect leaders.": The description in "about" is incorrect.
> >
> > 46. GetReplicaLogInfoResponse: Should we nest partitions under topicId to
> > be consistent with other types of responses?
> >
> > 47. kafka-leader-election.sh:
> > 47.1 Could we explain DESIGNATION?
> > 47.2 desiredLeader: Should it be a list to match the field in
> > ElectLeadersRequest?
> >
> > 48. We could add a section on downgrade?
> >
> > 49. LastKnownLeader: This seems only needed in the first phase of
> > delivering ELR. Will it be removed when the complete KIP is delivered?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Sep 19, 2023 at 1:30 PM Colin McCabe  wrote:
> >
> > > Hi Calvin,
> > >
> > > Thanks for the explanations. I like the idea of using none, balanced,
> > > aggressive. We also had an offline discussion about why it is good to
> > use a
> > > new config key (basically, so that we can deprecate the old one which
> had
> > > only false/true values in 4.0) With these changes, I am +1.
> > >
> > > best,
> > > Colin
> > >
> > > On Mon, Sep 18, 2023, at 15:54, Calvin Liu wrote:
> > > > Hi Colin,
> > > > Also, can we deprecate unclean.leader.election.enable in 4.0? Before
> > > 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-03 Thread Walker Carlson
Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me where
until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

   * The key that was specified for this query.
   */
  public K getKey();

  /**
   * The starting time point of the query, if specified
   */
  public Optional getFromTimestamp();

  /**
   * The ending time point of the query, if specified
   */
  public Optional getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the direction
of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax  wrote:

> Thanks for the updated KIP. Overall I like it.
>
> Victoria raises a very good point, and I personally tend to prefer (I
> believe so does Victoria, but it's not totally clear from her email) if
> a range query would not return any tombstones, ie, only two records in
> Victoria's example. Thus, it seems best to include a `validTo` ts-field
> to `VersionedRecord` -- otherwise, the retrieved result cannot be
> interpreted correctly.
>
> Not sure what others think about it.
>
> I would also be open to actually add a `includeDeletes()` (or
> `includeTombstones()`) method/flag (disabled by default) to allow users
> to get all tombstone: this would only be helpful if there are two
> consecutive tombstone though (if I got it right), so not sure if we want
> to add it or not -- it seems also possible to add it later if there is
> user demand for it, so it might be a premature addition as this point?
>
>
> Nit:
>
> > the public interface ValueIterator is used
>
> "is used" -> "is added" (otherwise it sounds like as if `ValueIterator`
> exist already)
>
>
>
> Should we also add a `.within(fromTs, toTs)` (or maybe some better
> name?) to allow specifying both bounds at once? The existing
> `RangeQuery` does the same for specifying the key-range, so might be
> good to add for time-range too?
>
>
>
> -Matthias
>
>
> On 9/6/23 5:01 AM, Bruno Cadonna wrote:
> > In my last e-mail I missed to finish a sentence.
> >
> > "I think from a KIP"
> >
> > should be
> >
> > "I think the KIP looks good!"
> >
> >
> > On 9/6/23 1:59 PM, Bruno Cadonna wrote:
> >> Hi Alieh,
> >>
> >> Thanks for the KIP!
> >>
> >> I think from a KIP
> >>
> >> 1.
> >> I propose to throw an IllegalArgumentException or an
> >> IllegalStateException for meaningless combinations. In any case, the
> >> KIP should specify what exception is thrown.
> >>
> >> 2.
> >> Why does not specifying a range return the latest version? I would
> >> expect that it returns all versions since an empty lower or upper
> >> limit is interpreted as no limit.
> >>
> >> 3.
> >> I second Matthias comment about replacing "asOf" with "until" or "to".
> >>
> >> 4.
> >> Do we need "allVersions()"? As I said above I would return all
> >> versions if no limits are specified. I think if we get rid of
> >> allVersions() there might not be any meaningless combinations anymore.
> >> If a user applies twice the same limit like for example
> >> MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one wins.
> >>
> >> 5.
> >> Could you add some more examples with time ranges to the example
> section?
> >>
> >> 6.
> >> The KIP misses the test plan section.
> >>
> >> 7.
> >> I propose to rename the class to "MultiVersionKeyQuery" since we are
> >> querying multiple versions of the same key.
> >>
> >> 8.
> >> Could you also add withAscendingTimestamps()? IMO it gives users the
> >> possibility to make their code more readable instead of only relying
> >> on the default.
> >>
> >> Best,
> >> Bruno
> >>
> >>
> >> On 8/17/23 4:13 AM, Matthias J. Sax wrote:
> >>> Thanks for splitting this part into a separate KIP!
> >>>
> >>> For `withKey()` we should be explicit that `null` is not allowed.
> >>>
> >>> (Looking into existing `KeyQuery` it seems the JavaDocs don't cover
> >>> this either -- would you like to do a tiny cleanup PR for this, or
> >>> fix on-the-side in one of your PRs?)
> >>>
> >>>
> >>>
>  The key query returns all the records that are valid in the time
>  range starting from the timestamp {@code fromTimestamp}.
> >>>
> >>> In the JavaDocs you use the phrase `are valid` -- I think we need to
> >>> explain what "valid" means? It might even be worth to add some
> >>> examples. It's annoying, but being precise if kinda important.
> >>>
> >>> With regard to KIP-962, should we allow `null` for time bounds ? The
> >>> JavaDocs should also be explicit if `null` is allowed or not and what
> >>> the semantics are if allowed.
> >>>
> >>>
> >>>
> >>> You are using `asOf()` however, because we are doing time-range
> >>> queries, to me using `until()` to des

[jira] [Created] (KAFKA-15532) ZkWriteBehindLag should not be reported by inactive controllers

2023-10-03 Thread David Arthur (Jira)
David Arthur created KAFKA-15532:


 Summary: ZkWriteBehindLag should not be reported by inactive 
controllers
 Key: KAFKA-15532
 URL: https://issues.apache.org/jira/browse/KAFKA-15532
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.6.0
Reporter: David Arthur


Since only the active controller is performing the dual-write to ZK during a 
migration, it should be the only controller to report the ZkWriteBehindLag 
metric. 

 

Currently, if the controller fails over during a migration, the previous active 
controller will incorrectly report its last value for ZkWriteBehindLag forever. 
Instead, it should report zero.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-939: Support Participation in 2PC

2023-10-03 Thread Justine Olshan
Hey Artem,

Thanks for the KIP. I had a question about epoch bumping.

Previously when we send an InitProducerId request on Producer startup, we
bump the epoch and abort the transaction. Is it correct to assume that we
will still bump the epoch, but just not abort the transaction?
If we still bump the epoch in this case, how does this interact with
KIP-890 where we also bump the epoch on every transaction. (I think this
means that we may skip epochs and the data itself will all have the same
epoch)

I may have follow ups depending on the answer to this. :)

Thanks,
Justine

On Thu, Sep 7, 2023 at 9:51 PM Artem Livshits
 wrote:

> Hi Alex,
>
> Thank you for your questions.
>
> > the purpose of having broker-level transaction.two.phase.commit.enable
>
> The thinking is that 2PC is a bit of an advanced construct so enabling 2PC
> in a Kafka cluster should be an explicit decision.  If it is set to 'false'
> InitiProducerId (and initTransactions) would
> return TRANSACTIONAL_ID_AUTHORIZATION_FAILED.
>
> > WDYT about adding an AdminClient method that returns the state of
> transaction.two.phase.commit.enable
>
> I wonder if the client could just try to use 2PC and then handle the error
> (e.g. if it needs to fall back to ordinary transactions).  This way it
> could uniformly handle cases when Kafka cluster doesn't support 2PC
> completely and cases when 2PC is restricted to certain users.  We could
> also expose this config in describeConfigs, if the fallback approach
> doesn't work for some scenarios.
>
> -Artem
>
>
> On Tue, Sep 5, 2023 at 12:45 PM Alexander Sorokoumov
>  wrote:
>
> > Hi Artem,
> >
> > Thanks for publishing this KIP!
> >
> > Can you please clarify the purpose of having broker-level
> > transaction.two.phase.commit.enable config in addition to the new ACL? If
> > the brokers are configured with
> transaction.two.phase.commit.enable=false,
> > at what point will a client configured with
> > transaction.two.phase.commit.enable=true fail? Will it happen at
> > KafkaProducer#initTransactions?
> >
> > WDYT about adding an AdminClient method that returns the state of t
> > ransaction.two.phase.commit.enable? This way, clients would know in
> advance
> > if 2PC is enabled on the brokers.
> >
> > Best,
> > Alex
> >
> > On Fri, Aug 25, 2023 at 9:40 AM Roger Hoover 
> > wrote:
> >
> > > Other than supporting multiplexing transactional streams on a single
> > > producer, I don't see how to improve it.
> > >
> > > On Thu, Aug 24, 2023 at 12:12 PM Artem Livshits
> > >  wrote:
> > >
> > > > Hi Roger,
> > > >
> > > > Thank you for summarizing the cons.  I agree and I'm curious what
> would
> > > be
> > > > the alternatives to solve these problems better and if they can be
> > > > incorporated into this proposal (or built independently in addition
> to
> > or
> > > > on top of this proposal).  E.g. one potential extension we discussed
> > > > earlier in the thread could be multiplexing logical transactional
> > > "streams"
> > > > with a single producer.
> > > >
> > > > -Artem
> > > >
> > > > On Wed, Aug 23, 2023 at 4:50 PM Roger Hoover  >
> > > > wrote:
> > > >
> > > > > Thanks.  I like that you're moving Kafka toward supporting this
> > > > dual-write
> > > > > pattern.  Each use case needs to consider the tradeoffs.  You
> already
> > > > > summarized the pros very well in the KIP.  I would summarize the
> cons
> > > > > as follows:
> > > > >
> > > > > - you sacrifice availability - each write requires both DB and
> Kafka
> > to
> > > > be
> > > > > available so I think your overall application availability is 1 -
> > p(DB
> > > is
> > > > > unavailable)*p(Kafka is unavailable).
> > > > > - latency will be higher and throughput lower - each write requires
> > > both
> > > > > writes to DB and Kafka while holding an exclusive lock in DB.
> > > > > - you need to create a producer per unit of concurrency in your app
> > > which
> > > > > has some overhead in the app and Kafka side (number of connections,
> > > poor
> > > > > batching).  I assume the producers would need to be configured for
> > low
> > > > > latency (linger.ms=0)
> > > > > - there's some complexity in managing stable transactional ids for
> > each
> > > > > producer/concurrency unit in your application.  With k8s
> deployment,
> > > you
> > > > > may need to switch to something like a StatefulSet that gives each
> > pod
> > > a
> > > > > stable identity across restarts.  On top of that pod identity which
> > you
> > > > can
> > > > > use as a prefix, you then assign unique transactional ids to each
> > > > > concurrency unit (thread/goroutine).
> > > > >
> > > > > On Wed, Aug 23, 2023 at 12:53 PM Artem Livshits
> > > > >  wrote:
> > > > >
> > > > > > Hi Roger,
> > > > > >
> > > > > > Thank you for the feedback.  You make a very good point that we
> > also
> > > > > > discussed internally.  Adding support for multiple concurrent
> > > > > > transactions in one producer could be valuable but it seems to
> be a
> > > > > fairly
> > > > > > large 

[jira] [Created] (KAFKA-15531) Ensure coordinator node is removed upon disconnection exception

2023-10-03 Thread Philip Nee (Jira)
Philip Nee created KAFKA-15531:
--

 Summary: Ensure coordinator node is removed upon disconnection 
exception
 Key: KAFKA-15531
 URL: https://issues.apache.org/jira/browse/KAFKA-15531
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Philip Nee
Assignee: Philip Nee


In the async consumer, the coordinator isn't being removed when receiving the 
following exception:

 
{code:java}
(e instanceof DisconnectException) {
  markCoordinatorUnknown(true, e.getMessage());
}{code}
 

This should happen on all requests going to coordinator node:

1. heartbeat 2. offset fetch/commit 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Justine Olshan
Thanks folks for following up. Given my previous testing and the results
you've provided, I'm +1 (binding)

I will also follow up with the non-blocking metrics documentation.

Thanks!
Justine

On Tue, Oct 3, 2023 at 8:17 AM Chris Egerton 
wrote:

> Hi Satish,
>
> Thanks for running this release!
>
> To verify, I:
> - Built from source using Java 11 with both:
> - - the 3.6.0-rc2 tag on GitHub
> - - the kafka-3.6.0-src.tgz artifact from
> https://home.apache.org/~satishd/kafka-3.6.0-rc2/
> - Checked signatures and checksums
> - Ran the quickstart using the kafka_2.13-3.6.0.tgz artifact from
> https://home.apache.org/~satishd/kafka-3.6.0-rc2/ with Java 11 and Scala
> 13
> in KRaft mode
> - Ran all unit tests
> - Ran all integration tests for Connect and MM2
> - Verified that the connect-test-plugins module is present in the staging
> Maven artifacts (https://issues.apache.org/jira/browse/KAFKA-15249)
>
> Everything looks good to me!
>
> +1 (binding)
>
> Cheers,
>
> Chris
>
> On Tue, Oct 3, 2023 at 6:43 AM Satish Duggana 
> wrote:
>
> > Thanks Luke for helping on running system tests on RCs and updating
> > the status on this email thread.
> >
> > ~Satish.
> >
> > On Tue, 3 Oct 2023 at 05:04, Luke Chen  wrote:
> > >
> > > Hi Justine and all,
> > >
> > > The system test result for 3.6.0 RC2 can be found below.
> > > In short, no failed tests. The flaky tests will pass in the 2nd run.
> > >
> >
> https://drive.google.com/drive/folders/1qwIKg-B4CBrswUeo5fBRv65KWpDsGUiS?usp=sharing
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Tue, Oct 3, 2023 at 7:08 AM Justine Olshan
> > 
> > > wrote:
> > >
> > > > I realized Luke shared the results here for RC1
> > > >
> > https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> > > > Given we had some runs that looked reasonable, and we made a small
> > change,
> > > > I'm ok with this. But I wouldn't be upset if we had another set of
> > runs :)
> > > >
> > > > As for the validation:
> > > >
> > > >- I've compiled from source with java 17, 2.13, run the
> > transactional
> > > >produce bench
> > > >- Run unit tests
> > > >- Validated the checksums
> > > >- Downloaded and ran the 2.12 version of the release
> > > >- Briefly took a look at the documentation
> > > >- I was browsing through the site html files and I noticed the
> html
> > for
> > > >documentation.html seemed to be for 3.4. Not sure if this is a
> > blocker,
> > > > but
> > > >wanted to flag it. This seems to be the case for the previous
> > release
> > > >candidates as well. (As well as 3.5 release it seems)
> > > >
> > > >
> > > > I will hold off on voting until we figure that part out. I will also
> > follow
> > > > up with the documentation Divij mentioned outside this thread.
> > > >
> > > > Thanks,
> > > > Justine
> > > >
> > > > On Mon, Oct 2, 2023 at 3:05 PM Greg Harris
> > 
> > > > wrote:
> > > >
> > > > > Hey Satish,
> > > > >
> > > > > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks!
> > > > >
> > > > > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> > > > >  wrote:
> > > > > >
> > > > > > Hey all -- I noticed we still have the system tests as something
> > that
> > > > > will
> > > > > > be updated. Did we get a run for this RC?
> > > > > >
> > > > > > On Mon, Oct 2, 2023 at 1:24 PM Bill Bejeck 
> > wrote:
> > > > > >
> > > > > > > Hi Satish,
> > > > > > >
> > > > > > > Thanks for running the release.
> > > > > > > I performed the following steps:
> > > > > > >
> > > > > > >- Validated all the checksums, signatures, and keys
> > > > > > >- Built the release from source
> > > > > > >- Ran all unit tests
> > > > > > >- Quick start validations
> > > > > > >   - ZK and Kraft
> > > > > > >   - Connect
> > > > > > >   - Kafka Streams
> > > > > > >- Spot checked java docs and documentation
> > > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > - Bill
> > > > > > >
> > > > > > > On Mon, Oct 2, 2023 at 10:23 AM Proven Provenzano
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > To verify the release of release 3.6.0 RC2 I did the
> following:
> > > > > > > >
> > > > > > > >- Downloaded the source, built and ran the tests.
> > > > > > > >- Validated SCRAM with KRaft including creating
> credentials
> > with
> > > > > > > >kafka-storage.
> > > > > > > >- Validated Delegation Tokens with KRaft
> > > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > --Proven
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Oct 2, 2023 at 8:37 AM Divij Vaidya <
> > > > divijvaidy...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > + 1 (non-binding)
> > > > > > > > >
> > > > > > > > > Verifications:
> > > > > > > > > 1. I ran a produce-consume workload with plaintext auth,
> > JDK17,
> > > > > zstd
> > > > > > > > > compression using

Re: [DISCUSS] KIP-714: Client metrics and observability

2023-10-03 Thread Andrew Schofield
Hi David,
Thanks for your interest in KIP-714.

Because this KIP is under development at the same time as KIP-848, it will
need to support both the existing KafkaConsumer code and the refactored code
being worked on under KIP-848. I’ve updated the Threading section accordingly.

Thanks,
Andrew

> On 30 Sep 2023, at 01:45, David Jacot  wrote:
>
> Hi Andrew,
>
> Thanks for driving this one. I haven't read all the KIP yet but I already
> have an initial question. In the Threading section, it is written
> "KafkaConsumer: the "background" thread (based on the consumer threading
> refactor which is underway)". If I understand this correctly, it means
> that KIP-714 won't work if the "old consumer" is used. Am I correct?
>
> Cheers,
> David
>
>
> On Fri, Sep 22, 2023 at 12:18 PM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
>> Hi Philip,
>> No, I do not think it should actively search for a broker that supports
>> the new
>> RPCs. In general, either all of the brokers or none of the brokers will
>> support it.
>> In the window, where the cluster is being upgraded or client telemetry is
>> being
>> enabled, there might be a mixed situation. I wouldn’t put too much effort
>> into
>> this mixed scenario. As the client finds brokers which support the new
>> RPCs,
>> it can begin to follow the KIP-714 mechanism.
>>
>> Thanks,
>> Andrew
>>
>>> On 22 Sep 2023, at 20:01, Philip Nee  wrote:
>>>
>>> Hi Andrew -
>>>
>>> Question on top of your answers: Do you think the client should actively
>>> search for a broker that supports this RPC? As previously mentioned, the
>>> broker uses the leastLoadedNode to find its first connection (am
>>> I correct?), and what if that broker doesn't support the metric push?
>>>
>>> P
>>>
>>> On Fri, Sep 22, 2023 at 10:20 AM Andrew Schofield <
>>> andrew_schofield_j...@outlook.com> wrote:
>>>
 Hi Kirk,
 Thanks for your question. You are correct that the presence or absence
>> of
 the new RPCs in the
 ApiVersionsResponse tells the client whether to request the telemetry
 subscriptions and push
 metrics.

 This is of course tricky in practice. It would be conceivable, as a
 cluster is upgraded to AK 3.7
 or as a client metrics receiver plugin is deployed across the cluster,
 that a client connects to some
 brokers that support the new RPCs and some that do not.

 Here’s my suggestion:
 * If a client is not connected to any brokers that support in the new
 RPCs, it cannot push metrics.
 * If a client is only connected to brokers that support the new RPCs, it
 will use the new RPCs in
 accordance with the KIP.
 * If a client is connected to some brokers that support the new RPCs and
 some that do not, it will
 use the new RPCs with the supporting subset of brokers in accordance
>> with
 the KIP.

 Comments?

 Thanks,
 Andrew

> On 22 Sep 2023, at 16:01, Kirk True  wrote:
>
> Hi Andrew/Jun,
>
> I want to make sure I understand question/comment #119… In the case
 where a cluster without a metrics client receiver is later reconfigured
>> and
 restarted to include a metrics client receiver, do we want the client to
 thereafter begin pushing metrics to the cluster? From Andrew’s response
>> to
 question #119, it sounds like we’re using the presence/absence of the
 relevant RPCs in ApiVersionsResponse as the to-push-or-not-to-push
 indicator. Do I have that correct?
>
> Thanks,
> Kirk
>
>> On Sep 21, 2023, at 7:42 AM, Andrew Schofield <
 andrew_schofield_j...@outlook.com> wrote:
>>
>> Hi Jun,
>> Thanks for your comments. I’ve updated the KIP to clarify where
 necessary.
>>
>> 110. Yes, agree. The motivation section mentions this.
>>
>> 111. The replacement of ‘-‘ with ‘.’ for metric names and the
 replacement of
>> ‘-‘ with ‘_’ for attribute keys is following the OTLP guidelines. I
 think it’s a bit
>> of a debatable point. OTLP makes a distinction between a namespace
>> and a
>> multi-word component. If it was “client.id” then “client” would be a
 namespace with
>> an attribute key “id”. But “client_id” is just a key. So, it was
 intentional, but debatable.
>>
>> 112. Thanks. The link target moved. Fixed.
>>
>> 113. Thanks. Fixed.
>>
>> 114.1. If a standard metric makes sense for a client, it should use
>> the
 exact same
>> name. If a standard metric doesn’t make sense for a client, then it
>> can
 omit that metric.
>>
>> For a required metric, the situation is stronger. All clients must
 implement these
>> metrics with these names in order to implement the KIP. But the
 required metrics
>> are essentially the number of connections and the request latency,
 which do not
>> reference the underlying implementation of the client (which
 producer.record.queue.time.max
>>

Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-03 Thread Greg Harris
Hi Viktor,

Thanks for your questions! I agree, replication is very fundamental in
Kafka, so it's been implemented in many different ways by different
people. I hope that this is the last implementation we'll need, but
every software engineer says that :)

GT-1: I think as this KIP is very focused on the UX of the feature,
that user stories are appropriate to include. I think it isn't
necessary to explain how the different applications are accomplished
with MM2 or other solutions, but describing what they will look like
after this KIP would be a wonderful addition. +1

MM2-1: I think that replacing the consumer is insufficient, as we need
a more expressive producer as well. This is not possible within the
design constraints of MM2 as a Connector, as MM2 uses the
connect-managed producer. This could be implemented in MM3 as a new
process that can use more expressive "internal clients", but then
we've thrown away the Connect runtime that made MM2 easier to run for
some users.
MM2-2: This is technically possible, but sounds operationally hazardous to me.
MM2-3: From the user perspective, I believe that CCR can be made more
simple to use and operate than MM2, while providing better guarantees.
>From the implementation standpoint, I think that CCR will be
significantly more complex, as the architecture of MM2 leverages a lot
of the Connect infrastructure.

LaK-1: Yes, I think you understand what I was going for.
LaK-2: I don't think that this is a user experience that we could add
to CCR without changing the Kafka clients to be aware of both clusters
concurrently. In order to redirect clients away from a failed cluster
with a metadata refresh, the cluster that they're currently connected
to must give them that data. But because the cluster failed, that
refresh will not be reliable. With a proxy between the client and
Kafka, that proxy can be available while the original Kafka cluster is
not. Failovers would happen between distinct sets of clients that are
part of the same logical application.

Thanks for taking a look at the rejected alternatives!
Greg

On Tue, Oct 3, 2023 at 3:24 AM Viktor Somogyi-Vass
 wrote:
>
> Hi Greg,
>
> Seems like finding the perfect replication solution is a never ending story
> for Kafka :).
>
> Some general thoughts:
> GT-1. While as you say it would be good to have some kind of built-in
> replication in Kafka, we definitely need to understand the problem better
> to provide a better solution. Replication has lots of user stories as you
> iterated over a few and I think it's very well worth the time to detail
> each one in the KIP. This may help understanding the problem on a deeper
> level to others who may want to contribute, somewhat sets the scope and
> describes the problem in a way that a good solution can be deduced from it.
>
> I also have a few questions regarding some of the rejected solutions:
>
> MM2:
> I think your points about MM2 are fair (offset transparency and operational
> complexity), however I think it needs more reasoning about why are we
> moving in a different direction?
> A few points I can think about what we could improve in MM2 that'd
> transform it into more like a solution that you aim for:
> MM2-1. What if we consider replacing the client based mechanism with a
> follower fetch protocol?
> MM2-2. Operating an MM2 cluster might be familiar to those who operate
> Connect anyway. For those who don't, can we provide a "built-in" version
> that runs in the same process as Kafka, like an embedded dedicated MM2
> cluster?
> MM2-3. Will we actually be able to achieve less complexity with a built-in
> solution?
>
> Layer above Kafka:
> LaK-1. Would you please add more details about this? What I can currently
> think of is that this "layer above Kafka" would be some kind of a proxy
> which would proactively send an incoming request to multiple clusters like
> "broadcast" it. Is that a correct assumption?
> LaK-2. In case of a cluster failover a client needs to change bootstrap
> servers to a different cluster. A layer above Kafka or a proxy can solve
> this by abstracting away the cluster itself. It could force out a metadata
> refresh and from that point on clients can fetch from the other cluster. Is
> this problem within the scope of this KIP or not?
>
> Thanks,
> Viktor
>
>
> On Tue, Oct 3, 2023 at 2:55 AM Greg Harris 
> wrote:
>
> > Hey Tom,
> >
> > Thanks for the high-level questions, as I am certainly approaching
> > this KIP differently than I've seen before.
> >
> > I think that ideally this KIP will expand to include lots of
> > requirements and possible implementations, and that through discussion
> > we can narrow the scope and form a roadmap for implementation across
> > multiple KIPs. I don't plan to be the decision-maker for this project,
> > as I'm more interested in building consensus among the co-authors. I
> > can certainly poll that consensus and update the KIP to keep the
> > project moving, and any other co-author can do the same. And to set

[DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-03 Thread Hanyu (Peter) Zheng
https://cwiki.apache.org/confluence/display/KAFKA/KIP-985%3A+Add+reverseRange+and+reverseAll+query+over+kv-store+in+IQv2

-- 

[image: Confluent] 
Hanyu (Peter) Zheng he/him/his
Software Engineer Intern
+1 (213) 431-7193 <+1+(213)+431-7193>
Follow us: [image: Blog]
[image:
Twitter] [image: LinkedIn]
[image: Slack]
[image: YouTube]


[image: Try Confluent Cloud for Free]



Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Chris Egerton
Hi Satish,

Thanks for running this release!

To verify, I:
- Built from source using Java 11 with both:
- - the 3.6.0-rc2 tag on GitHub
- - the kafka-3.6.0-src.tgz artifact from
https://home.apache.org/~satishd/kafka-3.6.0-rc2/
- Checked signatures and checksums
- Ran the quickstart using the kafka_2.13-3.6.0.tgz artifact from
https://home.apache.org/~satishd/kafka-3.6.0-rc2/ with Java 11 and Scala 13
in KRaft mode
- Ran all unit tests
- Ran all integration tests for Connect and MM2
- Verified that the connect-test-plugins module is present in the staging
Maven artifacts (https://issues.apache.org/jira/browse/KAFKA-15249)

Everything looks good to me!

+1 (binding)

Cheers,

Chris

On Tue, Oct 3, 2023 at 6:43 AM Satish Duggana 
wrote:

> Thanks Luke for helping on running system tests on RCs and updating
> the status on this email thread.
>
> ~Satish.
>
> On Tue, 3 Oct 2023 at 05:04, Luke Chen  wrote:
> >
> > Hi Justine and all,
> >
> > The system test result for 3.6.0 RC2 can be found below.
> > In short, no failed tests. The flaky tests will pass in the 2nd run.
> >
> https://drive.google.com/drive/folders/1qwIKg-B4CBrswUeo5fBRv65KWpDsGUiS?usp=sharing
> >
> > Thank you.
> > Luke
> >
> > On Tue, Oct 3, 2023 at 7:08 AM Justine Olshan
> 
> > wrote:
> >
> > > I realized Luke shared the results here for RC1
> > >
> https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> > > Given we had some runs that looked reasonable, and we made a small
> change,
> > > I'm ok with this. But I wouldn't be upset if we had another set of
> runs :)
> > >
> > > As for the validation:
> > >
> > >- I've compiled from source with java 17, 2.13, run the
> transactional
> > >produce bench
> > >- Run unit tests
> > >- Validated the checksums
> > >- Downloaded and ran the 2.12 version of the release
> > >- Briefly took a look at the documentation
> > >- I was browsing through the site html files and I noticed the html
> for
> > >documentation.html seemed to be for 3.4. Not sure if this is a
> blocker,
> > > but
> > >wanted to flag it. This seems to be the case for the previous
> release
> > >candidates as well. (As well as 3.5 release it seems)
> > >
> > >
> > > I will hold off on voting until we figure that part out. I will also
> follow
> > > up with the documentation Divij mentioned outside this thread.
> > >
> > > Thanks,
> > > Justine
> > >
> > > On Mon, Oct 2, 2023 at 3:05 PM Greg Harris
> 
> > > wrote:
> > >
> > > > Hey Satish,
> > > >
> > > > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > > > +1 (non-binding)
> > > >
> > > > Thanks!
> > > >
> > > > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> > > >  wrote:
> > > > >
> > > > > Hey all -- I noticed we still have the system tests as something
> that
> > > > will
> > > > > be updated. Did we get a run for this RC?
> > > > >
> > > > > On Mon, Oct 2, 2023 at 1:24 PM Bill Bejeck 
> wrote:
> > > > >
> > > > > > Hi Satish,
> > > > > >
> > > > > > Thanks for running the release.
> > > > > > I performed the following steps:
> > > > > >
> > > > > >- Validated all the checksums, signatures, and keys
> > > > > >- Built the release from source
> > > > > >- Ran all unit tests
> > > > > >- Quick start validations
> > > > > >   - ZK and Kraft
> > > > > >   - Connect
> > > > > >   - Kafka Streams
> > > > > >- Spot checked java docs and documentation
> > > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > - Bill
> > > > > >
> > > > > > On Mon, Oct 2, 2023 at 10:23 AM Proven Provenzano
> > > > > >  wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > To verify the release of release 3.6.0 RC2 I did the following:
> > > > > > >
> > > > > > >- Downloaded the source, built and ran the tests.
> > > > > > >- Validated SCRAM with KRaft including creating credentials
> with
> > > > > > >kafka-storage.
> > > > > > >- Validated Delegation Tokens with KRaft
> > > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > --Proven
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Oct 2, 2023 at 8:37 AM Divij Vaidya <
> > > divijvaidy...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > + 1 (non-binding)
> > > > > > > >
> > > > > > > > Verifications:
> > > > > > > > 1. I ran a produce-consume workload with plaintext auth,
> JDK17,
> > > > zstd
> > > > > > > > compression using an open messaging benchmark and found 3.6
> to be
> > > > > > better
> > > > > > > > than or equal to 3.5.1 across all dimensions. Notably, 3.6
> had
> > > > > > > consistently
> > > > > > > > 6-7% lower CPU utilization, lesser spikes on P99 produce
> > > latencies
> > > > and
> > > > > > > > overall lower P99.8 latencies.
> > > > > > > >
> > > > > > > > 2. I have verified that detached signature is correct using
> > > > > > > > https://www.apache.org/info/verification.html and the
> release
> > > > manager
> > > > > > > > public keys are available at

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #83

2023-10-03 Thread Apache Jenkins Server
See 




Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Satish Duggana
Hi Justine,

Good catch on the "documentation.html" of the generated
kafka-site-<>.tgz. It has been missing updates after 3.4. But I think
it is a non blocker as the kafka-site repo is updated the
documentation with the respective release directories including 3.6.
Any pending updates on 3.6 are addressed by updating in kafka-site
repo.
We will sync 3.6 and trunk branches of kafka repo and asf-site branch
of kafka-site repo with respect to the documentation so that kafka and
kafka-site repos are in sync for future releases.

https://issues.apache.org/jira/browse/KAFKA-15530 is raised to
followup on the missing documentation of metrics introduced in
KIP-890. Please close this JIRA as a duplicate if you already have
one.

Thanks,
Satish.

On Mon, 2 Oct 2023 at 16:07, Justine Olshan
 wrote:
>
> I realized Luke shared the results here for RC1
> https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> Given we had some runs that looked reasonable, and we made a small change,
> I'm ok with this. But I wouldn't be upset if we had another set of runs :)
>
> As for the validation:
>
>- I've compiled from source with java 17, 2.13, run the transactional
>produce bench
>- Run unit tests
>- Validated the checksums
>- Downloaded and ran the 2.12 version of the release
>- Briefly took a look at the documentation
>- I was browsing through the site html files and I noticed the html for
>documentation.html seemed to be for 3.4. Not sure if this is a blocker, but
>wanted to flag it. This seems to be the case for the previous release
>candidates as well. (As well as 3.5 release it seems)
>
>
> I will hold off on voting until we figure that part out. I will also follow
> up with the documentation Divij mentioned outside this thread.
>
> Thanks,
> Justine
>
> On Mon, Oct 2, 2023 at 3:05 PM Greg Harris 
> wrote:
>
> > Hey Satish,
> >
> > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > +1 (non-binding)
> >
> > Thanks!
> >
> > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> >  wrote:
> > >
> > > Hey all -- I noticed we still have the system tests as something that
> > will
> > > be updated. Did we get a run for this RC?
> > >
> > > On Mon, Oct 2, 2023 at 1:24 PM Bill Bejeck  wrote:
> > >
> > > > Hi Satish,
> > > >
> > > > Thanks for running the release.
> > > > I performed the following steps:
> > > >
> > > >- Validated all the checksums, signatures, and keys
> > > >- Built the release from source
> > > >- Ran all unit tests
> > > >- Quick start validations
> > > >   - ZK and Kraft
> > > >   - Connect
> > > >   - Kafka Streams
> > > >- Spot checked java docs and documentation
> > > >
> > > > +1 (binding)
> > > >
> > > > - Bill
> > > >
> > > > On Mon, Oct 2, 2023 at 10:23 AM Proven Provenzano
> > > >  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > To verify the release of release 3.6.0 RC2 I did the following:
> > > > >
> > > > >- Downloaded the source, built and ran the tests.
> > > > >- Validated SCRAM with KRaft including creating credentials with
> > > > >kafka-storage.
> > > > >- Validated Delegation Tokens with KRaft
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > --Proven
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Oct 2, 2023 at 8:37 AM Divij Vaidya  > >
> > > > > wrote:
> > > > >
> > > > > > + 1 (non-binding)
> > > > > >
> > > > > > Verifications:
> > > > > > 1. I ran a produce-consume workload with plaintext auth, JDK17,
> > zstd
> > > > > > compression using an open messaging benchmark and found 3.6 to be
> > > > better
> > > > > > than or equal to 3.5.1 across all dimensions. Notably, 3.6 had
> > > > > consistently
> > > > > > 6-7% lower CPU utilization, lesser spikes on P99 produce latencies
> > and
> > > > > > overall lower P99.8 latencies.
> > > > > >
> > > > > > 2. I have verified that detached signature is correct using
> > > > > > https://www.apache.org/info/verification.html and the release
> > manager
> > > > > > public keys are available at
> > > > > > https://keys.openpgp.org/search?q=F65DC3423D4CD7B9
> > > > > >
> > > > > > 3. I have verified that all metrics emitted in 3.5.1 (with Zk) are
> > also
> > > > > > being emitted in 3.6.0 (with Zk).
> > > > > >
> > > > > > Problems (but not blockers):
> > > > > > 1. Metrics added in
> > > > > >
> > > > > >
> > > > >
> > > >
> > https://github.com/apache/kafka/commit/2f71708955b293658cec3b27e9a5588d39c38d7e
> > > > > > aren't available in the documentation (cc: Justine). I don't
> > consider
> > > > > this
> > > > > > as a release blocker but we should add it as a fast follow-up.
> > > > > >
> > > > > > 2. Metric added in
> > > > > >
> > > > > >
> > > > >
> > > >
> > https://github.com/apache/kafka/commit/a900794ace4dcf1f9dadee27fbd8b63979532a18
> > > > > > isn't available in documentation (cc: David). I don't consider
> > this as
> > > > a
> > > > > > release blocker but we should add it as a fast follow-up.
> > > > > >
> > > > >

Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Satish Duggana
Thanks Luke for helping on running system tests on RCs and updating
the status on this email thread.

~Satish.

On Tue, 3 Oct 2023 at 05:04, Luke Chen  wrote:
>
> Hi Justine and all,
>
> The system test result for 3.6.0 RC2 can be found below.
> In short, no failed tests. The flaky tests will pass in the 2nd run.
> https://drive.google.com/drive/folders/1qwIKg-B4CBrswUeo5fBRv65KWpDsGUiS?usp=sharing
>
> Thank you.
> Luke
>
> On Tue, Oct 3, 2023 at 7:08 AM Justine Olshan 
> wrote:
>
> > I realized Luke shared the results here for RC1
> > https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> > Given we had some runs that looked reasonable, and we made a small change,
> > I'm ok with this. But I wouldn't be upset if we had another set of runs :)
> >
> > As for the validation:
> >
> >- I've compiled from source with java 17, 2.13, run the transactional
> >produce bench
> >- Run unit tests
> >- Validated the checksums
> >- Downloaded and ran the 2.12 version of the release
> >- Briefly took a look at the documentation
> >- I was browsing through the site html files and I noticed the html for
> >documentation.html seemed to be for 3.4. Not sure if this is a blocker,
> > but
> >wanted to flag it. This seems to be the case for the previous release
> >candidates as well. (As well as 3.5 release it seems)
> >
> >
> > I will hold off on voting until we figure that part out. I will also follow
> > up with the documentation Divij mentioned outside this thread.
> >
> > Thanks,
> > Justine
> >
> > On Mon, Oct 2, 2023 at 3:05 PM Greg Harris 
> > wrote:
> >
> > > Hey Satish,
> > >
> > > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > > +1 (non-binding)
> > >
> > > Thanks!
> > >
> > > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> > >  wrote:
> > > >
> > > > Hey all -- I noticed we still have the system tests as something that
> > > will
> > > > be updated. Did we get a run for this RC?
> > > >
> > > > On Mon, Oct 2, 2023 at 1:24 PM Bill Bejeck  wrote:
> > > >
> > > > > Hi Satish,
> > > > >
> > > > > Thanks for running the release.
> > > > > I performed the following steps:
> > > > >
> > > > >- Validated all the checksums, signatures, and keys
> > > > >- Built the release from source
> > > > >- Ran all unit tests
> > > > >- Quick start validations
> > > > >   - ZK and Kraft
> > > > >   - Connect
> > > > >   - Kafka Streams
> > > > >- Spot checked java docs and documentation
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > > - Bill
> > > > >
> > > > > On Mon, Oct 2, 2023 at 10:23 AM Proven Provenzano
> > > > >  wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > To verify the release of release 3.6.0 RC2 I did the following:
> > > > > >
> > > > > >- Downloaded the source, built and ran the tests.
> > > > > >- Validated SCRAM with KRaft including creating credentials with
> > > > > >kafka-storage.
> > > > > >- Validated Delegation Tokens with KRaft
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > --Proven
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Oct 2, 2023 at 8:37 AM Divij Vaidya <
> > divijvaidy...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > + 1 (non-binding)
> > > > > > >
> > > > > > > Verifications:
> > > > > > > 1. I ran a produce-consume workload with plaintext auth, JDK17,
> > > zstd
> > > > > > > compression using an open messaging benchmark and found 3.6 to be
> > > > > better
> > > > > > > than or equal to 3.5.1 across all dimensions. Notably, 3.6 had
> > > > > > consistently
> > > > > > > 6-7% lower CPU utilization, lesser spikes on P99 produce
> > latencies
> > > and
> > > > > > > overall lower P99.8 latencies.
> > > > > > >
> > > > > > > 2. I have verified that detached signature is correct using
> > > > > > > https://www.apache.org/info/verification.html and the release
> > > manager
> > > > > > > public keys are available at
> > > > > > > https://keys.openpgp.org/search?q=F65DC3423D4CD7B9
> > > > > > >
> > > > > > > 3. I have verified that all metrics emitted in 3.5.1 (with Zk)
> > are
> > > also
> > > > > > > being emitted in 3.6.0 (with Zk).
> > > > > > >
> > > > > > > Problems (but not blockers):
> > > > > > > 1. Metrics added in
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > https://github.com/apache/kafka/commit/2f71708955b293658cec3b27e9a5588d39c38d7e
> > > > > > > aren't available in the documentation (cc: Justine). I don't
> > > consider
> > > > > > this
> > > > > > > as a release blocker but we should add it as a fast follow-up.
> > > > > > >
> > > > > > > 2. Metric added in
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > https://github.com/apache/kafka/commit/a900794ace4dcf1f9dadee27fbd8b63979532a18
> > > > > > > isn't available in documentation (cc: David). I don't consider
> > > this as
> > > > > a
> > > > > > > release blocker but we should add it as a fast follow-up.
> > > > > > >
> > > > > > >

Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Satish Duggana
Thanks Divij for the observations.

KAFKA-15483 is added to address the missing metrics related to KRaft
and ZK to KRaft migration. Please leave a comment at
https://github.com/apache/kafka-site/pull/548 in which related metrics
documentation is added.

https://issues.apache.org/jira/browse/KAFKA-15530 is raised to
followup on the missing documentation of metrics introduced in
KIP-890.

~Satish.

On Mon, 2 Oct 2023 at 05:37, Divij Vaidya  wrote:
>
> + 1 (non-binding)
>
> Verifications:
> 1. I ran a produce-consume workload with plaintext auth, JDK17, zstd
> compression using an open messaging benchmark and found 3.6 to be better
> than or equal to 3.5.1 across all dimensions. Notably, 3.6 had consistently
> 6-7% lower CPU utilization, lesser spikes on P99 produce latencies and
> overall lower P99.8 latencies.
>
> 2. I have verified that detached signature is correct using
> https://www.apache.org/info/verification.html and the release manager
> public keys are available at
> https://keys.openpgp.org/search?q=F65DC3423D4CD7B9
>
> 3. I have verified that all metrics emitted in 3.5.1 (with Zk) are also
> being emitted in 3.6.0 (with Zk).
>
> Problems (but not blockers):
> 1. Metrics added in
> https://github.com/apache/kafka/commit/2f71708955b293658cec3b27e9a5588d39c38d7e
> aren't available in the documentation (cc: Justine). I don't consider this
> as a release blocker but we should add it as a fast follow-up.
>
> 2. Metric added in
> https://github.com/apache/kafka/commit/a900794ace4dcf1f9dadee27fbd8b63979532a18
> isn't available in documentation (cc: David). I don't consider this as a
> release blocker but we should add it as a fast follow-up.
>
> --
> Divij Vaidya
>
>
>
> On Mon, Oct 2, 2023 at 9:50 AM Federico Valeri  wrote:
>
> > Hi Satish, I did the following to verify the release:
> >
> > - Built from source with Java 17 and Scala 2.13
> > - Ran all unit and integration tests
> > - Spot checked documentation
> > - Ran custom client applications using staging artifacts on a 3-nodes
> > cluster
> > - Tested tiered storage with one of the available RSM implementations
> >
> > +1 (non binding)
> >
> > Thanks
> > Fede
> >
> > On Mon, Oct 2, 2023 at 8:50 AM Luke Chen  wrote:
> > >
> > > Hi Satish,
> > >
> > > I verified with:
> > > 1. Ran quick start in KRaft for scala 2.12 artifact
> > > 2. Making sure the checksum are correct
> > > 3. Browsing release notes, documents, javadocs, protocols.
> > > 4. Verified the tiered storage feature works well.
> > >
> > > +1 (binding).
> > >
> > > Thanks.
> > > Luke
> > >
> > >
> > >
> > > On Mon, Oct 2, 2023 at 5:23 AM Jakub Scholz  wrote:
> > >
> > > > +1 (non-binding). I used the Scala 2.13 binaries and the staged Maven
> > > > artifacts and run my tests. Everything seems to work fine for me.
> > > >
> > > > Thanks
> > > > Jakub
> > > >
> > > > On Fri, Sep 29, 2023 at 8:17 PM Satish Duggana <
> > satish.dugg...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello Kafka users, developers and client-developers,
> > > > >
> > > > > This is the third candidate for the release of Apache Kafka 3.6.0.
> > > > > Some of the major features include:
> > > > >
> > > > > * KIP-405 : Kafka Tiered Storage
> > > > > * KIP-868 : KRaft Metadata Transactions
> > > > > * KIP-875: First-class offsets support in Kafka Connect
> > > > > * KIP-898: Modernize Connect plugin discovery
> > > > > * KIP-938: Add more metrics for measuring KRaft performance
> > > > > * KIP-902: Upgrade Zookeeper to 3.8.1
> > > > > * KIP-917: Additional custom metadata for remote log segment
> > > > >
> > > > > Release notes for the 3.6.0 release:
> > > > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/RELEASE_NOTES.html
> > > > >
> > > > > *** Please download, test and vote by Tuesday, October 3, 12pm PT
> > > > >
> > > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > > > https://kafka.apache.org/KEYS
> > > > >
> > > > > * Release artifacts to be voted upon (source and binary):
> > > > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/
> > > > >
> > > > > * Maven artifacts to be voted upon:
> > > > >
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > > > >
> > > > > * Javadoc:
> > > > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/javadoc/
> > > > >
> > > > > * Tag to be voted upon (off 3.6 branch) is the 3.6.0-rc2 tag:
> > > > > https://github.com/apache/kafka/releases/tag/3.6.0-rc2
> > > > >
> > > > > * Documentation:
> > > > > https://kafka.apache.org/36/documentation.html
> > > > >
> > > > > * Protocol:
> > > > > https://kafka.apache.org/36/protocol.html
> > > > >
> > > > > * Successful Jenkins builds for the 3.6 branch:
> > > > > There are a few runs of unit/integration tests. You can see the
> > latest
> > > > > at https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.6/. We
> > will
> > > > > continue running a few more iterations.
> > > > > System tests:
> > > > > We will send an update once we have the results.
> > > > >
> > > > > Thanks,
> > 

[jira] [Created] (KAFKA-15530) Add missing documentation of metrics introduced as part of KAFKA-15196

2023-10-03 Thread Satish Duggana (Jira)
Satish Duggana created KAFKA-15530:
--

 Summary: Add missing documentation of metrics introduced as part 
of KAFKA-15196
 Key: KAFKA-15530
 URL: https://issues.apache.org/jira/browse/KAFKA-15530
 Project: Kafka
  Issue Type: Task
Reporter: Satish Duggana


This is a followup to the 3.6.0 RC2 verification email 
[thread|https://lists.apache.org/thread/js2nmq3ggn46qg122h4jg5p2fcq5hr2s]. Add 
the missing documentation of a few metrics added as part of the 
[change|https://github.com/apache/kafka/commit/2f71708955b293658cec3b27e9a5588d39c38d7e].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-975 Docker Image for Apache Kafka

2023-10-03 Thread Vedarth Sharma
Hey Viktor! Thanks for bringing up this use case.
I think we can take advantage of Docker volume for this.
We can allow users to mount a folder containing the secret files.
This folder can then be used to pass secrets to the container.

Thanks and regards,
Vedarth

On Wed, Sep 27, 2023 at 4:39 PM Viktor Somogyi-Vass
 wrote:

> Hi Krishna,
>
> Thanks for the answer. I've seen multiple such images where environment
> variables are used and I think they are generally good but it's unsafe for
> passing around secrets, jaas configs and so on. Perhaps for secrets we
> could recommend using the file config provider. Then users can create and
> mount secured properties file(s) with configs that are considered as
> secrets. What do you think? Did you already have something in your mind
> regarding this?
>
> Thanks,
> Viktor
>
> On Tue, Sep 26, 2023 at 3:05 PM Krishna Agarwal <
> krishna0608agar...@gmail.com> wrote:
>
> > Hi Ismael,
> > Apologies for missing the mailing list in the last reply.
> >
> > Thank you for the suggestions.
> > Just to clarify, the sizes mentioned in the previous email are of the
> > uncompressed base images, not the resulting Apache Kafka docker images:
> >
> >1. eclipse-temurin:17-jre -- 263MB (They should release JRE 21 images
> >soon)
> >2.
> registry.access.redhat.com/ubi8/openjdk-17-runtime:1.17-1.1693366274
> >-- 375MB
> >
> > Regards,
> > Krishna
> >
> >
> > On Tue, Sep 26, 2023 at 9:24 AM Ismael Juma  wrote:
> >
> > > Hi Krishna,
> > >
> > > Looks like you sent the response to me and not the mailing list,
> > > please include the mailing list in the replies. Comments below.
> > >
> > > On Mon, Sep 25, 2023 at 11:45 AM Krishna Agarwal <
> > > krishna0608agar...@gmail.com> wrote:
> > >
> > >> Hi Ismael,
> > >> Thanks for the questions.
> > >>
> > >>1. We intend to support only the latest Java supported by Apache
> > >>Kafka(As per this documentation
> > >> Apache Kafka
> > currently
> > >>supports Java 8, Java 11, and Java 17) which currently is Java 17.
> If
> > >>Apache Kafka supports Java 21 in the future, we will align with it.
> > >>
> > >> We are already building and testing with Java 21 (
> > > https://github.com/apache/kafka/pull/14451 updates `README.md` to
> > > indicate that). By 3.7.0 (the next release), we'll have Java 21 as one
> of
> > > the officially supported versions. I think we should start with that
> > > version for both docker image KIPs.
> > >
> > >>
> > >>1. For users seeking a Docker image with an alternative Java
> version,
> > >>they will have the flexibility to build their own Docker image
> > utilising
> > >>the Dockerfiles we provide. In our documentation, we will provide
> > clear
> > >>guidance on the designated base images for various Java versions.
> > >>
> > >> This sounds good to me. We should include these details as part of the
> > > KIP and also the documentation for the docker images. More
> specifically,
> > we
> > > should state that we will update the Java major version as part of
> minor
> > > Apache Kafka releases. The implication is that users who include broker
> > > plugins alongside the broker should use custom images to ensure their
> > > custom code is not broken by Java upgrades.
> > >
> > >>
> > >>1. Apache Kafka only requires JRE, not JDK, for operation.
> Utilizing
> > >>a base image with only JRE, rather than JDK, is a logical choice as
> > it
> > >>significantly reduces the size of the docker image.
> > >>Upon further investigation, I discovered the eclipse-temurin
> > >><
> >
> https://hub.docker.com/layers/library/eclipse-temurin/17-jre/images/sha256-d1dfb065ae433fe1b43ac7e50a1ed03660f487c73ec256c686b126c37fd4d086?context=explore
> > >
> > >>docker image, which is notably smaller than Redhat’s ubi8 docker
> > image (263
> > >>MB vs 375 MB). Additionally, the fact that Apache Flink relies on
> > >>eclipse-temurin base images
> > >><
> >
> https://github.com/apache/flink-docker/blob/master/1.17/scala_2.12-java11-ubuntu/Dockerfile#L19
> > >
> > >>further increases our confidence in their dependability(Will make
> > this
> > >>change in the KIP).
> > >>
> > >> Yes, eclipse-temurin looks like a good choice to me. Nice size
> > reduction!
> > >
> > >>
> > >>1. I'll conduct comparisons between our docker image and existing
> > >>ones, and incorporate the findings into the KIP. I'll keep you
> > posted on
> > >>the same.
> > >>
> > >> Excellent, thanks!
> > >
> > > Ismael
> > >
> > >
> > >> On Wed, Sep 20, 2023 at 11:26 PM Ismael Juma 
> wrote:
> > >>
> > >>> Hi Krishna,
> > >>>
> > >>> Thanks for the KIP. A few quick questions:
> > >>>
> > >>> 1. Since this will only be available for Kafka 3.7 in the best case,
> I
> > >>> suggest we go with Java 21 instead of Java 17. Also, we should be
> clear
> > >>> about Java version expectations. Are we allowed to change the Java
> > >>> vers

Re: [VOTE] 3.6.0 RC2

2023-10-03 Thread Luke Chen
Hi Justine and all,

The system test result for 3.6.0 RC2 can be found below.
In short, no failed tests. The flaky tests will pass in the 2nd run.
https://drive.google.com/drive/folders/1qwIKg-B4CBrswUeo5fBRv65KWpDsGUiS?usp=sharing

Thank you.
Luke

On Tue, Oct 3, 2023 at 7:08 AM Justine Olshan 
wrote:

> I realized Luke shared the results here for RC1
> https://drive.google.com/drive/folders/1S2XYd79f6_AeWj9f9qEkliRg7JtL04AC
> Given we had some runs that looked reasonable, and we made a small change,
> I'm ok with this. But I wouldn't be upset if we had another set of runs :)
>
> As for the validation:
>
>- I've compiled from source with java 17, 2.13, run the transactional
>produce bench
>- Run unit tests
>- Validated the checksums
>- Downloaded and ran the 2.12 version of the release
>- Briefly took a look at the documentation
>- I was browsing through the site html files and I noticed the html for
>documentation.html seemed to be for 3.4. Not sure if this is a blocker,
> but
>wanted to flag it. This seems to be the case for the previous release
>candidates as well. (As well as 3.5 release it seems)
>
>
> I will hold off on voting until we figure that part out. I will also follow
> up with the documentation Divij mentioned outside this thread.
>
> Thanks,
> Justine
>
> On Mon, Oct 2, 2023 at 3:05 PM Greg Harris 
> wrote:
>
> > Hey Satish,
> >
> > I verified KIP-898 functionality and the KAFKA-15473 patch.
> > +1 (non-binding)
> >
> > Thanks!
> >
> > On Mon, Oct 2, 2023 at 1:28 PM Justine Olshan
> >  wrote:
> > >
> > > Hey all -- I noticed we still have the system tests as something that
> > will
> > > be updated. Did we get a run for this RC?
> > >
> > > On Mon, Oct 2, 2023 at 1:24 PM Bill Bejeck  wrote:
> > >
> > > > Hi Satish,
> > > >
> > > > Thanks for running the release.
> > > > I performed the following steps:
> > > >
> > > >- Validated all the checksums, signatures, and keys
> > > >- Built the release from source
> > > >- Ran all unit tests
> > > >- Quick start validations
> > > >   - ZK and Kraft
> > > >   - Connect
> > > >   - Kafka Streams
> > > >- Spot checked java docs and documentation
> > > >
> > > > +1 (binding)
> > > >
> > > > - Bill
> > > >
> > > > On Mon, Oct 2, 2023 at 10:23 AM Proven Provenzano
> > > >  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > To verify the release of release 3.6.0 RC2 I did the following:
> > > > >
> > > > >- Downloaded the source, built and ran the tests.
> > > > >- Validated SCRAM with KRaft including creating credentials with
> > > > >kafka-storage.
> > > > >- Validated Delegation Tokens with KRaft
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > --Proven
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Oct 2, 2023 at 8:37 AM Divij Vaidya <
> divijvaidy...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > + 1 (non-binding)
> > > > > >
> > > > > > Verifications:
> > > > > > 1. I ran a produce-consume workload with plaintext auth, JDK17,
> > zstd
> > > > > > compression using an open messaging benchmark and found 3.6 to be
> > > > better
> > > > > > than or equal to 3.5.1 across all dimensions. Notably, 3.6 had
> > > > > consistently
> > > > > > 6-7% lower CPU utilization, lesser spikes on P99 produce
> latencies
> > and
> > > > > > overall lower P99.8 latencies.
> > > > > >
> > > > > > 2. I have verified that detached signature is correct using
> > > > > > https://www.apache.org/info/verification.html and the release
> > manager
> > > > > > public keys are available at
> > > > > > https://keys.openpgp.org/search?q=F65DC3423D4CD7B9
> > > > > >
> > > > > > 3. I have verified that all metrics emitted in 3.5.1 (with Zk)
> are
> > also
> > > > > > being emitted in 3.6.0 (with Zk).
> > > > > >
> > > > > > Problems (but not blockers):
> > > > > > 1. Metrics added in
> > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://github.com/apache/kafka/commit/2f71708955b293658cec3b27e9a5588d39c38d7e
> > > > > > aren't available in the documentation (cc: Justine). I don't
> > consider
> > > > > this
> > > > > > as a release blocker but we should add it as a fast follow-up.
> > > > > >
> > > > > > 2. Metric added in
> > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://github.com/apache/kafka/commit/a900794ace4dcf1f9dadee27fbd8b63979532a18
> > > > > > isn't available in documentation (cc: David). I don't consider
> > this as
> > > > a
> > > > > > release blocker but we should add it as a fast follow-up.
> > > > > >
> > > > > > --
> > > > > > Divij Vaidya
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Oct 2, 2023 at 9:50 AM Federico Valeri <
> > fedeval...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Satish, I did the following to verify the release:
> > > > > > >
> > > > > > > - Built from source with Java 17 and Scala 2.13
> > > > > > > - Ran all unit and integration tests
> > > > > > > - Spot checked documentation
> > > > > > > - Ran custom client 

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2251

2023-10-03 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 195 lines...]
FAILURE: Build failed with an exception.

* Where:
Build file '/home/jenkins/workspace/Kafka_kafka_trunk/build.gradle' line: 1914

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 16s

Publishing build scan...
https://ge.apache.org/s/o46qsoutbskee


FAILURE: Build failed with an exception.

* What went wrong:
Could not write to file 'reports/profile/profile-2023-10-03-11-52-13.html'.
> Unable to create directory 'reports/profile'

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 17s
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 21 and Scala 2.13

FAILURE: Build failed with an exception.

* Where:
Build file 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/build.gradle' line: 
1914

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 17s

Publishing build scan...
https://ge.apache.org/s/hya7rku3flqzw


See the profiling report at: 
file:///home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/reports/profile/profile-2023-10-03-11-52-12.html
A fine-grained performance profile is available: use the --scan option.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 8 and Scala 2.12

> Configure project :
Starting build with version 3.7.0-SNAPSHOT (commit id 5f676cce) using Gradle 
8.3, Java 11 and Scala 2.13.12
Build properties: maxParallelForks=24, maxScalacThreads=8, maxTestRetries=0

FAILURE: Build failed with an exception.

* Where:
Build file '/home/jenkins/workspace/Kafka_kafka_trunk/build.gradle' line: 1914

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 27s

Publishing build scan...
https://ge.apache.org/s/a6tdth43umpay


FAILURE: Build failed with an exception.

* What went wrong:
Could not write to file 'reports/profile/profile-2023-10-03-11-52-23.html'.
> Unable to create directory 'reports/profile'

* Try:
> Run with --stacktrace option

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2250

2023-10-03 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 195 lines...]

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 22s

Publishing build scan...
https://ge.apache.org/s/37vcugerwt6ig


See the profiling report at: 
file:///home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/reports/profile/profile-2023-10-03-11-31-50.html
A fine-grained performance profile is available: use the --scan option.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 8 and Scala 2.12

> Configure project :
Starting build with version 3.7.0-SNAPSHOT (commit id 951a9fef) using Gradle 
8.3, Java 17 and Scala 2.13.12
Build properties: maxParallelForks=24, maxScalacThreads=8, maxTestRetries=0

FAILURE: Build failed with an exception.

* Where:
Build file 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/build.gradle' line: 
1914

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 24s

Publishing build scan...
https://ge.apache.org/s/ds67mkjatiyae


FAILURE: Build failed with an exception.

* What went wrong:
Could not write to file 'reports/profile/profile-2023-10-03-11-32-04.html'.
> Unable to create directory 'reports/profile'

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 24s

> Configure project :
Starting build with version 3.7.0-SNAPSHOT (commit id 951a9fef) using Gradle 
8.3, Java 21 and Scala 2.13.12
Build properties: maxParallelForks=24, maxScalacThreads=8, maxTestRetries=0

FAILURE: Build failed with an exception.

* Where:
Build file '/home/jenkins/workspace/Kafka_kafka_trunk/build.gradle' line: 1914

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 29s

Publishing build scan...
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 17 and Scala 2.13
https://ge.apache.org/s/e6gy77efzc4jm


FAILURE: Build failed with an exception.

* What went wrong:
Could not write to file 'reports/profile/profile-2023-10-03-11-31-58.html'.
> Unable 

Re: [DISCUSS] KIP-986: Cross-Cluster Replication

2023-10-03 Thread Viktor Somogyi-Vass
Hi Greg,

Seems like finding the perfect replication solution is a never ending story
for Kafka :).

Some general thoughts:
GT-1. While as you say it would be good to have some kind of built-in
replication in Kafka, we definitely need to understand the problem better
to provide a better solution. Replication has lots of user stories as you
iterated over a few and I think it's very well worth the time to detail
each one in the KIP. This may help understanding the problem on a deeper
level to others who may want to contribute, somewhat sets the scope and
describes the problem in a way that a good solution can be deduced from it.

I also have a few questions regarding some of the rejected solutions:

MM2:
I think your points about MM2 are fair (offset transparency and operational
complexity), however I think it needs more reasoning about why are we
moving in a different direction?
A few points I can think about what we could improve in MM2 that'd
transform it into more like a solution that you aim for:
MM2-1. What if we consider replacing the client based mechanism with a
follower fetch protocol?
MM2-2. Operating an MM2 cluster might be familiar to those who operate
Connect anyway. For those who don't, can we provide a "built-in" version
that runs in the same process as Kafka, like an embedded dedicated MM2
cluster?
MM2-3. Will we actually be able to achieve less complexity with a built-in
solution?

Layer above Kafka:
LaK-1. Would you please add more details about this? What I can currently
think of is that this "layer above Kafka" would be some kind of a proxy
which would proactively send an incoming request to multiple clusters like
"broadcast" it. Is that a correct assumption?
LaK-2. In case of a cluster failover a client needs to change bootstrap
servers to a different cluster. A layer above Kafka or a proxy can solve
this by abstracting away the cluster itself. It could force out a metadata
refresh and from that point on clients can fetch from the other cluster. Is
this problem within the scope of this KIP or not?

Thanks,
Viktor


On Tue, Oct 3, 2023 at 2:55 AM Greg Harris 
wrote:

> Hey Tom,
>
> Thanks for the high-level questions, as I am certainly approaching
> this KIP differently than I've seen before.
>
> I think that ideally this KIP will expand to include lots of
> requirements and possible implementations, and that through discussion
> we can narrow the scope and form a roadmap for implementation across
> multiple KIPs. I don't plan to be the decision-maker for this project,
> as I'm more interested in building consensus among the co-authors. I
> can certainly poll that consensus and update the KIP to keep the
> project moving, and any other co-author can do the same. And to set an
> example, I'll clarify your questions and for anything that I agree
> with, I'll ask that you make the update to the KIP, so that the KIP
> captures your understanding of the problem and your requirements. If
> you don't get the chance to make the changes yourself, I'll make sure
> they get included eventually, as they're very good ideas :)
>
> For your remaining questions:
>
> M1: I was trying to draw analogies to databases, but your suggested
> properties are much more compelling and informative. I'd love it if
> you added some formalism here, so that we have a better grasp on what
> we're trying to accomplish. +1
> M2: I think the "asynchronous" problem corresponds to the goal of
> "exactly once semantics" but the two are not obviously opposites. I
> think the MM2 deficiencies could focus less on the architecture
> (asynchronicity) and more on the user-facing effect (semantics). +1
> M3: I had a "non-goals" section that ended up becoming the "rejected
> alternatives" section instead. If you have some non-goals in mind,
> please add them.
> M4+M5: I think it's too early to nail down the assumptions directly,
> but if you believe that "separate operators of source and target" is a
> requirement, that would be good to write down. +1
> M6: That is a concerning edge case, and I don't know how to handle it.
> I was imagining that there would be a many:many relationship of
> clusters and links, but I understand that the book-keeping of that
> decision may be significant.
> M7: I think this may be appropriate to cover in a "user story" or
> "example usages". I naturally thought that the feature would describe
> some minimal way of linking two topics, and the applications
> (combining multiple links, performing failovers, or running
> active-active, etc) would be left to users to define. I included the
> regex configurations because I imagine that creating 100s or 1000s of
> links would be unnecessarily tedious. The feature may also encode
> those use-cases directly as first-class citizens as well.
>
> U1: These are states that can happen in reality, and I meant for that
> section to imply that we should expect these states and model them for
> operations and observability.
>
> D1: I think I may have introduced this conf

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2249

2023-10-03 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 201 lines...]
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
using GIT_ASKPASS to set credentials ASF Cloudbees Jenkins ci-builds
 > git fetch --no-tags --progress -- https://github.com/apache/kafka.git 
 > +refs/heads/trunk:refs/remotes/origin/trunk # timeout=10
Checking out Revision 7553d3f562f3af6c7f9b062b9220bcad80b00478 (trunk)
Commit message: "KAFKA-14593: Move LeaderElectionCommand to tools (#13204)"
Commit message: "KAFKA-14593: Move LeaderElectionCommand to tools (#13204)"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 7553d3f562f3af6c7f9b062b9220bcad80b00478 # timeout=10
 > git rev-list --no-walk 8f8dbad564ffd9be409bb85edadbc40659cd0a56 # timeout=10
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 7553d3f562f3af6c7f9b062b9220bcad80b00478 # timeout=10
[Pipeline] withEnv
[Pipeline] {
[Pipeline] withEnv
[Pipeline] {
[Pipeline] tool
[Pipeline] envVarsForTool
[Pipeline] tool
[Pipeline] envVarsForTool
[Pipeline] withEnv
[Pipeline] {
[Pipeline] sh
Avoid second fetch
Checking out Revision 7553d3f562f3af6c7f9b062b9220bcad80b00478 (trunk)
[Pipeline] withEnv
[Pipeline] {
[Pipeline] withEnv
[Pipeline] {
[Pipeline] tool
[Pipeline] envVarsForTool
[Pipeline] withEnv
[Pipeline] {
[Pipeline] sh
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
 > git config --add remote.origin.fetch 
 > +refs/heads/trunk:refs/remotes/origin/trunk # timeout=10
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 7553d3f562f3af6c7f9b062b9220bcad80b00478 # timeout=10
+ ./retry_zinc ./gradlew -PscalaVersion=2.12 clean check -x test --profile 
--continue -PxmlSpotBugsReport=true -PkeepAliveMode=session
Commit message: "KAFKA-14593: Move LeaderElectionCommand to tools (#13204)"
To honour the JVM settings for this build a single-use Daemon process will be 
forked. For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/gradle_daemon.html#sec:disabling_the_daemon
 in the Gradle documentation.
+ ./retry_zinc ./gradlew -PscalaVersion=2.13 clean check -x test --profile 
--continue -PxmlSpotBugsReport=true -PkeepAliveMode=session
To honour the JVM settings for this build a single-use Daemon process will be 
forked. For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/gradle_daemon.html#sec:disabling_the_daemon
 in the Gradle documentation.
[Pipeline] withEnv
[Pipeline] {
[Pipeline] withEnv
[Pipeline] {
[Pipeline] tool
[Pipeline] envVarsForTool
[Pipeline] withEnv
[Pipeline] {
[Pipeline] sh
Daemon will be stopped at the end of the build 
Daemon will be stopped at the end of the build 
+ ./retry_zinc ./gradlew -PscalaVersion=2.13 clean check -x test --profile 
--continue -PxmlSpotBugsReport=true -PkeepAliveMode=session
To honour the JVM settings for this build a single-use Daemon process will be 
forked. For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/gradle_daemon.html#sec:disabling_the_daemon
 in the Gradle documentation.
Daemon will be stopped at the end of the build 

> Configure project :
Starting build with version 3.7.0-SNAPSHOT (commit id 7553d3f5) using Gradle 
8.3, Java 21 and Scala 2.13.12
Build properties: maxParallelForks=24, maxScalacThreads=8, maxTestRetries=0

FAILURE: Build failed with an exception.

* Where:
Build file 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk@2/build.gradle' line: 
1914

* What went wrong:
A problem occurred evaluating root project 'kafka'.
> Project with path ':storage:api' could not be found in project ':tools'.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.3/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD FAILED in 16s

Publishing build scan...
https://ge.apache.org/s/txulfhm7huook


FAILURE: Build failed with an exception.

* What went wrong:
Could not write to file 'reports/profile/profile-2023-10-03-10-00-33.html'.
> Unable to create directory 'reports/profile'

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this

[jira] [Created] (KAFKA-15529) Flaky test ReassignReplicaShrinkTest.executeTieredStorageTest

2023-10-03 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15529:


 Summary: Flaky test 
ReassignReplicaShrinkTest.executeTieredStorageTest
 Key: KAFKA-15529
 URL: https://issues.apache.org/jira/browse/KAFKA-15529
 Project: Kafka
  Issue Type: Test
  Components: Tiered-Storage
Affects Versions: 3.6.0
Reporter: Divij Vaidya


Example of failed CI build - 
[https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14449/3/testReport/junit/org.apache.kafka.tiered.storage.integration/ReassignReplicaShrinkTest/Build___JDK_21_and_Scala_2_13___executeTieredStorageTest_String__quorum_kraft_2/]
  


{noformat}
org.opentest4j.AssertionFailedError: Number of fetch requests from broker 0 to 
the tier storage does not match the expected value for topic-partition topicA-1 
==> expected: <3> but was: <4>
at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
at 
app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
at 
app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:559)
at 
app//org.apache.kafka.tiered.storage.actions.ConsumeAction.doExecute(ConsumeAction.java:128)
at 
app//org.apache.kafka.tiered.storage.TieredStorageTestAction.execute(TieredStorageTestAction.java:25)
at 
app//org.apache.kafka.tiered.storage.TieredStorageTestHarness.executeTieredStorageTest(TieredStorageTestHarness.java:112){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-979: Allow independently stop KRaft controllers or brokers

2023-10-03 Thread Federico Valeri
Hi Hailey, thanks for the KIP.

I also agree that the two mutually exclusive args are better. In order
to be consistent with the other tools, I would suggest to use
--process-role and --node-id (hyphen instead of dot). Can you also
update the KIP?

On Mon, Oct 2, 2023 at 10:18 PM Hailey Ni  wrote:
>
> Hi Kamal,
>
> I think the broker.id property has been replaced with the `node.id` property
> in KRaft.  The documentation for `node.id` says it is required (
> https://github.com/apache/kafka/blob/72e275f6ea867747e6b4e524c80d5ebd726ac25b/core/src/main/scala/kafka/server/KafkaConfig.scala#L741),
> and the QuickStart files all use it (
> https://github.com/apache/kafka/tree/72e275f6ea867747e6b4e524c80d5ebd726ac25b/config/kraft).
> It is technically true that these two configs are treated as synonyms of
> one another (
> https://github.com/apache/kafka/blob/72e275f6ea867747e6b4e524c80d5ebd726ac25b/core/src/main/scala/kafka/server/KafkaConfig.scala#L1587-L1597),
> so if you specify either one the process will still recognize it and
> start.  But it makes sense to exclusively use `node.id` in KRaft because a
> node isn't necessarily a broker anymore; it could be a controller (or even
> a combined broker+controller).
>
> Thanks,
> Hailey
>
> On Mon, Oct 2, 2023 at 1:17 PM Hailey Ni  wrote:
>
> > Hi Ismeal,
> >
> > Thanks for the comments. I'll change the implementation to use a pair of
> > mutually exclusive args --process.roles and --node.id.
> >
> > Thanks,
> > Hailey
> >
> > On Mon, Oct 2, 2023 at 6:34 AM Ismael Juma  wrote:
> >
> >> Hi Ron,
> >>
> >> Yes, that's what I am proposing, yes.
> >>
> >> Ismael
> >>
> >> On Sat, Sep 30, 2023 at 2:30 PM Ron Dagostino  wrote:
> >>
> >> > Thanks, Ismael.  I think you are proposing a pair of mutually exclusive
> >> > args --process.roles and --node.id, right?  I agree that is more
> >> > user-friendly than the --required-config arg, and it comes at the
> >> possible
> >> > expense of generality.  So that’s the tradeoff between the two, I think.
> >> > No other config comes to mind now that we’ve identified these two.  I
> >> think
> >> > the two specific and mutually exclusive parameters would be the way to
> >> go
> >> > unless someone else identifies still more options that people might
> >> want.
> >> >
> >> > Did I get that right, or were you proposing something different?
> >> >
> >> > Ron
> >> >
> >> > > On Sep 30, 2023, at 10:42 AM, Ismael Juma  wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > Thanks for the KIP. I think this approach based on configs is a bit
> >> too
> >> > > open ended and not very user friendly. Why don't we simply provide
> >> flags
> >> > > for the things a user may care about? So far, it seems like we have
> >> two
> >> > > good candidates (node id and process role). Are there any others?
> >> > >
> >> > > Ismael
> >> > >
> >> > >> On Fri, Sep 29, 2023 at 6:19 PM Hailey Ni 
> >> > wrote:
> >> > >>
> >> > >> Hi Ron,
> >> > >>
> >> > >> I think you made a great point, making the "name" arbitrary instead
> >> of
> >> > >> hard-coding it will make the functionality much more flexible. I've
> >> > updated
> >> > >> the KIP and the code accordingly. Thanks for the great idea!
> >> > >>
> >> > >> Thanks,
> >> > >> Hailey
> >> > >>
> >> > >>
> >> > >>> On Fri, Sep 29, 2023 at 2:34 PM Ron Dagostino 
> >> > wrote:
> >> > >>>
> >> > >>> Thanks, Hailey.  Is there a reason to restrict it to just
> >> > >>> process.roles and node.id?  Someone might want to do
> >> > >>> "--required-config any.name=whatever.value", for example, and at
> >> first
> >> > >>> glance I don't see a reason why the implementation should be any
> >> > >>> different -- it seems it would probably be easier to not have to
> >> worry
> >> > >>> about restricting to specific cases, actually.  WDYT?
> >> > >>>
> >> > >>> Ron
> >> > >>>
> >> > >>> On Fri, Sep 29, 2023 at 5:12 PM Hailey Ni  >> >
> >> > >>> wrote:
> >> > 
> >> >  Updated. Please let me know if you have any additional comments.
> >> Thank
> >> > >>> you!
> >> > 
> >> >  On Thu, Sep 21, 2023 at 3:02 PM Hailey Ni 
> >> wrote:
> >> > 
> >> > > Hi Ron. Thanks for the response. I agree with your point. I'll
> >> make
> >> > >> the
> >> > > corresponding changes in the KIP and KAFKA-15471
> >> > > .
> >> > >
> >> > > On Thu, Sep 21, 2023 at 1:40 PM Ron Dagostino 
> >> > >>> wrote:
> >> > >
> >> > >> Hi Hailey.  No, I just looked, and zookeeper-server-stop does not
> >> > >> have
> >> > >> any facility to be specific about which ZK nodes to signal.  So
> >> > >> providing the ability in kafka-server-stop to be more specific
> >> than
> >> > >> just "signal all controllers" or "signal all brokers" would be a
> >> > >> bonus
> >> > >> and therefore not necessarily required.  But if it is easy to
> >> > >> achieve
> >> > >> and doesn't add any additional cognitive load -- and at first
> >> glance
> >> > >> it does

KRaft Performance Improvements

2023-10-03 Thread Doğuşcan Namal
Do we have any performance test results showing the difference between
KRaft vs Zookeeper?

The one that I found online is from Redpanda comparing the tail latencies
https://redpanda.com/blog/kafka-kraft-vs-redpanda-performance-2023#the-test:-redpanda-23.1-vs.-kafka-3.4.0-with-kraft

Can I assume this is a valid comparison?

Also I heard that KRaft will be helpful for the number of partitions on a
cluster. Do we have any test showing the difference?

Are there any expected performance improvements?

Thanks
Doguscan


Re: [DISCUSS] KIP-980: Allow creating connectors in a stopped state

2023-10-03 Thread Yash Mayya
Hi Chris,

Thanks for taking a look at this KIP!

1. I chose to go with simply "state" as that exact term is already exposed
via some of the existing REST API responses and would be one that users are
already familiar with (although admittedly something like "initial_state"
wouldn't be much of a jump). Since it's a field in the request body for the
connector creation endpoint, wouldn't it be implied that it is the
"initial" state just like the "config" field represents the "initial"
configuration? Also, I don't think x.y has been established as the field
naming convention in the Connect REST API right? From what I can tell, x_y
is the convention being followed for fields in requests ("kafka_topic" /
"kafka_partition" / "kafka_offset" in the offsets APIs for instance) and
responses ("error_count", "kafka_cluster_id", "recommended_values" etc.).

2. The connector configuration record is currently used for both connector
create requests as well as connector config update requests. Since we're
only allowing configuring the target state for newly created connectors, I
feel like it'll be a cleaner separation of concerns to use the existing
records for connector configurations and connector target states rather
than bundling the "state" and "state.v2" (or equivalent) fields into the
connector configuration record. The additional write should be very minimal
overhead and the two writes would be an atomic operation for Connect
clusters that are using a transactional producer for the config topic
anyway. Thoughts?

3. I was thinking that we'd support standalone mode via the same connector
creation REST API endpoint changes (addition of the "state" field). If
there is further interest in adding similar support to the command line
method of creating connectors as well, perhaps we could do so in a small
follow-on KIP? I feel like ever since the standalone mode started
supporting the full Connect REST API, the command line method of creating
connectors is more of a "legacy" concept.

4. Yeah, connector / offsets migration was just used as a representative
example of how this feature could be useful - I didn't intend for it to be
the sole purpose of this KIP. However, that said, I really like the idea of
accepting an "offsets" field in the connector creation request since it'd
reduce the number of user operations required from 3 (create the connector
in a STOPPED state; alter the offsets; resume the connector) to 1. I'd be
happy to either create or review a separate small KIP for that if it sounds
acceptable to you?

5. Whoops, thanks, I hadn't even noticed that that's where I had linked to.
Fixed!

Thanks,
Yash

On Mon, Oct 2, 2023 at 11:14 PM Chris Egerton 
wrote:

> Hi Yash,
>
> Thanks for the KIP! Frankly this feels like an oversight in 875 and I'm
> glad we're closing that loop ASAP.
>
>
> Here are my thoughts:
>
> 1. (Nit): IMO "start.state", "initial.state", or "target.state" might be
> better than just "state" for the field name in the connector creation
> request.
>
> 2. Why implement this in distributed mode with two writes to the config
> topic? We could augment the existing format for connector configs in the
> config topic [1] with a new field instead.
>
> 3. Although standalone mode is mentioned in the KIP, there's no detail on
> the Java properties file format that we support for standalone mode (i.e.,
> `connect-standalone config/connect-standalone.properties
> config/connect-file-source.properties
> config/connect-file-sink.properties`). Do we plan on adding support for
> that mode as well?
>
> 4. I suspect there will be advantages for this feature beyond offsets
> migration, but if offsets migration were the only motivation for it,
> wouldn't it be simpler to accept an "offsets" field in connector creation
> requests? That way, users wouldn't have to start a connector in a different
> state and then resume it, they could just create the connector like normal.
> I think the current proposal is acceptable, but wanted to float this
> alternative in case we anticipate lots of connector migrations and want to
> optimize for them a bit more.
>
> 5. (NIt): We can link to our own Javadocs [2] instead of javadoc.io
>
>
> [1] -
>
> https://github.com/apache/kafka/blob/dcd8c7d05f2f22f2d815405e7ab3ad7439669239/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java#L234-L236
>
> [2] - https://kafka.apache.org/35/javadoc/index.html?overview-summary.html
>
>
> Cheers,
>
> Chris
>
> On Thu, Sep 21, 2023 at 2:37 AM Yash Mayya  wrote:
>
> > Hi Ashwin,
> >
> > Thanks for taking a look and sharing your thoughts!
> >
> > 1. Yes, the request / response formats of the two APIs were intentionally
> > made identical for such use-cases. [1]
> >
> > 2. I'm assuming you're referring to retaining the offset / config topic
> > records for a connector when it is deleted by a user? Firstly, a
> > connector's offsets aren't actually currently deleted when the connector
> is
> > deleted - it was list