Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #16

2021-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 501283 lines...]
[2021-11-17T03:51:32.611Z] > Task :streams:test-utils:compileJava UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :streams:jar UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :metadata:compileJava UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :metadata:classes UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :core:compileJava NO-SOURCE
[2021-11-17T03:51:32.611Z] > Task :clients:compileTestJava UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task 
:streams:generateMetadataFileForMavenJavaPublication
[2021-11-17T03:51:32.611Z] > Task :clients:processTestResources UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :clients:testClasses UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :connect:json:compileTestJava UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :connect:json:testClasses UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :raft:compileTestJava UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :raft:testClasses UP-TO-DATE
[2021-11-17T03:51:32.611Z] > Task :connect:json:testJar
[2021-11-17T03:51:32.611Z] > Task 
:clients:generateMetadataFileForMavenJavaPublication
[2021-11-17T03:51:33.650Z] > Task 
:clients:generatePomFileForMavenJavaPublication
[2021-11-17T03:51:33.650Z] > Task :connect:json:testSrcJar
[2021-11-17T03:51:33.650Z] > Task :metadata:compileTestJava UP-TO-DATE
[2021-11-17T03:51:33.650Z] > Task :metadata:testClasses UP-TO-DATE
[2021-11-17T03:51:33.650Z] > Task :core:compileScala UP-TO-DATE
[2021-11-17T03:51:33.650Z] > Task :core:classes UP-TO-DATE
[2021-11-17T03:51:33.650Z] > Task :core:compileTestJava NO-SOURCE
[2021-11-17T03:51:34.861Z] > Task :core:compileTestScala UP-TO-DATE
[2021-11-17T03:51:34.861Z] > Task :core:testClasses UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task :connect:api:javadoc
[2021-11-17T03:51:38.223Z] > Task :connect:api:copyDependantLibs UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task :connect:api:jar UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2021-11-17T03:51:38.223Z] > Task :connect:json:copyDependantLibs UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task :connect:json:jar UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task 
:connect:json:generateMetadataFileForMavenJavaPublication
[2021-11-17T03:51:38.223Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2021-11-17T03:51:38.223Z] > Task :connect:json:publishToMavenLocal
[2021-11-17T03:51:38.223Z] > Task :connect:api:javadocJar
[2021-11-17T03:51:38.223Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task :connect:api:testClasses UP-TO-DATE
[2021-11-17T03:51:38.223Z] > Task :connect:api:testJar
[2021-11-17T03:51:38.223Z] > Task :connect:api:testSrcJar
[2021-11-17T03:51:38.223Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2021-11-17T03:51:38.223Z] > Task :connect:api:publishToMavenLocal
[2021-11-17T03:51:41.343Z] > Task :streams:javadoc
[2021-11-17T03:51:41.343Z] > Task :streams:javadocJar
[2021-11-17T03:51:41.343Z] > Task :streams:compileTestJava UP-TO-DATE
[2021-11-17T03:51:41.343Z] > Task :streams:testClasses UP-TO-DATE
[2021-11-17T03:51:42.383Z] > Task :streams:testJar
[2021-11-17T03:51:42.383Z] > Task :streams:testSrcJar
[2021-11-17T03:51:42.383Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2021-11-17T03:51:42.383Z] > Task :streams:publishToMavenLocal
[2021-11-17T03:51:43.422Z] > Task :clients:javadoc
[2021-11-17T03:51:43.422Z] > Task :clients:javadocJar
[2021-11-17T03:51:44.461Z] 
[2021-11-17T03:51:44.461Z] > Task :clients:srcJar
[2021-11-17T03:51:44.461Z] Execution optimizations have been disabled for task 
':clients:srcJar' to ensure correctness due to the following reasons:
[2021-11-17T03:51:44.461Z]   - Gradle detected a problem with the following 
location: '/home/jenkins/workspace/Kafka_kafka_3.1/clients/src/generated/java'. 
Reason: Task ':clients:srcJar' uses this output of task 
':clients:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2021-11-17T03:51:45.895Z] 
[2021-11-17T03:51:45.895Z] > Task :clients:testJar
[2021-11-17T03:51:45.895Z] > Task :clients:testSrcJar
[2021-11-17T03:51:45.895Z] > Task 
:clients:publishMavenJavaPublicationToMavenLocal
[2021-11-17T03:51:45.895Z] > Task :clients:publishToMavenLocal
[2021-11-17T03:51:45.895Z] 
[2021-11-17T03:51:45.895Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 8.0.
[2021-11-17T03:51:45.895Z] 
[2021-11-17T03:51:45.895Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2021-11-17T03:51:45.895Z] 

Re: [DISCUSS] KIP-796: Interactive Query v2

2021-11-16 Thread Guozhang Wang
Thanks John! Some more thoughts inlined below.

On Mon, Nov 15, 2021 at 10:07 PM John Roesler  wrote:

> Thanks for the review, Guozhang!
>
> 1. This is a great point. I fell into the age-old trap of
> only considering the simplest store type and forgot about
> this extra wrinkle of the "key schema" that we use in
> Windowed and Session stores.
>
> Depending on how we want to forge forward with our provided
> queries, I think it can still work out ok. The simplest
> solution is just to have windowed versions of our queries
> for use on the windowed stores. That should work naively
> because we're basically just preserving the existing
> interactions. It might not be ideal in the long run, but at
> least it lets us make IQv2 orthogonal from other efforts to
> simplify the stores themselves.
>
> If we do that, then it would actually be correct to go ahead
> and just return the serdes that are present in the Metered
> stores today. For example, if I have a Windowed store with
> Integer keys, then the key serde I get from serdesForStore
> would just be the IntegerSerde. The query I'd use the
> serialized key with would be a RawWindowedKeyQuery, which
> takes a byte[] key and a timestamp. Then, the low-level
> store (the segmented store in this case) would have to take
> the next step to use its schema before making that last-mile
> query. Note, this is precisely how fetch is implemented
> today in RocksDBWindowStore:
>
> public byte[] fetch(final Bytes key, final long timestamp) {
>   return wrapped().get(WindowKeySchema.toStoreKeyBinary(key,
> timestamp, seqnum));
> }
>
> In other words, if we set up our provided Query types to
> stick close to the current store query methods, then
> everything "should work out" (tm).
>
> I think where things start to get more complicated would be
> if we wanted to expose the actual, raw, on-disk binary key
> to the user, along with a serde that can interpret it. Then,
> we would have to pack up the serde and the schema. If we go
> down that road, then knowing which one (the key serde or the
> windowed schema + the key serde) the user wants when they
> ask for "the serde" would be a challenge.
>
> I'm actually thinking maybe we don't need to include the
> serdesForStore method in this proposal, but instead leave it
> for follow-on work (if desired) to add it along with raw
> queries, since it's really only needed if you want raw
> queries and (as you mentioned later) there may be better
> ways to present the serdes, which is always easier to figure
> out once there's a use case.
>
>
> 2. Hmm, if I understand what you mean by the "formatted"
> layer, is that the one supplied by the
> WindowedBytesStoreSupplier or SessionBytesStoreSupplier in
> Materialized? I think the basic idea of this proposal is to
> let whatever store gets supplied there be the "last stop"
> for the query.
>
> For the case of our default windowed store, this is the
> segmented RocksDB store. It's true that this store "wraps" a
> bunch of segments, but it would be the segmented store's
> responsibility to handle the query, not defer to the
> segments. This might mean different things for different
> queries, but naively, I think it can just invoke to the
> current implementation of each of its methods.
>
> There might be future queries that require more
> sophisticated responses, but we should be able to add new
> queries for those, which have no restrictions on their
> response types. For example, we could choose to respond to a
> scan with a list of iterators, one for each segment.
>
>
For `formatted` stores, I also mean the ListValueStore which was added
recently for stream-stream joins, for example. The interface is a KV-store
but that disables same-key overwrites but instead keep all the values of
the same key as a list, and users can only delete old values by deleting
the whole key-list (which would of course delete new values as well).
ListValueStore uses a KeyValueStore as its inner, but would convert the put
call as append.

I think in the long run, we should have a different interface other than
KVStore for this type, and the implementation would then be at the
`formatted` store layer. That means the `query` should be always
implemented at the inner layer of the logged store (that could be the most
`inner` store, or the `fomatted` store), and upper level wrapped stores
would be delegating to the inner stores.

As for serdes, here's some more second thoughts: generally speaking, it's
always convenient for users to pass in the value as object than raw bytes,
the only exception is if the query is not for exact matching but prefix (or
suffix, though we do not have such cases today) matching, in which case we
would need the raw bytes in order to pass in the prefixed bytes into the
inner store. The returned value though could either be preferred as raw
bytes, or be deserialized already.

The composite-serde mostly happens at the key, but not much at the value
(we only have "value-timestamp" type which 

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #15

2021-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 500803 lines...]
[2021-11-16T23:49:00.671Z] EndToEndClusterIdTest > testEndToEnd() STARTED
[2021-11-16T23:49:00.873Z] 
[2021-11-16T23:49:00.873Z] GetOffsetShellTest > testTopicPartitionsArg() PASSED
[2021-11-16T23:49:00.873Z] 
[2021-11-16T23:49:00.873Z] GetOffsetShellTest > testInternalExcluded() STARTED
[2021-11-16T23:49:02.715Z] 
[2021-11-16T23:49:02.715Z] EndToEndClusterIdTest > testEndToEnd() PASSED
[2021-11-16T23:49:02.715Z] 
[2021-11-16T23:49:02.715Z] AdminClientWithPoliciesIntegrationTest > 
testInvalidAlterConfigs() STARTED
[2021-11-16T23:49:04.459Z] 
[2021-11-16T23:49:04.459Z] GetOffsetShellTest > testInternalExcluded() PASSED
[2021-11-16T23:49:04.459Z] 
[2021-11-16T23:49:04.459Z] GetOffsetShellTest > 
testTopicPartitionsNotFoundForNonMatchingTopicPartitionPattern() STARTED
[2021-11-16T23:49:07.016Z] 
[2021-11-16T23:49:07.016Z] AdminClientWithPoliciesIntegrationTest > 
testInvalidAlterConfigs() PASSED
[2021-11-16T23:49:07.016Z] 
[2021-11-16T23:49:07.016Z] AdminClientWithPoliciesIntegrationTest > 
testValidAlterConfigs() STARTED
[2021-11-16T23:49:09.382Z] 
[2021-11-16T23:49:09.382Z] GetOffsetShellTest > 
testTopicPartitionsNotFoundForNonMatchingTopicPartitionPattern() PASSED
[2021-11-16T23:49:09.382Z] 
[2021-11-16T23:49:09.382Z] GetOffsetShellTest > 
testTopicPartitionsNotFoundForExcludedInternalTopic() STARTED
[2021-11-16T23:49:10.460Z] 
[2021-11-16T23:49:10.460Z] AdminClientWithPoliciesIntegrationTest > 
testValidAlterConfigs() PASSED
[2021-11-16T23:49:10.460Z] 
[2021-11-16T23:49:10.460Z] AdminClientWithPoliciesIntegrationTest > 
testInvalidAlterConfigsDueToPolicy() STARTED
[2021-11-16T23:49:12.981Z] 
[2021-11-16T23:49:12.981Z] GetOffsetShellTest > 
testTopicPartitionsNotFoundForExcludedInternalTopic() PASSED
[2021-11-16T23:49:12.981Z] 
[2021-11-16T23:49:12.981Z] DeleteTopicTest > testDeleteTopicWithCleaner() 
STARTED
[2021-11-16T23:49:13.994Z] 
[2021-11-16T23:49:13.994Z] AdminClientWithPoliciesIntegrationTest > 
testInvalidAlterConfigsDueToPolicy() PASSED
[2021-11-16T23:49:13.994Z] 
[2021-11-16T23:49:13.994Z] RackAwareAutoTopicCreationTest > 
testAutoCreateTopic() STARTED
[2021-11-16T23:49:21.119Z] 
[2021-11-16T23:49:21.119Z] RackAwareAutoTopicCreationTest > 
testAutoCreateTopic() PASSED
[2021-11-16T23:49:21.119Z] 
[2021-11-16T23:49:21.119Z] SaslPlainPlaintextConsumerTest > 
testCoordinatorFailover() STARTED
[2021-11-16T23:49:29.837Z] 
[2021-11-16T23:49:29.837Z] SaslPlainPlaintextConsumerTest > 
testCoordinatorFailover() PASSED
[2021-11-16T23:49:29.837Z] 
[2021-11-16T23:49:29.837Z] SaslPlainPlaintextConsumerTest > 
testSimpleConsumption() STARTED
[2021-11-16T23:49:32.871Z] 
[2021-11-16T23:49:32.872Z] DeleteTopicTest > testDeleteTopicWithCleaner() 
SKIPPED
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] 1311 tests completed, 1 failed, 9 skipped
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] > Task :core:integrationTest FAILED
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] FAILURE: Build failed with an exception.
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] * What went wrong:
[2021-11-16T23:49:32.872Z] Execution failed for task ':core:integrationTest'.
[2021-11-16T23:49:32.872Z] > Process 'Gradle Test Executor 129' finished with 
non-zero exit value 1
[2021-11-16T23:49:32.872Z]   This problem might be caused by incorrect test 
process configuration.
[2021-11-16T23:49:32.872Z]   Please refer to the test execution section in the 
User Manual at 
https://docs.gradle.org/7.2/userguide/java_testing.html#sec:test_execution
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] * Try:
[2021-11-16T23:49:32.872Z] Run with --stacktrace option to get the stack trace. 
Run with --info or --debug option to get more log output. Run with --scan to 
get full insights.
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] * Get more help at https://help.gradle.org
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 8.0.
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] See 
https://docs.gradle.org/7.2/userguide/command_line_interface.html#sec:command_line_warnings
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] BUILD FAILED in 1h 54m 34s
[2021-11-16T23:49:32.872Z] 202 actionable tasks: 109 executed, 93 up-to-date
[2021-11-16T23:49:32.872Z] 
[2021-11-16T23:49:32.872Z] See the profiling report at: 
file:///home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1/build/reports/profile/profile-2021-11-16-21-55-00.html
[2021-11-16T23:49:32.872Z] A fine-grained performance profile is available: use 
the --scan option.
[Pipeline] }
[Pipeline] // 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
> Hi, David, Colin,
>
> Thanks for the reply.
>
> 16. Discussed with David offline a bit. We have 3 cases.
> a. We upgrade from an old version where the metadata.version has already
> been finalized. In this case it makes sense to stay with that feature
> version after the upgrade.

+1

> b. We upgrade from an old version where no metadata.version has been
> finalized. In this case, it makes sense to leave metadata.version disabled
> since we don't know if all brokers have been upgraded.

This is the scenario I was hoping to avoid by saying that ALL KRaft clusters 
have metadata.version of at least 1, and each subsequent level for 
metadata.version corresponds to an IBP version. The existing KRaft clusters in 
3.0 and earlier are preview (not for production) so I think this change is OK 
for 3.x (given that it affects only KRaft). Then IBP is irrelevant for KRaft 
clusters (the config is ignored, possibly with a WARN or ERROR message 
generated if it is set).

> c. We are starting from a brand new cluster and of course no
> metadata.version has been finalized. In this case, the KIP says it will
> pick the metadata.version in meta.properties. In the common case, people
> probably won't set the metadata.version in the meta.properties file
> explicitly. So, it will be useful to put a default (stable) version there
> when the meta.properties.

Hmm. I was assuming that clusters where the admin didn't specify any 
metadata.version during formatting would get the latest metadata.version. 
Partly, because this is what we do for IBP today. It would be good to clarify 
this...

>
> Also, it would be useful to clarify that if a FeatureLevelRecord exists for
> metadata.version, the metadata.version in meta.properties will be ignored.
>

Yeah, I agree.

best,
Colin

> Thanks,
>
> Jun
>
>
> On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe  wrote:
>
>> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
>> > Hi, David,
>> >
>> > Thanks for the reply.
>> >
>> > 16. My first concern is that the KIP picks up meta.version inconsistently
>> > during the deployment. If a new cluster is started, we pick up the
>> highest
>> > version. If we upgrade, we leave the feature version unchanged.
>>
>> Hi Jun,
>>
>> Thanks again for taking a look.
>>
>> The proposed behavior in KIP-778 is consistent with how it works today.
>> Upgrading the software is distinct from upgrading the IBP.
>>
>> I think it is important to keep these two operations ("upgrading
>> IBP/metadata version" and "upgrading software version") separate. If they
>> are coupled it will create a situation where software upgrades are
>> difficult and dangerous.
>>
>> Consider a situation where you find some bug in your current software, and
>> you want to upgrade to new software that fixes the bug. If upgrades and IBP
>> bumps are coupled, you can't do this without also bumping the IBP, which is
>> usually considered a high-risk change. That means that either you have to
>> make a special build that includes only the fix (time-consuming and
>> error-prone), live with the bug for longer, or be very conservative about
>> ever introducing new IBP/metadata versions. None of those are really good
>> choices.
>>
>> > Intuitively, it seems that independent of how a cluster is deployed, we
>> > should always pick the same feature version.
>>
>> I think it makes sense to draw a distinction between upgrading an existing
>> cluster and deploying a new one. What most people want out of upgrades is
>> that things should keep working, but with bug fixes. If we change that, it
>> just makes people more reluctant to upgrade (which is always a problem...)
>>
>> > I think we need to think this through in this KIP. My second concern is
>> > that as a particular version matures, it's inconvenient for a user to
>> manually
>> > upgrade every feature version. As long as we have a path to achieve that
>> in
>> > the future, we don't need to address that in this KIP.
>>
>> If people are managing a large number of Kafka clusters, they will want to
>> do some sort of A/B testing with IBP/metadata versions. So if you have 1000
>> Kafka clusters, you roll out the new IBP version to 10 of them and see how
>> it goes. If that goes well, you roll it out to more, etc.
>>
>> So, the automation needs to be at the cluster management layer, not at the
>> Kafka layer. Each Kafka cluster doesn't know how well things went in the
>> other 999 clusters. Automatically upgrading is a bad thing for the same
>> reason Kafka automatically upgrading its own software version would be a
>> bad thing -- it could lead to a disruption to a sensitive cluster at the
>> wrong time.
>>
>> For people who are just managing one or two Kafka clusters, automatically
>> upgrading feature versions isn't a big burden and can be done manually.
>> This is all consistent with how IBP works today.
>>
>> Also, we already have a command-line option to the feature tool which
>> upgrades all features to 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Jun Rao
Hi, David, Colin,

Thanks for the reply.

16. Discussed with David offline a bit. We have 3 cases.
a. We upgrade from an old version where the metadata.version has already
been finalized. In this case it makes sense to stay with that feature
version after the upgrade.
b. We upgrade from an old version where no metadata.version has been
finalized. In this case, it makes sense to leave metadata.version disabled
since we don't know if all brokers have been upgraded.
c. We are starting from a brand new cluster and of course no
metadata.version has been finalized. In this case, the KIP says it will
pick the metadata.version in meta.properties. In the common case, people
probably won't set the metadata.version in the meta.properties file
explicitly. So, it will be useful to put a default (stable) version there
when the meta.properties.

Also, it would be useful to clarify that if a FeatureLevelRecord exists for
metadata.version, the metadata.version in meta.properties will be ignored.

Thanks,

Jun


On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe  wrote:

> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> > Hi, David,
> >
> > Thanks for the reply.
> >
> > 16. My first concern is that the KIP picks up meta.version inconsistently
> > during the deployment. If a new cluster is started, we pick up the
> highest
> > version. If we upgrade, we leave the feature version unchanged.
>
> Hi Jun,
>
> Thanks again for taking a look.
>
> The proposed behavior in KIP-778 is consistent with how it works today.
> Upgrading the software is distinct from upgrading the IBP.
>
> I think it is important to keep these two operations ("upgrading
> IBP/metadata version" and "upgrading software version") separate. If they
> are coupled it will create a situation where software upgrades are
> difficult and dangerous.
>
> Consider a situation where you find some bug in your current software, and
> you want to upgrade to new software that fixes the bug. If upgrades and IBP
> bumps are coupled, you can't do this without also bumping the IBP, which is
> usually considered a high-risk change. That means that either you have to
> make a special build that includes only the fix (time-consuming and
> error-prone), live with the bug for longer, or be very conservative about
> ever introducing new IBP/metadata versions. None of those are really good
> choices.
>
> > Intuitively, it seems that independent of how a cluster is deployed, we
> > should always pick the same feature version.
>
> I think it makes sense to draw a distinction between upgrading an existing
> cluster and deploying a new one. What most people want out of upgrades is
> that things should keep working, but with bug fixes. If we change that, it
> just makes people more reluctant to upgrade (which is always a problem...)
>
> > I think we need to think this through in this KIP. My second concern is
> > that as a particular version matures, it's inconvenient for a user to
> manually
> > upgrade every feature version. As long as we have a path to achieve that
> in
> > the future, we don't need to address that in this KIP.
>
> If people are managing a large number of Kafka clusters, they will want to
> do some sort of A/B testing with IBP/metadata versions. So if you have 1000
> Kafka clusters, you roll out the new IBP version to 10 of them and see how
> it goes. If that goes well, you roll it out to more, etc.
>
> So, the automation needs to be at the cluster management layer, not at the
> Kafka layer. Each Kafka cluster doesn't know how well things went in the
> other 999 clusters. Automatically upgrading is a bad thing for the same
> reason Kafka automatically upgrading its own software version would be a
> bad thing -- it could lead to a disruption to a sensitive cluster at the
> wrong time.
>
> For people who are just managing one or two Kafka clusters, automatically
> upgrading feature versions isn't a big burden and can be done manually.
> This is all consistent with how IBP works today.
>
> Also, we already have a command-line option to the feature tool which
> upgrades all features to the latest available, if that is what the
> administrator desires. Perhaps we could add documentation saying that this
> should be done as the last step of the upgrade.
>
> best,
> Colin
>
> >
> > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
> > since the logic is always there. Would it be better to use disable?
> >
> > Jun
> >
> > On Fri, Nov 5, 2021 at 8:11 AM David Arthur
> >  wrote:
> >
> >> Colin and Jun, thanks for the additional comments!
> >>
> >> Colin:
> >>
> >> > We've been talking about having an automated RPC compatibility checker
> >>
> >> Do we have a way to mark fields in schemas as deprecated? It can stay in
> >> the RPC, it just complicates the logic a bit.
> >>
> >> > It would be nice if the active controller could validate that a
> majority
> >> of the quorum could use the proposed metadata.version. The active
> >> controller should have this 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Guozhang Wang
Yeah I agree that checking a majority of voters support the
metadata.version is sufficient. What I was originally considering is
whether (in the future) we could consider encoding the metadata.version
value in the vote request as well, so that the elected leader is supposed
to have a version that's supported by a majority of the quorum.

Guozhang

On Tue, Nov 16, 2021 at 2:02 PM Colin McCabe  wrote:

> On Tue, Nov 16, 2021, at 13:36, Guozhang Wang wrote:
> > Hi Colin,
> >
> > If we allow downgrades which would be appended in metadata.version, then
> > the length of the __cluster_medata log may not be safe to indicate higher
> > versions, since older version records could still be appended later than
> a
> > new version record right?
> >
>
> That's fair... the longer log could be at a lower metadata.version.
>
> However, can you think of a scenario where the upgrade / downgrade logic
> doesn't protect us here?
>
> Checking that a majority of the voters support the new metadata.version
> we're going to seems to cover all the cases I can think of. I guess there
> could be time-of-check, time-of-use race conditions, but they only happen
> if you are doing a feature downgrade at the same time as a software version
> downgrade. This is something the cluster management software should not do
> (I guess we should spell this out in the KIP) I think nobody would want to
> do that since it would mean rolling the cluster while you're messing with
> feature flags.
>
> best,
> Colin
>


-- 
-- Guozhang


[DISCUSS] Brokers disconnect intermittently with TLS1.3

2021-11-16 Thread Kokoori, Shylaja
Hi Luke,



Sorry about the miscommunication, I was not talking about making TLS1.2 
default. My assumption is that if JDK version < 11, TLS 1.2 will be used. So 
wanted to come up with a solution that worked for both cases.



To provide more details about the issue, given below is the error reported in 
the kafka.log. With a record size of 500KB it is easily reproducible.

Scenario when this happens,

in the read function (in SSLTransportLayer.java), 
unwrapResult.getHandshakeStatus==NEED_WRAP, unwrapResult.status=STATUS_OK

causing to throw the renegotiation exception.



A simple test I did was to turn off the renegotiation exception and I did not 
see the disconnect messages in the log and the intermittent latency spike.



Thank you,

Shylaja





ERROR [SslTransportLayer channelId=1 
key=channel=java.nio.channels.SocketChannel[connection-pending remote=/:9093], 
mailto:selector=sun.nio.ch.EPollSelectorImpl@29fc22ba, interestOps=8, 
readyOps=0] Renegotiation requested, but it is not supported, channelId 1, 
appReadBuffer pos 0, netReadBuffer pos 40, netWriteBuffer pos 147 
handshakeStatus NEED_WRAP (org.apache.kafka.common.network.SslTransportLayer)

[2021-10-05 21:03:40,042] INFO [ReplicaFetcher replicaId=0, leaderId=1, 
fetcherId=1] Error sending fetch request (sessionId=530771171, epoch=237174) to 
node 1: (org.apache.kafka.clients.FetchSessionHandler)

java.io.IOException: Connection to 1 was disconnected before the response was 
read

at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)

at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110)

at 
kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:217)

at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:317)

at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:141)

at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:140)

at scala.Option.foreach(Option.scala:437)

at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:140)

at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:123)

at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)

[2021-10-05 21:03:40,042] WARN [ReplicaFetcher replicaId=0, leaderId=1, 
fetcherId=1] Error in response for fetch request (type=FetchRequest, 
replicaId=0, maxWait=500, minBytes=1, maxBytes=10485760, 
fetchData={test_topic0-5=PartitionData(fetchOffset=267151, logStartOffset=0, 
maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional[0])}, isolationLevel=READ_UNCOMMITTED, toForget=, 
metadata=(sessionId=530771171, epoch=237174), rackId=) 
(kafka.server.ReplicaFetcherThread)

java.io.IOException: Connection to 1 was disconnected before the response was 
read

at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)

at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110)

at 
kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:217)

at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:317)

at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:141)

at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:140)

at scala.Option.foreach(Option.scala:437)

at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:140)


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Tue, Nov 16, 2021, at 13:36, Guozhang Wang wrote:
> Hi Colin,
>
> If we allow downgrades which would be appended in metadata.version, then
> the length of the __cluster_medata log may not be safe to indicate higher
> versions, since older version records could still be appended later than a
> new version record right?
>

That's fair... the longer log could be at a lower metadata.version.

However, can you think of a scenario where the upgrade / downgrade logic 
doesn't protect us here?

Checking that a majority of the voters support the new metadata.version we're 
going to seems to cover all the cases I can think of. I guess there could be 
time-of-check, time-of-use race conditions, but they only happen if you are 
doing a feature downgrade at the same time as a software version downgrade. 
This is something the cluster management software should not do (I guess we 
should spell this out in the KIP) I think nobody would want to do that since it 
would mean rolling the cluster while you're messing with feature flags.

best,
Colin


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #14

2021-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 501092 lines...]
[2021-11-16T21:43:05.949Z] 
[2021-11-16T21:43:05.949Z] DeleteConsumerGroupsTest > 
testDeleteCmdWithMixOfSuccessAndError() PASSED
[2021-11-16T21:43:05.949Z] 
[2021-11-16T21:43:05.949Z] DeleteConsumerGroupsTest > 
testDeleteWithTopicOption() STARTED
[2021-11-16T21:43:08.083Z] 
[2021-11-16T21:43:08.083Z] DeleteConsumerGroupsTest > 
testDeleteWithTopicOption() PASSED
[2021-11-16T21:43:08.083Z] 
[2021-11-16T21:43:08.083Z] DeleteConsumerGroupsTest > 
testDeleteNonExistingGroup() STARTED
[2021-11-16T21:43:10.202Z] 
[2021-11-16T21:43:10.202Z] GetOffsetShellTest > testTopicNameArg() PASSED
[2021-11-16T21:43:10.202Z] 
[2021-11-16T21:43:10.202Z] GetOffsetShellTest > testTopicPatternArg() STARTED
[2021-11-16T21:43:10.381Z] 
[2021-11-16T21:43:10.381Z] DeleteConsumerGroupsTest > 
testDeleteNonExistingGroup() PASSED
[2021-11-16T21:43:10.381Z] 
[2021-11-16T21:43:10.381Z] DeleteConsumerGroupsTest > 
testDeleteCmdNonEmptyGroup() STARTED
[2021-11-16T21:43:13.957Z] 
[2021-11-16T21:43:13.957Z] DeleteConsumerGroupsTest > 
testDeleteCmdNonEmptyGroup() PASSED
[2021-11-16T21:43:13.957Z] 
[2021-11-16T21:43:13.957Z] DeleteConsumerGroupsTest > 
testDeleteCmdNonExistingGroup() STARTED
[2021-11-16T21:43:15.275Z] 
[2021-11-16T21:43:15.275Z] GetOffsetShellTest > testTopicPatternArg() PASSED
[2021-11-16T21:43:15.275Z] 
[2021-11-16T21:43:15.275Z] GetOffsetShellTest > 
testTopicPartitionsNotFoundForNonExistentTopic() STARTED
[2021-11-16T21:43:16.287Z] 
[2021-11-16T21:43:16.287Z] DeleteConsumerGroupsTest > 
testDeleteCmdNonExistingGroup() PASSED
[2021-11-16T21:43:16.287Z] 
[2021-11-16T21:43:16.287Z] DeleteConsumerGroupsTest > testDeleteEmptyGroup() 
STARTED
[2021-11-16T21:43:19.420Z] 
[2021-11-16T21:43:19.420Z] DeleteConsumerGroupsTest > testDeleteEmptyGroup() 
PASSED
[2021-11-16T21:43:19.420Z] 
[2021-11-16T21:43:19.420Z] DeleteConsumerGroupsTest > 
testDeleteWithMixOfSuccessAndError() STARTED
[2021-11-16T21:43:19.954Z] 
[2021-11-16T21:43:19.954Z] GetOffsetShellTest > 
testTopicPartitionsNotFoundForNonExistentTopic() PASSED
[2021-11-16T21:43:19.954Z] 
[2021-11-16T21:43:19.954Z] GetOffsetShellTest > testPartitionsArg() STARTED
[2021-11-16T21:43:22.717Z] 
[2021-11-16T21:43:22.717Z] DeleteConsumerGroupsTest > 
testDeleteWithMixOfSuccessAndError() PASSED
[2021-11-16T21:43:22.717Z] 
[2021-11-16T21:43:22.717Z] DeleteConsumerGroupsTest > 
testDeleteWithUnrecognizedNewConsumerOption() STARTED
[2021-11-16T21:43:24.450Z] 
[2021-11-16T21:43:24.450Z] GetOffsetShellTest > testPartitionsArg() PASSED
[2021-11-16T21:43:24.450Z] 
[2021-11-16T21:43:24.450Z] GetOffsetShellTest > 
testTopicPartitionsArgWithInternalExcluded() STARTED
[2021-11-16T21:43:24.852Z] 
[2021-11-16T21:43:24.852Z] DeleteConsumerGroupsTest > 
testDeleteWithUnrecognizedNewConsumerOption() PASSED
[2021-11-16T21:43:24.852Z] 
[2021-11-16T21:43:24.852Z] DeleteConsumerGroupsTest > testDeleteCmdAllGroups() 
STARTED
[2021-11-16T21:43:28.125Z] 
[2021-11-16T21:43:28.125Z] DeleteConsumerGroupsTest > testDeleteCmdAllGroups() 
PASSED
[2021-11-16T21:43:28.125Z] 
[2021-11-16T21:43:28.125Z] DeleteConsumerGroupsTest > testDeleteCmdEmptyGroup() 
STARTED
[2021-11-16T21:43:30.245Z] 
[2021-11-16T21:43:30.245Z] GetOffsetShellTest > 
testTopicPartitionsArgWithInternalExcluded() PASSED
[2021-11-16T21:43:30.245Z] 
[2021-11-16T21:43:30.245Z] GetOffsetShellTest > testNoFilterOptions() STARTED
[2021-11-16T21:43:30.422Z] 
[2021-11-16T21:43:30.422Z] DeleteConsumerGroupsTest > testDeleteCmdEmptyGroup() 
PASSED
[2021-11-16T21:43:30.422Z] 
[2021-11-16T21:43:30.422Z] ListConsumerGroupTest > 
testListConsumerGroupsWithStates() STARTED
[2021-11-16T21:43:35.141Z] 
[2021-11-16T21:43:35.141Z] GetOffsetShellTest > testNoFilterOptions() PASSED
[2021-11-16T21:43:35.141Z] 
[2021-11-16T21:43:35.141Z] GetOffsetShellTest > 
testTopicPartitionsFlagWithPartitionsFlagCauseExit() STARTED
[2021-11-16T21:43:39.430Z] 
[2021-11-16T21:43:39.430Z] ListConsumerGroupTest > 
testListConsumerGroupsWithStates() PASSED
[2021-11-16T21:43:39.430Z] 
[2021-11-16T21:43:39.430Z] ListConsumerGroupTest > 
testListWithUnrecognizedNewConsumerOption() STARTED
[2021-11-16T21:43:39.607Z] 
[2021-11-16T21:43:39.607Z] GetOffsetShellTest > 
testTopicPartitionsFlagWithPartitionsFlagCauseExit() PASSED
[2021-11-16T21:43:39.607Z] 
[2021-11-16T21:43:39.607Z] GetOffsetShellTest > testTopicPartitionsArg() STARTED
[2021-11-16T21:43:41.757Z] 
[2021-11-16T21:43:41.757Z] ListConsumerGroupTest > 
testListWithUnrecognizedNewConsumerOption() PASSED
[2021-11-16T21:43:41.757Z] 
[2021-11-16T21:43:41.757Z] ListConsumerGroupTest > testListConsumerGroups() 
STARTED
[2021-11-16T21:43:43.053Z] 
[2021-11-16T21:43:43.053Z] GetOffsetShellTest > testTopicPartitionsArg() PASSED
[2021-11-16T21:43:43.053Z] 
[2021-11-16T21:43:43.053Z] GetOffsetShellTest > testInternalExcluded() STARTED
[2021-11-16T21:43:48.219Z] 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Guozhang Wang
Hi Colin,

If we allow downgrades which would be appended in metadata.version, then
the length of the __cluster_medata log may not be safe to indicate higher
versions, since older version records could still be appended later than a
new version record right?

On Tue, Nov 16, 2021 at 1:16 PM Colin McCabe  wrote:

> On Tue, Nov 16, 2021, at 06:36, David Arthur wrote:
> > An interesting case here is how to handle a version update if a majority
> of
> > the quorum supports it, but the leader doesn't. For example, if C1 was
> the
> > leader and an upgrade to version 4 was requested. Maybe this would
> trigger
> > C1 to resign and inform the client to retry the update later.
> >
>
> Hmm, wouldn't we want to just reject the version update in this case? I
> don't see what the advantage of allowing it would be.
>
> The reason for requiring a majority rather than all voters is mainly to
> cover the case where a voter is down, I thought. That clearly doesn't apply
> if the un-upgraded voter is the leader itself...
>
> >
> > We may eventually want to consider the metadata.version when electing a
> > leader, but as long as we have the majority requirement before
> committing a
> > new metadata.version, I think we should be safe.
> >
>
> Yeah, this is safe. If we elect a leader at metadata.version X then that
> means that a majority of the cluster is at least at version X. Proof by
> contradiction: assume that this is not the case. Then the newly elected
> leader must have a shorter __cluster_metadata log than a majority of the
> voters. But this is incompatible with winning a Raft election.
>
> In the case where the leader is "behind" some of the other voters, those
> voters will truncate their logs to match the new leader. This will
> downgrade them. Basically this is the case where the feature upgrade was
> proposed, but never fully completed.
>
> best,
> Colin
>
>
> > -David
> >
> > On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang 
> wrote:
> >
> >> Thanks David,
> >>
> >> 1. Got it. One thing I'm still not very clear is why it's sufficient to
> >> select a metadata.version which is supported by majority of the quorum,
> but
> >> not the whole quorum (i.e. choosing the lowest version among all the
> quorum
> >> members)? Since the leader election today does not take this value into
> >> consideration, we are not guaranteed that newly selected leaders would
> >> always be able to recognize and support the initialized metadata.version
> >> right?
> >>
> >> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is
> beyond
> >> the scope of this KIP, we can defer it to later discussions.
> >>
> >> On Mon, Nov 15, 2021 at 8:13 AM David Arthur
> >>  wrote:
> >>
> >> > Guozhang, thanks for the review!
> >> >
> >> > 1, As we've defined it so far, the initial metadata.version is set by
> an
> >> > operator via the "kafka-storage.sh" tool. It would be possible for
> >> > different values to be selected, but only the quorum leader would
> commit
> >> a
> >> > FeatureLevelRecord with the version they read locally. See the above
> >> reply
> >> > to Jun's question for a little more detail.
> >> >
> >> > We need to enable the KRaft RPCs regardless of metadata.version (vote,
> >> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can
> be
> >> > elected, and followers can learn about the selected metadata.version.
> I
> >> > believe the quorum startup procedure would go something like:
> >> >
> >> > * Controller nodes start, read their logs, begin leader election
> >> > * While the elected node is becoming leader (in
> >> > QuorumMetaLogListener#handleLeaderChange), initialize
> metadata.version if
> >> > necessary
> >> > * Followers replicate the FeatureLevelRecord
> >> >
> >> > This (probably) means the quorum peers must continue to rely on
> >> ApiVersions
> >> > to negotiate compatible RPC versions and these versions cannot depend
> on
> >> > metadata.version.
> >> >
> >> > Does this make sense?
> >> >
> >> >
> >> > 2, ApiVersionResponse includes the broker's supported feature flags as
> >> well
> >> > as the cluster-wide finalized feature flags. We probably need to add
> >> > something like the feature flag "epoch" to this response payload in
> order
> >> > to see which broker is most up to date.
> >> >
> >> > If the new feature flag version included RPC changes, we are helped by
> >> the
> >> > fact that a client won't attempt to use the new RPC until it has
> >> discovered
> >> > a broker that supports it via ApiVersions. The problem is more
> difficult
> >> > for cases like you described where the feature flag upgrade changes
> the
> >> > behavior of the broker, but not its RPCs. This is actually the same
> >> problem
> >> > as upgrading the IBP. During a rolling restart, clients may hit
> different
> >> > brokers with different capabilities and not know it.
> >> >
> >> > This probably warrants further investigation, but hopefully you agree
> it
> >> is
> >> > beyond the scope of this KIP :)
> >> 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Guozhang Wang
Thanks, I think having the leader election to consider the metadata.version
would be a good idea moving forward, but we do not need to include in this
KIP.

On Tue, Nov 16, 2021 at 6:37 AM David Arthur
 wrote:

> Guozhang,
>
> 1. By requiring a majority of all controller nodes to support the version
> selected by the leader, we increase the likelihood that the next leader
> will also support it. We can't guarantee that all nodes definitely support
> the selected metadata.version because there could always be an offline or
> partitioned peer that is running old software.
>
> If a controller running old software manages elected, we hit this case:
>
>  In the unlikely event that an active controller encounters an unsupported
> > metadata.version, it should resign and terminate.
>
>
> So, given this, we should be able to eventually elect a controller that
> does support the metadata.version.
>
>
> Consider controllers C1, C2, C3 with this arrangement:
>
> Node  SoftwareVer MaxMetadataVer
> C13.2 1
> C23.3 4
> C33.3 4
>
> If the current metadata.version is 1 and we're trying to upgrade to 4, we
> would allow it since two of the three nodes support it. If any one
> controller is down while we are attempting an upgrade, we would require
> that both of remaining alive nodes support the target metadata.version
> (since we require a majority of _all_ controller nodes, not just alive
> ones).
>
> An interesting case here is how to handle a version update if a majority of
> the quorum supports it, but the leader doesn't. For example, if C1 was the
> leader and an upgrade to version 4 was requested. Maybe this would trigger
> C1 to resign and inform the client to retry the update later.
>
> We may eventually want to consider the metadata.version when electing a
> leader, but as long as we have the majority requirement before committing a
> new metadata.version, I think we should be safe.
>
> -David
>
> On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang  wrote:
>
> > Thanks David,
> >
> > 1. Got it. One thing I'm still not very clear is why it's sufficient to
> > select a metadata.version which is supported by majority of the quorum,
> but
> > not the whole quorum (i.e. choosing the lowest version among all the
> quorum
> > members)? Since the leader election today does not take this value into
> > consideration, we are not guaranteed that newly selected leaders would
> > always be able to recognize and support the initialized metadata.version
> > right?
> >
> > 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is
> beyond
> > the scope of this KIP, we can defer it to later discussions.
> >
> > On Mon, Nov 15, 2021 at 8:13 AM David Arthur
> >  wrote:
> >
> > > Guozhang, thanks for the review!
> > >
> > > 1, As we've defined it so far, the initial metadata.version is set by
> an
> > > operator via the "kafka-storage.sh" tool. It would be possible for
> > > different values to be selected, but only the quorum leader would
> commit
> > a
> > > FeatureLevelRecord with the version they read locally. See the above
> > reply
> > > to Jun's question for a little more detail.
> > >
> > > We need to enable the KRaft RPCs regardless of metadata.version (vote,
> > > heartbeat, fetch, etc) so that the quorum can be formed, a leader can
> be
> > > elected, and followers can learn about the selected metadata.version. I
> > > believe the quorum startup procedure would go something like:
> > >
> > > * Controller nodes start, read their logs, begin leader election
> > > * While the elected node is becoming leader (in
> > > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version
> if
> > > necessary
> > > * Followers replicate the FeatureLevelRecord
> > >
> > > This (probably) means the quorum peers must continue to rely on
> > ApiVersions
> > > to negotiate compatible RPC versions and these versions cannot depend
> on
> > > metadata.version.
> > >
> > > Does this make sense?
> > >
> > >
> > > 2, ApiVersionResponse includes the broker's supported feature flags as
> > well
> > > as the cluster-wide finalized feature flags. We probably need to add
> > > something like the feature flag "epoch" to this response payload in
> order
> > > to see which broker is most up to date.
> > >
> > > If the new feature flag version included RPC changes, we are helped by
> > the
> > > fact that a client won't attempt to use the new RPC until it has
> > discovered
> > > a broker that supports it via ApiVersions. The problem is more
> difficult
> > > for cases like you described where the feature flag upgrade changes the
> > > behavior of the broker, but not its RPCs. This is actually the same
> > problem
> > > as upgrading the IBP. During a rolling restart, clients may hit
> different
> > > brokers with different capabilities and not know it.
> > >
> > > This probably warrants further investigation, but hopefully you agree
> it
> > is
> > > beyond the scope of this KIP :)
> > >
> > > -David
> > >

Re: Track topic deletion state without ZK

2021-11-16 Thread Colin McCabe
Hi Omnia,

Topic deletion doesn't get stuck if a broker is down, when using KRaft. There 
is no "deleting" state, only deleted or not deleted.

best,
Colin

On Tue, Nov 2, 2021, at 09:24, Omnia Ibrahim wrote:
> Hi Colin, thanks for your response.
> Regards your point that the topic gets deleted immediately, I got that we
> do this if the cluster is healthy.
> However, if there's a hardware failure with the disk or the broker is
> unreachable and has a replica; In these cases, deleting the log files from
> the failed disk or unreachable broker will be impossible to delete until we
> fix the hardware issue,
> So during troubleshooting, how will we know which topic is stuck for
> deletion because we can't delete some replicas because of hardware
> failures?
>
> Thanks
>
>
> On Mon, Nov 1, 2021 at 8:57 PM Colin McCabe  wrote:
>
>> Hi Omnia,
>>
>> It is not necessary to know which topics are marked for deletion in when
>> in KRaft mode, because topic deletion happens immediately.
>>
>> best,
>> Colin
>>
>> On Thu, Oct 28, 2021, at 06:57, Omnia Ibrahim wrote:
>> > Hi,
>> >
>> > Kafka topicCommand used to report which topic is marked for deletion by
>> > checking the znode on zookeeper; this feature has been deprecated without
>> > replacement as part of KAFKA-12596 Remove deprecated --zookeeper in
>> > topicCommands .
>> >
>> > Also as far as I can see, there's no equivalent for this with KIP-500 as
>> > well.
>> >
>> > Is there any other way to know the state of deletion for Kafka with ZK
>> and
>> > Without ZK?
>> >
>> > Is possible to leverage `RemoveTopicRecord` on the metadata topic to
>> > provide the same feature?
>> >
>> >
>> > Thanks
>> >
>> > Omnia
>>


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Tue, Nov 16, 2021, at 06:36, David Arthur wrote:
> An interesting case here is how to handle a version update if a majority of
> the quorum supports it, but the leader doesn't. For example, if C1 was the
> leader and an upgrade to version 4 was requested. Maybe this would trigger
> C1 to resign and inform the client to retry the update later.
>

Hmm, wouldn't we want to just reject the version update in this case? I don't 
see what the advantage of allowing it would be.

The reason for requiring a majority rather than all voters is mainly to cover 
the case where a voter is down, I thought. That clearly doesn't apply if the 
un-upgraded voter is the leader itself...

>
> We may eventually want to consider the metadata.version when electing a
> leader, but as long as we have the majority requirement before committing a
> new metadata.version, I think we should be safe.
>

Yeah, this is safe. If we elect a leader at metadata.version X then that means 
that a majority of the cluster is at least at version X. Proof by 
contradiction: assume that this is not the case. Then the newly elected leader 
must have a shorter __cluster_metadata log than a majority of the voters. But 
this is incompatible with winning a Raft election.

In the case where the leader is "behind" some of the other voters, those voters 
will truncate their logs to match the new leader. This will downgrade them. 
Basically this is the case where the feature upgrade was proposed, but never 
fully completed.

best,
Colin


> -David
>
> On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang  wrote:
>
>> Thanks David,
>>
>> 1. Got it. One thing I'm still not very clear is why it's sufficient to
>> select a metadata.version which is supported by majority of the quorum, but
>> not the whole quorum (i.e. choosing the lowest version among all the quorum
>> members)? Since the leader election today does not take this value into
>> consideration, we are not guaranteed that newly selected leaders would
>> always be able to recognize and support the initialized metadata.version
>> right?
>>
>> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond
>> the scope of this KIP, we can defer it to later discussions.
>>
>> On Mon, Nov 15, 2021 at 8:13 AM David Arthur
>>  wrote:
>>
>> > Guozhang, thanks for the review!
>> >
>> > 1, As we've defined it so far, the initial metadata.version is set by an
>> > operator via the "kafka-storage.sh" tool. It would be possible for
>> > different values to be selected, but only the quorum leader would commit
>> a
>> > FeatureLevelRecord with the version they read locally. See the above
>> reply
>> > to Jun's question for a little more detail.
>> >
>> > We need to enable the KRaft RPCs regardless of metadata.version (vote,
>> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be
>> > elected, and followers can learn about the selected metadata.version. I
>> > believe the quorum startup procedure would go something like:
>> >
>> > * Controller nodes start, read their logs, begin leader election
>> > * While the elected node is becoming leader (in
>> > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if
>> > necessary
>> > * Followers replicate the FeatureLevelRecord
>> >
>> > This (probably) means the quorum peers must continue to rely on
>> ApiVersions
>> > to negotiate compatible RPC versions and these versions cannot depend on
>> > metadata.version.
>> >
>> > Does this make sense?
>> >
>> >
>> > 2, ApiVersionResponse includes the broker's supported feature flags as
>> well
>> > as the cluster-wide finalized feature flags. We probably need to add
>> > something like the feature flag "epoch" to this response payload in order
>> > to see which broker is most up to date.
>> >
>> > If the new feature flag version included RPC changes, we are helped by
>> the
>> > fact that a client won't attempt to use the new RPC until it has
>> discovered
>> > a broker that supports it via ApiVersions. The problem is more difficult
>> > for cases like you described where the feature flag upgrade changes the
>> > behavior of the broker, but not its RPCs. This is actually the same
>> problem
>> > as upgrading the IBP. During a rolling restart, clients may hit different
>> > brokers with different capabilities and not know it.
>> >
>> > This probably warrants further investigation, but hopefully you agree it
>> is
>> > beyond the scope of this KIP :)
>> >
>> > -David
>> >
>> >
>> > On Mon, Nov 15, 2021 at 10:26 AM David Arthur > >
>> > wrote:
>> >
>> > > Jun, thanks for the comments!
>> > >
>> > > 16, When a new cluster is deployed, we don't select the highest
>> available
>> > > metadata.version, but rather the quorum leader picks a bootstrap
>> version
>> > > defined in meta.properties. As mentioned earlier, we should add
>> > validation
>> > > here to ensure a majority of the followers could support this version
>> > > before initializing it. This would avoid a situation where a 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
Hi David,

On Mon, Nov 15, 2021, at 08:13, David Arthur wrote:
> This (probably) means the quorum peers must continue to rely on ApiVersions
> to negotiate compatible RPC versions and these versions cannot depend on
> metadata.version.

I agree. Can we explicitly spell out in KIP that quorum peers use ApiVersions 
to negotiate compatible RPC versions, rather than using metadata.version?

>
> If the new feature flag version included RPC changes, we are helped by the
> fact that a client won't attempt to use the new RPC until it has discovered
> a broker that supports it via ApiVersions. The problem is more difficult
> for cases like you described where the feature flag upgrade changes the
> behavior of the broker, but not its RPCs. This is actually the same problem
> as upgrading the IBP. During a rolling restart, clients may hit different
> brokers with different capabilities and not know it.
>
> This probably warrants further investigation, but hopefully you agree it is
> beyond the scope of this KIP :)
>

Yes, this is an existing issue (at least in theory... doesn't seem to be in 
practice) and out of scope for this KIP.

> Thinking a bit more on this, we do need to define a state where we're
> running newer software, but we don't have the feature flag set. This could
> happen if we were running an older IBP that did not support KIP-778.

I took another look at the KIP and I see that it calls for IBP to continue to 
exist alongside metadata.version. In our previous offline discussions, I think 
we were assuming that metadata.version would replace the IBP completely. Also, 
there would be a 1:1 correspondence between new IBP versions and new 
metadata.version levels. Basically ZK-based clusters will keep using IBP, the 
same as they've always done, whereas KRaft-based clusters will use 
metadata.version instead.

I actually don't think it makes sense to keep IBP if we are introducing 
metadata.version. I think metadata.version should replace it completely in the 
KRaft case, and ZK-based brokers should keep working exactly like they 
currently do. This avoids all the complexity described above. What do you think?

Also, it seems like KIP-778 is not present on the main KIP wiki page... can you 
add it?

best,
Colin


> -David
>
>
> On Mon, Nov 15, 2021 at 10:26 AM David Arthur 
> wrote:
>
>> Jun, thanks for the comments!
>>
>> 16, When a new cluster is deployed, we don't select the highest available
>> metadata.version, but rather the quorum leader picks a bootstrap version
>> defined in meta.properties. As mentioned earlier, we should add validation
>> here to ensure a majority of the followers could support this version
>> before initializing it. This would avoid a situation where a failover
>> results in a new leader who can't support the selected metadata.version.
>>
>> Thinking a bit more on this, we do need to define a state where we're
>> running newer software, but we don't have the feature flag set. This could
>> happen if we were running an older IBP that did not support KIP-778.
>> Following on this, it doesn't seem too difficult to consider a case where
>> the IBP has been upgraded, but we still have not finalized a
>> metadata.version. Here are some possible version combinations (assuming
>> KIP-778 is added to Kafka 3.2):
>>
>> Case  SoftwareIBPmetadata.versioneffective version
>> --
>> A 3.1 3.1-   0  software too old for
>> feature flag
>> B 3.2 3.1-   0  feature flag supported,
>> but IBP too old
>> C 3.2 3.2-   0  feature flag supported,
>> but not initialized
>> D 3.2 3.21   1  feature flag initialized
>> to 1 (via operator or bootstrap process)
>> ...
>> E 3.8 3.1-   0  ...IBP too old
>> F 3.8 3.2-   0  ...not initialized
>> G 3.8 3.24   4
>>
>>
>> Here I'm defining version 0 as "no metadata.version set". So back to your
>> comment, I think the KIP omits discussion of case C from the above table
>> which I can amend. Does that cover your concerns, or am I missing something
>> else?
>>
>>
>> > it's inconvenient for a user to manually upgrade every feature version
>>
>> For this, we would probably want to extend the capabilities of KIP-584. I
>> don't think anything we've discussed for KIP-778 will preclude us from
>> adding some kind of auto-upgrade in the future.
>>
>> 21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit
>> weird.
>>
>>
>>
>> On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang  wrote:
>>
>>> Hello David,
>>>
>>> Thanks for the very nice writeup! It helped me a lot to refresh my memory
>>> on KIP-630/590/584 :)
>>>
>>> I just had two clarification questions after reading through the KIP:
>>>
>>> 1. For the initialization procedure, do we guarantee that all the quorum

Re: [DISCUSS] KIP-779: Allow Source Tasks to Handle Producer Exceptions

2021-11-16 Thread Knowles Atchison Jr
Thank you all for the feedback, the KIP has been updated.

On Tue, Nov 16, 2021 at 10:46 AM Arjun Satish 
wrote:

> One more nit: the RetryWithToleranceOperator class is not a public
> interface. So we do not have to call the changes in them out in the Public
> Interfaces section.
>
>
> On Tue, Nov 16, 2021 at 10:42 AM Arjun Satish 
> wrote:
>
> > Chris' point about upgrades is valid. An existing configuration will now
> > have additional behavior. We should clearly call this out in the kip, and
> > whenever they are prepared -- the release notes. It's a bit crummy when
> > upgrading, but I do think it's better than introducing a new
> configuration
> > in the long term.
> >
> > On Mon, Nov 15, 2021 at 2:52 PM Knowles Atchison Jr <
> katchiso...@gmail.com>
> > wrote:
> >
> >> Chris,
> >>
> >> Thank you for the feedback. I can certainly update the KIP to state that
> >> once exactly one support is in place, the task would be failed even if
> >> error.tolerance were set to all. Programmatically it would still require
> >> PRs to be merged to build on top of. I also liked my original
> >> implementation of the hook as it gave the connector writers the most
> >> flexibility in handling producer errors. I changed the original
> >> implementation as the progression/changes still supported my use case
> and
> >> I
> >> thought it would move this process along faster.
> >>
> >> Knowles
> >>
> >> On Thu, Nov 11, 2021 at 3:43 PM Chris Egerton
>  >> >
> >> wrote:
> >>
> >> > Hi Knowles,
> >> >
> >> > I think this looks good for the most part but I'd still like to see an
> >> > explicit mention in the KIP (and proposed doc/Javadoc changes) that
> >> states
> >> > that, with exactly-once support enabled, producer exceptions that
> result
> >> > from failures related to exactly-once support (including but not
> >> limited to
> >> > ProducerFencedExcecption instances (
> >> >
> >> >
> >>
> https://kafka.apache.org/30/javadoc/org/apache/kafka/common/errors/ProducerFencedException.html
> >> > ))
> >> > will not be skipped even with "errors.tolerance" set to "all", and
> will
> >> > instead unconditionally cause the task to fail. Your proposal that
> >> > "WorkerSourceTask could check the configuration before handing off the
> >> > records and exception to this function" seems great as long as we
> update
> >> > "handing off the records and exceptions to this function" to the
> >> > newly-proposed behavior of "logging the exception and continuing to
> poll
> >> > the task for data".
> >> >
> >> > I'm also a little bit wary of updating the existing "errors.tolerance"
> >> > configuration to have new behavior that users can't opt out of without
> >> also
> >> > opting out of the current behavior they get with "errors.tolerance"
> set
> >> to
> >> > "all", but I think I've found a decent argument in favor of it. One
> >> thought
> >> > that came to mind is whether this use case was originally considered
> >> when
> >> > KIP-298 was being discussed. However, it appears that KAFKA-8586 (
> >> > https://issues.apache.org/jira/browse/KAFKA-8586), the fix for which
> >> > caused
> >> > tasks to fail on non-retriable, asynchronous producer exceptions
> >> instead of
> >> > logging them and continuing, was discovered over a full year after the
> >> > changes for KIP-298 (https://github.com/apache/kafka/pull/5065) were
> >> > merged. I suspect that the current proposal aligns nicely with the
> >> original
> >> > design intent of KIP-298, and that if KAFKA-8586 were discovered
> before
> >> or
> >> > during discussion for KIP-298, non-retriable, asynchronous producer
> >> > exceptions would have been included in its scope. With that in mind,
> >> > although it may cause issues for some niche use cases, I think that
> >> this is
> >> > a valid change and would be worth the tradeoff of potentially
> >> complicating
> >> > life for a small number of users. I'd be interested in Arjun's
> thoughts
> >> on
> >> > this though (as he designed and implemented KIP-298), and if this
> >> analysis
> >> > is agreeable, we may want to document that information in the KIP as
> >> well
> >> > to strengthen our case for not introducing a new configuration
> property
> >> and
> >> > instead making this behavior tied to the existing "errors.tolerance"
> >> > property with no opt-out besides using a new value for that property.
> >> >
> >> > My last thought is that, although it may be outside the scope of this
> >> KIP,
> >> > I believe your original proposal of giving tasks a hook to handle
> >> > downstream exceptions is actually quite valid. The DLQ feature for
> sink
> >> > connectors is an extremely valuable one as it prevents data loss when
> >> > "errors.tolerance" is set to "all" by allowing users to reprocess
> >> > problematic records at a later date without stopping the flow of data
> in
> >> > their connector entirely. As others have noted, it's difficult if not
> >> > outright impossible to provide a Kafka DLQ topic for source connectors
> 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Fri, Nov 5, 2021, at 08:10, David Arthur wrote:
> Colin and Jun, thanks for the additional comments!
>
> Colin:
>
>> We've been talking about having an automated RPC compatibility checker
>
> Do we have a way to mark fields in schemas as deprecated? It can stay in
> the RPC, it just complicates the logic a bit.
>

Hi David,

The easiest thing to do is probably just to add a comment to the "description" 
field.

Thinking about it more, I really think we should stick to the normal RPC 
versioning rules. Making exceptions just tends to lead to issues down the road.

>> It would be nice if the active controller could validate that a majority
> of the quorum could use the proposed metadata.version. The active
> controller should have this information, right? If we don't have recent
> information  from a quorum of voters, we wouldn't be active.
>
> I believe we should have this information from the ApiVersionsResponse. It
> would be good to do this validation to avoid a situation where a
> quorum leader can't be elected due to unprocessable records.
>

Sounds good... can you add this to the KIP?

>> Do we need delete as a command separate from downgrade?
>
> I think from an operator's perspective, it is nice to distinguish between
> changing a feature flag and unsetting it. It might be surprising to an
> operator to see the flag's version set to nothing when they requested the
> downgrade to version 0 (or less).
>

Fair enough. If we want to go this route we should also specify whether set to 
0 is legal, or whether it's necessary to use "delete".

Jun's proposal of using "disable" instead of delete also seems reasonable 
here...

>
> > it seems like we should spell out that metadata.version begins at 1 in
> >KRaft clusters
>
> I added this text:
>

Thanks.

> > We probably also want an RPC implemented by both brokers and controllers
> > that will reveal the min and max supported versions for each feature level
> > supported by the server
>
> This is available in ApiVersionsResponse (we include the server's supported
> features as well as the cluster's finalized features)

Good point.

>
> Jun:
>
> 20. Indeed snapshots are not strictly necessary during an upgrade, I've
> reworded this
>

I thought we agreed that we would do a snapshot before each upgrade so that if 
there was a big problem with the new metadata version, we would at least have a 
clean snapshot to fall back on?

best,
Colin


>
> Thanks!
> David
>
>
> On Thu, Nov 4, 2021 at 6:51 PM Jun Rao  wrote:
>
>> Hi, David, Jose and Colin,
>>
>> Thanks for the reply. A few more comments.
>>
>> 12. It seems that we haven't updated the AdminClient accordingly?
>>
>> 14. "Metadata snapshot is generated and sent to the other inactive
>> controllers and to brokers". I thought we wanted each broker to generate
>> its own snapshot independently? If only the controller generates the
>> snapshot, how do we force other brokers to pick it up?
>>
>> 16. If a feature version is new, one may not want to enable it immediately
>> after the cluster is upgraded. However, if a feature version has been
>> stable, requiring every user to run a command to upgrade to that version
>> seems inconvenient. One way to improve this is for each feature to define
>> one version as the default. Then, when we upgrade a cluster, we will
>> automatically upgrade the feature to the default version. An admin could
>> use the tool to upgrade to a version higher than the default.
>>
>> 20. "The quorum controller can assist with this process by generating a
>> metadata snapshot after a metadata.version increase has been committed to
>> the metadata log. This snapshot will be a convenient way to let broker and
>> controller components rebuild their entire in-memory state following an
>> upgrade." The new version of the software could read both the new and the
>> old version. Is generating a new snapshot during upgrade needed?
>>
>> Jun
>>
>>
>> On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe  wrote:
>>
>> > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
>> > > Hi, David,
>> > >
>> > > One more comment.
>> > >
>> > > 16. The main reason why KIP-584 requires finalizing a feature manually
>> is
>> > > that in the ZK world, the controller doesn't know all brokers in a
>> > cluster.
>> > > A broker temporarily down is not registered in ZK. in the KRaft world,
>> > the
>> > > controller keeps track of all brokers, including those that are
>> > temporarily
>> > > down. This makes it possible for the controller to automatically
>> > finalize a
>> > > feature---it's safe to do so when all brokers support that feature.
>> This
>> > > will make the upgrade process much simpler since no manual command is
>> > > required to turn on a new feature. Have we considered this?
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> >
>> > Hi Jun,
>> >
>> > I guess David commented on this point already, but I'll comment as well.
>> I
>> > always had the perception that users viewed rolls as potentially risky
>> and
>> > were 

Re: [VOTE] KIP-779: Allow Source Tasks to Handle Producer Exceptions

2021-11-16 Thread Chris Egerton
+1 (non-binding). Thanks Knowles!

On Tue, Nov 16, 2021 at 10:48 AM Arjun Satish 
wrote:

> +1 (non-binding). Thanks for the KIP, Knowles! and appreciate the
> follow-ups!
>
> On Thu, Nov 11, 2021 at 2:55 PM John Roesler  wrote:
>
> > Thanks, Knowles!
> >
> > I'm +1 (binding)
> >
> > -John
> >
> > On Wed, 2021-11-10 at 12:42 -0500, Christopher Shannon
> > wrote:
> > > +1 (non-binding). This looks good to me and will be useful as a way to
> > > handle producer errors.
> > >
> > > On Mon, Nov 8, 2021 at 8:55 AM Knowles Atchison Jr <
> > katchiso...@gmail.com>
> > > wrote:
> > >
> > > > Good morning,
> > > >
> > > > I'd like to start a vote for KIP-779: Allow Source Tasks to Handle
> > Producer
> > > > Exceptions:
> > > >
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-779%3A+Allow+Source+Tasks+to+Handle+Producer+Exceptions
> > > >
> > > > The purpose of this KIP is to allow Source Tasks the option to
> "ignore"
> > > > kafka producer exceptions. After a few iterations, this is now part
> of
> > the
> > > > "errors.tolerance" configuration and provides a null RecordMetadata
> to
> > > > commitRecord() in lieu of a new SourceTask interface method or worker
> > > > configuration item.
> > > >
> > > > PR is here:
> > > >
> > > > https://github.com/apache/kafka/pull/11382
> > > >
> > > > Any comments and feedback are welcome.
> > > >
> > > > Knowles
> > > >
> >
> >
> >
>


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> Hi, David,
>
> Thanks for the reply.
>
> 16. My first concern is that the KIP picks up meta.version inconsistently
> during the deployment. If a new cluster is started, we pick up the highest
> version. If we upgrade, we leave the feature version unchanged.

Hi Jun,

Thanks again for taking a look.

The proposed behavior in KIP-778 is consistent with how it works today. 
Upgrading the software is distinct from upgrading the IBP.

I think it is important to keep these two operations ("upgrading IBP/metadata 
version" and "upgrading software version") separate. If they are coupled it 
will create a situation where software upgrades are difficult and dangerous.

Consider a situation where you find some bug in your current software, and you 
want to upgrade to new software that fixes the bug. If upgrades and IBP bumps 
are coupled, you can't do this without also bumping the IBP, which is usually 
considered a high-risk change. That means that either you have to make a 
special build that includes only the fix (time-consuming and error-prone), live 
with the bug for longer, or be very conservative about ever introducing new 
IBP/metadata versions. None of those are really good choices.

> Intuitively, it seems that independent of how a cluster is deployed, we
> should always pick the same feature version.

I think it makes sense to draw a distinction between upgrading an existing 
cluster and deploying a new one. What most people want out of upgrades is that 
things should keep working, but with bug fixes. If we change that, it just 
makes people more reluctant to upgrade (which is always a problem...)

> I think we need to think this through in this KIP. My second concern is
> that as a particular version matures, it's inconvenient for a user to manually
> upgrade every feature version. As long as we have a path to achieve that in
> the future, we don't need to address that in this KIP.

If people are managing a large number of Kafka clusters, they will want to do 
some sort of A/B testing with IBP/metadata versions. So if you have 1000 Kafka 
clusters, you roll out the new IBP version to 10 of them and see how it goes. 
If that goes well, you roll it out to more, etc.

So, the automation needs to be at the cluster management layer, not at the 
Kafka layer. Each Kafka cluster doesn't know how well things went in the other 
999 clusters. Automatically upgrading is a bad thing for the same reason Kafka 
automatically upgrading its own software version would be a bad thing -- it 
could lead to a disruption to a sensitive cluster at the wrong time.

For people who are just managing one or two Kafka clusters, automatically 
upgrading feature versions isn't a big burden and can be done manually. This is 
all consistent with how IBP works today.

Also, we already have a command-line option to the feature tool which upgrades 
all features to the latest available, if that is what the administrator 
desires. Perhaps we could add documentation saying that this should be done as 
the last step of the upgrade.

best,
Colin

>
> 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
> since the logic is always there. Would it be better to use disable?
>
> Jun
>
> On Fri, Nov 5, 2021 at 8:11 AM David Arthur
>  wrote:
>
>> Colin and Jun, thanks for the additional comments!
>>
>> Colin:
>>
>> > We've been talking about having an automated RPC compatibility checker
>>
>> Do we have a way to mark fields in schemas as deprecated? It can stay in
>> the RPC, it just complicates the logic a bit.
>>
>> > It would be nice if the active controller could validate that a majority
>> of the quorum could use the proposed metadata.version. The active
>> controller should have this information, right? If we don't have recent
>> information  from a quorum of voters, we wouldn't be active.
>>
>> I believe we should have this information from the ApiVersionsResponse. It
>> would be good to do this validation to avoid a situation where a
>> quorum leader can't be elected due to unprocessable records.
>>
>> > Do we need delete as a command separate from downgrade?
>>
>> I think from an operator's perspective, it is nice to distinguish between
>> changing a feature flag and unsetting it. It might be surprising to an
>> operator to see the flag's version set to nothing when they requested the
>> downgrade to version 0 (or less).
>>
>> > it seems like we should spell out that metadata.version begins at 1 in
>> KRaft clusters
>>
>> I added this text:
>>
>> Introduce an IBP version to indicate the lowest software version that
>> > supports *metadata.version*. Below this IBP, the *metadata.version* is
>> > undefined and will not be examined. At or above this IBP, the
>> > *metadata.version* must be *0* for ZooKeeper clusters and will be
>> > initialized as *1* for KRaft clusters.
>>
>>
>> > We probably also want an RPC implemented by both brokers and controllers
>> that will reveal the 

Re: KIP-769: Connect API to retrieve connector configuration definitions

2021-11-16 Thread Gunnar Morling
Hi,

I'm +1 for adding a GET endpoint for obtaining config definitions. It
always felt odd to me that one has to issue a PUT for that purpose. If
nothing else, it'd be better in terms of discoverability of the KC REST API.

One additional feature request I'd have is to expose the valid enum
constants for enum-typed options. That'll help to display the values in a
drop-down or via radio buttons in a UI, give us tab completion in kcctl,
etc.

Best,

--Gunnar


Am Di., 16. Nov. 2021 um 16:31 Uhr schrieb Chris Egerton
:

> Hi Viktor,
>
> It sounds like there are three major points here in favor of a new GET
> endpoint for connector config defs.
>
> 1. You cannot issue a blank ("dummy") request for sink connectors because a
> topic list/topic regex has to be supplied (otherwise the PUT endpoint
> returns a 500 response)
> 2. A dummy request still triggers custom validations by the connector,
> which may be best to avoid if we know for sure that the config isn't worth
> validating yet
> 3. It's more ergonomic and intuitive to be able to issue a GET request
> without having to give a dummy connector config
>
> With regards to 1, this is actually a bug in Connect (
> https://issues.apache.org/jira/browse/KAFKA-13327) with a fix already
> implemented and awaiting committer review (
> https://github.com/apache/kafka/pull/11369). I think it'd be better to
> focus on fixing this bug in general instead of implementing a new REST
> endpoint in order to allow people to work around it.
>
> With regards to 2, this is technically possible but I'm unsure it'd be too
> common out in the wild given that most validations that could be expensive
> would involve things like connecting to a database, checking if a cloud
> storage bucket exists, etc., none of which are possible without some
> configuration properties from the user (db hostname, bucket name, etc.).
>
> With regards to 3, I do agree that it'd be easier for people designing UIs
> to have a GET API to work against. I'm just not sure it's worth the
> additional implementation, testing, and maintenance burden. If it were
> possible to issue a PUT request without unexpected 500s for invalid
> configs, would that suffice? AFAICT it'd basically be as simple as issuing
> a PUT request with a dummy body consisting of nothing except the connector
> class (which at this point we might even make unnecessary and just
> automatically replace with the connector class from the URL) and then
> filtering the response to just grab the "definition" field of each element
> in the "configs" array in the response.
>
> Cheers,
>
> Chris
>
> On Tue, Nov 16, 2021 at 9:52 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Hi Folks,
> >
> > I too think this would be a very useful feature. Some of our management
> > applications would provide a wizard for creating connectors. In this
> > scenario the user basically would fill out a sample configuration
> generated
> > by the UI which would send it back to Connect for validation and
> eventually
> > create a new connector. The first part of this workflow can be enhanced
> if
> > we had an API that can return the configuration definition of the given
> > type of connector as the UI application would be able to generate a
> sample
> > for the user based on that (nicely drawn diagram:
> > https://imgur.com/a/7S1Xwm5).
> > The connector-plugins/{connectorType}/config/validate API essentially
> works
> > and returns the data that we need, however it is a HTTP PUT API that is a
> > bit unintuitive for a fetch-like functionality and also functionally
> > different as it validates the given (dummy) request. In case of sink
> > connectors one would need to also provide a topic name.
> >
> > A suggestion for the KIP: I think it can be useful to return the config
> > groups and the connector class' name similarly to the validate API just
> in
> > case any frontend needs them (and also the response would be more like
> the
> > validate API but simpler).
> >
> > Viktor
> >
> > On Fri, Aug 20, 2021 at 4:51 PM Ryanne Dolan 
> > wrote:
> >
> > > I think it'd be worth adding a GET version, fwiw. Could be the same
> > handler
> > > with just a different spelling maybe.
> > >
> > > On Fri, Aug 20, 2021, 7:44 AM Mickael Maison  >
> > > wrote:
> > >
> > > > Hi Chris,
> > > >
> > > > You're right, you can achieve the same functionality using the
> > > > existing validate endpoint.
> > > > In my mind it was only for validation once you have build a
> > > > configuration but when used with an empty configuration, it basically
> > > > serves the same purpose as the proposed new endpoint.
> > > >
> > > > I think it's a bit easier to use a GET endpoint but I don't think it
> > > > really warrants a different endpoint.
> > > >
> > > > Thanks
> > > >
> > > > On Thu, Aug 19, 2021 at 2:56 PM Chris Egerton
> > > >  wrote:
> > > > >
> > > > > Hi Mickael,
> > > > >
> > > > > I'm wondering about the use case here. The motivation section
> states
> > > that
> > > > 

[jira] [Resolved] (KAFKA-13071) Deprecate and remove --authorizer option in kafka-acls.sh

2021-11-16 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-13071.
-
Fix Version/s: 3.1.0
   Resolution: Fixed

> Deprecate and remove --authorizer option in kafka-acls.sh
> -
>
> Key: KAFKA-13071
> URL: https://issues.apache.org/jira/browse/KAFKA-13071
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.1.0
>
>
> Now that we have all of the ACL APIs implemented through the admin client, we 
> should consider deprecating and removing support for the --authorizer flag in 
> kafka-acls.sh.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #13

2021-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 502245 lines...]
[2021-11-16T18:37:20.230Z] > Task :raft:testClasses UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :connect:json:testJar
[2021-11-16T18:37:20.230Z] > Task :connect:json:testSrcJar
[2021-11-16T18:37:20.230Z] > Task :metadata:compileTestJava UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :metadata:testClasses UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :core:compileScala UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :core:classes UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :core:compileTestJava NO-SOURCE
[2021-11-16T18:37:20.230Z] > Task 
:clients:generateMetadataFileForMavenJavaPublication
[2021-11-16T18:37:20.230Z] > Task 
:clients:generatePomFileForMavenJavaPublication
[2021-11-16T18:37:20.230Z] 
[2021-11-16T18:37:20.230Z] > Task :streams:processMessages
[2021-11-16T18:37:20.230Z] Execution optimizations have been disabled for task 
':streams:processMessages' to ensure correctness due to the following reasons:
[2021-11-16T18:37:20.230Z]   - Gradle detected a problem with the following 
location: 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1@2/streams/src/generated/java/org/apache/kafka/streams/internals/generated'.
 Reason: Task ':streams:srcJar' uses this output of task 
':streams:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2021-11-16T18:37:20.230Z] MessageGenerator: processed 1 Kafka message JSON 
files(s).
[2021-11-16T18:37:20.230Z] 
[2021-11-16T18:37:20.230Z] > Task :core:compileTestScala UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :core:testClasses UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :streams:compileJava UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :streams:classes UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :streams:copyDependantLibs UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :streams:test-utils:compileJava UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task :streams:jar UP-TO-DATE
[2021-11-16T18:37:20.230Z] > Task 
:streams:generateMetadataFileForMavenJavaPublication
[2021-11-16T18:37:23.817Z] > Task :connect:api:javadoc
[2021-11-16T18:37:23.817Z] > Task :connect:api:copyDependantLibs UP-TO-DATE
[2021-11-16T18:37:23.817Z] > Task :connect:api:jar UP-TO-DATE
[2021-11-16T18:37:23.817Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2021-11-16T18:37:23.817Z] > Task :connect:json:copyDependantLibs UP-TO-DATE
[2021-11-16T18:37:23.817Z] > Task :connect:json:jar UP-TO-DATE
[2021-11-16T18:37:23.817Z] > Task 
:connect:json:generateMetadataFileForMavenJavaPublication
[2021-11-16T18:37:23.817Z] > Task :connect:api:javadocJar
[2021-11-16T18:37:23.817Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2021-11-16T18:37:23.817Z] > Task :connect:json:publishToMavenLocal
[2021-11-16T18:37:23.817Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2021-11-16T18:37:23.817Z] > Task :connect:api:testClasses UP-TO-DATE
[2021-11-16T18:37:23.817Z] > Task :connect:api:testJar
[2021-11-16T18:37:23.817Z] > Task :connect:api:testSrcJar
[2021-11-16T18:37:23.817Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2021-11-16T18:37:23.817Z] > Task :connect:api:publishToMavenLocal
[2021-11-16T18:37:26.142Z] > Task :streams:javadoc
[2021-11-16T18:37:26.142Z] > Task :streams:javadocJar
[2021-11-16T18:37:27.080Z] > Task :streams:compileTestJava UP-TO-DATE
[2021-11-16T18:37:27.080Z] > Task :streams:testClasses UP-TO-DATE
[2021-11-16T18:37:27.080Z] > Task :streams:testJar
[2021-11-16T18:37:27.080Z] > Task :streams:testSrcJar
[2021-11-16T18:37:27.080Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2021-11-16T18:37:27.080Z] > Task :streams:publishToMavenLocal
[2021-11-16T18:37:28.019Z] > Task :clients:javadoc
[2021-11-16T18:37:28.019Z] > Task :clients:javadocJar
[2021-11-16T18:37:28.958Z] 
[2021-11-16T18:37:28.958Z] > Task :clients:srcJar
[2021-11-16T18:37:28.958Z] Execution optimizations have been disabled for task 
':clients:srcJar' to ensure correctness due to the following reasons:
[2021-11-16T18:37:28.959Z]   - Gradle detected a problem with the following 
location: 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1@2/clients/src/generated/java'.
 Reason: Task ':clients:srcJar' uses this output of task 
':clients:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2021-11-16T18:37:29.899Z] 
[2021-11-16T18:37:29.899Z] > Task :clients:testJar
[2021-11-16T18:37:29.899Z] > Task 

Build failed in Jenkins: Kafka » Kafka Branch Builder » 2.8 #91

2021-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 466165 lines...]
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[2021-11-16T18:25:25.890Z] 
[2021-11-16T18:25:25.890Z] TransactionsTest > testFencingOnCommit() PASSED
[2021-11-16T18:25:25.890Z] 
[2021-11-16T18:25:25.890Z] TransactionsTest > testAbortTransactionTimeout() 
STARTED
[2021-11-16T18:25:35.813Z] 
[2021-11-16T18:25:35.813Z] TransactionsTest > testAbortTransactionTimeout() 
PASSED
[2021-11-16T18:25:35.813Z] 
[2021-11-16T18:25:35.813Z] TransactionsTest > testMultipleMarkersOneLeader() 
STARTED
[2021-11-16T18:25:44.329Z] 
[2021-11-16T18:25:44.329Z] TransactionsTest > testMultipleMarkersOneLeader() 
PASSED
[2021-11-16T18:25:44.329Z] 
[2021-11-16T18:25:44.329Z] TransactionsTest > testCommitTransactionTimeout() 
STARTED
[2021-11-16T18:25:58.340Z] 
[2021-11-16T18:25:58.340Z] TransactionsTest > testCommitTransactionTimeout() 
PASSED
[2021-11-16T18:25:58.340Z] 
[2021-11-16T18:25:58.340Z] SaslClientsWithInvalidCredentialsTest > 
testTransactionalProducerWithAuthenticationFailure() STARTED
[2021-11-16T18:25:59.305Z] 
[2021-11-16T18:25:59.305Z] SaslClientsWithInvalidCredentialsTest > 
testTransactionalProducerWithAuthenticationFailure() PASSED
[2021-11-16T18:25:59.305Z] 
[2021-11-16T18:25:59.305Z] SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAuthenticationFailure() STARTED
[2021-11-16T18:26:02.872Z] 
[2021-11-16T18:26:02.872Z] SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAuthenticationFailure() PASSED
[2021-11-16T18:26:02.872Z] 
[2021-11-16T18:26:02.872Z] SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationSuccess() STARTED
[2021-11-16T18:26:06.607Z] 
[2021-11-16T18:26:06.607Z] SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationSuccess() PASSED
[2021-11-16T18:26:06.607Z] 
[2021-11-16T18:26:06.607Z] SaslClientsWithInvalidCredentialsTest > 
testProducerWithAuthenticationFailure() STARTED
[2021-11-16T18:26:09.229Z] 
[2021-11-16T18:26:09.229Z] SaslClientsWithInvalidCredentialsTest > 
testProducerWithAuthenticationFailure() PASSED
[2021-11-16T18:26:09.229Z] 
[2021-11-16T18:26:09.229Z] SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationFailure() STARTED
[2021-11-16T18:26:12.801Z] 
[2021-11-16T18:26:12.801Z] SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationFailure() PASSED
[2021-11-16T18:26:12.801Z] 
[2021-11-16T18:26:12.801Z] SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAutoCommitDisabledWithAuthenticationFailure() 
STARTED
[2021-11-16T18:26:16.360Z] 
[2021-11-16T18:26:16.360Z] SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAutoCommitDisabledWithAuthenticationFailure() 
PASSED
[2021-11-16T18:26:16.360Z] 
[2021-11-16T18:26:16.360Z] SaslClientsWithInvalidCredentialsTest > 
testKafkaAdminClientWithAuthenticationFailure() STARTED
[2021-11-16T18:26:20.030Z] 
[2021-11-16T18:26:20.030Z] SaslClientsWithInvalidCredentialsTest > 
testKafkaAdminClientWithAuthenticationFailure() PASSED
[2021-11-16T18:26:20.030Z] 
[2021-11-16T18:26:20.030Z] SaslClientsWithInvalidCredentialsTest > 
testConsumerWithAuthenticationFailure() STARTED
[2021-11-16T18:26:23.597Z] 
[2021-11-16T18:26:23.597Z] SaslClientsWithInvalidCredentialsTest > 
testConsumerWithAuthenticationFailure() PASSED
[2021-11-16T18:26:23.597Z] 
[2021-11-16T18:26:23.597Z] UserClientIdQuotaTest > 
testProducerConsumerOverrideLowerQuota() STARTED
[2021-11-16T18:26:30.759Z] 
[2021-11-16T18:26:30.759Z] UserClientIdQuotaTest > 
testProducerConsumerOverrideLowerQuota() PASSED
[2021-11-16T18:26:30.759Z] 
[2021-11-16T18:26:30.759Z] UserClientIdQuotaTest > 
testProducerConsumerOverrideUnthrottled() STARTED
[2021-11-16T18:26:37.992Z] 
[2021-11-16T18:26:37.992Z] UserClientIdQuotaTest > 
testProducerConsumerOverrideUnthrottled() PASSED
[2021-11-16T18:26:37.992Z] 
[2021-11-16T18:26:37.992Z] UserClientIdQuotaTest > 
testThrottledProducerConsumer() STARTED
[2021-11-16T18:26:56.985Z] 
[2021-11-16T18:26:56.985Z] UserClientIdQuotaTest > 
testThrottledProducerConsumer() PASSED
[2021-11-16T18:26:56.985Z] 
[2021-11-16T18:26:56.985Z] UserClientIdQuotaTest > testQuotaOverrideDelete() 
STARTED
[2021-11-16T18:27:27.510Z] 
[2021-11-16T18:27:27.510Z] UserClientIdQuotaTest > testQuotaOverrideDelete() 
PASSED
[2021-11-16T18:27:27.510Z] 
[2021-11-16T18:27:27.510Z] UserClientIdQuotaTest > testThrottledRequest() 
STARTED
[2021-11-16T18:27:31.077Z] 
[2021-11-16T18:27:31.077Z] UserClientIdQuotaTest > testThrottledRequest() PASSED
[2021-11-16T18:27:31.077Z] 
[2021-11-16T18:27:31.077Z] ZooKeeperClientTest > 
testZNodeChangeHandlerForDataChange() STARTED
[2021-11-16T18:27:32.144Z] 
[2021-11-16T18:27:32.144Z] ZooKeeperClientTest > 
testZNodeChangeHandlerForDataChange() 

[jira] [Resolved] (KAFKA-13426) Add recordMetadata to StateStoreContext

2021-11-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler resolved KAFKA-13426.
--
Resolution: Fixed

> Add recordMetadata to StateStoreContext
> ---
>
> Key: KAFKA-13426
> URL: https://issues.apache.org/jira/browse/KAFKA-13426
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Patrick Stuedi
>Assignee: Patrick Stuedi
>Priority: Minor
>  Labels: kip
>
> KIP-791: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-791%3A+Add+Record+Metadata+to+State+Store+Context]
> In order for state stores to provide stronger consistency in the future 
> (e.g., RYW consistency) they need to be able to collect record metadata 
> (e.g., offset information).
> Today, we already make record metadata available in the 
> AbstractProcessContext (recordMetadata()), but the call is not currently 
> exposed through the StateStoreContext interface that is used by the state 
> store. 
> The task of this ticket is to expose recordMetadata in the StateStoreContext. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] KIP-779: Allow Source Tasks to Handle Producer Exceptions

2021-11-16 Thread Arjun Satish
+1 (non-binding). Thanks for the KIP, Knowles! and appreciate the
follow-ups!

On Thu, Nov 11, 2021 at 2:55 PM John Roesler  wrote:

> Thanks, Knowles!
>
> I'm +1 (binding)
>
> -John
>
> On Wed, 2021-11-10 at 12:42 -0500, Christopher Shannon
> wrote:
> > +1 (non-binding). This looks good to me and will be useful as a way to
> > handle producer errors.
> >
> > On Mon, Nov 8, 2021 at 8:55 AM Knowles Atchison Jr <
> katchiso...@gmail.com>
> > wrote:
> >
> > > Good morning,
> > >
> > > I'd like to start a vote for KIP-779: Allow Source Tasks to Handle
> Producer
> > > Exceptions:
> > >
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-779%3A+Allow+Source+Tasks+to+Handle+Producer+Exceptions
> > >
> > > The purpose of this KIP is to allow Source Tasks the option to "ignore"
> > > kafka producer exceptions. After a few iterations, this is now part of
> the
> > > "errors.tolerance" configuration and provides a null RecordMetadata to
> > > commitRecord() in lieu of a new SourceTask interface method or worker
> > > configuration item.
> > >
> > > PR is here:
> > >
> > > https://github.com/apache/kafka/pull/11382
> > >
> > > Any comments and feedback are welcome.
> > >
> > > Knowles
> > >
>
>
>


Re: [DISCUSS] KIP-779: Allow Source Tasks to Handle Producer Exceptions

2021-11-16 Thread Arjun Satish
One more nit: the RetryWithToleranceOperator class is not a public
interface. So we do not have to call the changes in them out in the Public
Interfaces section.


On Tue, Nov 16, 2021 at 10:42 AM Arjun Satish 
wrote:

> Chris' point about upgrades is valid. An existing configuration will now
> have additional behavior. We should clearly call this out in the kip, and
> whenever they are prepared -- the release notes. It's a bit crummy when
> upgrading, but I do think it's better than introducing a new configuration
> in the long term.
>
> On Mon, Nov 15, 2021 at 2:52 PM Knowles Atchison Jr 
> wrote:
>
>> Chris,
>>
>> Thank you for the feedback. I can certainly update the KIP to state that
>> once exactly one support is in place, the task would be failed even if
>> error.tolerance were set to all. Programmatically it would still require
>> PRs to be merged to build on top of. I also liked my original
>> implementation of the hook as it gave the connector writers the most
>> flexibility in handling producer errors. I changed the original
>> implementation as the progression/changes still supported my use case and
>> I
>> thought it would move this process along faster.
>>
>> Knowles
>>
>> On Thu, Nov 11, 2021 at 3:43 PM Chris Egerton > >
>> wrote:
>>
>> > Hi Knowles,
>> >
>> > I think this looks good for the most part but I'd still like to see an
>> > explicit mention in the KIP (and proposed doc/Javadoc changes) that
>> states
>> > that, with exactly-once support enabled, producer exceptions that result
>> > from failures related to exactly-once support (including but not
>> limited to
>> > ProducerFencedExcecption instances (
>> >
>> >
>> https://kafka.apache.org/30/javadoc/org/apache/kafka/common/errors/ProducerFencedException.html
>> > ))
>> > will not be skipped even with "errors.tolerance" set to "all", and will
>> > instead unconditionally cause the task to fail. Your proposal that
>> > "WorkerSourceTask could check the configuration before handing off the
>> > records and exception to this function" seems great as long as we update
>> > "handing off the records and exceptions to this function" to the
>> > newly-proposed behavior of "logging the exception and continuing to poll
>> > the task for data".
>> >
>> > I'm also a little bit wary of updating the existing "errors.tolerance"
>> > configuration to have new behavior that users can't opt out of without
>> also
>> > opting out of the current behavior they get with "errors.tolerance" set
>> to
>> > "all", but I think I've found a decent argument in favor of it. One
>> thought
>> > that came to mind is whether this use case was originally considered
>> when
>> > KIP-298 was being discussed. However, it appears that KAFKA-8586 (
>> > https://issues.apache.org/jira/browse/KAFKA-8586), the fix for which
>> > caused
>> > tasks to fail on non-retriable, asynchronous producer exceptions
>> instead of
>> > logging them and continuing, was discovered over a full year after the
>> > changes for KIP-298 (https://github.com/apache/kafka/pull/5065) were
>> > merged. I suspect that the current proposal aligns nicely with the
>> original
>> > design intent of KIP-298, and that if KAFKA-8586 were discovered before
>> or
>> > during discussion for KIP-298, non-retriable, asynchronous producer
>> > exceptions would have been included in its scope. With that in mind,
>> > although it may cause issues for some niche use cases, I think that
>> this is
>> > a valid change and would be worth the tradeoff of potentially
>> complicating
>> > life for a small number of users. I'd be interested in Arjun's thoughts
>> on
>> > this though (as he designed and implemented KIP-298), and if this
>> analysis
>> > is agreeable, we may want to document that information in the KIP as
>> well
>> > to strengthen our case for not introducing a new configuration property
>> and
>> > instead making this behavior tied to the existing "errors.tolerance"
>> > property with no opt-out besides using a new value for that property.
>> >
>> > My last thought is that, although it may be outside the scope of this
>> KIP,
>> > I believe your original proposal of giving tasks a hook to handle
>> > downstream exceptions is actually quite valid. The DLQ feature for sink
>> > connectors is an extremely valuable one as it prevents data loss when
>> > "errors.tolerance" is set to "all" by allowing users to reprocess
>> > problematic records at a later date without stopping the flow of data in
>> > their connector entirely. As others have noted, it's difficult if not
>> > outright impossible to provide a Kafka DLQ topic for source connectors
>> with
>> > the same guarantees, and so allowing source connectors the option of
>> > storing problematic records back in the system that they came from seems
>> > like a reasonable alternative. I think we're probably past the point of
>> > making that happen in this KIP, but I don't believe the changes you've
>> > proposed make that any harder in the future than it is 

Re: [DISCUSS] KIP-779: Allow Source Tasks to Handle Producer Exceptions

2021-11-16 Thread Arjun Satish
Chris' point about upgrades is valid. An existing configuration will now
have additional behavior. We should clearly call this out in the kip, and
whenever they are prepared -- the release notes. It's a bit crummy when
upgrading, but I do think it's better than introducing a new configuration
in the long term.

On Mon, Nov 15, 2021 at 2:52 PM Knowles Atchison Jr 
wrote:

> Chris,
>
> Thank you for the feedback. I can certainly update the KIP to state that
> once exactly one support is in place, the task would be failed even if
> error.tolerance were set to all. Programmatically it would still require
> PRs to be merged to build on top of. I also liked my original
> implementation of the hook as it gave the connector writers the most
> flexibility in handling producer errors. I changed the original
> implementation as the progression/changes still supported my use case and I
> thought it would move this process along faster.
>
> Knowles
>
> On Thu, Nov 11, 2021 at 3:43 PM Chris Egerton  >
> wrote:
>
> > Hi Knowles,
> >
> > I think this looks good for the most part but I'd still like to see an
> > explicit mention in the KIP (and proposed doc/Javadoc changes) that
> states
> > that, with exactly-once support enabled, producer exceptions that result
> > from failures related to exactly-once support (including but not limited
> to
> > ProducerFencedExcecption instances (
> >
> >
> https://kafka.apache.org/30/javadoc/org/apache/kafka/common/errors/ProducerFencedException.html
> > ))
> > will not be skipped even with "errors.tolerance" set to "all", and will
> > instead unconditionally cause the task to fail. Your proposal that
> > "WorkerSourceTask could check the configuration before handing off the
> > records and exception to this function" seems great as long as we update
> > "handing off the records and exceptions to this function" to the
> > newly-proposed behavior of "logging the exception and continuing to poll
> > the task for data".
> >
> > I'm also a little bit wary of updating the existing "errors.tolerance"
> > configuration to have new behavior that users can't opt out of without
> also
> > opting out of the current behavior they get with "errors.tolerance" set
> to
> > "all", but I think I've found a decent argument in favor of it. One
> thought
> > that came to mind is whether this use case was originally considered when
> > KIP-298 was being discussed. However, it appears that KAFKA-8586 (
> > https://issues.apache.org/jira/browse/KAFKA-8586), the fix for which
> > caused
> > tasks to fail on non-retriable, asynchronous producer exceptions instead
> of
> > logging them and continuing, was discovered over a full year after the
> > changes for KIP-298 (https://github.com/apache/kafka/pull/5065) were
> > merged. I suspect that the current proposal aligns nicely with the
> original
> > design intent of KIP-298, and that if KAFKA-8586 were discovered before
> or
> > during discussion for KIP-298, non-retriable, asynchronous producer
> > exceptions would have been included in its scope. With that in mind,
> > although it may cause issues for some niche use cases, I think that this
> is
> > a valid change and would be worth the tradeoff of potentially
> complicating
> > life for a small number of users. I'd be interested in Arjun's thoughts
> on
> > this though (as he designed and implemented KIP-298), and if this
> analysis
> > is agreeable, we may want to document that information in the KIP as well
> > to strengthen our case for not introducing a new configuration property
> and
> > instead making this behavior tied to the existing "errors.tolerance"
> > property with no opt-out besides using a new value for that property.
> >
> > My last thought is that, although it may be outside the scope of this
> KIP,
> > I believe your original proposal of giving tasks a hook to handle
> > downstream exceptions is actually quite valid. The DLQ feature for sink
> > connectors is an extremely valuable one as it prevents data loss when
> > "errors.tolerance" is set to "all" by allowing users to reprocess
> > problematic records at a later date without stopping the flow of data in
> > their connector entirely. As others have noted, it's difficult if not
> > outright impossible to provide a Kafka DLQ topic for source connectors
> with
> > the same guarantees, and so allowing source connectors the option of
> > storing problematic records back in the system that they came from seems
> > like a reasonable alternative. I think we're probably past the point of
> > making that happen in this KIP, but I don't believe the changes you've
> > proposed make that any harder in the future than it is now (which is
> > great!), and I wanted to voice my general support for a mechanism like
> this
> > in case you or someone following along think it'd be worth it to pursue
> at
> > a later date.
> >
> > Thanks for your KIP and thanks for your patience with the process!
> >
> > Cheers,
> >
> > Chris
> >
> > On Fri, Nov 5, 2021 at 

Re: KIP-769: Connect API to retrieve connector configuration definitions

2021-11-16 Thread Chris Egerton
Hi Viktor,

It sounds like there are three major points here in favor of a new GET
endpoint for connector config defs.

1. You cannot issue a blank ("dummy") request for sink connectors because a
topic list/topic regex has to be supplied (otherwise the PUT endpoint
returns a 500 response)
2. A dummy request still triggers custom validations by the connector,
which may be best to avoid if we know for sure that the config isn't worth
validating yet
3. It's more ergonomic and intuitive to be able to issue a GET request
without having to give a dummy connector config

With regards to 1, this is actually a bug in Connect (
https://issues.apache.org/jira/browse/KAFKA-13327) with a fix already
implemented and awaiting committer review (
https://github.com/apache/kafka/pull/11369). I think it'd be better to
focus on fixing this bug in general instead of implementing a new REST
endpoint in order to allow people to work around it.

With regards to 2, this is technically possible but I'm unsure it'd be too
common out in the wild given that most validations that could be expensive
would involve things like connecting to a database, checking if a cloud
storage bucket exists, etc., none of which are possible without some
configuration properties from the user (db hostname, bucket name, etc.).

With regards to 3, I do agree that it'd be easier for people designing UIs
to have a GET API to work against. I'm just not sure it's worth the
additional implementation, testing, and maintenance burden. If it were
possible to issue a PUT request without unexpected 500s for invalid
configs, would that suffice? AFAICT it'd basically be as simple as issuing
a PUT request with a dummy body consisting of nothing except the connector
class (which at this point we might even make unnecessary and just
automatically replace with the connector class from the URL) and then
filtering the response to just grab the "definition" field of each element
in the "configs" array in the response.

Cheers,

Chris

On Tue, Nov 16, 2021 at 9:52 AM Viktor Somogyi-Vass 
wrote:

> Hi Folks,
>
> I too think this would be a very useful feature. Some of our management
> applications would provide a wizard for creating connectors. In this
> scenario the user basically would fill out a sample configuration generated
> by the UI which would send it back to Connect for validation and eventually
> create a new connector. The first part of this workflow can be enhanced if
> we had an API that can return the configuration definition of the given
> type of connector as the UI application would be able to generate a sample
> for the user based on that (nicely drawn diagram:
> https://imgur.com/a/7S1Xwm5).
> The connector-plugins/{connectorType}/config/validate API essentially works
> and returns the data that we need, however it is a HTTP PUT API that is a
> bit unintuitive for a fetch-like functionality and also functionally
> different as it validates the given (dummy) request. In case of sink
> connectors one would need to also provide a topic name.
>
> A suggestion for the KIP: I think it can be useful to return the config
> groups and the connector class' name similarly to the validate API just in
> case any frontend needs them (and also the response would be more like the
> validate API but simpler).
>
> Viktor
>
> On Fri, Aug 20, 2021 at 4:51 PM Ryanne Dolan 
> wrote:
>
> > I think it'd be worth adding a GET version, fwiw. Could be the same
> handler
> > with just a different spelling maybe.
> >
> > On Fri, Aug 20, 2021, 7:44 AM Mickael Maison 
> > wrote:
> >
> > > Hi Chris,
> > >
> > > You're right, you can achieve the same functionality using the
> > > existing validate endpoint.
> > > In my mind it was only for validation once you have build a
> > > configuration but when used with an empty configuration, it basically
> > > serves the same purpose as the proposed new endpoint.
> > >
> > > I think it's a bit easier to use a GET endpoint but I don't think it
> > > really warrants a different endpoint.
> > >
> > > Thanks
> > >
> > > On Thu, Aug 19, 2021 at 2:56 PM Chris Egerton
> > >  wrote:
> > > >
> > > > Hi Mickael,
> > > >
> > > > I'm wondering about the use case here. The motivation section states
> > that
> > > > "Connect does not provide a way to see what configurations a
> connector
> > > > requires. Instead users have to go look at the connector
> documentation
> > or
> > > > in the worst case, look directly at the connector source code.", and
> > that
> > > > with this KIP, "users will be able to discover the required
> > > configurations
> > > > for connectors installed in a Connect cluster" and "tools will be
> able
> > to
> > > > generate wizards for configuring and starting connectors".
> > > >
> > > > Does the existing "PUT
> > > /connector-plugins/{connector-type}/config/validate"
> > > > endpoint not address these points? What will the newly-proposed
> > endpoint
> > > > allow users to do that they will not already be able to do with the
> > > > 

Re: KIP-769: Connect API to retrieve connector configuration definitions

2021-11-16 Thread Viktor Somogyi-Vass
Hi Folks,

I too think this would be a very useful feature. Some of our management
applications would provide a wizard for creating connectors. In this
scenario the user basically would fill out a sample configuration generated
by the UI which would send it back to Connect for validation and eventually
create a new connector. The first part of this workflow can be enhanced if
we had an API that can return the configuration definition of the given
type of connector as the UI application would be able to generate a sample
for the user based on that (nicely drawn diagram:
https://imgur.com/a/7S1Xwm5).
The connector-plugins/{connectorType}/config/validate API essentially works
and returns the data that we need, however it is a HTTP PUT API that is a
bit unintuitive for a fetch-like functionality and also functionally
different as it validates the given (dummy) request. In case of sink
connectors one would need to also provide a topic name.

A suggestion for the KIP: I think it can be useful to return the config
groups and the connector class' name similarly to the validate API just in
case any frontend needs them (and also the response would be more like the
validate API but simpler).

Viktor

On Fri, Aug 20, 2021 at 4:51 PM Ryanne Dolan  wrote:

> I think it'd be worth adding a GET version, fwiw. Could be the same handler
> with just a different spelling maybe.
>
> On Fri, Aug 20, 2021, 7:44 AM Mickael Maison 
> wrote:
>
> > Hi Chris,
> >
> > You're right, you can achieve the same functionality using the
> > existing validate endpoint.
> > In my mind it was only for validation once you have build a
> > configuration but when used with an empty configuration, it basically
> > serves the same purpose as the proposed new endpoint.
> >
> > I think it's a bit easier to use a GET endpoint but I don't think it
> > really warrants a different endpoint.
> >
> > Thanks
> >
> > On Thu, Aug 19, 2021 at 2:56 PM Chris Egerton
> >  wrote:
> > >
> > > Hi Mickael,
> > >
> > > I'm wondering about the use case here. The motivation section states
> that
> > > "Connect does not provide a way to see what configurations a connector
> > > requires. Instead users have to go look at the connector documentation
> or
> > > in the worst case, look directly at the connector source code.", and
> that
> > > with this KIP, "users will be able to discover the required
> > configurations
> > > for connectors installed in a Connect cluster" and "tools will be able
> to
> > > generate wizards for configuring and starting connectors".
> > >
> > > Does the existing "PUT
> > /connector-plugins/{connector-type}/config/validate"
> > > endpoint not address these points? What will the newly-proposed
> endpoint
> > > allow users to do that they will not already be able to do with the
> > > existing endpoint?
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Thu, Aug 19, 2021 at 9:20 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I've created KIP-769 to expose connector configuration definitions in
> > > > the Connect API
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+API+to+retrieve+connector+configuration+definitions
> > > >
> > > > Please take a look and let me know if you have any feedback.
> > > >
> > > > Thanks
> > > >
> >
>


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread David Arthur
Guozhang,

1. By requiring a majority of all controller nodes to support the version
selected by the leader, we increase the likelihood that the next leader
will also support it. We can't guarantee that all nodes definitely support
the selected metadata.version because there could always be an offline or
partitioned peer that is running old software.

If a controller running old software manages elected, we hit this case:

 In the unlikely event that an active controller encounters an unsupported
> metadata.version, it should resign and terminate.


So, given this, we should be able to eventually elect a controller that
does support the metadata.version.


Consider controllers C1, C2, C3 with this arrangement:

Node  SoftwareVer MaxMetadataVer
C13.2 1
C23.3 4
C33.3 4

If the current metadata.version is 1 and we're trying to upgrade to 4, we
would allow it since two of the three nodes support it. If any one
controller is down while we are attempting an upgrade, we would require
that both of remaining alive nodes support the target metadata.version
(since we require a majority of _all_ controller nodes, not just alive
ones).

An interesting case here is how to handle a version update if a majority of
the quorum supports it, but the leader doesn't. For example, if C1 was the
leader and an upgrade to version 4 was requested. Maybe this would trigger
C1 to resign and inform the client to retry the update later.

We may eventually want to consider the metadata.version when electing a
leader, but as long as we have the majority requirement before committing a
new metadata.version, I think we should be safe.

-David

On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang  wrote:

> Thanks David,
>
> 1. Got it. One thing I'm still not very clear is why it's sufficient to
> select a metadata.version which is supported by majority of the quorum, but
> not the whole quorum (i.e. choosing the lowest version among all the quorum
> members)? Since the leader election today does not take this value into
> consideration, we are not guaranteed that newly selected leaders would
> always be able to recognize and support the initialized metadata.version
> right?
>
> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond
> the scope of this KIP, we can defer it to later discussions.
>
> On Mon, Nov 15, 2021 at 8:13 AM David Arthur
>  wrote:
>
> > Guozhang, thanks for the review!
> >
> > 1, As we've defined it so far, the initial metadata.version is set by an
> > operator via the "kafka-storage.sh" tool. It would be possible for
> > different values to be selected, but only the quorum leader would commit
> a
> > FeatureLevelRecord with the version they read locally. See the above
> reply
> > to Jun's question for a little more detail.
> >
> > We need to enable the KRaft RPCs regardless of metadata.version (vote,
> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be
> > elected, and followers can learn about the selected metadata.version. I
> > believe the quorum startup procedure would go something like:
> >
> > * Controller nodes start, read their logs, begin leader election
> > * While the elected node is becoming leader (in
> > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if
> > necessary
> > * Followers replicate the FeatureLevelRecord
> >
> > This (probably) means the quorum peers must continue to rely on
> ApiVersions
> > to negotiate compatible RPC versions and these versions cannot depend on
> > metadata.version.
> >
> > Does this make sense?
> >
> >
> > 2, ApiVersionResponse includes the broker's supported feature flags as
> well
> > as the cluster-wide finalized feature flags. We probably need to add
> > something like the feature flag "epoch" to this response payload in order
> > to see which broker is most up to date.
> >
> > If the new feature flag version included RPC changes, we are helped by
> the
> > fact that a client won't attempt to use the new RPC until it has
> discovered
> > a broker that supports it via ApiVersions. The problem is more difficult
> > for cases like you described where the feature flag upgrade changes the
> > behavior of the broker, but not its RPCs. This is actually the same
> problem
> > as upgrading the IBP. During a rolling restart, clients may hit different
> > brokers with different capabilities and not know it.
> >
> > This probably warrants further investigation, but hopefully you agree it
> is
> > beyond the scope of this KIP :)
> >
> > -David
> >
> >
> > On Mon, Nov 15, 2021 at 10:26 AM David Arthur  >
> > wrote:
> >
> > > Jun, thanks for the comments!
> > >
> > > 16, When a new cluster is deployed, we don't select the highest
> available
> > > metadata.version, but rather the quorum leader picks a bootstrap
> version
> > > defined in meta.properties. As mentioned earlier, we should add
> > validation
> > > here to ensure a majority of the followers could support this version
> > > 

Re: [DISCUSS] KIP-797 Accept duplicate listener on port for IPv4/IPv6

2021-11-16 Thread Matthew de Detrich
Since no one has commented on either this thread or the original one I will
summon a vote by the end of this week.

Regards

On Wed, Nov 10, 2021 at 5:28 PM Matthew de Detrich <
matthew.dedetr...@aiven.io> wrote:

> Hello everyone,
>
> I would like to start a discussion for KIP-797 which is about allowing
> duplicate listeners on the same port in the specific case where one host is
> an IPv4 address and the other host is an IPv6 address.
>
> The proposal is here
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330
>
> Regards
> --
>
> Matthew de Detrich
>
> *Aiven Deutschland GmbH*
>
> Immanuelkirchstraße 26, 10405 Berlin
>
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>
> *m:* +491603708037
>
> *w:* aiven.io *e:* matthew.dedetr...@aiven.io
>


-- 

Matthew de Detrich

*Aiven Deutschland GmbH*

Immanuelkirchstraße 26, 10405 Berlin

Amtsgericht Charlottenburg, HRB 209739 B

Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen

*m:* +491603708037

*w:* aiven.io *e:* matthew.dedetr...@aiven.io


[Vote] KIP-787 - MM2 Interface to manage Kafka resources Kafka/KIPs

2021-11-16 Thread Omnia Ibrahim
Hi,
I'd like to start a vote on KIP-787
https://cwiki.apache.org/confluence/display/KAFKA/KIP-787%3A+MM2+Interface+to+manage+Kafka+resources


Thanks
Omnia


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #573

2021-11-16 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-13449) Comment optimization for parameter log.cleaner.delete.retention.ms

2021-11-16 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-13449.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Comment optimization for parameter log.cleaner.delete.retention.ms 
> ---
>
> Key: KAFKA-13449
> URL: https://issues.apache.org/jira/browse/KAFKA-13449
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Affects Versions: 3.0.0
>Reporter: RivenSun
>Priority: Major
> Fix For: 3.2.0
>
>
> You can view the comment of this parameter from Kafka's official website.
> https://kafka.apache.org/documentation/#brokerconfigs_log.cleaner.delete.retention.ms
> {code:java}
> log.cleaner.delete.retention.ms
> How long are delete records retained? {code}
> I think it should be consistent with the comment of topic level parameter 
> *delete.retention.ms* .
> https://kafka.apache.org/documentation/#topicconfigs_delete.retention.ms



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #572

2021-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 499591 lines...]
[2021-11-16T11:54:56.504Z] 
[2021-11-16T11:54:56.504Z] PlaintextConsumerTest > 
testConsumeMessagesWithCreateTime() PASSED
[2021-11-16T11:54:56.504Z] 
[2021-11-16T11:54:56.504Z] PlaintextConsumerTest > testAsyncCommit() STARTED
[2021-11-16T11:55:00.171Z] 
[2021-11-16T11:55:00.171Z] PlaintextConsumerTest > testAsyncCommit() PASSED
[2021-11-16T11:55:00.171Z] 
[2021-11-16T11:55:00.171Z] PlaintextConsumerTest > 
testLowMaxFetchSizeForRequestAndPartition() STARTED
[2021-11-16T11:55:39.015Z] 
[2021-11-16T11:55:39.015Z] PlaintextConsumerTest > 
testLowMaxFetchSizeForRequestAndPartition() PASSED
[2021-11-16T11:55:39.015Z] 
[2021-11-16T11:55:39.015Z] PlaintextConsumerTest > 
testMultiConsumerSessionTimeoutOnStopPolling() STARTED
[2021-11-16T11:55:55.061Z] 
[2021-11-16T11:55:55.061Z] PlaintextConsumerTest > 
testMultiConsumerSessionTimeoutOnStopPolling() PASSED
[2021-11-16T11:55:55.061Z] 
[2021-11-16T11:55:55.061Z] PlaintextConsumerTest > 
testMaxPollIntervalMsDelayInRevocation() STARTED
[2021-11-16T11:56:01.905Z] 
[2021-11-16T11:56:01.905Z] PlaintextConsumerTest > 
testMaxPollIntervalMsDelayInRevocation() PASSED
[2021-11-16T11:56:01.905Z] 
[2021-11-16T11:56:01.905Z] PlaintextConsumerTest > 
testPerPartitionLagMetricsCleanUpWithAssign() STARTED
[2021-11-16T11:56:08.311Z] 
[2021-11-16T11:56:08.311Z] PlaintextConsumerTest > 
testPerPartitionLagMetricsCleanUpWithAssign() PASSED
[2021-11-16T11:56:08.311Z] 
[2021-11-16T11:56:08.311Z] PlaintextConsumerTest > 
testPartitionsForInvalidTopic() STARTED
[2021-11-16T11:56:11.407Z] 
[2021-11-16T11:56:11.407Z] PlaintextConsumerTest > 
testPartitionsForInvalidTopic() PASSED
[2021-11-16T11:56:11.407Z] 
[2021-11-16T11:56:11.407Z] PlaintextConsumerTest > 
testPauseStateNotPreservedByRebalance() STARTED
[2021-11-16T11:56:16.909Z] 
[2021-11-16T11:56:16.909Z] PlaintextConsumerTest > 
testPauseStateNotPreservedByRebalance() PASSED
[2021-11-16T11:56:16.909Z] 
[2021-11-16T11:56:16.909Z] PlaintextConsumerTest > 
testFetchHonoursFetchSizeIfLargeRecordNotFirst() STARTED
[2021-11-16T11:56:22.370Z] 
[2021-11-16T11:56:22.370Z] PlaintextConsumerTest > 
testFetchHonoursFetchSizeIfLargeRecordNotFirst() PASSED
[2021-11-16T11:56:22.370Z] 
[2021-11-16T11:56:22.370Z] PlaintextConsumerTest > testSeek() STARTED
[2021-11-16T11:56:29.489Z] 
[2021-11-16T11:56:29.489Z] PlaintextConsumerTest > testSeek() PASSED
[2021-11-16T11:56:29.489Z] 
[2021-11-16T11:56:29.489Z] PlaintextConsumerTest > 
testConsumingWithNullGroupId() STARTED
[2021-11-16T11:56:38.218Z] 
[2021-11-16T11:56:38.218Z] PlaintextConsumerTest > 
testConsumingWithNullGroupId() PASSED
[2021-11-16T11:56:38.218Z] 
[2021-11-16T11:56:38.218Z] PlaintextConsumerTest > testPositionAndCommit() 
STARTED
[2021-11-16T11:56:43.806Z] 
[2021-11-16T11:56:43.806Z] PlaintextConsumerTest > testPositionAndCommit() 
PASSED
[2021-11-16T11:56:43.807Z] 
[2021-11-16T11:56:43.807Z] PlaintextConsumerTest > 
testFetchRecordLargerThanMaxPartitionFetchBytes() STARTED
[2021-11-16T11:56:49.221Z] 
[2021-11-16T11:56:49.221Z] PlaintextConsumerTest > 
testFetchRecordLargerThanMaxPartitionFetchBytes() PASSED
[2021-11-16T11:56:49.221Z] 
[2021-11-16T11:56:49.221Z] PlaintextConsumerTest > testUnsubscribeTopic() 
STARTED
[2021-11-16T11:56:53.578Z] 
[2021-11-16T11:56:53.578Z] PlaintextConsumerTest > testUnsubscribeTopic() PASSED
[2021-11-16T11:56:53.578Z] 
[2021-11-16T11:56:53.578Z] PlaintextConsumerTest > 
testMultiConsumerSessionTimeoutOnClose() STARTED
[2021-11-16T11:57:05.675Z] 
[2021-11-16T11:57:05.675Z] PlaintextConsumerTest > 
testMultiConsumerSessionTimeoutOnClose() PASSED
[2021-11-16T11:57:05.675Z] 
[2021-11-16T11:57:05.675Z] PlaintextConsumerTest > 
testMultiConsumerStickyAssignor() STARTED
[2021-11-16T11:57:16.497Z] 
[2021-11-16T11:57:16.497Z] PlaintextConsumerTest > 
testMultiConsumerStickyAssignor() PASSED
[2021-11-16T11:57:16.497Z] 
[2021-11-16T11:57:16.497Z] PlaintextConsumerTest > 
testFetchRecordLargerThanFetchMaxBytes() STARTED
[2021-11-16T11:57:20.737Z] 
[2021-11-16T11:57:20.737Z] PlaintextConsumerTest > 
testFetchRecordLargerThanFetchMaxBytes() PASSED
[2021-11-16T11:57:20.737Z] 
[2021-11-16T11:57:20.737Z] PlaintextConsumerTest > testAutoCommitOnClose() 
STARTED
[2021-11-16T11:57:25.597Z] 
[2021-11-16T11:57:25.598Z] PlaintextConsumerTest > testAutoCommitOnClose() 
PASSED
[2021-11-16T11:57:25.598Z] 
[2021-11-16T11:57:25.598Z] PlaintextConsumerTest > testListTopics() STARTED
[2021-11-16T11:57:30.338Z] 
[2021-11-16T11:57:30.338Z] PlaintextConsumerTest > testListTopics() PASSED
[2021-11-16T11:57:30.338Z] 
[2021-11-16T11:57:30.338Z] PlaintextConsumerTest > 
testExpandingTopicSubscriptions() STARTED
[2021-11-16T11:57:37.244Z] 
[2021-11-16T11:57:37.244Z] PlaintextConsumerTest > 
testExpandingTopicSubscriptions() PASSED
[2021-11-16T11:57:37.244Z] 
[2021-11-16T11:57:37.244Z] PlaintextConsumerTest > 

Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.1 #12

2021-11-16 Thread Apache Jenkins Server
See 




Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.0 #155

2021-11-16 Thread Apache Jenkins Server
See