[GitHub] [kafka] yun-yun commented on pull request #11981: KAFKA-13791: Fix FetchResponse#`fetchData` and `forgottenTopics`: Assignment of lazy-initialized members should be the last step with double-

2022-04-04 Thread GitBox


yun-yun commented on PR #11981:
URL: https://github.com/apache/kafka/pull/11981#issuecomment-1088245743

   Hi @showuon , can this PR be merged?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] ijuma commented on pull request #11991: KAFKA-13794: Fix comparator of inflightBatchesBySequence in TransactionManager

2022-04-04 Thread GitBox


ijuma commented on PR #11991:
URL: https://github.com/apache/kafka/pull/11991#issuecomment-1088215820

   I agree we should include this in 3.1.1. Do we know if this has always been 
like this or if it regressed at some point?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] ddrid commented on pull request #11991: KAFKA-13794: Fix comparator of inflightBatchesBySequence in TransactionManager

2022-04-04 Thread GitBox


ddrid commented on PR #11991:
URL: https://github.com/apache/kafka/pull/11991#issuecomment-1088209337

   @hachikuji Thanks for your review! I've addressed your comments and please 
take a look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] C0urante commented on pull request #5876: KAFKA-7509: Avoid passing most non-applicable properties to producer, consumer, and admin client

2022-04-04 Thread GitBox


C0urante commented on PR #5876:
URL: https://github.com/apache/kafka/pull/5876#issuecomment-1088179915

   I've taken an alternative approach in 
https://github.com/apache/kafka/pull/11986. If anyone wants to take a look and 
either review the changes or take them for a spin locally, I'd appreciate any 
feedback. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] hachikuji commented on a diff in pull request #11991: KAFKA-13794: Fix comparator of inflightBatchesBySequence in TransactionManager

2022-04-04 Thread GitBox


hachikuji commented on code in PR #11991:
URL: https://github.com/apache/kafka/pull/11991#discussion_r842258345


##
clients/src/test/java/org/apache/kafka/clients/producer/internals/TransactionManagerTest.java:
##
@@ -674,6 +674,61 @@ public void testBatchCompletedAfterProducerReset() {
 assertNull(transactionManager.nextBatchBySequence(tp0));
 }
 
+@Test
+public void testDuplicateSequenceAfterProducerReset() throws Exception {
+initializeTransactionManager(Optional.empty());
+initializeIdempotentProducerId(producerId, epoch);
+
+Metrics metrics = new Metrics(time);
+
+RecordAccumulator accumulator = new RecordAccumulator(logContext, 16 * 
1024, CompressionType.NONE, 0, 0L,
+15000, metrics, "", time, apiVersions, transactionManager,
+new BufferPool(1024 * 1024, 16 * 1024, metrics, time, ""));
+
+Sender sender = new Sender(logContext, this.client, this.metadata, 
accumulator, false,
+MAX_REQUEST_SIZE, ACKS_ALL, MAX_RETRIES, new 
SenderMetricsRegistry(metrics), this.time, 1,
+0, transactionManager, apiVersions);
+
+assertEquals(0, transactionManager.sequenceNumber(tp0).intValue());
+
+Future responseFuture1 = accumulator.append(tp0, 
time.milliseconds(), "1".getBytes(), "1".getBytes(), Record.EMPTY_HEADERS,
+null, MAX_BLOCK_TIMEOUT, false, time.milliseconds()).future;
+sender.runOnce();
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+
+time.sleep(1); // request time out
+sender.runOnce();
+assertEquals(0, client.inFlightRequestCount());
+assertTrue(transactionManager.hasInflightBatches(tp0));
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+sender.runOnce(); // retry
+assertEquals(1, client.inFlightRequestCount());
+assertTrue(transactionManager.hasInflightBatches(tp0));
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+
+time.sleep(5000); // delivery time out
+sender.runOnce(); // expired in accumulator
+assertFalse(transactionManager.hasInFlightRequest());

Review Comment:
   The behavior here puzzled me a little when I was trying to understand the 
test. Would a comment like this help?
   ```
   // The retried request will remain inflight until the request timeout
   // is reached even though the delivery timeout has expired and the
   // future has completed exceptionally.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] hachikuji commented on a diff in pull request #11991: KAFKA-13794: Fix comparator of inflightBatchesBySequence in TransactionManager

2022-04-04 Thread GitBox


hachikuji commented on code in PR #11991:
URL: https://github.com/apache/kafka/pull/11991#discussion_r842248811


##
clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java:
##
@@ -184,16 +184,22 @@ private void startSequencesAtBeginning(TopicPartition 
topicPartition, ProducerId
 // responses which are due to the retention period elapsing, and those 
which are due to actual lost data.
 private long lastAckedOffset;
 
+private final Comparator producerBatchComparator = (b1, 
b2) -> {

Review Comment:
   nit: could this be static? It doesn't have any state.



##
clients/src/test/java/org/apache/kafka/clients/producer/internals/TransactionManagerTest.java:
##
@@ -674,6 +674,61 @@ public void testBatchCompletedAfterProducerReset() {
 assertNull(transactionManager.nextBatchBySequence(tp0));
 }
 
+@Test
+public void testDuplicateSequenceAfterProducerReset() throws Exception {
+initializeTransactionManager(Optional.empty());
+initializeIdempotentProducerId(producerId, epoch);
+
+Metrics metrics = new Metrics(time);
+
+RecordAccumulator accumulator = new RecordAccumulator(logContext, 16 * 
1024, CompressionType.NONE, 0, 0L,
+15000, metrics, "", time, apiVersions, transactionManager,
+new BufferPool(1024 * 1024, 16 * 1024, metrics, time, ""));
+
+Sender sender = new Sender(logContext, this.client, this.metadata, 
accumulator, false,
+MAX_REQUEST_SIZE, ACKS_ALL, MAX_RETRIES, new 
SenderMetricsRegistry(metrics), this.time, 1,
+0, transactionManager, apiVersions);
+
+assertEquals(0, transactionManager.sequenceNumber(tp0).intValue());
+
+Future responseFuture1 = accumulator.append(tp0, 
time.milliseconds(), "1".getBytes(), "1".getBytes(), Record.EMPTY_HEADERS,
+null, MAX_BLOCK_TIMEOUT, false, time.milliseconds()).future;
+sender.runOnce();
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+
+time.sleep(1); // request time out
+sender.runOnce();
+assertEquals(0, client.inFlightRequestCount());
+assertTrue(transactionManager.hasInflightBatches(tp0));
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+sender.runOnce(); // retry
+assertEquals(1, client.inFlightRequestCount());
+assertTrue(transactionManager.hasInflightBatches(tp0));
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+
+time.sleep(5000); // delivery time out
+sender.runOnce(); // expired in accumulator
+assertFalse(transactionManager.hasInFlightRequest());
+assertEquals(1, client.inFlightRequestCount()); // not reaching 
request timeout, so still in flight
+
+sender.runOnce(); // bump the epoch
+assertEquals(epoch + 1, transactionManager.producerIdAndEpoch().epoch);
+assertEquals(0, transactionManager.sequenceNumber(tp0).intValue());
+
+Future responseFuture2 = accumulator.append(tp0, 
time.milliseconds(), "2".getBytes(), "2".getBytes(), Record.EMPTY_HEADERS,
+null, MAX_BLOCK_TIMEOUT, false, time.milliseconds()).future;
+sender.runOnce();
+sender.runOnce();
+assertEquals(0, transactionManager.firstInFlightSequence(tp0));
+assertEquals(1, transactionManager.sequenceNumber(tp0).intValue());
+
+time.sleep(5000); // request time out again
+sender.runOnce();
+assertTrue(transactionManager.hasInflightBatches(tp0)); // the latter 
batch failed and retried
+assertTrue(responseFuture1.isDone());

Review Comment:
   nit: could we move this after line 710? It makes the test a little easier to 
understand if we see when the first send fails. Also, maybe we could use 
`TestUtils.assertFutureThrows(responseFuture1, TimeoutException.class)` to make 
the timeout expectation explicit?
   



##
clients/src/test/java/org/apache/kafka/clients/producer/internals/TransactionManagerTest.java:
##
@@ -674,6 +674,61 @@ public void testBatchCompletedAfterProducerReset() {
 assertNull(transactionManager.nextBatchBySequence(tp0));
 }
 
+@Test
+public void testDuplicateSequenceAfterProducerReset() throws Exception {
+initializeTransactionManager(Optional.empty());
+initializeIdempotentProducerId(producerId, epoch);
+
+Metrics metrics = new Metrics(time);
+
+RecordAccumulator accumulator = new RecordAccumulator(logContext, 16 * 
1024, CompressionType.NONE, 0, 0L,
+15000, metrics, "", time, apiVersions, transactionManager,
+new BufferPool(1024 * 1024, 16 * 1024, metrics, time, ""));
+
+Sender sender = new Sender(logContext, this.client, this.metadata, 
accumulator, false,
+MAX_REQUEST_SIZE, ACKS_ALL, MAX_RETRIES, new 

[GitHub] [kafka] anatasiavela opened a new pull request, #11996: MINOR: Fix flaky testIdleConnection() test

2022-04-04 Thread GitBox


anatasiavela opened a new pull request, #11996:
URL: https://github.com/apache/kafka/pull/11996

   The test expects that the connection becomes idle before the mock time is 
moved forward, but the processor thread runs concurrently and may run some 
activity on the connection after the mock time is moved forward, thus the 
connection never expires.
   
   The solution is to wait until the message is received on the socket, and 
only then wait until the connection is unmuted (it's not enough to wait for 
unmuted without waiting for message being received on the socket, because the 
channel might have not been muted yet).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 commented on a diff in pull request #11985: MINOR: Supplement the description of `Valid Values` in the documentation of `compression.type`

2022-04-04 Thread GitBox


RivenSun2 commented on code in PR #11985:
URL: https://github.com/apache/kafka/pull/11985#discussion_r842228787


##
clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java:
##
@@ -329,7 +330,7 @@
 in("all", "-1", "0", "1"),
 Importance.LOW,
 ACKS_DOC)
-.define(COMPRESSION_TYPE_CONFIG, Type.STRING, 
"none", Importance.HIGH, COMPRESSION_TYPE_DOC)
+.define(COMPRESSION_TYPE_CONFIG, Type.STRING, 
CompressionType.NONE.name, in(CompressionType.names().toArray(new String[0])), 
Importance.HIGH, COMPRESSION_TYPE_DOC)

Review Comment:
   1. The parameters we need to pass in the validator here are `[none, gzip, 
snappy, lz4, zstd]`, not `[NONE, GZIP, SNAPPY, LZ4, ZSTD]`. So here I am not 
using the `Utils.enumOptions()` method directly.
   2. Add the static method `names()` to `CompressionType`, so that this method 
can be called in places with the same requirements in the future.
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-7509) Kafka Connect logs unnecessary warnings about unused configurations

2022-04-04 Thread Chris Egerton (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517159#comment-17517159
 ] 

Chris Egerton commented on KAFKA-7509:
--

Hmm... I'm not able to reproduce this easily with Kafka Streams. It looks like 
care is already taken to filter out most irrelevant client properties with 
logic like 
[this|https://github.com/apache/kafka/blob/74909e000aaab1f0300ae2e918a6fa361e38078e/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L1580-L1598].
 I think it might still be possible to trigger the unrecognized config warning 
by passing in interceptor properties in the top level streams config, but since 
you can already isolate client properties with separate {{{}producer.{}}}, 
{{{}consumer.{}}}, and {{admin.}} namespaces, there doesn't seem to be any need 
to make code changes to address that case.

[~mjsax] [~vvcephei] do either of you know if KAFKA-6793 is still an issue? And 
if so, do you know off the top of your head how to go about reproducing these 
warning messages with Kafka Streams?

If Kafka Streams has already taken care of everything then that should simplify 
the review process for [https://github.com/apache/kafka/pull/11986].

> Kafka Connect logs unnecessary warnings about unused configurations
> ---
>
> Key: KAFKA-7509
> URL: https://issues.apache.org/jira/browse/KAFKA-7509
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Randall Hauch
>Assignee: Chris Egerton
>Priority: Major
>
> When running Connect, the logs contain quite a few warnings about "The 
> configuration '{}' was supplied but isn't a known config." This occurs when 
> Connect creates producers, consumers, and admin clients, because the 
> AbstractConfig is logging unused configuration properties upon construction. 
> It's complicated by the fact that the Producer, Consumer, and AdminClient all 
> create their own AbstractConfig instances within the constructor, so we can't 
> even call its {{ignore(String key)}} method.
> See also KAFKA-6793 for a similar issue with Streams.
> There are no arguments in the Producer, Consumer, or AdminClient constructors 
> to control  whether the configs log these warnings, so a simpler workaround 
> is to only pass those configuration properties to the Producer, Consumer, and 
> AdminClient that the ProducerConfig, ConsumerConfig, and AdminClientConfig 
> configdefs know about.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (KAFKA-13741) Cluster IDs should not have leading dash

2022-04-04 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-13741.
--
Resolution: Fixed

> Cluster IDs should not have leading dash
> 
>
> Key: KAFKA-13741
> URL: https://issues.apache.org/jira/browse/KAFKA-13741
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft
>Affects Versions: 3.1.0, 3.0.0
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Minor
>  Labels: kip-500
> Fix For: 3.2.0
>
>
> Since we use URL-safe base64 encoded Uuid's for cluster ID, it is possible 
> for dash ("-") characters to be present in the ID string. When a cluster ID 
> has a leading dash, we can run into problems when running the Kafka bash 
> scripts.
> For example, if the ID "-Xflm1nKSfOK8QGt_AXhxw" is generated with 
> "random-uuid" sub-command of kafka-storage.sh, we would then normally format 
> the log directories like:
> {code}
> ./bin/kafka-storage.sh format --config ./config/kraft/controller.properties \
> --cluster-id -Xflm1nKSfOK8QGt_AXhxw
> {code}
> This will not parse correctly as the argument parsing library will treat the 
> cluster ID as an argument. It leads to the following error:
> {code}
> usage: kafka-storage format [-h] --config CONFIG --cluster-id CLUSTER_ID 
> [--ignore-formatted]
> kafka-storage: error: argument --cluster-id/-t: expected one argument
> {code}
> This can be worked around by putting the ID in quotes, or by using an "=" 
> between the "--cluster-id" and the Uuid.
> We can solve this going forward by not generating random Uuid's that contain 
> a leading dash.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13755) Broker heartbeat event should have deadline

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13755:
--
Fix Version/s: (was: 3.2.0)

> Broker heartbeat event should have deadline
> ---
>
> Key: KAFKA-13755
> URL: https://issues.apache.org/jira/browse/KAFKA-13755
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: David Arthur
>Priority: Minor
>  Labels: kip-500
>
> When we schedule the event for processing the broker heartbeat request in 
> QuroumController, we do not give a deadline. This means that the event will 
> only be processed after all other events which do have a deadline. In the 
> case of the controller's queue getting filled up with deadline (i.e., 
> "deferred") events, we may not process the heartbeat before the broker 
> attempts to send another one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13755) Broker heartbeat event should have deadline

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517155#comment-17517155
 ] 

Bruno Cadonna commented on KAFKA-13755:
---

Removing from the 3.2.0 release since code freeze has passed.

> Broker heartbeat event should have deadline
> ---
>
> Key: KAFKA-13755
> URL: https://issues.apache.org/jira/browse/KAFKA-13755
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: David Arthur
>Priority: Minor
>  Labels: kip-500
> Fix For: 3.2.0
>
>
> When we schedule the event for processing the broker heartbeat request in 
> QuroumController, we do not give a deadline. This means that the event will 
> only be processed after all other events which do have a deadline. In the 
> case of the controller's queue getting filled up with deadline (i.e., 
> "deferred") events, we may not process the heartbeat before the broker 
> attempts to send another one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-10642) Expose the real stack trace if any exception occurred during SSL Client Trust Verification in extension

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-10642:
--
Fix Version/s: (was: 3.2.0)

> Expose the real stack trace if any exception occurred during SSL Client Trust 
> Verification in extension
> ---
>
> Key: KAFKA-10642
> URL: https://issues.apache.org/jira/browse/KAFKA-10642
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.3.0, 2.4.0, 2.3.1, 2.5.0, 2.4.1, 2.6.0, 2.5.1
>Reporter: Senthilnathan Muthusamy
>Assignee: Senthilnathan Muthusamy
>Priority: Minor
>
> If there is any exception occurred in the custom implementation of client 
> trust verification (i.e. using security.provider), the inner exception is 
> suppressed or hidden and not logged to the log file...
>  
> Below is an example stack trace not showing actual exception from the 
> extension/custom implementation.
>  
> [2020-05-13 14:30:26,892] ERROR [KafkaServer id=423810470] Fatal error during 
> KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer)[2020-05-13 14:30:26,892] ERROR [KafkaServer 
> id=423810470] Fatal error during KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer) org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:71)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85)
>  at kafka.network.Processor.(SocketServer.scala:753) at 
> kafka.network.SocketServer.newProcessor(SocketServer.scala:394) at 
> kafka.network.SocketServer.$anonfun$addDataPlaneProcessors$1(SocketServer.scala:279)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.network.SocketServer.addDataPlaneProcessors(SocketServer.scala:278) at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:241)
>  at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:238)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:238)
>  at kafka.network.SocketServer.startup(SocketServer.scala:121) at 
> kafka.server.KafkaServer.startup(KafkaServer.scala:265) at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) at 
> kafka.Kafka$.main(Kafka.scala:84) at kafka.Kafka.main(Kafka.scala)Caused by: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:100)
>  at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:69)
>  ... 18 more



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-10642) Expose the real stack trace if any exception occurred during SSL Client Trust Verification in extension

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517154#comment-17517154
 ] 

Bruno Cadonna commented on KAFKA-10642:
---

Removing from the 3.2.0 release since code freeze has passed.

> Expose the real stack trace if any exception occurred during SSL Client Trust 
> Verification in extension
> ---
>
> Key: KAFKA-10642
> URL: https://issues.apache.org/jira/browse/KAFKA-10642
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.3.0, 2.4.0, 2.3.1, 2.5.0, 2.4.1, 2.6.0, 2.5.1
>Reporter: Senthilnathan Muthusamy
>Assignee: Senthilnathan Muthusamy
>Priority: Minor
> Fix For: 3.2.0
>
>
> If there is any exception occurred in the custom implementation of client 
> trust verification (i.e. using security.provider), the inner exception is 
> suppressed or hidden and not logged to the log file...
>  
> Below is an example stack trace not showing actual exception from the 
> extension/custom implementation.
>  
> [2020-05-13 14:30:26,892] ERROR [KafkaServer id=423810470] Fatal error during 
> KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer)[2020-05-13 14:30:26,892] ERROR [KafkaServer 
> id=423810470] Fatal error during KafkaServer startup. Prepare to shutdown 
> (kafka.server.KafkaServer) org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:71)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85)
>  at kafka.network.Processor.(SocketServer.scala:753) at 
> kafka.network.SocketServer.newProcessor(SocketServer.scala:394) at 
> kafka.network.SocketServer.$anonfun$addDataPlaneProcessors$1(SocketServer.scala:279)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.network.SocketServer.addDataPlaneProcessors(SocketServer.scala:278) at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:241)
>  at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:238)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:238)
>  at kafka.network.SocketServer.startup(SocketServer.scala:121) at 
> kafka.server.KafkaServer.startup(KafkaServer.scala:265) at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) at 
> kafka.Kafka$.main(Kafka.scala:84) at kafka.Kafka.main(Kafka.scala)Caused by: 
> org.apache.kafka.common.config.ConfigException: Invalid value 
> java.lang.RuntimeException: Delegated task threw Exception/Error for 
> configuration A client SSLEngine created with the provided settings can't 
> connect to a server SSLEngine created with those settings. at 
> org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:100)
>  at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:69)
>  ... 18 more



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-9803) Allow producers to recover gracefully from transaction timeouts

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9803:
-
Fix Version/s: (was: 3.2.0)

> Allow producers to recover gracefully from transaction timeouts
> ---
>
> Key: KAFKA-9803
> URL: https://issues.apache.org/jira/browse/KAFKA-9803
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer , streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>  Labels: needs-kip
>
> Transaction timeouts are detected by the transaction coordinator. When the 
> coordinator detects a timeout, it bumps the producer epoch and aborts the 
> transaction. The epoch bump is necessary in order to prevent the current 
> producer from being able to begin writing to a new transaction which was not 
> started through the coordinator.  
> Transactions may also be aborted if a new producer with the same 
> `transactional.id` starts up. Similarly this results in an epoch bump. 
> Currently the coordinator does not distinguish these two cases. Both will end 
> up as a `ProducerFencedException`, which means the producer needs to shut 
> itself down. 
> We can improve this with the new APIs from KIP-360. When the coordinator 
> times out a transaction, it can remember that fact and allow the existing 
> producer to claim the bumped epoch and continue. Roughly the logic would work 
> like this:
> 1. When a transaction times out, set lastProducerEpoch to the current epoch 
> and do the normal bump.
> 2. Any transactional requests from the old epoch result in a new 
> TRANSACTION_TIMED_OUT error code, which is propagated to the application.
> 3. The producer recovers by sending InitProducerId with the current epoch. 
> The coordinator returns the bumped epoch.
> One issue that needs to be addressed is how to handle INVALID_PRODUCER_EPOCH 
> from Produce requests. Partition leaders will not generally know if a bumped 
> epoch was the result of a timed out transaction or a fenced producer. 
> Possibly the producer can treat these errors as abortable when they come from 
> Produce responses. In that case, the user would try to abort the transaction 
> and then we can see if it was due to a timeout or otherwise.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-9803) Allow producers to recover gracefully from transaction timeouts

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517153#comment-17517153
 ] 

Bruno Cadonna commented on KAFKA-9803:
--

Removing from the 3.2.0 release since code freeze has passed.

> Allow producers to recover gracefully from transaction timeouts
> ---
>
> Key: KAFKA-9803
> URL: https://issues.apache.org/jira/browse/KAFKA-9803
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer , streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.2.0
>
>
> Transaction timeouts are detected by the transaction coordinator. When the 
> coordinator detects a timeout, it bumps the producer epoch and aborts the 
> transaction. The epoch bump is necessary in order to prevent the current 
> producer from being able to begin writing to a new transaction which was not 
> started through the coordinator.  
> Transactions may also be aborted if a new producer with the same 
> `transactional.id` starts up. Similarly this results in an epoch bump. 
> Currently the coordinator does not distinguish these two cases. Both will end 
> up as a `ProducerFencedException`, which means the producer needs to shut 
> itself down. 
> We can improve this with the new APIs from KIP-360. When the coordinator 
> times out a transaction, it can remember that fact and allow the existing 
> producer to claim the bumped epoch and continue. Roughly the logic would work 
> like this:
> 1. When a transaction times out, set lastProducerEpoch to the current epoch 
> and do the normal bump.
> 2. Any transactional requests from the old epoch result in a new 
> TRANSACTION_TIMED_OUT error code, which is propagated to the application.
> 3. The producer recovers by sending InitProducerId with the current epoch. 
> The coordinator returns the bumped epoch.
> One issue that needs to be addressed is how to handle INVALID_PRODUCER_EPOCH 
> from Produce requests. Partition leaders will not generally know if a bumped 
> epoch was the result of a timed out transaction or a fenced producer. 
> Possibly the producer can treat these errors as abortable when they come from 
> Produce responses. In that case, the user would try to abort the transaction 
> and then we can see if it was due to a timeout or otherwise.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12842:
--
Fix Version/s: (was: 3.2.0)

> Failing test: 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
> --
>
> Key: KAFKA-12842
> URL: https://issues.apache.org/jira/browse/KAFKA-12842
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: John Roesler
>Priority: Major
>
> This test failed during a PR build, which means that it failed twice in a 
> row, due to the test-retry logic in PR builds.
>  
> [https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]
>  
> {noformat}
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at org.reflections.Store.getAllIncluding(Store.java:82)
>   at org.reflections.Store.getAll(Store.java:93)
>   at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
>   at 
> org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
>   at 
> org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
>   at 
> org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
>   at 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> 

[jira] [Commented] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517152#comment-17517152
 ] 

Bruno Cadonna commented on KAFKA-12842:
---

Removing from the 3.2.0 release since code freeze has passed.

> Failing test: 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
> --
>
> Key: KAFKA-12842
> URL: https://issues.apache.org/jira/browse/KAFKA-12842
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: John Roesler
>Priority: Major
> Fix For: 3.2.0
>
>
> This test failed during a PR build, which means that it failed twice in a 
> row, due to the test-retry logic in PR builds.
>  
> [https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]
>  
> {noformat}
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at org.reflections.Store.getAllIncluding(Store.java:82)
>   at org.reflections.Store.getAll(Store.java:93)
>   at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
>   at 
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
>   at 
> org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
>   at 
> org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
>   at 
> org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
>   at 
> org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> 

[jira] [Updated] (KAFKA-9910) Implement new transaction timed out error

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9910:
-
Fix Version/s: (was: 3.2.0)

> Implement new transaction timed out error
> -
>
> Key: KAFKA-9910
> URL: https://issues.apache.org/jira/browse/KAFKA-9910
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Assignee: HaiyuanZhao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-9910) Implement new transaction timed out error

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517151#comment-17517151
 ] 

Bruno Cadonna commented on KAFKA-9910:
--

Removing from the 3.2.0 release since code freeze has passed.

> Implement new transaction timed out error
> -
>
> Key: KAFKA-9910
> URL: https://issues.apache.org/jira/browse/KAFKA-9910
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Assignee: HaiyuanZhao
>Priority: Major
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-7493) Rewrite test_broker_type_bounce_at_start

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-7493:
-
Fix Version/s: (was: 3.2.0)

> Rewrite test_broker_type_bounce_at_start
> 
>
> Key: KAFKA-7493
> URL: https://issues.apache.org/jira/browse/KAFKA-7493
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: John Roesler
>Priority: Major
>
> Currently, the test test_broker_type_bounce_at_start in 
> streams_broker_bounce_test.py is ignored.
> As written, there are a couple of race conditions that lead to flakiness.
> It should be possible to re-write the test to wait on log messages, as the 
> other tests do, instead of just sleeping to more deterministically transition 
> the test from one state to the next.
> Once the test is fixed, the fix should be back-ported to all prior branches, 
> back to 0.10.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-7493) Rewrite test_broker_type_bounce_at_start

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517150#comment-17517150
 ] 

Bruno Cadonna commented on KAFKA-7493:
--

Removing from the 3.2.0 release since code freeze has passed.

> Rewrite test_broker_type_bounce_at_start
> 
>
> Key: KAFKA-7493
> URL: https://issues.apache.org/jira/browse/KAFKA-7493
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: John Roesler
>Priority: Major
> Fix For: 3.2.0
>
>
> Currently, the test test_broker_type_bounce_at_start in 
> streams_broker_bounce_test.py is ignored.
> As written, there are a couple of race conditions that lead to flakiness.
> It should be possible to re-write the test to wait on log messages, as the 
> other tests do, instead of just sleeping to more deterministically transition 
> the test from one state to the next.
> Once the test is fixed, the fix should be back-ported to all prior branches, 
> back to 0.10.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12641) Clear RemoteLogLeaderEpochState entry when it become empty.

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12641:
--
Fix Version/s: (was: 3.2.0)

> Clear RemoteLogLeaderEpochState entry when it become empty. 
> 
>
> Key: KAFKA-12641
> URL: https://issues.apache.org/jira/browse/KAFKA-12641
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
>
> https://github.com/apache/kafka/pull/10218#discussion_r609895193



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12641) Clear RemoteLogLeaderEpochState entry when it become empty.

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517149#comment-17517149
 ] 

Bruno Cadonna commented on KAFKA-12641:
---

Removing from the 3.2.0 release since code freeze has passed.

> Clear RemoteLogLeaderEpochState entry when it become empty. 
> 
>
> Key: KAFKA-12641
> URL: https://issues.apache.org/jira/browse/KAFKA-12641
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
> Fix For: 3.2.0
>
>
> https://github.com/apache/kafka/pull/10218#discussion_r609895193



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-9579) RLM fetch implementation by adding respective purgatory

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9579:
-
Fix Version/s: (was: 3.2.0)

> RLM fetch implementation by adding respective purgatory
> ---
>
> Key: KAFKA-9579
> URL: https://issues.apache.org/jira/browse/KAFKA-9579
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Ying Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-9579) RLM fetch implementation by adding respective purgatory

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517148#comment-17517148
 ] 

Bruno Cadonna commented on KAFKA-9579:
--

Removing from the 3.2.0 release since code freeze has passed.

> RLM fetch implementation by adding respective purgatory
> ---
>
> Key: KAFKA-9579
> URL: https://issues.apache.org/jira/browse/KAFKA-9579
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Ying Zheng
>Priority: Major
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12176) Consider changing default log.message.timestamp.difference.max.ms

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517147#comment-17517147
 ] 

Bruno Cadonna commented on KAFKA-12176:
---

Removing from the 3.2.0 release since code freeze has passed.

> Consider changing default log.message.timestamp.difference.max.ms
> -
>
> Key: KAFKA-12176
> URL: https://issues.apache.org/jira/browse/KAFKA-12176
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.2.0
>
>
> The default `log.message.timestamp.difference.max.ms` is Long.MaxValue, which 
> means the broker will accept arbitrary timestamps. The broker relies on 
> timestamps internally for deciding when a segments should be rolled and when 
> they should be deleted. A buggy client which is producing messages with 
> timestamps too far in the future or past can cause segments to roll 
> frequently which can lead to open file exhaustion, or it can cause segments 
> to be retained indefinitely which can lead to disk space exhaustion.
> In https://issues.apache.org/jira/browse/KAFKA-4340, it was proposed to set 
> the default equal to `log.retention.ms`, which still seems to make sense. 
> This was rejected for two reasons as far as I can tell. First was 
> compatibility, which just means we would need a major upgrade. The second is 
> that we previously did not have a way to tell the user which record had 
> violated the max timestamp difference, but now we have 
> [KIP-467|https://cwiki.apache.org/confluence/display/KAFKA/KIP-467%3A+Augment+ProduceResponse+error+messaging+for+specific+culprit+records].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12176) Consider changing default log.message.timestamp.difference.max.ms

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12176:
--
Fix Version/s: (was: 3.2.0)

> Consider changing default log.message.timestamp.difference.max.ms
> -
>
> Key: KAFKA-12176
> URL: https://issues.apache.org/jira/browse/KAFKA-12176
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: needs-kip
>
> The default `log.message.timestamp.difference.max.ms` is Long.MaxValue, which 
> means the broker will accept arbitrary timestamps. The broker relies on 
> timestamps internally for deciding when a segments should be rolled and when 
> they should be deleted. A buggy client which is producing messages with 
> timestamps too far in the future or past can cause segments to roll 
> frequently which can lead to open file exhaustion, or it can cause segments 
> to be retained indefinitely which can lead to disk space exhaustion.
> In https://issues.apache.org/jira/browse/KAFKA-4340, it was proposed to set 
> the default equal to `log.retention.ms`, which still seems to make sense. 
> This was rejected for two reasons as far as I can tell. First was 
> compatibility, which just means we would need a major upgrade. The second is 
> that we previously did not have a way to tell the user which record had 
> violated the max timestamp difference, but now we have 
> [KIP-467|https://cwiki.apache.org/confluence/display/KAFKA/KIP-467%3A+Augment+ProduceResponse+error+messaging+for+specific+culprit+records].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-10233) KafkaConsumer polls in a tight loop if group is not authorized

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-10233:
--
Fix Version/s: (was: 3.2.0)

> KafkaConsumer polls in a tight loop if group is not authorized
> --
>
> Key: KAFKA-10233
> URL: https://issues.apache.org/jira/browse/KAFKA-10233
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
>
> Consumer propagates GroupAuthorizationException from poll immediately when 
> trying to find coordinator even though it is a retriable exception. If the 
> application polls in a loop, ignoring retriable exceptions, the consumer 
> tries to find coordinator in a tight loop without any backoff. We should 
> apply retry backoff in this case to avoid overloading brokers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-10233) KafkaConsumer polls in a tight loop if group is not authorized

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517146#comment-17517146
 ] 

Bruno Cadonna commented on KAFKA-10233:
---

Removing from the 3.2.0 release since code freeze has passed.

> KafkaConsumer polls in a tight loop if group is not authorized
> --
>
> Key: KAFKA-10233
> URL: https://issues.apache.org/jira/browse/KAFKA-10233
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 3.2.0
>
>
> Consumer propagates GroupAuthorizationException from poll immediately when 
> trying to find coordinator even though it is a retriable exception. If the 
> application polls in a loop, ignoring retriable exceptions, the consumer 
> tries to find coordinator in a tight loop without any backoff. We should 
> apply retry backoff in this case to avoid overloading brokers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-6080) Transactional EoS for source connectors

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-6080:
-
Fix Version/s: (was: 3.2.0)

> Transactional EoS for source connectors
> ---
>
> Key: KAFKA-6080
> URL: https://issues.apache.org/jira/browse/KAFKA-6080
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Antony Stubbs
>Assignee: Chris Egerton
>Priority: Major
>  Labels: needs-kip
>
> Exactly once (eos) message production for source connectors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-6080) Transactional EoS for source connectors

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517145#comment-17517145
 ] 

Bruno Cadonna commented on KAFKA-6080:
--

Removing from the 3.2.0 release since code freeze has passed.

> Transactional EoS for source connectors
> ---
>
> Key: KAFKA-6080
> URL: https://issues.apache.org/jira/browse/KAFKA-6080
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Antony Stubbs
>Assignee: Chris Egerton
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.2.0
>
>
> Exactly once (eos) message production for source connectors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13093) KIP-724: Log compaction should write new segments with record version v2

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13093:
--
Fix Version/s: (was: 3.2.0)

> KIP-724: Log compaction should write new segments with record version v2
> 
>
> Key: KAFKA-13093
> URL: https://issues.apache.org/jira/browse/KAFKA-13093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
>
> If IBP is 3.0 or higher. Currently, log compaction retains the record format 
> of the record batch that was retained.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13093) KIP-724: Log compaction should write new segments with record version v2

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517144#comment-17517144
 ] 

Bruno Cadonna commented on KAFKA-13093:
---

Removing from the 3.2.0 release since code freeze has passed.

> KIP-724: Log compaction should write new segments with record version v2
> 
>
> Key: KAFKA-13093
> URL: https://issues.apache.org/jira/browse/KAFKA-13093
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 3.2.0
>
>
> If IBP is 3.0 or higher. Currently, log compaction retains the record format 
> of the record batch that was retained.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [kafka] junrao commented on a diff in pull request #11842: KAFKA-13687: Limiting the amount of bytes to be read in a segment logs

2022-04-04 Thread GitBox


junrao commented on code in PR #11842:
URL: https://github.com/apache/kafka/pull/11842#discussion_r842151867


##
core/src/main/scala/kafka/tools/DumpLogSegments.scala:
##
@@ -430,6 +431,11 @@ object DumpLogSegments {
   .describedAs("size")
   .ofType(classOf[java.lang.Integer])
   .defaultsTo(5 * 1024 * 1024)
+val maxBytesOpt = parser.accepts("max-bytes", "Limit the amount of total 
batches in bytes avoiding reading the whole file(s).")

Review Comment:
   It would be useful to make it clear this option only applies to the .log 
files.



##
core/src/test/scala/unit/kafka/tools/DumpLogSegmentsTest.scala:
##
@@ -310,6 +334,38 @@ class DumpLogSegmentsTest {
 None
   }
 
+  // Returns the total bytes of the batches specified
+  private def readPartialBatchesBytes(lines: util.ListIterator[String], limit: 
Int): Int = {
+val sizePattern: Regex = raw".+?size:\s(\d+).+".r
+var batchesBytes = 0
+var batchesCounter = 0
+while (lines.hasNext) {
+  if (batchesCounter >= limit){
+return batchesBytes
+  }
+  val line = lines.next()
+  if (line.startsWith("baseOffset")) {
+line match {
+  case sizePattern(size) => batchesBytes += size.toInt
+  case _ => throw new IllegalStateException(s"Failed to parse and find 
size value for batch line: $line")
+}
+batchesCounter += 1
+  }
+}
+return batchesBytes
+  }
+
+  private def countBatches(lines: util.ListIterator[String]): Int = {
+var countBatches = 0
+while (lines.hasNext) {
+  val line = lines.next()
+  if (line.startsWith("baseOffset")) {
+countBatches += 1
+  }
+}
+return countBatches

Review Comment:
   No need for return.



##
core/src/test/scala/unit/kafka/tools/DumpLogSegmentsTest.scala:
##
@@ -310,6 +334,38 @@ class DumpLogSegmentsTest {
 None
   }
 
+  // Returns the total bytes of the batches specified
+  private def readPartialBatchesBytes(lines: util.ListIterator[String], limit: 
Int): Int = {
+val sizePattern: Regex = raw".+?size:\s(\d+).+".r
+var batchesBytes = 0
+var batchesCounter = 0
+while (lines.hasNext) {
+  if (batchesCounter >= limit){
+return batchesBytes
+  }
+  val line = lines.next()
+  if (line.startsWith("baseOffset")) {
+line match {
+  case sizePattern(size) => batchesBytes += size.toInt
+  case _ => throw new IllegalStateException(s"Failed to parse and find 
size value for batch line: $line")
+}
+batchesCounter += 1
+  }
+}
+return batchesBytes

Review Comment:
   No need for return.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-13798) KafkaController should send LeaderAndIsr request when LeaderRecoveryState is altered

2022-04-04 Thread Jose Armando Garcia Sancio (Jira)
Jose Armando Garcia Sancio created KAFKA-13798:
--

 Summary: KafkaController should send LeaderAndIsr request when 
LeaderRecoveryState is altered
 Key: KAFKA-13798
 URL: https://issues.apache.org/jira/browse/KAFKA-13798
 Project: Kafka
  Issue Type: Task
  Components: controller
Affects Versions: 3.2.0
Reporter: Jose Armando Garcia Sancio
Assignee: Jose Armando Garcia Sancio


The current implementation of KIP-704 and the ZK Controller only sends a 
LeaderAndIsr request to the followers if the AlterPartition completes an 
reassignment. That means that if there are no reassignment pending then the ZK 
Controller never sends a LeaderAndIsr request to the follower. The controller 
needs to send a LeaderAndIsr request when the partition has recovered because 
of "fetch from follower" feature.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-6089) Transient failure in kafka.network.SocketServerTest.configureNewConnectionException

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-6089:
-
Fix Version/s: (was: 3.2.0)

> Transient failure in 
> kafka.network.SocketServerTest.configureNewConnectionException
> ---
>
> Key: KAFKA-6089
> URL: https://issues.apache.org/jira/browse/KAFKA-6089
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
>
> Stack trace:
> {quote}
> java.lang.AssertionError: Channels not removed
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:347)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861)
>   at 
> kafka.network.SocketServerTest.kafka$network$SocketServerTest$$assertProcessorHealthy(SocketServerTest.scala:888)
>   at 
> kafka.network.SocketServerTest$$anonfun$configureNewConnectionException$1.apply(SocketServerTest.scala:654)
>   at 
> kafka.network.SocketServerTest$$anonfun$configureNewConnectionException$1.apply(SocketServerTest.scala:645)
>   at 
> kafka.network.SocketServerTest.withTestableServer(SocketServerTest.scala:863)
>   at 
> kafka.network.SocketServerTest.configureNewConnectionException(SocketServerTest.scala:645)
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-6089) Transient failure in kafka.network.SocketServerTest.configureNewConnectionException

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517105#comment-17517105
 ] 

Bruno Cadonna commented on KAFKA-6089:
--

Removing from the 3.2.0 release since code freeze has passed.

> Transient failure in 
> kafka.network.SocketServerTest.configureNewConnectionException
> ---
>
> Key: KAFKA-6089
> URL: https://issues.apache.org/jira/browse/KAFKA-6089
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 3.2.0
>
>
> Stack trace:
> {quote}
> java.lang.AssertionError: Channels not removed
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:347)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861)
>   at 
> kafka.network.SocketServerTest.kafka$network$SocketServerTest$$assertProcessorHealthy(SocketServerTest.scala:888)
>   at 
> kafka.network.SocketServerTest$$anonfun$configureNewConnectionException$1.apply(SocketServerTest.scala:654)
>   at 
> kafka.network.SocketServerTest$$anonfun$configureNewConnectionException$1.apply(SocketServerTest.scala:645)
>   at 
> kafka.network.SocketServerTest.withTestableServer(SocketServerTest.scala:863)
>   at 
> kafka.network.SocketServerTest.configureNewConnectionException(SocketServerTest.scala:645)
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-7435) Consider standardizing the config object pattern on interface/implementation.

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517103#comment-17517103
 ] 

Bruno Cadonna commented on KAFKA-7435:
--

Removing from the 3.2.0 release since code freeze has passed.

> Consider standardizing the config object pattern on interface/implementation.
> -
>
> Key: KAFKA-7435
> URL: https://issues.apache.org/jira/browse/KAFKA-7435
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Priority: Major
>
> Currently, the majority of Streams's config objects are structured as a 
> "external" builder class (with protected state) and an "internal" subclass 
> exposing getters to the state. This is serviceable, but there is an 
> alternative we can consider: to use an interface for the external API and the 
> implementation class for the internal one.
> Advantages:
>  * we could use private state, which improves maintainability
>  * the setters and getters would all be defined in the same class, improving 
> readability
>  * users browsing the public API would be able to look at an interface that 
> contains less extraneous internal details than the current class
>  * there is more flexibility in implementation
> Alternatives
>  * instead of external-class/internal-subclass, we could use an external 
> *final* class with package-protected state and an internal accessor class 
> (not a subclass, obviously). This would make it impossible for users to try 
> and create custom subclasses of our config objects, which is generally not 
> allowed already, but is currently a runtime class cast exception.
> Example implementation: [https://github.com/apache/kafka/pull/5677]
> This change would break binary, but not source, compatibility, so the 
> earliest we could consider it is 3.0.
> To be clear, I'm *not* saying this *should* be done, just calling for a 
> discussion. Otherwise, I'd make a KIP.
> Thoughts?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-7435) Consider standardizing the config object pattern on interface/implementation.

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-7435:
-
Fix Version/s: (was: 3.2.0)

> Consider standardizing the config object pattern on interface/implementation.
> -
>
> Key: KAFKA-7435
> URL: https://issues.apache.org/jira/browse/KAFKA-7435
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Priority: Major
>
> Currently, the majority of Streams's config objects are structured as a 
> "external" builder class (with protected state) and an "internal" subclass 
> exposing getters to the state. This is serviceable, but there is an 
> alternative we can consider: to use an interface for the external API and the 
> implementation class for the internal one.
> Advantages:
>  * we could use private state, which improves maintainability
>  * the setters and getters would all be defined in the same class, improving 
> readability
>  * users browsing the public API would be able to look at an interface that 
> contains less extraneous internal details than the current class
>  * there is more flexibility in implementation
> Alternatives
>  * instead of external-class/internal-subclass, we could use an external 
> *final* class with package-protected state and an internal accessor class 
> (not a subclass, obviously). This would make it impossible for users to try 
> and create custom subclasses of our config objects, which is generally not 
> allowed already, but is currently a runtime class cast exception.
> Example implementation: [https://github.com/apache/kafka/pull/5677]
> This change would break binary, but not source, compatibility, so the 
> earliest we could consider it is 3.0.
> To be clear, I'm *not* saying this *should* be done, just calling for a 
> discussion. Otherwise, I'd make a KIP.
> Thoughts?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13560) Load indexes and data in async manner in the critical path of replica fetcher threads.

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517100#comment-17517100
 ] 

Bruno Cadonna commented on KAFKA-13560:
---

Removing from the 3.2.0 release since code freeze has passed.

> Load indexes and data in async manner in the critical path of replica fetcher 
> threads. 
> ---
>
> Key: KAFKA-13560
> URL: https://issues.apache.org/jira/browse/KAFKA-13560
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Satish Duggana
>Assignee: Kamal Chandraprakash
>Priority: Major
> Fix For: 3.2.0
>
>
> https://github.com/apache/kafka/pull/11390#discussion_r762366976



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13560) Load indexes and data in async manner in the critical path of replica fetcher threads.

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13560:
--
Fix Version/s: (was: 3.2.0)

> Load indexes and data in async manner in the critical path of replica fetcher 
> threads. 
> ---
>
> Key: KAFKA-13560
> URL: https://issues.apache.org/jira/browse/KAFKA-13560
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Satish Duggana
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> https://github.com/apache/kafka/pull/11390#discussion_r762366976



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12616) Convert integration tests to use ClusterTest

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517098#comment-17517098
 ] 

Bruno Cadonna commented on KAFKA-12616:
---

Removing from the 3.2.0 release since code freeze has passed.

> Convert integration tests to use ClusterTest 
> -
>
> Key: KAFKA-12616
> URL: https://issues.apache.org/jira/browse/KAFKA-12616
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: kip-500
> Fix For: 3.2.0
>
>
> We would like to convert integration tests to use the new ClusterTest 
> annotations so that we can easily test both the Zk and KRaft implementations. 
> This will require adding a bunch of support to the ClusterTest framework as 
> we go along.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12616) Convert integration tests to use ClusterTest

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12616:
--
Fix Version/s: (was: 3.2.0)

> Convert integration tests to use ClusterTest 
> -
>
> Key: KAFKA-12616
> URL: https://issues.apache.org/jira/browse/KAFKA-12616
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: kip-500
>
> We would like to convert integration tests to use the new ClusterTest 
> annotations so that we can easily test both the Zk and KRaft implementations. 
> This will require adding a bunch of support to the ClusterTest framework as 
> we go along.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13782) Producer may fail to add the correct partition to transaction

2022-04-04 Thread Jason Gustafson (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517085#comment-17517085
 ] 

Jason Gustafson commented on KAFKA-13782:
-

[~tombentley] Apologies for the delay. PR is up: 
https://github.com/apache/kafka/pull/11995.

> Producer may fail to add the correct partition to transaction
> -
>
> Key: KAFKA-13782
> URL: https://issues.apache.org/jira/browse/KAFKA-13782
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
>
> In KAFKA-13412, we changed the logic to add partitions to transactions in the 
> producer. The intention was to ensure that the partition is added in 
> `TransactionManager` before the record is appended to the 
> `RecordAccumulator`. However, this does not take into account the possibility 
> that the originally selected partition may be changed if `abortForNewBatch` 
> is set in `RecordAppendResult` in the call to `RecordAccumulator.append`. 
> When this happens, the partitioner can choose a different partition, which 
> means that the `TransactionManager` would be tracking the wrong partition.
> I think the consequence of this is that the batches sent to this partition 
> would get stuck in the `RecordAccumulator` until they timed out because we 
> validate before sending that the partition has been added correctly to the 
> transaction.
> Note that KAFKA-13412 has not been included in any release, so there are no 
> affected versions.
> Thanks to [~alivshits] for identifying the bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [kafka] hachikuji opened a new pull request, #11995: KAFKA-13782; Ensure correct partition added to txn after abort on full batch

2022-04-04 Thread GitBox


hachikuji opened a new pull request, #11995:
URL: https://github.com/apache/kafka/pull/11995

   Fixes a regression introduced in https://github.com/apache/kafka/pull/11452. 
Following 
[KIP-480](https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner),
 the `Partitioner` will receive a callback when a batch has been completed so 
that it can choose another partition. Because of this, we have to wait until 
the batch has been successfully appended to the accumulator before adding the 
partition in `TransactionManager.maybeAddPartition`. This is still safe because 
the `Sender` cannot dequeue a batch from the accumulator until it has been 
added to the transaction successfully.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-9990) Supporting transactions in tiered storage

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517034#comment-17517034
 ] 

Bruno Cadonna commented on KAFKA-9990:
--

Removing from the 3.2.0 release since code freeze has passed.

> Supporting transactions in tiered storage
> -
>
> Key: KAFKA-9990
> URL: https://issues.apache.org/jira/browse/KAFKA-9990
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (KAFKA-13668) Failed cluster authorization should not be fatal for producer

2022-04-04 Thread Philip Nee (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Nee reassigned KAFKA-13668:
--

Assignee: Philip Nee  (was: Kirk True)

> Failed cluster authorization should not be fatal for producer
> -
>
> Key: KAFKA-13668
> URL: https://issues.apache.org/jira/browse/KAFKA-13668
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Philip Nee
>Priority: Major
>
> The idempotent producer fails fatally if the initial `InitProducerId` returns 
> CLUSTER_AUTHORIZATION_FAILED. This makes the producer unusable until a new 
> instance is constructed. For some applications, it is more convenient to keep 
> the producer instance active and let the administrator fix the permission 
> problem instead of going into a crash loop. Additionally, most applications 
> will probably not be smart enough to reconstruct the producer instance, so if 
> the application does not handle the error by failing, users will have to 
> restart the application manually. 
> I think it would be better to let the producer retry the `InitProducerId` 
> request as long as the user keeps trying to use the producer. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (KAFKA-9578) Kafka Tiered Storage - System Tests

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517032#comment-17517032
 ] 

Bruno Cadonna edited comment on KAFKA-9578 at 4/4/22 6:40 PM:
--

Removing from the 3.2.0 release since code freeze has passed.


was (Author: cadonna):
Remove from the 3.2.0 release since code freeze has passed.

> Kafka Tiered Storage - System  Tests
> 
>
> Key: KAFKA-9578
> URL: https://issues.apache.org/jira/browse/KAFKA-9578
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Harsha
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> Initial test cases set up by [~Ying Zheng] 
>  
> [https://docs.google.com/spreadsheets/d/1gS0s1FOmcjpKYXBddejXAoJAjEZ7AdEzMU9wZc-JgY8/edit#gid=0]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-9990) Supporting transactions in tiered storage

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9990:
-
Fix Version/s: (was: 3.2.0)

> Supporting transactions in tiered storage
> -
>
> Key: KAFKA-9990
> URL: https://issues.apache.org/jira/browse/KAFKA-9990
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-9578) Kafka Tiered Storage - System Tests

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517032#comment-17517032
 ] 

Bruno Cadonna commented on KAFKA-9578:
--

Remove from the 3.2.0 release since code freeze has passed.

> Kafka Tiered Storage - System  Tests
> 
>
> Key: KAFKA-9578
> URL: https://issues.apache.org/jira/browse/KAFKA-9578
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Harsha
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> Initial test cases set up by [~Ying Zheng] 
>  
> [https://docs.google.com/spreadsheets/d/1gS0s1FOmcjpKYXBddejXAoJAjEZ7AdEzMU9wZc-JgY8/edit#gid=0]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-9578) Kafka Tiered Storage - System Tests

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9578:
-
Fix Version/s: (was: 3.2.0)

> Kafka Tiered Storage - System  Tests
> 
>
> Key: KAFKA-9578
> URL: https://issues.apache.org/jira/browse/KAFKA-9578
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Harsha
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> Initial test cases set up by [~Ying Zheng] 
>  
> [https://docs.google.com/spreadsheets/d/1gS0s1FOmcjpKYXBddejXAoJAjEZ7AdEzMU9wZc-JgY8/edit#gid=0]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (KAFKA-9550) RemoteLogManager implementation

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517030#comment-17517030
 ] 

Bruno Cadonna edited comment on KAFKA-9550 at 4/4/22 6:40 PM:
--

Removing from the 3.2.0 release since code freeze has passed.


was (Author: cadonna):
Remove from the 3.2.0 release since code freeze has passed.

> RemoteLogManager implementation 
> 
>
> Key: KAFKA-9550
> URL: https://issues.apache.org/jira/browse/KAFKA-9550
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
>
> Implementation of RLM as mentioned in the HLD section of KIP-405
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP-405:KafkaTieredStorage-High-leveldesign]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-9550) RemoteLogManager implementation

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517030#comment-17517030
 ] 

Bruno Cadonna commented on KAFKA-9550:
--

Remove from the 3.2.0 release since code freeze has passed.

> RemoteLogManager implementation 
> 
>
> Key: KAFKA-9550
> URL: https://issues.apache.org/jira/browse/KAFKA-9550
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
> Fix For: 3.2.0
>
>
> Implementation of RLM as mentioned in the HLD section of KIP-405
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP-405:KafkaTieredStorage-High-leveldesign]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-9550) RemoteLogManager implementation

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9550:
-
Fix Version/s: (was: 3.2.0)

> RemoteLogManager implementation 
> 
>
> Key: KAFKA-9550
> URL: https://issues.apache.org/jira/browse/KAFKA-9550
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Satish Duggana
>Assignee: Satish Duggana
>Priority: Major
>
> Implementation of RLM as mentioned in the HLD section of KIP-405
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP-405:KafkaTieredStorage-High-leveldesign]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12644) Add Missing Class-Level Javadoc to Descendants of org.apache.kafka.common.errors.ApiException

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12644:
--
Fix Version/s: (was: 3.2.0)

> Add Missing Class-Level Javadoc to Descendants of 
> org.apache.kafka.common.errors.ApiException
> -
>
> Key: KAFKA-12644
> URL: https://issues.apache.org/jira/browse/KAFKA-12644
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 2.8.1, 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>  Labels: documentation
>
> I noticed that class-level Javadocs are missing from some classes in the 
> org.apache.kafka.common.errors package. This issue is for tracking the work 
> of adding the missing class-level javadocs for those Exception classes.
> https://kafka.apache.org/27/javadoc/org/apache/kafka/common/errors/package-summary.html
> https://github.com/apache/kafka/tree/trunk/clients/src/main/java/org/apache/kafka/common/errors
> Basic class-level documentation could be derived by mapping the error 
> conditions documented in the protocol
> https://kafka.apache.org/protocol#protocol_constants



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12644) Add Missing Class-Level Javadoc to Descendants of org.apache.kafka.common.errors.ApiException

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517027#comment-17517027
 ] 

Bruno Cadonna commented on KAFKA-12644:
---

Removing from the 3.2.0 release since code freeze has passed. 

> Add Missing Class-Level Javadoc to Descendants of 
> org.apache.kafka.common.errors.ApiException
> -
>
> Key: KAFKA-12644
> URL: https://issues.apache.org/jira/browse/KAFKA-12644
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 2.8.1, 3.0.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Major
>  Labels: documentation
> Fix For: 3.2.0
>
>
> I noticed that class-level Javadocs are missing from some classes in the 
> org.apache.kafka.common.errors package. This issue is for tracking the work 
> of adding the missing class-level javadocs for those Exception classes.
> https://kafka.apache.org/27/javadoc/org/apache/kafka/common/errors/package-summary.html
> https://github.com/apache/kafka/tree/trunk/clients/src/main/java/org/apache/kafka/common/errors
> Basic class-level documentation could be derived by mapping the error 
> conditions documented in the protocol
> https://kafka.apache.org/protocol#protocol_constants



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-9837) New RPC for notifying controller of failed replica

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-9837:
-
Fix Version/s: (was: 3.2.0)

> New RPC for notifying controller of failed replica
> --
>
> Key: KAFKA-9837
> URL: https://issues.apache.org/jira/browse/KAFKA-9837
> Project: Kafka
>  Issue Type: New Feature
>  Components: controller, core
>Reporter: David Arthur
>Assignee: dengziming
>Priority: Major
>  Labels: kip-500
>
> This is the tracking ticket for 
> [KIP-589|https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller].
>  For the bridge release, brokers should no longer use ZooKeeper to notify the 
> controller that a log dir has failed. It should instead use an RPC mechanism.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-9837) New RPC for notifying controller of failed replica

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517025#comment-17517025
 ] 

Bruno Cadonna commented on KAFKA-9837:
--

Removing from 3.2.0 release since code freeze has passed.

> New RPC for notifying controller of failed replica
> --
>
> Key: KAFKA-9837
> URL: https://issues.apache.org/jira/browse/KAFKA-9837
> Project: Kafka
>  Issue Type: New Feature
>  Components: controller, core
>Reporter: David Arthur
>Assignee: dengziming
>Priority: Major
>  Labels: kip-500
> Fix For: 3.2.0
>
>
> This is the tracking ticket for 
> [KIP-589|https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller].
>  For the bridge release, brokers should no longer use ZooKeeper to notify the 
> controller that a log dir has failed. It should instead use an RPC mechanism.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13411) Exception while connecting from kafka client consumer producers deployed in a wildfly context to a kafka broker implementing OAUTHBEARER sasl mechanism

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13411:
--
Fix Version/s: (was: 3.2.0)

> Exception while connecting from kafka client consumer producers deployed in a 
> wildfly context to a kafka broker implementing OAUTHBEARER sasl mechanism
> ---
>
> Key: KAFKA-13411
> URL: https://issues.apache.org/jira/browse/KAFKA-13411
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
> Environment: Windows, Linux , Wildfly Application server
>Reporter: Shankar Bhaskaran
>Priority: Major
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> I have set up a Kafka cluster on my linux machine secured using keycloak
> (OAUTHBEARER) Mechanism. I can use the Kafka Console Consumers and
> Producers to send and receive messages.
>  
> I have tried to connect to Kafka from my consumers and producers deployed
> as module on the wildfly App serve (version 19, java 11) . I have set up
> all the required configuration (Config Section at the bottom) .
> The SASL_JAAS_CONFIG provided as consumerconfig option has the details
> like (apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule
> required LoginStringClaim_sub='kafka-client');
>  
> I am able to get authenticated with the broker , but in the client callback
> I am getting an Unsupported Callback error . I have 3 modules in wildfly
> 1) kafka producer consumer code dependent on the 2) oauth jar (for
> logincallbackhandler and login module) dependent on the 3) kafka-client
> jar (2.8.0)]
>  
> I can see that the CLIENT CALL BACK IS CLIENTCREDENTIAL INSTEAD OF
> OAuthBearerTokenCallback. The saslclient is getting set as
> AbstractSaslClient instead of OAuthBearerSaslClient.
> [https://www.mail-archive.com/dev@kafka.apache.org/msg120743.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13411) Exception while connecting from kafka client consumer producers deployed in a wildfly context to a kafka broker implementing OAUTHBEARER sasl mechanism

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517024#comment-17517024
 ] 

Bruno Cadonna commented on KAFKA-13411:
---

Removing from the 3.2.0 release since code freeze has passed.

> Exception while connecting from kafka client consumer producers deployed in a 
> wildfly context to a kafka broker implementing OAUTHBEARER sasl mechanism
> ---
>
> Key: KAFKA-13411
> URL: https://issues.apache.org/jira/browse/KAFKA-13411
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
> Environment: Windows, Linux , Wildfly Application server
>Reporter: Shankar Bhaskaran
>Priority: Major
> Fix For: 3.2.0
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> I have set up a Kafka cluster on my linux machine secured using keycloak
> (OAUTHBEARER) Mechanism. I can use the Kafka Console Consumers and
> Producers to send and receive messages.
>  
> I have tried to connect to Kafka from my consumers and producers deployed
> as module on the wildfly App serve (version 19, java 11) . I have set up
> all the required configuration (Config Section at the bottom) .
> The SASL_JAAS_CONFIG provided as consumerconfig option has the details
> like (apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule
> required LoginStringClaim_sub='kafka-client');
>  
> I am able to get authenticated with the broker , but in the client callback
> I am getting an Unsupported Callback error . I have 3 modules in wildfly
> 1) kafka producer consumer code dependent on the 2) oauth jar (for
> logincallbackhandler and login module) dependent on the 3) kafka-client
> jar (2.8.0)]
>  
> I can see that the CLIENT CALL BACK IS CLIENTCREDENTIAL INSTEAD OF
> OAuthBearerTokenCallback. The saslclient is getting set as
> AbstractSaslClient instead of OAuthBearerSaslClient.
> [https://www.mail-archive.com/dev@kafka.apache.org/msg120743.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13331) Slow reassignments in 2.8 because of large number of UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),) Events

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13331:
--
Fix Version/s: (was: 3.2.0)

> Slow reassignments in 2.8 because of large number of  
> UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),)
>  Events
> 
>
> Key: KAFKA-13331
> URL: https://issues.apache.org/jira/browse/KAFKA-13331
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, controller, core
>Affects Versions: 2.8.1, 3.0.0
>Reporter: GEORGE LI
>Priority: Critical
> Attachments: Screen Shot 2021-09-28 at 12.57.34 PM.png
>
>
> Slowness is observed when doing reassignments on clusters with more brokers 
> (e.g. 80 brokers). 
> After investigation, it looks like the slowness is because  for 
> reassignments, it sends the UpdateMetadataRequest to all the broker for every 
> topic partition affected by the reassignment (maybe some optimization can be 
> done).   e.g. 
> for a reassignment with batch size of 50 partitions.  it will generate about 
> 10k - 20k ControllerEventManager.EventQueueSize and  the p99 EventQueueTimeMs 
> will be  1M.   if the batch size is 100 partitions,   about 40K 
> EventQeuueSize and  3M p99 EventQueueTimeMs.   See below screen shot on the 
> metrics.  
>  !Screen Shot 2021-09-28 at 12.57.34 PM.png! 
> it takes about 10-30minutes to process 100 reassignments per batch.   and 
> 20-30 seconds for 1 reassignment per batch even the topic partition is almost 
> empty.   Version 1.1, the reassignment is almost instant. 
> Looking at what is in the ControllerEventManager.EventQueue,  the majority 
> (depends on the how many brokers in the cluster, it can be 90%+)  is  
> {{UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),)}}
>  events.   which is introduced  in this commit: 
> {code}
> commit 4e431246c31170a7f632da8edfdb9cf4f882f6ef
> Author: Jason Gustafson 
> Date:   Thu Nov 21 07:41:29 2019 -0800
> MINOR: Controller should log UpdateMetadata response errors (#7717)
> 
> Create a controller event for handling UpdateMetadata responses and log a 
> message when a response contains an error.
> 
> Reviewers: Stanislav Kozlovski , Ismael 
> Juma 
> {code}
> Checking how the events in the ControllerEventManager are processed for  
> {{UpdateMetadata response}},  it seems like it's only checking whether there 
> is an error, and simply log the error.  but it takes about 3ms - 60ms  to 
> dequeue each event.  Because it's a FIFO queue,  other events were waiting in 
> the queue. 
> {code}
>   private def processUpdateMetadataResponseReceived(updateMetadataResponse: 
> UpdateMetadataResponse, brokerId: Int): Unit = {
> if (!isActive) return
> if (updateMetadataResponse.error != Errors.NONE) {
>   stateChangeLogger.error(s"Received error 
> ${updateMetadataResponse.error} in UpdateMetadata " +
> s"response $updateMetadataResponse from broker $brokerId")
> }
>   }
> {code}
> There might be more sophisticated logic for handling the UpdateMetadata 
> response error in the future.   For current version,   would it be better to 
> check whether  the response error code is  Errors.NONE before putting into 
> the Event Queue?   e.g.  I put this additional check and see the Reassignment 
> Performance increase dramatically on the 80 brokers cluster. 
> {code}
>  val updateMetadataRequestBuilder = new 
> UpdateMetadataRequest.Builder(updateMetadataRequestVersion,
> controllerId, controllerEpoch, brokerEpoch, partitionStates.asJava, 
> liveBrokers.asJava, topicIds.asJava)
>   sendRequest(broker, updateMetadataRequestBuilder, (r: AbstractResponse) 
> => {
> val updateMetadataResponse = r.asInstanceOf[UpdateMetadataResponse]
> if (updateMetadataResponse.error != Errors.NONE) {   //< 
> Add additional check whether the response code,   if no error, which is 
> almost 99.99% the case, skip adding updateMetadataResponse to the Event 
> Queue. 
>   sendEvent(UpdateMetadataResponseReceived(updateMetadataResponse, 
> broker))
> }
>   })
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13331) Slow reassignments in 2.8 because of large number of UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),) Events

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517022#comment-17517022
 ] 

Bruno Cadonna commented on KAFKA-13331:
---

Removing from the 3.2.0 release since code freeze has passed.

> Slow reassignments in 2.8 because of large number of  
> UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),)
>  Events
> 
>
> Key: KAFKA-13331
> URL: https://issues.apache.org/jira/browse/KAFKA-13331
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, controller, core
>Affects Versions: 2.8.1, 3.0.0
>Reporter: GEORGE LI
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: Screen Shot 2021-09-28 at 12.57.34 PM.png
>
>
> Slowness is observed when doing reassignments on clusters with more brokers 
> (e.g. 80 brokers). 
> After investigation, it looks like the slowness is because  for 
> reassignments, it sends the UpdateMetadataRequest to all the broker for every 
> topic partition affected by the reassignment (maybe some optimization can be 
> done).   e.g. 
> for a reassignment with batch size of 50 partitions.  it will generate about 
> 10k - 20k ControllerEventManager.EventQueueSize and  the p99 EventQueueTimeMs 
> will be  1M.   if the batch size is 100 partitions,   about 40K 
> EventQeuueSize and  3M p99 EventQueueTimeMs.   See below screen shot on the 
> metrics.  
>  !Screen Shot 2021-09-28 at 12.57.34 PM.png! 
> it takes about 10-30minutes to process 100 reassignments per batch.   and 
> 20-30 seconds for 1 reassignment per batch even the topic partition is almost 
> empty.   Version 1.1, the reassignment is almost instant. 
> Looking at what is in the ControllerEventManager.EventQueue,  the majority 
> (depends on the how many brokers in the cluster, it can be 90%+)  is  
> {{UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),)}}
>  events.   which is introduced  in this commit: 
> {code}
> commit 4e431246c31170a7f632da8edfdb9cf4f882f6ef
> Author: Jason Gustafson 
> Date:   Thu Nov 21 07:41:29 2019 -0800
> MINOR: Controller should log UpdateMetadata response errors (#7717)
> 
> Create a controller event for handling UpdateMetadata responses and log a 
> message when a response contains an error.
> 
> Reviewers: Stanislav Kozlovski , Ismael 
> Juma 
> {code}
> Checking how the events in the ControllerEventManager are processed for  
> {{UpdateMetadata response}},  it seems like it's only checking whether there 
> is an error, and simply log the error.  but it takes about 3ms - 60ms  to 
> dequeue each event.  Because it's a FIFO queue,  other events were waiting in 
> the queue. 
> {code}
>   private def processUpdateMetadataResponseReceived(updateMetadataResponse: 
> UpdateMetadataResponse, brokerId: Int): Unit = {
> if (!isActive) return
> if (updateMetadataResponse.error != Errors.NONE) {
>   stateChangeLogger.error(s"Received error 
> ${updateMetadataResponse.error} in UpdateMetadata " +
> s"response $updateMetadataResponse from broker $brokerId")
> }
>   }
> {code}
> There might be more sophisticated logic for handling the UpdateMetadata 
> response error in the future.   For current version,   would it be better to 
> check whether  the response error code is  Errors.NONE before putting into 
> the Event Queue?   e.g.  I put this additional check and see the Reassignment 
> Performance increase dramatically on the 80 brokers cluster. 
> {code}
>  val updateMetadataRequestBuilder = new 
> UpdateMetadataRequest.Builder(updateMetadataRequestVersion,
> controllerId, controllerEpoch, brokerEpoch, partitionStates.asJava, 
> liveBrokers.asJava, topicIds.asJava)
>   sendRequest(broker, updateMetadataRequestBuilder, (r: AbstractResponse) 
> => {
> val updateMetadataResponse = r.asInstanceOf[UpdateMetadataResponse]
> if (updateMetadataResponse.error != Errors.NONE) {   //< 
> Add additional check whether the response code,   if no error, which is 
> almost 99.99% the case, skip adding updateMetadataResponse to the Event 
> Queue. 
>   sendEvent(UpdateMetadataResponseReceived(updateMetadataResponse, 
> broker))
> }
>   })
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516996#comment-17516996
 ] 

Bruno Cadonna commented on KAFKA-12878:
---

Removing from 3.2.0 relase since code freeze has passed.

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Neil Buesing
>Priority: Major
>  Labels: needs-kip, newbie
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12878:
--
Fix Version/s: (was: 3.2.0)

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Neil Buesing
>Priority: Major
>  Labels: needs-kip
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12878:
--
Priority: Major  (was: Critical)

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Neil Buesing
>Priority: Major
>  Labels: needs-kip
> Fix For: 3.2.0
>
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12878:
--
Labels: needs-kip newbie  (was: needs-kip)

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Neil Buesing
>Priority: Major
>  Labels: needs-kip, newbie
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12878:
--
Labels: needs-kip  (was: )

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Neil Buesing
>Priority: Critical
>  Labels: needs-kip
> Fix For: 3.2.0
>
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12878) Support --bootstrap-server kafka-streams-application-reset

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12878:
--
Component/s: streams

> Support --bootstrap-server kafka-streams-application-reset
> --
>
> Key: KAFKA-12878
> URL: https://issues.apache.org/jira/browse/KAFKA-12878
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Neil Buesing
>Assignee: Neil Buesing
>Priority: Critical
> Fix For: 3.2.0
>
>
> kafka-streams-application-reset still uses --bootstrap-servers, align with 
> other tools that use --bootstrap-server.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13797) Adding metric to indicate metadata response outgoing bytes rate

2022-04-04 Thread Lucas Wang (Jira)
Lucas Wang created KAFKA-13797:
--

 Summary: Adding metric to indicate metadata response outgoing 
bytes rate
 Key: KAFKA-13797
 URL: https://issues.apache.org/jira/browse/KAFKA-13797
 Project: Kafka
  Issue Type: Improvement
Reporter: Lucas Wang


It's not a common case, but we experienced the following problem in one of our 
clusters.

The use case involves dynamically creating and deleting topics in the cluster, 
and the clients were constantly checking if a topic exists in a cluster using 
the special type of Metadata requests whose topics field is null in order to 
retrieve all topics before checking a topic's existence.

A high rate of such Metadata requests generated a heavy load on brokers in the 
cluster. 

Yet, currently, there is no metric to indicate the metadata response outgoing 
bytes rate.

We propose to add such a metric in order to make the troubleshooting of such 
cases easier.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [kafka] C0urante commented on a diff in pull request #11974: KAFKA-13763 (1): Improve unit testing coverage and flexibility for IncrementalCooperativeAssignor

2022-04-04 Thread GitBox


C0urante commented on code in PR #11974:
URL: https://github.com/apache/kafka/pull/11974#discussion_r842022697


##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignorTest.java:
##
@@ -1275,89 +1136,68 @@ private static ClusterConfigState 
clusterConfigState(long offset,
  int connectorNum,
  int taskNum) {
 int connectorNumEnd = connectorStart + connectorNum - 1;
+
+Map connectorTaskCounts = fillMap(connectorStart, 
connectorNumEnd, i -> "connector" + i, () -> taskNum);
+Map> connectorConfigs = 
fillMap(connectorStart, connectorNumEnd, i -> "connector" + i, HashMap::new);
+Map connectorTargetStates = 
fillMap(connectorStart, connectorNumEnd, i -> "connector" + i, () -> 
TargetState.STARTED);
+Map> taskConfigs = fillMap(
+0,
+connectorNum * taskNum,
+i -> new ConnectorTaskId("connector" + i / connectorNum + 1, 
i),
+HashMap::new
+);
+
 return new ClusterConfigState(
 offset,
 null,
-connectorTaskCounts(connectorStart, connectorNumEnd, taskNum),
-connectorConfigs(connectorStart, connectorNumEnd),
-connectorTargetStates(connectorStart, connectorNumEnd, 
TargetState.STARTED),
-taskConfigs(0, connectorNum, connectorNum * taskNum),
+connectorTaskCounts,
+connectorConfigs,
+connectorTargetStates,
+taskConfigs,
 Collections.emptySet());
 }
 
-private static Map memberConfigs(String 
givenLeader,
-  long 
givenOffset,
-  Map givenAssignments) {
-return givenAssignments.entrySet().stream()
-.collect(Collectors.toMap(
-Map.Entry::getKey,
-e -> new 
ExtendedWorkerState(expectedLeaderUrl(givenLeader), givenOffset, 
e.getValue(;
-}
-
-private static Map memberConfigs(String 
givenLeader,
+private Map memberConfigs(String givenLeader,
   long 
givenOffset,
   int start,
   int 
connectorNum) {
-return IntStream.range(start, connectorNum + 1)
-.mapToObj(i -> new SimpleEntry<>("worker" + i, new 
ExtendedWorkerState(expectedLeaderUrl(givenLeader), givenOffset, null)))
-.collect(Collectors.toMap(SimpleEntry::getKey, 
SimpleEntry::getValue));
-}
-
-private static Map connectorTaskCounts(int start,
-int connectorNum,
-int taskCounts) {
-return IntStream.range(start, connectorNum + 1)
-.mapToObj(i -> new SimpleEntry<>("connector" + i, taskCounts))
-.collect(Collectors.toMap(SimpleEntry::getKey, 
SimpleEntry::getValue));
-}
-
-private static Map> connectorConfigs(int 
start, int connectorNum) {
-return IntStream.range(start, connectorNum + 1)
-.mapToObj(i -> new SimpleEntry<>("connector" + i, new 
HashMap()))
-.collect(Collectors.toMap(SimpleEntry::getKey, 
SimpleEntry::getValue));
-}
-
-private static Map connectorTargetStates(int start,
-  int 
connectorNum,
-  TargetState 
state) {
-return IntStream.range(start, connectorNum + 1)
-.mapToObj(i -> new SimpleEntry<>("connector" + i, state))
-.collect(Collectors.toMap(SimpleEntry::getKey, 
SimpleEntry::getValue));
+return fillMap(

Review Comment:
   Agree that it's more readable, and in some ways it's actually more flexible 
since you can now specify an arbitrary set of worker names instead of having 
them generated for you. Done  



##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignorTest.java:
##
@@ -1275,89 +1136,68 @@ private static ClusterConfigState 
clusterConfigState(long offset,
  int connectorNum,
  int taskNum) {
 int connectorNumEnd = connectorStart + connectorNum - 1;
+
+Map connectorTaskCounts = fillMap(connectorStart, 
connectorNumEnd, i -> "connector" + i, () -> taskNum);
+Map> connectorConfigs = 
fillMap(connectorStart, connectorNumEnd, i -> 

[jira] [Commented] (KAFKA-7509) Kafka Connect logs unnecessary warnings about unused configurations

2022-04-04 Thread Chris Egerton (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516992#comment-17516992
 ] 

Chris Egerton commented on KAFKA-7509:
--

[~mjsax] thanks for taking a look. RE {{connect.kafka.cluster.id}} – this 
property is already passed in automatically by 
[ConnectUtils::addMetricsContextProperties|https://github.com/apache/kafka/blob/74909e000aaab1f0300ae2e918a6fa361e38078e/connect/runtime/src/main/java/org/apache/kafka/connect/util/ConnectUtils.java#L144-L153],
 which is used all over the place in Connect (see 
[here|https://github.com/apache/kafka/blob/74909e000aaab1f0300ae2e918a6fa361e38078e/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L664],
 
[here|https://github.com/apache/kafka/blob/74909e000aaab1f0300ae2e918a6fa361e38078e/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L697],
 and 
[here|https://github.com/apache/kafka/blob/74909e000aaab1f0300ae2e918a6fa361e38078e/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L742]
 for some cases). The changes in the PR don't add any new config properties, 
they just cause the ones injected by that method to be ignored when logging 
unused properties.

I can try to take a stab at this with Streams, will probably file a separate PR 
for that if it looks promising.

> Kafka Connect logs unnecessary warnings about unused configurations
> ---
>
> Key: KAFKA-7509
> URL: https://issues.apache.org/jira/browse/KAFKA-7509
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Randall Hauch
>Assignee: Chris Egerton
>Priority: Major
>
> When running Connect, the logs contain quite a few warnings about "The 
> configuration '{}' was supplied but isn't a known config." This occurs when 
> Connect creates producers, consumers, and admin clients, because the 
> AbstractConfig is logging unused configuration properties upon construction. 
> It's complicated by the fact that the Producer, Consumer, and AdminClient all 
> create their own AbstractConfig instances within the constructor, so we can't 
> even call its {{ignore(String key)}} method.
> See also KAFKA-6793 for a similar issue with Streams.
> There are no arguments in the Producer, Consumer, or AdminClient constructors 
> to control  whether the configs log these warnings, so a simpler workaround 
> is to only pass those configuration properties to the Producer, Consumer, and 
> AdminClient that the ProducerConfig, ConsumerConfig, and AdminClientConfig 
> configdefs know about.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-12291) Fix Ignored Upgrade Tests in streams_upgrade_test.py: test_upgrade_downgrade_brokers

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-12291:
--
Fix Version/s: (was: 3.2.0)

> Fix Ignored Upgrade Tests in streams_upgrade_test.py: 
> test_upgrade_downgrade_brokers
> 
>
> Key: KAFKA-12291
> URL: https://issues.apache.org/jira/browse/KAFKA-12291
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: Bruno Cadonna
>Priority: Critical
>
> Fix in the oldest branch that ignores the test and cherry-pick forward.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12291) Fix Ignored Upgrade Tests in streams_upgrade_test.py: test_upgrade_downgrade_brokers

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516987#comment-17516987
 ] 

Bruno Cadonna commented on KAFKA-12291:
---

Removing from release 3.2.0 since code freeze has passed.

> Fix Ignored Upgrade Tests in streams_upgrade_test.py: 
> test_upgrade_downgrade_brokers
> 
>
> Key: KAFKA-12291
> URL: https://issues.apache.org/jira/browse/KAFKA-12291
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>Reporter: Bruno Cadonna
>Priority: Critical
> Fix For: 3.2.0
>
>
> Fix in the oldest branch that ignores the test and cherry-pick forward.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13474) Regression in dynamic update of broker certificate

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13474:
--
Fix Version/s: 3.3.0
   (was: 3.2.0)

> Regression in dynamic update of broker certificate
> --
>
> Key: KAFKA-13474
> URL: https://issues.apache.org/jira/browse/KAFKA-13474
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.7.0, 2.7.2, 2.8.1, 3.0.0
>Reporter: Igor Shipenkov
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: failed-controller-single-session-2029.pcap.gz
>
>
> h1. Problem
> It seems, after updating listener SSL certificate with dynamic broker 
> configuration update, old certificate is somehow still used for broker client 
> SSL factory. Because of this broker fails to create new connection to 
> controller after old certificate expires.
> h1. History
> Back in KAFKA-8336 there was an issue, when client-side SSL factory wasn't 
> updating certificate, when it was changed with dynamic configuration. That 
> bug have been fixed in version 2.3 and I can confirm, that dynamic update 
> worked for us with kafka 2.4. But now we have updated clusters to 2.7 and see 
> this (or at least similar) problem again.
> h1. Affected versions
> First we've seen this on confluent 6.1.2, which (I think) based on kafka 
> 2.7.0. Then I tried vanilla versions 2.7.0 and 2.7.2 and can reproduce 
> problem on them just fine
> h1. How to reproduce
>  * Have zookeeper somewhere (in my example it will be "10.88.0.21:2181").
>  * Get vanilla version 2.7.2 (or 2.7.0) from 
> [https://kafka.apache.org/downloads] .
>  * Make basic broker config like this (don't forget to actually create 
> log.dirs):
> {code:none}
> broker.id=1
> listeners=SSL://:9092
> advertised.listeners=SSL://localhost:9092
> log.dirs=/tmp/broker1/data
> zookeeper.connect=10.88.0.21:2181
> security.inter.broker.protocol=SSL
> ssl.protocol=TLSv1.2
> ssl.client.auth=required
> ssl.endpoint.identification.algorithm=
> ssl.keystore.type=PKCS12
> ssl.keystore.location=/tmp/broker1/secrets/broker1.keystore.p12
> ssl.keystore.password=changeme1
> ssl.key.password=changeme1
> ssl.truststore.type=PKCS12
> ssl.truststore.location=/tmp/broker1/secrets/truststore.p12
> ssl.truststore.password=changeme
> {code}
> (I use here TLS 1.2 just so I can see client certificate in TLS handshake in 
> traffic dump, you will get same error with default TLS 1.3 too)
>  ** Repeat this config for another 2 brokers, changing id, listener port and 
> certificate accordingly.
>  * Make basic client config (I use for it one of brokers' certificates):
> {code:none}
> security.protocol=SSL
> ssl.key.password=changeme1
> ssl.keystore.type=PKCS12
> ssl.keystore.location=/tmp/broker1/secrets/broker1.keystore.p12
> ssl.keystore.password=changeme1
> ssl.truststore.type=PKCS12
> ssl.truststore.location=/tmp/broker1/secrets/truststore.p12
> ssl.truststore.password=changeme
> ssl.endpoint.identification.algorithm=
> {code}
>  * Create usual local self-signed PKI for test
>  ** generate self-signed CA certificate and private key. Place certificate in 
> truststore.
>  ** create keys for broker certificates and create requests from them as 
> usual (I'll use here same subject for all brokers)
>  ** create 2 certificates as usual
> {code:bash}
> openssl x509 \
>-req -CAcreateserial -days 1 \
>-CA ca/ca-cert.pem -CAkey ca/ca-key.pem \
>-in broker1.csr -out broker1.crt
> {code}
>  ** Use "faketime" utility to make third certificate expire soon:
> {code:bash}
> # date here is some point yesterday, so certificate will expire like 10-15 
> minutes from now
> faketime "2021-11-23 10:15" openssl x509 \
>-req -CAcreateserial -days 1 \
>-CA ca/ca-cert.pem -CAkey ca/ca-key.pem \
>-in broker2.csr -out broker2.crt
> {code}
>  ** create keystores from certificates and place them according to broker 
> configs from earlier
>  * Run 3 brokers with your configs like
> {code:bash}
> ./bin/kafka-server-start.sh server2.properties
> {code}
> (I start it here without daemon mode to see logs right on terminal - just use 
> "tmux" or something to run 3 brokers simultaneously)
>  ** you can check that one broker certificate will expire soon with
> {code:bash}
> openssl s_client -connect localhost:9093  -text | grep -A2 Valid
> {code}
>  * Issue new certificate to replace one, which will expire soon, place it in 
> keystore and replace old keystore with it.
>  * Use dynamic configuration to make broker re-read keystore:
> {code:bash}
> ./bin/kafka-configs --command-config ssl.properties --bootstrap-server 
> localhost:9092 --entity-type brokers --entity-name "2" --alter --add-config 
> 

[jira] [Commented] (KAFKA-13474) Regression in dynamic update of broker certificate

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516984#comment-17516984
 ] 

Bruno Cadonna commented on KAFKA-13474:
---

Moving to the next release since code freeze for 3.2.0 has passed.

> Regression in dynamic update of broker certificate
> --
>
> Key: KAFKA-13474
> URL: https://issues.apache.org/jira/browse/KAFKA-13474
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.7.0, 2.7.2, 2.8.1, 3.0.0
>Reporter: Igor Shipenkov
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: failed-controller-single-session-2029.pcap.gz
>
>
> h1. Problem
> It seems, after updating listener SSL certificate with dynamic broker 
> configuration update, old certificate is somehow still used for broker client 
> SSL factory. Because of this broker fails to create new connection to 
> controller after old certificate expires.
> h1. History
> Back in KAFKA-8336 there was an issue, when client-side SSL factory wasn't 
> updating certificate, when it was changed with dynamic configuration. That 
> bug have been fixed in version 2.3 and I can confirm, that dynamic update 
> worked for us with kafka 2.4. But now we have updated clusters to 2.7 and see 
> this (or at least similar) problem again.
> h1. Affected versions
> First we've seen this on confluent 6.1.2, which (I think) based on kafka 
> 2.7.0. Then I tried vanilla versions 2.7.0 and 2.7.2 and can reproduce 
> problem on them just fine
> h1. How to reproduce
>  * Have zookeeper somewhere (in my example it will be "10.88.0.21:2181").
>  * Get vanilla version 2.7.2 (or 2.7.0) from 
> [https://kafka.apache.org/downloads] .
>  * Make basic broker config like this (don't forget to actually create 
> log.dirs):
> {code:none}
> broker.id=1
> listeners=SSL://:9092
> advertised.listeners=SSL://localhost:9092
> log.dirs=/tmp/broker1/data
> zookeeper.connect=10.88.0.21:2181
> security.inter.broker.protocol=SSL
> ssl.protocol=TLSv1.2
> ssl.client.auth=required
> ssl.endpoint.identification.algorithm=
> ssl.keystore.type=PKCS12
> ssl.keystore.location=/tmp/broker1/secrets/broker1.keystore.p12
> ssl.keystore.password=changeme1
> ssl.key.password=changeme1
> ssl.truststore.type=PKCS12
> ssl.truststore.location=/tmp/broker1/secrets/truststore.p12
> ssl.truststore.password=changeme
> {code}
> (I use here TLS 1.2 just so I can see client certificate in TLS handshake in 
> traffic dump, you will get same error with default TLS 1.3 too)
>  ** Repeat this config for another 2 brokers, changing id, listener port and 
> certificate accordingly.
>  * Make basic client config (I use for it one of brokers' certificates):
> {code:none}
> security.protocol=SSL
> ssl.key.password=changeme1
> ssl.keystore.type=PKCS12
> ssl.keystore.location=/tmp/broker1/secrets/broker1.keystore.p12
> ssl.keystore.password=changeme1
> ssl.truststore.type=PKCS12
> ssl.truststore.location=/tmp/broker1/secrets/truststore.p12
> ssl.truststore.password=changeme
> ssl.endpoint.identification.algorithm=
> {code}
>  * Create usual local self-signed PKI for test
>  ** generate self-signed CA certificate and private key. Place certificate in 
> truststore.
>  ** create keys for broker certificates and create requests from them as 
> usual (I'll use here same subject for all brokers)
>  ** create 2 certificates as usual
> {code:bash}
> openssl x509 \
>-req -CAcreateserial -days 1 \
>-CA ca/ca-cert.pem -CAkey ca/ca-key.pem \
>-in broker1.csr -out broker1.crt
> {code}
>  ** Use "faketime" utility to make third certificate expire soon:
> {code:bash}
> # date here is some point yesterday, so certificate will expire like 10-15 
> minutes from now
> faketime "2021-11-23 10:15" openssl x509 \
>-req -CAcreateserial -days 1 \
>-CA ca/ca-cert.pem -CAkey ca/ca-key.pem \
>-in broker2.csr -out broker2.crt
> {code}
>  ** create keystores from certificates and place them according to broker 
> configs from earlier
>  * Run 3 brokers with your configs like
> {code:bash}
> ./bin/kafka-server-start.sh server2.properties
> {code}
> (I start it here without daemon mode to see logs right on terminal - just use 
> "tmux" or something to run 3 brokers simultaneously)
>  ** you can check that one broker certificate will expire soon with
> {code:bash}
> openssl s_client -connect localhost:9093  -text | grep -A2 Valid
> {code}
>  * Issue new certificate to replace one, which will expire soon, place it in 
> keystore and replace old keystore with it.
>  * Use dynamic configuration to make broker re-read keystore:
> {code:bash}
> ./bin/kafka-configs --command-config ssl.properties --bootstrap-server 
> localhost:9092 --entity-type brokers --entity-name "2" --alter --add-config 
> 

[jira] [Updated] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13159:
--
Fix Version/s: 3.2.0

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
> Fix For: 3.2.0
>
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-7509) Kafka Connect logs unnecessary warnings about unused configurations

2022-04-04 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516979#comment-17516979
 ] 

Matthias J. Sax commented on KAFKA-7509:


Thanks for following up. – I briefly took a look. My first impression is that 
it might work with KS.

Assing `connect.kafka.cluster.id` would still require a KIP – it seems that 
adding a config and avoiding the warnings is two different things though – 
might be good to extract to changes for the logging into an independent PR (and 
maybe add using is in KS right away to verify that it works)?

> Kafka Connect logs unnecessary warnings about unused configurations
> ---
>
> Key: KAFKA-7509
> URL: https://issues.apache.org/jira/browse/KAFKA-7509
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Randall Hauch
>Assignee: Chris Egerton
>Priority: Major
>
> When running Connect, the logs contain quite a few warnings about "The 
> configuration '{}' was supplied but isn't a known config." This occurs when 
> Connect creates producers, consumers, and admin clients, because the 
> AbstractConfig is logging unused configuration properties upon construction. 
> It's complicated by the fact that the Producer, Consumer, and AdminClient all 
> create their own AbstractConfig instances within the constructor, so we can't 
> even call its {{ignore(String key)}} method.
> See also KAFKA-6793 for a similar issue with Streams.
> There are no arguments in the Producer, Consumer, or AdminClient constructors 
> to control  whether the configs log these warnings, so a simpler workaround 
> is to only pass those configuration properties to the Producer, Consumer, and 
> AdminClient that the ProducerConfig, ConsumerConfig, and AdminClientConfig 
> configdefs know about.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2022-04-04 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-13159.
--
Resolution: Fixed

Resolving as these tests have been enabled on trunk for a while now: 
https://github.com/apache/kafka/blob/trunk/tests/kafkatest/tests/core/transactions_test.py#L250

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [kafka] mimaison commented on a diff in pull request #11974: KAFKA-13763 (1): Improve unit testing coverage and flexibility for IncrementalCooperativeAssignor

2022-04-04 Thread GitBox


mimaison commented on code in PR #11974:
URL: https://github.com/apache/kafka/pull/11974#discussion_r841936725


##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignorTest.java:
##
@@ -34,25 +34,30 @@
 import org.mockito.junit.MockitoJUnit;
 import org.mockito.junit.MockitoRule;
 
-import java.util.AbstractMap.SimpleEntry;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collection;
 import java.util.Collections;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
+import java.util.Optional;
 import java.util.Set;
+import java.util.function.Function;
+import java.util.function.Supplier;
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
+import static 
org.apache.kafka.connect.runtime.distributed.ExtendedAssignment.duplicate;
 import static 
org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeConnectProtocol.CONNECT_PROTOCOL_V1;
 import static 
org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeConnectProtocol.CONNECT_PROTOCOL_V2;
 import static 
org.apache.kafka.connect.runtime.distributed.WorkerCoordinator.WorkerLoad;
-import static org.hamcrest.CoreMatchers.hasItems;
 import static org.hamcrest.CoreMatchers.is;

Review Comment:
   Could we remove hamcrest now and only use junit?



##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignorTest.java:
##
@@ -1275,89 +1136,68 @@ private static ClusterConfigState 
clusterConfigState(long offset,
  int connectorNum,
  int taskNum) {
 int connectorNumEnd = connectorStart + connectorNum - 1;
+
+Map connectorTaskCounts = fillMap(connectorStart, 
connectorNumEnd, i -> "connector" + i, () -> taskNum);
+Map> connectorConfigs = 
fillMap(connectorStart, connectorNumEnd, i -> "connector" + i, HashMap::new);
+Map connectorTargetStates = 
fillMap(connectorStart, connectorNumEnd, i -> "connector" + i, () -> 
TargetState.STARTED);
+Map> taskConfigs = fillMap(
+0,
+connectorNum * taskNum,
+i -> new ConnectorTaskId("connector" + i / connectorNum + 1, 
i),
+HashMap::new
+);
+
 return new ClusterConfigState(
 offset,
 null,
-connectorTaskCounts(connectorStart, connectorNumEnd, taskNum),
-connectorConfigs(connectorStart, connectorNumEnd),
-connectorTargetStates(connectorStart, connectorNumEnd, 
TargetState.STARTED),
-taskConfigs(0, connectorNum, connectorNum * taskNum),
+connectorTaskCounts,
+connectorConfigs,
+connectorTargetStates,
+taskConfigs,
 Collections.emptySet());
 }
 
-private static Map memberConfigs(String 
givenLeader,
-  long 
givenOffset,
-  Map givenAssignments) {
-return givenAssignments.entrySet().stream()
-.collect(Collectors.toMap(
-Map.Entry::getKey,
-e -> new 
ExtendedWorkerState(expectedLeaderUrl(givenLeader), givenOffset, 
e.getValue(;
-}
-
-private static Map memberConfigs(String 
givenLeader,
+private Map memberConfigs(String givenLeader,
   long 
givenOffset,
   int start,
   int 
connectorNum) {

Review Comment:
   Should this be `workerNum` instead of `connectorNum`?



##
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignorTest.java:
##
@@ -34,25 +34,30 @@
 import org.mockito.junit.MockitoJUnit;
 import org.mockito.junit.MockitoRule;
 
-import java.util.AbstractMap.SimpleEntry;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collection;
 import java.util.Collections;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
+import java.util.Optional;
 import java.util.Set;
+import java.util.function.Function;
+import java.util.function.Supplier;
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
+import static 
org.apache.kafka.connect.runtime.distributed.ExtendedAssignment.duplicate;
 import static 
org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeConnectProtocol.CONNECT_PROTOCOL_V1;
 import static 
org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeConnectProtocol.CONNECT_PROTOCOL_V2;
 

[jira] [Comment Edited] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516947#comment-17516947
 ] 

Bruno Cadonna edited comment on KAFKA-13159 at 4/4/22 4:52 PM:
---

[~mumrah] Same question as [~mimaison]: Can we close this ticket?
In the meanwhile, I am removing this ticket form the 3.2.0 release since code 
freeze has passed.


was (Author: cadonna):
[~mumrah] Same question as [~mimaison]: Can we close this ticket?
In the meanwhile, I am removing this ticket form the 3.2.0 release.

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13159:
--
Fix Version/s: (was: 3.2.0)

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13159) Enable system tests for transactions in KRaft mode

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516947#comment-17516947
 ] 

Bruno Cadonna commented on KAFKA-13159:
---

[~mumrah] Same question as [~mimaison]: Can we close this ticket?
In the meanwhile, I am removing this ticket form the 3.2.0 release.

> Enable system tests for transactions in KRaft mode
> --
>
> Key: KAFKA-13159
> URL: https://issues.apache.org/jira/browse/KAFKA-13159
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
> Fix For: 3.2.0
>
>
> Previously, we disabled several system tests involving system tests in KRaft 
> mode. Now that KIP-730 is complete and transactions work in KRaft, we need to 
> re-enable these tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13295) Long restoration times for new tasks can lead to transaction timeouts

2022-04-04 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516945#comment-17516945
 ] 

Bruno Cadonna commented on KAFKA-13295:
---

Removing from 3.2. release since code freeze passed.

> Long restoration times for new tasks can lead to transaction timeouts
> -
>
> Key: KAFKA-13295
> URL: https://issues.apache.org/jira/browse/KAFKA-13295
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: A. Sophie Blee-Goldman
>Assignee: Sagar Rao
>Priority: Critical
>  Labels: eos, new-streams-runtime-should-fix
>
> In some EOS applications with relatively long restoration times we've noticed 
> a series of ProducerFencedExceptions occurring during/immediately after 
> restoration. The broker logs were able to confirm these were due to 
> transactions timing out.
> In Streams, it turns out we automatically begin a new txn when calling 
> {{send}} (if there isn’t already one in flight). A {{send}} occurs often 
> outside a commit during active processing (eg writing to the changelog), 
> leaving the txn open until the next commit. And if a StreamThread has been 
> actively processing when a rebalance results in a new stateful task without 
> revoking any existing tasks, the thread won’t actually commit this open txn 
> before it goes back into the restoration phase while it builds up state for 
> the new task. So the in-flight transaction is left open during restoration, 
> during which the StreamThread only consumes from the changelog without 
> committing, leaving it vulnerable to timing out when restoration times exceed 
> the configured transaction.timeout.ms for the producer client.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13295) Long restoration times for new tasks can lead to transaction timeouts

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna updated KAFKA-13295:
--
Fix Version/s: (was: 3.2.0)

> Long restoration times for new tasks can lead to transaction timeouts
> -
>
> Key: KAFKA-13295
> URL: https://issues.apache.org/jira/browse/KAFKA-13295
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: A. Sophie Blee-Goldman
>Assignee: Sagar Rao
>Priority: Critical
>  Labels: eos, new-streams-runtime-should-fix
>
> In some EOS applications with relatively long restoration times we've noticed 
> a series of ProducerFencedExceptions occurring during/immediately after 
> restoration. The broker logs were able to confirm these were due to 
> transactions timing out.
> In Streams, it turns out we automatically begin a new txn when calling 
> {{send}} (if there isn’t already one in flight). A {{send}} occurs often 
> outside a commit during active processing (eg writing to the changelog), 
> leaving the txn open until the next commit. And if a StreamThread has been 
> actively processing when a rebalance results in a new stateful task without 
> revoking any existing tasks, the thread won’t actually commit this open txn 
> before it goes back into the restoration phase while it builds up state for 
> the new task. So the in-flight transaction is left open during restoration, 
> during which the StreamThread only consumes from the changelog without 
> committing, leaving it vulnerable to timing out when restoration times exceed 
> the configured transaction.timeout.ms for the producer client.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (KAFKA-13684) KStream rebalance can lead to JVM process crash when network issues occure

2022-04-04 Thread Bruno Cadonna (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna resolved KAFKA-13684.
---
Resolution: Not A Bug

> KStream rebalance can lead to JVM process crash when network issues occure
> --
>
> Key: KAFKA-13684
> URL: https://issues.apache.org/jira/browse/KAFKA-13684
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.1
>Reporter: Peter Cipov
>Priority: Critical
> Attachments: crash-dump.log, crash-logs.csv
>
>
> Hello,
> Sporadically KStream rebalance leads to segmentation fault
> {code:java}
> siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 
> 0x {code}
> I have spotted it occuring when:
> 1) there some intermittent connection issues. I have found 
> org.apache.kafka.common.errors.DisconnectException:  in logs during rebalance
> 2) a lot of partitions are shifted due to ks cluster re-balance
>  
> crash stack:
> {code:java}
> Current thread (0x7f5bf407a000):  JavaThread "app-blue-v6-StreamThread-2" 
> [_thread_in_native, id=231, stack(0x7f5bdc2ed000,0x7f5bdc3ee000)]
> Stack: [0x7f5bdc2ed000,0x7f5bdc3ee000],  sp=0x7f5bdc3ebe30,  free 
> space=1019kNative frames: (J=compiled Java code, A=aot compiled Java code, 
> j=interpreted, Vv=VM code, C=native code)C  [libc.so.6+0x37ab7]  abort+0x297
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)J 8080  
> org.rocksdb.WriteBatch.put(J[BI[BIJ)V (0 bytes) @ 0x7f5c857ca520 
> [0x7f5c857ca4a0+0x0080]J 8835 c2 
> org.apache.kafka.streams.state.internals.RocksDBStore$SingleColumnFamilyAccessor.prepareBatchForRestore(Ljava/util/Collection;Lorg/rocksdb/WriteBatch;)V
>  (52 bytes) @ 0x7f5c858dccb4 [0x7f5c858dcb60+0x0154]J 
> 9779 c1 
> org.apache.kafka.streams.state.internals.RocksDBStore$RocksDBBatchingRestoreCallback.restoreAll(Ljava/util/Collection;)V
>  (147 bytes) @ 0x7f5c7ef7b7e4 [0x7f5c7ef7b360+0x0484]J 
> 8857 c2 
> org.apache.kafka.streams.processor.internals.StateRestoreCallbackAdapter.lambda$adapt$0(Lorg/apache/kafka/streams/processor/StateRestoreCallback;Ljava/util/Collection;)V
>  (73 bytes) @ 0x7f5c858f86dc [0x7f5c858f8500+0x01dc]J 
> 9686 c1 
> org.apache.kafka.streams.processor.internals.StateRestoreCallbackAdapter$$Lambda$937.restoreBatch(Ljava/util/Collection;)V
>  (9 bytes) @ 0x7f5c7dff7bb4 [0x7f5c7dff7b40+0x0074]J 9683 
> c1 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.restore(Lorg/apache/kafka/streams/processor/internals/ProcessorStateManager$StateStoreMetadata;Ljava/util/List;)V
>  (176 bytes) @ 0x7f5c7e71af4c [0x7f5c7e719740+0x180c]J 
> 8882 c2 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restoreChangelog(Lorg/apache/kafka/streams/processor/internals/StoreChangelogReader$ChangelogMetadata;)Z
>  (334 bytes) @ 0x7f5c859052ec [0x7f5c85905140+0x01ac]J 
> 12689 c2 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(Ljava/util/Map;)V
>  (412 bytes) @ 0x7f5c85ce98d4 [0x7f5c85ce8420+0x14b4]J 
> 12688 c2 
> org.apache.kafka.streams.processor.internals.StreamThread.initializeAndRestorePhase()V
>  (214 bytes) @ 0x7f5c85ce580c [0x7f5c85ce5540+0x02cc]J 
> 17654 c2 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce()V (725 
> bytes) @ 0x7f5c859960e8 [0x7f5c85995fa0+0x0148]j  
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop()Z+61j
> org.apache.kafka.streams.processor.internals.StreamThread.run()V+36v  
> ~StubRoutines::call_stub 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 
> 0x{code}
> I attached whole java cash-dump and digest from our logs. 
> It is executed on azul jdk11
> KS 2.8.1
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [kafka] dajac commented on pull request #11906: MINOR: Doc updates for Kafka 3.0.1

2022-04-04 Thread GitBox


dajac commented on PR #11906:
URL: https://github.com/apache/kafka/pull/11906#issuecomment-1087722280

   I usually ask them to open two PRs. The issue is that the doc in kafka might 
be way ahead of the current release (e.g. new features).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mimaison merged pull request #11992: MINOR: Fix wrong configuration of `inter.broker.security.protocol` for communication between brokers

2022-04-04 Thread GitBox


mimaison merged PR #11992:
URL: https://github.com/apache/kafka/pull/11992


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] tombentley commented on pull request #11994: MINOR: Mention KAFKA-13748 in release notes

2022-04-04 Thread GitBox


tombentley commented on PR #11994:
URL: https://github.com/apache/kafka/pull/11994#issuecomment-1087701062

   @cadonna are you OK for me to cherry-pick this to 3.2? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mimaison commented on a diff in pull request #11985: MINOR: Supplement the description of `Valid Values` in the documentation of `compression.type`

2022-04-04 Thread GitBox


mimaison commented on code in PR #11985:
URL: https://github.com/apache/kafka/pull/11985#discussion_r841863190


##
clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java:
##
@@ -329,7 +330,7 @@
 in("all", "-1", "0", "1"),
 Importance.LOW,
 ACKS_DOC)
-.define(COMPRESSION_TYPE_CONFIG, Type.STRING, 
"none", Importance.HIGH, COMPRESSION_TYPE_DOC)
+.define(COMPRESSION_TYPE_CONFIG, Type.STRING, 
CompressionType.NONE.name, in(CompressionType.names().toArray(new String[0])), 
Importance.HIGH, COMPRESSION_TYPE_DOC)

Review Comment:
   Could we use 
[`Utils.enumOptions()`](https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/utils/Utils.java#L1424)
 here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-13684) KStream rebalance can lead to JVM process crash when network issues occure

2022-04-04 Thread Peter Cipov (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516900#comment-17516900
 ] 

Peter Cipov commented on KAFKA-13684:
-

hello [~guozhang] 

 

We have investigated this further and were able to track down the cause. It was 
a lack of enough native memory for our workloads, that is consumed by RocksDB. 
The issue manifested itself differently each time (different threads, stacks), 
but there always was the same signal sent by OS. So I guess this is not a bug 
in kstreams/rocksdb as such, nevertheless error message was pretty cryptic.

Issue can be closed.

> KStream rebalance can lead to JVM process crash when network issues occure
> --
>
> Key: KAFKA-13684
> URL: https://issues.apache.org/jira/browse/KAFKA-13684
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.8.1
>Reporter: Peter Cipov
>Priority: Critical
> Attachments: crash-dump.log, crash-logs.csv
>
>
> Hello,
> Sporadically KStream rebalance leads to segmentation fault
> {code:java}
> siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 
> 0x {code}
> I have spotted it occuring when:
> 1) there some intermittent connection issues. I have found 
> org.apache.kafka.common.errors.DisconnectException:  in logs during rebalance
> 2) a lot of partitions are shifted due to ks cluster re-balance
>  
> crash stack:
> {code:java}
> Current thread (0x7f5bf407a000):  JavaThread "app-blue-v6-StreamThread-2" 
> [_thread_in_native, id=231, stack(0x7f5bdc2ed000,0x7f5bdc3ee000)]
> Stack: [0x7f5bdc2ed000,0x7f5bdc3ee000],  sp=0x7f5bdc3ebe30,  free 
> space=1019kNative frames: (J=compiled Java code, A=aot compiled Java code, 
> j=interpreted, Vv=VM code, C=native code)C  [libc.so.6+0x37ab7]  abort+0x297
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)J 8080  
> org.rocksdb.WriteBatch.put(J[BI[BIJ)V (0 bytes) @ 0x7f5c857ca520 
> [0x7f5c857ca4a0+0x0080]J 8835 c2 
> org.apache.kafka.streams.state.internals.RocksDBStore$SingleColumnFamilyAccessor.prepareBatchForRestore(Ljava/util/Collection;Lorg/rocksdb/WriteBatch;)V
>  (52 bytes) @ 0x7f5c858dccb4 [0x7f5c858dcb60+0x0154]J 
> 9779 c1 
> org.apache.kafka.streams.state.internals.RocksDBStore$RocksDBBatchingRestoreCallback.restoreAll(Ljava/util/Collection;)V
>  (147 bytes) @ 0x7f5c7ef7b7e4 [0x7f5c7ef7b360+0x0484]J 
> 8857 c2 
> org.apache.kafka.streams.processor.internals.StateRestoreCallbackAdapter.lambda$adapt$0(Lorg/apache/kafka/streams/processor/StateRestoreCallback;Ljava/util/Collection;)V
>  (73 bytes) @ 0x7f5c858f86dc [0x7f5c858f8500+0x01dc]J 
> 9686 c1 
> org.apache.kafka.streams.processor.internals.StateRestoreCallbackAdapter$$Lambda$937.restoreBatch(Ljava/util/Collection;)V
>  (9 bytes) @ 0x7f5c7dff7bb4 [0x7f5c7dff7b40+0x0074]J 9683 
> c1 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.restore(Lorg/apache/kafka/streams/processor/internals/ProcessorStateManager$StateStoreMetadata;Ljava/util/List;)V
>  (176 bytes) @ 0x7f5c7e71af4c [0x7f5c7e719740+0x180c]J 
> 8882 c2 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restoreChangelog(Lorg/apache/kafka/streams/processor/internals/StoreChangelogReader$ChangelogMetadata;)Z
>  (334 bytes) @ 0x7f5c859052ec [0x7f5c85905140+0x01ac]J 
> 12689 c2 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(Ljava/util/Map;)V
>  (412 bytes) @ 0x7f5c85ce98d4 [0x7f5c85ce8420+0x14b4]J 
> 12688 c2 
> org.apache.kafka.streams.processor.internals.StreamThread.initializeAndRestorePhase()V
>  (214 bytes) @ 0x7f5c85ce580c [0x7f5c85ce5540+0x02cc]J 
> 17654 c2 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce()V (725 
> bytes) @ 0x7f5c859960e8 [0x7f5c85995fa0+0x0148]j  
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop()Z+61j
> org.apache.kafka.streams.processor.internals.StreamThread.run()V+36v  
> ~StubRoutines::call_stub 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 
> 0x{code}
> I attached whole java cash-dump and digest from our logs. 
> It is executed on azul jdk11
> KS 2.8.1
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [kafka] mimaison commented on pull request #11906: MINOR: Doc updates for Kafka 3.0.1

2022-04-04 Thread GitBox


mimaison commented on PR #11906:
URL: https://github.com/apache/kafka/pull/11906#issuecomment-1087684315

   I wonder if we should prevent editing files in `kafka-site` altogether and 
ask users to always do them in `kafka`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] tombentley opened a new pull request, #11994: MINOR: Mention KAFKA-13748 in release notes

2022-04-04 Thread GitBox


tombentley opened a new pull request, #11994:
URL: https://github.com/apache/kafka/pull/11994

   KAFKA-13748 removed the FileStreamSinkConnector and FileStreamSinkConnector 
from the default classpath. Although these are provided for example purposes 
only, it makes sense to mention this in the release notes for users who use 
them in their own examples.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] tombentley commented on pull request #11994: MINOR: Mention KAFKA-13748 in release notes

2022-04-04 Thread GitBox


tombentley commented on PR #11994:
URL: https://github.com/apache/kafka/pull/11994#issuecomment-1087649367

   @kkonstantine PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 commented on pull request #11976: KAFKA-13771: Support to explicitly delete delegationTokens that have expired but have not been automatically cleaned up

2022-04-04 Thread GitBox


RivenSun2 commented on PR #11976:
URL: https://github.com/apache/kafka/pull/11976#issuecomment-1087593035

   Hi @showuon @dajac and @mimaison
   could you help to review the PR?
   Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 commented on pull request #11947: MINOR: Improve the description of principal under different mechanisms of sasl

2022-04-04 Thread GitBox


RivenSun2 commented on PR #11947:
URL: https://github.com/apache/kafka/pull/11947#issuecomment-1087585550

   Hi @showuon @dajac and @mimaison
   could you help to review the PR?
   Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 commented on pull request #11985: MINOR: Supplement the description of `Valid Values` in the documentation of `compression.type`

2022-04-04 Thread GitBox


RivenSun2 commented on PR #11985:
URL: https://github.com/apache/kafka/pull/11985#issuecomment-1087584508

   Hi @showuon @dajac and @mimaison
   could you help to review the PR?
   Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 commented on pull request #11992: MINOR: Fix wrong configuration of `inter.broker.security.protocol` for communication between brokers

2022-04-04 Thread GitBox


RivenSun2 commented on PR #11992:
URL: https://github.com/apache/kafka/pull/11992#issuecomment-1087583883

   Hi @showuon @dajac and @mimaison 
   could you help to review the PR?
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 closed pull request #11975: MINOR: Improve the log output information of SocketServer

2022-04-04 Thread GitBox


RivenSun2 closed pull request #11975: MINOR: Improve the log output information 
of SocketServer
URL: https://github.com/apache/kafka/pull/11975


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] RivenSun2 commented on pull request #11975: MINOR: Improve the log output information of SocketServer

2022-04-04 Thread GitBox


RivenSun2 commented on PR #11975:
URL: https://github.com/apache/kafka/pull/11975#issuecomment-1087581308

   Hi @showuon  @mimaison 
   Thanks for your review.
   let me close this PR.
   Thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >