[jira] [Updated] (KAFKA-16993) Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener()
[ https://issues.apache.org/jira/browse/KAFKA-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhang gong updated KAFKA-16993: --- Attachment: (was: kafka权威指南中文版.pdf) > Flaky test > RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener() > --- > > Key: KAFKA-16993 > URL: https://issues.apache.org/jira/browse/KAFKA-16993 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 6u4a4e27e2oh2-org.apache.kafka.streams.integration.RestoreIntegrationTest-shouldInvokeUserDefinedGlobalStateRestoreListener()-1-output.txt > > > {code:java} > org.opentest4j.AssertionFailedError: expected: but was: at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at > > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)at > org.apache.kafka.streams.integration.RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener(RestoreIntegrationTest.java:611)at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:498)at > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)at > > org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)at > > org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)at > > org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)at > > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)at > > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)at > > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)at > org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > >
[jira] [Updated] (KAFKA-16993) Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener()
[ https://issues.apache.org/jira/browse/KAFKA-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuanzhang gong updated KAFKA-16993: --- Attachment: kafka权威指南中文版.pdf > Flaky test > RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener() > --- > > Key: KAFKA-16993 > URL: https://issues.apache.org/jira/browse/KAFKA-16993 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 6u4a4e27e2oh2-org.apache.kafka.streams.integration.RestoreIntegrationTest-shouldInvokeUserDefinedGlobalStateRestoreListener()-1-output.txt > > > {code:java} > org.opentest4j.AssertionFailedError: expected: but was: at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at > > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)at > org.apache.kafka.streams.integration.RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener(RestoreIntegrationTest.java:611)at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:498)at > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)at > > org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)at > > org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)at > > org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)at > > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)at > > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)at > > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)at > org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > >
[jira] [Comment Edited] (KAFKA-16976) Improve the dynamic config handling for RemoteLogManagerConfig when a broker is restarted.
[ https://issues.apache.org/jira/browse/KAFKA-16976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856131#comment-17856131 ] Kamal Chandraprakash edited comment on KAFKA-16976 at 6/19/24 5:48 AM: --- [~chiacyu] Thanks for showing interest to work on this ticket. Since, we have to revert some of the changes that were done in the previous PRs to address this change effectively, I'll take this ticket and tag you once the patch is ready for review. was (Author: ckamal): [~chiacyu] Thanks for showing interest to work on this ticket. Since, we have to revert some of the changes that were done in the previous PRs to address this change effectively, I'll taking this ticket. I'll tag you once the patch is ready for review. > Improve the dynamic config handling for RemoteLogManagerConfig when a broker > is restarted. > -- > > Key: KAFKA-16976 > URL: https://issues.apache.org/jira/browse/KAFKA-16976 > Project: Kafka > Issue Type: Task >Reporter: Satish Duggana >Assignee: Kamal Chandraprakash >Priority: Major > Fix For: 3.9.0 > > > This is a followup on the discussion: > https://github.com/apache/kafka/pull/16353#pullrequestreview-2121953295 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16976) Improve the dynamic config handling for RemoteLogManagerConfig when a broker is restarted.
[ https://issues.apache.org/jira/browse/KAFKA-16976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856131#comment-17856131 ] Kamal Chandraprakash commented on KAFKA-16976: -- [~chiacyu] Thanks for showing interest to work on this ticket. Since, we have to revert some of the changes that were done in the previous PRs to address this change effectively, I'll taking this ticket. I'll tag you once the patch is ready for review. > Improve the dynamic config handling for RemoteLogManagerConfig when a broker > is restarted. > -- > > Key: KAFKA-16976 > URL: https://issues.apache.org/jira/browse/KAFKA-16976 > Project: Kafka > Issue Type: Task >Reporter: Satish Duggana >Assignee: Kamal Chandraprakash >Priority: Major > Fix For: 3.9.0 > > > This is a followup on the discussion: > https://github.com/apache/kafka/pull/16353#pullrequestreview-2121953295 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16997) do not stop kafka when issue to delete a partition folder
Jerome Morel created KAFKA-16997: Summary: do not stop kafka when issue to delete a partition folder Key: KAFKA-16997 URL: https://issues.apache.org/jira/browse/KAFKA-16997 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 3.6.2 Reporter: Jerome Morel Context: In our project we create different partitions and even if we delete the segments those remains and it came out we have so many partitions that kafka crashes due to amount of open files. Therefore we want to delete regularly those partitions but we get during that kafka stopping. The issue: after some investigations we found out that the deletion process gives sometimes warnings if it cannot delete some log files: {code:java} [2024-06-17 15:52:39,590] WARN Failed atomic move of /tmp/kafka-logs-mnt/kafka-no-docker/69747657-f49d-453f-9fa2-4d4369199699-0.7b51dad41a77448d8b419c76749f0b2c-delete/0010.timeindex to /tmp/kafka-logs-mnt/kafka-no-docker/69747657-f49d-453f-9fa2-4d4369199699-0.7b51dad41a77448d8b419c76749f0b2c-delete/0010.timeindex.deleted retrying with a non-atomic move (org.apache.kafka.common.utils.Utils) java.nio.file.NoSuchFileException: /tmp/kafka-logs-mnt/kafka-no-docker/69747657-f49d-453f-9fa2-4d4369199699-0.7b51dad41a77448d8b419c76749f0b2c-delete/0010.timeindex -> /tmp/kafka-logs-mnt/kafka-no-docker/69747657-f49d-453f-9fa2-4d4369199699-0.7b51dad41a77448d8b419c76749f0b2c-delete/0010.timeindex.deleted at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416) at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:266) at java.base/java.nio.file.Files.move(Files.java:1432) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:980) at org.apache.kafka.storage.internals.log.LazyIndex$IndexFile.renameTo(LazyIndex.java:80) at org.apache.kafka.storage.internals.log.LazyIndex.renameTo(LazyIndex.java:202) at org.apache.kafka.storage.internals.log.LogSegment.changeFileSuffixes(LogSegment.java:666) at kafka.log.LocalLog$.$anonfun$deleteSegmentFiles$1(LocalLog.scala:912) at kafka.log.LocalLog$.$anonfun$deleteSegmentFiles$1$adapted(LocalLog.scala:910) at scala.collection.immutable.List.foreach(List.scala:431) at kafka.log.LocalLog$.deleteSegmentFiles(LocalLog.scala:910) at kafka.log.LocalLog.removeAndDeleteSegments(LocalLog.scala:289) {code} And just continue but when it is to delete a folder then it mark the replica as not ok and then stop kafka if only replica available (which is our case): {code:java} [2024-06-17 15:52:39,637] ERROR Error while deleting dir for 69747657-f49d-453f-9fa2-4d4369199699-0 in dir /tmp/kafka-logs-mnt/kafka-no-docker (org.apache.kafka.storage.internals.log.LogDirFailureChannel) java.nio.file.DirectoryNotEmptyException: /tmp/kafka-logs-mnt/kafka-no-docker/69747657-f49d-453f-9fa2-4d4369199699-0.7b51dad41a77448d8b419c76749f0b2c-delete at java.base/sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:246) at java.base/sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:105) at java.base/java.nio.file.Files.delete(Files.java:1152) at org.apache.kafka.common.utils.Utils$1.postVisitDirectory(Utils.java:923) at org.apache.kafka.common.utils.Utils$1.postVisitDirectory(Utils.java:901) at java.base/java.nio.file.Files.walkFileTree(Files.java:2828) at java.base/java.nio.file.Files.walkFileTree(Files.java:2882) at org.apache.kafka.common.utils.Utils.delete(Utils.java:901) at kafka.log.LocalLog.$anonfun$deleteEmptyDir$2(LocalLog.scala:243) at kafka.log.LocalLog.deleteEmptyDir(LocalLog.scala:709) at kafka.log.UnifiedLog.$anonfun$delete$2(UnifiedLog.scala:1734) at kafka.log.UnifiedLog.delete(UnifiedLog.scala:1911) at kafka.log.LogManager.deleteLogs(LogManager.scala:1152) at kafka.log.LogManager.$anonfun$deleteLogs$6(LogManager.scala:1166) at org.apache.kafka.server.util.KafkaScheduler.lambda$schedule$1(KafkaScheduler.java:150) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at
[jira] [Assigned] (KAFKA-16976) Improve the dynamic config handling for RemoteLogManagerConfig when a broker is restarted.
[ https://issues.apache.org/jira/browse/KAFKA-16976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamal Chandraprakash reassigned KAFKA-16976: Assignee: Kamal Chandraprakash > Improve the dynamic config handling for RemoteLogManagerConfig when a broker > is restarted. > -- > > Key: KAFKA-16976 > URL: https://issues.apache.org/jira/browse/KAFKA-16976 > Project: Kafka > Issue Type: Task >Reporter: Satish Duggana >Assignee: Kamal Chandraprakash >Priority: Major > Fix For: 3.9.0 > > > This is a followup on the discussion: > https://github.com/apache/kafka/pull/16353#pullrequestreview-2121953295 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856121#comment-17856121 ] Tai Le Manh commented on KAFKA-16959: - [~chia7712] [~m1a2st] I agree with Chia-Ping Tsai's point, making it work with both entities really makes sense. > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16995) The listeners broker parameter incorrect documentation
[ https://issues.apache.org/jira/browse/KAFKA-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856114#comment-17856114 ] dujian0068 commented on KAFKA-16995: Hello: >From the error message, it seems that there is something wrong with your >broker configuration FORMAT: listeners = listener_name://host_name:port each listener_name must different > The listeners broker parameter incorrect documentation > --- > > Key: KAFKA-16995 > URL: https://issues.apache.org/jira/browse/KAFKA-16995 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Kafka 3.6.1 >Reporter: Sergey >Assignee: dujian0068 >Priority: Minor > > We are using Kafka 3.6.1 and the > [KIP-797|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330] > describes configuring listeners with the same port and name for supporting > IPv4/IPv6 dual-stack. > Documentation link: > [https://kafka.apache.org/36/documentation.html#brokerconfigs_listeners] > As I understand it, Kafka should allow us to set the listener name and > listener port to the same value if we configure dual-stack. > But in reality, the broker returns an error if we set the listener name to > the same value. > Error example: > {code:java} > java.lang.IllegalArgumentException: requirement failed: Each listener must > have a different name, listeners: > CONTROLPLANE://0.0.0.0:9090,SSL://0.0.0.0:9093,SSL://[::]:9093 > at scala.Predef$.require(Predef.scala:337) > at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:214) > at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) > at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:2120) > at kafka.server.KafkaConfig.(KafkaConfig.scala:1807) > at kafka.server.KafkaConfig.(KafkaConfig.scala:1604) > at kafka.Kafka$.buildServer(Kafka.scala:72) > at kafka.Kafka$.main(Kafka.scala:91) > at kafka.Kafka.main(Kafka.scala) {code} > I've tried to set the listeners to: "SSL://0.0.0.0:9093,SSL://[::]:9093" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856103#comment-17856103 ] Chia-Ping Tsai commented on KAFKA-16959: Another point is that other entity type can process both entities. Hence, broker entity should work for both entities too for consistency. > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16996) The leastLoadedNode() function in kafka-client may choose a faulty node during the consumer thread starting and meanwhile one of the KAFKA server node is dead.
Goufu created KAFKA-16996: - Summary: The leastLoadedNode() function in kafka-client may choose a faulty node during the consumer thread starting and meanwhile one of the KAFKA server node is dead. Key: KAFKA-16996 URL: https://issues.apache.org/jira/browse/KAFKA-16996 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 3.6.0, 2.3.0, 2.0.1 Reporter: Goufu The leastLoadedNode() function has a bug during the consumer process starting period. The function sendMetadataRequest() called by getTopicMetadataRequest() uses a random node which maybe faulty since every node‘s state recorded in the client thread is not ready yet. It happened in my production environment during my consumer thread restarting and meanwhile one of the KAFKA server node is dead. I'm using the kafka-client-2.0.1.jar. I have checked the source code of higher versions and the issue still exists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856096#comment-17856096 ] 黃竣陽 edited comment on KAFKA-16959 at 6/19/24 1:49 AM: -- In this sentence {quote}{{Alter an entity's config by adding/updating/removing config entries. The given entity name may either be a specific entity (via --entity-name) or, if supported, the default config for all entities (via --entity-default).}} {quote} I think that we should focus on "either ... or ...", It means we should choose one kind of entity. If we can use two kind entities, we need to check adding/updating/removing config belongs to which Update Mode. Personally, It will make code complex. [~chia7712] [~tailm.asf] WDYT? was (Author: JIRAUSER305187): In this sentence ??Alter an entity's config by adding/updating/removing config entries. The given entity name may either be a specific entity (via --entity-name) or, if supported, the default config for all entities (via --entity-default).?? I think that we should focus on "either ... or ...", It means we should choose one kind of entity. If we can use two kind entities, we need to check adding/updating/removing config belongs to which Update Mode. Personally, It will make code complex. [~chia7712] [~tailm.asf] WDYT? > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-8275) NetworkClient leastLoadedNode selection should consider throttled nodes
[ https://issues.apache.org/jira/browse/KAFKA-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856097#comment-17856097 ] Goufu commented on KAFKA-8275: -- The leastLoadedNode() function has a bug during the consumer process starting period. The function sendMetadataRequest() called by getTopicMetadataRequest() uses a random node which maybe faulty since every node‘s state recorded in the client thread is not ready yet. It happened in my production environment during my consumer thread restarting and meanwhile one of the KAFKA server node is dead. I'm using the kafka-client-2.0.1.jar. I have checked the source code of higher versions and the issue still exists. > NetworkClient leastLoadedNode selection should consider throttled nodes > --- > > Key: KAFKA-8275 > URL: https://issues.apache.org/jira/browse/KAFKA-8275 > Project: Kafka > Issue Type: Bug > Components: clients, network >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 2.3.0 > > > The leastLoadedNode() function is intended to find any available node. It is > smart in the sense that it considers the number of inflight requests and > reconnect backoff, but it has not been updated to take into account client > throttling. If we have an available node which is not throttled, we should > use it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856096#comment-17856096 ] 黃竣陽 edited comment on KAFKA-16959 at 6/19/24 1:46 AM: -- In this sentence ??Alter an entity's config by adding/updating/removing config entries. The given entity name may either be a specific entity (via --entity-name) or, if supported, the default config for all entities (via --entity-default).?? I think that we should focus on "either ... or ...", It means we should choose one kind of entity. If we can use two kind entities, we need to check adding/updating/removing config belongs to which Update Mode. Personally, It will make code complex. [~chia7712] [~tailm.asf] WDYT? was (Author: JIRAUSER305187): In this sentence > Alter an entity's config by adding/updating/removing config entries. The > given entity name may either be a specific entity (via --entity-name) or, if > supported, the default config for all entities (via --entity-default). I think that we should focus on "either ... or ...", It means we should choose one kind of entity. If we can use two kind entities, we need to check adding/updating/removing config belongs to which Update Mode. Personally, It will make code complex. [~chia7712] [~tailm.asf] WDYT? > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856096#comment-17856096 ] 黃竣陽 commented on KAFKA-16959: - In this sentence > Alter an entity's config by adding/updating/removing config entries. The > given entity name may either be a specific entity (via --entity-name) or, if > supported, the default config for all entities (via --entity-default). I think that we should focus on "either ... or ...", It means we should choose one kind of entity. If we can use two kind entities, we need to check adding/updating/removing config belongs to which Update Mode. Personally, It will make code complex. [~chia7712] [~tailm.asf] WDYT? > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16957) Enable KafkaConsumerTest#configurableObjectsShouldSeeGeneratedClientId to work with CLASSIC and CONSUMER
[ https://issues.apache.org/jira/browse/KAFKA-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai resolved KAFKA-16957. Resolution: Fixed > Enable KafkaConsumerTest#configurableObjectsShouldSeeGeneratedClientId to > work with CLASSIC and CONSUMER > - > > Key: KAFKA-16957 > URL: https://issues.apache.org/jira/browse/KAFKA-16957 > Project: Kafka > Issue Type: Test > Components: clients, consumer, unit tests >Reporter: Chia-Ping Tsai >Assignee: Chia Chuan Yu >Priority: Minor > Fix For: 3.9.0 > > > The `CLIENT_IDS` is a static variable, so the latter one will see previous > test results. We should clear it before testing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16989) Use StringBuilder instead of string concatenation
[ https://issues.apache.org/jira/browse/KAFKA-16989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai resolved KAFKA-16989. Fix Version/s: 3.9.0 Resolution: Fixed > Use StringBuilder instead of string concatenation > - > > Key: KAFKA-16989 > URL: https://issues.apache.org/jira/browse/KAFKA-16989 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: TengYao Chi >Priority: Minor > Fix For: 3.9.0 > > > https://github.com/apache/kafka/blob/2fd00ce53678509c9f2cfedb428e37a871e3d530/metadata/src/main/java/org/apache/kafka/image/node/ClientQuotasImageNode.java#L130 > The string concatenation will create many new strings and we can reduce the > cost by using StringBuilder -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856093#comment-17856093 ] Chia-Ping Tsai commented on KAFKA-16959: [~tailm.asf] nice point. KIP-543 says: {quote} Alter an entity's config by adding/updating/removing config entries. The given entity name may either be a specific entity (via --entity-name) or, if supported, the default config for all entities (via --entity-default). {quote} Personally, it does not explicitly declare that we can input "both" kind of entity. Also, current implementation [0] seems to send request for the "head" of entity only. BUT, I feel the point from [~tailm.asf] is better than my previous intent. Instead of throwing exception, making it work with both entities is more valuable. [~m1a2st] [~tailm.asf] WDYT? [0] https://github.com/apache/kafka/blob/e2060204fead9aa03f611b76c543b6824f8eb26b/core/src/main/scala/kafka/admin/ConfigCommand.scala#L401 > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16508) Infinte loop if output topic does not exisit
[ https://issues.apache.org/jira/browse/KAFKA-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reassigned KAFKA-16508: --- Assignee: Alieh Saeedi > Infinte loop if output topic does not exisit > > > Key: KAFKA-16508 > URL: https://issues.apache.org/jira/browse/KAFKA-16508 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Alieh Saeedi >Priority: Major > > Kafka Streams supports `ProductionExceptionHandler` to drop records on error > when writing into an output topic. > However, if the output topic does not exist, the corresponding error cannot > be skipped over because the handler is not called. > The issue is, that the producer internally retires to fetch the output topic > metadata until it times out, an a `TimeoutException` (which is a > `RetriableException`) is returned via the registered `Callback`. However, for > `RetriableException` there is different code path and the > `ProductionExceptionHandler` is not called. > In general, Kafka Streams correctly tries to handle as many errors a possible > internally, and a `RetriableError` falls into this category (and thus there > is no need to call the handler). However, for this particular case, just > retrying does not solve the issue – it's unclear if throwing a retryable > `TimeoutException` is actually the right thing to do for the Producer? Also > not sure what the right way to address this ticket would be (currently, we > cannot really detect this case, except if we would do some nasty error > message String comparison what sounds hacky...) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16995) The listeners broker parameter incorrect documentation
[ https://issues.apache.org/jira/browse/KAFKA-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dujian0068 reassigned KAFKA-16995: -- Assignee: dujian0068 > The listeners broker parameter incorrect documentation > --- > > Key: KAFKA-16995 > URL: https://issues.apache.org/jira/browse/KAFKA-16995 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Kafka 3.6.1 >Reporter: Sergey >Assignee: dujian0068 >Priority: Minor > > We are using Kafka 3.6.1 and the > [KIP-797|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330] > describes configuring listeners with the same port and name for supporting > IPv4/IPv6 dual-stack. > Documentation link: > [https://kafka.apache.org/36/documentation.html#brokerconfigs_listeners] > As I understand it, Kafka should allow us to set the listener name and > listener port to the same value if we configure dual-stack. > But in reality, the broker returns an error if we set the listener name to > the same value. > Error example: > {code:java} > java.lang.IllegalArgumentException: requirement failed: Each listener must > have a different name, listeners: > CONTROLPLANE://0.0.0.0:9090,SSL://0.0.0.0:9093,SSL://[::]:9093 > at scala.Predef$.require(Predef.scala:337) > at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:214) > at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) > at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:2120) > at kafka.server.KafkaConfig.(KafkaConfig.scala:1807) > at kafka.server.KafkaConfig.(KafkaConfig.scala:1604) > at kafka.Kafka$.buildServer(Kafka.scala:72) > at kafka.Kafka$.main(Kafka.scala:91) > at kafka.Kafka.main(Kafka.scala) {code} > I've tried to set the listeners to: "SSL://0.0.0.0:9093,SSL://[::]:9093" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16990) Unrecognised flag passed to kafka-storage.sh in system test
[ https://issues.apache.org/jira/browse/KAFKA-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Narula reassigned KAFKA-16990: - Assignee: Justine Olshan (was: Gaurav Narula) > Unrecognised flag passed to kafka-storage.sh in system test > --- > > Key: KAFKA-16990 > URL: https://issues.apache.org/jira/browse/KAFKA-16990 > Project: Kafka > Issue Type: Test >Reporter: Gaurav Narula >Assignee: Justine Olshan >Priority: Major > > Running > {{TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" > bash tests/docker/run_tests.sh}} on trunk (c4a3d2475f) fails with the > following: > {code:java} > [INFO:2024-06-18 09:16:03,139]: Triggering test 2 of 32... > [INFO:2024-06-18 09:16:03,147]: RunnerClient: Loading test {'directory': > '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': > 'kraft_upgrade_test.py', 'cls_name': 'TestKRaftUpgrade', 'method_name': > 'test_isolated_mode_upgrade', 'injected_args': {'from_kafka_version': > '3.1.2', 'use_new_coordinator': True, 'metadata_quorum': 'ISOLATED_KRAFT'}} > [INFO:2024-06-18 09:16:03,151]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > on run 1/1 > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Setting up... > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Running... > [INFO:2024-06-18 09:16:05,999]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Tearing down... > [INFO:2024-06-18 09:16:12,366]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > FAIL: RemoteCommandError({'ssh_config': {'host': 'ducker10', 'hostname': > 'ducker10', 'user': 'ducker', 'port': 22, 'password': '', 'identityfile': > '/home/ducker/.ssh/id_rsa', 'connecttimeout': None}, 'hostname': 'ducker10', > 'ssh_hostname': 'ducker10', 'user': 'ducker', 'externally_routable_ip': > 'ducker10', '_logger': kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT-2 > (DEBUG)>, 'os': 'linux', '_ssh_client': 0x85bccc70>, '_sftp_client': 0x85bccdf0>, '_custom_ssh_exception_checks': None}, > '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config > /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f > group.version=1', 1, b"usage: kafka-storage format [-h] --config CONFIG > --cluster-id CLUSTER_ID\n > [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n") > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 132, in test_isolated_mode_upgrade > self.run_upgrade(from_kafka_version, group_protocol) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 96, in run_upgrade > self.kafka.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 669, in > start > self.isolated_controller_quorum.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 671, in > start > Service.start(self, **kwargs) > File "/usr/local/lib/python3.9/dist-packages/ducktape/services/service.py", > line 265, in start > self.start_node(node, **kwargs) > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 902, in > start_node > node.account.ssh(cmd) > File > "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", > line 35, in wrapper > return method(self, *args, **kwargs) > File > "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", > line
[jira] [Commented] (KAFKA-16990) Unrecognised flag passed to kafka-storage.sh in system test
[ https://issues.apache.org/jira/browse/KAFKA-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856078#comment-17856078 ] Gaurav Narula commented on KAFKA-16990: --- Thanks [~jolshan]! {quote}I'm wondering why this is set this way considering new group coordinator is not available for the original "old" version.{quote} IIUC, the flag is gated on {{use_new_coordinator}} and that was introduced with KAFKA-15578/[PR-14582|https://github.com/apache/kafka/commit/150b0e8290cda57df668ba89f6b422719866de5a#diff-0f57108563ac01c46ead76bc036e43a5ac988cbc4b17e314739fbd2fe7a1bf9e]. Perhaps the intention is to start the cluster with the new group coordinator _after_ the upgrade? I'm assigning the JIRA to you. Hope that's okay. > Unrecognised flag passed to kafka-storage.sh in system test > --- > > Key: KAFKA-16990 > URL: https://issues.apache.org/jira/browse/KAFKA-16990 > Project: Kafka > Issue Type: Test >Reporter: Gaurav Narula >Assignee: Gaurav Narula >Priority: Major > > Running > {{TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" > bash tests/docker/run_tests.sh}} on trunk (c4a3d2475f) fails with the > following: > {code:java} > [INFO:2024-06-18 09:16:03,139]: Triggering test 2 of 32... > [INFO:2024-06-18 09:16:03,147]: RunnerClient: Loading test {'directory': > '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': > 'kraft_upgrade_test.py', 'cls_name': 'TestKRaftUpgrade', 'method_name': > 'test_isolated_mode_upgrade', 'injected_args': {'from_kafka_version': > '3.1.2', 'use_new_coordinator': True, 'metadata_quorum': 'ISOLATED_KRAFT'}} > [INFO:2024-06-18 09:16:03,151]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > on run 1/1 > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Setting up... > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Running... > [INFO:2024-06-18 09:16:05,999]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Tearing down... > [INFO:2024-06-18 09:16:12,366]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > FAIL: RemoteCommandError({'ssh_config': {'host': 'ducker10', 'hostname': > 'ducker10', 'user': 'ducker', 'port': 22, 'password': '', 'identityfile': > '/home/ducker/.ssh/id_rsa', 'connecttimeout': None}, 'hostname': 'ducker10', > 'ssh_hostname': 'ducker10', 'user': 'ducker', 'externally_routable_ip': > 'ducker10', '_logger': kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT-2 > (DEBUG)>, 'os': 'linux', '_ssh_client': 0x85bccc70>, '_sftp_client': 0x85bccdf0>, '_custom_ssh_exception_checks': None}, > '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config > /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f > group.version=1', 1, b"usage: kafka-storage format [-h] --config CONFIG > --cluster-id CLUSTER_ID\n > [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n") > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 132, in test_isolated_mode_upgrade > self.run_upgrade(from_kafka_version, group_protocol) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 96, in run_upgrade > self.kafka.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 669, in > start > self.isolated_controller_quorum.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 671, in > start >
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
junrao commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1645128202 ## tools/src/test/java/org/apache/kafka/tools/FeatureCommandTest.java: ## @@ -146,7 +146,7 @@ public void testDowngradeMetadataVersionWithKRaft(ClusterInstance cluster) { ); // Change expected message to reflect possible MetadataVersion range 1-N (N increases when adding a new version) assertEquals("Could not disable metadata.version. Invalid update version 0 for feature " + -"metadata.version. Local controller 3000 only supports versions 1-21", commandOutput); +"metadata.version. Local controller 3000 only supports versions 1-23", commandOutput); Review Comment: Should we change IBP_3_7_IV4 and 3.7.-IV4 in the following code to IBP_3_8_0? ``` @ClusterTest(types = {Type.KRAFT}, metadataVersion = MetadataVersion.IBP_3_7_IV4) "SupportedMaxVersion: 4.0-IV0\tFinalizedVersionLevel: 3.7-IV4\t", outputWithoutEpoch(features.get(1))); ``` ## core/src/test/scala/unit/kafka/server/ApiVersionsRequestTest.scala: ## @@ -47,7 +47,7 @@ object ApiVersionsRequestTest { List(ClusterConfig.defaultBuilder() .setTypes(java.util.Collections.singleton(Type.ZK)) .setServerProperties(serverProperties) - .setMetadataVersion(MetadataVersion.IBP_4_0_IV0) + .setMetadataVersion(MetadataVersion.latestTesting()) Review Comment: Should we change the MV in the following to the new production MV? ``` def testApiVersionsRequestValidationV0Template(): java.util.List[ClusterConfig] = { val serverProperties: java.util.HashMap[String, String] = controlPlaneListenerProperties() serverProperties.put("unstable.api.versions.enable", "false") serverProperties.put("unstable.feature.versions.enable", "false") List(ClusterConfig.defaultBuilder() .setTypes(java.util.Collections.singleton(Type.ZK)) .setMetadataVersion(MetadataVersion.IBP_3_7_IV4) .build()).asJava } ``` ``` @ClusterTest(types = Array(Type.KRAFT, Type.CO_KRAFT), metadataVersion = MetadataVersion.IBP_3_7_IV4, serverProperties = Array( new ClusterConfigProperty(key = "unstable.api.versions.enable", value = "false"), new ClusterConfigProperty(key = "unstable.feature.versions.enable", value = "false"), )) ``` ## metadata/src/test/java/org/apache/kafka/controller/PartitionChangeBuilderTest.java: ## @@ -125,7 +125,7 @@ private static MetadataVersion metadataVersionForPartitionChangeRecordVersion(sh case (short) 1: return MetadataVersion.IBP_3_7_IV2; case (short) 2: -return MetadataVersion.IBP_3_8_IV0; +return MetadataVersion.IBP_3_9_IV1; Review Comment: Should we add 3.8-IV0 to the following? The intention seems to be testing the latest production MV for each minor release. Then, I don't understand why we include 3.6-IV0 instead of 3.6-IV2. Is there a way to avoid manually add the latest production MV in the future? ``` @ValueSource(strings = {"3.6-IV0", "3.7-IV4"}) public void testNoLeaderEpochBumpOnIsrShrink(String metadataVersionString) { ``` ``` @ValueSource(strings = {"3.6-IV0", "3.7-IV4"}) public void testLeaderEpochBumpOnIsrShrinkWithZkMigration(String metadataVersionString) { ``` ``` @ValueSource(strings = {"3.4-IV0", "3.5-IV2", "3.6-IV0", "3.7-IV4"}) public void testNoLeaderEpochBumpOnIsrExpansion(String metadataVersionString) { ``` ``` @ValueSource(strings = {"3.4-IV0", "3.5-IV2", "3.6-IV0", "3.7-IV4"}) public void testNoLeaderEpochBumpOnIsrExpansionDuringMigration(String metadataVersionString) { ``` ``` @ValueSource(strings = {"3.4-IV0", "3.5-IV2", "3.6-IV0", "3.7-IV4"}) public void testLeaderEpochBumpOnNewReplicaSetDisjoint(String metadataVersionString) { ``` ``` @ValueSource(strings = {"3.4-IV0", "3.5-IV2", "3.6-IV0", "3.7-IV4"}) public void testNoLeaderEpochBumpOnEmptyTargetIsr(String metadataVersionString) { ``` ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -331,7 +343,7 @@ public boolean isDirectoryAssignmentSupported() { } public boolean isElrSupported() { -return this.isAtLeast(IBP_3_8_IV0); +return this.isAtLeast(IBP_3_9_IV0); Review Comment: This needs to be IBP_3_9_IV1 now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on PR #16312: URL: https://github.com/apache/kafka/pull/16312#issuecomment-2176982668 Hey @brenden20, thanks for the changes! Took a first pass, left a few comments. I see some other calls around tests blocked on commits that seem confusing but I expect they may all align better if we review how we use the commit manager and are able to simplify to a mock maybe. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1645055613 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -2457,10 +2538,7 @@ private CompletableFuture mockRevocationNoCallbacks(boolean withAutoCommit doNothing().when(subscriptionState).markPendingRevocation(anySet()); when(subscriptionState.rebalanceListener()).thenReturn(Optional.empty()).thenReturn(Optional.empty()); if (withAutoCommit) { -when(commitRequestManager.autoCommitEnabled()).thenReturn(true); -CompletableFuture commitResult = new CompletableFuture<>(); - when(commitRequestManager.maybeAutoCommitSyncBeforeRevocation(anyLong())).thenReturn(commitResult); -return commitResult; Review Comment: uhm removing this will make that in all tests that want to block on commit we will have to call the same `when(commitRequestManager.maybeAutoCommitSyncBeforeRevocation...` we had here I guess? I do see that you had to add this line in many places in the end. Again, seems related to how we're using an insntace of commitRequestManager mentioned in comments above. (and if we are able to move to a mocked commitManager this should probably stay?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1645039449 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -141,23 +136,41 @@ private MembershipManagerImpl createMembershipManagerJoiningGroup(String groupIn } private MembershipManagerImpl createMembershipManager(String groupInstanceId) { -return spy(new MembershipManagerImpl( +return new MembershipManagerImpl( GROUP_ID, Optional.ofNullable(groupInstanceId), REBALANCE_TIMEOUT, Optional.empty(), subscriptionState, commitRequestManager, metadata, logContext, Optional.empty(), -backgroundEventHandler, time, rebalanceMetricsManager)); +backgroundEventHandler, time, rebalanceMetricsManager); } private MembershipManagerImpl createMembershipManagerJoiningGroup(String groupInstanceId, String serverAssignor) { -MembershipManagerImpl manager = spy(new MembershipManagerImpl( +MembershipManagerImpl manager = new MembershipManagerImpl( GROUP_ID, Optional.ofNullable(groupInstanceId), REBALANCE_TIMEOUT, Optional.ofNullable(serverAssignor), subscriptionState, commitRequestManager, metadata, logContext, Optional.empty(), backgroundEventHandler, time, -rebalanceMetricsManager)); +rebalanceMetricsManager); manager.transitionToJoining(); return manager; } +private void createCommitRequestManager(boolean autoCommit) { +ConsumerConfig config = mock(ConsumerConfig.class); +if (autoCommit) { + when(config.getBoolean(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG)).thenReturn(true); +} + +commitRequestManager = new CommitRequestManager( Review Comment: I would aim for not having an actual commitRequestManager here, a mock should do, but I may be missing something. Check my other comments to get rid of the `createCommitRequestManager` (and maybe we can end up removing this?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1645037904 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -81,53 +82,47 @@ import static org.mockito.ArgumentMatchers.anyCollection; import static org.mockito.ArgumentMatchers.anyLong; import static org.mockito.ArgumentMatchers.anySet; +import static org.mockito.ArgumentMatchers.eq; import static org.mockito.Mockito.clearInvocations; import static org.mockito.Mockito.doNothing; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.never; -import static org.mockito.Mockito.spy; import static org.mockito.Mockito.times; import static org.mockito.Mockito.verify; import static org.mockito.Mockito.when; +@SuppressWarnings("ClassDataAbstractionCoupling") public class MembershipManagerImplTest { private static final String GROUP_ID = "test-group"; private static final String MEMBER_ID = "test-member-1"; private static final int REBALANCE_TIMEOUT = 100; private static final int MEMBER_EPOCH = 1; -private final LogContext logContext = new LogContext(); +private LogContext logContext; private SubscriptionState subscriptionState; private ConsumerMetadata metadata; - private CommitRequestManager commitRequestManager; - -private ConsumerTestBuilder testBuilder; private BlockingQueue backgroundEventQueue; private BackgroundEventHandler backgroundEventHandler; private Time time; private Metrics metrics; private RebalanceMetricsManager rebalanceMetricsManager; +@SuppressWarnings("unchecked") @BeforeEach public void setup() { -testBuilder = new ConsumerTestBuilder(ConsumerTestBuilder.createDefaultGroupInformation()); -metadata = testBuilder.metadata; -subscriptionState = testBuilder.subscriptions; -commitRequestManager = testBuilder.commitRequestManager.orElseThrow(IllegalStateException::new); -backgroundEventQueue = testBuilder.backgroundEventQueue; -backgroundEventHandler = testBuilder.backgroundEventHandler; +metadata = mock(ConsumerMetadata.class); +subscriptionState = mock(SubscriptionState.class); +commitRequestManager = mock(CommitRequestManager.class); +backgroundEventHandler = mock(BackgroundEventHandler.class); +backgroundEventQueue = mock(BlockingQueue.class); Review Comment: heads-up, having these as mocks means we cannot retrieve the events like we used to (backgroundEventQueue.poll), that I notice is still used to get the event to `performCallback`. I would say having it as a mock is the right thing to do, but we need to fix how we get the events from it. Ex. [this](https://github.com/apache/kafka/blob/53ec9733c2bf8b122fbd193d8d00b8e25cbd648f/clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java#L2165-L2169) section to get the callback event to complete should be updated to retrieve the event with an argument captor instead (similar to how it's done [here](https://github.com/apache/kafka/blob/f2a552a1ebaa0ce933f90343655b63e2472856d9/clients/src/test/java/org/apache/kafka/clients/consumer/internals/HeartbeatRequestManagerTest.java#L807-L809), but retrieving a `ConsumerRebalanceListenerCallbackNeededEvent`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-16995) The listeners broker parameter incorrect documentation
Sergey created KAFKA-16995: -- Summary: The listeners broker parameter incorrect documentation Key: KAFKA-16995 URL: https://issues.apache.org/jira/browse/KAFKA-16995 Project: Kafka Issue Type: Bug Affects Versions: 3.6.1 Environment: Kafka 3.6.1 Reporter: Sergey We are using Kafka 3.6.1 and the [KIP-797|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330] describes configuring listeners with the same port and name for supporting IPv4/IPv6 dual-stack. Documentation link: https://kafka.apache.org/36/documentation.html#brokerconfigs_listeners |{{}}| As I understand it, Kafka should allow us to set the listener name and listener port to the same value if we configure dual-stack. But in reality, the broker returns an error if we set the listener name to the same value. Error example: {code:java} java.lang.IllegalArgumentException: requirement failed: Each listener must have a different name, listeners: CONTROLPLANE://0.0.0.0:9090,SSL://0.0.0.0:9093,SSL://[::]:9093 at scala.Predef$.require(Predef.scala:337) at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:214) at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:2120) at kafka.server.KafkaConfig.(KafkaConfig.scala:1807) at kafka.server.KafkaConfig.(KafkaConfig.scala:1604) at kafka.Kafka$.buildServer(Kafka.scala:72) at kafka.Kafka$.main(Kafka.scala:91) at kafka.Kafka.main(Kafka.scala) {code} I've tried to set the listeners to: "SSL://0.0.0.0:9093,SSL://[::]:9093" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16995) The listeners broker parameter incorrect documentation
[ https://issues.apache.org/jira/browse/KAFKA-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey updated KAFKA-16995: --- Description: We are using Kafka 3.6.1 and the [KIP-797|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330] describes configuring listeners with the same port and name for supporting IPv4/IPv6 dual-stack. Documentation link: [https://kafka.apache.org/36/documentation.html#brokerconfigs_listeners] As I understand it, Kafka should allow us to set the listener name and listener port to the same value if we configure dual-stack. But in reality, the broker returns an error if we set the listener name to the same value. Error example: {code:java} java.lang.IllegalArgumentException: requirement failed: Each listener must have a different name, listeners: CONTROLPLANE://0.0.0.0:9090,SSL://0.0.0.0:9093,SSL://[::]:9093 at scala.Predef$.require(Predef.scala:337) at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:214) at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:2120) at kafka.server.KafkaConfig.(KafkaConfig.scala:1807) at kafka.server.KafkaConfig.(KafkaConfig.scala:1604) at kafka.Kafka$.buildServer(Kafka.scala:72) at kafka.Kafka$.main(Kafka.scala:91) at kafka.Kafka.main(Kafka.scala) {code} I've tried to set the listeners to: "SSL://0.0.0.0:9093,SSL://[::]:9093" was: We are using Kafka 3.6.1 and the [KIP-797|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330] describes configuring listeners with the same port and name for supporting IPv4/IPv6 dual-stack. Documentation link: https://kafka.apache.org/36/documentation.html#brokerconfigs_listeners |{{}}| As I understand it, Kafka should allow us to set the listener name and listener port to the same value if we configure dual-stack. But in reality, the broker returns an error if we set the listener name to the same value. Error example: {code:java} java.lang.IllegalArgumentException: requirement failed: Each listener must have a different name, listeners: CONTROLPLANE://0.0.0.0:9090,SSL://0.0.0.0:9093,SSL://[::]:9093 at scala.Predef$.require(Predef.scala:337) at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:214) at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:2120) at kafka.server.KafkaConfig.(KafkaConfig.scala:1807) at kafka.server.KafkaConfig.(KafkaConfig.scala:1604) at kafka.Kafka$.buildServer(Kafka.scala:72) at kafka.Kafka$.main(Kafka.scala:91) at kafka.Kafka.main(Kafka.scala) {code} I've tried to set the listeners to: "SSL://0.0.0.0:9093,SSL://[::]:9093" > The listeners broker parameter incorrect documentation > --- > > Key: KAFKA-16995 > URL: https://issues.apache.org/jira/browse/KAFKA-16995 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Kafka 3.6.1 >Reporter: Sergey >Priority: Minor > > We are using Kafka 3.6.1 and the > [KIP-797|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726330] > describes configuring listeners with the same port and name for supporting > IPv4/IPv6 dual-stack. > Documentation link: > [https://kafka.apache.org/36/documentation.html#brokerconfigs_listeners] > As I understand it, Kafka should allow us to set the listener name and > listener port to the same value if we configure dual-stack. > But in reality, the broker returns an error if we set the listener name to > the same value. > Error example: > {code:java} > java.lang.IllegalArgumentException: requirement failed: Each listener must > have a different name, listeners: > CONTROLPLANE://0.0.0.0:9090,SSL://0.0.0.0:9093,SSL://[::]:9093 > at scala.Predef$.require(Predef.scala:337) > at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:214) > at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) > at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:2120) > at kafka.server.KafkaConfig.(KafkaConfig.scala:1807) > at kafka.server.KafkaConfig.(KafkaConfig.scala:1604) > at kafka.Kafka$.buildServer(Kafka.scala:72) > at kafka.Kafka$.main(Kafka.scala:91) > at kafka.Kafka.main(Kafka.scala) {code} > I've tried to set the listeners to: "SSL://0.0.0.0:9093,SSL://[::]:9093" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16944) Range assignor doesn't co-partition with stickiness
[ https://issues.apache.org/jira/browse/KAFKA-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856061#comment-17856061 ] Ritika Reddy commented on KAFKA-16944: -- Hey! Thanks for showing interest in this task! We've already started working on this internally so I think we're good for now! > Range assignor doesn't co-partition with stickiness > --- > > Key: KAFKA-16944 > URL: https://issues.apache.org/jira/browse/KAFKA-16944 > Project: Kafka > Issue Type: Sub-task >Reporter: Ritika Reddy >Assignee: Ritika Reddy >Priority: Major > > When stickiness is considered during range assignments, it is possible that > in certain cases where co-partitioning is guaranteed we fail. > An example would be: > Consider two topics T1, T2 with 3 partitions each and three members A, B, C. > Let's say the existing assignment (for whatever reason) is: > {quote}A -> T1P0 || B -> T1P1, T2P0, T2P1, T2P2 || C -> T1P2 > {quote} > Now we trigger a rebalance with the following subscriptions where all members > are subscribed to both topics everything else is the same > {quote}A -> T1, T2 || B -> T1, T2 || C -> T1, T2 > {quote} > Since all the topics have an equal number of partitions and all the members > are subscribed to the same set of topics we would expect co-partitioning > right so would we want the final assignment returned to be > {quote}A -> T1P0, T2P0 || B -> T1P1, T2P1 || C -> T1P2, T2P2 > {quote} > SO currently the client side assignor returns the following but it's because > they don't assign sticky partitions > {{{}C=[topic1-2, topic2-2], B=[topic1-1, topic2-1], A=[topic1-0, > topic2-0]{}}}Our > > Server side assignor returns: > (The partitions in bold are the sticky partitions) > {{{}A=MemberAssignment(targetPartitions={topic2=[1], > }}\{{{}{*}topic1=[0]{*}{}}}{{{}}), > B=MemberAssignment(targetPartitions={{}}}{{{}*topic2=[0]*{}}}{{{}, > {{{{{}*topic1=[1]*{}}}{{{}}), > C=MemberAssignment(targetPartitions={topic2=[2], {{{{{}*topic1=[2]*{}}} > *As seen above co-partitioning is expected but not returned.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16961) TestKRaftUpgrade system tests fail in v3.7.1 RC1
[ https://issues.apache.org/jira/browse/KAFKA-16961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Soarez resolved KAFKA-16961. - Resolution: Fixed Verified test no longer failing with {code:java} TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" bash tests/docker/run_tests.sh {code} after KAFKA-16969 > TestKRaftUpgrade system tests fail in v3.7.1 RC1 > > > Key: KAFKA-16961 > URL: https://issues.apache.org/jira/browse/KAFKA-16961 > Project: Kafka > Issue Type: Test >Reporter: Luke Chen >Assignee: Igor Soarez >Priority: Blocker > Fix For: 3.8.0, 3.7.1 > > > > > {code:java} > > SESSION REPORT (ALL TESTS) > ducktape version: 0.11.4 > session_id: 2024-06-14--003 > run time: 86 minutes 13.705 seconds > tests run: 24 > passed: 18 > flaky: 0 > failed: 6 > ignored: 0 > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=False.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 3 minutes 44.680 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 3 minutes 42.627 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.2.3.use_new_coordinator=False.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 3 minutes 28.205 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.2.3.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 3 minutes 42.388 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.3.2.use_new_coordinator=False.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 2 minutes 57.679 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.3.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 2 minutes 57.238 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.4.1.use_new_coordinator=False.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 2 minutes 52.545 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.4.1.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 2 minutes 56.289 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.5.2.use_new_coordinator=False.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 2 minutes 54.953 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.5.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 2 minutes 59.579 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=dev.use_new_coordinator=False.metadata_quorum=ISOLATED_KRAFT > status: PASS > run time: 3 minutes 21.016 seconds > > test_id: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=dev.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT > status:
[PR] Deprecate window.size.ms in StreamsConfig.java and [kafka]
Cerchie opened a new pull request, #16391: URL: https://github.com/apache/kafka/pull/16391 This is the first step towards [KIP-1020](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=290982804) Pieces of config to deprecate: - window.size.ms - windowed.inner.class.serde In the code, `WINDOWED_INNER_CLASS_SERDE` was directly below `DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS` with no decorators or javadocs. I added a @Deprecated decorator. I added a @Deprecated decorator to `WINDOW_SIZE_MS_CONFIG` and a short note regarding deprecation to the javadoc. Normally I'd note the release it was dep'd in as well as alternate config but those are not established yet. In the documentation, I added a short note on deprecation to `window.size.ms` but again, had no further info to add. Something titled `windowed.inner.class.serde` was not present in config-streams.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16899 GroupRebalanceConfig: rebalanceTimeoutMs updated to commitTimeoutDuringReconciliation [kafka]
linu-shibu commented on PR #16334: URL: https://github.com/apache/kafka/pull/16334#issuecomment-2176913853 > My apologies, @linu-shibu. > > I think the consensus was to change [`MembershipManagerImpl`'s](https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImpl.java) `rebalanceTimeoutMs` instance variable to `commitTimeoutDuringReconciliation`. We don't want to change any other code at this point. > > Thanks! Updated accordingly, please review again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1645002348 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -696,21 +721,23 @@ public void testDelayedReconciliationResultAppliedWhenTargetChangedWithMetadataU // Member receives and reconciles topic1-partition0 Uuid topicId1 = Uuid.randomUuid(); String topic1 = "topic1"; + when(commitRequestManager.maybeAutoCommitSyncBeforeRevocation(anyLong())).thenReturn(CompletableFuture.completedFuture(null)); MembershipManagerImpl membershipManager = mockMemberSuccessfullyReceivesAndAcksAssignment(topicId1, topic1, Collections.singletonList(0)); membershipManager.onHeartbeatRequestSent(); assertEquals(MemberState.STABLE, membershipManager.state()); -clearInvocations(membershipManager, subscriptionState); +//clearInvocations(membershipManager, subscriptionState); Review Comment: let's remove the comment if we don't need the line anymore -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1645001767 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -660,19 +688,16 @@ public void testSameAssignmentReconciledAgainWithMissingTopic() { ); membershipManager.onHeartbeatRequestSent(); assertEquals(MemberState.RECONCILING, membershipManager.state()); -clearInvocations(membershipManager); // Receive extended assignment - assignment received but no reconciliation triggered membershipManager.onHeartbeatSuccess(createConsumerGroupHeartbeatResponse(assignment2).data()); assertEquals(MemberState.RECONCILING, membershipManager.state()); -verifyReconciliationNotTriggered(membershipManager); Review Comment: similar to comment above, I would keep this kind of verification (spying on the membershipMgr only). The membership mgr holds the state machine for the member as part of a group, it's all intertwined, we've seen how small changes in one membership func affects/breaks states and transitions, so all coverage we can maintain here is important. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1644996524 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -386,15 +404,16 @@ public void testNewAssignmentIgnoredWhenStateIsPrepareLeaving() { receiveAssignment(topicId, Arrays.asList(0, 1), membershipManager); assertEquals(MemberState.PREPARE_LEAVING, membershipManager.state()); assertTrue(membershipManager.topicsAwaitingReconciliation().isEmpty()); -verify(membershipManager, never()).markReconciliationInProgress(); // When callback completes member should transition to LEAVING. completeCallback(callbackEvent, membershipManager); +membershipManager.transitionToSendingLeaveGroup(false); Review Comment: uhm we shouldn't call this here ourselves. The test wants to ensure that when the callbacks complete, the membershipMgr internally calls this `transitionToSendingLeaveGroup`. If for some reason the transition to leaving was not happening we should review what's missing/failing that does not let the flow make it to [this](https://github.com/apache/kafka/blob/f2a552a1ebaa0ce933f90343655b63e2472856d9/clients/src/main/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImpl.java#L695) point, where the internal call to transitionToSendingLeaveGroup should happen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-16951) TransactionManager should rediscover coordinator on disconnection
[ https://issues.apache.org/jira/browse/KAFKA-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856057#comment-17856057 ] Graham Campbell commented on KAFKA-16951: - Yes, if the original coordinator is online the transactional request will either succeed as normal if leader election has happened for the relevant __transaction_state partition or quickly return a NOT_COORDINATOR error. I've made an attempt to generalize the handleServerDisconnect method used by the MetadataUpdater to be a more general interface in the linked PR Related to this ticket I also opened KAFKA-16902 to use the socket.connection.setup.timeout.ms config to reduce the impact of attempting reconnection. > TransactionManager should rediscover coordinator on disconnection > - > > Key: KAFKA-16951 > URL: https://issues.apache.org/jira/browse/KAFKA-16951 > Project: Kafka > Issue Type: Improvement > Components: clients, producer >Affects Versions: 3.7.0 >Reporter: Graham Campbell >Priority: Major > > When a transaction coordinator for a transactional client shuts down for > restart or due to failure, the NetworkClient notices the broker disconnection > and [will automatically refresh cluster > metadata|https://github.com/apache/kafka/blob/f380cd1b64134cf81e5dab16d71a276781de890e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L1182-L1183] > to get the latest partition assignments. > The TransactionManager does not notice any changes until the next > transactional request. If the broker is still offline, this is a [blocking > wait while the client attempts to reconnect to the old > coordinator|https://github.com/apache/kafka/blob/f380cd1b64134cf81e5dab16d71a276781de890e/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L489-L490], > which can be up to request.timeout.ms long (default 35 seconds). Coordinator > lookup is only performed after a transactional request times out and fails. > The lookup is triggered in either the [Sender|#L525-L528] > or > [TransactionalManager's|https://github.com/apache/kafka/blob/f380cd1b64134cf81e5dab16d71a276781de890e/clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java#L1225-L1229] > error handling. > To support faster recovery and faster reaction to transaction coordinator > reassignments, the TransactionManager should proactively lookup the > transaction coordinator whenever the client is disconnected from the current > transaction coordinator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] KAFKA-16951 Notify TransactionManager of disconnects to proactively rediscover Coordinator [kafka]
parafiend opened a new pull request, #16390: URL: https://github.com/apache/kafka/pull/16390 When disconnected from a transaction or consumer group coordinator, immediately perform a lookup to ensure we still have the correct node marked as the coordinator. This allows pro-active discovery of coordinator changes during broker restarts. Includes several test changes since the request/response flow has changed for disconnections so wait times, polling, and coordinator lookups all have slight adjustments. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16788 - Fix resource leakage during connector start() failure [kafka]
ashoke-cube commented on code in PR #16095: URL: https://github.com/apache/kafka/pull/16095#discussion_r1644974636 ## connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConnector.java: ## @@ -307,7 +307,7 @@ void doShutdown() { + " as the connector has been scheduled for shutdown"), null); } -if (state == State.STARTED) +if (state == State.STARTED || state == State.FAILED) connector.stop(); Review Comment: Done. Please do take a look. There are state transitions possible from INIT-> STOPPED/PAUSED. We don't call `stop()` in suspend, if the state is not STARTED. There shouldn't be any resource allocation in `initialize()` of a connector for this to be an issue, but wanted to confirm. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-15774: use the default dsl store supplier for fkj subscriptions [kafka]
jlprat commented on PR #16380: URL: https://github.com/apache/kafka/pull/16380#issuecomment-2176831655 Hi @ableegoldman I didn't generate the RC candidates yet, and we have at least 1 blocker. If it's low risk we could port it to 3.8. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-16959) ConfigCommand should not allow to define both `entity-default` and `entity-name`
[ https://issues.apache.org/jira/browse/KAFKA-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856048#comment-17856048 ] Tai Le Manh commented on KAFKA-16959: - hm, it seems that using `entity-name` and `entity-default` at the same time is legal, it is allowed and described in [KIP-543|https://cwiki.apache.org/confluence/display/KAFKA/KIP-543%3A+Expand+ConfigCommand%27s+non-ZK+functionality] > ConfigCommand should not allow to define both `entity-default` and > `entity-name` > > > Key: KAFKA-16959 > URL: https://issues.apache.org/jira/browse/KAFKA-16959 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Minor > > When users input both `entity-default` and `entity-name`, only `entity-name` > will get evaluated. It seems to me that is error-prone. We should throw > exception directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] KAFKA-15774: use the default dsl store supplier for fkj subscriptions [kafka]
ableegoldman commented on PR #16380: URL: https://github.com/apache/kafka/pull/16380#issuecomment-2176809133 hey @jlprat what's the status of 3.8? This is a fix for a bug in a 3.7 feature so it is technically not a regression, but it's low risk and fixes an edge case with the FKJ operator so it would be nice to slip into 3.8 if that's still possible -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1644942652 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -366,12 +384,12 @@ public void testFencingWhenStateIsPrepareLeaving() { completeCallback(callbackEvent, membershipManager); assertEquals(MemberState.UNSUBSCRIBED, membershipManager.state()); assertEquals(ConsumerGroupHeartbeatRequest.LEAVE_GROUP_MEMBER_EPOCH, membershipManager.memberEpoch()); -verify(membershipManager).notifyEpochChange(Optional.empty(), Optional.empty()); Review Comment: Even though I totally agree with removing all the spies there were on other components, I would lean towards spying only on this membershipManager component in cases like this, where it seems valuable to verify on it. It would only mean that we spy on it when needed with `membershipManager = spy(createMemberInStableState())` on the specific tests that do verify, and not spying on membershipmanager for all test that in the end don;t need it. Makes sense? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644941547 ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -202,11 +202,20 @@ public enum MetadataVersion { // Add new fetch request version for KIP-951 IBP_3_7_IV4(19, "3.7", "IV4", false), +// New version for the Kafka 3.8.0 release. +IBP_3_8_IV0(20, "3.8", "IV0", false), + +// +// NOTE: MetadataVersions after this point are unstable and may be changed. +// If users attempt to use an unstable MetadataVersion, they will get an error. +// Please move this comment when updating the LATEST_PRODUCTION constant. +// + // Add ELR related supports (KIP-966). -IBP_3_8_IV0(20, "3.8", "IV0", true), +IBP_3_9_IV0(21, "3.9", "IV0", true), Review Comment: OK. I will put that in 3.9-IV0, and we can put ELR in 3.9-IV1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644940428 ## server-common/src/test/java/org/apache/kafka/server/common/FeaturesTest.java: ## @@ -110,7 +110,7 @@ public void testLatestProductionMapsToLatestMetadataVersion(Features features) { @EnumSource(MetadataVersion.class) public void testDefaultTestVersion(MetadataVersion metadataVersion) { short expectedVersion; -if (!metadataVersion.isLessThan(MetadataVersion.IBP_3_8_IV0)) { +if (!metadataVersion.isLessThan(MetadataVersion.IBP_3_9_IV0)) { Review Comment: Is the one you're talking about `testDescribeWithKRaftAndBootstrapControllers` in `FeatureCommandTest.java`? I think it would be OK to leave that test case on 3.7-IV4, just to provide some variety. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644937626 ## server-common/src/test/java/org/apache/kafka/server/common/MetadataVersionTest.java: ## @@ -184,8 +184,11 @@ public void testFromVersionString() { assertEquals(IBP_3_7_IV3, MetadataVersion.fromVersionString("3.7-IV3")); assertEquals(IBP_3_7_IV4, MetadataVersion.fromVersionString("3.7-IV4")); +assertEquals(IBP_3_8_IV0, MetadataVersion.fromVersionString("3.8")); Review Comment: The 3.7 comment is still true but I will add an a similar comment for 3.8. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644936406 ## metadata/src/test/java/org/apache/kafka/metadata/PartitionRegistrationTest.java: ## @@ -371,7 +371,7 @@ public void testPartitionRegistrationToRecord_ElrShouldBeNullIfEmpty() { setPartitionEpoch(0); List exceptions = new ArrayList<>(); ImageWriterOptions options = new ImageWriterOptions.Builder(). -setMetadataVersion(MetadataVersion.IBP_3_8_IV0). +setMetadataVersion(MetadataVersion.IBP_3_9_IV0). Review Comment: good find, fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644935988 ## core/src/test/scala/integration/kafka/zk/ZkMigrationIntegrationTest.scala: ## @@ -73,7 +73,7 @@ object ZkMigrationIntegrationTest { MetadataVersion.IBP_3_7_IV2, MetadataVersion.IBP_3_7_IV4, MetadataVersion.IBP_3_8_IV0, - MetadataVersion.IBP_4_0_IV0 + MetadataVersion.IBP_3_9_IV0 Review Comment: 1. added 2. changed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644934445 ## tools/src/test/java/org/apache/kafka/tools/FeatureCommandTest.java: ## @@ -321,4 +321,4 @@ public void testHandleDisableDryRun() { "Can not disable metadata.version. Can't downgrade below 4%n" + "quux can be disabled."), disableOutput); } -} \ No newline at end of file +} Review Comment: That was unintentional, will remove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644933858 ## core/src/test/scala/unit/kafka/server/ApiVersionsRequestTest.scala: ## @@ -47,7 +47,7 @@ object ApiVersionsRequestTest { List(ClusterConfig.defaultBuilder() .setTypes(java.util.Collections.singleton(Type.ZK)) .setServerProperties(serverProperties) - .setMetadataVersion(MetadataVersion.IBP_4_0_IV0) + .setMetadataVersion(MetadataVersion.IBP_3_9_IV0) Review Comment: Good point. I'll just set `latestTesting` so that we don't have to keep updating it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644933588 ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -331,7 +340,7 @@ public boolean isDirectoryAssignmentSupported() { } public boolean isElrSupported() { -return this.isAtLeast(IBP_3_8_IV0); +return this.isAtLeast(IBP_3_9_IV0); Review Comment: Maybe. We'll move it out again if it's not done next month -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
junrao commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644931539 ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -202,11 +202,20 @@ public enum MetadataVersion { // Add new fetch request version for KIP-951 IBP_3_7_IV4(19, "3.7", "IV4", false), +// New version for the Kafka 3.8.0 release. +IBP_3_8_IV0(20, "3.8", "IV0", false), + +// +// NOTE: MetadataVersions after this point are unstable and may be changed. +// If users attempt to use an unstable MetadataVersion, they will get an error. +// Please move this comment when updating the LATEST_PRODUCTION constant. +// + // Add ELR related supports (KIP-966). -IBP_3_8_IV0(20, "3.8", "IV0", true), +IBP_3_9_IV0(21, "3.9", "IV0", true), Review Comment: Could we add a separate MV for ListOffset? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644931292 ## clients/src/main/resources/common/message/ListOffsetsRequest.json: ## @@ -34,9 +34,12 @@ // Version 7 enables listing offsets by max timestamp (KIP-734). // // Version 8 enables listing offsets by local log start offset (KIP-405). - "validVersions": "0-8", + // + // Version 9 enables listing offsets by last tiered offset (KIP-1005). + "validVersions": "0-9", "deprecatedVersions": "0", "flexibleVersions": "6+", + "latestVersionUnstable": true, Review Comment: @junrao : right, I'll have a separate PR for 3.8 that will be a bit different (and include the revert) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16000 Migrated MembershipManagerImplTest away from ConsumerTestBuilder [kafka]
lianetm commented on code in PR #16312: URL: https://github.com/apache/kafka/pull/16312#discussion_r1644919762 ## clients/src/test/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImplTest.java: ## @@ -220,6 +233,7 @@ public void testTransitionToReconcilingIfEmptyAssignmentReceived() { @Test public void testMemberIdAndEpochResetOnFencedMembers() { +createCommitRequestManager(false); Review Comment: we shouldn't need an actual `commitRequestManager` here, where we don't need anything specific related to commits and a mock should do. I guess you could remove it and just return a completed future for the autoCommit before revocation : `when(commitRequestManager.maybeAutoCommitSyncBeforeRevocation(anyLong())).thenReturn(CompletableFuture.completedFuture(null));` Probably doing it once inside the `createMemberInStableState` will allow you to remove most of the createCommitRequestManager calls -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
junrao commented on code in PR #16347: URL: https://github.com/apache/kafka/pull/16347#discussion_r1644855151 ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -331,7 +340,7 @@ public boolean isDirectoryAssignmentSupported() { } public boolean isElrSupported() { -return this.isAtLeast(IBP_3_8_IV0); +return this.isAtLeast(IBP_3_9_IV0); Review Comment: Since 3.9 is a short release, will ELR be supported in 3.9? ## core/src/test/scala/unit/kafka/server/ApiVersionsRequestTest.scala: ## @@ -47,7 +47,7 @@ object ApiVersionsRequestTest { List(ClusterConfig.defaultBuilder() .setTypes(java.util.Collections.singleton(Type.ZK)) .setServerProperties(serverProperties) - .setMetadataVersion(MetadataVersion.IBP_4_0_IV0) + .setMetadataVersion(MetadataVersion.IBP_3_9_IV0) Review Comment: Could we add some comments on which MV this test should set? Is it the first unstable MV? Could we add sth like MetadataVersion.earliestTesting() to avoid having to keep changing it? ## clients/src/main/resources/common/message/ListOffsetsRequest.json: ## @@ -34,9 +34,12 @@ // Version 7 enables listing offsets by max timestamp (KIP-734). // // Version 8 enables listing offsets by local log start offset (KIP-405). - "validVersions": "0-8", + // + // Version 9 enables listing offsets by last tiered offset (KIP-1005). + "validVersions": "0-9", "deprecatedVersions": "0", "flexibleVersions": "6+", + "latestVersionUnstable": true, Review Comment: @clolov : Is this change ok with you for 3.9? @cmccabe : I guess this part won't be included in 3.8? ## server-common/src/test/java/org/apache/kafka/server/common/MetadataVersionTest.java: ## @@ -184,8 +184,11 @@ public void testFromVersionString() { assertEquals(IBP_3_7_IV3, MetadataVersion.fromVersionString("3.7-IV3")); assertEquals(IBP_3_7_IV4, MetadataVersion.fromVersionString("3.7-IV4")); +assertEquals(IBP_3_8_IV0, MetadataVersion.fromVersionString("3.8")); Review Comment: Should we change the following comment accordingly? `// 3.7-IV4 is the latest production version in the 3.7 line` ## metadata/src/test/java/org/apache/kafka/metadata/PartitionRegistrationTest.java: ## @@ -371,7 +371,7 @@ public void testPartitionRegistrationToRecord_ElrShouldBeNullIfEmpty() { setPartitionEpoch(0); List exceptions = new ArrayList<>(); ImageWriterOptions options = new ImageWriterOptions.Builder(). -setMetadataVersion(MetadataVersion.IBP_3_8_IV0). +setMetadataVersion(MetadataVersion.IBP_3_9_IV0). Review Comment: We have the following code in this class. Should we change IBP_3_8_IV0 to IBP_3_9_IV0? ``` private static Stream metadataVersionsForTestPartitionRegistration() { return Stream.of( MetadataVersion.IBP_3_7_IV1, MetadataVersion.IBP_3_7_IV2, MetadataVersion.IBP_3_8_IV0 ).map(mv -> Arguments.of(mv)); } ``` ## core/src/test/scala/integration/kafka/zk/ZkMigrationIntegrationTest.scala: ## @@ -73,7 +73,7 @@ object ZkMigrationIntegrationTest { MetadataVersion.IBP_3_7_IV2, MetadataVersion.IBP_3_7_IV4, MetadataVersion.IBP_3_8_IV0, - MetadataVersion.IBP_4_0_IV0 + MetadataVersion.IBP_3_9_IV0 Review Comment: (1) I guess we don't support ZK migration from 4.0. Could we add a comment? (2) We have the following code in this class. Should we bump IBP_3_8_IV0 to IBP_3_9_IV0? `@ClusterTest(types = Array(Type.ZK), brokers = 3, metadataVersion = MetadataVersion.IBP_3_8_IV0, serverProperties = Array( ` ## tools/src/test/java/org/apache/kafka/tools/FeatureCommandTest.java: ## @@ -321,4 +321,4 @@ public void testHandleDisableDryRun() { "Can not disable metadata.version. Can't downgrade below 4%n" + "quux can be disabled."), disableOutput); } -} \ No newline at end of file +} Review Comment: Is this change needed? ## server-common/src/test/java/org/apache/kafka/server/common/FeaturesTest.java: ## @@ -110,7 +110,7 @@ public void testLatestProductionMapsToLatestMetadataVersion(Features features) { @EnumSource(MetadataVersion.class) public void testDefaultTestVersion(MetadataVersion metadataVersion) { short expectedVersion; -if (!metadataVersion.isLessThan(MetadataVersion.IBP_3_8_IV0)) { +if (!metadataVersion.isLessThan(MetadataVersion.IBP_3_9_IV0)) { Review Comment: Should we change 3.7.-IV4 in the following code to IBP_3_8_0? `@ClusterTest(types = {Type.KRAFT}, metadataVersion = MetadataVersion.IBP_3_7_IV4)` ` "SupportedMaxVersion: 4.0-IV0\tFinalizedVersionLevel: 3.7-IV4\t",
Re: [PR] KAFKA-16788 - Fix resource leakage during connector start() failure [kafka]
gharris1727 commented on code in PR #16095: URL: https://github.com/apache/kafka/pull/16095#discussion_r1644895138 ## connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConnector.java: ## @@ -307,7 +307,7 @@ void doShutdown() { + " as the connector has been scheduled for shutdown"), null); } -if (state == State.STARTED) +if (state == State.STARTED || state == State.FAILED) connector.stop(); Review Comment: I think that some of the tests in WorkerConnectorTest exercising failed connectors will need some verifyCleanShutdown assertions updated. A test which tests exceptions in both start() and stop() would also be very helpful :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16788 - Fix resource leakage during connector start() failure [kafka]
ashoke-cube commented on code in PR #16095: URL: https://github.com/apache/kafka/pull/16095#discussion_r1644889388 ## connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConnector.java: ## @@ -307,7 +307,7 @@ void doShutdown() { + " as the connector has been scheduled for shutdown"), null); } -if (state == State.STARTED) +if (state == State.STARTED || state == State.FAILED) connector.stop(); Review Comment: Yes, will do. Also, does this need any separate test to be written? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16862: Refactor ConsumerTaskTest to be deterministic and avoid tight loops [kafka]
gharris1727 commented on PR #16303: URL: https://github.com/apache/kafka/pull/16303#issuecomment-2176717352 Thanks @xiaoqingwanga this looks good. Can you fix up the build errors? I think it just requires a whitespace change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16788 - Fix resource leakage during connector start() failure [kafka]
gharris1727 commented on code in PR #16095: URL: https://github.com/apache/kafka/pull/16095#discussion_r1644883959 ## connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConnector.java: ## @@ -307,7 +307,7 @@ void doShutdown() { + " as the connector has been scheduled for shutdown"), null); } -if (state == State.STARTED) +if (state == State.STARTED || state == State.FAILED) connector.stop(); Review Comment: connector.stop can fail, and if that happens then we don't transit to STOPPED, we re-transit to FAILED, but have a different error. I think this will shadow the exception that caused the connector to fail in the first place, which is almost certainly more confusing than helpful. In the catch clause, can you swap out the state/statusListener for onFailure? It has failure-deduplication logic that seems useful here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-16986) After upgrading to Kafka 3.4.1, the producer constantly produces logs related to topicId changes
[ https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856022#comment-17856022 ] Justine Olshan commented on KAFKA-16986: Thanks for clarifying. I will take a look. > After upgrading to Kafka 3.4.1, the producer constantly produces logs related > to topicId changes > > > Key: KAFKA-16986 > URL: https://issues.apache.org/jira/browse/KAFKA-16986 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 3.0.1, 3.6.1 >Reporter: Vinicius Vieira dos Santos >Priority: Minor > Attachments: image.png > > > When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that > the applications began to log the message "{*}Resetting the last seen epoch > of partition PAYMENTS-0 to 0 since the associated topicId changed from null > to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this > behavior is not expected because the topic was not deleted and recreated so > it should simply use the cached data and not go through this client log line. > We have some applications with around 15 topics and 40 partitions which means > around 600 log lines when metadata updates occur > The main thing for me is to know if this could indicate a problem or if I can > simply change the log level of the org.apache.kafka.clients.Metadata class to > warn without worries > > There are other reports of the same behavior like this: > [https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why] > > *Some log occurrences over an interval of about 7 hours, each block refers to > an instance of the application in kubernetes* > > !image.png! > *My scenario:* > *Application:* > - Java: 21 > - Client: 3.6.1, also tested on 3.0.1 and has the same behavior > *Broker:* > - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52 > image > > *Producer Config* > > acks = -1 > auto.include.jmx.reporter = true > batch.size = 16384 > bootstrap.servers = [server:9092] > buffer.memory = 33554432 > client.dns.lookup = use_all_dns_ips > client.id = producer-1 > compression.type = gzip > connections.max.idle.ms = 54 > delivery.timeout.ms = 3 > enable.idempotence = true > interceptor.classes = [] > key.serializer = class > org.apache.kafka.common.serialization.ByteArraySerializer > linger.ms = 0 > max.block.ms = 6 > max.in.flight.requests.per.connection = 1 > max.request.size = 1048576 > metadata.max.age.ms = 30 > metadata.max.idle.ms = 30 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 3 > partitioner.adaptive.partitioning.enable = true > partitioner.availability.timeout.ms = 0 > partitioner.class = null > partitioner.ignore.keys = false > receive.buffer.bytes = 32768 > reconnect.backoff.max.ms = 1000 > reconnect.backoff.ms = 50 > request.timeout.ms = 3 > retries = 3 > retry.backoff.ms = 100 > sasl.client.callback.handler.class = null > sasl.jaas.config = [hidden] > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 6 > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.login.callback.handler.class = null > sasl.login.class = null > sasl.login.connect.timeout.ms = null > sasl.login.read.timeout.ms = null > sasl.login.refresh.buffer.seconds = 300 > sasl.login.refresh.min.period.seconds = 60 > sasl.login.refresh.window.factor = 0.8 > sasl.login.refresh.window.jitter = 0.05 > sasl.login.retry.backoff.max.ms = 1 > sasl.login.retry.backoff.ms = 100 > sasl.mechanism = PLAIN > sasl.oauthbearer.clock.skew.seconds = 30 > sasl.oauthbearer.expected.audience = null > sasl.oauthbearer.expected.issuer = null > sasl.oauthbearer.jwks.endpoint.refresh.ms = 360 > sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 1 > sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100 > sasl.oauthbearer.jwks.endpoint.url = null > sasl.oauthbearer.scope.claim.name = scope > sasl.oauthbearer.sub.claim.name = sub > sasl.oauthbearer.token.endpoint.url = null > security.protocol = SASL_PLAINTEXT > security.providers = null > send.buffer.bytes = 131072 > socket.connection.setup.timeout.max.ms = 3 > socket.connection.setup.timeout.ms = 1 > ssl.cipher.suites = null > ssl.enabled.protocols = [TLSv1.2, TLSv1.3] >
[jira] [Commented] (KAFKA-16990) Unrecognised flag passed to kafka-storage.sh in system test
[ https://issues.apache.org/jira/browse/KAFKA-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856020#comment-17856020 ] Justine Olshan commented on KAFKA-16990: Likely we are trying to set the new group coordinator with an old storage tool that doesn't recognize it. I'm wondering why this is set this way considering new group coordinator is not available for the original "old" version. > Unrecognised flag passed to kafka-storage.sh in system test > --- > > Key: KAFKA-16990 > URL: https://issues.apache.org/jira/browse/KAFKA-16990 > Project: Kafka > Issue Type: Test >Reporter: Gaurav Narula >Assignee: Gaurav Narula >Priority: Major > > Running > {{TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" > bash tests/docker/run_tests.sh}} on trunk (c4a3d2475f) fails with the > following: > {code:java} > [INFO:2024-06-18 09:16:03,139]: Triggering test 2 of 32... > [INFO:2024-06-18 09:16:03,147]: RunnerClient: Loading test {'directory': > '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': > 'kraft_upgrade_test.py', 'cls_name': 'TestKRaftUpgrade', 'method_name': > 'test_isolated_mode_upgrade', 'injected_args': {'from_kafka_version': > '3.1.2', 'use_new_coordinator': True, 'metadata_quorum': 'ISOLATED_KRAFT'}} > [INFO:2024-06-18 09:16:03,151]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > on run 1/1 > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Setting up... > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Running... > [INFO:2024-06-18 09:16:05,999]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Tearing down... > [INFO:2024-06-18 09:16:12,366]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > FAIL: RemoteCommandError({'ssh_config': {'host': 'ducker10', 'hostname': > 'ducker10', 'user': 'ducker', 'port': 22, 'password': '', 'identityfile': > '/home/ducker/.ssh/id_rsa', 'connecttimeout': None}, 'hostname': 'ducker10', > 'ssh_hostname': 'ducker10', 'user': 'ducker', 'externally_routable_ip': > 'ducker10', '_logger': kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT-2 > (DEBUG)>, 'os': 'linux', '_ssh_client': 0x85bccc70>, '_sftp_client': 0x85bccdf0>, '_custom_ssh_exception_checks': None}, > '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config > /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f > group.version=1', 1, b"usage: kafka-storage format [-h] --config CONFIG > --cluster-id CLUSTER_ID\n > [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n") > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 132, in test_isolated_mode_upgrade > self.run_upgrade(from_kafka_version, group_protocol) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 96, in run_upgrade > self.kafka.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 669, in > start > self.isolated_controller_quorum.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 671, in > start > Service.start(self, **kwargs) > File "/usr/local/lib/python3.9/dist-packages/ducktape/services/service.py", > line 265, in start > self.start_node(node, **kwargs) > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 902, in > start_node > node.account.ssh(cmd) > File >
[jira] [Commented] (KAFKA-16990) Unrecognised flag passed to kafka-storage.sh in system test
[ https://issues.apache.org/jira/browse/KAFKA-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856016#comment-17856016 ] Justine Olshan commented on KAFKA-16990: I can take a look. cc: [~dajac] > Unrecognised flag passed to kafka-storage.sh in system test > --- > > Key: KAFKA-16990 > URL: https://issues.apache.org/jira/browse/KAFKA-16990 > Project: Kafka > Issue Type: Test >Reporter: Gaurav Narula >Assignee: Gaurav Narula >Priority: Major > > Running > {{TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" > bash tests/docker/run_tests.sh}} on trunk (c4a3d2475f) fails with the > following: > {code:java} > [INFO:2024-06-18 09:16:03,139]: Triggering test 2 of 32... > [INFO:2024-06-18 09:16:03,147]: RunnerClient: Loading test {'directory': > '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': > 'kraft_upgrade_test.py', 'cls_name': 'TestKRaftUpgrade', 'method_name': > 'test_isolated_mode_upgrade', 'injected_args': {'from_kafka_version': > '3.1.2', 'use_new_coordinator': True, 'metadata_quorum': 'ISOLATED_KRAFT'}} > [INFO:2024-06-18 09:16:03,151]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > on run 1/1 > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Setting up... > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Running... > [INFO:2024-06-18 09:16:05,999]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Tearing down... > [INFO:2024-06-18 09:16:12,366]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > FAIL: RemoteCommandError({'ssh_config': {'host': 'ducker10', 'hostname': > 'ducker10', 'user': 'ducker', 'port': 22, 'password': '', 'identityfile': > '/home/ducker/.ssh/id_rsa', 'connecttimeout': None}, 'hostname': 'ducker10', > 'ssh_hostname': 'ducker10', 'user': 'ducker', 'externally_routable_ip': > 'ducker10', '_logger': kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT-2 > (DEBUG)>, 'os': 'linux', '_ssh_client': 0x85bccc70>, '_sftp_client': 0x85bccdf0>, '_custom_ssh_exception_checks': None}, > '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config > /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f > group.version=1', 1, b"usage: kafka-storage format [-h] --config CONFIG > --cluster-id CLUSTER_ID\n > [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n") > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 132, in test_isolated_mode_upgrade > self.run_upgrade(from_kafka_version, group_protocol) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 96, in run_upgrade > self.kafka.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 669, in > start > self.isolated_controller_quorum.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 671, in > start > Service.start(self, **kwargs) > File "/usr/local/lib/python3.9/dist-packages/ducktape/services/service.py", > line 265, in start > self.start_node(node, **kwargs) > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 902, in > start_node > node.account.ssh(cmd) > File > "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", > line 35, in wrapper > return method(self, *args, **kwargs) > File >
Re: [PR] KAFKA-16480: Bump ListOffsets version, IBP version and mark last version of ListOffsets as unstable [kafka]
cmccabe commented on PR #15673: URL: https://github.com/apache/kafka/pull/15673#issuecomment-2176610110 I commented in the other PR that I think we should just revert KIP-1005 from 3.8 if it's not finished. https://github.com/apache/kafka/pull/16347#issuecomment-2176593301 We can implement it in 3.9-IV0. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16480: Bump ListOffsets version, IBP version and mark last version of ListOffsets as unstable [kafka]
cmccabe commented on code in PR #15673: URL: https://github.com/apache/kafka/pull/15673#discussion_r1644812539 ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -228,7 +231,7 @@ public enum MetadataVersion { * Think carefully before you update this value. ONCE A METADATA VERSION IS PRODUCTION, * IT CANNOT BE CHANGED. */ -public static final MetadataVersion LATEST_PRODUCTION = IBP_3_7_IV4; +public static final MetadataVersion LATEST_PRODUCTION = IBP_3_8_IV1; Review Comment: There is really no reason to leave unstable MVs unused. I think the issue with IBP_3_7_IV3 was a misunderstanding. The whole point of an unstable MV is you can do whatever you want. ## server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java: ## @@ -228,7 +231,7 @@ public enum MetadataVersion { * Think carefully before you update this value. ONCE A METADATA VERSION IS PRODUCTION, * IT CANNOT BE CHANGED. */ -public static final MetadataVersion LATEST_PRODUCTION = IBP_3_7_IV4; +public static final MetadataVersion LATEST_PRODUCTION = IBP_3_8_IV1; Review Comment: There is really no reason to leave unstable MVs unused. I think the issue with IBP_3_7_IV3 was a misunderstanding. The whole point of an unstable MV is you can do whatever you want (until it becomes stable). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
cmccabe commented on PR #16347: URL: https://github.com/apache/kafka/pull/16347#issuecomment-2176593301 @junrao @clolov @dajac @jlprat : It seems like the client side of the KIP-1005 changes is not implemented. So I would propose we revert #15213 from 3.8 since the KIP is only half-complete, and feature freeze has come and gone. This will unblock the 3.8.0 release. We can then retarget KIP-1005 at 3.9.0 as usual. As part of that, we can bump the ListOffsets version and mark the latest version of ListOffsets as unstable. This will all be in 3.9-IV0. I'll update this PR in a moment with those changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16753: Implement share acknowledge API in partition (KIP-932) [kafka]
omkreddy merged PR #16339: URL: https://github.com/apache/kafka/pull/16339 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (KAFKA-16994) Flaky Test SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart
[ https://issues.apache.org/jira/browse/KAFKA-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-16994: Attachment: 5owo5xbyzjnao-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_true]-1-output.txt 7jnraxqt7a52m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_false]-1-output.txt dujhqmgv6nzuu-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_true]-1-output.txt fj6qia6oiob4m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_false]-1-output.txt > Flaky Test SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart > -- > > Key: KAFKA-16994 > URL: https://issues.apache.org/jira/browse/KAFKA-16994 > Project: Kafka > Issue Type: Test >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 5owo5xbyzjnao-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_true]-1-output.txt, > > 7jnraxqt7a52m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_false]-1-output.txt, > > dujhqmgv6nzuu-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_true]-1-output.txt, > > fj6qia6oiob4m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_false]-1-output.txt > > > Failed for all different parameters. > {code:java} > java.lang.AssertionError: Did not receive all 1 records from topic > output-shouldRestoreAfterJoinRestart_ON_WINDOW_CLOSE_cache_true_F_de0bULT5a8gQ_8lAhz8Q > within 6 msExpected: is a value equal to or greater than <1> but: > <0> was less than <1>at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueWithTimestampRecordsReceived$2(IntegrationTestUtils.java:778)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:412)at > > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueWithTimestampRecordsReceived(IntegrationTestUtils.java:774)at > > org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.receiveMessagesWithTimestamp(SlidingWindowedKStreamIntegrationTest.java:479)at > > org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart(SlidingWindowedKStreamIntegrationTest.java:404)at > jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)at > > jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:568)at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at > > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at > > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at > > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at > > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)at > > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at > java.util.concurrent.FutureTask.run(FutureTask.java:264)at > java.lang.Thread.run(Thread.java:833) {code} > {code:java} > java.lang.AssertionError: Did not receive all 2 records from topic > output-shouldRestoreAfterJoinRestart_ON_WINDOW_UPDATE_cache_true_bG_UnW1QSr_7tz2aXQNTXA > within 6 msExpected: is a value equal to or greater than <2> but: > <0> was less than <2>at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueWithTimestampRecordsReceived$2(IntegrationTestUtils.java:778)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at > >
[jira] [Created] (KAFKA-16994) Flaky Test SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart
Matthias J. Sax created KAFKA-16994: --- Summary: Flaky Test SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart Key: KAFKA-16994 URL: https://issues.apache.org/jira/browse/KAFKA-16994 Project: Kafka Issue Type: Test Reporter: Matthias J. Sax Attachments: 5owo5xbyzjnao-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_true]-1-output.txt, 7jnraxqt7a52m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_false]-1-output.txt, dujhqmgv6nzuu-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_true]-1-output.txt, fj6qia6oiob4m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_false]-1-output.txt Failed for all different parameters. {code:java} java.lang.AssertionError: Did not receive all 1 records from topic output-shouldRestoreAfterJoinRestart_ON_WINDOW_CLOSE_cache_true_F_de0bULT5a8gQ_8lAhz8Q within 6 msExpected: is a value equal to or greater than <1> but: <0> was less than <1>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueWithTimestampRecordsReceived$2(IntegrationTestUtils.java:778)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:412)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueWithTimestampRecordsReceived(IntegrationTestUtils.java:774)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.receiveMessagesWithTimestamp(SlidingWindowedKStreamIntegrationTest.java:479)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart(SlidingWindowedKStreamIntegrationTest.java:404)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:568)at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at java.util.concurrent.FutureTask.run(FutureTask.java:264)at java.lang.Thread.run(Thread.java:833) {code} {code:java} java.lang.AssertionError: Did not receive all 2 records from topic output-shouldRestoreAfterJoinRestart_ON_WINDOW_UPDATE_cache_true_bG_UnW1QSr_7tz2aXQNTXA within 6 msExpected: is a value equal to or greater than <2> but: <0> was less than <2>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueWithTimestampRecordsReceived$2(IntegrationTestUtils.java:778)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:412)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueWithTimestampRecordsReceived(IntegrationTestUtils.java:774)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.receiveMessagesWithTimestamp(SlidingWindowedKStreamIntegrationTest.java:479)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart(SlidingWindowedKStreamIntegrationTest.java:404)at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)at java.lang.reflect.Method.invoke(Method.java:580)at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at
[jira] [Comment Edited] (KAFKA-16986) After upgrading to Kafka 3.4.1, the producer constantly produces logs related to topicId changes
[ https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855993#comment-17855993 ] Vinicius Vieira dos Santos edited comment on KAFKA-16986 at 6/18/24 5:02 PM: - [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version Logs about client startup containing the version: 2024-06-17 08:23:50,376 [main] INFO o.a.k.clients.producer.KafkaProducer [] : [Producer clientId=producer-1] Instantiated an idempotent producer. 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka version: 3.6.1 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka commitId: 5e3c2b738d253ff5 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka startTimeMs: 1718623430387 I added the producer settings we are using in the description, thanks for the help was (Author: JIRAUSER305851): [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version Logs about client startup containing the version: 2024-06-17 08:23:50,376 [main] INFO o.a.k.clients.producer.KafkaProducer [] : [Producer clientId=producer-2] Instantiated an idempotent producer. 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka version: 3.6.1 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka commitId: 5e3c2b738d253ff5 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka startTimeMs: 1718623430387 I added the producer settings we are using in the description, thanks for the help > After upgrading to Kafka 3.4.1, the producer constantly produces logs related > to topicId changes > > > Key: KAFKA-16986 > URL: https://issues.apache.org/jira/browse/KAFKA-16986 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 3.0.1, 3.6.1 >Reporter: Vinicius Vieira dos Santos >Priority: Minor > Attachments: image.png > > > When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that > the applications began to log the message "{*}Resetting the last seen epoch > of partition PAYMENTS-0 to 0 since the associated topicId changed from null > to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this > behavior is not expected because the topic was not deleted and recreated so > it should simply use the cached data and not go through this client log line. > We have some applications with around 15 topics and 40 partitions which means > around 600 log lines when metadata updates occur > The main thing for me is to know if this could indicate a problem or if I can > simply change the log level of the org.apache.kafka.clients.Metadata class to > warn without worries > > There are other reports of the same behavior like this: > [https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why] > > *Some log occurrences over an interval of about 7 hours, each block refers to > an instance of the application in kubernetes* > > !image.png! > *My scenario:* > *Application:* > - Java: 21 > - Client: 3.6.1, also tested on 3.0.1 and has the same behavior > *Broker:* > - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52 > image > > *Producer Config* > > acks = -1 > auto.include.jmx.reporter = true > batch.size = 16384 > bootstrap.servers = [server:9092] > buffer.memory = 33554432 > client.dns.lookup = use_all_dns_ips > client.id = producer-1 > compression.type = gzip > connections.max.idle.ms = 54 > delivery.timeout.ms = 3 > enable.idempotence = true > interceptor.classes = [] > key.serializer = class > org.apache.kafka.common.serialization.ByteArraySerializer > linger.ms = 0 > max.block.ms = 6 > max.in.flight.requests.per.connection = 1 > max.request.size = 1048576 > metadata.max.age.ms = 30 > metadata.max.idle.ms = 30 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 3 > partitioner.adaptive.partitioning.enable = true > partitioner.availability.timeout.ms = 0 > partitioner.class = null > partitioner.ignore.keys = false > receive.buffer.bytes = 32768 >
[jira] [Created] (KAFKA-16993) Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener()
Matthias J. Sax created KAFKA-16993: --- Summary: Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener() Key: KAFKA-16993 URL: https://issues.apache.org/jira/browse/KAFKA-16993 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Attachments: 6u4a4e27e2oh2-org.apache.kafka.streams.integration.RestoreIntegrationTest-shouldInvokeUserDefinedGlobalStateRestoreListener()-1-output.txt {code:java} org.opentest4j.AssertionFailedError: expected: but was: at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)at org.apache.kafka.streams.integration.RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener(RestoreIntegrationTest.java:611)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)at org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)at java.util.ArrayList.forEach(ArrayList.java:1259)at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)at
[jira] [Commented] (KAFKA-16887) document busy metrics value when remoteLogManager throttling
[ https://issues.apache.org/jira/browse/KAFKA-16887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856001#comment-17856001 ] Ksolves commented on KAFKA-16887: - Checked one of the attached PRs ([PR#16086|https://github.com/apache/kafka/pull/16086]) in the discussion, and it seems like the below 2 configs need to be documented - * remote-copy-throttle-time-avg (The average time in millis remote copies was throttled by a broker) * remote-copy-throttle-time-max (The max time in millis remote copies was throttled by a broker) Can you confirm that we have to add these metric values to the documentation? Will create the PR accordingly. > document busy metrics value when remoteLogManager throttling > > > Key: KAFKA-16887 > URL: https://issues.apache.org/jira/browse/KAFKA-16887 > Project: Kafka > Issue Type: Sub-task >Reporter: Luke Chen >Priority: Major > > Context: https://github.com/apache/kafka/pull/15820#discussion_r1625304008 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16986) After upgrading to Kafka 3.4.1, the producer constantly produces logs related to topicId changes
[ https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855993#comment-17855993 ] Vinicius Vieira dos Santos edited comment on KAFKA-16986 at 6/18/24 5:02 PM: - [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version Logs about client startup containing the version: 2024-06-17 08:23:50,376 [main] INFO o.a.k.clients.producer.KafkaProducer [] : [Producer clientId=producer-2] Instantiated an idempotent producer. 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka version: 3.6.1 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka commitId: 5e3c2b738d253ff5 2024-06-17 08:23:50,387 [main] INFO o.a.kafka.common.utils.AppInfoParser [] : Kafka startTimeMs: 1718623430387 I added the producer settings we are using in the description, thanks for the help was (Author: JIRAUSER305851): [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version I added the producer settings we are using in the description, thanks for the help > After upgrading to Kafka 3.4.1, the producer constantly produces logs related > to topicId changes > > > Key: KAFKA-16986 > URL: https://issues.apache.org/jira/browse/KAFKA-16986 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 3.0.1, 3.6.1 >Reporter: Vinicius Vieira dos Santos >Priority: Minor > Attachments: image.png > > > When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that > the applications began to log the message "{*}Resetting the last seen epoch > of partition PAYMENTS-0 to 0 since the associated topicId changed from null > to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this > behavior is not expected because the topic was not deleted and recreated so > it should simply use the cached data and not go through this client log line. > We have some applications with around 15 topics and 40 partitions which means > around 600 log lines when metadata updates occur > The main thing for me is to know if this could indicate a problem or if I can > simply change the log level of the org.apache.kafka.clients.Metadata class to > warn without worries > > There are other reports of the same behavior like this: > [https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why] > > *Some log occurrences over an interval of about 7 hours, each block refers to > an instance of the application in kubernetes* > > !image.png! > *My scenario:* > *Application:* > - Java: 21 > - Client: 3.6.1, also tested on 3.0.1 and has the same behavior > *Broker:* > - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52 > image > > *Producer Config* > > acks = -1 > auto.include.jmx.reporter = true > batch.size = 16384 > bootstrap.servers = [server:9092] > buffer.memory = 33554432 > client.dns.lookup = use_all_dns_ips > client.id = producer-1 > compression.type = gzip > connections.max.idle.ms = 54 > delivery.timeout.ms = 3 > enable.idempotence = true > interceptor.classes = [] > key.serializer = class > org.apache.kafka.common.serialization.ByteArraySerializer > linger.ms = 0 > max.block.ms = 6 > max.in.flight.requests.per.connection = 1 > max.request.size = 1048576 > metadata.max.age.ms = 30 > metadata.max.idle.ms = 30 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 3 > partitioner.adaptive.partitioning.enable = true > partitioner.availability.timeout.ms = 0 > partitioner.class = null > partitioner.ignore.keys = false > receive.buffer.bytes = 32768 > reconnect.backoff.max.ms = 1000 > reconnect.backoff.ms = 50 > request.timeout.ms = 3 > retries = 3 > retry.backoff.ms = 100 > sasl.client.callback.handler.class = null > sasl.jaas.config = [hidden] > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 6 > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.login.callback.handler.class = null > sasl.login.class = null >
[jira] [Updated] (KAFKA-16993) Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener()
[ https://issues.apache.org/jira/browse/KAFKA-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-16993: Attachment: 6u4a4e27e2oh2-org.apache.kafka.streams.integration.RestoreIntegrationTest-shouldInvokeUserDefinedGlobalStateRestoreListener()-1-output.txt > Flaky test > RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener() > --- > > Key: KAFKA-16993 > URL: https://issues.apache.org/jira/browse/KAFKA-16993 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 6u4a4e27e2oh2-org.apache.kafka.streams.integration.RestoreIntegrationTest-shouldInvokeUserDefinedGlobalStateRestoreListener()-1-output.txt > > > {code:java} > org.opentest4j.AssertionFailedError: expected: but was: at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at > > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)at > org.apache.kafka.streams.integration.RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener(RestoreIntegrationTest.java:611)at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:498)at > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)at > > org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)at > > org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)at > > org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)at > > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)at > > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)at > > org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)at > > org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)at > > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)at > > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)at > > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at > > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)at > org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)at > >
Re: [PR] KAFKA-16480: Bump ListOffsets version, IBP version and mark last version of ListOffsets as unstable [kafka]
junrao commented on PR #15673: URL: https://github.com/apache/kafka/pull/15673#issuecomment-2176575249 @cmccabe and @jlprat : Here is a summary of of the issue that this PR is trying to fix. [KIP-1005](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1005%3A+Expose+EarliestLocalOffset+and+TieredOffset) proposes to (1) add support for latest-tiered timestamp (-5) in ListOffset request (2) expose earliest-local (-4) and latest-tiered timestamp (-5) in AdminClient and GetOffsetShell tool. (1) was implemented in the 3.8 branch, but was implemented incorrectly since it didn't bump up the request version (see https://issues.apache.org/jira/browse/KAFKA-16480). (2) hasn't been implemented yet. > Does this add a new feature to 3.8 or not? Some comments seems to indicate that it does, but there was also a comment that this was just "fixing a mistake" For this PR, we just want to fix (1). > Is the last version of ListOffsets unstable or not? Title of the PR indicates that it should be, but the PR itself doesn't set "latestVersionUnstable": true. in ListOffsetsRequest. The latest PR makes the latest version ListOffsets stable. So, we need to change the description of this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (KAFKA-16992) Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]
[ https://issues.apache.org/jira/browse/KAFKA-16992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16992. - Resolution: Duplicate > Flaky Test > org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2] > -- > > Key: KAFKA-16992 > URL: https://issues.apache.org/jira/browse/KAFKA-16992 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 6u4a4e27e2oh2-org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest-shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]-1-output.txt > > > We saw this test to timeout more frequently recently: > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. > Expected ERROR state but driver is on RUNNING ==> expected: but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at > > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)at > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350)at > org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169)at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:498)at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at > > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at > > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at > > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at > java.util.concurrent.FutureTask.run(FutureTask.java:266)at > java.lang.Thread.run(Thread.java:750) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] MINOR: Don't swallow validateReconfiguration exceptions [kafka]
ahuang98 commented on code in PR #16346: URL: https://github.com/apache/kafka/pull/16346#discussion_r1644791643 ## core/src/main/scala/kafka/server/DynamicBrokerConfig.scala: ## @@ -640,8 +640,8 @@ class DynamicBrokerConfig(private val kafkaConfig: KafkaConfig) extends Logging reconfigurable.validateReconfiguration(newConfigs) } catch { case e: ConfigException => throw e - case _: Exception => -throw new ConfigException(s"Validation of dynamic config update of $updatedConfigNames failed with class ${reconfigurable.getClass}") + case e: Exception => +throw new ConfigException(s"Validation of dynamic config update of $updatedConfigNames failed with class ${reconfigurable.getClass} due to: $e") Review Comment: @cmccabe @rajinisivaram what do you think about what I suggested in my earlier comment? https://github.com/apache/kafka/pull/16346#discussion_r1643521884 i.e. additionally catching IllegalStateExceptions or wrapping some error cases in ConfigException so the stacktrace is preserved properly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (KAFKA-16992) Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]
[ https://issues.apache.org/jira/browse/KAFKA-16992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-16992: Attachment: 6u4a4e27e2oh2-org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest-shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]-1-output.txt > Flaky Test > org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2] > -- > > Key: KAFKA-16992 > URL: https://issues.apache.org/jira/browse/KAFKA-16992 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 6u4a4e27e2oh2-org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest-shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]-1-output.txt > > > We saw this test to timeout more frequently recently: > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. > Expected ERROR state but driver is on RUNNING ==> expected: but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at > > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)at > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350)at > org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169)at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:498)at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at > > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at > > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at > > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at > java.util.concurrent.FutureTask.run(FutureTask.java:266)at > java.lang.Thread.run(Thread.java:750) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16992) Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]
Matthias J. Sax created KAFKA-16992: --- Summary: Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2] Key: KAFKA-16992 URL: https://issues.apache.org/jira/browse/KAFKA-16992 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Attachments: 6u4a4e27e2oh2-org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest-shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]-1-output.txt We saw this test to timeout more frequently recently: {code:java} org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. Expected ERROR state but driver is on RUNNING ==> expected: but was: at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)at org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350)at org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.lang.Thread.run(Thread.java:750) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16943) Synchronously verify Connect worker startup failure in InternalTopicsIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ksolves updated KAFKA-16943: Attachment: code-diff.png > Synchronously verify Connect worker startup failure in > InternalTopicsIntegrationTest > > > Key: KAFKA-16943 > URL: https://issues.apache.org/jira/browse/KAFKA-16943 > Project: Kafka > Issue Type: Improvement > Components: connect >Reporter: Chris Egerton >Priority: Minor > Labels: newbie > Attachments: code-diff.png > > > Created after PR discussion > [here|https://github.com/apache/kafka/pull/16288#discussion_r1636615220]. > In some of our integration tests, we want to verify that a Connect worker > cannot start under poor conditions (such as when its internal topics do not > yet exist and it is configured to create them with a higher replication > factor than the number of available brokers, or when its internal topics > already exist but they do not have the compaction cleanup policy). > This is currently not possible, and presents a possible gap in testing > coverage, especially for the test cases > {{testFailToCreateInternalTopicsWithMoreReplicasThanBrokers}} and > {{{}testFailToStartWhenInternalTopicsAreNotCompacted{}}}. It'd be nice if we > could have some way of synchronously awaiting the completion or failure of > worker startup in our integration tests in order to guarantee that worker > startup fails under sufficiently adverse conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16943) Synchronously verify Connect worker startup failure in InternalTopicsIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855997#comment-17855997 ] Ksolves edited comment on KAFKA-16943 at 6/18/24 4:53 PM: -- In our current test cases (`testFailToCreateInternalTopicsWithMoreReplicasThanBrokers` and `testFailToStartWhenInternalTopicsAreNotCompacted`), we attempt to verify that the Connect worker fails to start. However, our mechanism for verifying the startup failure lacks synchronous waiting and precise assertion. Example Test Case (Added screenshot to show the diff of newly added changes): {code:java} @Test public void testFailToCreateInternalTopicsWithMoreReplicasThanBrokers() throws InterruptedException { workerProps.put(DistributedConfig.CONFIG_STORAGE_REPLICATION_FACTOR_CONFIG, "3"); workerProps.put(DistributedConfig.OFFSET_STORAGE_REPLICATION_FACTOR_CONFIG, "2"); workerProps.put(DistributedConfig.STATUS_STORAGE_REPLICATION_FACTOR_CONFIG, "1"); int numWorkers = 0; int numBrokers = 1; connect = new EmbeddedConnectCluster.Builder().name("connect-cluster-1") .workerProps(workerProps) .numWorkers(numWorkers) .numBrokers(numBrokers) .brokerProps(brokerProps) .build(); // Start the brokers and Connect, but Connect should fail to create config and offset topic connect.start(); log.info("Completed startup of {} Kafka broker. Expected Connect worker to fail", numBrokers); // Try to start a worker connect.addWorker(); // Synchronously await and verify that the worker fails during startup boolean workerStarted = waitForWorkerStartupFailure(connect, 3); // 30 seconds timeout assertFalse(workerStarted, "Worker should not have started successfully"); log.info("Verifying the internal topics for Connect"); connect.assertions().assertTopicsDoNotExist(configTopic(), offsetTopic()); // Verify that no workers are running assertFalse(connect.anyWorkersRunning()); } private boolean waitForWorkerStartupFailure(EmbeddedConnectCluster connect, long timeoutMillis) throws InterruptedException { long startTime = System.currentTimeMillis(); while (System.currentTimeMillis() - startTime < timeoutMillis) { if (!connect.anyWorkersRunning()) { return false; } Thread.sleep(500); // wait for 500 milliseconds before checking again } return true; } {code} What changes do you suggest to improve this synchronous verification mechanism? I'll create the PR accordingly. was (Author: JIRAUSER305714): In our current test cases (`testFailToCreateInternalTopicsWithMoreReplicasThanBrokers` and `testFailToStartWhenInternalTopicsAreNotCompacted`), we attempt to verify that the Connect worker fails to start. However, our mechanism for verifying the startup failure lacks synchronous waiting and precise assertion. Example Test Case: [Changes in existing test case marked in {color:#00875a}*green*{color}] {code:java} @Test public void testFailToCreateInternalTopicsWithMoreReplicasThanBrokers() throws InterruptedException { workerProps.put(DistributedConfig.CONFIG_STORAGE_REPLICATION_FACTOR_CONFIG, "3"); workerProps.put(DistributedConfig.OFFSET_STORAGE_REPLICATION_FACTOR_CONFIG, "2"); workerProps.put(DistributedConfig.STATUS_STORAGE_REPLICATION_FACTOR_CONFIG, "1"); int numWorkers = 0; int numBrokers = 1; connect = new EmbeddedConnectCluster.Builder().name("connect-cluster-1") .workerProps(workerProps) .numWorkers(numWorkers) .numBrokers(numBrokers) .brokerProps(brokerProps) .build(); // Start the brokers and Connect, but Connect should fail to create config and offset topic connect.start(); log.info("Completed startup of {} Kafka broker. Expected Connect worker to fail", numBrokers); // Try to start a worker connect.addWorker(); // Synchronously await and verify that the worker fails during startup boolean workerStarted = waitForWorkerStartupFailure(connect, 3); // 30 seconds timeout assertFalse(workerStarted, "Worker should not have started successfully"); log.info("Verifying the internal topics for Connect"); connect.assertions().assertTopicsDoNotExist(configTopic(), offsetTopic()); // Verify that no workers are running assertFalse(connect.anyWorkersRunning()); } private boolean waitForWorkerStartupFailure(EmbeddedConnectCluster connect, long
[jira] [Commented] (KAFKA-16943) Synchronously verify Connect worker startup failure in InternalTopicsIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855997#comment-17855997 ] Ksolves commented on KAFKA-16943: - In our current test cases (`testFailToCreateInternalTopicsWithMoreReplicasThanBrokers` and `testFailToStartWhenInternalTopicsAreNotCompacted`), we attempt to verify that the Connect worker fails to start. However, our mechanism for verifying the startup failure lacks synchronous waiting and precise assertion. Example Test Case: [Changes in existing test case marked in {color:#00875a}*green*{color}] {code:java} @Test public void testFailToCreateInternalTopicsWithMoreReplicasThanBrokers() throws InterruptedException { workerProps.put(DistributedConfig.CONFIG_STORAGE_REPLICATION_FACTOR_CONFIG, "3"); workerProps.put(DistributedConfig.OFFSET_STORAGE_REPLICATION_FACTOR_CONFIG, "2"); workerProps.put(DistributedConfig.STATUS_STORAGE_REPLICATION_FACTOR_CONFIG, "1"); int numWorkers = 0; int numBrokers = 1; connect = new EmbeddedConnectCluster.Builder().name("connect-cluster-1") .workerProps(workerProps) .numWorkers(numWorkers) .numBrokers(numBrokers) .brokerProps(brokerProps) .build(); // Start the brokers and Connect, but Connect should fail to create config and offset topic connect.start(); log.info("Completed startup of {} Kafka broker. Expected Connect worker to fail", numBrokers); // Try to start a worker connect.addWorker(); // Synchronously await and verify that the worker fails during startup boolean workerStarted = waitForWorkerStartupFailure(connect, 3); // 30 seconds timeout assertFalse(workerStarted, "Worker should not have started successfully"); log.info("Verifying the internal topics for Connect"); connect.assertions().assertTopicsDoNotExist(configTopic(), offsetTopic()); // Verify that no workers are running assertFalse(connect.anyWorkersRunning()); } private boolean waitForWorkerStartupFailure(EmbeddedConnectCluster connect, long timeoutMillis) throws InterruptedException { long startTime = System.currentTimeMillis(); while (System.currentTimeMillis() - startTime < timeoutMillis) { if (!connect.anyWorkersRunning()) { return false; } Thread.sleep(500); // wait for 500 milliseconds before checking again } return true; } {code} What changes do you suggest to improve this synchronous verification mechanism? I'll create the PR accordingly. > Synchronously verify Connect worker startup failure in > InternalTopicsIntegrationTest > > > Key: KAFKA-16943 > URL: https://issues.apache.org/jira/browse/KAFKA-16943 > Project: Kafka > Issue Type: Improvement > Components: connect >Reporter: Chris Egerton >Priority: Minor > Labels: newbie > > Created after PR discussion > [here|https://github.com/apache/kafka/pull/16288#discussion_r1636615220]. > In some of our integration tests, we want to verify that a Connect worker > cannot start under poor conditions (such as when its internal topics do not > yet exist and it is configured to create them with a higher replication > factor than the number of available brokers, or when its internal topics > already exist but they do not have the compaction cleanup policy). > This is currently not possible, and presents a possible gap in testing > coverage, especially for the test cases > {{testFailToCreateInternalTopicsWithMoreReplicasThanBrokers}} and > {{{}testFailToStartWhenInternalTopicsAreNotCompacted{}}}. It'd be nice if we > could have some way of synchronously awaiting the completion or failure of > worker startup in our integration tests in order to guarantee that worker > startup fails under sufficiently adverse conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16522) Admin client changes for adding and removing voters
[ https://issues.apache.org/jira/browse/KAFKA-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] José Armando García Sancio updated KAFKA-16522: --- Fix Version/s: 3.9.0 > Admin client changes for adding and removing voters > --- > > Key: KAFKA-16522 > URL: https://issues.apache.org/jira/browse/KAFKA-16522 > Project: Kafka > Issue Type: Sub-task >Reporter: José Armando García Sancio >Assignee: Colin McCabe >Priority: Major > Fix For: 3.9.0 > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Controller+Membership+Changes#KIP853:KRaftControllerMembershipChanges-Admin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16520) Changes to DescribeQuorum response
[ https://issues.apache.org/jira/browse/KAFKA-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] José Armando García Sancio updated KAFKA-16520: --- Fix Version/s: 3.9.0 (was: 3.8.0) > Changes to DescribeQuorum response > -- > > Key: KAFKA-16520 > URL: https://issues.apache.org/jira/browse/KAFKA-16520 > Project: Kafka > Issue Type: Sub-task > Components: kraft >Reporter: José Armando García Sancio >Assignee: Nikolay Izhikov >Priority: Major > Fix For: 3.9.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16986) After upgrading to Kafka 3.4.1, the producer constantly produces logs related to topicId changes
[ https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855993#comment-17855993 ] Vinicius Vieira dos Santos edited comment on KAFKA-16986 at 6/18/24 4:47 PM: - [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version I added the producer settings we are using in the description, thanks for the help was (Author: JIRAUSER305851): [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version > After upgrading to Kafka 3.4.1, the producer constantly produces logs related > to topicId changes > > > Key: KAFKA-16986 > URL: https://issues.apache.org/jira/browse/KAFKA-16986 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 3.0.1, 3.6.1 >Reporter: Vinicius Vieira dos Santos >Priority: Minor > Attachments: image.png > > > When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that > the applications began to log the message "{*}Resetting the last seen epoch > of partition PAYMENTS-0 to 0 since the associated topicId changed from null > to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this > behavior is not expected because the topic was not deleted and recreated so > it should simply use the cached data and not go through this client log line. > We have some applications with around 15 topics and 40 partitions which means > around 600 log lines when metadata updates occur > The main thing for me is to know if this could indicate a problem or if I can > simply change the log level of the org.apache.kafka.clients.Metadata class to > warn without worries > > There are other reports of the same behavior like this: > [https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why] > > *Some log occurrences over an interval of about 7 hours, each block refers to > an instance of the application in kubernetes* > > !image.png! > *My scenario:* > *Application:* > - Java: 21 > - Client: 3.6.1, also tested on 3.0.1 and has the same behavior > *Broker:* > - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52 > image > > *Producer Config* > > acks = -1 > auto.include.jmx.reporter = true > batch.size = 16384 > bootstrap.servers = [server:9092] > buffer.memory = 33554432 > client.dns.lookup = use_all_dns_ips > client.id = producer-1 > compression.type = gzip > connections.max.idle.ms = 54 > delivery.timeout.ms = 3 > enable.idempotence = true > interceptor.classes = [] > key.serializer = class > org.apache.kafka.common.serialization.ByteArraySerializer > linger.ms = 0 > max.block.ms = 6 > max.in.flight.requests.per.connection = 1 > max.request.size = 1048576 > metadata.max.age.ms = 30 > metadata.max.idle.ms = 30 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 3 > partitioner.adaptive.partitioning.enable = true > partitioner.availability.timeout.ms = 0 > partitioner.class = null > partitioner.ignore.keys = false > receive.buffer.bytes = 32768 > reconnect.backoff.max.ms = 1000 > reconnect.backoff.ms = 50 > request.timeout.ms = 3 > retries = 3 > retry.backoff.ms = 100 > sasl.client.callback.handler.class = null > sasl.jaas.config = [hidden] > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 6 > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.login.callback.handler.class = null > sasl.login.class = null > sasl.login.connect.timeout.ms = null > sasl.login.read.timeout.ms = null > sasl.login.refresh.buffer.seconds = 300 > sasl.login.refresh.min.period.seconds = 60 > sasl.login.refresh.window.factor = 0.8 > sasl.login.refresh.window.jitter = 0.05 > sasl.login.retry.backoff.max.ms = 1 > sasl.login.retry.backoff.ms = 100 > sasl.mechanism = PLAIN > sasl.oauthbearer.clock.skew.seconds = 30 > sasl.oauthbearer.expected.audience = null > sasl.oauthbearer.expected.issuer = null > sasl.oauthbearer.jwks.endpoint.refresh.ms = 360 >
[jira] [Updated] (KAFKA-16986) After upgrading to Kafka 3.4.1, the producer constantly produces logs related to topicId changes
[ https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinicius Vieira dos Santos updated KAFKA-16986: --- Description: When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that the applications began to log the message "{*}Resetting the last seen epoch of partition PAYMENTS-0 to 0 since the associated topicId changed from null to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this behavior is not expected because the topic was not deleted and recreated so it should simply use the cached data and not go through this client log line. We have some applications with around 15 topics and 40 partitions which means around 600 log lines when metadata updates occur The main thing for me is to know if this could indicate a problem or if I can simply change the log level of the org.apache.kafka.clients.Metadata class to warn without worries There are other reports of the same behavior like this: [https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why] *Some log occurrences over an interval of about 7 hours, each block refers to an instance of the application in kubernetes* !image.png! *My scenario:* *Application:* - Java: 21 - Client: 3.6.1, also tested on 3.0.1 and has the same behavior *Broker:* - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52 image *Producer Config* acks = -1 auto.include.jmx.reporter = true batch.size = 16384 bootstrap.servers = [server:9092] buffer.memory = 33554432 client.dns.lookup = use_all_dns_ips client.id = producer-1 compression.type = gzip connections.max.idle.ms = 54 delivery.timeout.ms = 3 enable.idempotence = true interceptor.classes = [] key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer linger.ms = 0 max.block.ms = 6 max.in.flight.requests.per.connection = 1 max.request.size = 1048576 metadata.max.age.ms = 30 metadata.max.idle.ms = 30 metric.reporters = [] metrics.num.samples = 2 metrics.recording.level = INFO metrics.sample.window.ms = 3 partitioner.adaptive.partitioning.enable = true partitioner.availability.timeout.ms = 0 partitioner.class = null partitioner.ignore.keys = false receive.buffer.bytes = 32768 reconnect.backoff.max.ms = 1000 reconnect.backoff.ms = 50 request.timeout.ms = 3 retries = 3 retry.backoff.ms = 100 sasl.client.callback.handler.class = null sasl.jaas.config = [hidden] sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.min.time.before.relogin = 6 sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 sasl.kerberos.ticket.renew.window.factor = 0.8 sasl.login.callback.handler.class = null sasl.login.class = null sasl.login.connect.timeout.ms = null sasl.login.read.timeout.ms = null sasl.login.refresh.buffer.seconds = 300 sasl.login.refresh.min.period.seconds = 60 sasl.login.refresh.window.factor = 0.8 sasl.login.refresh.window.jitter = 0.05 sasl.login.retry.backoff.max.ms = 1 sasl.login.retry.backoff.ms = 100 sasl.mechanism = PLAIN sasl.oauthbearer.clock.skew.seconds = 30 sasl.oauthbearer.expected.audience = null sasl.oauthbearer.expected.issuer = null sasl.oauthbearer.jwks.endpoint.refresh.ms = 360 sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 1 sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100 sasl.oauthbearer.jwks.endpoint.url = null sasl.oauthbearer.scope.claim.name = scope sasl.oauthbearer.sub.claim.name = sub sasl.oauthbearer.token.endpoint.url = null security.protocol = SASL_PLAINTEXT security.providers = null send.buffer.bytes = 131072 socket.connection.setup.timeout.max.ms = 3 socket.connection.setup.timeout.ms = 1 ssl.cipher.suites = null ssl.enabled.protocols = [TLSv1.2, TLSv1.3] ssl.endpoint.identification.algorithm = https ssl.engine.factory.class = null ssl.key.password = null ssl.keymanager.algorithm = SunX509 ssl.keystore.certificate.chain = null ssl.keystore.key = null ssl.keystore.location = null ssl.keystore.password = null ssl.keystore.type = JKS ssl.protocol = TLSv1.3 ssl.provider = null ssl.secure.random.implementation = null ssl.trustmanager.algorithm = PKIX ssl.truststore.certificates = null ssl.truststore.location = null ssl.truststore.password = null ssl.truststore.type = JKS transaction.timeout.ms = 6 transactional.id = null value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer If you need any more details, please let me know. was: When updating the Kafka
[jira] [Commented] (KAFKA-16986) After upgrading to Kafka 3.4.1, the producer constantly produces logs related to topicId changes
[ https://issues.apache.org/jira/browse/KAFKA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855993#comment-17855993 ] Vinicius Vieira dos Santos commented on KAFKA-16986: [~jolshan] The logs in the previous comments were collected from a client version 3.6.1 and the broker 3.4.1, when you refer to 3.5.0 I believe it is from the client and therefore the one I am using is in a higher version > After upgrading to Kafka 3.4.1, the producer constantly produces logs related > to topicId changes > > > Key: KAFKA-16986 > URL: https://issues.apache.org/jira/browse/KAFKA-16986 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 3.0.1, 3.6.1 >Reporter: Vinicius Vieira dos Santos >Priority: Minor > Attachments: image.png > > > When updating the Kafka broker from version 2.7.0 to 3.4.1, we noticed that > the applications began to log the message "{*}Resetting the last seen epoch > of partition PAYMENTS-0 to 0 since the associated topicId changed from null > to szRLmiAiTs8Y0nI8b3Wz1Q{*}" in a very constant, from what I understand this > behavior is not expected because the topic was not deleted and recreated so > it should simply use the cached data and not go through this client log line. > We have some applications with around 15 topics and 40 partitions which means > around 600 log lines when metadata updates occur > The main thing for me is to know if this could indicate a problem or if I can > simply change the log level of the org.apache.kafka.clients.Metadata class to > warn without worries > > There are other reports of the same behavior like this: > https://stackoverflow.com/questions/74652231/apache-kafka-resetting-the-last-seen-epoch-of-partition-why > > *Some log occurrences over an interval of about 7 hours, each block refers to > an instance of the application in kubernetes* > > !image.png! > *My scenario:* > *Application:* > - Java: 21 > - Client: 3.6.1, also tested on 3.0.1 and has the same behavior > *Broker:* > - Cluster running on Kubernetes with the bitnami/kafka:3.4.1-debian-11-r52 > image > > If you need any more details, please let me know. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] KAFKA-16772: Introduce kraft.version to support KIP-853 [kafka]
jsancio commented on code in PR #16230: URL: https://github.com/apache/kafka/pull/16230#discussion_r1644766348 ## server-common/src/main/java/org/apache/kafka/server/common/KRaftVersion.java: ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.kafka.server.common; + +import java.util.Collections; +import java.util.Map; + +public enum KRaftVersion implements FeatureVersion { + +// Version 1 enables KIP-853. +KRAFT_VERSION_0(0, MetadataVersion.MINIMUM_KRAFT_VERSION), +KRAFT_VERSION_1(1, MetadataVersion.IBP_3_8_IV0); Review Comment: This should be IBP_3_9_IV0 now that we moved KIP-853 to 3.9. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-16991) Flaky Test org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState
Matthias J. Sax created KAFKA-16991: --- Summary: Flaky Test org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState Key: KAFKA-16991 URL: https://issues.apache.org/jira/browse/KAFKA-16991 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Attachments: 5owo5xbyzjnao-org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest-shouldRestoreState()-1-output.txt We see this test running into timeouts more frequently recently. {code:java} org.opentest4j.AssertionFailedError: Condition not met within timeout 6. Repartition topic restore-test-KSTREAM-AGGREGATE-STATE-STORE-02-repartition not purged data after 6 ms. ==> expected: but was: at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)•••at org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:367)at org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState(PurgeRepartitionTopicIntegrationTest.java:220) {code} There was no ERROR or WARN log... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16991) Flaky Test org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState
[ https://issues.apache.org/jira/browse/KAFKA-16991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-16991: Attachment: 5owo5xbyzjnao-org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest-shouldRestoreState()-1-output.txt > Flaky Test > org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState > --- > > Key: KAFKA-16991 > URL: https://issues.apache.org/jira/browse/KAFKA-16991 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 5owo5xbyzjnao-org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest-shouldRestoreState()-1-output.txt > > > We see this test running into timeouts more frequently recently. > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 6. > Repartition topic > restore-test-KSTREAM-AGGREGATE-STATE-STORE-02-repartition not purged > data after 6 ms. ==> expected: but was: at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)•••at > > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:367)at > org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState(PurgeRepartitionTopicIntegrationTest.java:220) > {code} > There was no ERROR or WARN log... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] KIP-966 dynamic configs server side support [kafka]
mannoopj opened a new pull request, #16389: URL: https://github.com/apache/kafka/pull/16389 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] kip 966 unclean leader election dynamic configs [kafka]
mannoopj closed pull request #15603: kip 966 unclean leader election dynamic configs URL: https://github.com/apache/kafka/pull/15603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [WIP]KAFKA-16908: Refactor QuorumConfig with AbstractConfig [kafka]
johnnychhsu opened a new pull request, #16388: URL: https://github.com/apache/kafka/pull/16388 Jira: https://issues.apache.org/jira/browse/KAFKA-16908 ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16527; Implement request handling for updated KRaft RPCs [kafka]
jsancio commented on code in PR #16235: URL: https://github.com/apache/kafka/pull/16235#discussion_r1644751861 ## core/src/main/scala/kafka/server/KafkaServer.scala: ## @@ -440,6 +441,8 @@ class KafkaServer( threadNamePrefix, CompletableFuture.completedFuture(quorumVoters), QuorumConfig.parseBootstrapServers(config.quorumBootstrapServers), +// Endpoint information is only needed for controllers (voters). ZK brokers can never be controllers +Endpoints.empty(), Review Comment: Fixed the comment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINOR: update kraft_upgrade_test to create a new topic after metadata upgrade [kafka]
gaurav-narula commented on PR #15451: URL: https://github.com/apache/kafka/pull/15451#issuecomment-2176503765 Thanks for reviewing Igor! I rebased but ran into what seems like a bug in trunk. This is therefore blocked on [KAFKA-16990](https://issues.apache.org/jira/browse/KAFKA-16990) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16527; Implement request handling for updated KRaft RPCs [kafka]
jsancio commented on code in PR #16235: URL: https://github.com/apache/kafka/pull/16235#discussion_r1644748799 ## clients/src/main/resources/common/message/VoteRequest.json: ## @@ -18,30 +18,36 @@ "type": "request", "listeners": ["controller"], "name": "VoteRequest", - "validVersions": "0", + "validVersions": "0-1", Review Comment: Fixed. ## clients/src/main/resources/common/message/VoteResponse.json: ## @@ -17,29 +17,37 @@ "apiKey": 52, "type": "response", "name": "VoteResponse", - "validVersions": "0", + "validVersions": "0-1", Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-16990) Unrecognised flag passed to kafka-storage.sh in system test
Gaurav Narula created KAFKA-16990: - Summary: Unrecognised flag passed to kafka-storage.sh in system test Key: KAFKA-16990 URL: https://issues.apache.org/jira/browse/KAFKA-16990 Project: Kafka Issue Type: Test Reporter: Gaurav Narula Running {{TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" bash tests/docker/run_tests.sh}} on trunk (c4a3d2475f) fails with the following: {code:java} [INFO:2024-06-18 09:16:03,139]: Triggering test 2 of 32... [INFO:2024-06-18 09:16:03,147]: RunnerClient: Loading test {'directory': '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': 'kraft_upgrade_test.py', 'cls_name': 'TestKRaftUpgrade', 'method_name': 'test_isolated_mode_upgrade', 'injected_args': {'from_kafka_version': '3.1.2', 'use_new_coordinator': True, 'metadata_quorum': 'ISOLATED_KRAFT'}} [INFO:2024-06-18 09:16:03,151]: RunnerClient: kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: on run 1/1 [INFO:2024-06-18 09:16:03,153]: RunnerClient: kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: Setting up... [INFO:2024-06-18 09:16:03,153]: RunnerClient: kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: Running... [INFO:2024-06-18 09:16:05,999]: RunnerClient: kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: Tearing down... [INFO:2024-06-18 09:16:12,366]: RunnerClient: kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: FAIL: RemoteCommandError({'ssh_config': {'host': 'ducker10', 'hostname': 'ducker10', 'user': 'ducker', 'port': 22, 'password': '', 'identityfile': '/home/ducker/.ssh/id_rsa', 'connecttimeout': None}, 'hostname': 'ducker10', 'ssh_hostname': 'ducker10', 'user': 'ducker', 'externally_routable_ip': 'ducker10', '_logger': , 'os': 'linux', '_ssh_client': , '_sftp_client': , '_custom_ssh_exception_checks': None}, '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f group.version=1', 1, b"usage: kafka-storage format [-h] --config CONFIG --cluster-id CLUSTER_ID\n [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n") Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", line 186, in _do_run data = self.run_test() File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", line 246, in run_test return self.test_context.function(self.test) File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 433, in wrapper return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", line 132, in test_isolated_mode_upgrade self.run_upgrade(from_kafka_version, group_protocol) File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", line 96, in run_upgrade self.kafka.start() File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 669, in start self.isolated_controller_quorum.start() File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 671, in start Service.start(self, **kwargs) File "/usr/local/lib/python3.9/dist-packages/ducktape/services/service.py", line 265, in start self.start_node(node, **kwargs) File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 902, in start_node node.account.ssh(cmd) File "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", line 35, in wrapper return method(self, *args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", line 310, in ssh raise RemoteCommandError(self, cmd, exit_status, stderr.read()) ducktape.cluster.remoteaccount.RemoteCommandError: ducker@ducker10: Command '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f group.version=1' returned non-zero exit status 1. Remote error message: b"usage: kafka-storage format [-h] --config CONFIG --cluster-id CLUSTER_ID\n [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n" {code} This may be related to KAFKA-16860 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16990) Unrecognised flag passed to kafka-storage.sh in system test
[ https://issues.apache.org/jira/browse/KAFKA-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Narula reassigned KAFKA-16990: - Assignee: Gaurav Narula > Unrecognised flag passed to kafka-storage.sh in system test > --- > > Key: KAFKA-16990 > URL: https://issues.apache.org/jira/browse/KAFKA-16990 > Project: Kafka > Issue Type: Test >Reporter: Gaurav Narula >Assignee: Gaurav Narula >Priority: Major > > Running > {{TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade" > bash tests/docker/run_tests.sh}} on trunk (c4a3d2475f) fails with the > following: > {code:java} > [INFO:2024-06-18 09:16:03,139]: Triggering test 2 of 32... > [INFO:2024-06-18 09:16:03,147]: RunnerClient: Loading test {'directory': > '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': > 'kraft_upgrade_test.py', 'cls_name': 'TestKRaftUpgrade', 'method_name': > 'test_isolated_mode_upgrade', 'injected_args': {'from_kafka_version': > '3.1.2', 'use_new_coordinator': True, 'metadata_quorum': 'ISOLATED_KRAFT'}} > [INFO:2024-06-18 09:16:03,151]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > on run 1/1 > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Setting up... > [INFO:2024-06-18 09:16:03,153]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Running... > [INFO:2024-06-18 09:16:05,999]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > Tearing down... > [INFO:2024-06-18 09:16:12,366]: RunnerClient: > kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT: > FAIL: RemoteCommandError({'ssh_config': {'host': 'ducker10', 'hostname': > 'ducker10', 'user': 'ducker', 'port': 22, 'password': '', 'identityfile': > '/home/ducker/.ssh/id_rsa', 'connecttimeout': None}, 'hostname': 'ducker10', > 'ssh_hostname': 'ducker10', 'user': 'ducker', 'externally_routable_ip': > 'ducker10', '_logger': kafkatest.tests.core.kraft_upgrade_test.TestKRaftUpgrade.test_isolated_mode_upgrade.from_kafka_version=3.1.2.use_new_coordinator=True.metadata_quorum=ISOLATED_KRAFT-2 > (DEBUG)>, 'os': 'linux', '_ssh_client': 0x85bccc70>, '_sftp_client': 0x85bccdf0>, '_custom_ssh_exception_checks': None}, > '/opt/kafka-3.1.2/bin/kafka-storage.sh format --ignore-formatted --config > /mnt/kafka/kafka.properties --cluster-id I2eXt9rvSnyhct8BYmW6-w -f > group.version=1', 1, b"usage: kafka-storage format [-h] --config CONFIG > --cluster-id CLUSTER_ID\n > [--ignore-formatted]\nkafka-storage: error: unrecognized arguments: '-f'\n") > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 132, in test_isolated_mode_upgrade > self.run_upgrade(from_kafka_version, group_protocol) > File "/opt/kafka-dev/tests/kafkatest/tests/core/kraft_upgrade_test.py", > line 96, in run_upgrade > self.kafka.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 669, in > start > self.isolated_controller_quorum.start() > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 671, in > start > Service.start(self, **kwargs) > File "/usr/local/lib/python3.9/dist-packages/ducktape/services/service.py", > line 265, in start > self.start_node(node, **kwargs) > File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 902, in > start_node > node.account.ssh(cmd) > File > "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", > line 35, in wrapper > return method(self, *args, **kwargs) > File > "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", > line 310, in ssh > raise
Re: [PR] KAFKA-16527; Implement request handling for updated KRaft RPCs [kafka]
jsancio commented on code in PR #16235: URL: https://github.com/apache/kafka/pull/16235#discussion_r1644747026 ## clients/src/main/resources/common/message/FetchSnapshotRequest.json: ## @@ -18,7 +18,7 @@ "type": "request", "listeners": ["controller"], "name": "FetchSnapshotRequest", - "validVersions": "0", + "validVersions": "0-1", Review Comment: Fixed. ## clients/src/main/resources/common/message/FetchSnapshotResponse.json: ## @@ -17,7 +17,7 @@ "apiKey": 59, "type": "response", "name": "FetchSnapshotResponse", - "validVersions": "0", + "validVersions": "0-1", Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16527; Implement request handling for updated KRaft RPCs [kafka]
jsancio commented on code in PR #16235: URL: https://github.com/apache/kafka/pull/16235#discussion_r1644744740 ## clients/src/main/resources/common/message/EndQuorumEpochResponse.json: ## @@ -17,25 +17,35 @@ "apiKey": 54, "type": "response", "name": "EndQuorumEpochResponse", - "validVersions": "0", - "flexibleVersions": "none", + "validVersions": "0-1", Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16527; Implement request handling for updated KRaft RPCs [kafka]
jsancio commented on code in PR #16235: URL: https://github.com/apache/kafka/pull/16235#discussion_r1644744458 ## clients/src/main/resources/common/message/EndQuorumEpochRequest.json: ## @@ -18,26 +18,41 @@ "type": "request", "listeners": ["controller"], "name": "EndQuorumEpochRequest", - "validVersions": "0", - "flexibleVersions": "none", + "validVersions": "0-1", Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] KAFKA-16968: Make 3.8-IV0 a stable MetadataVersion and create 3.9-IV0 [kafka]
junrao commented on PR #16347: URL: https://github.com/apache/kafka/pull/16347#issuecomment-2176493383 > Maybe someone can explain more about what https://github.com/apache/kafka/pull/15673 does exactly. @cmccabe : The main issue that https://github.com/apache/kafka/pull/15673 fixes is described in https://issues.apache.org/jira/browse/KAFKA-16480. ``` https://issues.apache.org/jira/browse/KAFKA-16154 introduced the changes to the ListOffsets API to accept latest-tiered-timestamp and return the corresponding offset. Those changes should have a) increased the version of the ListOffsets API b) increased the inter-broker protocol version c) hidden the latest version of the ListOffsets behind the latestVersionUnstable flag ``` So, let's hold off on merging this PR until we understand how to consolidate it with https://github.com/apache/kafka/pull/15673. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org