[jira] [Resolved] (KAFKA-17805) Deprecate named topologies
[ https://issues.apache.org/jira/browse/KAFKA-17805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17805. - Resolution: Fixed > Deprecate named topologies > -- > > Key: KAFKA-17805 > URL: https://issues.apache.org/jira/browse/KAFKA-17805 > Project: Kafka > Issue Type: Task > Components: streams >Reporter: A. Sophie Blee-Goldman >Assignee: A. Sophie Blee-Goldman >Priority: Major > Fix For: 4.0.0 > > > We plan to eventually phase out the experimental "named topologies" feature, > since new features and functionality in Streams will not be compatible with > named topologies and continuing to support them would result in increasing > tech debt over time. > However, it is as-yet unknown how many users have deployed named topologies > in production. We know of at least one case but hope to hear from any others > who may be concerned about this deprecation, so that we can work on an > alternative or even revert the deprecation if need be. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-7108) "Exactly-once" stream breaks production exception handler contract
[ https://issues.apache.org/jira/browse/KAFKA-7108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-7108. Resolution: Duplicate > "Exactly-once" stream breaks production exception handler contract > -- > > Key: KAFKA-7108 > URL: https://issues.apache.org/jira/browse/KAFKA-7108 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.1.0 >Reporter: Anna O >Priority: Major > Labels: exactly-once > > I have a stream configured with "default.production.exception.handler" that > is supposed to log the error and continue. When I set "processing.guarantee" > to "exactly_once" it appeared that retryable NotEnoughReplicasException that > passed the production exception handler was rethrown by the > TransactionManager wrapped with KafkaException and terminated the stream > thread: > _org.apache.kafka.common.KafkaException: Cannot execute transactional method > because we are in an error stateat > org.apache.kafka.clients.producer.internals.TransactionManager.maybeFailWithError(TransactionManager.java:784) > ~[kafka-clients-1.1.0.jar:?]_ > _at > org.apache.kafka.clients.producer.internals.TransactionManager.sendOffsetsToTransaction(TransactionManager.java:250) > ~[kafka-clients-1.1.0.jar:?]_ > _at > org.apache.kafka.clients.producer.KafkaProducer.sendOffsetsToTransaction(KafkaProducer.java:617) > ~[kafka-clients-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamTask.commitOffsets(StreamTask.java:357) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamTask.access$000(StreamTask.java:53) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:316) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:307) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:297) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67) > ~[kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:357) > [kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:347) > [kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:403) > [kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:994) > [kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:811) > [kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:750) > [kafka-streams-1.1.0.jar:?]_ > _at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:720) > [kafka-streams-1.1.0.jar:?]_ > _Caused by: org.apache.kafka.common.errors.NotEnoughReplicasException: > Messages are rejected since there are fewer in-sync replicas than required._ > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16331) Remove Deprecated EOSv1
[ https://issues.apache.org/jira/browse/KAFKA-16331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16331. - Resolution: Fixed > Remove Deprecated EOSv1 > --- > > Key: KAFKA-16331 > URL: https://issues.apache.org/jira/browse/KAFKA-16331 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Fix For: 4.0.0 > > > EOSv1 was deprecated in AK 3.0 via > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-732%3A+Deprecate+eos-alpha+and+replace+eos-beta+with+eos-v2] > * remove conifg > * remove Producer#sendOffsetsToTransaction > * cleanup code -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-12827) Remove Deprecated method KafkaStreams#setUncaughtExceptionHandler
[ https://issues.apache.org/jira/browse/KAFKA-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-12827. - Resolution: Fixed > Remove Deprecated method KafkaStreams#setUncaughtExceptionHandler > - > > Key: KAFKA-12827 > URL: https://issues.apache.org/jira/browse/KAFKA-12827 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Josep Prat >Assignee: Abhishek Giri >Priority: Blocker > Fix For: 4.0.0 > > > ++Method > org.apache.kafka.streams.KafkaStreams#setUncaughtExceptionHandler(java.lang.Thread.UncaughtExceptionHandler) > was deprecated in 2.8 > > See KAFKA-9331 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17114) DefaultStateUpdater::handleRuntimeException should update isRunning before calling `addToExceptionsAndFailedTasksThenClearUpdatingAndPausedTasks`
[ https://issues.apache.org/jira/browse/KAFKA-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17114. - Resolution: Cannot Reproduce We did push some fixed for related tickets. Closing this for now. > DefaultStateUpdater::handleRuntimeException should update isRunning before > calling `addToExceptionsAndFailedTasksThenClearUpdatingAndPausedTasks` > - > > Key: KAFKA-17114 > URL: https://issues.apache.org/jira/browse/KAFKA-17114 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Affects Versions: 3.9.0 >Reporter: Ao Li >Priority: Minor > > I saw a flaky test in > DefaultStateUpdaterTest::shouldThrowIfAddingStandbyAndActiveTaskWithSameId > recently. > {code} > org.opentest4j.AssertionFailedError: expected: but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at org.junit.jupiter.api.AssertFalse.failNotFalse(AssertFalse.java:63) > at org.junit.jupiter.api.AssertFalse.assertFalse(AssertFalse.java:36) > at org.junit.jupiter.api.AssertFalse.assertFalse(AssertFalse.java:31) > at org.junit.jupiter.api.Assertions.assertFalse(Assertions.java:231) > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.shouldThrowIfAddingTasksWithSameId(DefaultStateUpdaterTest.java:294) > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.shouldThrowIfAddingStandbyAndActiveTaskWithSameId(DefaultStateUpdaterTest.java:285) > at java.base/java.lang.reflect.Method.invoke(Method.java:580) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > {code} > To make the bug more reproducible, you may add `Thread.sleep(5)` after > `addToExceptionsAndFailedTasksThenClearUpdatingAndPausedTasks(runtimeException);` > in DefaultStateUpdater::handleRuntimeException > The test is flaky because > `addToExceptionsAndFailedTasksThenClearUpdatingAndPausedTasks(runtimeException);` > will unblock the `verifyFailedTasks(IllegalStateException.class, task1);` > statement in DefaultStateUpdaterTest::shouldThrowIfAddingTasksWithSameId. > If `assertFalse(stateUpdater.isRunning());` is executed before > `isRunning.set(false);` the test will fail -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reopened KAFKA-9323: > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: A. Sophie Blee-Goldman >Priority: Major > > With the introduction of version probing in 2.0 and cooperative rebalancing > in 2.4, the specific upgrade path depends heavily on the to & from version. > This can be a complex operation, and we should make sure to test a realistic > upgrade scenario across all possible combinations. The current system tests > have gaps however, which have allowed bugs in the upgrade path to slip by > unnoticed for several versions. > Our current system tests include a "simple upgrade downgrade" test, a > metadata upgrade test, a version probing test, and a cooperative upgrade > test. This has a few drawbacks and current issues: > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) > d) each test itself tries to test the upgrade across different versions, > meaning there may be overlap and/or unnecessary tests. For example, currently > both the metadata_upgrade and cooperative_upgrade tests will test the upgrade > of 1.0 -> 2.4 > e) worse, a number of (to, from) pairs are not tested according to the > correct upgrade path at all, which has lead to easily reproducible bugs > slipping past for several versions. > f) we have a test_simple_upgrade_downgrade test which does not actually do a > downgrade, and for some reason only tests upgrading within the range [0.10.1 > - 1.1] > g) as new versions are released, it is unclear to those not directly involved > in these tests and/or projects whether and what needs to be updated (eg > should this new version be added to the cooperative test? should the old > version be aded to the metadata test?) > We should definitely fill in the testing gap here, but how to do so is of > course up for discussion. > I would propose to refactor the upgrade tests, and rather than maintain > different lists of versions to pass as input to each different test, we > should have a single matrix that we update with each new version that > specifies which upgrade path that version combination actually requires. We > can then loop through each version combination and test only the actual > upgrade path that users would actually need to follow. This way we can be > sure we are not missing anything, as each and every possible upgrade would be > tested. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-9323. Resolution: Fixed > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: A. Sophie Blee-Goldman >Priority: Major > > With the introduction of version probing in 2.0 and cooperative rebalancing > in 2.4, the specific upgrade path depends heavily on the to & from version. > This can be a complex operation, and we should make sure to test a realistic > upgrade scenario across all possible combinations. The current system tests > have gaps however, which have allowed bugs in the upgrade path to slip by > unnoticed for several versions. > Our current system tests include a "simple upgrade downgrade" test, a > metadata upgrade test, a version probing test, and a cooperative upgrade > test. This has a few drawbacks and current issues: > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) > d) each test itself tries to test the upgrade across different versions, > meaning there may be overlap and/or unnecessary tests. For example, currently > both the metadata_upgrade and cooperative_upgrade tests will test the upgrade > of 1.0 -> 2.4 > e) worse, a number of (to, from) pairs are not tested according to the > correct upgrade path at all, which has lead to easily reproducible bugs > slipping past for several versions. > f) we have a test_simple_upgrade_downgrade test which does not actually do a > downgrade, and for some reason only tests upgrading within the range [0.10.1 > - 1.1] > g) as new versions are released, it is unclear to those not directly involved > in these tests and/or projects whether and what needs to be updated (eg > should this new version be added to the cooperative test? should the old > version be aded to the metadata test?) > We should definitely fill in the testing gap here, but how to do so is of > course up for discussion. > I would propose to refactor the upgrade tests, and rather than maintain > different lists of versions to pass as input to each different test, we > should have a single matrix that we update with each new version that > specifies which upgrade path that version combination actually requires. We > can then loop through each version combination and test only the actual > upgrade path that users would actually need to follow. This way we can be > sure we are not missing anything, as each and every possible upgrade would be > tested. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17402) Test failure: DefaultStateUpdaterTest.shouldGetTasksFromRestoredActiveTasks expected: <2> but was: <3>
[ https://issues.apache.org/jira/browse/KAFKA-17402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17402. - Resolution: Cannot Reproduce We did put out some fixes for related tickets. Closing this for now. > Test failure: DefaultStateUpdaterTest.shouldGetTasksFromRestoredActiveTasks > expected: <2> but was: <3> > --- > > Key: KAFKA-17402 > URL: https://issues.apache.org/jira/browse/KAFKA-17402 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Ao Li >Priority: Major > > I saw a test failure caused by a concurrency issue in DefaultStateUpdater. > {code} > org.opentest4j.AssertionFailedError: expected: <2> but was: <3> > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145) > at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531) > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.verifyGetTasks(DefaultStateUpdaterTest.java:1688) > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.shouldGetTasksFromRestoredActiveTasks(DefaultStateUpdaterTest.java:1517) > at java.base/java.lang.reflect.Method.invoke(Method.java:580) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > {code} > To reproduce the failure, you may clone this fork > https://github.com/aoli-al/kafka/tree/KAFKA-92 and run `./gradlew > :streams:test --tests > DefaultStateUpdaterTest.shouldGetTasksFromRestoredActiveTasks` > The root cause of the issue is that function > `DefaultStateUpdater::maybeCompleteRestoration` is not atomic. > {code} > // This code will unblock the `verifyRestoredActiveTasks` in > the test > addToRestoredTasks(task); > //If the test method resumes before the task is removed from > `updatingTasks`, the test will fail. > updatingTasks.remove(task.id()); > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16336) Remove Deprecated metric standby-process-ratio
[ https://issues.apache.org/jira/browse/KAFKA-16336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16336. - Resolution: Invalid > Remove Deprecated metric standby-process-ratio > -- > > Key: KAFKA-16336 > URL: https://issues.apache.org/jira/browse/KAFKA-16336 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: TengYao Chi >Priority: Blocker > > Metric "standby-process-ratio" was deprecated in 3.5 release via > https://cwiki.apache.org/confluence/display/KAFKA/KIP-869%3A+Improve+Streams+State+Restoration+Visibility -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17697) Fix flaky DefaultStateUpdaterTest.shouldRestoreSingleActiveStatefulTask
[ https://issues.apache.org/jira/browse/KAFKA-17697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17697. - Fix Version/s: 4.0.0 Resolution: Fixed > Fix flaky DefaultStateUpdaterTest.shouldRestoreSingleActiveStatefulTask > --- > > Key: KAFKA-17697 > URL: https://issues.apache.org/jira/browse/KAFKA-17697 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Reporter: Yu-Lin Chen >Assignee: Bruno Cadonna >Priority: Major > Fix For: 4.0.0 > > Attachments: 0001-reproduce-the-racing-issue.patch > > > 28 flaky out of 253 trunk build in the past 28 days. (github) ([Report > Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1727973081200&search.startTimeMin=172555200&search.tags=trunk,github&search.timeZoneId=Asia%2FTaipei&tests.container=org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest&tests.test=shouldRestoreSingleActiveStatefulTask()]) > The issue can be reproduced in my local env within 20 loops. Can also be > reproduced by the attached patch: [^0001-reproduce-the-racing-issue.patch] > ([Oct 2 2024 at 05:39:43 > CST|https://ge.apache.org/s/5gsvq5esvbouc/tests/task/:streams:test/details/org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest/shouldRestoreSingleActiveStatefulTask()?expanded-stacktrace=WyIwIl0&top-execution=1]) > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 3. > Did not get all restored active task within the given timeout! ==> expected: > but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > > at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214) > at > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396) > > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444) > > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:367) > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.verifyRestoredActiveTasks(DefaultStateUpdaterTest.java:1715) > > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.shouldRestoreSingleActiveStatefulTask(DefaultStateUpdaterTest.java:340) > > at java.lang.reflect.Method.invoke(Method.java:566) > at java.util.ArrayList.forEach(ArrayList.java:1541) > at java.util.ArrayList.forEach(ArrayList.java:1541) > {code} > {*}Root Casue{*}: Racing between below two threads > 1. stateUpdater.add(task) in test thread [1] > 2. runOnce() in StateUpdaterThread loops [2] > In below scenario, the StateUpdaterThread hang even if there have > updatingTasks. > {*}Flaky scenario{*}: If stateUpdater.add(task) ran behind the first > runOnce() loop, the second loop will hang. [3][4]. The allWorkDone() in the > second loop of runOnce() will be true[5], even if updatingTasks.isEmpty() is > false. [6] > Below is the flow of the flaky scenario: > # runOnce() loop 1: completedChangelogs() return emptySet, > # runOnce() loop 1: allChangelogsCompleted() return false, updatingTasks is > empty, allWorkDone() is false. {color:#de350b}Called > tasksAndActionsCondition.await(){color}. (Will be notify by > stateUpdater.add(task) [1][7]) > # test thread call stateUpdater.add(task) > # runOnce() loop 1: allChangelogsCompleted() return false again before quit > the while loop. allWorkDone() is false because tasksAndActions is not empty. > [8] > # runOnce() loop 2: completedChangelogs() return 1 topic partition > # runOnce() loop 2: allChangelogsCompleted() return true, allWorkDone() is > true, {color:#de350b}call tasksAndActionsCondition.await() again{color} and > never be notified. > > The happy path is: (stateUpdater.add(task) ran before the end of first > runOnce() loop) > # runOnce() loop 1: completedChangelogs() return emptySet, > # runOnce() loop 1: allChangelogsCompleted() return false, updatingTasks is > not empty, allWorkDone() is false > # runOnce() loop 2: completedChangelogs() return 1 topic partition, > # runOnce() loop 2: allChangelogsCompleted() return false, updatingTasks is > not empty, allWorkDone() is false > # runOnce() loop 3: completedChangelogs() return 2 topic partition, > {color:#4c9aff}+move
[jira] [Reopened] (KAFKA-17697) Fix flaky DefaultStateUpdaterTest.shouldRestoreSingleActiveStatefulTask
[ https://issues.apache.org/jira/browse/KAFKA-17697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reopened KAFKA-17697: - > Fix flaky DefaultStateUpdaterTest.shouldRestoreSingleActiveStatefulTask > --- > > Key: KAFKA-17697 > URL: https://issues.apache.org/jira/browse/KAFKA-17697 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Reporter: Yu-Lin Chen >Assignee: Yu-Lin Chen >Priority: Major > Attachments: 0001-reproduce-the-racing-issue.patch > > > 28 flaky out of 253 trunk build in the past 28 days. (github) ([Report > Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1727973081200&search.startTimeMin=172555200&search.tags=trunk,github&search.timeZoneId=Asia%2FTaipei&tests.container=org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest&tests.test=shouldRestoreSingleActiveStatefulTask()]) > The issue can be reproduced in my local env within 20 loops. Can also be > reproduced by the attached patch: [^0001-reproduce-the-racing-issue.patch] > ([Oct 2 2024 at 05:39:43 > CST|https://ge.apache.org/s/5gsvq5esvbouc/tests/task/:streams:test/details/org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest/shouldRestoreSingleActiveStatefulTask()?expanded-stacktrace=WyIwIl0&top-execution=1]) > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 3. > Did not get all restored active task within the given timeout! ==> expected: > but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > > at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214) > at > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396) > > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444) > > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:367) > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.verifyRestoredActiveTasks(DefaultStateUpdaterTest.java:1715) > > at > org.apache.kafka.streams.processor.internals.DefaultStateUpdaterTest.shouldRestoreSingleActiveStatefulTask(DefaultStateUpdaterTest.java:340) > > at java.lang.reflect.Method.invoke(Method.java:566) > at java.util.ArrayList.forEach(ArrayList.java:1541) > at java.util.ArrayList.forEach(ArrayList.java:1541) > {code} > {*}Root Casue{*}: Racing between below two threads > 1. stateUpdater.add(task) in test thread [1] > 2. runOnce() in StateUpdaterThread loops [2] > In below scenario, the StateUpdaterThread hang even if there have > updatingTasks. > {*}Flaky scenario{*}: If stateUpdater.add(task) ran behind the first > runOnce() loop, the second loop will hang. [3][4]. The allWorkDone() in the > second loop of runOnce() will be true[5], even if updatingTasks.isEmpty() is > false. [6] > Below is the flow of the flaky scenario: > # runOnce() loop 1: completedChangelogs() return emptySet, > # runOnce() loop 1: allChangelogsCompleted() return false, updatingTasks is > empty, allWorkDone() is false. {color:#de350b}Called > tasksAndActionsCondition.await(){color}. (Will be notify by > stateUpdater.add(task) [1][7]) > # test thread call stateUpdater.add(task) > # runOnce() loop 1: allChangelogsCompleted() return false again before quit > the while loop. allWorkDone() is false because tasksAndActions is not empty. > [8] > # runOnce() loop 2: completedChangelogs() return 1 topic partition > # runOnce() loop 2: allChangelogsCompleted() return true, allWorkDone() is > true, {color:#de350b}call tasksAndActionsCondition.await() again{color} and > never be notified. > > The happy path is: (stateUpdater.add(task) ran before the end of first > runOnce() loop) > # runOnce() loop 1: completedChangelogs() return emptySet, > # runOnce() loop 1: allChangelogsCompleted() return false, updatingTasks is > not empty, allWorkDone() is false > # runOnce() loop 2: completedChangelogs() return 1 topic partition, > # runOnce() loop 2: allChangelogsCompleted() return false, updatingTasks is > not empty, allWorkDone() is false > # runOnce() loop 3: completedChangelogs() return 2 topic partition, > {color:#4c9aff}+move the task to restoredActiveTasks+{color} [9] > # runOnce() loop 3: allChangelogsCo
[jira] [Resolved] (KAFKA-17408) Fix flaky testShouldCountClicksPerRegionWithNamedRepartitionTopic
[ https://issues.apache.org/jira/browse/KAFKA-17408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17408. - Resolution: Won't Fix Seem the root cause if the issue is some ZK connectivity issue. Given that we are actively working on ZK removal, I am closing this ticket. > Fix flaky testShouldCountClicksPerRegionWithNamedRepartitionTopic > - > > Key: KAFKA-17408 > URL: https://issues.apache.org/jira/browse/KAFKA-17408 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Major > Labels: flaky-test > > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 6. > Did not receive all [KeyValue(americas, 101), KeyValue(europe, 109), > KeyValue(asia, 124)] records from topic output-topic (got []) ==> expected: > but was: Stacktraceorg.opentest4j.AssertionFailedError: > Condition not met within timeout 6. Did not receive all > [KeyValue(americas, 101), KeyValue(europe, 109), KeyValue(asia, 124)] records > from topic output-topic (got []) ==> expected: but was: at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214) at > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:397) >at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:445) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:394) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:378) at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:368) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilFinalKeyValueRecordsReceived(IntegrationTestUtils.java:873) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilFinalKeyValueRecordsReceived(IntegrationTestUtils.java:822) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilFinalKeyValueRecordsReceived(IntegrationTestUtils.java:785) > at > org.apache.kafka.streams.scala.utils.StreamToTableJoinScalaIntegrationTestBase.produceNConsume(StreamToTableJoinScalaIntegrationTestBase.scala:139) > at > org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes.testShouldCountClicksPerRegionWithNamedRepartitionTopic(StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala:118) > at java.lang.reflect.Method.invoke(Method.java:498) at > java.util.ArrayList.forEach(ArrayList.java:1259) at > java.util.ArrayList.forEach(ArrayList.java:1259) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-6197) Difficult to get to the Kafka Streams javadocs
[ https://issues.apache.org/jira/browse/KAFKA-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-6197. Fix Version/s: 4.0.0 Resolution: Fixed > Difficult to get to the Kafka Streams javadocs > -- > > Key: KAFKA-6197 > URL: https://issues.apache.org/jira/browse/KAFKA-6197 > Project: Kafka > Issue Type: Improvement > Components: documentation, streams >Affects Versions: 0.11.0.2, 1.0.0 >Reporter: James Cheng >Priority: Major > Labels: beginner, newbie > Fix For: 4.0.0 > > > In order to get to the javadocs for the Kafka producer/consumer/streams, I > typically go to http://kafka.apache.org/documentation/ and click on either > 2.1 2.2 or 2.3 in the table of contents to go right to appropriate section. > The link for "Streams API" now goes to the (very nice) > http://kafka.apache.org/10/documentation/streams/. That page doesn't have a > direct link to the Javadocs anywhere. The examples and guides actually > frequently mention "See javadocs for details" but there are no direct links > to it. > If I instead go back to the main page and scroll directly to section 2.3, > there is still the link to get to the javadocs. But it's harder to jump > immediately to it. And it's a little confusing that section 2.3 in the table > of contents does not link you to section 2.3 of the page. > It would be nice if the link to the Streams javadocs was easier to get to. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17649) Remove none-StateUpdater Code
Matthias J. Sax created KAFKA-17649: --- Summary: Remove none-StateUpdater Code Key: KAFKA-17649 URL: https://issues.apache.org/jira/browse/KAFKA-17649 Project: Kafka Issue Type: Task Components: streams Reporter: Matthias J. Sax With 3.8 release, we enabled the new state-updated-thread by default. We should remove the old "none state-updater" code (and internal feature flag) at some point to reduce tech-debt and simplify the code base. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17558) Cleanup Kafka Streams integration tests
Matthias J. Sax created KAFKA-17558: --- Summary: Cleanup Kafka Streams integration tests Key: KAFKA-17558 URL: https://issues.apache.org/jira/browse/KAFKA-17558 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax EosIntegrationTest is parametrized and using "at-least-once" config. Similarly, other test run with "exactly-once" what seem unnecessary. Cf [https://github.com/apache/kafka/pull/17110] We should cleanup the test to cut down on test runtime. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17551) Remove ForeachProcessor
Matthias J. Sax created KAFKA-17551: --- Summary: Remove ForeachProcessor Key: KAFKA-17551 URL: https://issues.apache.org/jira/browse/KAFKA-17551 Project: Kafka Issue Type: Sub-task Components: streams Affects Versions: 5.0.0 Reporter: Matthias J. Sax Fix For: 5.0.0 Cf [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1077%3A+Deprecate+%60ForeachProcessor%60+and+move+to+internal+package] We can remove the public `ForeachProcessor` class. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17427) Fix leaking *_DOC variables in StreamsConfig
[ https://issues.apache.org/jira/browse/KAFKA-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17427. - Fix Version/s: 4.0.0 Resolution: Fixed > Fix leaking *_DOC variables in StreamsConfig > > > Key: KAFKA-17427 > URL: https://issues.apache.org/jira/browse/KAFKA-17427 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: 黃竣陽 >Priority: Major > Labels: kip > Fix For: 4.0.0 > > > StreamsConfigs has two variables per config, one for the config name and one > for the doc description. The second one for the doc description should be > private, but it's public for some cases. > We should deprecate the public ones to allow us making them private in a > future release. > KIP-1085: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1085%3A+Fix+leaking+*_DOC+variables+in+StreamsConfig?src=jira] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17538) Make deprecated *_DOC variable private/package-privste
Matthias J. Sax created KAFKA-17538: --- Summary: Make deprecated *_DOC variable private/package-privste Key: KAFKA-17538 URL: https://issues.apache.org/jira/browse/KAFKA-17538 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Cf [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1085%3A+Fix+leaking+*_DOC+variables+in+StreamsConfig] We can make the deprecated variables `private` (or package private). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17537) Remove deprecates methods from Joined class
Matthias J. Sax created KAFKA-17537: --- Summary: Remove deprecates methods from Joined class Key: KAFKA-17537 URL: https://issues.apache.org/jira/browse/KAFKA-17537 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Cf [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1078%3A+Remove+Leaking+Getter+Methods+in+Joined+Helper+Class] We can remove: * {{Joined#gracePeriod}} * {{Joined#keySerde}} * {{Joined#valueSerde}} * {{Joined#otherValueSerde}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17253) Remove leaking getter methods in Joined helper class
[ https://issues.apache.org/jira/browse/KAFKA-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17253. - Fix Version/s: 4.0.0 Resolution: Fixed > Remove leaking getter methods in Joined helper class > > > Key: KAFKA-17253 > URL: https://issues.apache.org/jira/browse/KAFKA-17253 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: TengYao Chi >Priority: Major > Labels: kip > Fix For: 4.0.0 > > > The helper class `Joined` has four getter methods `gracePeriod()`, > `keySerde()`, `valueSerde()` and `otherValueSerde()` which don't belong to > this class. > We should deprecate them for future removal – all four are already available > via JoinedInternal where they belong to. > KIP-1078: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1078%3A+Remove+Leaking+Getter+Methods+in+Joined+Helper+Class] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16332) Remove Deprecated builder methods for Time/Session/Join/SlidingWindows
[ https://issues.apache.org/jira/browse/KAFKA-16332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16332. - Resolution: Fixed > Remove Deprecated builder methods for Time/Session/Join/SlidingWindows > -- > > Key: KAFKA-16332 > URL: https://issues.apache.org/jira/browse/KAFKA-16332 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Kuan Po Tseng >Priority: Blocker > Fix For: 4.0.0 > > > Deprecated in 3.0: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-633%3A+Deprecate+24-hour+Default+Grace+Period+for+Windowed+Operations+in+Streams] > > * TimeWindows#of > * TimeWindows#grace > * SessionWindows#with > * SessionWindows#grace > * -JoinWindows#of- (cf https://issues.apache.org/jira/browse/KAFKA-17531) > * -JoinWindows#grace- (cf https://issues.apache.org/jira/browse/KAFKA-17531) > * SlidingWindows#withTimeDifferencAndGrace > Me might want to hold-off to cleanup JoinWindows due to > https://issues.apache.org/jira/browse/KAFKA-13813 (open for discussion) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17531) Remove deprecates methods JoinWindows#of and #grace
Matthias J. Sax created KAFKA-17531: --- Summary: Remove deprecates methods JoinWindows#of and #grace Key: KAFKA-17531 URL: https://issues.apache.org/jira/browse/KAFKA-17531 Project: Kafka Issue Type: Sub-task Components: streams Affects Versions: 5.0.0 Reporter: Matthias J. Sax Deprecated in 3.0: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-633%3A+Deprecate+24-hour+Default+Grace+Period+for+Windowed+Operations+in+Streams] * JoinWindows#of * JoinWindows#grace While we did remove the other methods deprecated via KIP-633 in 4.0 via KAFKA-16332 already, we could not remove these two because of https://issues.apache.org/jira/browse/KAFKA-13813. We an only pickup this ticket for 5.0 if KAFKA-13813 is resolved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17527) Kafka Streams fails with NPE for missing RecordContext
Matthias J. Sax created KAFKA-17527: --- Summary: Kafka Streams fails with NPE for missing RecordContext Key: KAFKA-17527 URL: https://issues.apache.org/jira/browse/KAFKA-17527 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 3.9.0 Reporter: Matthias J. Sax He did observe a crash of Kafka Streams with the following stack trace: {code:java} 2024-09-10 10:59:12,301] ERROR [kafka-producer-network-thread | i-0197827b22f4d4e4c-StreamThread-1-producer] Error executing user-provided callback on message for topic-partition 'stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-47-changelog-1' (org.apache.kafka.clients.producer.internals.ProducerBatch) java.lang.NullPointerException: Cannot invoke "org.apache.kafka.streams.processor.internals.InternalProcessorContext.recordContext()" because "context" is null at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:405) at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$1(RecordCollectorImpl.java:285) at org.apache.kafka.clients.producer.KafkaProducer$AppendCallbacks.onCompletion(KafkaProducer.java:1574) at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:312) at org.apache.kafka.clients.producer.internals.ProducerBatch.abort(ProducerBatch.java:200) at org.apache.kafka.clients.producer.internals.RecordAccumulator.abortUndrainedBatches(RecordAccumulator.java:1166) at org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:474) at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:337) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:251) at java.base/java.lang.Thread.run(Thread.java:840) {code} It seems to be a bug introduced via KIP-1033, coming from the changelogging layer which does pass a `null` context into `RecordCollector.send(...)`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17488) Cleanup (test) code for Kafka Streams "metric version"
Matthias J. Sax created KAFKA-17488: --- Summary: Cleanup (test) code for Kafka Streams "metric version" Key: KAFKA-17488 URL: https://issues.apache.org/jira/browse/KAFKA-17488 Project: Kafka Issue Type: Task Components: streams, unit tests Reporter: Matthias J. Sax In Kafka Streams, we move from old metrics to new metrics, and added a config for users to explicitly opt into using the new metric. While we did remove old metric in the code base completely, we kept the config (what is desired, as if we make a similar change in the future the config would just be there), however, there is still code left which still uses the config for no good reason (especially test code). It might be worth to cleanup a little bit. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17485) Replace KafkaClientSupplier with KafkaClientInterceptor
Matthias J. Sax created KAFKA-17485: --- Summary: Replace KafkaClientSupplier with KafkaClientInterceptor Key: KAFKA-17485 URL: https://issues.apache.org/jira/browse/KAFKA-17485 Project: Kafka Issue Type: New Feature Components: streams Reporter: Matthias J. Sax Kafka Streams currently support the `KafkaClientSupplier` interface which allows users to create client instances. However, this interface is mostly used to wrap the standard Kafka clients. To setup Kafka Streams for future modification, including KIP-1071, the Kafka Streams runtime needs to create client instances, and thus the `KafkaClientSupplier` interface should be deprecated and get removed in the future. To still allow users to wrap client and to intercept call, we propose to add a new `KafkaClientInterceptor` interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16502) Fix flaky EOSUncleanShutdownIntegrationTest#shouldWorkWithUncleanShutdownWipeOutStateStore
[ https://issues.apache.org/jira/browse/KAFKA-16502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16502. - Resolution: Fixed > Fix flaky > EOSUncleanShutdownIntegrationTest#shouldWorkWithUncleanShutdownWipeOutStateStore > -- > > Key: KAFKA-16502 > URL: https://issues.apache.org/jira/browse/KAFKA-16502 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Chia-Ping Tsai >Priority: Major > Labels: flaky-test > > org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. > Expected ERROR state but driver is on RUNNING ==> expected: but was: > > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214) > at > app//org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396) > at > app//org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444) > at > app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393) > at > app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377) > at > app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350) > at > app//org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169) > at > java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base@11.0.16.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base@11.0.16.1/java.lang.reflect.Method.invoke(Method.java:566) > at > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at > java.base@11.0.16.1/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base@11.0.16.1/java.lang.Thread.run(Thread.java:829) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (KAFKA-16502) Fix flaky EOSUncleanShutdownIntegrationTest#shouldWorkWithUncleanShutdownWipeOutStateStore
[ https://issues.apache.org/jira/browse/KAFKA-16502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reopened KAFKA-16502: - > Fix flaky > EOSUncleanShutdownIntegrationTest#shouldWorkWithUncleanShutdownWipeOutStateStore > -- > > Key: KAFKA-16502 > URL: https://issues.apache.org/jira/browse/KAFKA-16502 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Chia-Ping Tsai >Priority: Major > Labels: flaky-test > > org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. > Expected ERROR state but driver is on RUNNING ==> expected: but was: > > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214) > at > app//org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396) > at > app//org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444) > at > app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393) > at > app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377) > at > app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350) > at > app//org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169) > at > java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base@11.0.16.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base@11.0.16.1/java.lang.reflect.Method.invoke(Method.java:566) > at > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at > java.base@11.0.16.1/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base@11.0.16.1/java.lang.Thread.run(Thread.java:829) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17467) Flaky test shouldStayDeadAfterTwoCloses org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest
[ https://issues.apache.org/jira/browse/KAFKA-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17467. - Resolution: Duplicate > Flaky test shouldStayDeadAfterTwoCloses > org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest > --- > > Key: KAFKA-17467 > URL: https://issues.apache.org/jira/browse/KAFKA-17467 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Igor Soarez >Priority: Minor > > First spotted on the Java 11 tests for > https://github.com/apache/kafka/pull/17004 > > {code:java} > org.opentest4j.AssertionFailedError: expected: but was: > > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177) > at > app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1145) > at > app//org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest.shouldStayDeadAfterTwoCloses(GlobalStreamThreadTest.java:234) >{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17474) Flaky tests in GlobalStreamThreadTest
Matthias J. Sax created KAFKA-17474: --- Summary: Flaky tests in GlobalStreamThreadTest Key: KAFKA-17474 URL: https://issues.apache.org/jira/browse/KAFKA-17474 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Around Aug/24, multiple tests of GlobalStreamThreadTest started to fail with high failure rate. * [shouldStayDeadAfterTwoCloses()|https://ge.apache.org/scans/tests?search.names=Git%20branch&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FVienna&search.values=trunk&tests.container=org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest&tests.sortField=FLAKY&tests.test=shouldStayDeadAfterTwoCloses()] * [shouldCloseStateStoresOnClose()|https://ge.apache.org/scans/tests?search.names=Git%20branch&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FVienna&search.values=trunk&tests.container=org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest&tests.sortField=FLAKY&tests.test=shouldCloseStateStoresOnClose()] * [shouldStopRunningWhenClosedByUser()|https://ge.apache.org/scans/tests?search.names=Git%20branch&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FVienna&search.values=trunk&tests.container=org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest&tests.sortField=FLAKY&tests.test=shouldStopRunningWhenClosedByUser()] * [shouldBeRunningAfterSuccessfulStart()|https://ge.apache.org/scans/tests?search.names=Git%20branch&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Europe%2FVienna&search.values=trunk&tests.container=org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest&tests.sortField=FLAKY&tests.test=shouldBeRunningAfterSuccessfulStart()] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-12826) Remove Deprecated Class Serdes (Streams)
[ https://issues.apache.org/jira/browse/KAFKA-12826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-12826. - Resolution: Fixed > Remove Deprecated Class Serdes (Streams) > > > Key: KAFKA-12826 > URL: https://issues.apache.org/jira/browse/KAFKA-12826 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Josep Prat >Assignee: Arnav Dadarya >Priority: Blocker > Fix For: 4.0.0 > > > Class org.apache.kafka.streams.scala.Serdes was deprecated in version 2.7 > See KAFKA-10020 and KIP-616 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16335) Remove Deprecated method on StreamPartitioner
[ https://issues.apache.org/jira/browse/KAFKA-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16335. - Resolution: Fixed > Remove Deprecated method on StreamPartitioner > - > > Key: KAFKA-16335 > URL: https://issues.apache.org/jira/browse/KAFKA-16335 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Caio César >Priority: Blocker > Fix For: 4.0.0 > > > Deprecated in 3.4 release via > [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883356] > * StreamPartitioner#partition (singular) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17427) Fix leaking *_DOC variables in StreamsConfig
Matthias J. Sax created KAFKA-17427: --- Summary: Fix leaking *_DOC variables in StreamsConfig Key: KAFKA-17427 URL: https://issues.apache.org/jira/browse/KAFKA-17427 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax StreamsConfigs has two variables per config, one for the config name and one for the doc description. The second one for the doc description should be private, but it's public for some cases. We should deprecate the public ones to allow us making them private in a future release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17216) StreamsConfig STATE_DIR_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17216. - Resolution: Invalid Cf the reply on GitHub – caused by version miss-match. > StreamsConfig STATE_DIR_CONFIG > -- > > Key: KAFKA-17216 > URL: https://issues.apache.org/jira/browse/KAFKA-17216 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.8.0 >Reporter: raphaelauv >Priority: Major > > I can't use the class StreamsConfig > it fail with Caused by: java.lang.ExceptionInInitializerError at > StreamsConfig.java:866 > problem is not present in 3.7.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16329) Remove Deprecated Task/ThreadMetadata classes and related methods
[ https://issues.apache.org/jira/browse/KAFKA-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16329. - Resolution: Fixed > Remove Deprecated Task/ThreadMetadata classes and related methods > - > > Key: KAFKA-16329 > URL: https://issues.apache.org/jira/browse/KAFKA-16329 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Muralidhar Basani >Priority: Blocker > Fix For: 4.0.0 > > > Deprecated in AK 3.0 via > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-744%3A+Migrate+TaskMetadata+and+ThreadMetadata+to+an+interface+with+internal+implementation] > > * org.apache.kafka.streams.processor.TaskMetadata > * org.apache.kafka.streams.processo.ThreadMetadata > * org.apache.kafka.streams.KafkaStreams#localThredMetadata > * org.apache.kafka.streams.state.StreamsMetadata > * org.apache.kafka.streams.KafkaStreams#allMetadata > * org.apache.kafka.streams.KafkaStreams#allMetadataForStore > This is related https://issues.apache.org/jira/browse/KAFKA-16330 and both > ticket should be worked on together. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16334) Remove Deprecated command line option from reset tool
[ https://issues.apache.org/jira/browse/KAFKA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16334. - Resolution: Fixed > Remove Deprecated command line option from reset tool > - > > Key: KAFKA-16334 > URL: https://issues.apache.org/jira/browse/KAFKA-16334 > Project: Kafka > Issue Type: Sub-task > Components: streams, tools >Reporter: Matthias J. Sax >Assignee: Caio César >Priority: Blocker > Fix For: 4.0.0 > > > --bootstrap-servers (plural) was deprecated in 3.4 release via > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-865%3A+Support+--bootstrap-server+in+kafka-streams-application-reset] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17371) Flaky test in DefaultTaskExecutorTest.shouldUnassignTaskWhenRequired
[ https://issues.apache.org/jira/browse/KAFKA-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17371. - Fix Version/s: 4.0.0 Resolution: Fixed > Flaky test in DefaultTaskExecutorTest.shouldUnassignTaskWhenRequired > > > Key: KAFKA-17371 > URL: https://issues.apache.org/jira/browse/KAFKA-17371 > Project: Kafka > Issue Type: Bug >Reporter: Ao Li >Assignee: TengYao Chi >Priority: Minor > Fix For: 4.0.0 > > > Please see this fork https://github.com/aoli-al/kafka/tree/KAFKA-251 for a > deterministic reproduction. > The test failed with > {code} > expected: not > org.opentest4j.AssertionFailedError: expected: not > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:152) > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at org.junit.jupiter.api.AssertNotNull.failNull(AssertNotNull.java:49) > at > org.junit.jupiter.api.AssertNotNull.assertNotNull(AssertNotNull.java:35) > at > org.junit.jupiter.api.AssertNotNull.assertNotNull(AssertNotNull.java:30) > at org.junit.jupiter.api.Assertions.assertNotNull(Assertions.java:304) > at > org.apache.kafka.streams.processor.internals.tasks.DefaultTaskExecutorTest.shouldUnassignTaskWhenRequired(DefaultTaskExecutorTest.java:233) > at java.base/java.lang.reflect.Method.invoke(Method.java:580) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > {code} > The root cause of the failure is that `currentTask = > taskManager.assignNextTask(DefaultTaskExecutor.this);` is not an atomic > operation. This means that calling `taskManager.assignNextTask` will unblock > the `verify(taskManager, > timeout(VERIFICATION_TIMEOUT)).assignNextTask(taskExecutor);` statement in > the test method. > If `assertNotNull(taskExecutor.currentTask());` is executed before the > assignment `currentTaks = [...]` the test will fail. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-12833) Remove Deprecated methods under TopologyTestDriver
[ https://issues.apache.org/jira/browse/KAFKA-12833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-12833. - Resolution: Duplicate > Remove Deprecated methods under TopologyTestDriver > -- > > Key: KAFKA-12833 > URL: https://issues.apache.org/jira/browse/KAFKA-12833 > Project: Kafka > Issue Type: Sub-task > Components: streams-test-utils >Reporter: Josep Prat >Assignee: Arnav Dadarya >Priority: Blocker > Fix For: 4.0.0 > > > The following methods were at least deprecated in 2.8 > * > org.apache.kafka.streams.TopologyTestDriver.KeyValueStoreFacade#init(org.apache.kafka.streams.processor.ProcessorContext, > org.apache.kafka.streams.processor.StateStore) > * > org.apache.kafka.streams.TopologyTestDriver.WindowStoreFacade#init(org.apache.kafka.streams.processor.ProcessorContext, > org.apache.kafka.streams.processor.StateStore) > > *Disclaimer,* these methods might have been deprecated for a longer time, but > they were definitely moved to this new "hierarchy position" in version 2.8 > > Move from standalone class to inner class was done under KAFKA-12435 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16333) Removed Deprecated methods KTable#join
[ https://issues.apache.org/jira/browse/KAFKA-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16333. - Resolution: Fixed > Removed Deprecated methods KTable#join > -- > > Key: KAFKA-16333 > URL: https://issues.apache.org/jira/browse/KAFKA-16333 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Muralidhar Basani >Priority: Blocker > Fix For: 4.0.0 > > > KTable#join() methods taking a `Named` parameter got deprecated in 3.1 > release via https://issues.apache.org/jira/browse/KAFKA-13813 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17281) Remove old PAPI methods
Matthias J. Sax created KAFKA-17281: --- Summary: Remove old PAPI methods Key: KAFKA-17281 URL: https://issues.apache.org/jira/browse/KAFKA-17281 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax In 4.0, we deprecated more methods of the old PAPI via KIP-1070, which can be removed in 5.0 Cf [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1070%3A+deprecate+MockProcessorContext] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17280) Remove deprecated methods from DeserializationExceptionHandler as ProductionExceptionHandler
Matthias J. Sax created KAFKA-17280: --- Summary: Remove deprecated methods from DeserializationExceptionHandler as ProductionExceptionHandler Key: KAFKA-17280 URL: https://issues.apache.org/jira/browse/KAFKA-17280 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Existing handler got changed to have new `ErrorHandlerContext` input parameter and old handler got deprecated. Cf [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1033%3A+Add+Kafka+Streams+exception+handler+for+exceptions+occurring+during+processing] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-12828) Remove Deprecated methods under KeyQueryMetadata
[ https://issues.apache.org/jira/browse/KAFKA-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-12828. - Resolution: Fixed > Remove Deprecated methods under KeyQueryMetadata > > > Key: KAFKA-12828 > URL: https://issues.apache.org/jira/browse/KAFKA-12828 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Josep Prat >Assignee: Ksolves India Limited >Priority: Blocker > Fix For: 4.0.0 > > > The following methods under KeyQueryMetadata were deprecated in version 2.7 > * org.apache.kafka.streams.KeyQueryMetadata#getActiveHost > * org.apache.kafka.streams.KeyQueryMetadata#getStandbyHosts > * org.apache.kafka.streams.KeyQueryMetadata#getPartition > See KAFKA-10316 and KIP-648 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15896) Flaky test: shouldQuerySpecificStalePartitionStores() – org.apache.kafka.streams.integration.StoreQueryIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-15896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-15896. - Resolution: Cannot Reproduce > Flaky test: shouldQuerySpecificStalePartitionStores() – > org.apache.kafka.streams.integration.StoreQueryIntegrationTest > -- > > Key: KAFKA-15896 > URL: https://issues.apache.org/jira/browse/KAFKA-15896 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Reporter: Apoorv Mittal >Priority: Major > Labels: flaky-test > > Flaky test: > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14699/21/tests/ > > > {code:java} > Error > org.apache.kafka.streams.errors.InvalidStateStorePartitionException: The > specified partition 1 for store source-table does not > exist.Stacktraceorg.apache.kafka.streams.errors.InvalidStateStorePartitionException: > The specified partition 1 for store source-table does not exist. at > app//org.apache.kafka.streams.state.internals.WrappingStoreProvider.stores(WrappingStoreProvider.java:63) > at > app//org.apache.kafka.streams.state.internals.CompositeReadOnlyKeyValueStore.get(CompositeReadOnlyKeyValueStore.java:53) > at > app//org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificStalePartitionStores(StoreQueryIntegrationTest.java:347) > at > java.base@21.0.1/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) > at java.base@21.0.1/java.lang.reflect.Method.invoke(Method.java:580)at > app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728) > at > app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) >at > app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) > at > app//org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45) > at > app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156) > at > app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147) > at > app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86) >at > app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103) > at > app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93) > at > app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) > at > app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) >at > app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) > at > app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) > at > app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17253) Remove leaking getter methods in Joined helper class
Matthias J. Sax created KAFKA-17253: --- Summary: Remove leaking getter methods in Joined helper class Key: KAFKA-17253 URL: https://issues.apache.org/jira/browse/KAFKA-17253 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax The helper class `Joined` has three getter methods `keySerde()`, `valueSerde()` and `otherValueSerde()` which don't belong to this class. We should deprecate them for future removal and move them to `JoinedInternal` where they belong. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17251) KafkaStreams.cleanup() semantics unclear
Matthias J. Sax created KAFKA-17251: --- Summary: KafkaStreams.cleanup() semantics unclear Key: KAFKA-17251 URL: https://issues.apache.org/jira/browse/KAFKA-17251 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax `KafkaStreams#cleanup()` method is designed to delete an instance local state directory, including the `app.dir` itself if it's empty (cf corresponding unit tests in StateDirectoryTest.java). If the top level `app.dir` could not be deleted, a WARN is logged. However, in a later version we started to persist the `processId` what implies that the state directory won't be empty for many cases, because the `StateDirectory#clean()` method does not explicitly delete it. It's unclear right now, if `clean()` should actually try to delete the `processId` file, too, and it's a bug that it does not, or if there are cases for which the `processId` should actually be preserved (for this case, we should not log a WARN as it's expected that the `app.dir` is not deleted). Maybe there is not even a strict yes/no answer to this question, but we should extend `KafkaStreams#cleanup()` with a parameter and let users pick? (This would require a KIP.) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17224) Make ForeachProcessor internal
Matthias J. Sax created KAFKA-17224: --- Summary: Make ForeachProcessor internal Key: KAFKA-17224 URL: https://issues.apache.org/jira/browse/KAFKA-17224 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax `ForeachProcessor` is in public package `org.apache.kafka.streams.kstream` but it's actually an internal class. We should deprecate it and move into an internal package is a future release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17215) Remove get-prefix for all getters
Matthias J. Sax created KAFKA-17215: --- Summary: Remove get-prefix for all getters Key: KAFKA-17215 URL: https://issues.apache.org/jira/browse/KAFKA-17215 Project: Kafka Issue Type: Improvement Components: streams, streams-test-utils Reporter: Matthias J. Sax Kafka traditionally does not use a `get` prefix for getter methods. However, for multiple public interfaces, we don't follow this common pattern, but actually have a get-prefix. We might want to clean this up. The upcoming 4.0 release might be a good opportunity to deprecate existing methods and add them back with the "correct" name. We should maybe also do multiple smaller KIPs instead of just one big KIP. We do know of the following * StreamsConfig (getMainConsumerConfigs, getRestoreConsumerConfigs, getGlobalConsumerConfigs, getProducerConfigs, getAdminConfigs, getClientTags, getKafkaClientSupplier – for some of these, we might even consider to remove them; it's questionable if it makes sense to have them in the public API (cf [https://github.com/apache/kafka/pull/14548)] – we should also consider https://issues.apache.org/jira/browse/KAFKA-16945 for this work) * TopologyConfig (getTaskConfig) * KafkaClientSupplier (getAdmin, getProducer, getConsumer, getRestoreConsumer, getGlobalConsumer) * Contexts (maybe not worth it... we might deprecate the whole class soon): ** ProcessorContext (getStateStore) ** MockProcessorContext (getStateStore) * api.ProcessingContext (getStateStore) ** api.FixedKeyProcessorContext (getStateStore) ** api.MockProcessorContext (getStateStore) * StateStore (getPosition) * IQv2: officially an evolving API (maybe we can rename in 4.0 directly w/o deprecation period, but might be nasty...) ** KeyQuery (getKey) ** Position (getTopics, getPartitionPositions) ** QueryResult (getExecutionInfo, getPosition, getFailureReason, getFailureMessage, getResult) ** RangeQuery (getLowerBound, getUpperBound) ** StateQueryRequest (getStoreName, getPositionBound, getQuery, getPartitions) ** StateQueryResult (getPartitionResults, getOnlyPartitionResult, getGlobalResult, getPosition) ** WindowKeyQuery (getKey, getTimeFrom, getTimeTo, ** WindowRangeQuery (getKey, getTimeFrom, getTimeTo) * TopologyTestDriver (getAllStateStores, getStateStore, getKeyValueStore, getTimestampedKeyValueStore, getVersionedKeyValueStore, getWindowStore, getTimestampedWindowStore, getSessionStore) * TestOutputTopic (getQueueSize) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17204) KafkaStreamsCloseOptionsIntegrationTest.before leaks AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17204. - Fix Version/s: 3.9.0 Resolution: Fixed > KafkaStreamsCloseOptionsIntegrationTest.before leaks AdminClient > > > Key: KAFKA-17204 > URL: https://issues.apache.org/jira/browse/KAFKA-17204 > Project: Kafka > Issue Type: Test > Components: streams >Affects Versions: 3.9.0 >Reporter: Greg Harris >Assignee: TengYao Chi >Priority: Minor > Labels: newbie > Fix For: 3.9.0 > > > The before method creates an AdminClient, but this client is never closed. It > should be closed either in `after` or `closeCluster`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17054) test
[ https://issues.apache.org/jira/browse/KAFKA-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-17054. - Resolution: Invalid > test > > > Key: KAFKA-17054 > URL: https://issues.apache.org/jira/browse/KAFKA-17054 > Project: Kafka > Issue Type: Test >Reporter: Xuze Yang >Priority: Major > Attachments: kafka消费者.md, 扩分区后分区不均衡问题.zip > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17178) Update KTable.transformValues to new Processor API
Matthias J. Sax created KAFKA-17178: --- Summary: Update KTable.transformValues to new Processor API Key: KAFKA-17178 URL: https://issues.apache.org/jira/browse/KAFKA-17178 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax The original Processor API was replace by a type-safe version using `Record` instead of K/V pairs. The old PAPI is already mainly deprecated and will be remove to large parts in 4.0 release. However, `KTable.transformValues` is still using the old PAPI, and should be updated to use the new `api.FixedKeyProcessor` instead of the old `ValueTransformerWithKey`. The method `transformValues` should be deprecated and replaced with` processValues` to `KStream.processValues`. At the same time, the old interfaces `ValueTransformerWithKey` and `ValueTransfromerWithKeySupplier` should be deprecated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17164) Consider to enforce application.server : format at config level
Matthias J. Sax created KAFKA-17164: --- Summary: Consider to enforce application.server : format at config level Key: KAFKA-17164 URL: https://issues.apache.org/jira/browse/KAFKA-17164 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax KafkaStreams support configuration parameter `application.server` which must be of format `:`. However, in `StreamsConfig` we accept any `String` w/o validation, but only validate the format inside `KafkaStreams` constructor. It might be better to add an `AbstactConfig.Validator` and move this validation into `StreamsConfig` directly. This would be a semantic change because `new StreamsConfig(...)` might now throw an exception. Thus we need a KIP for this change, and it's technically backward incompatible... (So not sure if we can do this at all – expect for a major release? – But 4.0 is close...) The ROI is unclear to be fair. Filing this ticket mainly for documentation and to collect feedback if people think it would be a worthwhile thing to do or not. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17131) Cleanup `ProcessorContext` an related interfaces
Matthias J. Sax created KAFKA-17131: --- Summary: Cleanup `ProcessorContext` an related interfaces Key: KAFKA-17131 URL: https://issues.apache.org/jira/browse/KAFKA-17131 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax We did replace `Processor` and related classes with `api.Processor` et al. – However, there is also `ProcessorContext`: While `ProcessorContext` has a new `api.ProcessorContext` equivalent, `ProcessorContext` is still used in other places (eg `StateStore#init`, and `DeserializationExceptionHandler`) Note, that we also have `api.ProcessingContext` as well as `StateStoreContext` interfaces. Side note: In general, `StateStore` is a very leaky abstraction, as it serves two purposes. It's an interface to implement a custom `StateStore` and the top level interface to access state stores inside a `Processor`, and thus, we leak "runtime" method (like `init()` / `close()`) into `Processor` where they should not be expose. I might be worth to look into this, and figure out how to clean this all up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-10356) Handle accidental deletion of sink-topics as exceptional failure
[ https://issues.apache.org/jira/browse/KAFKA-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-10356. - Resolution: Not A Problem > Handle accidental deletion of sink-topics as exceptional failure > > > Key: KAFKA-10356 > URL: https://issues.apache.org/jira/browse/KAFKA-10356 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Bruno Cadonna >Priority: Major > > Today when sink topics are deleted, the producer's send callback would > eventually return the UnknownTopicOrPartition exception after configured > max.delivery.ms whose default is 2min if EOS is not enabled (otherwise its > Integer.MAX_VALUE). Then in Streams implementation the exception would be > handled by ProductionExceptionHandler which by default would treat it as > `FAIL` and would cause the thread to die. If it treat it is CONTINUE then it > would be silently ignored and the sent records are lost silently. > We should improve this situation in Streams by special-handling > {{UnknownTopicOrPartition}} exception and trigger a rebalance as well, and > then in leader we can also check if the sink topic metadata exists, just like > source topic, and then follow the same logic as in > https://issues.apache.org/jira/browse/KAFKA-10355. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-10814) improving ability of handling exception in kafka
[ https://issues.apache.org/jira/browse/KAFKA-10814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-10814. - Resolution: Incomplete Unclear what the problem is. Closing. > improving ability of handling exception in kafka > > > Key: KAFKA-10814 > URL: https://issues.apache.org/jira/browse/KAFKA-10814 > Project: Kafka > Issue Type: Wish > Components: streams >Reporter: yuzhou >Priority: Major > > When i use kafka streams,an exception such as NullPointerException runtime > exception occured in a processor,which i can't caught in it,the whole streams > was shuttdown,and other streams app would shuttdown either. > When use common tomcat handle request,excepiton just throw,not shuttdown > application,and not effect other request forward. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-8629) Kafka Streams Apps to support small native images through GraalVM
[ https://issues.apache.org/jira/browse/KAFKA-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-8629. Resolution: Fixed GraalVM should work with KS now. > Kafka Streams Apps to support small native images through GraalVM > - > > Key: KAFKA-8629 > URL: https://issues.apache.org/jira/browse/KAFKA-8629 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 2.3.0 > Environment: OSX > Linux on Docker >Reporter: Andy Muir >Assignee: Andy Muir >Priority: Minor > > I'm investigating using [GraalVM|http://example.com/] to help with reducing > docker image size and required resources for a simple Kafka Streams > microservice. To this end, I'm looking at running a microservice which: > 1) consumes from a Kafka topic (XML) > 2) Transforms into JSON > 3) Produces to a new Kafka topic. > The Kafka Streams app running in the JVM works fine. > When I attempt to build it to a GraalVM native image (binary executable which > does not require the JVM, hence smaller image size and less resources), I > encountered a few > [incompatibilities|https://github.com/oracle/graal/blob/master/substratevm/LIMITATIONS.md] > with the source code in Kafka. > I've implemented a workaround for each of these in a fork (link to follow) to > help establish if it is feasible. I don't intend (at this stage) for the > changes to be applied to the broker - I'm only after Kafka Streams for now. > I'm not sure whether it'd be a good idea for the broker itself to run as a > native image! > There were 2 issues getting the native image with kafka streams: > 1) Some Reflection use cases using MethodHandle > 2) Anything JMX > To work around these issues, I have: > 1) Replaced use of MethodHandle with alternatives > 2) Commented out the JMX code in a few places > While the first may be sustainable, I'd expect that the 2nd option should be > put behind a configuration switch to allow the existing code to be used by > default and turning off JMX if configured. > *I haven't created a PR for now, as I'd like feedback to decide if it is > going to be feasible to carry this forwards.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13326) Add multi-cluster support to Kafka Streams
[ https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-13326. - Resolution: Duplicate > Add multi-cluster support to Kafka Streams > -- > > Key: KAFKA-13326 > URL: https://issues.apache.org/jira/browse/KAFKA-13326 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Guangyuan Wang >Priority: Major > Labels: needs-kip > > Dear Kafka Team, > According to the link, > https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers. > Kafka Streams applications can only communicate with a single Kafka cluster > specified by this config value. Future versions of Kafka Streams will support > connecting to different Kafka clusters for reading input streams and writing > output streams. > Which version will this feature be added in the Kafka stream? This is really > a very good feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (KAFKA-13183) Dropping nul key/value records upstream to repartiton topic not tracked via metrics
[ https://issues.apache.org/jira/browse/KAFKA-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reopened KAFKA-13183: - > Dropping nul key/value records upstream to repartiton topic not tracked via > metrics > --- > > Key: KAFKA-13183 > URL: https://issues.apache.org/jira/browse/KAFKA-13183 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Matthias J. Sax >Priority: Major > > For joins and aggregation, we consider records with null key or value as > invalid, and drop them. Inside the aggregate and join processors, we record > dropped record with a corresponding metric (cf `droppedRecrodSensor`). > However, we also apply an upstream optimization if we need to repartition > data. As we know that the downstream aggregation / join will drop those > records anyway, we drop them _before_ we write them into the repartition > topic (we still need the drop logic in the processor for the case we don't > have a repartition topic). > We add a `KStreamFilter` (cf `KStreamImpl#createRepartiitonSource()`) > upstream but this filter does not update the corresponding metric to record > dropped records. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13183) Dropping nul key/value records upstream to repartiton topic not tracked via metrics
[ https://issues.apache.org/jira/browse/KAFKA-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-13183. - Resolution: Fixed > Dropping nul key/value records upstream to repartiton topic not tracked via > metrics > --- > > Key: KAFKA-13183 > URL: https://issues.apache.org/jira/browse/KAFKA-13183 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Matthias J. Sax >Priority: Major > > For joins and aggregation, we consider records with null key or value as > invalid, and drop them. Inside the aggregate and join processors, we record > dropped record with a corresponding metric (cf `droppedRecrodSensor`). > However, we also apply an upstream optimization if we need to repartition > data. As we know that the downstream aggregation / join will drop those > records anyway, we drop them _before_ we write them into the repartition > topic (we still need the drop logic in the processor for the case we don't > have a repartition topic). > We add a `KStreamFilter` (cf `KStreamImpl#createRepartiitonSource()`) > upstream but this filter does not update the corresponding metric to record > dropped records. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17057) Add "retry" option to ProductionExceptionHandler
Matthias J. Sax created KAFKA-17057: --- Summary: Add "retry" option to ProductionExceptionHandler Key: KAFKA-17057 URL: https://issues.apache.org/jira/browse/KAFKA-17057 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax With KAFKA-16508 we changed the KS behavior to call the ProductionExceptionHandler for a single special case of a potentially missing output topic, to break an infinite retry loop. However, this seems not to be very flexible, as users might want to retry for some cases. We might also consider to not calling the handler when writing into internal topics, as those _must_ exist. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16508) Infinite loop if output topic does not exisit
[ https://issues.apache.org/jira/browse/KAFKA-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16508. - Fix Version/s: 3.9.0 Resolution: Fixed > Infinite loop if output topic does not exisit > - > > Key: KAFKA-16508 > URL: https://issues.apache.org/jira/browse/KAFKA-16508 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Alieh Saeedi >Priority: Major > Fix For: 3.9.0 > > > Kafka Streams supports `ProductionExceptionHandler` to drop records on error > when writing into an output topic. > However, if the output topic does not exist, the corresponding error cannot > be skipped over because the handler is not called. > The issue is, that the producer internally retires to fetch the output topic > metadata until it times out, an a `TimeoutException` (which is a > `RetriableException`) is returned via the registered `Callback`. However, for > `RetriableException` there is different code path and the > `ProductionExceptionHandler` is not called. > In general, Kafka Streams correctly tries to handle as many errors a possible > internally, and a `RetriableError` falls into this category (and thus there > is no need to call the handler). However, for this particular case, just > retrying does not solve the issue – it's unclear if throwing a retryable > `TimeoutException` is actually the right thing to do for the Producer? Also > not sure what the right way to address this ticket would be (currently, we > cannot really detect this case, except if we would do some nasty error > message String comparison what sounds hacky...) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16965) Add a "root cause" exception as a nested exception to TimeoutException for Producer
[ https://issues.apache.org/jira/browse/KAFKA-16965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16965. - Fix Version/s: 3.9.0 Resolution: Fixed > Add a "root cause" exception as a nested exception to TimeoutException for > Producer > --- > > Key: KAFKA-16965 > URL: https://issues.apache.org/jira/browse/KAFKA-16965 > Project: Kafka > Issue Type: Bug > Components: producer >Reporter: Alieh Saeedi >Assignee: Alieh Saeedi >Priority: Minor > Fix For: 3.9.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17019) Producer TimeoutException should include root cause
Matthias J. Sax created KAFKA-17019: --- Summary: Producer TimeoutException should include root cause Key: KAFKA-17019 URL: https://issues.apache.org/jira/browse/KAFKA-17019 Project: Kafka Issue Type: Improvement Components: producer Reporter: Matthias J. Sax With KAFKA-16965 we added a "root cause" to some `TimeoutException` throws by the producer. However, it's only a partial solution to address a specific issue. We should consider to add the "root cause" for _all_ `TimeoutException` cases and unify/cleanup the code to get an holistic solution to the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16994) Flaky Test SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart
Matthias J. Sax created KAFKA-16994: --- Summary: Flaky Test SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart Key: KAFKA-16994 URL: https://issues.apache.org/jira/browse/KAFKA-16994 Project: Kafka Issue Type: Test Reporter: Matthias J. Sax Attachments: 5owo5xbyzjnao-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_true]-1-output.txt, 7jnraxqt7a52m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_cache_false]-1-output.txt, dujhqmgv6nzuu-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_true]-1-output.txt, fj6qia6oiob4m-org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest-shouldRestoreAfterJoinRestart[ON_WINDOW_UPDATE_cache_false]-1-output.txt Failed for all different parameters. {code:java} java.lang.AssertionError: Did not receive all 1 records from topic output-shouldRestoreAfterJoinRestart_ON_WINDOW_CLOSE_cache_true_F_de0bULT5a8gQ_8lAhz8Q within 6 msExpected: is a value equal to or greater than <1> but: <0> was less than <1>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueWithTimestampRecordsReceived$2(IntegrationTestUtils.java:778)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:412)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueWithTimestampRecordsReceived(IntegrationTestUtils.java:774)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.receiveMessagesWithTimestamp(SlidingWindowedKStreamIntegrationTest.java:479)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart(SlidingWindowedKStreamIntegrationTest.java:404)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:568)at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at java.util.concurrent.FutureTask.run(FutureTask.java:264)at java.lang.Thread.run(Thread.java:833) {code} {code:java} java.lang.AssertionError: Did not receive all 2 records from topic output-shouldRestoreAfterJoinRestart_ON_WINDOW_UPDATE_cache_true_bG_UnW1QSr_7tz2aXQNTXA within 6 msExpected: is a value equal to or greater than <2> but: <0> was less than <2>at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueWithTimestampRecordsReceived$2(IntegrationTestUtils.java:778)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:412)at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueWithTimestampRecordsReceived(IntegrationTestUtils.java:774)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.receiveMessagesWithTimestamp(SlidingWindowedKStreamIntegrationTest.java:479)at org.apache.kafka.streams.integration.SlidingWindowedKStreamIntegrationTest.shouldRestoreAfterJoinRestart(SlidingWindowedKStreamIntegrationTest.java:404)at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)at java.lang.reflect.Method.invoke(Method.java:580)at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at org.junit.internal.runners.statements.Ru
[jira] [Created] (KAFKA-16993) Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener()
Matthias J. Sax created KAFKA-16993: --- Summary: Flaky test RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener.shouldInvokeUserDefinedGlobalStateRestoreListener() Key: KAFKA-16993 URL: https://issues.apache.org/jira/browse/KAFKA-16993 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Attachments: 6u4a4e27e2oh2-org.apache.kafka.streams.integration.RestoreIntegrationTest-shouldInvokeUserDefinedGlobalStateRestoreListener()-1-output.txt {code:java} org.opentest4j.AssertionFailedError: expected: but was: at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)at org.apache.kafka.streams.integration.RestoreIntegrationTest.shouldInvokeUserDefinedGlobalStateRestoreListener(RestoreIntegrationTest.java:611)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)at org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)at java.util.ArrayList.forEach(ArrayList.java:1259)at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(
[jira] [Resolved] (KAFKA-16992) Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]
[ https://issues.apache.org/jira/browse/KAFKA-16992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16992. - Resolution: Duplicate > Flaky Test > org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2] > -- > > Key: KAFKA-16992 > URL: https://issues.apache.org/jira/browse/KAFKA-16992 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Major > Attachments: > 6u4a4e27e2oh2-org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest-shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]-1-output.txt > > > We saw this test to timeout more frequently recently: > {code:java} > org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. > Expected ERROR state but driver is on RUNNING ==> expected: but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at > > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)at > org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at > > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350)at > org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169)at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at > java.lang.reflect.Method.invoke(Method.java:498)at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at > > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at > > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at > > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at > > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at > java.util.concurrent.FutureTask.run(FutureTask.java:266)at > java.lang.Thread.run(Thread.java:750) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16992) Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]
Matthias J. Sax created KAFKA-16992: --- Summary: Flaky Test org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2] Key: KAFKA-16992 URL: https://issues.apache.org/jira/browse/KAFKA-16992 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Attachments: 6u4a4e27e2oh2-org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest-shouldWorkWithUncleanShutdownWipeOutStateStore[exactly_once_v2]-1-output.txt We saw this test to timeout more frequently recently: {code:java} org.opentest4j.AssertionFailedError: Condition not met within timeout 15000. Expected ERROR state but driver is on RUNNING ==> expected: but was: at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)at org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:350)at org.apache.kafka.streams.integration.EOSUncleanShutdownIntegrationTest.shouldWorkWithUncleanShutdownWipeOutStateStore(EOSUncleanShutdownIntegrationTest.java:169)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.lang.Thread.run(Thread.java:750) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16991) Flaky Test org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState
Matthias J. Sax created KAFKA-16991: --- Summary: Flaky Test org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState Key: KAFKA-16991 URL: https://issues.apache.org/jira/browse/KAFKA-16991 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax Attachments: 5owo5xbyzjnao-org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest-shouldRestoreState()-1-output.txt We see this test running into timeouts more frequently recently. {code:java} org.opentest4j.AssertionFailedError: Condition not met within timeout 6. Repartition topic restore-test-KSTREAM-AGGREGATE-STATE-STORE-02-repartition not purged data after 6 ms. ==> expected: but was: at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)•••at org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:396)at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:444)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:393)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:377)at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:367)at org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest.shouldRestoreState(PurgeRepartitionTopicIntegrationTest.java:220) {code} There was no ERROR or WARN log... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16001) Migrate ConsumerNetworkThreadTest away from ConsumerTestBuilder
[ https://issues.apache.org/jira/browse/KAFKA-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16001. - Resolution: Fixed > Migrate ConsumerNetworkThreadTest away from ConsumerTestBuilder > --- > > Key: KAFKA-16001 > URL: https://issues.apache.org/jira/browse/KAFKA-16001 > Project: Kafka > Issue Type: Test > Components: clients, consumer, unit tests >Reporter: Lucas Brutschy >Assignee: Brenden DeLuna >Priority: Minor > Labels: consumer-threading-refactor, unit-tests > Fix For: 3.9.0 > > > We should: > # Remove spy calls to the dependencies > # Remove ConsumerNetworkThreadTest -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13298) Improve documentation on EOS KStream requirements
[ https://issues.apache.org/jira/browse/KAFKA-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-13298. - Fix Version/s: 3.1.0 Assignee: Andy Chambers Resolution: Fixed > Improve documentation on EOS KStream requirements > - > > Key: KAFKA-13298 > URL: https://issues.apache.org/jira/browse/KAFKA-13298 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 2.8.0 >Reporter: F Méthot >Assignee: Andy Chambers >Priority: Trivial > Fix For: 3.1.0 > > > After posting question on a kafka forum, the following was revealed by kafka > developer: > {quote}Is there minimum replication factor required to enable exactly_once > for topic involved in transactions? > {quote} > "Well, technically you can configure the system to use EOS with any > replication factor. However, using a lower replication factor than 3 > effectively voids EOS. Thus, it’s strongly recommended to use a replication > factor of 3." > > This should be clearly documented in the stream doc: > [https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html] > Which refers to this broker link > [https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#streams-developer-guide-processing-guarantedd] > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13561) Consider deprecating `StreamsBuilder#build(props)` function
[ https://issues.apache.org/jira/browse/KAFKA-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-13561. - Resolution: Duplicate > Consider deprecating `StreamsBuilder#build(props)` function > --- > > Key: KAFKA-13561 > URL: https://issues.apache.org/jira/browse/KAFKA-13561 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Priority: Major > Labels: needs-kip > > With > https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store > being accepted that introduced the new `StreamsBuilder(TopologyConfig)` > constructor, we can consider deprecating the `StreamsBuilder#build(props)` > function now. There are still a few things we'd need to do: > 1. Copy the `StreamsConfig.TOPOLOGY_OPTIMIZATION_CONFIG` to TopologyConfig. > 2. Make sure the overloaded `StreamsBuilder()` constructor takes in default > values of TopologyConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16945) Cleanup StreamsBuilder and TopologyConfig
Matthias J. Sax created KAFKA-16945: --- Summary: Cleanup StreamsBuilder and TopologyConfig Key: KAFKA-16945 URL: https://issues.apache.org/jira/browse/KAFKA-16945 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax Historically, Kafka Streams offers two ways to build a topology: either via the PAPI by creating a `new Topology()` explicitly, or via the `StreamsBuilder` which returns a topology via `build()` method. We later added an overload `build(Properties)` to enable topology optimizations for the DSL layer. Furthermore, we also added `TopologyConfig` object, which can be passed into `new Topology(TopologyConfig)` as well as `StreamsBuilder(TopologyConfig)`. We should consider to unify the different approaches to simplify the rather complex API we have right now. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15600) KIP-990: Capability to PAUSE Tasks on DeserializationException
[ https://issues.apache.org/jira/browse/KAFKA-15600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-15600. - Resolution: Won't Fix > KIP-990: Capability to PAUSE Tasks on DeserializationException > -- > > Key: KAFKA-15600 > URL: https://issues.apache.org/jira/browse/KAFKA-15600 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Nicholas Telford >Assignee: Nicholas Telford >Priority: Minor > Labels: kip > > Presently, Kafka Streams provides users with two options for handling a > {{DeserializationException}} via the {{DeserializationExceptionHandler}} > interface: > # {{FAIL}} - throw an Exception that causes the stream thread to fail. This > will either cause the whole application instance to exit, or the stream > thread will be replaced and restarted. Either way, the failed {{Task}} will > end up being resumed, either by the current instance or after being > rebalanced to another, causing a cascading failure until a user intervenes to > address the problem. > # {{CONTINUE}} - discard the record and continue processing with the next > record. This can cause data loss if the record triggering the > {{DeserializationException}} should be considered a valid record. This can > happen if an upstream producer changes the record schema in a way that is > incompatible with the streams application, or if there is a bug in the > {{Deserializer}} (for example, failing to handle a valid edge-case). > The user can currently choose between data loss, or a cascading failure that > usually causes all processing to slowly grind to a halt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14567) Kafka Streams crashes after ProducerFencedException
[ https://issues.apache.org/jira/browse/KAFKA-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-14567. - Assignee: (was: Kirk True) Resolution: Fixed Not 100% sure either, but I feel good enough to close this ticket for now. If we see it again, we can reopen or create a new ticket. > Kafka Streams crashes after ProducerFencedException > --- > > Key: KAFKA-14567 > URL: https://issues.apache.org/jira/browse/KAFKA-14567 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.7.0 >Reporter: Matthias J. Sax >Priority: Blocker > Labels: eos > Fix For: 3.8.0 > > > Running a Kafka Streams application with EOS-v2. > We first see a `ProducerFencedException`. After the fencing, the fenced > thread crashed resulting in a non-recoverable error: > {quote}[2022-12-22 13:49:13,423] ERROR [i-0c291188ec2ae17a0-StreamThread-3] > stream-thread [i-0c291188ec2ae17a0-StreamThread-3] Failed to process stream > task 1_2 due to the following error: > (org.apache.kafka.streams.processor.internals.TaskExecutor) > org.apache.kafka.streams.errors.StreamsException: Exception caught in > process. taskId=1_2, processor=KSTREAM-SOURCE-05, > topic=node-name-repartition, partition=2, offset=539776276, > stacktrace=java.lang.IllegalStateException: TransactionalId > stream-soak-test-72b6e57c-c2f5-489d-ab9f-fdbb215d2c86-3: Invalid transition > attempted from state FATAL_ERROR to state ABORTABLE_ERROR > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:974) > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:394) > at > org.apache.kafka.clients.producer.internals.TransactionManager.maybeTransitionToErrorState(TransactionManager.java:620) > at > org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1079) > at > org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:959) > at > org.apache.kafka.streams.processor.internals.StreamsProducer.send(StreamsProducer.java:257) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:207) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:162) > at > org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:85) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228) > at > org.apache.kafka.streams.kstream.internals.KStreamKTableJoinProcessor.process(KStreamKTableJoinProcessor.java:88) > at > org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:157) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228) > at > org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:84) > at > org.apache.kafka.streams.processor.internals.StreamTask.lambda$doProcess$1(StreamTask.java:791) > at > org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:867) > at > org.apache.kafka.streams.processor.internals.StreamTask.doProcess(StreamTask.java:791) > at > org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:722) > at > org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:95) > at > org.apache.kafka.streams.processor.internals.TaskExecutor.process(TaskExecutor.java:76) > at > org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1645) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:788) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:607) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:569) > at > org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:748) > at > o
[jira] [Resolved] (KAFKA-16911) Kafka Streams topology optimization docs incomplete
[ https://issues.apache.org/jira/browse/KAFKA-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16911. - Fix Version/s: 3.8.0 Resolution: Fixed > Kafka Streams topology optimization docs incomplete > --- > > Key: KAFKA-16911 > URL: https://issues.apache.org/jira/browse/KAFKA-16911 > Project: Kafka > Issue Type: Improvement > Components: docs, streams >Affects Versions: 3.4.0 >Reporter: Matthias J. Sax >Assignee: James Galasyn >Priority: Minor > Fix For: 3.8.0 > > > The docs for topology optimization are incomplete: > [https://kafka.apache.org/37/documentation/streams/developer-guide/config-streams.html#topology-optimization] > In 3.4.0 we added a new option via > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-862%3A+Self-join+optimization+for+stream-stream+joins] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16911) Kafka Streams topology optimization docs incomplete
Matthias J. Sax created KAFKA-16911: --- Summary: Kafka Streams topology optimization docs incomplete Key: KAFKA-16911 URL: https://issues.apache.org/jira/browse/KAFKA-16911 Project: Kafka Issue Type: Improvement Components: docs, streams Affects Versions: 3.4.0 Reporter: Matthias J. Sax The docs for topology optimization are incomplte: [https://kafka.apache.org/37/documentation/streams/developer-guide/config-streams.html#topology-optimization] In 3.4.0 we added a new option via [https://cwiki.apache.org/confluence/display/KAFKA/KIP-862%3A+Self-join+optimization+for+stream-stream+joins] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16903) Task should consider producer error previously occurred for different task
[ https://issues.apache.org/jira/browse/KAFKA-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16903. - Resolution: Fixed > Task should consider producer error previously occurred for different task > -- > > Key: KAFKA-16903 > URL: https://issues.apache.org/jira/browse/KAFKA-16903 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.7.0 >Reporter: Bruno Cadonna >Assignee: Bruno Cadonna >Priority: Major > Fix For: 3.8.0 > > > A task does not consider a producer error that occurred for a different task. > The following log messages show the issue. > Task {{0_2}} of a Streams app (EOSv2 enabled) crashes while sending records > with an {{InvalidTxnStateException}}: > {code:java} > [2024-05-30 10:20:35,881] ERROR [kafka-producer-network-thread | > i-0af25f5c2bd9bba31-StreamThread-1-producer] stream-thread > [i-0af25f5c2bd9bba31-StreamThread-1] stream-task [0_2] Error encountered > sending record to topic stream-soak-test-node-name-repartition for task 0_2 > due to: > org.apache.kafka.common.errors.InvalidTxnStateException: The producer > attempted a transactional operation in an invalid state. > Exception handler choose to FAIL the processing, no more records would be > sent. (org.apache.kafka.streams.processor.internals.RecordCollectorImpl) > org.apache.kafka.common.errors.InvalidTxnStateException: The producer > attempted a transactional operation in an invalid state. > [2024-05-30 10:20:35,886] ERROR [i-0af25f5c2bd9bba31-StreamThread-1] > stream-thread [i-0af25f5c2bd9bba31-StreamThread-1] Failed to process stream > task 0_2 due to the following error: > (org.apache.kafka.streams.processor.internals.TaskExecutor) > org.apache.kafka.streams.errors.StreamsException: Error encountered sending > record to topic stream-soak-test-node-name-repartition for task 0_2 due to: > org.apache.kafka.common.errors.InvalidTxnStateException: The producer > attempted a transactional operation in an invalid state. > Exception handler choose to FAIL the processing, no more records would be > sent. > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:316) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$1(RecordCollectorImpl.java:285) > at > org.apache.kafka.clients.producer.KafkaProducer$AppendCallbacks.onCompletion(KafkaProducer.java:1565) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:311) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:272) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.completeExceptionally(ProducerBatch.java:236) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:829) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:818) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:770) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:702) > at > org.apache.kafka.clients.producer.internals.Sender.lambda$null$2(Sender.java:627) > at java.util.ArrayList.forEach(ArrayList.java:1259) > at > org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$3(Sender.java:612) > at java.lang.Iterable.forEach(Iterable.java:75) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:612) > at > org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$9(Sender.java:916) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:608) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:600) > at > org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:348) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:250) > at java.lang.Thread.run(Thread.java:750) > Caused by: org.apache.kafka.common.errors.InvalidTxnStateException: The > producer attempted a transactional operation in an invalid state. > {code} > Just before the exception of task 0_2 also task 0_0 encountered an > exception while producing: > {code:java} > [2024-05-30 10:20:35,880] ERROR [kafka-producer-network-thread | > i-0af25f5c2bd9bba31-StreamThread-1-producer] stream-thread > [i-0af25f5c2bd9bba31-StreamThread-1] stream-task [0_0] Error encountered > sending record to topic stream-soak-test-network-id-repartition for task
[jira] [Created] (KAFKA-16863) Consider removing `default.` prefix for exception handler config
Matthias J. Sax created KAFKA-16863: --- Summary: Consider removing `default.` prefix for exception handler config Key: KAFKA-16863 URL: https://issues.apache.org/jira/browse/KAFKA-16863 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax Kafka Streams has a set of configs with `default.` prefix. The intent for the default-prefix is to make a distinction between, well the default, and in-place overwrites in the code. Eg, users can specify ts-extractors on a per-topic basis. However, for the deserialization- and production-exception handlers, no such overwrites are possible, and thus, `default.` does not really make sense, because there is just one handler overall. Via KIP-1033 we added a new processing-exception handler w/o a default-prefix, too. Thus, we should consider to deprecate the two existing configs names and add them back w/o the `default.` prefix. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15242) FixedKeyProcessor testing is unusable
[ https://issues.apache.org/jira/browse/KAFKA-15242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-15242. - Assignee: (was: Alexander Aghili) Resolution: Duplicate > FixedKeyProcessor testing is unusable > - > > Key: KAFKA-15242 > URL: https://issues.apache.org/jira/browse/KAFKA-15242 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Zlstibor Veljkovic >Priority: Major > > Using mock processor context to get the forwarded message doesn't work. > Also there is not a well documented way for testing FixedKeyProcessors. > Please see the repo at [https://github.com/zveljkovic/kafka-repro] > but most important piece is test file with runtime and compile time errors: > [https://github.com/zveljkovic/kafka-repro/blob/master/src/test/java/com/example/demo/MyFixedKeyProcessorTest.java] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16644) FK join emits duplicate tombstone on left-side delete
[ https://issues.apache.org/jira/browse/KAFKA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16644. - Resolution: Duplicate > FK join emits duplicate tombstone on left-side delete > - > > Key: KAFKA-16644 > URL: https://issues.apache.org/jira/browse/KAFKA-16644 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.7.0 >Reporter: Matthias J. Sax >Priority: Major > > We introduced a regression bug in 3.7.0 release via KAFKA-14748. When a > left-hand side record is deleted, the join now emits two tombstone records > instead of one. > The problem was not detected via unit test, because the tests use a `Map` > instead of a `List` when verifying the result topic records > ([https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/KTableKTableForeignKeyJoinIntegrationTest.java#L250-L255).] > We should update all test cases to use `List` when reading from the output > topic, and of course fix the introduced bug: The > `SubscriptionSendProcessorSupplier` is sending two subscription records > instead of just a single one: > [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/foreignkeyjoin/SubscriptionSendProcessorSupplier.java#L133-L136] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16644) FK join emit duplicate tombstone on left-side delete
Matthias J. Sax created KAFKA-16644: --- Summary: FK join emit duplicate tombstone on left-side delete Key: KAFKA-16644 URL: https://issues.apache.org/jira/browse/KAFKA-16644 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 3.7.0 Reporter: Matthias J. Sax We introduced a regression bug in 3.7.0 release via KAFKA-14778. When a left-hand side record is deleted, the join now emits two tombstone records instead of one. The problem was not detected via unit test, because the test use a `Map` instead of a `List` when verifying the result topic records ([https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/KTableKTableForeignKeyJoinIntegrationTest.java#L250-L255).] We should update all test cases to use `List` when reading from the output topic, and of course fix the introduces bug: The `SubscriptionSendProcessorSupplier` is sending two subscription records instead of just a single one: [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/foreignkeyjoin/SubscriptionSendProcessorSupplier.java#L133-L136] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16486) Integrate metric measurability changes in metrics collector
[ https://issues.apache.org/jira/browse/KAFKA-16486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16486. - Fix Version/s: 3.8.0 Resolution: Done > Integrate metric measurability changes in metrics collector > --- > > Key: KAFKA-16486 > URL: https://issues.apache.org/jira/browse/KAFKA-16486 > Project: Kafka > Issue Type: Sub-task >Reporter: Apoorv Mittal >Assignee: Apoorv Mittal >Priority: Major > Fix For: 3.8.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16586) Test TaskAssignorConvergenceTest failing
Matthias J. Sax created KAFKA-16586: --- Summary: Test TaskAssignorConvergenceTest failing Key: KAFKA-16586 URL: https://issues.apache.org/jira/browse/KAFKA-16586 Project: Kafka Issue Type: Test Components: streams, unit tests Reporter: Matthias J. Sax {code:java} java.lang.AssertionError: Assertion failed in randomized test. Reproduce with: `runRandomizedScenario(-538095696758490522)`.at org.apache.kafka.streams.processor.internals.assignment.TaskAssignorConvergenceTest.runRandomizedScenario(TaskAssignorConvergenceTest.java:545) at org.apache.kafka.streams.processor.internals.assignment.TaskAssignorConvergenceTest.randomClusterPerturbationsShouldConverge(TaskAssignorConvergenceTest.java:479){code} This might expose an actual bug (or incorrect test setup) and should be reproducible (die not try it myself yet). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16280) Expose method to determine Metric Measurability
[ https://issues.apache.org/jira/browse/KAFKA-16280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16280. - Resolution: Done > Expose method to determine Metric Measurability > --- > > Key: KAFKA-16280 > URL: https://issues.apache.org/jira/browse/KAFKA-16280 > Project: Kafka > Issue Type: Bug > Components: metrics >Affects Versions: 3.8.0 >Reporter: Apoorv Mittal >Assignee: Apoorv Mittal >Priority: Major > Labels: kip > Fix For: 3.8.0 > > > The Jira is to track the development of KIP-1019: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-1019%3A+Expose+method+to+determine+Metric+Measurability] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16575) Automatically remove KTable aggregation result when group becomes empty
Matthias J. Sax created KAFKA-16575: --- Summary: Automatically remove KTable aggregation result when group becomes empty Key: KAFKA-16575 URL: https://issues.apache.org/jira/browse/KAFKA-16575 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax Using `KTable.groupBy(...).aggregate(...)` can handle updates (inserts, deletes, actual updates) of the input KTable, by calling the provided `Adder` and `Subtractor`. However, when all records from the input table (which map to the same group/row in the result table) get removed, the result entry is not removed automatically. For example, if we implement a "count", the count would go to zero for a group by default, instead of removing the row from the result, if all input record for this group got deleted. Users can let their `Subtractor` return `null` for this case, to actually delete the row, but it's not well documented and it seems it should be a built-in feature of the table-aggregation to remove "empty groups" from the result, instead of relying on "correct" behavior of user-code. (Also the built-in `count()` does not return `null`, but actually zero...) An internal counter how many elements are in a group should be sufficient. Of course, there is backward compatibility questions we need to answer. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16508) Infinte loop if output topic does not exisit
Matthias J. Sax created KAFKA-16508: --- Summary: Infinte loop if output topic does not exisit Key: KAFKA-16508 URL: https://issues.apache.org/jira/browse/KAFKA-16508 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax Kafka Streams supports `ProductionExceptionHandler` to drop records on error when writing into an output topic. However, if the output topic does not exist, the corresponding error cannot be skipped over because the handler is not called. The issue is, that the producer internally retires to fetch the output topic metadata until it times out, an a `TimeoutException` (which is a `RetriableException`) is returned via the registered `Callback`. However, for `RetriableException` there is different code path and the `ProductionExceptionHandler` is not called. In general, Kafka Streams correctly tries to handle as many errors a possible internally, and a `RetriableError` falls into this category (and thus there is no need to call the handler). However, for this particular case, just retrying does not solve the issue – it's unclear if throwing a retryable `TimeoutException` is actually the right thing to do for the Producer? Also not sure what the right way to address this ticket would be (currently, we cannot really detect this case, except if we would do some nasty error message String comparison what sounds hacky...) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16357) Kafka Client JAR manifest breaks javac linting
[ https://issues.apache.org/jira/browse/KAFKA-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16357. - Resolution: Duplicate > Kafka Client JAR manifest breaks javac linting > -- > > Key: KAFKA-16357 > URL: https://issues.apache.org/jira/browse/KAFKA-16357 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.7.0 > Environment: Linux, JDK 21 (Docker image eclipse-temurin:21-jdk-jammy) >Reporter: Jacek Wojciechowski >Priority: Critical > > I upgraded kafka-clients from 3.6.1 to 3.7.0 and discovered that my project > is not building anymore. > The reason is that kafka-clients-3.7.0.jar contains the following entry in > its JAR manifest file: > Class-Path: zstd-jni-1.5.5-6.jar lz4-java-1.8.0.jar snappy-java-1.1.10 > .5.jar slf4j-api-1.7.36.jar > I'm using Maven repo to keep my dependencies and those files are not in the > same directory as kafka-clients-3.7.0.jar, so the paths in the manifest's > Class-Path are not correct. It fails my build because we build with javac > with all linting options on, in particular -Xlint:-path. It produces the > following warnings coming from javac: > [WARNING] COMPILATION WARNING : > [INFO] - > [WARNING] [path] bad path element > "/home/ci/.m2/repository/org/apache/kafka/kafka-clients/3.7.0/zstd-jni-1.5.5-6.jar": > no such file or directory > [WARNING] [path] bad path element > "/home/ci/.m2/repository/org/apache/kafka/kafka-clients/3.7.0/lz4-java-1.8.0.jar": > no such file or directory > [WARNING] [path] bad path element > "/home/ci/.m2/repository/org/apache/kafka/kafka-clients/3.7.0/snappy-java-1.1.10.5.jar": > no such file or directory > [WARNING] [path] bad path element > "/home/ci/.m2/repository/org/apache/kafka/kafka-clients/3.7.0/slf4j-api-1.7.36.jar": > no such file or directory > Since we have also {{-Werror}} option enabled, it turns warnings into errors > and fails our build. > I think our setup is quite typical: using Maven repo to store dependencies, > having linting on and -Werror. Unfortunatelly, it doesn't work with the > lastest kafka-clients because of the entries in the manifest's Class-Path. > And I think it might affect quite a lot of projects set up in a similar way. > I don't know what was the reason to add Class-Path entry in the JAR manifest > file - but perhaps this effect was not considered. > It would be great if you removed the Class-Path entry from the JAR manifest > file. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16360) Release plan of 3.x kafka releases.
[ https://issues.apache.org/jira/browse/KAFKA-16360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-16360. - Resolution: Invalid Please don't use Jira to ask questions. Jira tickets are for bug reports and features only. Question should be asked on the user and/or dev mailing lists: https://kafka.apache.org/contact > Release plan of 3.x kafka releases. > --- > > Key: KAFKA-16360 > URL: https://issues.apache.org/jira/browse/KAFKA-16360 > Project: Kafka > Issue Type: Improvement >Reporter: kaushik srinivas >Priority: Major > > KIP > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-ReleaseTimeline] > mentions , > h2. Kafka 3.7 > * January 2024 > * Final release with ZK mode > But we see in Jira, some tickets are marked for 3.8 release. Does apache > continue to make 3.x releases having zookeeper and kraft supported > independent of pure kraft 4.x releases ? > If yes, how many more releases can be expected on 3.x release line ? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16366) Refactor KTable source optimization
Matthias J. Sax created KAFKA-16366: --- Summary: Refactor KTable source optimization Key: KAFKA-16366 URL: https://issues.apache.org/jira/browse/KAFKA-16366 Project: Kafka Issue Type: Improvement Components: streams Reporter: Matthias J. Sax Kafka Streams DSL offers an optimization to re-use an input topic as table changelog, in favor of creating a dedicated changelog topic. So far, the Processor API did not support any such feature, and thus when the DSL compiles down into a Topology, we needed to access topology internal stuff to allow for this optimization. With KIP-813 (merged for AK 3.8), we added `Topology#addReadOnlyStateStore` as public API, and thus we should refactor the DSL compilation code, to use this public API to build the `Topology` instead of internal APIs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15576) Add 3.6.0 to broker/client and streams upgrade/compatibility tests
[ https://issues.apache.org/jira/browse/KAFKA-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-15576. - Resolution: Fixed > Add 3.6.0 to broker/client and streams upgrade/compatibility tests > -- > > Key: KAFKA-15576 > URL: https://issues.apache.org/jira/browse/KAFKA-15576 > Project: Kafka > Issue Type: Task >Reporter: Satish Duggana >Priority: Major > Fix For: 3.8.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16350) StateUpdated does not init transaction after canceling task close action
Matthias J. Sax created KAFKA-16350: --- Summary: StateUpdated does not init transaction after canceling task close action Key: KAFKA-16350 URL: https://issues.apache.org/jira/browse/KAFKA-16350 Project: Kafka Issue Type: Bug Components: streams Reporter: Matthias J. Sax With EOSv2, we use a thread producer shared across all tasks. We init tx on the producer with each _task_ (due to EOSv1 which uses a producer per task), and have a guard in place to only init tx a single time. If we hit an error, we close the producer and create a new one, which is still not initialized for transaction. At the same time, with state updater, we schedule a "close task" action on error. For each task we get back, we do cancel the "close task" action, to actually keep the task. If this happens for _all_ tasks, we don't have any task in state CRATED at hand, and thus we never init the producer for transactions, because we assume this was already done. On the first `send` request, we crash with an IllegalStateException:{{{}{}}} {code:java} Invalid transition attempted from state UNINITIALIZED to state IN_TRANSACTION {code} This bug is exposed via EOSIntegrationTest (logs attached). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15417) JoinWindow does not seem to work properly with a KStream - KStream - LeftJoin()
[ https://issues.apache.org/jira/browse/KAFKA-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-15417. - Fix Version/s: 3.8.0 Resolution: Fixed > JoinWindow does not seem to work properly with a KStream - KStream - > LeftJoin() > > > Key: KAFKA-15417 > URL: https://issues.apache.org/jira/browse/KAFKA-15417 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.4.0 >Reporter: Victor van den Hoven >Assignee: Victor van den Hoven >Priority: Major > Fix For: 3.8.0 > > Attachments: Afbeelding 1-1.png, Afbeelding 1.png, > SimpleStreamTopology.java, SimpleStreamTopologyTest.java > > > In Kafka-streams 3.4.0 : > According to the javadoc of the Joinwindow: > _There are three different window configuration supported:_ > * _before = after = time-difference_ > * _before = 0 and after = time-difference_ > * _*before = time-difference and after = 0*_ > > However if I use a joinWindow with *before = time-difference and after = 0* > on a kstream-kstream-leftjoin the *after=0* part does not seem to work. > When using _stream1.leftjoin(stream2, joinWindow)_ with > {_}joinWindow{_}.{_}after=0 and joinWindow.before=30s{_}, any new message on > stream 1 that can not be joined with any messages on stream2 should be joined > with a null-record after the _joinWindow.after_ has ended and a new message > has arrived on stream1. > It does not. > Only if the new message arrives after the value of _joinWindow.before_ the > previous message will be joined with a null-record. > > Attached you can find two files with a TopologyTestDriver Unit test to > reproduce. > topology: stream1.leftjoin( stream2, joiner, joinwindow) > joinWindow has before=5000ms and after=0 > message1(key1) -> stream1 > after 4000ms message2(key2) -> stream1 -> NO null-record join was made, but > the after period was expired. > after 4900ms message2(key2) -> stream1 -> NO null-record join was made, but > the after period was expired. > after 5000ms message2(key2) -> stream1 -> A null-record join was made, > before period was expired. > after 6000ms message2(key2) -> stream1 -> A null-record join was made, > before period was expired. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14747) FK join should record discarded subscription responses
[ https://issues.apache.org/jira/browse/KAFKA-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-14747. - Fix Version/s: 3.8.0 Resolution: Fixed > FK join should record discarded subscription responses > -- > > Key: KAFKA-14747 > URL: https://issues.apache.org/jira/browse/KAFKA-14747 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Ayoub Omari >Priority: Minor > Labels: beginner, newbie > Fix For: 3.8.0 > > > FK-joins are subject to a race condition: If the left-hand side record is > updated, a subscription is sent to the right-hand side (including a hash > value of the left-hand side record), and the right-hand side might send back > join responses (also including the original hash). The left-hand side only > processed the responses if the returned hash matches to current hash of the > left-hand side record, because a different hash implies that the lef- hand > side record was updated in the mean time (including sending a new > subscription to the right hand side), and thus the data is stale and the > response should not be processed (joining the response to the new record > could lead to incorrect results). > A similar thing can happen on a right-hand side update that triggers a > response, that might be dropped if the left-hand side record was updated in > parallel. > While the behavior is correct, we don't record if this happens. We should > consider to record this using the existing "dropped record" sensor or maybe > add a new sensor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-10603) Re-design KStream.process() and K*.transform*() operations
[ https://issues.apache.org/jira/browse/KAFKA-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-10603. - Resolution: Fixed > Re-design KStream.process() and K*.transform*() operations > -- > > Key: KAFKA-10603 > URL: https://issues.apache.org/jira/browse/KAFKA-10603 > Project: Kafka > Issue Type: New Feature >Reporter: John Roesler >Priority: Major > Labels: needs-kip > > After the implementation of KIP-478, we have the ability to reconsider all > these APIs, and maybe just replace them with > {code:java} > // KStream > KStream process(ProcessorSupplier) > // KTable > KTable process(ProcessorSupplier){code} > > but it needs more thought and a KIP for sure. > > This ticket probably supercedes > https://issues.apache.org/jira/browse/KAFKA-8396 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16339) Remove Deprecated "transformer" methods and classes
Matthias J. Sax created KAFKA-16339: --- Summary: Remove Deprecated "transformer" methods and classes Key: KAFKA-16339 URL: https://issues.apache.org/jira/browse/KAFKA-16339 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Fix For: 4.0.0 Cf [https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API] * KStream#tranform * KStream#flatTransform * KStream#transformValue * KStream#flatTransformValues * and the corresponding Scala methods Related to https://issues.apache.org/jira/browse/KAFKA-12829, and both tickets should be worked on together. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16338) Removed Deprecated configs from StreamsConfig
Matthias J. Sax created KAFKA-16338: --- Summary: Removed Deprecated configs from StreamsConfig Key: KAFKA-16338 URL: https://issues.apache.org/jira/browse/KAFKA-16338 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Fix For: 5.0.0 * "buffered.records.per.partition" were deprecated via [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390] (KIP not fully implemented yet, so move this from the 4.0 into this 5.0 ticket) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16337) Remove Deprecates APIs of Kafka Streams in 5.0
Matthias J. Sax created KAFKA-16337: --- Summary: Remove Deprecates APIs of Kafka Streams in 5.0 Key: KAFKA-16337 URL: https://issues.apache.org/jira/browse/KAFKA-16337 Project: Kafka Issue Type: Task Components: streams, streams-test-utils Reporter: Matthias J. Sax Fix For: 5.0.0 This is an umbrella ticket that tries to collect all APIs under Kafka Streams that were deprecated in 3.6 or later. When the release scheduled for 5.0 will be set, we might need to remove sub-tasks if they don't hit the 1-year threshold. Each subtask will de focusing on a specific API, so it's easy to discuss if it should be removed by 5.0.0 or maybe even at a later point. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16336) Remove Deprecated metric standby-process-ratio
Matthias J. Sax created KAFKA-16336: --- Summary: Remove Deprecated metric standby-process-ratio Key: KAFKA-16336 URL: https://issues.apache.org/jira/browse/KAFKA-16336 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Fix For: 4.0.0 Metric "standby-process-ratio" was deprecated in 3.5 release via https://cwiki.apache.org/confluence/display/KAFKA/KIP-869%3A+Improve+Streams+State+Restoration+Visibility -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16335) Remove Deprecated method on StreamPartitioner
Matthias J. Sax created KAFKA-16335: --- Summary: Remove Deprecated method on StreamPartitioner Key: KAFKA-16335 URL: https://issues.apache.org/jira/browse/KAFKA-16335 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Fix For: 4.0.0 Deprecated in 3.4 release via [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883356] * StreamPartitioner#partition (singular) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16334) Remove Deprecated command line option from reset tool
Matthias J. Sax created KAFKA-16334: --- Summary: Remove Deprecated command line option from reset tool Key: KAFKA-16334 URL: https://issues.apache.org/jira/browse/KAFKA-16334 Project: Kafka Issue Type: Sub-task Components: streams, tools Reporter: Matthias J. Sax Fix For: 4.0.0 --bootstrap-server (singular) was deprecated in 3.4 release via [https://cwiki.apache.org/confluence/display/KAFKA/KIP-865%3A+Support+--bootstrap-server+in+kafka-streams-application-reset] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16333) Removed Deprecated methods KTable#join
Matthias J. Sax created KAFKA-16333: --- Summary: Removed Deprecated methods KTable#join Key: KAFKA-16333 URL: https://issues.apache.org/jira/browse/KAFKA-16333 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Fix For: 4.0.0 KTable#join() methods taking a `Named` parameter got deprecated in 3.1 release via https://issues.apache.org/jira/browse/KAFKA-13813 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16332) Remove Deprecated builder methods for Time/Session/Join/SlidingWindows
Matthias J. Sax created KAFKA-16332: --- Summary: Remove Deprecated builder methods for Time/Session/Join/SlidingWindows Key: KAFKA-16332 URL: https://issues.apache.org/jira/browse/KAFKA-16332 Project: Kafka Issue Type: Sub-task Reporter: Matthias J. Sax Deprecated in 3.0: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-633%3A+Deprecate+24-hour+Default+Grace+Period+for+Windowed+Operations+in+Streams] * TimeWindows#of * TimeWindows#grace * SessionWindows#with * SessionWindows#grace * JoinWindows#of * JoinWindows#grace * SlidingWindows#withTimeDifferencAndGrace Me might want to hold-off to cleanup JoinWindows due to https://issues.apache.org/jira/browse/KAFKA-13813 (open for discussion) -- This message was sent by Atlassian Jira (v8.20.10#820010)