[GitHub] [kafka] guozhangwang commented on pull request #8924: KAFKA-10198: guard against recycling dirty state
guozhangwang commented on pull request #8924: URL: https://github.com/apache/kafka/pull/8924#issuecomment-649199285 Merged to trunk and cherry-picked to 2.6 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] michael-carter-instaclustr commented on pull request #8844: KAFKA-9887 fix failed task or connector count on startup failure
michael-carter-instaclustr commented on pull request #8844: URL: https://github.com/apache/kafka/pull/8844#issuecomment-649196880 I've made changes to those test now @C0urante . I couple of things worth noting: Changing the mock of the WorkMetricsGroup to a real object needs a fair few more mocks that relate to each other, so I've organised those into a @Before method instead of injecting them via a @Mock annotation, I hope that's okay. Doing so had the benefit that the tests are now more explicit about the values of the metrics being recorded, which may make the deletion of lines in WorkerTest more palatable. Having a look at the tests I'm modifying there, most of them actually already do have expectations on the statusListener being called, so I think it that case it's fair to say that the responsibility has simply moved to the WorkerGroupMetrics class (and therefore the WorkerGroupMetricsTest). For the test that doesn't seem to have any expectations on the status listener (testAddRemoveTask), I believe this might be because it's the WorkerTask's job to call the status listener once it's running, but that aspect is mocked away in that particular test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-10198) Dirty tasks may be recycled instead of closed
[ https://issues.apache.org/jira/browse/KAFKA-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144625#comment-17144625 ] Sophie Blee-Goldman commented on KAFKA-10198: - [~rhauch] the fix has been merged and picked to the 2.6 branch > Dirty tasks may be recycled instead of closed > - > > Key: KAFKA-10198 > URL: https://issues.apache.org/jira/browse/KAFKA-10198 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.6.0 > > > We recently added a guard to `Task#closeClean` to make sure we don't > accidentally clean-close a dirty task, but we forgot to also add this check > to `Task#closeAndRecycleState`. This meant an otherwise dirty task could be > closed clean and recycled into a new task when it should have just been > closed. > This manifest as an NPE in our test application. Specifically, task 1_0 was > active on StreamThread-2 but reassigned as a standby. During handleRevocation > we hit a TaskMigratedException while flushing the tasks and bailed on trying > to flush and commit the remainder. This left task 1_0 with dirty keys in the > suppression buffer and the `commitNeeded` flag still set to true. > During handleAssignment, we should have closed all the tasks with pending > state as dirty (ie any task with commitNeeded = true). Since we don't know > about the TaskMigratedException we hit during handleRevocation, we rely on > the guard in Task#closeClean` to throw an exception and force the task to be > closed dirty. > Unfortunately, we left this guard out of `closeAndRecycleState`, which meant > task 1_0 was able to slip through without being closed dirty. Once > reinitialized as a standby task, we eventually tried to commit it. The > suppression buffer of course tried to flush its remaining dirty keys from its > previous life as an active task. But since it's now a standby task, it should > not be sending anything to the changelog and has a null RecordCollector. We > tried to access it, and hit the NPE. > > The fix is simple, we just need to add the guard in closeClean to > closeAndRecycleState as well -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] abbccdda commented on a change in pull request #8712: KAFKA-10006: Don't create internal topics when LeaderNotAvailableException
abbccdda commented on a change in pull request #8712: URL: https://github.com/apache/kafka/pull/8712#discussion_r445277549 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java ## @@ -179,31 +180,43 @@ public InternalTopicManager(final Admin adminClient, final StreamsConfig streams * Topics that were not able to get its description will simply not be returned */ // visible for testing -protected Map getNumPartitions(final Set topics) { -log.debug("Trying to check if topics {} have been created with expected number of partitions.", topics); - -final DescribeTopicsResult describeTopicsResult = adminClient.describeTopics(topics); +protected Map getNumPartitions(final Set topics, +final HashSet tempUnknownTopics, +final int remainingRetries) { Review comment: We could just pass in a boolean here to indicate whether there are remaining retries ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java ## @@ -98,9 +98,10 @@ public InternalTopicManager(final Admin adminClient, final StreamsConfig streams int remainingRetries = retries; Set topicsNotReady = new HashSet<>(topics.keySet()); final Set newlyCreatedTopics = new HashSet<>(); +final HashSet tempUnknownTopics = new HashSet<>(); Review comment: s/HashSet/Set? ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java ## @@ -243,10 +259,18 @@ public InternalTopicManager(final Admin adminClient, final StreamsConfig streams throw new StreamsException(errorMsg); } } else { -topicsToCreate.add(topicName); +// for the tempUnknownTopics, we'll check again later if retries > 0 Review comment: Could be merged with above `else` ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java ## @@ -179,31 +180,43 @@ public InternalTopicManager(final Admin adminClient, final StreamsConfig streams * Topics that were not able to get its description will simply not be returned */ // visible for testing -protected Map getNumPartitions(final Set topics) { -log.debug("Trying to check if topics {} have been created with expected number of partitions.", topics); - -final DescribeTopicsResult describeTopicsResult = adminClient.describeTopics(topics); +protected Map getNumPartitions(final Set topics, +final HashSet tempUnknownTopics, +final int remainingRetries) { +final Set allTopicsToDescribe = new HashSet<>(topics); +allTopicsToDescribe.addAll(tempUnknownTopics); Review comment: Why do we need `allTopicsToDescribe`? It seems only queried once locally. ## File path: streams/src/test/java/org/apache/kafka/streams/processor/internals/InternalTopicManagerTest.java ## @@ -287,12 +291,41 @@ public void shouldLogWhenTopicNotFoundAndNotThrowException() { assertThat( appender.getMessages(), -hasItem("stream-thread [" + threadName + "] Topic internal-topic is unknown or not found, hence not existed yet:" + -" org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Topic internal-topic not found.") +hasItem("stream-thread [" + threadName + "] Topic internal-topic is unknown or not found, hence not existed yet.\n" + +"Error message was: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Topic internal-topic not found.") ); } } +@Test +public void shouldLogWhenTopicLeaderNotAvailableAndThrowException() { +final String leaderNotAvailableTopic = "LeaderNotAvailableTopic"; +final AdminClient admin = EasyMock.createNiceMock(AdminClient.class); +final InternalTopicManager topicManager = new InternalTopicManager(admin, new StreamsConfig(config)); + +final KafkaFutureImpl topicDescriptionFailFuture = new KafkaFutureImpl<>(); +topicDescriptionFailFuture.completeExceptionally(new LeaderNotAvailableException("Leader Not Available!")); + +// simulate describeTopics got LeaderNotAvailableException + EasyMock.expect(admin.describeTopics(Collections.singleton(leaderNotAvailableTopic))) +.andReturn(new MockDescribeTopicsResult( Review comment: Use 4 space format to align with other tests. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java
[jira] [Updated] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
[ https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-10017: Priority: Critical (was: Blocker) > Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta > --- > > Key: KAFKA-10017 > URL: https://issues.apache.org/jira/browse/KAFKA-10017 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.6.0 >Reporter: Sophie Blee-Goldman >Assignee: Matthias J. Sax >Priority: Critical > Labels: flaky-test, unit-test > Fix For: 2.6.0 > > > Creating a new ticket for this since the root cause is different than > https://issues.apache.org/jira/browse/KAFKA-9966 > With injectError = true: > h3. Stacktrace > java.lang.AssertionError: Did not receive all 20 records from topic > multiPartitionOutputTopic within 6 ms Expected: is a value equal to or > greater than <20> but: <15> was less than <20> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
[ https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144590#comment-17144590 ] Matthias J. Sax commented on KAFKA-10017: - The test is still subject to other bugs that we are currently working on. So it's hard to say atm. Feel free to cut an RC right away (I update the ticket as critical for now). However, if this test surfaces another bug, we might kill an RC. > Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta > --- > > Key: KAFKA-10017 > URL: https://issues.apache.org/jira/browse/KAFKA-10017 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.6.0 >Reporter: Sophie Blee-Goldman >Assignee: Matthias J. Sax >Priority: Blocker > Labels: flaky-test, unit-test > Fix For: 2.6.0 > > > Creating a new ticket for this since the root cause is different than > https://issues.apache.org/jira/browse/KAFKA-9966 > With injectError = true: > h3. Stacktrace > java.lang.AssertionError: Did not receive all 20 records from topic > multiPartitionOutputTopic within 6 ms Expected: is a value equal to or > greater than <20> but: <15> was less than <20> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] guozhangwang merged pull request #8924: KAFKA-10198: guard against recycling dirty state
guozhangwang merged pull request #8924: URL: https://github.com/apache/kafka/pull/8924 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on a change in pull request #8926: KAFKA-10166: always invoke `postCommit` before closing a task
ableegoldman commented on a change in pull request #8926: URL: https://github.com/apache/kafka/pull/8926#discussion_r445262784 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java ## @@ -813,27 +821,6 @@ void shutdown(final boolean clean) { tasksToCloseDirty.add(task); } } - -for (final Task task : tasksToCommit) { Review comment: We can actually simplify the standby task shutdown a LOT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on a change in pull request #8926: KAFKA-10166: always invoke `postCommit` before closing a task
ableegoldman commented on a change in pull request #8926: URL: https://github.com/apache/kafka/pull/8926#discussion_r445262369 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java ## @@ -242,18 +242,16 @@ public void handleAssignment(final Map> activeTasks, for (final Task task : tasksToClose) { try { -task.suspend(); // Should be a no-op for active tasks since they're suspended in handleRevocation -if (task.commitNeeded()) { -if (task.isActive()) { -log.error("Active task {} was revoked and should have already been committed", task.id()); -throw new IllegalStateException("Revoked active task was not committed during handleRevocation"); Review comment: This was another "sort-of bug": if we hit an exception in `handleRevocation` we wouldn't finish committing the active tasks, so `commitNeeded` could still be true. But of course, if we hit an exception earlier, we would have thrown it up to ConsumerCoordinator which would only save the first exception, so this didn't really do anything This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Reopened] (KAFKA-8197) Flaky Test kafka.server.DynamicBrokerConfigTest > testPasswordConfigEncoderSecretChange
[ https://issues.apache.org/jira/browse/KAFKA-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman reopened KAFKA-8197: Saw this fail again: kafka.server.DynamicBrokerConfigTest > testPasswordConfigEncoderSecretChange FAILED*16:33:33* org.junit.ComparisonFailure: expected:<[staticLoginModule required;]> but was:<[u`??2?e;%h>r/???8e*16:33:33* at org.junit.Assert.assertEquals(Assert.java:117)*16:33:33* at org.junit.Assert.assertEquals(Assert.java:146)*16:33:33* at kafka.server.DynamicBrokerConfigTest.testPasswordConfigEncoderSecretChange(DynamicBrokerConfigTest.scala:309) > Flaky Test kafka.server.DynamicBrokerConfigTest > > testPasswordConfigEncoderSecretChange > --- > > Key: KAFKA-8197 > URL: https://issues.apache.org/jira/browse/KAFKA-8197 > Project: Kafka > Issue Type: Improvement > Components: core, unit tests >Affects Versions: 1.1.1 >Reporter: Guozhang Wang >Priority: Major > > {code} > 09:18:23 kafka.server.DynamicBrokerConfigTest > > testPasswordConfigEncoderSecretChange FAILED > 09:18:23 org.junit.ComparisonFailure: expected:<[staticLoginModule > required;]> but was:<[????O?i???A?c'??Ch?|?p]> > 09:18:23 at org.junit.Assert.assertEquals(Assert.java:115) > 09:18:23 at org.junit.Assert.assertEquals(Assert.java:144) > 09:18:23 at > kafka.server.DynamicBrokerConfigTest.testPasswordConfigEncoderSecretChange(DynamicBrokerConfigTest.scala:253) > {code} > https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/13466/consoleFull -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] ableegoldman commented on pull request #8924: KAFKA-10198: guard against recycling dirty state
ableegoldman commented on pull request #8924: URL: https://github.com/apache/kafka/pull/8924#issuecomment-649162622 Two unrelated test failures: `MirrorConnectorsIntegrationTest.testReplication` `DynamicBrokerConfigTest > testPasswordConfigEncoderSecretChange` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman opened a new pull request #8926: KAFKA-10166: always invoke `postCommit` before closing a task
ableegoldman opened a new pull request #8926: URL: https://github.com/apache/kafka/pull/8926 This should address at least some of the excessive TaskCorruptedExceptions we've been seeing lately. Basically, at the moment we only commit tasks if `commitNeeded` is true -- this seems true by definition. But the problem is we do some essential cleanup in `postCommit` that should always be done before a task is closed: 1. clear the PartitionGroup 2. write the checkpoint 2 is actually fine to skip when `commitNeeded = false` with ALOS, as we will have already written a checkpoint during the last commit. But for EOS, we _only_ write the checkpoint before a close -- so even if there is no new pending data since the last commit, we have to write the current offsets. If we don't, the task will be assumed dirty and we will run into our friend the TaskCorruptedException during (re)initialization. To fix this, we should just always call `prepareCommit` and `postCommit` at the TaskManager level. Within the task, it can decide whether or not to actually do something in those methods based on `commitNeeded`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] guozhangwang commented on pull request #8925: KAFKA-9974: Make produce-sync flush
guozhangwang commented on pull request #8925: URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157606 test this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] guozhangwang opened a new pull request #8925: KAFKA-9974: Make produce-sync flush
guozhangwang opened a new pull request #8925: URL: https://github.com/apache/kafka/pull/8925 I cannot actually re-produce the failure locally, but by looking at the code I think there's an issue in `produceKeyValuesSynchronously`: when Eos is not enabled, we then need to call `flush` to make sure all records are indeed sent "synchronously". If Eos is enabled the `commitTxn` would flush the records already. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] guozhangwang removed a comment on pull request #8925: KAFKA-9974: Make produce-sync flush
guozhangwang removed a comment on pull request #8925: URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157091 test this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] guozhangwang commented on pull request #8925: KAFKA-9974: Make produce-sync flush
guozhangwang commented on pull request #8925: URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157235 test this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] guozhangwang commented on pull request #8925: KAFKA-9974: Make produce-sync flush
guozhangwang commented on pull request #8925: URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157091 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade
[ https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144573#comment-17144573 ] Boyang Chen commented on KAFKA-10173: - cc [~rhauch] > BufferUnderflowException during Kafka Streams Upgrade > - > > Key: KAFKA-10173 > URL: https://issues.apache.org/jira/browse/KAFKA-10173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.5.0 >Reporter: Karsten Schnitter >Assignee: John Roesler >Priority: Blocker > Labels: suppress > Fix For: 2.6.0, 2.4.2, 2.5.1 > > > I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I > followed the steps described in the upgrade guide and set the property > {{migrate.from=2.3}}. On my dev system with just one running instance I got > the following exception: > {noformat} > stream-thread [0-StreamThread-2] Encountered the following error during > processing: > java.nio.BufferUnderflowException: null > at java.base/java.nio.HeapByteBuffer.get(Unknown Source) > at java.base/java.nio.ByteBuffer.get(Unknown Source) > at > org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94) > at > org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83) > at > org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368) > at > org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89) > at > org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94) > at > org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670) > {noformat} > I figured out, that this problem only occurs for stores, where I use the > suppress feature. If I rename the changelog topics during the migration, the > problem will not occur. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
[ https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144567#comment-17144567 ] Sophie Blee-Goldman commented on KAFKA-10017: - cc [~mjsax] > Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta > --- > > Key: KAFKA-10017 > URL: https://issues.apache.org/jira/browse/KAFKA-10017 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.6.0 >Reporter: Sophie Blee-Goldman >Assignee: Matthias J. Sax >Priority: Blocker > Labels: flaky-test, unit-test > Fix For: 2.6.0 > > > Creating a new ticket for this since the root cause is different than > https://issues.apache.org/jira/browse/KAFKA-9966 > With injectError = true: > h3. Stacktrace > java.lang.AssertionError: Did not receive all 20 records from topic > multiPartitionOutputTopic within 6 ms Expected: is a value equal to or > greater than <20> but: <15> was less than <20> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10166) Excessive TaskCorruptedException seen in testing
[ https://issues.apache.org/jira/browse/KAFKA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144562#comment-17144562 ] Sophie Blee-Goldman commented on KAFKA-10166: - I think this should be a blocker, yes. It's a regression in 2.6 and causes Streams to unnecessarily rebuild state from the changelog which can mean a very long stall. One root cause just occurred to me while looking at some related code, which I'll open a PR for right away. I'm not sure it's the _only_ root cause but I'll begin testing right away to see if it fixes the majority of the problem or not. [~cadonna] do you want to split up this ticket? There are two kinds of TaskCorruptedException, both of which we see more than expected. It probably makes sense to look at these individually and in parallel. Can you look into the TaskCorruptedException thrown in StoreChangelogReader#restore? I'll investigate my theory for the exceptions thrown in ProcessorStateManager > Excessive TaskCorruptedException seen in testing > > > Key: KAFKA-10166 > URL: https://issues.apache.org/jira/browse/KAFKA-10166 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Bruno Cadonna >Priority: Blocker > Fix For: 2.6.0 > > > As the title indicates, long-running test applications with injected network > "outages" seem to hit TaskCorruptedException more than expected. > Seen occasionally on the ALOS application (~20 times in two days in one case, > for example), and very frequently with EOS (many times per day) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] guozhangwang commented on pull request #8924: KAFKA-10198: guard against recycling dirty state
guozhangwang commented on pull request #8924: URL: https://github.com/apache/kafka/pull/8924#issuecomment-649137974 LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] junrao commented on a change in pull request #8479: KAFKA-9769: Finish operations for leaderEpoch-updated partitions up to point ZK Exception
junrao commented on a change in pull request #8479: URL: https://github.com/apache/kafka/pull/8479#discussion_r445234702 ## File path: core/src/main/scala/kafka/server/ReplicaManager.scala ## @@ -1556,6 +1557,11 @@ class ReplicaManager(val config: KafkaConfig, error(s"Error while making broker the follower for partition $partition with leader " + s"$newLeaderBrokerId in dir $dirOpt", e) responseMap.put(partition.topicPartition, Errors.KAFKA_STORAGE_ERROR) + case e: ZooKeeperClientException => Review comment: It's probably better to do this in Partition.makeFollower() instead of here. That way, we only skip partitions that have incurred ZK error. Also, the same ZK exception can happen in Partition.makeLeader(). So, we want to do the same thing there as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade
[ https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144485#comment-17144485 ] John Roesler commented on KAFKA-10173: -- Ok, I'm horrified and embarrassed to report that I know what the problem is, as well as the solution. I'll update my PR tomorrow, and I'll re-escalate this ticket to a blocker for 2.5.1 and 2.6.0. In a nutshell, although we have a version number on these changelog records so that we can deserialize old formats, I subtly changed the serialization format in 2.4.0 without updating the version number, so although the serialization format is incompatible between 2.3.1 and 2.5.0, they both claim to be at "version 2" (I was mistaken about this before). This mistake happens to also render the serialization compatibility test I had written to be ineffectual. And I've also just discovered that our system test that covers application upgrades had suffered an oversight that made it skip these versions. Needless to say, in addition to fixing this bug, I'm fixing the serialization test to avoid using the same code paths as the application, and I'm also revamping the upgrade system test to be sure we'll have a much more robust test going forward. Although this bug was introduced in 2.4.0, I'd still classify it as a blocker, since it's so severe (you simply can't upgrade the application until we fix it). Thanks for the excellently detailed report, and, again, my sincere apologies, -John > BufferUnderflowException during Kafka Streams Upgrade > - > > Key: KAFKA-10173 > URL: https://issues.apache.org/jira/browse/KAFKA-10173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.5.0 >Reporter: Karsten Schnitter >Assignee: John Roesler >Priority: Major > Labels: suppress > Fix For: 2.5.1 > > > I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I > followed the steps described in the upgrade guide and set the property > {{migrate.from=2.3}}. On my dev system with just one running instance I got > the following exception: > {noformat} > stream-thread [0-StreamThread-2] Encountered the following error during > processing: > java.nio.BufferUnderflowException: null > at java.base/java.nio.HeapByteBuffer.get(Unknown Source) > at java.base/java.nio.ByteBuffer.get(Unknown Source) > at > org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94) > at > org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83) > at > org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368) > at > org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89) > at > org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94) > at > org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670) > {noformat} > I figured out, that this problem only occurs for stores, where I use the > suppress feature. If I rename the changelog topics during the migration, the > problem will not occur. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade
[ https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-10173: - Fix Version/s: 2.4.2 2.6.0 > BufferUnderflowException during Kafka Streams Upgrade > - > > Key: KAFKA-10173 > URL: https://issues.apache.org/jira/browse/KAFKA-10173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.5.0 >Reporter: Karsten Schnitter >Assignee: John Roesler >Priority: Blocker > Labels: suppress > Fix For: 2.6.0, 2.4.2, 2.5.1 > > > I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I > followed the steps described in the upgrade guide and set the property > {{migrate.from=2.3}}. On my dev system with just one running instance I got > the following exception: > {noformat} > stream-thread [0-StreamThread-2] Encountered the following error during > processing: > java.nio.BufferUnderflowException: null > at java.base/java.nio.HeapByteBuffer.get(Unknown Source) > at java.base/java.nio.ByteBuffer.get(Unknown Source) > at > org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94) > at > org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83) > at > org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368) > at > org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89) > at > org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94) > at > org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670) > {noformat} > I figured out, that this problem only occurs for stores, where I use the > suppress feature. If I rename the changelog topics during the migration, the > problem will not occur. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade
[ https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-10173: - Priority: Blocker (was: Major) > BufferUnderflowException during Kafka Streams Upgrade > - > > Key: KAFKA-10173 > URL: https://issues.apache.org/jira/browse/KAFKA-10173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.5.0 >Reporter: Karsten Schnitter >Assignee: John Roesler >Priority: Blocker > Labels: suppress > Fix For: 2.5.1 > > > I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I > followed the steps described in the upgrade guide and set the property > {{migrate.from=2.3}}. On my dev system with just one running instance I got > the following exception: > {noformat} > stream-thread [0-StreamThread-2] Encountered the following error during > processing: > java.nio.BufferUnderflowException: null > at java.base/java.nio.HeapByteBuffer.get(Unknown Source) > at java.base/java.nio.ByteBuffer.get(Unknown Source) > at > org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94) > at > org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83) > at > org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368) > at > org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89) > at > org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350) > at > org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94) > at > org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670) > {noformat} > I figured out, that this problem only occurs for stores, where I use the > suppress feature. If I rename the changelog topics during the migration, the > problem will not occur. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips
[ https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9313: - Fix Version/s: (was: 2.7.0) 2.6.0 > Set default for client.dns.lookup to use_all_dns_ips > > > Key: KAFKA-9313 > URL: https://issues.apache.org/jira/browse/KAFKA-9313 > Project: Kafka > Issue Type: Improvement > Components: clients >Reporter: Yeva Byzek >Assignee: Badai Aqrandista >Priority: Minor > Fix For: 2.6.0 > > > The default setting of the configuration parameter {{client.dns.lookup}} is > *not* {{use_all_dns_ips}} . Consequently, by default, if there are multiple > IP addresses and the first one fails, the connection will fail. > > It is desirable to change the default to be > {{client.dns.lookup=use_all_dns_ips}} for two reasons: > # reduce connection failure rates by > # users are often surprised that this is not already the default > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips
[ https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9313: - Labels: need-kip (was: ) > Set default for client.dns.lookup to use_all_dns_ips > > > Key: KAFKA-9313 > URL: https://issues.apache.org/jira/browse/KAFKA-9313 > Project: Kafka > Issue Type: Improvement > Components: clients >Reporter: Yeva Byzek >Assignee: Badai Aqrandista >Priority: Minor > Labels: need-kip > Fix For: 2.6.0 > > > The default setting of the configuration parameter {{client.dns.lookup}} is > *not* {{use_all_dns_ips}} . Consequently, by default, if there are multiple > IP addresses and the first one fails, the connection will fail. > > It is desirable to change the default to be > {{client.dns.lookup=use_all_dns_ips}} for two reasons: > # reduce connection failure rates by > # users are often surprised that this is not already the default > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips
[ https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144481#comment-17144481 ] Randall Hauch commented on KAFKA-9313: -- Actually, this looks like it was merged and this issue was simply not updated. So please ignore the previous comment. > Set default for client.dns.lookup to use_all_dns_ips > > > Key: KAFKA-9313 > URL: https://issues.apache.org/jira/browse/KAFKA-9313 > Project: Kafka > Issue Type: Improvement > Components: clients >Reporter: Yeva Byzek >Assignee: Badai Aqrandista >Priority: Minor > Fix For: 2.7.0 > > > The default setting of the configuration parameter {{client.dns.lookup}} is > *not* {{use_all_dns_ips}} . Consequently, by default, if there are multiple > IP addresses and the first one fails, the connection will fail. > > It is desirable to change the default to be > {{client.dns.lookup=use_all_dns_ips}} for two reasons: > # reduce connection failure rates by > # users are often surprised that this is not already the default > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] mjsax commented on pull request #8924: KAFKA-10198: guard against recycling dirty state
mjsax commented on pull request #8924: URL: https://github.com/apache/kafka/pull/8924#issuecomment-649120565 Retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-10198) Dirty tasks may be recycled instead of closed
[ https://issues.apache.org/jira/browse/KAFKA-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144472#comment-17144472 ] Randall Hauch commented on KAFKA-10198: --- Thanks, [~ableegoldman]. I agree this should be fixed in 2.6, so merge whenever the PR is ready and cherry-pick to the `2.6` branch. (I'm the 2.6 release manager.) > Dirty tasks may be recycled instead of closed > - > > Key: KAFKA-10198 > URL: https://issues.apache.org/jira/browse/KAFKA-10198 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.6.0 > > > We recently added a guard to `Task#closeClean` to make sure we don't > accidentally clean-close a dirty task, but we forgot to also add this check > to `Task#closeAndRecycleState`. This meant an otherwise dirty task could be > closed clean and recycled into a new task when it should have just been > closed. > This manifest as an NPE in our test application. Specifically, task 1_0 was > active on StreamThread-2 but reassigned as a standby. During handleRevocation > we hit a TaskMigratedException while flushing the tasks and bailed on trying > to flush and commit the remainder. This left task 1_0 with dirty keys in the > suppression buffer and the `commitNeeded` flag still set to true. > During handleAssignment, we should have closed all the tasks with pending > state as dirty (ie any task with commitNeeded = true). Since we don't know > about the TaskMigratedException we hit during handleRevocation, we rely on > the guard in Task#closeClean` to throw an exception and force the task to be > closed dirty. > Unfortunately, we left this guard out of `closeAndRecycleState`, which meant > task 1_0 was able to slip through without being closed dirty. Once > reinitialized as a standby task, we eventually tried to commit it. The > suppression buffer of course tried to flush its remaining dirty keys from its > previous life as an active task. But since it's now a standby task, it should > not be sending anything to the changelog and has a null RecordCollector. We > tried to access it, and hit the NPE. > > The fix is simple, we just need to add the guard in closeClean to > closeAndRecycleState as well -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9381) kafka-streams-scala: Javadocs + Scaladocs not published on maven central
[ https://issues.apache.org/jira/browse/KAFKA-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144469#comment-17144469 ] Randall Hauch commented on KAFKA-9381: -- [~mumrah] can you share the changes you made to get the javadoc and scaladoc to build? We're running Gradle 6.5, which is supposed to have the fix for the {{MalformedURLException}}. > kafka-streams-scala: Javadocs + Scaladocs not published on maven central > > > Key: KAFKA-9381 > URL: https://issues.apache.org/jira/browse/KAFKA-9381 > Project: Kafka > Issue Type: Bug > Components: documentation, streams >Reporter: Julien Jean Paul Sirocchi >Assignee: Randall Hauch >Priority: Blocker > Fix For: 2.6.0 > > > As per title, empty (aside for MANIFEST, LICENCE and NOTICE) > javadocs/scaladocs jars on central for any version (kafka nor scala), e.g. > [http://repo1.maven.org/maven2/org/apache/kafka/kafka-streams-scala_2.12/2.3.1/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] d8tltanc commented on pull request #8683: KAFKA-9893: Configurable TCP connection timeout and improve the initial metadata fetch
d8tltanc commented on pull request #8683: URL: https://github.com/apache/kafka/pull/8683#issuecomment-649113870 @rajinisivaram @dajac The test failures are caused by the connection state transition from `CONNECTING` to `CHECKING_API_VERSIONS`, and then to `CONNECTED`, instead of to `CONNECTED` directly. In this case, I should remove the node from the `connecting` HashSet when this transfer happens. I've fixed this issue and update the patch. Also, I've addressed the latest comments. Please let me know if you have more suggestions and if we can re-run the Jenkins tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on a change in pull request #8924: KAFKA-10198: guard against recycling dirty state
ableegoldman commented on a change in pull request #8924: URL: https://github.com/apache/kafka/pull/8924#discussion_r445212765 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java ## @@ -515,17 +517,20 @@ private void writeCheckpoint() { stateMgr.checkpoint(checkpointableOffsets()); } -/** - * You must commit a task and checkpoint the state manager before closing as this will release the state dir lock - */ -private void close(final boolean clean) { -if (clean && commitNeeded) { -// It may be that we failed to commit a task during handleRevocation, but "forgot" this and tried to -// closeClean in handleAssignment. We should throw if we detect this to force the TaskManager to closeDirty +private void validateClean() { +// It may be that we failed to commit a task during handleRevocation, but "forgot" this and tried to +// closeClean in handleAssignment. We should throw if we detect this to force the TaskManager to closeDirty +if (commitNeeded) { log.debug("Tried to close clean but there was pending uncommitted data, this means we failed to" + " commit and should close as dirty instead"); throw new TaskMigratedException("Tried to close dirty task as clean"); } +} + +/** + * You must commit a task and checkpoint the state manager before closing as this will release the state dir lock + */ +private void close(final boolean clean) { Review comment: This diff turned out a bit awkward, basically I just factored this check out into a separate method that we should call at the beginning of both flavors of clean close This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman opened a new pull request #8924: KAFKA-10198: guard against recycling dirty state
ableegoldman opened a new pull request #8924: URL: https://github.com/apache/kafka/pull/8924 We just needed to add the check in `StreamTask#closeClean` to `closeAndRecycleState` as well. Also renamed `closeAndRecycleState` to `closeCleanAndRecycleState` to drive this point home: it needs to be clean This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-10198) Dirty tasks may be recycled instead of closed
Sophie Blee-Goldman created KAFKA-10198: --- Summary: Dirty tasks may be recycled instead of closed Key: KAFKA-10198 URL: https://issues.apache.org/jira/browse/KAFKA-10198 Project: Kafka Issue Type: Bug Components: streams Reporter: Sophie Blee-Goldman Assignee: Sophie Blee-Goldman Fix For: 2.6.0 We recently added a guard to `Task#closeClean` to make sure we don't accidentally clean-close a dirty task, but we forgot to also add this check to `Task#closeAndRecycleState`. This meant an otherwise dirty task could be closed clean and recycled into a new task when it should have just been closed. This manifest as an NPE in our test application. Specifically, task 1_0 was active on StreamThread-2 but reassigned as a standby. During handleRevocation we hit a TaskMigratedException while flushing the tasks and bailed on trying to flush and commit the remainder. This left task 1_0 with dirty keys in the suppression buffer and the `commitNeeded` flag still set to true. During handleAssignment, we should have closed all the tasks with pending state as dirty (ie any task with commitNeeded = true). Since we don't know about the TaskMigratedException we hit during handleRevocation, we rely on the guard in Task#closeClean` to throw an exception and force the task to be closed dirty. Unfortunately, we left this guard out of `closeAndRecycleState`, which meant task 1_0 was able to slip through without being closed dirty. Once reinitialized as a standby task, we eventually tried to commit it. The suppression buffer of course tried to flush its remaining dirty keys from its previous life as an active task. But since it's now a standby task, it should not be sending anything to the changelog and has a null RecordCollector. We tried to access it, and hit the NPE. The fix is simple, we just need to add the guard in closeClean to closeAndRecycleState as well -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka] ning2008wisc commented on pull request #7577: KAFKA-9076: support consumer offset sync across clusters in MM 2.0
ning2008wisc commented on pull request #7577: URL: https://github.com/apache/kafka/pull/7577#issuecomment-649092864 bump for attention @mimaison ^ given that https://issues.apache.org/jira/browse/KAFKA-9076 is slipped to the next release (2.7.0) and some people may be already testing/using this feature, I would hope if it is possible to revisit this PR soon so that it can formally part of Kafka. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] vvcephei commented on pull request #8881: KIP-557: Add emit on change support to Kafka Streams
vvcephei commented on pull request #8881: URL: https://github.com/apache/kafka/pull/8881#issuecomment-649089312 Hey @ConcurrencyPractitioner , I'm sorry for the delay. I started to look at it, but got caught up in stabilizing the 2.6.0 and 2.5.1 releases. I'll get you a review ASAP. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] andrewchoi5 commented on pull request #8479: KAFKA-9769: Finish operations for leaderEpoch-updated partitions up to point ZK Exception
andrewchoi5 commented on pull request #8479: URL: https://github.com/apache/kafka/pull/8479#issuecomment-649082464 Hello @junrao @hachikuji -- I have made some updates to address the comments. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] mjsax commented on a change in pull request #8902: KAFKA-10179: Pass correct changelog topic to state serdes
mjsax commented on a change in pull request #8902: URL: https://github.com/apache/kafka/pull/8902#discussion_r445179411 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorStateManager.java ## @@ -578,4 +577,10 @@ private StateStoreMetadata findStore(final TopicPartition changelogPartition) { return found.isEmpty() ? null : found.get(0); } + +@Override +public TopicPartition changelogTopicPartitionFor(final String storeName) { +final StateStoreMetadata storeMetadata = stores.get(storeName); +return storeMetadata == null ? null : storeMetadata.changelogPartition; Review comment: From my understanding `storeMetadata` should only be `null` if the store was not registered? Thus it seems to indicate a bug? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (KAFKA-9846) Race condition can lead to severe lag underestimate for active tasks
[ https://issues.apache.org/jira/browse/KAFKA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144398#comment-17144398 ] Sophie Blee-Goldman commented on KAFKA-9846: This is definitely a limitation of the current Affects Version/Fix Version system – this actually is fixed in 2.6.0, but has not been fixed in 2.5.0 (hence the ticket is unresolved). That said, to avoid interfering with the release process I think we can leave it as is for now and then put 2.6.0 back on the fix version once it's released so that users know this doesn't affect 2.6+ > Race condition can lead to severe lag underestimate for active tasks > > > Key: KAFKA-9846 > URL: https://issues.apache.org/jira/browse/KAFKA-9846 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.5.0 >Reporter: Sophie Blee-Goldman >Priority: Critical > Fix For: 2.7.0 > > > In KIP-535 we added the ability to query still-restoring and standby tasks. > To give users control over how out of date the data they fetch can be, we > added an API to KafkaStreams that fetches the end offsets for all changelog > partitions and computes the lag for each local state store. > During this lag computation, we check whether an active task is in RESTORING > and calculate the actual lag if so. If not, we assume it's in RUNNING and > return a lag of zero. However, tasks may be in other states besides running > and restoring; notably they first pass through the CREATED state before > getting to RESTORING. A CREATED task may happen to be caught-up to the end > offset, but in many cases it is likely to be lagging or even completely > uninitialized. > This introduces a race condition where users may be led to believe that a > task has zero lag and is "safe" to query even with the strictest correctness > guarantees, while the task is actually lagging by some unknown amount. > During transfer of ownership of the task between different threads on the > same machine, tasks can actually spend a while in CREATED while the new owner > waits to acquire the task directory lock. So, this race condition may not be > particularly rare in multi-threaded Streams applications -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10068) Verify HighAvailabilityTaskAssignor performance with large clusters and topologies
[ https://issues.apache.org/jira/browse/KAFKA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144394#comment-17144394 ] Sophie Blee-Goldman commented on KAFKA-10068: - Moved the fix version back to 2.7.0. I don't think it's critical to have the test itself in 2.6.0, so long as we have the test itself and can verify that it doesn't expose a problem. > Verify HighAvailabilityTaskAssignor performance with large clusters and > topologies > -- > > Key: KAFKA-10068 > URL: https://issues.apache.org/jira/browse/KAFKA-10068 > Project: Kafka > Issue Type: Task > Components: streams >Affects Versions: 2.6.0 >Reporter: John Roesler >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.7.0 > > > While reviewing [https://github.com/apache/kafka/pull/8668/files,] I realized > that we should have a similar test to make sure that the new task assignor > completes well within the default assignment deadline. 30 seconds is a good > upper bound. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10068) Verify HighAvailabilityTaskAssignor performance with large clusters and topologies
[ https://issues.apache.org/jira/browse/KAFKA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman updated KAFKA-10068: Fix Version/s: (was: 2.6.0) 2.7.0 > Verify HighAvailabilityTaskAssignor performance with large clusters and > topologies > -- > > Key: KAFKA-10068 > URL: https://issues.apache.org/jira/browse/KAFKA-10068 > Project: Kafka > Issue Type: Task > Components: streams >Affects Versions: 2.6.0 >Reporter: John Roesler >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.7.0 > > > While reviewing [https://github.com/apache/kafka/pull/8668/files,] I realized > that we should have a similar test to make sure that the new task assignor > completes well within the default assignment deadline. 30 seconds is a good > upper bound. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9076) MirrorMaker 2.0 automated consumer offset sync
[ https://issues.apache.org/jira/browse/KAFKA-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144388#comment-17144388 ] Ning Zhang commented on KAFKA-9076: --- [~rhauch] It is true that this jira is not a blocking factor to 2.6.0 release and thank you for checking it. Given the PR has been proposed for over 6 months, some users are testing it in different use cases, I would hope the committers ([~mimaison]) and other reviewers may take an other review on https://github.com/apache/kafka/pull/7577 and see if we could iterate faster, so that it can formally be part of Kafka in the 2.7.0 release. > MirrorMaker 2.0 automated consumer offset sync > -- > > Key: KAFKA-9076 > URL: https://issues.apache.org/jira/browse/KAFKA-9076 > Project: Kafka > Issue Type: Improvement > Components: mirrormaker >Affects Versions: 2.4.0 >Reporter: Ning Zhang >Assignee: Ning Zhang >Priority: Major > Labels: mirrormaker, pull-request-available > Fix For: 2.7.0 > > > To calculate the translated consumer offset in the target cluster, currently > `Mirror-client` provides a function called "remoteConsumerOffsets()" that is > used by "RemoteClusterUtils" for one-time purpose. > In order to make the consumer and stream applications migrate from source to > target cluster transparently and conveniently, e.g. in event of source > cluster failure, a background job is proposed to periodically sync the > consumer offsets from the source to target cluster, so that when the consumer > and stream applications switch to the target cluster, it will resume to > consume from where it left off at source cluster. > KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-545%3A+support+automated+consumer+offset+sync+across+clusters+in+MM+2.0 > [https://github.com/apache/kafka/pull/7577] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-4740) Using new consumer API with a Deserializer that throws SerializationException can lead to infinite loop
[ https://issues.apache.org/jira/browse/KAFKA-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144384#comment-17144384 ] Patrick Taylor commented on KAFKA-4740: --- I agree with [Andrea's comment on 12/Dec/18|#comment-16719006] that a better solution is a customizable exception handler so the client can choose whether/how to log and skip unserializable records, or throw an exception as it does now, etc. This would avoid the complexities discussed in earlier comments. > Using new consumer API with a Deserializer that throws SerializationException > can lead to infinite loop > --- > > Key: KAFKA-4740 > URL: https://issues.apache.org/jira/browse/KAFKA-4740 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1 > Environment: Kafka broker 0.10.1.1 (but this bug is not dependent on > the broker version) > Kafka clients 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1 >Reporter: Sébastien Launay >Assignee: Sébastien Launay >Priority: Critical > > The old consumer supports deserializing records into typed objects and throws > a {{SerializationException}} through {{MessageAndMetadata#key()}} and > {{MessageAndMetadata#message()}} that can be catched by the client \[1\]. > When using the new consumer API with kafka-clients version < 0.10.0.1, such > the exception is swallowed by the {{NetworkClient}} class and result in an > infinite loop which the client has no control over like: > {noformat} > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Resetting offset > for partition test2-0 to earliest offset. > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Fetched offset 0 > for partition test2-0 > ERROR org.apache.kafka.clients.NetworkClient - Uncaught error in request > completion: > org.apache.kafka.common.errors.SerializationException: Size of data received > by IntegerDeserializer is not 4 > ERROR org.apache.kafka.clients.NetworkClient - Uncaught error in request > completion: > org.apache.kafka.common.errors.SerializationException: Size of data received > by IntegerDeserializer is not 4 > ... > {noformat} > Thanks to KAFKA-3977, this has been partially fixed in 0.10.1.0 but another > issue still remains. > Indeed, the client can now catch the {{SerializationException}} but the next > call to {{Consumer#poll(long)}} will throw the same exception indefinitely. > The following snippet (full example available on Github \[2\] for most > released kafka-clients versions): > {code:java} > try (KafkaConsumer kafkaConsumer = new > KafkaConsumer<>(consumerConfig, new StringDeserializer(), new > IntegerDeserializer())) { > kafkaConsumer.subscribe(Arrays.asList("topic")); > // Will run till the shutdown hook is called > while (!doStop) { > try { > ConsumerRecords records = > kafkaConsumer.poll(1000); > if (!records.isEmpty()) { > logger.info("Got {} messages", records.count()); > for (ConsumerRecord record : records) { > logger.info("Message with partition: {}, offset: {}, key: > {}, value: {}", > record.partition(), record.offset(), record.key(), > record.value()); > } > } else { > logger.info("No messages to consume"); > } > } catch (SerializationException e) { > logger.warn("Failed polling some records", e); > } > } > } > {code} > when run with the following records (third record has an invalid Integer > value): > {noformat} > printf "\x00\x00\x00\x00\n" | bin/kafka-console-producer.sh --broker-list > localhost:9092 --topic topic > printf "\x00\x00\x00\x01\n" | bin/kafka-console-producer.sh --broker-list > localhost:9092 --topic topic > printf "\x00\x00\x00\n" | bin/kafka-console-producer.sh --broker-list > localhost:9092 --topic topic > printf "\x00\x00\x00\x02\n" | bin/kafka-console-producer.sh --broker-list > localhost:9092 --topic topic > {noformat} > will produce the following logs: > {noformat} > INFO consumer.Consumer - Got 2 messages > INFO consumer.Consumer - Message with partition: 0, offset: 0, key: null, > value: 0 > INFO consumer.Consumer - Message with partition: 0, offset: 1, key: null, > value: 1 > WARN consumer.Consumer - Failed polling some records > org.apache.kafka.common.errors.SerializationException: Error deserializing > key/value for partition topic-0 at offset 2 > Caused by: org.apache.kafka.common.errors.SerializationException: Size of > data received by IntegerDeserializer is not 4 > WARN consumer.Consumer - Failed polling some records
[jira] [Commented] (KAFKA-9861) Process Simplification - Community Validation of Kafka Release Candidates
[ https://issues.apache.org/jira/browse/KAFKA-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144383#comment-17144383 ] Randall Hauch commented on KAFKA-9861: -- Since this is not a blocker issue, as release manager I'm trying to complete the 2.6.0 release. I'm removing `2.6.0` from the fix version and replacing with the next releases, `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Process Simplification - Community Validation of Kafka Release Candidates > - > > Key: KAFKA-9861 > URL: https://issues.apache.org/jira/browse/KAFKA-9861 > Project: Kafka > Issue Type: Improvement > Components: build, documentation, system tests >Affects Versions: 2.6.0, 2.4.2, 2.5.1 > Environment: Linux, Java 8/11, Scala 2.x >Reporter: Israel Ekpo >Assignee: Israel Ekpo >Priority: Minor > Fix For: 2.4.2, 2.5.1, 2.7.0, 2.6.1 > > > When new KAFKA release candidates are published and there is a solicitation > for the community to get involved in testing and verifying the release > candidates, it would be great to have the test process thoroughly documented > for newcomers to participate effectively. > For new contributors, this can be very daunting and it would be great to have > this process clearly documented in a way that lowers the level of effort > necessary to get started. > The goal of this task is to create the documentation and supporting artifacts > that would make this goal a reality. > Going forward for future releases, it would be great to have the link to this > documentation included in the RC announcements so that the community > (especially end users) can help test and participate in the voting process > effectively. > These are the items that I believe should be included in this documentation > * How to set up test environment for unit and functional tests > * Java version(s) needed for the tests > * Scala version(s) needed for the tests > * Gradle version needed > * Sample script for running sanity checks and unit tests > * Sample Helm Charts for running all the basic components on a Kubernetes > * Sample Ansible Script for running all the basic components on Virtual > Machines > The first 4 items will be part of the documentation that shows how to install > these dependencies in a Linux VM. The 5th item is a script that will download > PGP keys, check signatures, validate checksums and run unit/integration > tests. The 6th item is a Helm chart with basic components necessary to > validate critical components in the ecosystem (Zookeeper, Brokers, Streams > etc) within a Kubernetes cluster. The last item is similar to the 6th item > but installs these components on virtual machines instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9861) Process Simplification - Community Validation of Kafka Release Candidates
[ https://issues.apache.org/jira/browse/KAFKA-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9861: - Fix Version/s: (was: 2.6.0) 2.6.1 > Process Simplification - Community Validation of Kafka Release Candidates > - > > Key: KAFKA-9861 > URL: https://issues.apache.org/jira/browse/KAFKA-9861 > Project: Kafka > Issue Type: Improvement > Components: build, documentation, system tests >Affects Versions: 2.6.0, 2.4.2, 2.5.1 > Environment: Linux, Java 8/11, Scala 2.x >Reporter: Israel Ekpo >Assignee: Israel Ekpo >Priority: Minor > Fix For: 2.4.2, 2.5.1, 2.6.1 > > > When new KAFKA release candidates are published and there is a solicitation > for the community to get involved in testing and verifying the release > candidates, it would be great to have the test process thoroughly documented > for newcomers to participate effectively. > For new contributors, this can be very daunting and it would be great to have > this process clearly documented in a way that lowers the level of effort > necessary to get started. > The goal of this task is to create the documentation and supporting artifacts > that would make this goal a reality. > Going forward for future releases, it would be great to have the link to this > documentation included in the RC announcements so that the community > (especially end users) can help test and participate in the voting process > effectively. > These are the items that I believe should be included in this documentation > * How to set up test environment for unit and functional tests > * Java version(s) needed for the tests > * Scala version(s) needed for the tests > * Gradle version needed > * Sample script for running sanity checks and unit tests > * Sample Helm Charts for running all the basic components on a Kubernetes > * Sample Ansible Script for running all the basic components on Virtual > Machines > The first 4 items will be part of the documentation that shows how to install > these dependencies in a Linux VM. The 5th item is a script that will download > PGP keys, check signatures, validate checksums and run unit/integration > tests. The 6th item is a Helm chart with basic components necessary to > validate critical components in the ecosystem (Zookeeper, Brokers, Streams > etc) within a Kubernetes cluster. The last item is similar to the 6th item > but installs these components on virtual machines instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9861) Process Simplification - Community Validation of Kafka Release Candidates
[ https://issues.apache.org/jira/browse/KAFKA-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9861: - Fix Version/s: 2.7.0 > Process Simplification - Community Validation of Kafka Release Candidates > - > > Key: KAFKA-9861 > URL: https://issues.apache.org/jira/browse/KAFKA-9861 > Project: Kafka > Issue Type: Improvement > Components: build, documentation, system tests >Affects Versions: 2.6.0, 2.4.2, 2.5.1 > Environment: Linux, Java 8/11, Scala 2.x >Reporter: Israel Ekpo >Assignee: Israel Ekpo >Priority: Minor > Fix For: 2.4.2, 2.5.1, 2.7.0, 2.6.1 > > > When new KAFKA release candidates are published and there is a solicitation > for the community to get involved in testing and verifying the release > candidates, it would be great to have the test process thoroughly documented > for newcomers to participate effectively. > For new contributors, this can be very daunting and it would be great to have > this process clearly documented in a way that lowers the level of effort > necessary to get started. > The goal of this task is to create the documentation and supporting artifacts > that would make this goal a reality. > Going forward for future releases, it would be great to have the link to this > documentation included in the RC announcements so that the community > (especially end users) can help test and participate in the voting process > effectively. > These are the items that I believe should be included in this documentation > * How to set up test environment for unit and functional tests > * Java version(s) needed for the tests > * Scala version(s) needed for the tests > * Gradle version needed > * Sample script for running sanity checks and unit tests > * Sample Helm Charts for running all the basic components on a Kubernetes > * Sample Ansible Script for running all the basic components on Virtual > Machines > The first 4 items will be part of the documentation that shows how to install > these dependencies in a Linux VM. The 5th item is a script that will download > PGP keys, check signatures, validate checksums and run unit/integration > tests. The 6th item is a Helm chart with basic components necessary to > validate critical components in the ecosystem (Zookeeper, Brokers, Streams > etc) within a Kubernetes cluster. The last item is similar to the 6th item > but installs these components on virtual machines instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips
[ https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9313: - Fix Version/s: (was: 2.6.0) 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Set default for client.dns.lookup to use_all_dns_ips > > > Key: KAFKA-9313 > URL: https://issues.apache.org/jira/browse/KAFKA-9313 > Project: Kafka > Issue Type: Improvement > Components: clients >Reporter: Yeva Byzek >Assignee: Badai Aqrandista >Priority: Minor > Fix For: 2.7.0 > > > The default setting of the configuration parameter {{client.dns.lookup}} is > *not* {{use_all_dns_ips}} . Consequently, by default, if there are multiple > IP addresses and the first one fails, the connection will fail. > > It is desirable to change the default to be > {{client.dns.lookup=use_all_dns_ips}} for two reasons: > # reduce connection failure rates by > # users are often surprised that this is not already the default > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9587) Producer configs are omitted in the documentation
[ https://issues.apache.org/jira/browse/KAFKA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9587: - Fix Version/s: (was: 2.6.0) 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Producer configs are omitted in the documentation > - > > Key: KAFKA-9587 > URL: https://issues.apache.org/jira/browse/KAFKA-9587 > Project: Kafka > Issue Type: Improvement > Components: clients, documentation >Affects Versions: 2.4.0 >Reporter: Dongjin Lee >Assignee: Dongjin Lee >Priority: Minor > Fix For: 2.7.0 > > > As of 2.4, [the KafkaProducer > documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html] > states: > {quote}If the request fails, the producer can automatically retry, though > since we have specified retries as 0 it won't. > {quote} > {quote}... in the code snippet above, likely all 100 records would be sent in > a single request since we set our linger time to 1 millisecond. > {quote} > However, the code snippet (below) does not include any configurtaion on > '{{retry'}} or '{{linger.ms'}}: > {quote}Properties props = new Properties(); > props.put("bootstrap.servers", "localhost:9092"); > props.put("acks", "all"); > props.put("key.serializer", > "org.apache.kafka.common.serialization.StringSerializer"); > props.put("value.serializer", > "org.apache.kafka.common.serialization.StringSerializer"); > {quote} > The same documentation in [version > 2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html] > includes the configs; However, > [2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html] > only includes '{{linger.ms}}' and > [2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html] > includes none. It seems like it was removed in the middle of two releases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10038) ConsumerPerformance.scala supports the setting of client.id
[ https://issues.apache.org/jira/browse/KAFKA-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-10038: -- Fix Version/s: (was: 2.6.0) 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > ConsumerPerformance.scala supports the setting of client.id > --- > > Key: KAFKA-10038 > URL: https://issues.apache.org/jira/browse/KAFKA-10038 > Project: Kafka > Issue Type: Improvement > Components: consumer, core >Affects Versions: 2.1.1 > Environment: Trunk branch >Reporter: tigertan >Assignee: Luke Chen >Priority: Minor > Labels: newbie, performance > Fix For: 2.7.0 > > > ConsumerPerformance.scala supports the setting of "client.id", which is a > reasonable requirement, and the way "console consumer" and "console producer" > handle "client.id" can be unified. "client.id" defaults to > "perf-consumer-client". > We often use client.id in quotas, if the script of > kafka-producer-perf-test.sh supports the setting of "client.id" , we can do > quota testing through scripts without writing our own consumer programs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8930) MM2 documentation
[ https://issues.apache.org/jira/browse/KAFKA-8930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8930: - Fix Version/s: (was: 2.6.0) 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > MM2 documentation > - > > Key: KAFKA-8930 > URL: https://issues.apache.org/jira/browse/KAFKA-8930 > Project: Kafka > Issue Type: Improvement > Components: documentation, mirrormaker >Affects Versions: 2.4.0 >Reporter: Ryanne Dolan >Assignee: Ryanne Dolan >Priority: Minor > Fix For: 2.7.0 > > > Expand javadocs for new MirrorMaker (entrypoint) and MirrorMakerConfig > classes. Include example usage and example configuration. > Expand javadocs for MirrorSourceConnector, MirrorCheckpointConnector, and > MirrorHeartbeatConnector, including example configuration for running on > Connect w/o mm2 driver. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8929) MM2 system tests
[ https://issues.apache.org/jira/browse/KAFKA-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8929: - Fix Version/s: (was: 2.6.0) 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > MM2 system tests > > > Key: KAFKA-8929 > URL: https://issues.apache.org/jira/browse/KAFKA-8929 > Project: Kafka > Issue Type: Improvement > Components: mirrormaker >Affects Versions: 2.4.0 >Reporter: Ryanne Dolan >Assignee: Ryanne Dolan >Priority: Minor > Labels: test > Fix For: 2.7.0 > > > Add system tests for MM2 driver. Should resemble existing mirror-maker system > tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9018) Kafka Connect - throw clearer exceptions on serialisation errors
[ https://issues.apache.org/jira/browse/KAFKA-9018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9018: - Fix Version/s: (was: 2.6.0) 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Kafka Connect - throw clearer exceptions on serialisation errors > > > Key: KAFKA-9018 > URL: https://issues.apache.org/jira/browse/KAFKA-9018 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Affects Versions: 2.5.0, 2.4.1 >Reporter: Robin Moffatt >Assignee: Mario Molina >Priority: Minor > Fix For: 2.7.0 > > > When Connect fails on a deserialisation error, it doesn't show if that's the > *key or value* that's thrown the error, nor does it give the user any > indication of the *topic/partition/offset* of the message. Kafka Connect > should be improved to return this information. > Example message that user will get (in this case caused by reading non-Avro > data with the Avro converter) > {code:java} > Caused by: org.apache.kafka.connect.errors.DataException: Failed to > deserialize data for topic sample_topic to Avro: > at > io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:110) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:487) > at > org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128) > at > org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162) > ... 13 more > Caused by: org.apache.kafka.common.errors.SerializationException: Error > deserializing Avro message for id -1 > Caused by: org.apache.kafka.common.errors.SerializationException: Unknown > magic byte!{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10068) Verify HighAvailabilityTaskAssignor performance with large clusters and topologies
[ https://issues.apache.org/jira/browse/KAFKA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144375#comment-17144375 ] Randall Hauch commented on KAFKA-10068: --- [~ableegoldman], actually, I'm still trying to cut the first RC, but we need to decide whether this is really a blocker for the `2.6.0` release. If we think it still is a blocker, what is an ETA for completing this and is this a risky change for so late in the release cycle? > Verify HighAvailabilityTaskAssignor performance with large clusters and > topologies > -- > > Key: KAFKA-10068 > URL: https://issues.apache.org/jira/browse/KAFKA-10068 > Project: Kafka > Issue Type: Task > Components: streams >Affects Versions: 2.6.0 >Reporter: John Roesler >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.6.0 > > > While reviewing [https://github.com/apache/kafka/pull/8668/files,] I realized > that we should have a similar test to make sure that the new task assignor > completes well within the default assignment deadline. 30 seconds is a good > upper bound. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8264) Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition
[ https://issues.apache.org/jira/browse/KAFKA-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8264: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 > Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition > -- > > Key: KAFKA-8264 > URL: https://issues.apache.org/jira/browse/KAFKA-8264 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.0.1, 2.3.0 >Reporter: Matthias J. Sax >Assignee: Luke Chen >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/252/tests] > {quote}org.apache.kafka.common.errors.TopicExistsException: Topic 'topic3' > already exists.{quote} > STDOUT > > {quote}[2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=1, > leaderId=0, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,312] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,313] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,994] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:21,727] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition topicWithNewMessageFormat-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:28,696] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:28,699] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:29,246] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition topic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:29,247] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:29,287] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition topic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:33,655] ERROR
[jira] [Commented] (KAFKA-8264) Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition
[ https://issues.apache.org/jira/browse/KAFKA-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144374#comment-17144374 ] Randall Hauch commented on KAFKA-8264: -- Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition > -- > > Key: KAFKA-8264 > URL: https://issues.apache.org/jira/browse/KAFKA-8264 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.0.1, 2.3.0 >Reporter: Matthias J. Sax >Assignee: Luke Chen >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/252/tests] > {quote}org.apache.kafka.common.errors.TopicExistsException: Topic 'topic3' > already exists.{quote} > STDOUT > > {quote}[2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=1, > leaderId=0, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,312] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,313] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:20,994] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:21,727] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition topicWithNewMessageFormat-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:28,696] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:28,699] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:29,246] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition topic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:29,247] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:29,287] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition topic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-0 at
[jira] [Commented] (KAFKA-9943) Enable TLSv.1.3 in system tests "run all" execution.
[ https://issues.apache.org/jira/browse/KAFKA-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144372#comment-17144372 ] Randall Hauch commented on KAFKA-9943: -- Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Enable TLSv.1.3 in system tests "run all" execution. > > > Key: KAFKA-9943 > URL: https://issues.apache.org/jira/browse/KAFKA-9943 > Project: Kafka > Issue Type: Test >Reporter: Nikolay Izhikov >Assignee: Nikolay Izhikov >Priority: Major > Fix For: 2.7.0 > > > We need to enable system tests with the TLSv1.3 in "run all" execution. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-5453) Controller may miss requests sent to the broker when zk session timeout happens.
[ https://issues.apache.org/jira/browse/KAFKA-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144373#comment-17144373 ] Randall Hauch commented on KAFKA-5453: -- Moving this to 2.7.0 as there is no progress. > Controller may miss requests sent to the broker when zk session timeout > happens. > > > Key: KAFKA-5453 > URL: https://issues.apache.org/jira/browse/KAFKA-5453 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.11.0.0 >Reporter: Jiangjie Qin >Assignee: Viktor Somogyi-Vass >Priority: Major > Fix For: 2.7.0 > > > The issue I encountered was the following: > 1. Partition reassignment was in progress, one replica of a partition is > being reassigned from broker 1 to broker 2. > 2. Controller received an ISR change notification which indicates broker 2 > has caught up. > 3. Controller was sending StopReplicaRequest to broker 1. > 4. Broker 1 zk session timeout occurs. Controller removed broker 1 from the > cluster and cleaned up the queue. i.e. the StopReplicaRequest was removed > from the ControllerChannelManager. > 5. Broker 1 reconnected to zk and act as if it is still a follower replica of > the partition. > 6. Broker 1 will always receive exception from the leader because it is not > in the replica list. > Not sure what is the correct fix here. It seems that broke 1 in this case > should ask the controller for the latest replica assignment. > There are two related bugs: > 1. when a {{NotAssignedReplicaException}} is thrown from > {{Partition.updateReplicaLogReadResult()}}, the other partitions in the same > request will failed to update the fetch timestamp and offset and thus also > drop out of the ISR. > 2. The {{NotAssignedReplicaException}} was not properly returned to the > replicas, instead, a UnknownServerException is returned. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-5453) Controller may miss requests sent to the broker when zk session timeout happens.
[ https://issues.apache.org/jira/browse/KAFKA-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-5453: - Fix Version/s: (was: 2.6.0) 2.7.0 > Controller may miss requests sent to the broker when zk session timeout > happens. > > > Key: KAFKA-5453 > URL: https://issues.apache.org/jira/browse/KAFKA-5453 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.11.0.0 >Reporter: Jiangjie Qin >Assignee: Viktor Somogyi-Vass >Priority: Major > Fix For: 2.7.0 > > > The issue I encountered was the following: > 1. Partition reassignment was in progress, one replica of a partition is > being reassigned from broker 1 to broker 2. > 2. Controller received an ISR change notification which indicates broker 2 > has caught up. > 3. Controller was sending StopReplicaRequest to broker 1. > 4. Broker 1 zk session timeout occurs. Controller removed broker 1 from the > cluster and cleaned up the queue. i.e. the StopReplicaRequest was removed > from the ControllerChannelManager. > 5. Broker 1 reconnected to zk and act as if it is still a follower replica of > the partition. > 6. Broker 1 will always receive exception from the leader because it is not > in the replica list. > Not sure what is the correct fix here. It seems that broke 1 in this case > should ask the controller for the latest replica assignment. > There are two related bugs: > 1. when a {{NotAssignedReplicaException}} is thrown from > {{Partition.updateReplicaLogReadResult()}}, the other partitions in the same > request will failed to update the fetch timestamp and offset and thus also > drop out of the ISR. > 2. The {{NotAssignedReplicaException}} was not properly returned to the > replicas, instead, a UnknownServerException is returned. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9943) Enable TLSv.1.3 in system tests "run all" execution.
[ https://issues.apache.org/jira/browse/KAFKA-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9943: - Fix Version/s: (was: 2.6.0) 2.7.0 > Enable TLSv.1.3 in system tests "run all" execution. > > > Key: KAFKA-9943 > URL: https://issues.apache.org/jira/browse/KAFKA-9943 > Project: Kafka > Issue Type: Test >Reporter: Nikolay Izhikov >Assignee: Nikolay Izhikov >Priority: Major > Fix For: 2.7.0 > > > We need to enable system tests with the TLSv1.3 in "run all" execution. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10185) Streams should log summarized restoration information at info level
[ https://issues.apache.org/jira/browse/KAFKA-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144371#comment-17144371 ] Randall Hauch commented on KAFKA-10185: --- Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Streams should log summarized restoration information at info level > --- > > Key: KAFKA-10185 > URL: https://issues.apache.org/jira/browse/KAFKA-10185 > Project: Kafka > Issue Type: Task > Components: streams >Reporter: John Roesler >Assignee: John Roesler >Priority: Major > Fix For: 2.7.0 > > > Currently, restoration progress is only visible at debug level in the > Consumer's Fetcher logs. Users can register a restoration listener and > implement their own logging, but it would substantially improve operability > to have some logs available at INFO level. > Logging each partition in each restore batch at info level would be too much, > though, so we should print summarized logs at a decreased interval, like > every 10 seconds. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9458) Kafka crashed in windows environment
[ https://issues.apache.org/jira/browse/KAFKA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9458: - Fix Version/s: (was: 2.6.0) 2.7.0 > Kafka crashed in windows environment > > > Key: KAFKA-9458 > URL: https://issues.apache.org/jira/browse/KAFKA-9458 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 2.4.0 > Environment: Windows Server 2019 >Reporter: hirik >Priority: Critical > Labels: windows > Fix For: 2.7.0 > > Attachments: Windows_crash_fix.patch, logs.zip > > > Hi, > while I was trying to validate Kafka retention policy, Kafka Server crashed > with below exception trace. > [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, > dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] > Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log) > [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in > dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka > (kafka.server.LogDirFailureChannel) > java.nio.file.FileSystemException: > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex > -> > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted: > The process cannot access the file because it is being used by another > process. > at > java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) > at > java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) > at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395) > at > java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292) > at java.base/java.nio.file.Files.move(Files.java:1425) > at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795) > at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209) > at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497) > at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206) > at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206) > at scala.collection.immutable.List.foreach(List.scala:305) > at kafka.log.Log.deleteSegmentFiles(Log.scala:2206) > at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191) > at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700) > at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17) > at kafka.log.Log.maybeHandleIOException(Log.scala:2316) > at kafka.log.Log.deleteSegments(Log.scala:1691) > at kafka.log.Log.deleteOldSegments(Log.scala:1686) > at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763) > at kafka.log.Log.deleteOldSegments(Log.scala:1753) > at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982) > at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979) > at scala.collection.immutable.List.foreach(List.scala:305) > at kafka.log.LogManager.cleanupLogs(LogManager.scala:979) > at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403) > at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:830) > Suppressed: java.nio.file.FileSystemException: > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex > -> > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted: > The process cannot access the file because it is being used by another > process. > at > java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) > at > java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) > at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309) > at > java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292) > at java.base/java.nio.file.Files.move(Files.java:1425) > at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792) > ... 27 more > [2020-01-21 17:10:40,495] INFO [ReplicaManager
[jira] [Updated] (KAFKA-10185) Streams should log summarized restoration information at info level
[ https://issues.apache.org/jira/browse/KAFKA-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-10185: -- Fix Version/s: (was: 2.6.0) > Streams should log summarized restoration information at info level > --- > > Key: KAFKA-10185 > URL: https://issues.apache.org/jira/browse/KAFKA-10185 > Project: Kafka > Issue Type: Task > Components: streams >Reporter: John Roesler >Assignee: John Roesler >Priority: Major > Fix For: 2.7.0 > > > Currently, restoration progress is only visible at debug level in the > Consumer's Fetcher logs. Users can register a restoration listener and > implement their own logging, but it would substantially improve operability > to have some logs available at INFO level. > Logging each partition in each restore batch at info level would be too much, > though, so we should print summarized logs at a decreased interval, like > every 10 seconds. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9458) Kafka crashed in windows environment
[ https://issues.apache.org/jira/browse/KAFKA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144370#comment-17144370 ] Randall Hauch commented on KAFKA-9458: -- Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Kafka crashed in windows environment > > > Key: KAFKA-9458 > URL: https://issues.apache.org/jira/browse/KAFKA-9458 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 2.4.0 > Environment: Windows Server 2019 >Reporter: hirik >Priority: Critical > Labels: windows > Fix For: 2.7.0 > > Attachments: Windows_crash_fix.patch, logs.zip > > > Hi, > while I was trying to validate Kafka retention policy, Kafka Server crashed > with below exception trace. > [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, > dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] > Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log) > [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in > dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka > (kafka.server.LogDirFailureChannel) > java.nio.file.FileSystemException: > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex > -> > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted: > The process cannot access the file because it is being used by another > process. > at > java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) > at > java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) > at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395) > at > java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292) > at java.base/java.nio.file.Files.move(Files.java:1425) > at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795) > at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209) > at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497) > at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206) > at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206) > at scala.collection.immutable.List.foreach(List.scala:305) > at kafka.log.Log.deleteSegmentFiles(Log.scala:2206) > at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191) > at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700) > at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17) > at kafka.log.Log.maybeHandleIOException(Log.scala:2316) > at kafka.log.Log.deleteSegments(Log.scala:1691) > at kafka.log.Log.deleteOldSegments(Log.scala:1686) > at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763) > at kafka.log.Log.deleteOldSegments(Log.scala:1753) > at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982) > at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979) > at scala.collection.immutable.List.foreach(List.scala:305) > at kafka.log.LogManager.cleanupLogs(LogManager.scala:979) > at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403) > at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:830) > Suppressed: java.nio.file.FileSystemException: > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex > -> > C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted: > The process cannot access the file because it is being used by another > process. > at > java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) > at > java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) > at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309) > at >
[jira] [Commented] (KAFKA-10166) Excessive TaskCorruptedException seen in testing
[ https://issues.apache.org/jira/browse/KAFKA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144358#comment-17144358 ] Randall Hauch commented on KAFKA-10166: --- [~ableegoldman], [~cadonna]: Do we want to continue treating this as a blocker for the 2.6.0 release? If so, what's the timeframe for fixing this? If this should not block the release, should we downgrade the priority and/or change the fix versions to 2.6.1 and/or 2.7.0? > Excessive TaskCorruptedException seen in testing > > > Key: KAFKA-10166 > URL: https://issues.apache.org/jira/browse/KAFKA-10166 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Bruno Cadonna >Priority: Blocker > Fix For: 2.6.0 > > > As the title indicates, long-running test applications with injected network > "outages" seem to hit TaskCorruptedException more than expected. > Seen occasionally on the ALOS application (~20 times in two days in one case, > for example), and very frequently with EOS (many times per day) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10143) Can no longer change replication throttle with reassignment tool
[ https://issues.apache.org/jira/browse/KAFKA-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144357#comment-17144357 ] Randall Hauch commented on KAFKA-10143: --- [~hachikuji]: do we want to continue treating this as a blocker for the 2.6.0 release? If so, what's the timeframe for fixing this? If this should not block the release, should we downgrade the priority and/or change the fix versions to 2.6.1 and/or 2.7.0? > Can no longer change replication throttle with reassignment tool > > > Key: KAFKA-10143 > URL: https://issues.apache.org/jira/browse/KAFKA-10143 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 2.6.0 > > > Previously we could use --execute with the --throttle option in order to > change the quota of an active reassignment. We seem to have lost this with > KIP-455. The code has the following comment: > {code} > val reassignPartitionsInProgress = zkClient.reassignPartitionsInProgress() > if (reassignPartitionsInProgress) { > // Note: older versions of this tool would modify the broker quotas > here (but not > // topic quotas, for some reason). This behavior wasn't documented in > the --execute > // command line help. Since it might interfere with other ongoing > reassignments, > // this behavior was dropped as part of the KIP-455 changes. > throw new > TerseReassignmentFailureException(cannotExecuteBecauseOfExistingMessage) > } > {code} > Seems like it was a mistake to change this because it breaks compatibility. > We probably have to revert. At the same time, we can make the intent clearer > both in the code and in the command help output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10134) High CPU issue during rebalance in Kafka consumer after upgrading to 2.5
[ https://issues.apache.org/jira/browse/KAFKA-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144355#comment-17144355 ] Randall Hauch commented on KAFKA-10134: --- [~guozhang], [~ijuma], [~seanguo]: what's the status of this? Do we want to continue treating this as a blocker for the 2.6.0 release? If so, what's the timeframe for fixing this? If this should not block the release, should we downgrade the priority and/or change the fix versions to 2.6.1 and/or 2.7.0? > High CPU issue during rebalance in Kafka consumer after upgrading to 2.5 > > > Key: KAFKA-10134 > URL: https://issues.apache.org/jira/browse/KAFKA-10134 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 2.5.0 >Reporter: Sean Guo >Priority: Blocker > Fix For: 2.6.0, 2.5.1 > > > We want to utilize the new rebalance protocol to mitigate the stop-the-world > effect during the rebalance as our tasks are long running task. > But after the upgrade when we try to kill an instance to let rebalance happen > when there is some load(some are long running tasks >30S) there, the CPU will > go sky-high. It reads ~700% in our metrics so there should be several threads > are in a tight loop. We have several consumer threads consuming from > different partitions during the rebalance. This is reproducible in both the > new CooperativeStickyAssignor and old eager rebalance rebalance protocol. The > difference is that with old eager rebalance rebalance protocol used the high > CPU usage will dropped after the rebalance done. But when using cooperative > one, it seems the consumers threads are stuck on something and couldn't > finish the rebalance so the high CPU usage won't drop until we stopped our > load. Also a small load without long running task also won't cause continuous > high CPU usage as the rebalance can finish in that case. > > "executor.kafka-consumer-executor-4" #124 daemon prio=5 os_prio=0 > cpu=76853.07ms elapsed=841.16s tid=0x7fe11f044000 nid=0x1f4 runnable > [0x7fe119aab000]"executor.kafka-consumer-executor-4" #124 daemon prio=5 > os_prio=0 cpu=76853.07ms elapsed=841.16s tid=0x7fe11f044000 nid=0x1f4 > runnable [0x7fe119aab000] java.lang.Thread.State: RUNNABLE at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:467) > at > org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1275) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1241) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216) > at > > By debugging into the code we found it looks like the clients are in a loop > on finding the coordinator. > I also tried the old rebalance protocol for the new version the issue still > exists but the CPU will be back to normal when the rebalance is done. > Also tried the same on the 2.4.1 which seems don't have this issue. So it > seems related something changed between 2.4.1 and 2.5.0. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
[ https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144352#comment-17144352 ] Randall Hauch commented on KAFKA-10017: --- [~vvcephei], [~ableegoldman]: what's the status of this? Is this more than just a flaky test, and should we keep this as a blocker for the 2.6.0 release? If so, what's the timeframe for fixing this? > Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta > --- > > Key: KAFKA-10017 > URL: https://issues.apache.org/jira/browse/KAFKA-10017 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.6.0 >Reporter: Sophie Blee-Goldman >Assignee: Matthias J. Sax >Priority: Blocker > Labels: flaky-test, unit-test > Fix For: 2.6.0 > > > Creating a new ticket for this since the root cause is different than > https://issues.apache.org/jira/browse/KAFKA-9966 > With injectError = true: > h3. Stacktrace > java.lang.AssertionError: Did not receive all 20 records from topic > multiPartitionOutputTopic within 6 ms Expected: is a value equal to or > greater than <20> but: <15> was less than <20> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961) > at > org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8073) Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer
[ https://issues.apache.org/jira/browse/KAFKA-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144349#comment-17144349 ] Randall Hauch commented on KAFKA-8073: -- Since this is not a blocker issue, as part of the 2.6.0 release process I'm removing `2.6.0` from the fix version and adding `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer > -- > > Key: KAFKA-8073 > URL: https://issues.apache.org/jira/browse/KAFKA-8073 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0, 2.3.0 >Reporter: Bill Bejeck >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.2.3, 2.7.0, 2.6.1 > > > Failed in build [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/20134/] > > Stacktrace and STDOUT > {noformat} > Error Message > java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have > been throttled > Stacktrace > java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have > been throttled > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.assertTrue(Assert.java:42) > at > kafka.api.QuotaTestClients.verifyThrottleTimeMetric(BaseQuotaTest.scala:229) > at > kafka.api.QuotaTestClients.verifyProduceThrottle(BaseQuotaTest.scala:215) > at > kafka.api.BaseQuotaTest.testThrottledProducerConsumer(BaseQuotaTest.scala:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at org.junit.runners.ParentRunner.run(ParentRunner.java:412) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
[jira] [Updated] (KAFKA-8073) Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer
[ https://issues.apache.org/jira/browse/KAFKA-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8073: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 > Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer > -- > > Key: KAFKA-8073 > URL: https://issues.apache.org/jira/browse/KAFKA-8073 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0, 2.3.0 >Reporter: Bill Bejeck >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.2.3, 2.7.0, 2.6.1 > > > Failed in build [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/20134/] > > Stacktrace and STDOUT > {noformat} > Error Message > java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have > been throttled > Stacktrace > java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have > been throttled > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.assertTrue(Assert.java:42) > at > kafka.api.QuotaTestClients.verifyThrottleTimeMetric(BaseQuotaTest.scala:229) > at > kafka.api.QuotaTestClients.verifyProduceThrottle(BaseQuotaTest.scala:215) > at > kafka.api.BaseQuotaTest.testThrottledProducerConsumer(BaseQuotaTest.scala:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at org.junit.runners.ParentRunner.run(ParentRunner.java:412) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) > at > org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) > at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) > at >
[jira] [Updated] (KAFKA-10135) Extract Task#executeAndMaybeSwallow to be a general utility function into TaskManager
[ https://issues.apache.org/jira/browse/KAFKA-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-10135: -- Fix Version/s: 2.6.0 > Extract Task#executeAndMaybeSwallow to be a general utility function into > TaskManager > - > > Key: KAFKA-10135 > URL: https://issues.apache.org/jira/browse/KAFKA-10135 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 2.5.0, 2.6.0 >Reporter: Boyang Chen >Assignee: feyman >Priority: Major > Fix For: 2.6.0, 2.7.0 > > > We have a couple of cases where we need to swallow the exception during > operations in both Task class and TaskManager class. This utility method > should be generalized at least onto TaskManager level. See discussion comment > [here|https://github.com/apache/kafka/pull/8833#discussion_r437697665]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7964) Flaky Test ConsumerBounceTest#testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize
[ https://issues.apache.org/jira/browse/KAFKA-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-7964: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > ConsumerBounceTest#testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize > -- > > Key: KAFKA-7964 > URL: https://issues.apache.org/jira/browse/KAFKA-7964 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/] > {quote}java.lang.AssertionError: expected:<100> but was:<0> at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.failNotEquals(Assert.java:834) at > org.junit.Assert.assertEquals(Assert.java:645) at > org.junit.Assert.assertEquals(Assert.java:631) at > kafka.api.ConsumerBounceTest.receiveExactRecords(ConsumerBounceTest.scala:551) > at > kafka.api.ConsumerBounceTest.$anonfun$testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize$2(ConsumerBounceTest.scala:409) > at > kafka.api.ConsumerBounceTest.$anonfun$testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize$2$adapted(ConsumerBounceTest.scala:408) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.api.ConsumerBounceTest.testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize(ConsumerBounceTest.scala:408){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8085) Flaky Test ResetConsumerGroupOffsetTest#testResetOffsetsByDuration
[ https://issues.apache.org/jira/browse/KAFKA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8085: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test ResetConsumerGroupOffsetTest#testResetOffsetsByDuration > -- > > Key: KAFKA-8085 > URL: https://issues.apache.org/jira/browse/KAFKA-8085 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/62/testReport/junit/kafka.admin/ResetConsumerGroupOffsetTest/testResetOffsetsByDuration/] > {quote}java.lang.AssertionError: Expected that consumer group has consumed > all messages from topic/partition. at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.admin.ResetConsumerGroupOffsetTest.awaitConsumerProgress(ResetConsumerGroupOffsetTest.scala:364) > at > kafka.admin.ResetConsumerGroupOffsetTest.produceConsumeAndShutdown(ResetConsumerGroupOffsetTest.scala:359) > at > kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsByDuration(ResetConsumerGroupOffsetTest.scala:146){quote} > STDOUT > {quote}[2019-03-09 08:39:29,856] WARN Unable to read additional data from > client sessionid 0x105f6adb208, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) [2019-03-09 08:39:46,373] > WARN Unable to read additional data from client sessionid 0x105f6adf4c50001, > likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7540) Flaky Test ConsumerBounceTest#testClose
[ https://issues.apache.org/jira/browse/KAFKA-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-7540: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test ConsumerBounceTest#testClose > --- > > Key: KAFKA-7540 > URL: https://issues.apache.org/jira/browse/KAFKA-7540 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 2.2.0 >Reporter: John Roesler >Assignee: Jason Gustafson >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > Observed on Java 8: > [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/17314/testReport/junit/kafka.api/ConsumerBounceTest/testClose/] > > Stacktrace: > {noformat} > java.lang.ArrayIndexOutOfBoundsException: -1 > at > kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146) > at > kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238) > at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) > at > org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) > at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) > at > org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) > at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) > at >
[jira] [Updated] (KAFKA-8140) Flaky Test SaslSslAdminClientIntegrationTest#testDescribeAndAlterConfigs
[ https://issues.apache.org/jira/browse/KAFKA-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8140: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test SaslSslAdminClientIntegrationTest#testDescribeAndAlterConfigs > > > Key: KAFKA-8140 > URL: https://issues.apache.org/jira/browse/KAFKA-8140 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/80/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testDescribeAndAlterConfigs/] > {quote}java.lang.IllegalArgumentException: Could not find a 'KafkaServer' or > 'sasl_ssl.KafkaServer' entry in the JAAS configuration. System property > 'java.security.auth.login.config' is not set at > org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:133) > at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:98) at > org.apache.kafka.common.security.JaasContext.loadServerContext(JaasContext.java:70) > at > org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:121) > at > org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85) > at kafka.network.Processor.(SocketServer.scala:694) at > kafka.network.SocketServer.newProcessor(SocketServer.scala:344) at > kafka.network.SocketServer.$anonfun$addDataPlaneProcessors$1(SocketServer.scala:253) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at > kafka.network.SocketServer.addDataPlaneProcessors(SocketServer.scala:252) at > kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:216) > at > kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:214) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:214) > at kafka.network.SocketServer.startup(SocketServer.scala:114) at > kafka.server.KafkaServer.startup(KafkaServer.scala:253) at > kafka.utils.TestUtils$.createServer(TestUtils.scala:140) at > kafka.integration.KafkaServerTestHarness.$anonfun$setUp$1(KafkaServerTestHarness.scala:101) > at scala.collection.Iterator.foreach(Iterator.scala:941) at > scala.collection.Iterator.foreach$(Iterator.scala:941) at > scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at > scala.collection.IterableLike.foreach(IterableLike.scala:74) at > scala.collection.IterableLike.foreach$(IterableLike.scala:73) at > scala.collection.AbstractIterable.foreach(Iterable.scala:56) at > kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:100) > at kafka.api.IntegrationTestHarness.doSetup(IntegrationTestHarness.scala:81) > at kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:73) at > kafka.api.AdminClientIntegrationTest.setUp(AdminClientIntegrationTest.scala:79) > at > kafka.api.SaslSslAdminClientIntegrationTest.setUp(SaslSslAdminClientIntegrationTest.scala:64){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8250) Flaky Test DelegationTokenEndToEndAuthorizationTest#testProduceConsumeViaAssign
[ https://issues.apache.org/jira/browse/KAFKA-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8250: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > DelegationTokenEndToEndAuthorizationTest#testProduceConsumeViaAssign > --- > > Key: KAFKA-8250 > URL: https://issues.apache.org/jira/browse/KAFKA-8250 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/442/tests] > {quote}java.lang.AssertionError: Consumed more records than expected > expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1288) > at > kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:460) > at > kafka.api.EndToEndAuthorizationTest.testProduceConsumeViaAssign(EndToEndAuthorizationTest.scala:209){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8138) Flaky Test PlaintextConsumerTest#testFetchRecordLargerThanFetchMaxBytes
[ https://issues.apache.org/jira/browse/KAFKA-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8138: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test PlaintextConsumerTest#testFetchRecordLargerThanFetchMaxBytes > --- > > Key: KAFKA-8138 > URL: https://issues.apache.org/jira/browse/KAFKA-8138 > Project: Kafka > Issue Type: Bug > Components: clients, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/80/testReport/junit/kafka.api/PlaintextConsumerTest/testFetchRecordLargerThanFetchMaxBytes/] > {quote}java.lang.AssertionError: Partition [topic,0] metadata not propagated > after 15000 ms at kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:125) > at kafka.api.BaseConsumerTest.setUp(BaseConsumerTest.scala:69){quote} > STDOUT (truncated) > {quote}[2019-03-20 16:10:19,759] ERROR [ReplicaFetcher replicaId=2, > leaderId=0, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-20 16:10:19,760] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > __consumer_offsets-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-20 16:10:19,963] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > topic-1 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-20 16:10:19,964] ERROR > [ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Error for partition > topic-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-20 16:10:19,975] ERROR > [ReplicaFetcher replicaId=0, leaderId=2, fetcherId=0] Error for partition > topic-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition.{quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7947) Flaky Test EpochDrivenReplicationProtocolAcceptanceTest#shouldFollowLeaderEpochBasicWorkflow
[ https://issues.apache.org/jira/browse/KAFKA-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-7947: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > EpochDrivenReplicationProtocolAcceptanceTest#shouldFollowLeaderEpochBasicWorkflow > > > Key: KAFKA-7947 > URL: https://issues.apache.org/jira/browse/KAFKA-7947 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/17/] > {quote}java.lang.AssertionError: expected: startOffset=0), EpochEntry(epoch=1, startOffset=1))> but > was: startOffset=1))> at org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.failNotEquals(Assert.java:834) at > org.junit.Assert.assertEquals(Assert.java:118) at > org.junit.Assert.assertEquals(Assert.java:144) at > kafka.server.epoch.EpochDrivenReplicationProtocolAcceptanceTest.shouldFollowLeaderEpochBasicWorkflow(EpochDrivenReplicationProtocolAcceptanceTest.scala:101){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7969) Flaky Test DescribeConsumerGroupTest#testDescribeOffsetsOfExistingGroupWithNoMembers
[ https://issues.apache.org/jira/browse/KAFKA-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-7969: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > DescribeConsumerGroupTest#testDescribeOffsetsOfExistingGroupWithNoMembers > > > Key: KAFKA-7969 > URL: https://issues.apache.org/jira/browse/KAFKA-7969 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/24/] > {quote}java.lang.AssertionError: Expected no active member in describe group > results, state: Some(Empty), assignments: Some(List()) at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.admin.DescribeConsumerGroupTest.testDescribeOffsetsOfExistingGroupWithNoMembers(DescribeConsumerGroupTest.scala:278{quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8267) Flaky Test SaslAuthenticatorTest#testUserCredentialsUnavailableForScramMechanism
[ https://issues.apache.org/jira/browse/KAFKA-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8267: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > SaslAuthenticatorTest#testUserCredentialsUnavailableForScramMechanism > > > Key: KAFKA-8267 > URL: https://issues.apache.org/jira/browse/KAFKA-8267 > Project: Kafka > Issue Type: Bug > Components: core, security, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3925/testReport/junit/org.apache.kafka.common.security.authenticator/SaslAuthenticatorTest/testUserCredentialsUnavailableForScramMechanism/] > {quote}java.lang.AssertionError: Metric not updated > successful-reauthentication-total expected:<0.0> but was:<1.0> expected:<0.0> > but was:<1.0> at org.junit.Assert.fail(Assert.java:89) at > org.junit.Assert.failNotEquals(Assert.java:835) at > org.junit.Assert.assertEquals(Assert.java:555) at > org.apache.kafka.common.network.NioEchoServer.waitForMetrics(NioEchoServer.java:190) > at > org.apache.kafka.common.network.NioEchoServer.verifyReauthenticationMetrics(NioEchoServer.java:157) > at > org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.testUserCredentialsUnavailableForScramMechanism(SaslAuthenticatorTest.java:501){quote} > STDOUT > {quote}[2019-04-19 22:15:35,524] ERROR Extensions provided in login context > without a token > (org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule:318) > java.io.IOException: Extensions provided in login context without a token at > org.apache.kafka.common.security.oauthbearer.internals.unsecured.OAuthBearerUnsecuredLoginCallbackHandler.handle(OAuthBearerUnsecuredLoginCallbackHandler.java:164) > at > org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule.identifyToken(OAuthBearerLoginModule.java:316) > at > org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule.login(OAuthBearerLoginModule.java:301) > at > java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:726) > at > java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:665) > at > java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:663) > at java.base/java.security.AccessController.doPrivileged(Native Method) at > java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:663) > at > java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:574) > at > org.apache.kafka.common.security.authenticator.AbstractLogin.login(AbstractLogin.java:60) > at > org.apache.kafka.common.security.authenticator.LoginManager.(LoginManager.java:61) > at > org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:104) > at > org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:149) > at > org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146) > at > org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85) > at > org.apache.kafka.common.network.NioEchoServer.(NioEchoServer.java:121) > at > org.apache.kafka.common.network.NioEchoServer.(NioEchoServer.java:97) > at > org.apache.kafka.common.network.NetworkTestUtils.createEchoServer(NetworkTestUtils.java:49) > at > org.apache.kafka.common.network.NetworkTestUtils.createEchoServer(NetworkTestUtils.java:43) > at > org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.createEchoServer(SaslAuthenticatorTest.java:1851) > at > org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.createEchoServer(SaslAuthenticatorTest.java:1847) > at > org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.testValidSaslOauthBearerMechanismWithoutServerTokens(SaslAuthenticatorTest.java:1586) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at >
[jira] [Updated] (KAFKA-8268) Flaky Test SaslSslAdminIntegrationTest#testSeekAfterDeleteRecords
[ https://issues.apache.org/jira/browse/KAFKA-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8268: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test SaslSslAdminIntegrationTest#testSeekAfterDeleteRecords > - > > Key: KAFKA-8268 > URL: https://issues.apache.org/jira/browse/KAFKA-8268 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3570/tests] > {quote}java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout. > at > org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) > > at > org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) > at > org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) > at > kafka.api.AdminClientIntegrationTest.testSeekAfterDeleteRecords(AdminClientIntegrationTest.scala:775){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8269) Flaky Test TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitionsMixed
[ https://issues.apache.org/jira/browse/KAFKA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8269: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitionsMixed > - > > Key: KAFKA-8269 > URL: https://issues.apache.org/jira/browse/KAFKA-8269 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3573/tests] > {quote}java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:87) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertTrue(Assert.java:53) > at > kafka.admin.TopicCommandWithAdminClientTest.testDescribeUnderMinIsrPartitionsMixed(TopicCommandWithAdminClientTest.scala:659){quote} > It's a long LOG. This might be interesting: > {quote}[2019-04-20 21:30:37,936] ERROR [ReplicaFetcher replicaId=4, > leaderId=5, fetcherId=0] Error for partition > testCreateWithReplicaAssignment-0cpsXnG35w-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-20 21:30:48,600] WARN Unable to read additional data from client > sessionid 0x10510a59d3c0004, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-04-20 21:30:48,908] WARN Unable to read additional data from client > sessionid 0x10510a59d3c0003, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-04-20 21:30:48,919] ERROR [RequestSendThread controllerId=0] Controller > 0 fails to send a request to broker localhost:43520 (id: 5 rack: rack3) > (kafka.controller.RequestSendThread:76) > java.lang.InterruptedException > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at kafka.utils.ShutdownableThread.pause(ShutdownableThread.scala:75) > at > kafka.controller.RequestSendThread.backoff$1(ControllerChannelManager.scala:224) > at > kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:252) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89) > [2019-04-20 21:30:48,920] ERROR [RequestSendThread controllerId=0] Controller > 0 fails to send a request to broker localhost:33570 (id: 4 rack: rack3) > (kafka.controller.RequestSendThread:76) > java.lang.InterruptedException > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at kafka.utils.ShutdownableThread.pause(ShutdownableThread.scala:75) > at > kafka.controller.RequestSendThread.backoff$1(ControllerChannelManager.scala:224) > at > kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:252) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89) > [2019-04-20 21:31:28,942] ERROR [ReplicaFetcher replicaId=3, leaderId=1, > fetcherId=0] Error for partition under-min-isr-topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-20 21:31:28,973] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition under-min-isr-topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8033) Flaky Test PlaintextConsumerTest#testFetchInvalidOffset
[ https://issues.apache.org/jira/browse/KAFKA-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8033: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test PlaintextConsumerTest#testFetchInvalidOffset > --- > > Key: KAFKA-8033 > URL: https://issues.apache.org/jira/browse/KAFKA-8033 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2829/testReport/junit/kafka.api/PlaintextConsumerTest/testFetchInvalidOffset/] > {quote}org.scalatest.junit.JUnitTestFailedError: Expected exception > org.apache.kafka.clients.consumer.NoOffsetForPartitionException to be thrown, > but no exception was thrown{quote} > STDOUT prints this over and over again: > {quote}[2019-03-02 04:01:25,576] ERROR [ReplicaFetcher replicaId=0, > leaderId=1, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7957) Flaky Test DynamicBrokerReconfigurationTest#testMetricsReporterUpdate
[ https://issues.apache.org/jira/browse/KAFKA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-7957: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test DynamicBrokerReconfigurationTest#testMetricsReporterUpdate > - > > Key: KAFKA-7957 > URL: https://issues.apache.org/jira/browse/KAFKA-7957 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/18/] > {quote}java.lang.AssertionError: Messages not sent at > kafka.utils.TestUtils$.fail(TestUtils.scala:356) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:766) at > kafka.server.DynamicBrokerReconfigurationTest.startProduceConsume(DynamicBrokerReconfigurationTest.scala:1270) > at > kafka.server.DynamicBrokerReconfigurationTest.testMetricsReporterUpdate(DynamicBrokerReconfigurationTest.scala:650){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8115) Flaky Test CoordinatorTest#testTaskRequestWithOldStartMsGetsUpdated
[ https://issues.apache.org/jira/browse/KAFKA-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8115: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test CoordinatorTest#testTaskRequestWithOldStartMsGetsUpdated > --- > > Key: KAFKA-8115 > URL: https://issues.apache.org/jira/browse/KAFKA-8115 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3254/testReport/junit/org.apache.kafka.trogdor.coordinator/CoordinatorTest/testTaskRequestWithOldStartMsGetsUpdated/] > {quote}org.junit.runners.model.TestTimedOutException: test timed out after > 12 milliseconds at java.base@11.0.1/jdk.internal.misc.Unsafe.park(Native > Method) at > java.base@11.0.1/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234) > at > java.base@11.0.1/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123) > at > java.base@11.0.1/java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1454) > at > java.base@11.0.1/java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination(Executors.java:709) > at > app//org.apache.kafka.trogdor.rest.JsonRestServer.waitForShutdown(JsonRestServer.java:157) > at app//org.apache.kafka.trogdor.agent.Agent.waitForShutdown(Agent.java:123) > at > app//org.apache.kafka.trogdor.common.MiniTrogdorCluster.close(MiniTrogdorCluster.java:285) > at > app//org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated(CoordinatorTest.java:596) > at > java.base@11.0.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base@11.0.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base@11.0.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base@11.0.1/java.lang.reflect.Method.invoke(Method.java:566) at > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base@11.0.1/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base@11.0.1/java.lang.Thread.run(Thread.java:834){quote} > STDOUT > {quote}[2019-03-15 09:23:41,364] INFO Creating MiniTrogdorCluster with > agents: node02 and coordinator: node01 > (org.apache.kafka.trogdor.common.MiniTrogdorCluster:135) [2019-03-15 > 09:23:41,595] INFO Logging initialized @13340ms to > org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log:193) > [2019-03-15 09:23:41,752] INFO Starting REST server > (org.apache.kafka.trogdor.rest.JsonRestServer:89) [2019-03-15 09:23:41,912] > INFO Registered resource > org.apache.kafka.trogdor.agent.AgentRestResource@3fa38ceb > (org.apache.kafka.trogdor.rest.JsonRestServer:94) [2019-03-15 09:23:42,178] > INFO jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: > c4550056e785fb5665914545889f21dc136ad9e6; jvm 11.0.1+13-LTS > (org.eclipse.jetty.server.Server:370) [2019-03-15 09:23:42,360] INFO > DefaultSessionIdManager workerName=node0 > (org.eclipse.jetty.server.session:365) [2019-03-15 09:23:42,362] INFO No > SessionScavenger set, using defaults (org.eclipse.jetty.server.session:370) > [2019-03-15 09:23:42,370] INFO node0 Scavenging every 66ms > (org.eclipse.jetty.server.session:149) [2019-03-15 09:23:44,412] INFO Started > o.e.j.s.ServletContextHandler@335a5293\{/,null,AVAILABLE} > (org.eclipse.jetty.server.handler.ContextHandler:855) [2019-03-15 > 09:23:44,473] INFO Started > ServerConnector@79a93bf1\{HTTP/1.1,[http/1.1]}{0.0.0.0:33477} > (org.eclipse.jetty.server.AbstractConnector:292) [2019-03-15 09:23:44,474] > INFO Started @16219ms (org.eclipse.jetty.server.Server:407)
[jira] [Updated] (KAFKA-8113) Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch
[ https://issues.apache.org/jira/browse/KAFKA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8113: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch > - > > Key: KAFKA-8113 > URL: https://issues.apache.org/jira/browse/KAFKA-8113 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3468/tests] > {quote}java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:87) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertTrue(Assert.java:53) > at > kafka.server.ListOffsetsRequestTest.fetchOffsetAndEpoch$1(ListOffsetsRequestTest.scala:136) > at > kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:151){quote} > STDOUT > {quote}[2019-03-15 17:16:13,029] ERROR [ReplicaFetcher replicaId=2, > leaderId=1, fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-15 17:16:13,231] ERROR [KafkaApi-0] Error while responding to offset > request (kafka.server.KafkaApis:76) > org.apache.kafka.common.errors.ReplicaNotAvailableException: Partition > topic-0 is not available{quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8082) Flaky Test ProducerFailureHandlingTest#testNotEnoughReplicasAfterBrokerShutdown
[ https://issues.apache.org/jira/browse/KAFKA-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8082: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > ProducerFailureHandlingTest#testNotEnoughReplicasAfterBrokerShutdown > --- > > Key: KAFKA-8082 > URL: https://issues.apache.org/jira/browse/KAFKA-8082 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/61/testReport/junit/kafka.api/ProducerFailureHandlingTest/testNotEnoughReplicasAfterBrokerShutdown/] > {quote}java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.NotEnoughReplicasAfterAppendException: > Messages are written to the log, but to fewer in-sync replicas than required. > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:98) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:67) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30) > at > kafka.api.ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown(ProducerFailureHandlingTest.scala:270){quote} > STDOUT > {quote}[2019-03-09 03:59:24,897] ERROR [ReplicaFetcher replicaId=0, > leaderId=1, fetcherId=0] Error for partition topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 03:59:28,028] ERROR > [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition > topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 03:59:42,046] ERROR > [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition > minisrtest-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 03:59:42,245] ERROR > [ReplicaManager broker=1] Error processing append operation on partition > minisrtest-0 (kafka.server.ReplicaManager:76) > org.apache.kafka.common.errors.NotEnoughReplicasException: The size of the > current ISR Set(1, 0) is insufficient to satisfy the min.isr requirement of 3 > for partition minisrtest-0 [2019-03-09 04:00:01,212] ERROR [ReplicaFetcher > replicaId=1, leaderId=0, fetcherId=0] Error for partition topic-1-0 at offset > 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 04:00:02,214] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 04:00:03,216] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 04:00:23,144] ERROR > [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition > topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 04:00:24,146] ERROR > [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition > topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 04:00:25,148] ERROR > [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition > topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-09 04:00:44,607] ERROR > [ReplicaFetcher replicaId=1, leaderId=0,
[jira] [Updated] (KAFKA-8015) Flaky Test SaslGssapiSslEndToEndAuthorizationTest#testProduceConsumeTopicAutoCreateTopicCreateAcl
[ https://issues.apache.org/jira/browse/KAFKA-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8015: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > SaslGssapiSslEndToEndAuthorizationTest#testProduceConsumeTopicAutoCreateTopicCreateAcl > - > > Key: KAFKA-8015 > URL: https://issues.apache.org/jira/browse/KAFKA-8015 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3422/tests] > {quote}java.lang.AssertionError: Partition [e2etopic,0] metadata not > propagated after 15000 ms > at kafka.utils.TestUtils$.fail(TestUtils.scala:356) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:766) > at kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:855) > at kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:303) > at kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:302) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.Range.foreach(Range.scala:160) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at kafka.utils.TestUtils$.createTopic(TestUtils.scala:302) > at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:125) > at > kafka.api.EndToEndAuthorizationTest.setUp(EndToEndAuthorizationTest.scala:189) > at > kafka.api.SaslEndToEndAuthorizationTest.setUp(SaslEndToEndAuthorizationTest.scala:45){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8084) Flaky Test DescribeConsumerGroupTest#testDescribeMembersOfExistingGroupWithNoMembers
[ https://issues.apache.org/jira/browse/KAFKA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8084: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > DescribeConsumerGroupTest#testDescribeMembersOfExistingGroupWithNoMembers > > > Key: KAFKA-8084 > URL: https://issues.apache.org/jira/browse/KAFKA-8084 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/62/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersOfExistingGroupWithNoMembers/] > {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata > not propagated after 15000 ms at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at > kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at > kafka.admin.DescribeConsumerGroupTest.testDescribeMembersOfExistingGroupWithNoMembers(DescribeConsumerGroupTest.scala:283){quote} > STDOUT > {quote}TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST > CLIENT-ID foo 0 0 0 0 - - - TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG > CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - COORDINATOR (ID) > ASSIGNMENT-STRATEGY STATE #MEMBERS localhost:45812 (0) Empty 0{quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8141) Flaky Test FetchRequestDownConversionConfigTest#testV1FetchWithDownConversionDisabled
[ https://issues.apache.org/jira/browse/KAFKA-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8141: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > FetchRequestDownConversionConfigTest#testV1FetchWithDownConversionDisabled > - > > Key: KAFKA-8141 > URL: https://issues.apache.org/jira/browse/KAFKA-8141 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/80/testReport/junit/kafka.server/FetchRequestDownConversionConfigTest/testV1FetchWithDownConversionDisabled/] > {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata > not propagated after 15000 ms at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at > kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at > kafka.api.IntegrationTestHarness.doSetup(IntegrationTestHarness.scala:95) at > kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:73){quote} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
[ https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-7988: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize > > > Key: KAFKA-7988 > URL: https://issues.apache.org/jira/browse/KAFKA-7988 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: Rajini Sivaram >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/] > {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize > FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: > List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, > ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) > at org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260) > at > kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531) > at > kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550) > at > kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536) > at > kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at > kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558) > at > kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8329) Flaky Test LogOffsetTest#testEmptyLogsGetOffsets
[ https://issues.apache.org/jira/browse/KAFKA-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8329: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test LogOffsetTest#testEmptyLogsGetOffsets > > > Key: KAFKA-8329 > URL: https://issues.apache.org/jira/browse/KAFKA-8329 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4325/testReport/junit/kafka.server/LogOffsetTest/testEmptyLogsGetOffsets/] > {quote}org.scalatest.exceptions.TestFailedException: Partition [kafka-,0] > metadata not propagated after 15000 ms at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) at > org.scalatest.Assertions.fail$(Assertions.scala:1087) at > org.scalatest.Assertions$.fail(Assertions.scala:1389) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:788) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:877) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:320) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:319) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:319) at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:125) > at > kafka.server.LogOffsetTest.testEmptyLogsGetOffsets(LogOffsetTest.scala:141){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8059) Flaky Test DynamicConnectionQuotaTest #testDynamicConnectionQuota
[ https://issues.apache.org/jira/browse/KAFKA-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8059: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test DynamicConnectionQuotaTest #testDynamicConnectionQuota > - > > Key: KAFKA-8059 > URL: https://issues.apache.org/jira/browse/KAFKA-8059 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0, 2.1.1 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/46/tests] > {quote}org.scalatest.junit.JUnitTestFailedError: Expected exception > java.io.IOException to be thrown, but no exception was thrown > at > org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:100) > at > org.scalatest.junit.JUnitSuite.newAssertionFailedException(JUnitSuite.scala:71) > at org.scalatest.Assertions$class.intercept(Assertions.scala:822) > at org.scalatest.junit.JUnitSuite.intercept(JUnitSuite.scala:71) > at > kafka.network.DynamicConnectionQuotaTest.testDynamicConnectionQuota(DynamicConnectionQuotaTest.scala:82){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8303) Flaky Test SaslSslAdminClientIntegrationTest#testLogStartOffsetCheckpoint
[ https://issues.apache.org/jira/browse/KAFKA-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8303: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test SaslSslAdminClientIntegrationTest#testLogStartOffsetCheckpoint > - > > Key: KAFKA-8303 > URL: https://issues.apache.org/jira/browse/KAFKA-8303 > Project: Kafka > Issue Type: Bug > Components: admin, security, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/21274/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testLogStartOffsetCheckpoint/] > {quote}java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout. at > org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) > at > org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) > at > org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) > at > kafka.api.AdminClientIntegrationTest$$anonfun$testLogStartOffsetCheckpoint$2.apply$mcZ$sp(AdminClientIntegrationTest.scala:820) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:789) at > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpoint(AdminClientIntegrationTest.scala:813){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8110) Flaky Test DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions
[ https://issues.apache.org/jira/browse/KAFKA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8110: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions > -- > > Key: KAFKA-8110 > URL: https://issues.apache.org/jira/browse/KAFKA-8110 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/67/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersWithConsumersWithoutAssignedPartitions/] > {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata > not propagated after 15000 ms at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at > kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at > kafka.admin.DescribeConsumerGroupTest.testDescribeMembersWithConsumersWithoutAssignedPartitions(DescribeConsumerGroupTest.scala:372){quote} > STDOUT > {quote}[2019-03-14 20:01:52,347] WARN Ignoring unexpected runtime exception > (org.apache.zookeeper.server.NIOServerCnxnFactory:236) > java.nio.channels.CancelledKeyException at > sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at > sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:205) > at java.lang.Thread.run(Thread.java:748) TOPIC PARTITION CURRENT-OFFSET > LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - TOPIC > PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 > 0 0 0 - - - COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS > localhost:44669 (0){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8068) Flaky Test DescribeConsumerGroupTest#testDescribeMembersOfExistingGroup
[ https://issues.apache.org/jira/browse/KAFKA-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8068: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test DescribeConsumerGroupTest#testDescribeMembersOfExistingGroup > --- > > Key: KAFKA-8068 > URL: https://issues.apache.org/jira/browse/KAFKA-8068 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/55/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersOfExistingGroup/] > {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata > not propagated after 15000 ms at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at > kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at > kafka.admin.DescribeConsumerGroupTest.testDescribeMembersOfExistingGroup(DescribeConsumerGroupTest.scala:154){quote} > > STDOUT > {quote}[2019-03-07 18:55:40,194] WARN Unable to read additional data from > client sessionid 0x1006fb9a65f0001, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) TOPIC PARTITION > CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - > - TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST > CLIENT-ID foo 0 0 0 0 - - - COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE > #MEMBERS localhost:35213 (0) Empty 0 [2019-03-07 18:58:42,206] WARN Unable to > read additional data from client sessionid 0x1006fbc6962, likely client > has closed socket (org.apache.zookeeper.server.NIOServerCnxn:376) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8064) Flaky Test DeleteTopicTest #testRecreateTopicAfterDeletion
[ https://issues.apache.org/jira/browse/KAFKA-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8064: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test DeleteTopicTest #testRecreateTopicAfterDeletion > -- > > Key: KAFKA-8064 > URL: https://issues.apache.org/jira/browse/KAFKA-8064 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/54/testReport/junit/kafka.admin/DeleteTopicTest/testRecreateTopicAfterDeletion/] > {quote}java.lang.AssertionError: Admin path /admin/delete_topic/test path not > deleted even after a replica is restarted at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.verifyTopicDeletion(TestUtils.scala:1056) at > kafka.admin.DeleteTopicTest.testRecreateTopicAfterDeletion(DeleteTopicTest.scala:283){quote} > STDOUT > {quote}[2019-03-07 16:05:05,661] ERROR [ReplicaFetcher replicaId=1, > leaderId=0, fetcherId=0] Error for partition test-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:26,122] WARN Unable to > read additional data from client sessionid 0x1006f1dd1a60003, likely client > has closed socket (org.apache.zookeeper.server.NIOServerCnxn:376) [2019-03-07 > 16:05:36,511] ERROR [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] > Error for partition test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:36,512] ERROR > [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:43,418] ERROR > [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:43,422] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:47,649] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:47,649] ERROR > [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:05:51,668] ERROR > [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. WARNING: If partitions are increased for > a topic that has a key, the partition logic or ordering of the messages will > be affected Adding partitions succeeded! [2019-03-07 16:05:56,135] WARN > Unable to read additional data from client sessionid 0x1006f1e2abb0006, > likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) [2019-03-07 16:06:00,286] > ERROR [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for > partition test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. [2019-03-07 16:06:00,357] ERROR > [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition > test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) >
[jira] [Updated] (KAFKA-8701) Flaky Test SaslSslAdminClientIntegrationTest#testDescribeConfigsForTopic
[ https://issues.apache.org/jira/browse/KAFKA-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8701: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test SaslSslAdminClientIntegrationTest#testDescribeConfigsForTopic > > > Key: KAFKA-8701 > URL: https://issues.apache.org/jira/browse/KAFKA-8701 > Project: Kafka > Issue Type: Bug > Components: unit tests >Affects Versions: 2.4.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/477/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testDescribeConfigsForTopic/] > {quote}org.scalatest.exceptions.TestFailedException: Partition [topic,0] > metadata not propagated after 15000 ms at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) at > org.scalatest.Assertions.fail$(Assertions.scala:1087) at > org.scalatest.Assertions$.fail(Assertions.scala:1389) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:822) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:911) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:337) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:336) at > scala.collection.immutable.Range.map(Range.scala:59) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:336) at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:126) > at > kafka.api.AdminClientIntegrationTest.testDescribeConfigsForTopic(AdminClientIntegrationTest.scala:1008){quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8133) Flaky Test MetadataRequestTest#testNoTopicsRequest
[ https://issues.apache.org/jira/browse/KAFKA-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8133: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test MetadataRequestTest#testNoTopicsRequest > -- > > Key: KAFKA-8133 > URL: https://issues.apache.org/jira/browse/KAFKA-8133 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.1.1 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.1-jdk8/detail/kafka-2.1-jdk8/151/tests] > {quote}org.apache.kafka.common.errors.TopicExistsException: Topic 't1' > already exists.{quote} > STDOUT: > {quote}[2019-03-20 03:49:00,982] ERROR [ReplicaFetcher replicaId=1, > leaderId=0, fetcherId=0] Error for partition isr-after-broker-shutdown-0 at > offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:00,982] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition isr-after-broker-shutdown-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:15,319] ERROR [ReplicaFetcher replicaId=1, leaderId=2, > fetcherId=0] Error for partition replicaDown-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:15,319] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition replicaDown-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:20,049] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition testAutoCreate_Topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:27,080] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition __consumer_offsets-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:27,080] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition __consumer_offsets-2 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:27,080] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:27,538] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition notInternal-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:27,538] ERROR [ReplicaFetcher replicaId=0, leaderId=2, > fetcherId=0] Error for partition notInternal-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:28,863] WARN Unable to read additional data from client > sessionid 0x102fbd81b150003, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-20 03:49:40,478] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition t1-2 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-20 03:49:40,921] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition t2-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This
[jira] [Updated] (KAFKA-8075) Flaky Test GroupAuthorizerIntegrationTest#testTransactionalProducerTopicAuthorizationExceptionInCommit
[ https://issues.apache.org/jira/browse/KAFKA-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8075: - Fix Version/s: (was: 2.6.0) 2.6.1 2.7.0 Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread. > Flaky Test > GroupAuthorizerIntegrationTest#testTransactionalProducerTopicAuthorizationExceptionInCommit > -- > > Key: KAFKA-8075 > URL: https://issues.apache.org/jira/browse/KAFKA-8075 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.7.0, 2.6.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/56/testReport/junit/kafka.api/GroupAuthorizerIntegrationTest/testTransactionalProducerTopicAuthorizationExceptionInCommit/] > {quote}org.apache.kafka.common.errors.TimeoutException: Timeout expired while > initializing transactional state in 3000ms.{quote} > STDOUT > {quote}[2019-03-08 01:48:45,226] ERROR [Consumer clientId=consumer-99, > groupId=my-group] Offset commit failed on partition topic-0 at offset 5: Not > authorized to access topics: [Topic authorization failed.] > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:812) > [2019-03-08 01:48:45,227] ERROR [Consumer clientId=consumer-99, > groupId=my-group] Not authorized to commit to topics [topic] > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:850) > [2019-03-08 01:48:57,870] ERROR [KafkaApi-0] Error when handling request: > clientId=0, correlationId=0, api=UPDATE_METADATA, > body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=43610,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]} > (kafka.server.KafkaApis:76) > org.apache.kafka.common.errors.ClusterAuthorizationException: Request > Request(processor=0, connectionId=127.0.0.1:43610-127.0.0.1:44870-0, > session=Session(Group:testGroup,/127.0.0.1), > listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, > buffer=null) is not authorized. [2019-03-08 01:49:14,858] ERROR [KafkaApi-0] > Error when handling request: clientId=0, correlationId=0, > api=UPDATE_METADATA, > body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=44107,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]} > (kafka.server.KafkaApis:76) > org.apache.kafka.common.errors.ClusterAuthorizationException: Request > Request(processor=0, connectionId=127.0.0.1:44107-127.0.0.1:38156-0, > session=Session(Group:testGroup,/127.0.0.1), > listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, > buffer=null) is not authorized. [2019-03-08 01:49:21,984] ERROR [KafkaApi-0] > Error when handling request: clientId=0, correlationId=0, > api=UPDATE_METADATA, > body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=39025,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]} > (kafka.server.KafkaApis:76) > org.apache.kafka.common.errors.ClusterAuthorizationException: Request > Request(processor=0, connectionId=127.0.0.1:39025-127.0.0.1:41474-0, > session=Session(Group:testGroup,/127.0.0.1), > listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, > buffer=null) is not authorized. [2019-03-08 01:49:39,438] ERROR [KafkaApi-0] > Error when handling request: clientId=0, correlationId=0, > api=UPDATE_METADATA, > body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=44798,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]} > (kafka.server.KafkaApis:76) > org.apache.kafka.common.errors.ClusterAuthorizationException: Request > Request(processor=0, connectionId=127.0.0.1:44798-127.0.0.1:58496-0, > session=Session(Group:testGroup,/127.0.0.1), > listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, > buffer=null) is not authorized. Error: Consumer group 'my-group' does not > exist. [2019-03-08 01:49:55,502] WARN Ignoring unexpected runtime exception > (org.apache.zookeeper.server.NIOServerCnxnFactory:236) > java.nio.channels.CancelledKeyException at > sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at > sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) at >