[ https://issues.apache.org/jira/browse/KAFKA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334937#comment-17334937 ]
A. Sophie Blee-Goldman commented on KAFKA-9295: ----------------------------------------------- It's failing on {noformat} startApplicationAndWaitUntilRunning(kafkaStreamsList, ofSeconds(60)); {noformat} At this point a session timeout seems unlikely, since [~showuon] observed a Streams instance dropping out on the heartbeat interval only a couple of times even when it failed, and all it has to do here is get to RUNNING once. It doesn't require that all KafkaStreams in the list get to RUNNING and then stay there, so all the instance has to do is start up and go through at least once successful rebalance in that time. There's nothing to restore so the transition to RUNNING should be immediate after the rebalance. Now technically 60s is a typical timeout for startApplicationAndWaitUntilRunning in the Streams integration tests, but the difference between this and other tests is that most have only one or two KafkaStreams to start up whereas this test has three. They're not started up and waited on sequentially so that shouldn't _really_ matter that much, but still it might just be that a longer timeout should be used in this case. I'm open to other theories however Also note that we should soon have a larger default session interval, so once Jason's KIP for that has been implemented we'll be able to get that improvement for free. Even if we think the session interval is the problem with this test, it probably makes sense to just wait for that KIP than to hardcode in some special value. If it starts to fail very frequently we can reconsider, but I haven't observed it doing so since the last fix was merged > KTableKTableForeignKeyInnerJoinMultiIntegrationTest#shouldInnerJoinMultiPartitionQueryable > ------------------------------------------------------------------------------------------ > > Key: KAFKA-9295 > URL: https://issues.apache.org/jira/browse/KAFKA-9295 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests > Affects Versions: 2.4.0, 2.6.0 > Reporter: Matthias J. Sax > Assignee: Luke Chen > Priority: Critical > Labels: flaky-test > Fix For: 3.0.0 > > > [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/27106/testReport/junit/org.apache.kafka.streams.integration/KTableKTableForeignKeyInnerJoinMultiIntegrationTest/shouldInnerJoinMultiPartitionQueryable/] > {quote}java.lang.AssertionError: Did not receive all 1 records from topic > output- within 60000 ms Expected: is a value equal to or greater than <1> > but: <0> was less than <1> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:515) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:417) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:385) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:511) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:489) > at > org.apache.kafka.streams.integration.KTableKTableForeignKeyInnerJoinMultiIntegrationTest.verifyKTableKTableJoin(KTableKTableForeignKeyInnerJoinMultiIntegrationTest.java:200) > at > org.apache.kafka.streams.integration.KTableKTableForeignKeyInnerJoinMultiIntegrationTest.shouldInnerJoinMultiPartitionQueryable(KTableKTableForeignKeyInnerJoinMultiIntegrationTest.java:183){quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)