Re: [PR] MINOR: Fix flaky test ConnectWorkerIntegrationTest::testReconfigureConnectorWithFailingTaskConfigs [kafka]
C0urante merged PR #16273: URL: https://github.com/apache/kafka/pull/16273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINOR: Fix flaky test ConnectWorkerIntegrationTest::testReconfigureConnectorWithFailingTaskConfigs [kafka]
C0urante commented on PR #16273: URL: https://github.com/apache/kafka/pull/16273#issuecomment-2160373057 Are you seeing these failures in CI, running locally in a normal environment, or running locally in a special environment? I don't think we need to worry about them if they aren't cropping up in the wild. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINOR: Fix flaky test ConnectWorkerIntegrationTest::testReconfigureConnectorWithFailingTaskConfigs [kafka]
gharris1727 commented on PR #16273: URL: https://github.com/apache/kafka/pull/16273#issuecomment-2159746582 I still see some remaining flakiness in this test. It fails ~10% of the time at 30% CPU, rising to ~70% of the time at 15% CPU. The failures are mostly this one: ``` caught: org.apache.kafka.connect.errors.DataException: Insufficient records committed by connector simple-connector in 300 millis. Records expected=8, actual=0 at org.apache.kafka.connect.integration.ConnectorHandle.awaitCommits(ConnectorHandle.java:213) at org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testReconfigureConnectorWithFailingTaskConfigs(ConnectWorkerIntegrationTest.java:1292) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ``` with a handful of these two: ``` caught: java.lang.AssertionError: Connector tasks were not restarted in time at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testReconfigureConnectorWithFailingTaskConfigs(ConnectWorkerIntegrationTest.java:1310) ``` ``` caught: org.apache.kafka.connect.errors.DataException: Insufficient records committed by connector simple-connector in 300 millis. Records expected=1, actual=0 at org.apache.kafka.connect.integration.ConnectorHandle.awaitCommits(ConnectorHandle.java:213) at org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testReconfigureConnectorWithFailingTaskConfigs(ConnectWorkerIntegrationTest.java:1317) ``` I'll look into this more tomorrow if you need some more info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] MINOR: Fix flaky test ConnectWorkerIntegrationTest::testReconfigureConnectorWithFailingTaskConfigs [kafka]
C0urante opened a new pull request, #16273: URL: https://github.com/apache/kafka/pull/16273 This test has been flaky since it was merged to trunk. To date, there have been 566 successful runs and 8 flaky failures (see [Gradle Enterprise analysis](https://ge.apache.org/scans/tests?search.relativeStartTime=P28D=kafka=America%2FNew_York=org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest=Wzhd=testReconfigureConnectorWithFailingTaskConfigs)). One possible cause of this is that we establish an expectation on the number of offset commits that need to take place (two per task) before reconfiguring the connector, but the assumption in the test is that these commits will take place after the tasks have been restarted. In some rare cases, it's possible that these commits will have already taken place before the tasks are restarted, which causes an assertion failure with the message "java.lang.AssertionError: Source connector should have published at least one record to new Kafka topic after being reconfigured". This patch should resolve those failures by establishing the expected number of offset commits _after_ the connector has been reconfigured and its tasks have been restarted, which should guarantee that the offset commits are performed by tasks with the updated connector configuration. In addition, the number of expected offset commits is reduced to one, since a single commit is all that we need in order to expect at least one record is present in the new Kafka topic. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org