[jira] [Commented] (KAFKA-14089) Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic
[ https://issues.apache.org/jira/browse/KAFKA-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855668#comment-17855668 ] Chris Egerton commented on KAFKA-14089: --- Should be fixed by [https://github.com/apache/kafka/pull/16306|https://github.com/apache/kafka/pull/16306.] > Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic > --- > > Key: KAFKA-14089 > URL: https://issues.apache.org/jira/browse/KAFKA-14089 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Mickael Maison >Priority: Major > Fix For: 3.3.0 > > Attachments: failure.txt, > org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic.test.stdout > > > It looks like the sequence got broken around "65535, 65537, 65536, 65539, > 65538, 65541, 65540, 65543" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14089) Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic
[ https://issues.apache.org/jira/browse/KAFKA-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830728#comment-17830728 ] Justine Olshan commented on KAFKA-14089: I've seen this again as well. https://ge.apache.org/scans/tests?search.names=Git%20branch=P28D=kafka=America%2FLos_Angeles=trunk=org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest=testSeparateOffsetsTopic > Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic > --- > > Key: KAFKA-14089 > URL: https://issues.apache.org/jira/browse/KAFKA-14089 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Mickael Maison >Priority: Major > Fix For: 3.3.0 > > Attachments: failure.txt, > org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic.test.stdout > > > It looks like the sequence got broken around "65535, 65537, 65536, 65539, > 65538, 65541, 65540, 65543" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14089) Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic
[ https://issues.apache.org/jira/browse/KAFKA-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569715#comment-17569715 ] Chris Egerton commented on KAFKA-14089: --- Thanks Mickael. I put together a draft fix [here|https://github.com/apache/kafka/pull/12429], although I still haven't been able to replicate the failure locally. If you have time, would you mind giving it a try and see if it has positive effects in your environment? I can also kick off several Jenkins builds by re-triggering CI runs, although that will be more time-consuming as it will run the build for the whole project instead of just Connect. > Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic > --- > > Key: KAFKA-14089 > URL: https://issues.apache.org/jira/browse/KAFKA-14089 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Mickael Maison >Assignee: Chris Egerton >Priority: Major > Attachments: failure.txt, > org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic.test.stdout > > > It looks like the sequence got broken around "65535, 65537, 65536, 65539, > 65538, 65541, 65540, 65543" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14089) Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic
[ https://issues.apache.org/jira/browse/KAFKA-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569309#comment-17569309 ] Mickael Maison commented on KAFKA-14089: I've hit this issue locally and I can reproduce it fairly consistently. See attached logs [^org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic.test.stdout] > Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic > --- > > Key: KAFKA-14089 > URL: https://issues.apache.org/jira/browse/KAFKA-14089 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Mickael Maison >Assignee: Chris Egerton >Priority: Major > Attachments: failure.txt, > org.apache.kafka.connect.integration.ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic.test.stdout > > > It looks like the sequence got broken around "65535, 65537, 65536, 65539, > 65538, 65541, 65540, 65543" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14089) Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic
[ https://issues.apache.org/jira/browse/KAFKA-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569204#comment-17569204 ] Chris Egerton commented on KAFKA-14089: --- Thanks [~mimaison]. We don't assert on order of records, just that the expected seqnos were present in any order, so the wonkiness around 65535 isn't actually an issue (and it's even present in the stringified representation of both the expected _and_ the actual seqno sets). After doing some Bash scrubbing on the file attached to the ticket, it looks like seqnos start to be missing (i.e., they're in the expected set but not the actual) between 114463 and 114754. Not every seqno in that range is missing, but there's 105 in total. After that, starting at 114755, there's 105 extra (i.e., in the actual set but not the expected) seqnos. Given that the issues crop up at the very end of the seqno set, it seems like this could be caused by non-graceful shutdown of the worker after exactly-once support is disabled, or even possibly the recently-discovered KAFKA-14079. It's a little worrisome, though, since the results here indicate possible data loss. If this was on Jenkins, do you have a link to the CI run that caused it? Or if it was encountered elsewhere, do you have any logs available? I'll try to kick off some local runs but I'm in the middle of stress-testing my laptop with the latest KIP-618 system tests and may not be able to reproduce locally. > Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic > --- > > Key: KAFKA-14089 > URL: https://issues.apache.org/jira/browse/KAFKA-14089 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Mickael Maison >Assignee: Chris Egerton >Priority: Major > Attachments: failure.txt > > > It looks like the sequence got broken around "65535, 65537, 65536, 65539, > 65538, 65541, 65540, 65543" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14089) Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic
[ https://issues.apache.org/jira/browse/KAFKA-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17568972#comment-17568972 ] Mickael Maison commented on KAFKA-14089: cc [~ChrisEgerton] > Flaky ExactlyOnceSourceIntegrationTest.testSeparateOffsetsTopic > --- > > Key: KAFKA-14089 > URL: https://issues.apache.org/jira/browse/KAFKA-14089 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Mickael Maison >Priority: Major > Attachments: failure.txt > > > It looks like the sequence got broken around "65535, 65537, 65536, 65539, > 65538, 65541, 65540, 65543" -- This message was sent by Atlassian Jira (v8.20.10#820010)