[jira] [Commented] (KAFKA-14733) Update AclAuthorizerTest to run tests for both zk and kraft mode
[ https://issues.apache.org/jira/browse/KAFKA-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838647#comment-17838647 ] Matthew de Detrich commented on KAFKA-14733: [~emissionnebula] Have you started working on this or can I look into it? > Update AclAuthorizerTest to run tests for both zk and kraft mode > > > Key: KAFKA-14733 > URL: https://issues.apache.org/jira/browse/KAFKA-14733 > Project: Kafka > Issue Type: Improvement >Reporter: Purshotam Chauhan >Assignee: Purshotam Chauhan >Priority: Minor > > Currently, we have two test classes AclAuthorizerTest and > StandardAuthorizerTest that are used exclusively for zk and kraft mode. > But AclAuthorizerTest has a lot of tests covering various scenarios. We > should change AclAuthorizerTest to run for both zk and kraft modes so as to > keep parity between both modes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15744) KRaft support in CustomQuotaCallbackTest
[ https://issues.apache.org/jira/browse/KAFKA-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838644#comment-17838644 ] Matthew de Detrich commented on KAFKA-15744: [~high.lee] Are you still working on this issue or is it okay if I can take over? > KRaft support in CustomQuotaCallbackTest > > > Key: KAFKA-15744 > URL: https://issues.apache.org/jira/browse/KAFKA-15744 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Sameer Tejani >Assignee: highluck >Priority: Minor > Labels: kraft, kraft-test, newbie > > The following tests in CustomQuotaCallbackTest in > core/src/test/scala/integration/kafka/api/CustomQuotaCallbackTest.scala need > to be updated to support KRaft > 90 : def testCustomQuotaCallback(): Unit = { > Scanned 468 lines. Found 0 KRaft tests out of 1 tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15724) KRaft support in OffsetFetchRequestTest
[ https://issues.apache.org/jira/browse/KAFKA-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838643#comment-17838643 ] Matthew de Detrich commented on KAFKA-15724: [~shivsundar] Is it oaky if I can take over the ticket or are you still working on it? > KRaft support in OffsetFetchRequestTest > --- > > Key: KAFKA-15724 > URL: https://issues.apache.org/jira/browse/KAFKA-15724 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Sameer Tejani >Assignee: Shivsundar R >Priority: Minor > Labels: kraft, kraft-test, newbie > > The following tests in OffsetFetchRequestTest in > core/src/test/scala/unit/kafka/server/OffsetFetchRequestTest.scala need to be > updated to support KRaft > 83 : def testOffsetFetchRequestSingleGroup(): Unit = { > 130 : def testOffsetFetchRequestAllOffsetsSingleGroup(): Unit = { > 180 : def testOffsetFetchRequestWithMultipleGroups(): Unit = { > Scanned 246 lines. Found 0 KRaft tests out of 3 tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15709) KRaft support in ServerStartupTest
[ https://issues.apache.org/jira/browse/KAFKA-15709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838640#comment-17838640 ] Matthew de Detrich commented on KAFKA-15709: [~linzihao1999] are you still working on this or can I take over > KRaft support in ServerStartupTest > -- > > Key: KAFKA-15709 > URL: https://issues.apache.org/jira/browse/KAFKA-15709 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Sameer Tejani >Assignee: Zihao Lin >Priority: Minor > Labels: kraft, kraft-test, newbie > > The following tests in ServerStartupTest in > core/src/test/scala/unit/kafka/server/ServerStartupTest.scala need to be > updated to support KRaft > 38 : def testBrokerCreatesZKChroot(): Unit = { > 51 : def testConflictBrokerStartupWithSamePort(): Unit = { > 65 : def testConflictBrokerRegistration(): Unit = { > 82 : def testBrokerSelfAware(): Unit = { > 93 : def testBrokerStateRunningAfterZK(): Unit = { > Scanned 107 lines. Found 0 KRaft tests out of 5 tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-15737) KRaft support in ConsumerBounceTest
[ https://issues.apache.org/jira/browse/KAFKA-15737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838639#comment-17838639 ] Matthew de Detrich edited comment on KAFKA-15737 at 4/18/24 12:39 PM: -- [~k-raina] are you working on this or can I have a look into it? was (Author: JIRAUSER297606): Ill have a look at this > KRaft support in ConsumerBounceTest > --- > > Key: KAFKA-15737 > URL: https://issues.apache.org/jira/browse/KAFKA-15737 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Sameer Tejani >Assignee: Kaushik Raina >Priority: Minor > Labels: kraft, kraft-test, newbie > > The following tests in ConsumerBounceTest in > core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala need to be > updated to support KRaft > 81 : def testConsumptionWithBrokerFailures(): Unit = > consumeWithBrokerFailures(10) > 122 : def testSeekAndCommitWithBrokerFailures(): Unit = > seekAndCommitWithBrokerFailures(5) > 161 : def testSubscribeWhenTopicUnavailable(): Unit = { > 212 : def testClose(): Unit = { > 297 : def > testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(): > Unit = { > 337 : def testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize(): Unit = { > 370 : def testCloseDuringRebalance(): Unit = { > Scanned 535 lines. Found 0 KRaft tests out of 7 tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15737) KRaft support in ConsumerBounceTest
[ https://issues.apache.org/jira/browse/KAFKA-15737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838639#comment-17838639 ] Matthew de Detrich commented on KAFKA-15737: Ill have a look at this > KRaft support in ConsumerBounceTest > --- > > Key: KAFKA-15737 > URL: https://issues.apache.org/jira/browse/KAFKA-15737 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Sameer Tejani >Assignee: Kaushik Raina >Priority: Minor > Labels: kraft, kraft-test, newbie > > The following tests in ConsumerBounceTest in > core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala need to be > updated to support KRaft > 81 : def testConsumptionWithBrokerFailures(): Unit = > consumeWithBrokerFailures(10) > 122 : def testSeekAndCommitWithBrokerFailures(): Unit = > seekAndCommitWithBrokerFailures(5) > 161 : def testSubscribeWhenTopicUnavailable(): Unit = { > 212 : def testClose(): Unit = { > 297 : def > testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(): > Unit = { > 337 : def testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize(): Unit = { > 370 : def testCloseDuringRebalance(): Unit = { > Scanned 535 lines. Found 0 KRaft tests out of 7 tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14547) Be able to run kafka KRaft Server in tests without needing to run a storage setup script
[ https://issues.apache.org/jira/browse/KAFKA-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838637#comment-17838637 ] Matthew de Detrich commented on KAFKA-14547: Ill have a look into this > Be able to run kafka KRaft Server in tests without needing to run a storage > setup script > > > Key: KAFKA-14547 > URL: https://issues.apache.org/jira/browse/KAFKA-14547 > Project: Kafka > Issue Type: Improvement > Components: kraft >Affects Versions: 3.3.1 >Reporter: Natan Silnitsky >Priority: Major > > Currently kafka KRaft Server requires running kafka-storage.sh in order to > start properly. > This makes setup much more cubersome for build tools like bazel to work > properly. > One way to mitigate this is to configure the paths via kafkaConfig... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14572) Migrate EmbeddedKafkaCluster used by Streams integration tests from EmbeddedZookeeper to KRaft
[ https://issues.apache.org/jira/browse/KAFKA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-14572: -- Assignee: Matthew de Detrich > Migrate EmbeddedKafkaCluster used by Streams integration tests from > EmbeddedZookeeper to KRaft > -- > > Key: KAFKA-14572 > URL: https://issues.apache.org/jira/browse/KAFKA-14572 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Blocker > Fix For: 4.0.0 > > > ZK will be removed in 4.0, and we need to update our test to switch to ZK to > KRaft. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14572) Migrate EmbeddedKafkaCluster used by Streams integration tests from EmbeddedZookeeper to KRaft
[ https://issues.apache.org/jira/browse/KAFKA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795182#comment-17795182 ] Matthew de Detrich commented on KAFKA-14572: Ill have a look into this > Migrate EmbeddedKafkaCluster used by Streams integration tests from > EmbeddedZookeeper to KRaft > -- > > Key: KAFKA-14572 > URL: https://issues.apache.org/jira/browse/KAFKA-14572 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Blocker > Fix For: 4.0.0 > > > ZK will be removed in 4.0, and we need to update our test to switch to ZK to > KRaft. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14237) Kafka TLS Doesn't Present Intermediary Certificates when using PEM
[ https://issues.apache.org/jira/browse/KAFKA-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795181#comment-17795181 ] Matthew de Detrich commented on KAFKA-14237: I believe this issue has been resolved so it can be closed (see https://issues.apache.org/jira/browse/KAFKA-10338) > Kafka TLS Doesn't Present Intermediary Certificates when using PEM > -- > > Key: KAFKA-14237 > URL: https://issues.apache.org/jira/browse/KAFKA-14237 > Project: Kafka > Issue Type: Bug > Components: security >Affects Versions: 3.2.1 > Environment: Deployed using the Bitnami Helm > Chart(https://github.com/bitnami/charts/tree/master/bitnami/kafka) > The Bitnami Helm Chart uses Docker Image: > https://github.com/bitnami/containers/tree/main/bitnami/kafka > An issue was already opened with Bitnami and they told us to send this > upstream: https://github.com/bitnami/containers/issues/6654 >Reporter: Ryan R >Priority: Blocker > > When using PEM TLS certificates, Kafka does not present the entire > certificate chain. > > Our {{/opt/bitnami/kafka/config/server.properties}} file looks like this: > {code:java} > ssl.keystore.type=PEM > ssl.truststore.type=PEM > ssl.keystore.key=-BEGIN PRIVATE KEY- \ > > -END PRIVATE KEY- > ssl.keystore.certificate.chain=-BEGIN CERTIFICATE- \ > > -END CERTIFICATE- \ > -BEGIN CERTIFICATE- \ > MIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw \ > TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh \ > cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMjAwOTA0MDAwMDAw \ > WhcNMjUwOTE1MTYwMDAwWjAyMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNTGV0J3Mg \ > RW5jcnlwdDELMAkGA1UEAxMCUjMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK \ > AoIBAQC7AhUozPaglNMPEuyNVZLD+ILxmaZ6QoinXSaqtSu5xUyxr45r+XXIo9cP \ > R5QUVTVXjJ6oojkZ9YI8QqlObvU7wy7bjcCwXPNZOOftz2nwWgsbvsCUJCWH+jdx \ > sxPnHKzhm+/b5DtFUkWWqcFTzjTIUu61ru2P3mBw4qVUq7ZtDpelQDRrK9O8Zutm \ > NHz6a4uPVymZ+DAXXbpyb/uBxa3Shlg9F8fnCbvxK/eG3MHacV3URuPMrSXBiLxg \ > Z3Vms/EY96Jc5lP/Ooi2R6X/ExjqmAl3P51T+c8B5fWmcBcUr2Ok/5mzk53cU6cG \ > /kiFHaFpriV1uxPMUgP17VGhi9sVAgMBAAGjggEIMIIBBDAOBgNVHQ8BAf8EBAMC \ > AYYwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMBMBIGA1UdEwEB/wQIMAYB \ > Af8CAQAwHQYDVR0OBBYEFBQusxe3WFbLrlAJQOYfr52LFMLGMB8GA1UdIwQYMBaA \ > FHm0WeZ7tuXkAXOACIjIGlj26ZtuMDIGCCsGAQUFBwEBBCYwJDAiBggrBgEFBQcw \ > AoYWaHR0cDovL3gxLmkubGVuY3Iub3JnLzAnBgNVHR8EIDAeMBygGqAYhhZodHRw \ > Oi8veDEuYy5sZW5jci5vcmcvMCIGA1UdIAQbMBkwCAYGZ4EMAQIBMA0GCysGAQQB \ > gt8TAQEBMA0GCSqGSIb3DQEBCwUAA4ICAQCFyk5HPqP3hUSFvNVneLKYY611TR6W \ > PTNlclQtgaDqw+34IL9fzLdwALduO/ZelN7kIJ+m74uyA+eitRY8kc607TkC53wl \ > ikfmZW4/RvTZ8M6UK+5UzhK8jCdLuMGYL6KvzXGRSgi3yLgjewQtCPkIVz6D2QQz \ > CkcheAmCJ8MqyJu5zlzyZMjAvnnAT45tRAxekrsu94sQ4egdRCnbWSDtY7kh+BIm \ > lJNXoB1lBMEKIq4QDUOXoRgffuDghje1WrG9ML+Hbisq/yFOGwXD9RiX8F6sw6W4 \ > avAuvDszue5L3sz85K+EC4Y/wFVDNvZo4TYXao6Z0f+lQKc0t8DQYzk1OXVu8rp2 \ > yJMC6alLbBfODALZvYH7n7do1AZls4I9d1P4jnkDrQoxB3UqQ9hVl3LEKQ73xF1O \ > yK5GhDDX8oVfGKF5u+decIsH4YaTw7mP3GFxJSqv3+0lUFJoi5Lc5da149p90Ids \ > hCExroL1+7mryIkXPeFM5TgO9r0rvZaBFOvV2z0gp35Z0+L4WPlbuEjN/lxPFin+ \ > HlUjr8gRsI3qfJOQFy/9rKIJR0Y/8Omwt/8oTWgy1mdeHmmjk7j1nYsvC9JSQ6Zv \ > MldlTTKB3zhThV1+XWYp6rjd5JW1zbVWEkLNxE7GJThEUG3szgBVGP7pSWTUTsqX \ > nLRbwHOoq7hHwg== \ > -END CERTIFICATE- \ > ssl.truststore.certificates=-BEGIN CERTIFICATE- \ > MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw \ > TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh \ > cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4 \ > WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu \ > ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY \ > MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc \ > h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+ \ > 0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U \ > A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW \ > T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH \ > B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC \ > B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv \ > KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn \ > OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn \ > jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw \ > qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI \ > rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV \ > HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq \ > hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL \ > ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ \ >
[jira] [Assigned] (KAFKA-14245) Topic deleted during reassignment
[ https://issues.apache.org/jira/browse/KAFKA-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-14245: -- Assignee: Matthew de Detrich > Topic deleted during reassignment > - > > Key: KAFKA-14245 > URL: https://issues.apache.org/jira/browse/KAFKA-14245 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Junyang Liu >Assignee: Matthew de Detrich >Priority: Blocker > > When Doing partition reassignment, the partition should not be deleted. But > in the method "markTopicIneligibleForDeletion", topics need to be in > "controllerContext.topicsToBeDeleted" while this is usually false because > topics are not deleted at that time. This makes the topics doing reassignment > not able to be added to "topicsIneligibleForDeletion". So when topic deletion > comes, the topic in reassignment can also be deleted, which leads to the > result that the reassignment can never be finished. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14245) Topic deleted during reassignment
[ https://issues.apache.org/jira/browse/KAFKA-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791501#comment-17791501 ] Matthew de Detrich commented on KAFKA-14245: Ill have a look into this to see if its still valid or not (and if it is ill try and implement a fix) > Topic deleted during reassignment > - > > Key: KAFKA-14245 > URL: https://issues.apache.org/jira/browse/KAFKA-14245 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Junyang Liu >Priority: Blocker > > When Doing partition reassignment, the partition should not be deleted. But > in the method "markTopicIneligibleForDeletion", topics need to be in > "controllerContext.topicsToBeDeleted" while this is usually false because > topics are not deleted at that time. This makes the topics doing reassignment > not able to be added to "topicsIneligibleForDeletion". So when topic deletion > comes, the topic in reassignment can also be deleted, which leads to the > result that the reassignment can never be finished. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14132) Remaining PowerMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776407#comment-17776407 ] Matthew de Detrich commented on KAFKA-14132: [~ChrisEgerton] Feel free to leave it unassigned, I don't have capacity for this. > Remaining PowerMock to Mockito tests > > > Key: KAFKA-14132 > URL: https://issues.apache.org/jira/browse/KAFKA-14132 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > Fix For: 3.7.0 > > > {color:#de350b}Some of the tests below use EasyMock as well. For those > migrate both PowerMock and EasyMock to Mockito.{color} > Unless stated in brackets the tests are in the connect module. > A list of tests which still require to be moved from PowerMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > {color:#ff8b00}InReview{color} > {color:#00875a}Merged{color} > # {color:#00875a}ErrorHandlingTaskTest{color} (owner: [~shekharrajak]) > # {color:#00875a}SourceTaskOffsetCommiterTest{color} (owner: Christo) > # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) > # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) > # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) > # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) > # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) > # {color:#00875a}ConnectorsResourceTest{color} (owner: [~mdedetrich-aiven]) > # {color:#ff8b00}StandaloneHerderTest{color} (owner: [~mdedetrich-aiven]) > ([https://github.com/apache/kafka/pull/12728]) > # KafkaConfigBackingStoreTest (UNOWNED) > # {color:#00875a}KafkaOffsetBackingStoreTest{color} (owner: Christo) > ([https://github.com/apache/kafka/pull/12418]) > # {color:#00875a}KafkaBasedLogTest{color} (owner: @bachmanity ]) > # {color:#00875a}RetryUtilTest{color} (owner: [~yash.mayya]) > # {color:#00875a}RepartitionTopicTest{color} (streams) (owner: Christo) > # {color:#00875a}StateManagerUtilTest{color} (streams) (owner: Christo) > *The coverage report for the above tests after the change should be >= to > what the coverage is now.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete
[ https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich resolved KAFKA-14975. Resolution: Won't Fix Closing issue since the core problem was solved in another way (see https://github.com/apache/kafka/pull/14127) > Make TopicBasedRemoteLogMetadataManager methods wait for initialize to > complete > --- > > Key: KAFKA-14975 > URL: https://issues.apache.org/jira/browse/KAFKA-14975 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > In the current implementation of TopicBasedRemoteLogMetadataManager various > methods internally call the > ensureInitializedAndNotClosed to ensure that the > TopicBasedRemoteLogMetadataManager is initialized. If > TopicBasedRemoteLogMetadataManager is not initialized then an exception will > be thrown. > This is not an ideal behaviour, rather than just throwing an exception we > should instead try to wait until TopicBasedRemoteLogMetadataManager is > initialised (with a timeout). This is what the expected behaviour from users > should be and its also what other parts of Kafka that use plugin based > systems (ergo kafka connect) do. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete
[ https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14975: --- Parent: KAFKA-7739 Issue Type: Sub-task (was: Task) > Make TopicBasedRemoteLogMetadataManager methods wait for initialize to > complete > --- > > Key: KAFKA-14975 > URL: https://issues.apache.org/jira/browse/KAFKA-14975 > Project: Kafka > Issue Type: Sub-task >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > In the current implementation of TopicBasedRemoteLogMetadataManager various > methods internally call the > ensureInitializedAndNotClosed to ensure that the > TopicBasedRemoteLogMetadataManager is initialized. If > TopicBasedRemoteLogMetadataManager is not initialized then an exception will > be thrown. > This is not an ideal behaviour, rather than just throwing an exception we > should instead try to wait until TopicBasedRemoteLogMetadataManager is > initialised (with a timeout). This is what the expected behaviour from users > should be and its also what other parts of Kafka that use plugin based > systems (ergo kafka connect) do. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete
[ https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14975: --- Description: In the current implementation of TopicBasedRemoteLogMetadataManager various methods internally call the ensureInitializedAndNotClosed to ensure that the TopicBasedRemoteLogMetadataManager is initialized. If TopicBasedRemoteLogMetadataManager is not initialized then an exception will be thrown. This is not an ideal behaviour, rather than just throwing an exception we should instead try to wait until TopicBasedRemoteLogMetadataManager is initialised (with a timeout). This is what the expected behaviour from users should be and its also what other parts of Kafka that use plugin based systems (ergo kafka connect) do. was:In the current implementation of > Make TopicBasedRemoteLogMetadataManager methods wait for initialize to > complete > --- > > Key: KAFKA-14975 > URL: https://issues.apache.org/jira/browse/KAFKA-14975 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > In the current implementation of TopicBasedRemoteLogMetadataManager various > methods internally call the > ensureInitializedAndNotClosed to ensure that the > TopicBasedRemoteLogMetadataManager is initialized. If > TopicBasedRemoteLogMetadataManager is not initialized then an exception will > be thrown. > This is not an ideal behaviour, rather than just throwing an exception we > should instead try to wait until TopicBasedRemoteLogMetadataManager is > initialised (with a timeout). This is what the expected behaviour from users > should be and its also what other parts of Kafka that use plugin based > systems (ergo kafka connect) do. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete
[ https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-14975: -- Assignee: Matthew de Detrich > Make TopicBasedRemoteLogMetadataManager methods wait for initialize to > complete > --- > > Key: KAFKA-14975 > URL: https://issues.apache.org/jira/browse/KAFKA-14975 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > In the current implementation of -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete
[ https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14975: --- Description: In the current implementation of > Make TopicBasedRemoteLogMetadataManager methods wait for initialize to > complete > --- > > Key: KAFKA-14975 > URL: https://issues.apache.org/jira/browse/KAFKA-14975 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In the current implementation of -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete
Matthew de Detrich created KAFKA-14975: -- Summary: Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete Key: KAFKA-14975 URL: https://issues.apache.org/jira/browse/KAFKA-14975 Project: Kafka Issue Type: Task Reporter: Matthew de Detrich -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith
[ https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413 ] Matthew de Detrich edited comment on KAFKA-14524 at 1/4/23 11:45 AM: - I believe the main intention of this is to modularize the Kafka core (which I don't think anyone disagrees with) but the Scala removal has been tacked on without a good justification as to why these orthogonal concerns are done together. Combining such 2 massive tasks at once seems to make the scope of this too big (splitting out the core without migrating away from Scala has far less risk). I'm also wondering if the move away from Scala would qualify as consensus required changes, mostly to get the opinion of the community on this topic. For example, Flink which also decided to do such a similar change consulted the community via discussion threads (see [https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and [https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as examples). May I suggest that instead of doing both the modularization and the Scala to Java migration at once, we just do the modularization and do the migration later once the community is consulted? was (Author: mdedetrich-aiven): I believe the main intention of this is to modularize the Kafka core (which I don't think anyone disagrees with) but the Scala removal has been tacked on without a good justification as to why these orthogonal concerns are done together. Combining such 2 massive tasks at once seems to make the scope of this too big (splitting out the core without migrating away from Scala has far less risk). I'm also wondering if the move away from Scala would qualify as consensus required changes, mostly to get the opinion of the community on this topic. For example, Flink which also decided to do such a similar change consulted the community via discussion threads (see [https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and [https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as an example). May I suggest that instead of doing both the modularization and the Scala to Java migration at once, we just do the modularization and do the migration later once the community is consulted? > Modularize `core` monolith > -- > > Key: KAFKA-14524 > URL: https://issues.apache.org/jira/browse/KAFKA-14524 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Priority: Major > > The `core` module has grown too large and it's time to split it into multiple > modules. A much slimmer `core` module will remain in the end. > Evidence of `core` growing too large is that it takes 1m10s to compile the > main code and tests and it takes hours to run all the tests sequentially. > As part of this effort, we should rewrite the Scala code in Java to reduce > developer friction, reduce compilation time and simplify deployment (i.e. we > can remove the scala version suffix from the module name). Scala may have a > number of advantages over Java 8 (minimum version we support now) and Java 11 > (minimum version we will support in Kafka 4.0), but a mixture of Scala and > Java (as we have now) is more complex than just Java. > Another benefit is that code dependencies will be strictly enforced, which > will hopefully help ensure better abstractions. > This pattern was started with the `tools` (but not completed), `metadata` and > `raft` modules and we have (when this ticket was filed) a couple more in > progress: `group-coordinator` and `storage`. > This is an umbrella ticket and it will link to each ticket related to this > goal. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith
[ https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413 ] Matthew de Detrich edited comment on KAFKA-14524 at 1/4/23 11:44 AM: - I believe the main intention of this is to modularize the Kafka core (which I don't think anyone disagrees with) but the Scala removal has been tacked on without a good justification as to why these orthogonal concerns are done together. Combining such 2 massive tasks at once seems to make the scope of this too big (splitting out the core without migrating away from Scala has far less risk). I'm also wondering if the move away from Scala would qualify as consensus required changes, mostly to get the opinion of the community on this topic. For example, Flink which also decided to do such a similar change consulted the community via discussion threads (see [https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and [https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as an example). May I suggest that instead of doing both the modularization and the Scala to Java migration at once, we just do the modularization and do the migration later once the community is consulted? was (Author: mdedetrich-aiven): I believe the main intention of this is to modularize the Kafka core (which I don't think anyone disagrees with) but the Scala removal has been tacked on without a good justification as to why these orthogonal concerns are done together. Combining such 2 massive tasks at once seems to make the scope of this too big (splitting out the core without migrating away from Scala has far less risk). I'm also wondering if the move away from Scala would qualify as consensus required changes, mostly to get the opinion of the community on this topic. For example, Flink which also decided to do such a similar change consulted the community via discussion threads (see https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h as an example). May I suggest that instead of doing both the modularization and the Scala to Java migration at once, we just do the modularization and do the migration later once the community is consulted? > Modularize `core` monolith > -- > > Key: KAFKA-14524 > URL: https://issues.apache.org/jira/browse/KAFKA-14524 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Priority: Major > > The `core` module has grown too large and it's time to split it into multiple > modules. A much slimmer `core` module will remain in the end. > Evidence of `core` growing too large is that it takes 1m10s to compile the > main code and tests and it takes hours to run all the tests sequentially. > As part of this effort, we should rewrite the Scala code in Java to reduce > developer friction, reduce compilation time and simplify deployment (i.e. we > can remove the scala version suffix from the module name). Scala may have a > number of advantages over Java 8 (minimum version we support now) and Java 11 > (minimum version we will support in Kafka 4.0), but a mixture of Scala and > Java (as we have now) is more complex than just Java. > Another benefit is that code dependencies will be strictly enforced, which > will hopefully help ensure better abstractions. > This pattern was started with the `tools` (but not completed), `metadata` and > `raft` modules and we have (when this ticket was filed) a couple more in > progress: `group-coordinator` and `storage`. > This is an umbrella ticket and it will link to each ticket related to this > goal. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14524) Modularize `core` monolith
[ https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413 ] Matthew de Detrich commented on KAFKA-14524: I believe the main intention of this is to modularize the Kafka core (which I don't think anyone disagrees with) but the Scala removal has been tacked on without a good justification as to why these orthogonal concerns are done together. Combining such 2 massive tasks at once seems to make the scope of this too big (splitting out the core without migrating away from Scala has far less risk). I'm also wondering if the move away from Scala would qualify as consensus required changes, mostly to get the opinion of the community on this topic. For example, Flink which also decided to do such a similar change consulted the community via discussion threads (see https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h as an example). May I suggest that instead of doing both the modularization and the Scala to Java migration at once, we just do the modularization and do the migration later once the community is consulted? > Modularize `core` monolith > -- > > Key: KAFKA-14524 > URL: https://issues.apache.org/jira/browse/KAFKA-14524 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Priority: Major > > The `core` module has grown too large and it's time to split it into multiple > modules. A much slimmer `core` module will remain in the end. > Evidence of `core` growing too large is that it takes 1m10s to compile the > main code and tests and it takes hours to run all the tests sequentially. > As part of this effort, we should rewrite the Scala code in Java to reduce > developer friction, reduce compilation time and simplify deployment (i.e. we > can remove the scala version suffix from the module name). Scala may have a > number of advantages over Java 8 (minimum version we support now) and Java 11 > (minimum version we will support in Kafka 4.0), but a mixture of Scala and > Java (as we have now) is more complex than just Java. > Another benefit is that code dependencies will be strictly enforced, which > will hopefully help ensure better abstractions. > This pattern was started with the `tools` (but not completed), `metadata` and > `raft` modules and we have (when this ticket was filed) a couple more in > progress: `group-coordinator` and `storage`. > This is an umbrella ticket and it will link to each ticket related to this > goal. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14332) Split out checkstyle configs between test and main
[ https://issues.apache.org/jira/browse/KAFKA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14332: --- Description: Currently in Kafka we have one global import-control.xml that is used to control imports for both the main source code and tests. It would be ideal to have separate configs, one for test and one for main so that for example one cannot accidentally add a test library (such as powermock) to the main source code. > Split out checkstyle configs between test and main > -- > > Key: KAFKA-14332 > URL: https://issues.apache.org/jira/browse/KAFKA-14332 > Project: Kafka > Issue Type: Bug > Components: system tests, unit tests >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > Currently in Kafka we have one global import-control.xml that is used to > control imports for both the main source code and tests. It would be ideal to > have separate configs, one for test and one for main so that for example one > cannot accidentally add a test library (such as powermock) to the main source > code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14332) Split out checkstyle configs between test and main
[ https://issues.apache.org/jira/browse/KAFKA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14332: --- Component/s: system tests unit tests > Split out checkstyle configs between test and main > -- > > Key: KAFKA-14332 > URL: https://issues.apache.org/jira/browse/KAFKA-14332 > Project: Kafka > Issue Type: Bug > Components: system tests, unit tests >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14332) Split out checkstyle configs between test and main
[ https://issues.apache.org/jira/browse/KAFKA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-14332: -- Assignee: Matthew de Detrich > Split out checkstyle configs between test and main > -- > > Key: KAFKA-14332 > URL: https://issues.apache.org/jira/browse/KAFKA-14332 > Project: Kafka > Issue Type: Bug >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14332) Split out checkstyle configs between test and main
Matthew de Detrich created KAFKA-14332: -- Summary: Split out checkstyle configs between test and main Key: KAFKA-14332 URL: https://issues.apache.org/jira/browse/KAFKA-14332 Project: Kafka Issue Type: Bug Reporter: Matthew de Detrich -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14132: --- Description: {color:#de350b}Some of the tests below use EasyMock as well. For those migrate both PowerMock and EasyMock to Mockito.{color} Unless stated in brackets the tests are in the connect module. A list of tests which still require to be moved from PowerMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}InReview{color} {color:#00875a}Merged{color} # ErrorHandlingTaskTest (owner: Divij) # SourceTaskOffsetCommiterTest (owner: Divij) # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) # WorkerSinkTaskTest (owner: Divij) # WorkerSinkTaskThreadedTest (owner: Divij) # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) ([https://github.com/apache/kafka/pull/12725]) # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] ) # KafkaOffsetBackingStoreTest (owner: Christo) ([https://github.com/apache/kafka/pull/12418]) # KafkaBasedLogTest (owner: [~mdedetrich-aiven] ) # RetryUtilTest (owner: [~mdedetrich-aiven] ) # RepartitionTopicTest (streams) (owner: Christo) # StateManagerUtilTest (streams) (owner: Christo) *The coverage report for the above tests after the change should be >= to what the coverage is now.* was: {color:#de350b}Some of the tests below use EasyMock as well. For those migrate both PowerMock and EasyMock to Mockito.{color} Unless stated in brackets the tests are in the connect module. A list of tests which still require to be moved from PowerMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}InReview{color} {color:#00875a}Merged{color} # ErrorHandlingTaskTest (owner: Divij) # SourceTaskOffsetCommiterTest (owner: Divij) # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) # WorkerSinkTaskTest (owner: Divij) # WorkerSinkTaskThreadedTest (owner: Divij) # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )(https://github.com/apache/kafka/pull/12725) # KafkaOffsetBackingStoreTest (owner: Christo) ([https://github.com/apache/kafka/pull/12418]) # KafkaBasedLogTest (owner: [~mdedetrich-aiven] ) # RetryUtilTest (owner: [~mdedetrich-aiven] ) # RepartitionTopicTest (streams) (owner: Christo) # StateManagerUtilTest (streams) (owner: Christo) *The coverage report for the above tests after the change should be >= to what the coverage is now.* > Remaining PowerMock to Mockito tests > > > Key: KAFKA-14132 > URL: https://issues.apache.org/jira/browse/KAFKA-14132 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#de350b}Some of the tests below use EasyMock as well. For those > migrate both PowerMock and EasyMock to Mockito.{color} > Unless stated in brackets the tests are in the connect module. > A list of tests which still require to be moved from PowerMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > {color:#ff8b00}InReview{color} > {color:#00875a}Merged{color} > # ErrorHandlingTaskTest (owner: Divij) > # SourceTaskOffsetCommiterTest (owner: Divij) > # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) > # WorkerSinkTaskTest (owner: Divij) > # WorkerSinkTaskThreadedTest (owner: Divij) > # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) > # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) > # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) > # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) > # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) > ([https://github.com/apache/kafka/pull/12725]) > # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) > # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] ) > #
[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14132: --- Description: {color:#de350b}Some of the tests below use EasyMock as well. For those migrate both PowerMock and EasyMock to Mockito.{color} Unless stated in brackets the tests are in the connect module. A list of tests which still require to be moved from PowerMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}InReview{color} {color:#00875a}Merged{color} # ErrorHandlingTaskTest (owner: Divij) # SourceTaskOffsetCommiterTest (owner: Divij) # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) # WorkerSinkTaskTest (owner: Divij) # WorkerSinkTaskThreadedTest (owner: Divij) # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )(https://github.com/apache/kafka/pull/12725) # KafkaOffsetBackingStoreTest (owner: Christo) ([https://github.com/apache/kafka/pull/12418]) # KafkaBasedLogTest (owner: [~mdedetrich-aiven] ) # RetryUtilTest (owner: [~mdedetrich-aiven] ) # RepartitionTopicTest (streams) (owner: Christo) # StateManagerUtilTest (streams) (owner: Christo) *The coverage report for the above tests after the change should be >= to what the coverage is now.* was: {color:#de350b}Some of the tests below use EasyMock as well. For those migrate both PowerMock and EasyMock to Mockito.{color} Unless stated in brackets the tests are in the connect module. A list of tests which still require to be moved from PowerMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}InReview{color} {color:#00875a}Merged{color} # ErrorHandlingTaskTest (owner: Divij) # SourceTaskOffsetCommiterTest (owner: Divij) # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) # WorkerSinkTaskTest (owner: Divij) # WorkerSinkTaskThreadedTest (owner: Divij) # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] ) # KafkaOffsetBackingStoreTest (owner: Christo) ([https://github.com/apache/kafka/pull/12418]) # KafkaBasedLogTest (owner: [~mdedetrich-aiven] ) # RetryUtilTest (owner: [~mdedetrich-aiven] ) # RepartitionTopicTest (streams) (owner: Christo) # StateManagerUtilTest (streams) (owner: Christo) *The coverage report for the above tests after the change should be >= to what the coverage is now.* > Remaining PowerMock to Mockito tests > > > Key: KAFKA-14132 > URL: https://issues.apache.org/jira/browse/KAFKA-14132 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#de350b}Some of the tests below use EasyMock as well. For those > migrate both PowerMock and EasyMock to Mockito.{color} > Unless stated in brackets the tests are in the connect module. > A list of tests which still require to be moved from PowerMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > {color:#ff8b00}InReview{color} > {color:#00875a}Merged{color} > # ErrorHandlingTaskTest (owner: Divij) > # SourceTaskOffsetCommiterTest (owner: Divij) > # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij) > # WorkerSinkTaskTest (owner: Divij) > # WorkerSinkTaskThreadedTest (owner: Divij) > # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) > # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) > # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) > # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) > # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) > # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) > # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] > )(https://github.com/apache/kafka/pull/12725) > # KafkaOffsetBackingStoreTest (owner: Christo) >
[jira] [Updated] (KAFKA-14283) Fix connector creation authorization tests not doing anything
[ https://issues.apache.org/jira/browse/KAFKA-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14283: --- Description: Currently the testCreateConnectorWithoutHeaderAuthorization and testCreateConnectorWithHeaderAuthorization tests within ConnectorsResourceTest aren't actually doing anything. This is because in reality the requests should be forwarded to a leader and tests aren't actually testing that a leader RestClient request is made (was: Currently the testCreateConnectorWithoutHeaderAuthorization and testCreateConnectorWithHeaderAuthorization tests within ConnectorsResourceTest aren't actually anything. This is because in reality the requests should be forwarded to a leader and tests aren't actually testing that a leader RestClient request is made) > Fix connector creation authorization tests not doing anything > - > > Key: KAFKA-14283 > URL: https://issues.apache.org/jira/browse/KAFKA-14283 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > Currently the testCreateConnectorWithoutHeaderAuthorization and > testCreateConnectorWithHeaderAuthorization tests within > ConnectorsResourceTest aren't actually doing anything. This is because in > reality the requests should be forwarded to a leader and tests aren't > actually testing that a leader RestClient request is made -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14283) Fix connector creation authorization tests not doing anything
[ https://issues.apache.org/jira/browse/KAFKA-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich resolved KAFKA-14283. Resolution: Fixed > Fix connector creation authorization tests not doing anything > - > > Key: KAFKA-14283 > URL: https://issues.apache.org/jira/browse/KAFKA-14283 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Matthew de Detrich >Assignee: Matthew de Detrich >Priority: Major > > Currently the testCreateConnectorWithoutHeaderAuthorization and > testCreateConnectorWithHeaderAuthorization tests within > ConnectorsResourceTest aren't actually anything. This is because in reality > the requests should be forwarded to a leader and tests aren't actually > testing that a leader RestClient request is made -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14283) Fix connector creation authorization tests not doing anything
Matthew de Detrich created KAFKA-14283: -- Summary: Fix connector creation authorization tests not doing anything Key: KAFKA-14283 URL: https://issues.apache.org/jira/browse/KAFKA-14283 Project: Kafka Issue Type: Bug Components: KafkaConnect Reporter: Matthew de Detrich Currently the testCreateConnectorWithoutHeaderAuthorization and testCreateConnectorWithHeaderAuthorization tests within ConnectorsResourceTest aren't actually anything. This is because in reality the requests should be forwarded to a leader and tests aren't actually testing that a leader RestClient request is made -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14256) Update to Scala 2.13.9
Matthew de Detrich created KAFKA-14256: -- Summary: Update to Scala 2.13.9 Key: KAFKA-14256 URL: https://issues.apache.org/jira/browse/KAFKA-14256 Project: Kafka Issue Type: Task Reporter: Matthew de Detrich -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14132: --- Description: {color:#de350b}Some of the tests below use EasyMock as well. For those migrate both PowerMock and EasyMock to Mockito.{color} Unless stated in brackets the tests are in the connect module. A list of tests which still require to be moved from PowerMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}InReview{color} {color:#00875a}Merged{color} # ErrorHandlingTaskTest (owner: Divij) # SourceTaskOffsetCommiterTest (owner: Divij) # WorkerMetricsGroupTest (owner: Divij) # WorkerSinkTaskTest (owner: Divij) # WorkerSinkTaskThreadedTest (owner: Divij) # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] ) # KafkaOffsetBackingStoreTest (owner: Christo) ([https://github.com/apache/kafka/pull/12418]) # KafkaBasedLogTest (owner: [~mdedetrich-aiven] ) # RetryUtilTest (owner: [~mdedetrich-aiven] ) # RepartitionTopicTest (streams) (owner: Christo) # StateManagerUtilTest (streams) (owner: Christo) *The coverage report for the above tests after the change should be >= to what the coverage is now.* was: {color:#de350b}Some of the tests below use EasyMock as well. For those migrate both PowerMock and EasyMock to Mockito.{color} Unless stated in brackets the tests are in the connect module. A list of tests which still require to be moved from PowerMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#FF8B00}InReview{color} {color:#00875A}Merged{color} # ErrorHandlingTaskTest (owner: Divij) # SourceTaskOffsetCommiterTest (owner: Divij) # WorkerMetricsGroupTest (owner: Divij) # WorkerSinkTaskTest (owner: Divij) # WorkerSinkTaskThreadedTest (owner: Divij) # {color:#00875A}WorkerTaskTest{color} (owner: [~yash.mayya]) # {color:#00875A}ErrorReporterTest{color} (owner: [~yash.mayya]) # {color:#00875A}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) # {color:#00875A}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) # ConnectorsResourceTest # StandaloneHerderTest # KafkaConfigBackingStoreTest # KafkaOffsetBackingStoreTest (owner: Christo) ([https://github.com/apache/kafka/pull/12418]) # KafkaBasedLogTest # RetryUtilTest # RepartitionTopicTest (streams) (owner: Christo) # StateManagerUtilTest (streams) (owner: Christo) *The coverage report for the above tests after the change should be >= to what the coverage is now.* > Remaining PowerMock to Mockito tests > > > Key: KAFKA-14132 > URL: https://issues.apache.org/jira/browse/KAFKA-14132 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#de350b}Some of the tests below use EasyMock as well. For those > migrate both PowerMock and EasyMock to Mockito.{color} > Unless stated in brackets the tests are in the connect module. > A list of tests which still require to be moved from PowerMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > {color:#ff8b00}InReview{color} > {color:#00875a}Merged{color} > # ErrorHandlingTaskTest (owner: Divij) > # SourceTaskOffsetCommiterTest (owner: Divij) > # WorkerMetricsGroupTest (owner: Divij) > # WorkerSinkTaskTest (owner: Divij) > # WorkerSinkTaskThreadedTest (owner: Divij) > # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya]) > # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya]) > # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya]) > # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya]) > # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) > # StandaloneHerderTest (owner: [~mdedetrich-aiven] ) > # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] ) > # KafkaOffsetBackingStoreTest (owner: Christo) > ([https://github.com/apache/kafka/pull/12418]) > # KafkaBasedLogTest (owner: [~mdedetrich-aiven] ) > # RetryUtilTest (owner: [~mdedetrich-aiven] ) > # RepartitionTopicTest (streams) (owner: Christo) > # StateManagerUtilTest (streams) (owner: Christo) > *The coverage report for the above tests after the change should
[jira] [Comment Edited] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
[ https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603260#comment-17603260 ] Matthew de Detrich edited comment on KAFKA-14221 at 9/12/22 9:52 PM: - Thanks, INFRA tickets created. I checked the matrix and indeed even the latest variants of the JDK's are older than the versions of JDK where this bug has been fixed. was (Author: mdedetrich-aiven): Thanks, INFRA tickets created > Update Apache Kafka JVM version in CI to latest versions > > > Key: KAFKA-14221 > URL: https://issues.apache.org/jira/browse/KAFKA-14221 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In a recent test (see > [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] > run the JVM crashed with the following stack trace > > {code:java} > [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java > Runtime Environment: > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, > pid=6229, tid=7342 > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment > (17.0.1+12) (build 17.0.1+12-LTS-39) > [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM > (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed > class ptrs, parallel gc, linux-amd64) > [2022-09-12T14:22:22.414Z] # Problematic frame: > [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] > PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) > [clone .part.0]+0x52 > {code} > After some research online I found that there was a JDK bug filed for the > same kind of crash, see > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] > This bug was fixed in JDK 17.0.2 which is a newer version than the one that > is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with > the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash. > We should update both JDK 11 and JDk 17 to the latest version in the CI to > see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
[ https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603260#comment-17603260 ] Matthew de Detrich edited comment on KAFKA-14221 at 9/12/22 9:49 PM: - Thanks, INFRA tickets created was (Author: mdedetrich-aiven): Thanks, INFRA ticket created > Update Apache Kafka JVM version in CI to latest versions > > > Key: KAFKA-14221 > URL: https://issues.apache.org/jira/browse/KAFKA-14221 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In a recent test (see > [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] > run the JVM crashed with the following stack trace > > {code:java} > [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java > Runtime Environment: > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, > pid=6229, tid=7342 > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment > (17.0.1+12) (build 17.0.1+12-LTS-39) > [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM > (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed > class ptrs, parallel gc, linux-amd64) > [2022-09-12T14:22:22.414Z] # Problematic frame: > [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] > PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) > [clone .part.0]+0x52 > {code} > After some research online I found that there was a JDK bug filed for the > same kind of crash, see > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] > This bug was fixed in JDK 17.0.2 which is a newer version than the one that > is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with > the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash. > We should update both JDK 11 and JDk 17 to the latest version in the CI to > see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14223) Update jdk_17_latest to adoptium 17.0.4+1
[ https://issues.apache.org/jira/browse/KAFKA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich resolved KAFKA-14223. Resolution: Invalid Created in wrong project > Update jdk_17_latest to adoptium 17.0.4+1 > - > > Key: KAFKA-14223 > URL: https://issues.apache.org/jira/browse/KAFKA-14223 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > The current latest JDK 17 is outdated and Apache Kafka appeared to have hit a > bug due to this (see https://issues.apache.org/jira/browse/KAFKA-14221). > Would it be possible to update jdk_17_latest to the adoptium 17.0.4+1 release > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
[ https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603260#comment-17603260 ] Matthew de Detrich commented on KAFKA-14221: Thanks, INFRA ticket created > Update Apache Kafka JVM version in CI to latest versions > > > Key: KAFKA-14221 > URL: https://issues.apache.org/jira/browse/KAFKA-14221 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In a recent test (see > [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] > run the JVM crashed with the following stack trace > > {code:java} > [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java > Runtime Environment: > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, > pid=6229, tid=7342 > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment > (17.0.1+12) (build 17.0.1+12-LTS-39) > [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM > (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed > class ptrs, parallel gc, linux-amd64) > [2022-09-12T14:22:22.414Z] # Problematic frame: > [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] > PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) > [clone .part.0]+0x52 > {code} > After some research online I found that there was a JDK bug filed for the > same kind of crash, see > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] > This bug was fixed in JDK 17.0.2 which is a newer version than the one that > is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with > the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash. > We should update both JDK 11 and JDk 17 to the latest version in the CI to > see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14223) Update jdk_17_latest to adoptium 17.0.4+1
Matthew de Detrich created KAFKA-14223: -- Summary: Update jdk_17_latest to adoptium 17.0.4+1 Key: KAFKA-14223 URL: https://issues.apache.org/jira/browse/KAFKA-14223 Project: Kafka Issue Type: Task Reporter: Matthew de Detrich The current latest JDK 17 is outdated and Apache Kafka appeared to have hit a bug due to this (see https://issues.apache.org/jira/browse/KAFKA-14221). Would it be possible to update jdk_17_latest to the adoptium 17.0.4+1 release -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
[ https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14221: --- Description: In a recent test (see [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] run the JVM crashed with the following stack trace {code:java} [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java Runtime Environment: [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, tid=7342 [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment (17.0.1+12) (build 17.0.1+12-LTS-39) [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, parallel gc, linux-amd64) [2022-09-12T14:22:22.414Z] # Problematic frame: [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone .part.0]+0x52 {code} After some research online I found that there was a JDK bug filed for the same kind of crash, see [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] This bug was fixed in JDK 17.0.2 which is a newer version than the one that is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash. We should update both JDK 11 and JDk 17 to the latest version in the CI to see if this will solve the problem. was: In a recent test (see [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] run the JVM crashed with the following stack trace {code:java} [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java Runtime Environment: [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, tid=7342 [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment (17.0.1+12) (build 17.0.1+12-LTS-39) [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, parallel gc, linux-amd64) [2022-09-12T14:22:22.414Z] # Problematic frame: [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone .part.0]+0x52 {code} After some research online I found that there was a JDK bug filed for the same kind of crash, see [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] This bug was fixed in JDK 17.0.2 which is a newer version than the one that is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). We should update both JDK 11 and JDk 17 to the latest version in the CI to see if this will solve the problem. > Update Apache Kafka JVM version in CI to latest versions > > > Key: KAFKA-14221 > URL: https://issues.apache.org/jira/browse/KAFKA-14221 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In a recent test (see > [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] > run the JVM crashed with the following stack trace > > {code:java} > [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java > Runtime Environment: > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, > pid=6229, tid=7342 > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment > (17.0.1+12) (build 17.0.1+12-LTS-39) > [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM > (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed > class ptrs, parallel gc, linux-amd64) > [2022-09-12T14:22:22.414Z] # Problematic frame: > [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] > PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) > [clone .part.0]+0x52 > {code} > After some research online I found that there was a JDK bug filed for the > same kind of crash, see > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] > This bug was fixed in JDK 17.0.2 which is a newer version than the one that > is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with > the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash. > We should update both JDK 11 and JDk 17 to the latest version in the CI to > see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
[ https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603151#comment-17603151 ] Matthew de Detrich commented on KAFKA-14221: [~ijuma] Maybe you have some idea on how to update the JDK on the CI runner? > Update Apache Kafka JVM version in CI to latest versions > > > Key: KAFKA-14221 > URL: https://issues.apache.org/jira/browse/KAFKA-14221 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In a recent test (see > [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] > run the JVM crashed with the following stack trace > > {code:java} > [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java > Runtime Environment: > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, > pid=6229, tid=7342 > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment > (17.0.1+12) (build 17.0.1+12-LTS-39) > [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM > (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed > class ptrs, parallel gc, linux-amd64) > [2022-09-12T14:22:22.414Z] # Problematic frame: > [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] > PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) > [clone .part.0]+0x52 > {code} > After some research online I found that there was a JDK bug filed for the > same kind of crash, see > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] > This bug was fixed in JDK 17.0.2 which is a newer version than the one that > is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). > We should update both JDK 11 and JDk 17 to the latest version in the CI to > see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
[ https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14221: --- Description: In a recent test (see [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] run the JVM crashed with the following stack trace {code:java} [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java Runtime Environment: [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, tid=7342 [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment (17.0.1+12) (build 17.0.1+12-LTS-39) [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, parallel gc, linux-amd64) [2022-09-12T14:22:22.414Z] # Problematic frame: [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone .part.0]+0x52 {code} After some research online I found that there was a JDK bug filed for the same kind of crash, see [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] This bug was fixed in JDK 17.0.2 which is a newer version than the one that is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). We should update both JDK 11 and JDk 17 to the latest version in the CI to see if this will solve the problem. was: In a recent test (see [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] run the JVM crashed with the following stack trace [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java Runtime Environment: [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, tid=7342 [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment (17.0.1+12) (build 17.0.1+12-LTS-39) [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, parallel gc, linux-amd64) [2022-09-12T14:22:22.414Z] # Problematic frame: [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone .part.0]+0x52 After some research online I found that there was a JDK bug filed for the same kind of crash, see [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] This bug was fixed in JDK 17.0.2 which is a newer version than the one that is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). We should update both JDK 11 and JDk 17 to the latest version in the CI to see if this will solve the problem. > Update Apache Kafka JVM version in CI to latest versions > > > Key: KAFKA-14221 > URL: https://issues.apache.org/jira/browse/KAFKA-14221 > Project: Kafka > Issue Type: Task >Reporter: Matthew de Detrich >Priority: Major > > In a recent test (see > [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] > run the JVM crashed with the following stack trace > > {code:java} > [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java > Runtime Environment: > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, > pid=6229, tid=7342 > [2022-09-12T14:22:22.414Z] # > [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment > (17.0.1+12) (build 17.0.1+12-LTS-39) > [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM > (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed > class ptrs, parallel gc, linux-amd64) > [2022-09-12T14:22:22.414Z] # Problematic frame: > [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] > PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) > [clone .part.0]+0x52 > {code} > After some research online I found that there was a JDK bug filed for the > same kind of crash, see > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] > This bug was fixed in JDK 17.0.2 which is a newer version than the one that > is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). > We should update both JDK 11 and JDk 17 to the latest version in the CI to > see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions
Matthew de Detrich created KAFKA-14221: -- Summary: Update Apache Kafka JVM version in CI to latest versions Key: KAFKA-14221 URL: https://issues.apache.org/jira/browse/KAFKA-14221 Project: Kafka Issue Type: Task Reporter: Matthew de Detrich In a recent test (see [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)] run the JVM crashed with the following stack trace [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java Runtime Environment: [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, tid=7342 [2022-09-12T14:22:22.414Z] # [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment (17.0.1+12) (build 17.0.1+12-LTS-39) [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, parallel gc, linux-amd64) [2022-09-12T14:22:22.414Z] # Problematic frame: [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone .part.0]+0x52 After some research online I found that there was a JDK bug filed for the same kind of crash, see [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886] This bug was fixed in JDK 17.0.2 which is a newer version than the one that is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). We should update both JDK 11 and JDk 17 to the latest version in the CI to see if this will solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14133) Remaining EasyMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-14133: --- Description: {color:#de350b}There are tests which use both PowerMock and EasyMock. I have put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here rely solely on EasyMock.{color} Unless stated in brackets the tests are in the streams module. A list of tests which still require to be moved from EasyMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}In Review{color} # WorkerConnectorTest (connect) (owner: [~yash.mayya] ) # WorkerCoordinatorTest (connect) (owner: [~yash.mayya] ) # RootResourceTest (connect) (owner: [~yash.mayya] ) # ByteArrayProducerRecordEquals (connect) (owner: [~yash.mayya] ) # {color:#ff8b00}KStreamFlatTransformTest{color} (owner: Christo) # {color:#ff8b00}KStreamFlatTransformValuesTest{color} (owner: Christo) # {color:#ff8b00}KStreamPrintTest{color} (owner: Christo) # {color:#ff8b00}KStreamRepartitionTest{color} (owner: Christo) # {color:#ff8b00}MaterializedInternalTest{color} (owner: Christo) # {color:#ff8b00}TransformerSupplierAdapterTest{color} (owner: Christo) # {color:#ff8b00}KTableSuppressProcessorMetricsTest{color} (owner: Christo) # {color:#ff8b00}ClientUtilsTest{color} (owner: Christo) # {color:#ff8b00}HighAvailabilityStreamsPartitionAssignorTest{color} (owner: Christo) # {color:#ff8b00}StreamsRebalanceListenerTest{color} (owner: Christo) # {color:#ff8b00}TimestampedKeyValueStoreMaterializerTest{color} (owner: Christo) # {color:#ff8b00}CachingInMemoryKeyValueStoreTest{color} (owner: Christo) # {color:#ff8b00}CachingInMemorySessionStoreTest{color} (owner: Christo) # {color:#ff8b00}CachingPersistentSessionStoreTest{color} (owner: Christo) # {color:#ff8b00}CachingPersistentWindowStoreTest{color} (owner: Christo) # {color:#ff8b00}ChangeLoggingKeyValueBytesStoreTest{color} (owner: Christo) # {color:#ff8b00}ChangeLoggingTimestampedKeyValueBytesStoreTest{color} (owner: Christo) # {color:#ff8b00}CompositeReadOnlyWindowStoreTest{color} (owner: Christo) # {color:#ff8b00}KeyValueStoreBuilderTest{color} (owner: Christo) # {color:#ff8b00}RocksDBStoreTest{color} (owner: Christo) # {color:#ff8b00}StreamThreadStateStoreProviderTest{color} (owner: Christo) # TopologyTest (owner: Christo) # KTableSuppressProcessorTest (owner: Christo) # InternalTopicManagerTest (owner: Christo) # ProcessorContextImplTest (owner: Christo) # ProcessorStateManagerTest ({*}WIP{*} owner: Matthew) # StandbyTaskTest ({*}WIP{*} owner: Matthew) # StoreChangelogReaderTest ({*}WIP{*} owner: Matthew) # StreamTaskTest ({*}WIP{*} owner: Matthew) # StreamThreadTest ({*}WIP{*} owner: Matthew) # StreamsAssignmentScaleTest ({*}WIP{*} owner: Christo) # StreamsPartitionAssignorTest ({*}WIP{*} owner: Christo) # TaskManagerTest # WriteConsistencyVectorTest # AssignmentTestUtils ({*}WIP{*} owner: Christo) # StreamsMetricsImplTest # ChangeLoggingSessionBytesStoreTest # ChangeLoggingTimestampedWindowBytesStoreTest # ChangeLoggingWindowBytesStoreTest # MeteredTimestampedWindowStoreTest # TimeOrderedCachingPersistentWindowStoreTest # TimeOrderedWindowStoreTest *The coverage report for the above tests after the change should be >= to what the coverage is now.* was: {color:#de350b}There are tests which use both PowerMock and EasyMock. I have put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here rely solely on EasyMock.{color} Unless stated in brackets the tests are in the streams module. A list of tests which still require to be moved from EasyMock to Mockito as of 2nd of August 2022 which do not have a Jira issue and do not have pull requests I am aware of which are opened: {color:#ff8b00}In Review{color} # WorkerConnectorTest (connect) (owner: [~yash.mayya] ) # WorkerCoordinatorTest (connect) (owner: [~yash.mayya] ) # RootResourceTest (connect) (owner: [~yash.mayya] ) # ByteArrayProducerRecordEquals (connect) (owner: [~yash.mayya] ) # {color:#ff8b00}KStreamFlatTransformTest{color} (owner: Christo) # {color:#ff8b00}KStreamFlatTransformValuesTest{color} (owner: Christo) # {color:#ff8b00}KStreamPrintTest{color} (owner: Christo) # {color:#ff8b00}KStreamRepartitionTest{color} (owner: Christo) # {color:#ff8b00}MaterializedInternalTest{color} (owner: Christo) # {color:#ff8b00}TransformerSupplierAdapterTest{color} (owner: Christo) # {color:#ff8b00}KTableSuppressProcessorMetricsTest{color} (owner: Christo) # {color:#ff8b00}ClientUtilsTest{color} (owner: Christo) # {color:#ff8b00}HighAvailabilityStreamsPartitionAssignorTest{color} (owner: Christo) # {color:#ff8b00}StreamsRebalanceListenerTest{color} (owner: Christo) # {color:#ff8b00}TimestampedKeyValueStoreMaterializerTest{color} (owner: Christo) #
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573211#comment-17573211 ] Matthew de Detrich edited comment on KAFKA-14014 at 8/9/22 12:58 PM: - In case people want to reproduce the flakiness, assuming you have a working docker installation you can do the following {code:java} docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle sh{code} where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff between higher occurrence to encounter the flakiness vs how fast the test runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the kafak project in gradle can take ages. The above command will put you into a shell at which point you can do {code:java} while [ $? -eq 0 ]; do ./gradlew :streams:test --tests org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets; done{code} Which will re-run the tests until there is a failure. Note that due to how Gradle cache's test runs, you need to do something like [https://github.com/gradle/gradle/issues/9151#issue-434212465] or put {code:java} test.outputs.upToDateWhen {false}{code} in order to force gradle to re-run the test every time. was (Author: mdedetrich-aiven): In case people want to reproduce the flakiness, assuming you have a working docker installation you can do the following {code:java} docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle sh{code} where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff between higher occurrence to encounter the flakiness vs how fast the test runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the kafak project in gradle can take ages. The above command will put you into a shell at which point you can do {code:java} while [ $? -eq 0 ]; do ./gradlew :streams:test --tests org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets; done{code} Which will re-run the tests until there is a failure. Note that due to how Gradle cache's test runs, you need to do something like [https://github.com/gradle/gradle/issues/9151#issue-434212465] in order to force gradle to re-run the test every time. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) >
[jira] [Commented] (KAFKA-14132) Remaining PowerMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575616#comment-17575616 ] Matthew de Detrich commented on KAFKA-14132: Ah misunderstood, I will review them to help out. > Remaining PowerMock to Mockito tests > > > Key: KAFKA-14132 > URL: https://issues.apache.org/jira/browse/KAFKA-14132 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#DE350B}Some of the tests below use EasyMock as well. For those > migrate both PowerMock and EasyMock to Mockito.{color} > Unless stated in brackets the tests are in the connect module. > A list of tests which still require to be moved from PowerMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > # ErrorHandlingTaskTest (owner: Divij) > # SourceTaskOffsetCommiterTest (owner: Divij) > # WorkerMetricsGroupTest (owner: Divij) > # WorkerSinkTaskTest (owner: Divij) > # WorkerSinkTaskThreadedTest (owner: Divij) > # WorkerTaskTest > # ErrorReporterTest > # RetryWithToleranceOperatorTest > # WorkerErrantRecordReporterTest > # ConnectorsResourceTest > # StandaloneHerderTest > # KafkaConfigBackingStoreTest > # KafkaOffsetBackingStoreTest (owner: Christo) > (https://github.com/apache/kafka/pull/12418) > # KafkaBasedLogTest > # RetryUtilTest > # RepartitionTopicTest (streams) (owner: Christo) > # StateManagerUtilTest (streams) (owner: Christo) > *The coverage report for the above tests after the change should be >= to > what the coverage is now.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14133) Remaining EasyMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575617#comment-17575617 ] Matthew de Detrich commented on KAFKA-14133: No worries, I will review them to help out. > Remaining EasyMock to Mockito tests > --- > > Key: KAFKA-14133 > URL: https://issues.apache.org/jira/browse/KAFKA-14133 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#DE350B}There are tests which use both PowerMock and EasyMock. I have > put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here > rely solely on EasyMock.{color} > Unless stated in brackets the tests are in the streams module. > A list of tests which still require to be moved from EasyMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > # WorkerConnectorTest (connect) (owner: Yash) > # WorkerCoordinatorTest (connect) > # RootResourceTest (connect) > # ByteArrayProducerRecordEquals (connect) > # TopologyTest > # KStreamFlatTransformTest > # KStreamFlatTransformValuesTest > # KStreamPrintTest > # KStreamRepartitionTest > # MaterializedInternalTest > # TransformerSupplierAdapterTest > # KTableSuppressProcessorMetricsTest > # KTableSuppressProcessorTest > # ClientUtilsTest > # HighAvailabilityStreamsPartitionAssignorTest > # InternalTopicManagerTest > # ProcessorContextImplTest > # ProcessorStateManagerTest > # StandbyTaskTest > # StoreChangelogReaderTest > # StreamTaskTest > # StreamThreadTest > # StreamsAssignmentScaleTest > # StreamsPartitionAssignorTest > # StreamsRebalanceListenerTest > # TaskManagerTest > # TimestampedKeyValueStoreMaterializerTest > # WriteConsistencyVectorTest > # AssignmentTestUtils > # StreamsMetricsImplTest > # CachingInMemoryKeyValueStoreTest > # CachingInMemorySessionStoreTest > # CachingPersistentSessionStoreTest > # CachingPersistentWindowStoreTest > # ChangeLoggingKeyValueBytesStoreTest > # ChangeLoggingSessionBytesStoreTest > # ChangeLoggingTimestampedKeyValueBytesStoreTest > # ChangeLoggingTimestampedWindowBytesStoreTest > # ChangeLoggingWindowBytesStoreTest > # CompositeReadOnlyWindowStoreTest > # KeyValueStoreBuilderTest > # MeteredTimestampedWindowStoreTest > # RocksDBStoreTest > # StreamThreadStateStoreProviderTest > # TimeOrderedCachingPersistentWindowStoreTest > # TimeOrderedWindowStoreTest > *The coverage report for the above tests after the change should be >= to > what the coverage is now.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14133) Remaining EasyMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575435#comment-17575435 ] Matthew de Detrich commented on KAFKA-14133: [~divijvaidya] [~christo_lolov] As per the email on the mailing list, apparently this ticket is looking for an owner and I am happy to take it up. Everyone okay with that? > Remaining EasyMock to Mockito tests > --- > > Key: KAFKA-14133 > URL: https://issues.apache.org/jira/browse/KAFKA-14133 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#DE350B}There are tests which use both PowerMock and EasyMock. I have > put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here > rely solely on EasyMock.{color} > Unless stated in brackets the tests are in the streams module. > A list of tests which still require to be moved from EasyMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > # WorkerConnectorTest (connect) (owner: Yash) > # WorkerCoordinatorTest (connect) > # RootResourceTest (connect) > # ByteArrayProducerRecordEquals (connect) > # TopologyTest > # KStreamFlatTransformTest > # KStreamFlatTransformValuesTest > # KStreamPrintTest > # KStreamRepartitionTest > # MaterializedInternalTest > # TransformerSupplierAdapterTest > # KTableSuppressProcessorMetricsTest > # KTableSuppressProcessorTest > # ClientUtilsTest > # HighAvailabilityStreamsPartitionAssignorTest > # InternalTopicManagerTest > # ProcessorContextImplTest > # ProcessorStateManagerTest > # StandbyTaskTest > # StoreChangelogReaderTest > # StreamTaskTest > # StreamThreadTest > # StreamsAssignmentScaleTest > # StreamsPartitionAssignorTest > # StreamsRebalanceListenerTest > # TaskManagerTest > # TimestampedKeyValueStoreMaterializerTest > # WriteConsistencyVectorTest > # AssignmentTestUtils > # StreamsMetricsImplTest > # CachingInMemoryKeyValueStoreTest > # CachingInMemorySessionStoreTest > # CachingPersistentSessionStoreTest > # CachingPersistentWindowStoreTest > # ChangeLoggingKeyValueBytesStoreTest > # ChangeLoggingSessionBytesStoreTest > # ChangeLoggingTimestampedKeyValueBytesStoreTest > # ChangeLoggingTimestampedWindowBytesStoreTest > # ChangeLoggingWindowBytesStoreTest > # CompositeReadOnlyWindowStoreTest > # KeyValueStoreBuilderTest > # MeteredTimestampedWindowStoreTest > # RocksDBStoreTest > # StreamThreadStateStoreProviderTest > # TimeOrderedCachingPersistentWindowStoreTest > # TimeOrderedWindowStoreTest > *The coverage report for the above tests after the change should be >= to > what the coverage is now.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14132) Remaining PowerMock to Mockito tests
[ https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575434#comment-17575434 ] Matthew de Detrich commented on KAFKA-14132: [~divijvaidya] [~christo_lolov] As per the email on the mailing list, apparently this ticket is looking for an owner and I am happy to take it up. Everyone okay with that? > Remaining PowerMock to Mockito tests > > > Key: KAFKA-14132 > URL: https://issues.apache.org/jira/browse/KAFKA-14132 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > > {color:#DE350B}Some of the tests below use EasyMock as well. For those > migrate both PowerMock and EasyMock to Mockito.{color} > Unless stated in brackets the tests are in the connect module. > A list of tests which still require to be moved from PowerMock to Mockito as > of 2nd of August 2022 which do not have a Jira issue and do not have pull > requests I am aware of which are opened: > # ErrorHandlingTaskTest (owner: Divij) > # SourceTaskOffsetCommiterTest (owner: Divij) > # WorkerMetricsGroupTest (owner: Divij) > # WorkerSinkTaskTest (owner: Divij) > # WorkerSinkTaskThreadedTest (owner: Divij) > # WorkerTaskTest > # ErrorReporterTest > # RetryWithToleranceOperatorTest > # WorkerErrantRecordReporterTest > # ConnectorsResourceTest > # StandaloneHerderTest > # KafkaConfigBackingStoreTest > # KafkaOffsetBackingStoreTest (owner: Christo) > (https://github.com/apache/kafka/pull/12418) > # KafkaBasedLogTest > # RetryUtilTest > # RepartitionTopicTest (streams) (owner: Christo) > # StateManagerUtilTest (streams) (owner: Christo) > *The coverage report for the above tests after the change should be >= to > what the coverage is now.* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13530) Flaky test ReplicaManagerTest
[ https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574920#comment-17574920 ] Matthew de Detrich commented on KAFKA-13530: So I ran the tests inside a docker container to simulate limited resources and couldn't replicate the flakiness > Flaky test ReplicaManagerTest > - > > Key: KAFKA-13530 > URL: https://issues.apache.org/jira/browse/KAFKA-13530 > Project: Kafka > Issue Type: Test > Components: core, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > kafka.server.ReplicaManagerTest.[1] usesTopicIds=true > {quote}org.opentest4j.AssertionFailedError: expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at > kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502) > at > kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote} > STDOUT > {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file > /tmp/kafka-6310287969113820536/replication-offset-checkpoint > (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: > /tmp/kafka-6310287969113820536/replication-offset-checkpoint at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at > sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) > at java.nio.file.Files.newByteChannel(Files.java:361) at > java.nio.file.Files.newByteChannel(Files.java:407) at > java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) > at java.nio.file.Files.newInputStream(Files.java:152) at > java.nio.file.Files.newBufferedReader(Files.java:2784) at > java.nio.file.Files.newBufferedReader(Files.java:2816) at > org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) > at > kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48) > at > kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97) > at > kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89) > at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at > kafka.cluster.Partition.createLog(Partition.scala:361) at > kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at > kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at > kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at > kafka.cluster.Partition.makeLeader(Partition.scala:530) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163) > at scala.Option.foreach(Option.scala:437) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160) > at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) > at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at > kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-13513) Flaky test AdjustStreamThreadCountTest
[ https://issues.apache.org/jira/browse/KAFKA-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-13513: -- Assignee: Matthew de Detrich > Flaky test AdjustStreamThreadCountTest > -- > > Key: KAFKA-13513 > URL: https://issues.apache.org/jira/browse/KAFKA-13513 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > org.apache.kafka.streams.integration.AdjustStreamThreadCountTest.testConcurrentlyAccessThreads > {quote}java.lang.AssertionError: expected null, but > was: at > org.junit.Assert.fail(Assert.java:89) at > org.junit.Assert.failNotNull(Assert.java:756) at > org.junit.Assert.assertNull(Assert.java:738) at > org.junit.Assert.assertNull(Assert.java:748) at > org.apache.kafka.streams.integration.AdjustStreamThreadCountTest.testConcurrentlyAccessThreads(AdjustStreamThreadCountTest.java:367) > {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-14014: -- Assignee: Matthew de Detrich > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13530) Flaky test ReplicaManagerTest
[ https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574099#comment-17574099 ] Matthew de Detrich commented on KAFKA-13530: Okay so I have done some testing with docker to see if I can reproduce the flakiness under constrained resources and I couldn't simulate although the progress of doing this is incredibly slow (for some reason when running tests under the Kafka core gradle project it takes a *really* long amount of time per run, it takes ~5-10 minutes in the configuring stage before even starting to compile the sources). [~mjsax] When is the last time you experienced this flaky test? > Flaky test ReplicaManagerTest > - > > Key: KAFKA-13530 > URL: https://issues.apache.org/jira/browse/KAFKA-13530 > Project: Kafka > Issue Type: Test > Components: core, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > kafka.server.ReplicaManagerTest.[1] usesTopicIds=true > {quote}org.opentest4j.AssertionFailedError: expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at > kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502) > at > kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote} > STDOUT > {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file > /tmp/kafka-6310287969113820536/replication-offset-checkpoint > (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: > /tmp/kafka-6310287969113820536/replication-offset-checkpoint at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at > sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) > at java.nio.file.Files.newByteChannel(Files.java:361) at > java.nio.file.Files.newByteChannel(Files.java:407) at > java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) > at java.nio.file.Files.newInputStream(Files.java:152) at > java.nio.file.Files.newBufferedReader(Files.java:2784) at > java.nio.file.Files.newBufferedReader(Files.java:2816) at > org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) > at > kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48) > at > kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97) > at > kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89) > at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at > kafka.cluster.Partition.createLog(Partition.scala:361) at > kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at > kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at > kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at > kafka.cluster.Partition.makeLeader(Partition.scala:530) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163) > at scala.Option.foreach(Option.scala:437) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160) > at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) > at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at > kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote} -- This message was sent by Atlassian Jira
[jira] [Assigned] (KAFKA-13530) Flaky test ReplicaManagerTest
[ https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-13530: -- Assignee: Matthew de Detrich > Flaky test ReplicaManagerTest > - > > Key: KAFKA-13530 > URL: https://issues.apache.org/jira/browse/KAFKA-13530 > Project: Kafka > Issue Type: Test > Components: core, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > kafka.server.ReplicaManagerTest.[1] usesTopicIds=true > {quote}org.opentest4j.AssertionFailedError: expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at > kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502) > at > kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote} > STDOUT > {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file > /tmp/kafka-6310287969113820536/replication-offset-checkpoint > (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: > /tmp/kafka-6310287969113820536/replication-offset-checkpoint at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at > sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) > at java.nio.file.Files.newByteChannel(Files.java:361) at > java.nio.file.Files.newByteChannel(Files.java:407) at > java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) > at java.nio.file.Files.newInputStream(Files.java:152) at > java.nio.file.Files.newBufferedReader(Files.java:2784) at > java.nio.file.Files.newBufferedReader(Files.java:2816) at > org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) > at > kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48) > at > kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97) > at > kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89) > at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at > kafka.cluster.Partition.createLog(Partition.scala:361) at > kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at > kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at > kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at > kafka.cluster.Partition.makeLeader(Partition.scala:530) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163) > at scala.Option.foreach(Option.scala:437) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160) > at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) > at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at > kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13530) Flaky test ReplicaManagerTest
[ https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573282#comment-17573282 ] Matthew de Detrich commented on KAFKA-13530: So I tried to simulate this test failure under normal circumstances and couldn't do so. I will retry the tests using docker to limit the resources of to see if I can fail the test under those circumstances. > Flaky test ReplicaManagerTest > - > > Key: KAFKA-13530 > URL: https://issues.apache.org/jira/browse/KAFKA-13530 > Project: Kafka > Issue Type: Test > Components: core, unit tests >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > > kafka.server.ReplicaManagerTest.[1] usesTopicIds=true > {quote}org.opentest4j.AssertionFailedError: expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at > kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502) > at > kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote} > STDOUT > {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file > /tmp/kafka-6310287969113820536/replication-offset-checkpoint > (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: > /tmp/kafka-6310287969113820536/replication-offset-checkpoint at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at > sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) > at java.nio.file.Files.newByteChannel(Files.java:361) at > java.nio.file.Files.newByteChannel(Files.java:407) at > java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) > at java.nio.file.Files.newInputStream(Files.java:152) at > java.nio.file.Files.newBufferedReader(Files.java:2784) at > java.nio.file.Files.newBufferedReader(Files.java:2816) at > org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) > at > kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48) > at > kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94) > at > kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97) > at > kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89) > at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at > kafka.cluster.Partition.createLog(Partition.scala:361) at > kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at > kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at > kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at > kafka.cluster.Partition.makeLeader(Partition.scala:530) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163) > at scala.Option.foreach(Option.scala:437) at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160) > at > kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) > at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at > kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573211#comment-17573211 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/30/22 5:57 AM: - In case people want to reproduce the flakiness, assuming you have a working docker installation you can do the following {code:java} docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle sh{code} where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff between higher occurrence to encounter the flakiness vs how fast the test runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the kafak project in gradle can take ages. The above command will put you into a shell at which point you can do {code:java} while [ $? -eq 0 ]; do ./gradlew :streams:test --tests org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets; done{code} Which will re-run the tests until there is a failure. Note that due to how Gradle cache's test runs, you need to do something like [https://github.com/gradle/gradle/issues/9151#issue-434212465] in order to force gradle to re-run the test every time. was (Author: mdedetrich-aiven): In case people want to reproduce the flakiness, assuming you have a working docker installation you can do the following {code:java} docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle sh{code} where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff between higher occurrence to encounter the flakiness vs how fast the test runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the kafak project in gradle can take ages. The above command will put you into a shell at which point you can do {code:java} while [ $? -eq 0 ]; do ./gradlew :streams:test --tests org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets; done{code} Which will re-run the tests until there is a failure. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} >
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573211#comment-17573211 ] Matthew de Detrich commented on KAFKA-14014: In case people want to reproduce the flakiness, assuming you have a working docker installation you can do the following {code:java} docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle sh{code} where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff between higher occurrence to encounter the flakiness vs how fast the test runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the kafak project in gradle can take ages. The above command will put you into a shell at which point you can do {code:java} while [ $? -eq 0 ]; do ./gradlew :streams:test --tests org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets; done{code} Which will re-run the tests until there is a failure. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13467) Clients never refresh cached bootstrap IPs
[ https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573144#comment-17573144 ] Matthew de Detrich commented on KAFKA-13467: [~ijuma] are you talking about https://issues.apache.org/jira/browse/KAFKA-13405, if so its already closed? > Clients never refresh cached bootstrap IPs > -- > > Key: KAFKA-13467 > URL: https://issues.apache.org/jira/browse/KAFKA-13467 > Project: Kafka > Issue Type: Improvement > Components: clients, network >Reporter: Matthias J. Sax >Priority: Minor > > Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405. > For certain broker rolling upgrade scenarios, it would be beneficial to > expired cached bootstrap server IP addresses and re-resolve those IPs to > allow clients to re-connect to the cluster without the need to restart the > client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-13467) Clients never refresh cached bootstrap IPs
[ https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572788#comment-17572788 ] Matthew de Detrich edited comment on KAFKA-13467 at 7/29/22 7:18 AM: - I am currently having a look at this issue although due to the nature of the ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] would be able to give my some pointers on how to best approach this? Following what [~dengziming] said earlier, I am currently looking at SocketServer which seems to be core of where all of the netty connections are initialized but this may the wrong place to do the change. Furthermore I may be missing something but if one naively implements the "broker disconnecting a connection when the client connects to force an IP change" would create an infinite loop unless one one checks for a condition (i.e. only on broker upgrade, is this possible) or maybe some other condition? was (Author: mdedetrich-aiven): I am currently having a look at this issue although due to the nature of the ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] would be able to give my some pointers on how to best approach this. Following what [~dengziming] said earlier, I am currently looking at SocketServer which seems to be core of where all of the netty connections are initialized but this may the wrong place to do the change. Furthermore I may be missing something but if one naively implements the "broker disconnecting a connection when the client connects to force an IP change" would create an infinite loop unless one one checks for a condition (i.e. only on broker upgrade, is this possible) or maybe some other condition? > Clients never refresh cached bootstrap IPs > -- > > Key: KAFKA-13467 > URL: https://issues.apache.org/jira/browse/KAFKA-13467 > Project: Kafka > Issue Type: Improvement > Components: clients, network >Reporter: Matthias J. Sax >Priority: Minor > > Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405. > For certain broker rolling upgrade scenarios, it would be beneficial to > expired cached bootstrap server IP addresses and re-resolve those IPs to > allow clients to re-connect to the cluster without the need to restart the > client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-13467) Clients never refresh cached bootstrap IPs
[ https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572788#comment-17572788 ] Matthew de Detrich edited comment on KAFKA-13467 at 7/29/22 7:18 AM: - I am currently having a look at this issue although due to the nature of the ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] would be able to give my some pointers on how to best approach this? Following what [~dengziming] said earlier, I am currently looking at SocketServer which seems to be core of where all of the netty connections are initialized but this may the wrong place to do the change. Furthermore I may be missing something but if one naively implements the "broker disconnecting a connection when the client connects to force a cache refresh to get new IP's from FQDN" would create an infinite loop unless one one checks for a condition on that disconnect (i.e. only on broker upgrade, is this possible) or maybe some other condition? was (Author: mdedetrich-aiven): I am currently having a look at this issue although due to the nature of the ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] would be able to give my some pointers on how to best approach this? Following what [~dengziming] said earlier, I am currently looking at SocketServer which seems to be core of where all of the netty connections are initialized but this may the wrong place to do the change. Furthermore I may be missing something but if one naively implements the "broker disconnecting a connection when the client connects to force an IP change" would create an infinite loop unless one one checks for a condition (i.e. only on broker upgrade, is this possible) or maybe some other condition? > Clients never refresh cached bootstrap IPs > -- > > Key: KAFKA-13467 > URL: https://issues.apache.org/jira/browse/KAFKA-13467 > Project: Kafka > Issue Type: Improvement > Components: clients, network >Reporter: Matthias J. Sax >Priority: Minor > > Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405. > For certain broker rolling upgrade scenarios, it would be beneficial to > expired cached bootstrap server IP addresses and re-resolve those IPs to > allow clients to re-connect to the cluster without the need to restart the > client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13467) Clients never refresh cached bootstrap IPs
[ https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572788#comment-17572788 ] Matthew de Detrich commented on KAFKA-13467: I am currently having a look at this issue although due to the nature of the ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] would be able to give my some pointers on how to best approach this. Following what [~dengziming] said earlier, I am currently looking at SocketServer which seems to be core of where all of the netty connections are initialized but this may the wrong place to do the change. Furthermore I may be missing something but if one naively implements the "broker disconnecting a connection when the client connects to force an IP change" would create an infinite loop unless one one checks for a condition (i.e. only on broker upgrade, is this possible) or maybe some other condition? > Clients never refresh cached bootstrap IPs > -- > > Key: KAFKA-13467 > URL: https://issues.apache.org/jira/browse/KAFKA-13467 > Project: Kafka > Issue Type: Improvement > Components: clients, network >Reporter: Matthias J. Sax >Priority: Minor > > Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405. > For certain broker rolling upgrade scenarios, it would be beneficial to > expired cached bootstrap server IP addresses and re-resolve those IPs to > allow clients to re-connect to the cluster without the need to restart the > client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14103) Check for hostnames in CoreUtils.scala
Matthew de Detrich created KAFKA-14103: -- Summary: Check for hostnames in CoreUtils.scala Key: KAFKA-14103 URL: https://issues.apache.org/jira/browse/KAFKA-14103 Project: Kafka Issue Type: Improvement Components: core Reporter: Matthew de Detrich In the process of working on [https://github.com/apache/kafka/pull/11478] we realized that Kafka does not do any hostname validation when parsing listener configurations. It would be ideal to investigate hostname validation so that we can eagerly short-circuit on invalid hostnames rather than the current behaviour (this needs to be verified). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564158#comment-17564158 ] Matthew de Detrich commented on KAFKA-14014: Okay so another update, I think my async theory doesn't hold up. After running the above code over a few days it doesn't appear to be making a difference if you account for the extra time that the test is taking due to the Thread.sleep I did however notice something interesting, three are actually 2 types of stack traces that you can get, the one mentioned in the ticket and this one {code:java} org.apache.kafka.streams.integration.NamedTopologyIntegrationTest > shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets FAILED java.lang.AssertionError: Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1), KeyValue(C, 2)]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:544){code} As you can see, in this case C appears to be counted twice, once with a value 1 and the other with a value 2. Due to this I have a suspicion that there may be some issue with the underlying stream that only appears when under heavy load? > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563162#comment-17563162 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 11:27 AM: - So I think I have gotten somewhere. I tried adjusting various timeouts to no avail so I moved onto another theory which is that some topology/stream based methods are asynchronous, i.e they are doing something in the background even after the calling function has finished. To test this theory I have adjusted the flaky test by adding some manual Thread.sleep's to account for this supposed synchronicity, i.e. {code:java} final AddNamedTopologyResult result1 = streams.addNamedTopology(topology2Client1); Thread.sleep(500); streams2.addNamedTopology(topology2Client2).all().get(); Thread.sleep(500); result1.all().get(); Thread.sleep(500);{code} This appears to be greatly reducing the flakiness. Originally I used 200 millis which roughly increased the amount of time until failure by 4x however I still got a failure so now I am testing it with the 500 millis as shown above. was (Author: mdedetrich-aiven): So I think I have gotten somewhere. I tried adjusting various timeouts to no avail so I moved onto another theory which is that some topology/stream based methods are asynchronous, i.e they are doing something in the background even after the calling function has finished. To test this theory I have adjusted the flaky test by adding some manual sleeps, i.e. {code:java} final AddNamedTopologyResult result1 = streams.addNamedTopology(topology2Client1); Thread.sleep(500); streams2.addNamedTopology(topology2Client2).all().get(); Thread.sleep(500); result1.all().get(); Thread.sleep(500);{code} This appears to be greatly reducing the flakiness. Originally I used 200 millis which roughly increased the amount of time until failure by 4x however I still got a failure so now I am testing it with the 500 millis as shown above. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ >
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 11:24 AM: - [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with I did manage to predictably replicate the test's flakiness and it does appear to be related to load, i.e. the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running has 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. was (Author: mdedetrich-aiven): [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with I did manage to predictably replicate the test's flakiness and it does appear to be related to load, i.e. the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ >
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563162#comment-17563162 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 11:23 AM: - So I think I have gotten somewhere. I tried adjusting various timeouts to no avail so I moved onto another theory which is that some topology/stream based methods are asynchronous, i.e they are doing something in the background even after the calling function has finished. To test this theory I have adjusted the flaky test by adding some manual sleeps, i.e. {code:java} final AddNamedTopologyResult result1 = streams.addNamedTopology(topology2Client1); Thread.sleep(500); streams2.addNamedTopology(topology2Client2).all().get(); Thread.sleep(500); result1.all().get(); Thread.sleep(500);{code} This appears to be greatly reducing the flakiness. Originally I used 200 millis which roughly increased the amount of time until failure by 4x however I still got a failure so now I am testing it with the 500 millis as shown above. was (Author: mdedetrich-aiven): So I think I have gotten somewhere. I tried adjusting various timeouts to no avail so I moved onto another theory which is that some topology/stream based methods are asynchronous, i.e they are doing something in the background even after the calling function has finished. To test this theory I have adjusted the flaky test by adding some manual sleeps, i.e. {code:java} final AddNamedTopologyResult result1 = streams.addNamedTopology(topology2Client1); Thread.sleep(500); streams2.addNamedTopology(topology2Client2).all().get(); Thread.sleep(500); result1.all().get(); Thread.sleep(500);{code} This appears to be greatly reducing the flakiness. Originally I used 200 millis which roughly increased the amount of time until failure by 4x and now I am testing it with the 500 millis as shown above. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ --
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563162#comment-17563162 ] Matthew de Detrich commented on KAFKA-14014: So I think I have gotten somewhere. I tried adjusting various timeouts to no avail so I moved onto another theory which is that some topology/stream based methods are asynchronous, i.e they are doing something in the background even after the calling function has finished. To test this theory I have adjusted the flaky test by adding some manual sleeps, i.e. {code:java} final AddNamedTopologyResult result1 = streams.addNamedTopology(topology2Client1); Thread.sleep(500); streams2.addNamedTopology(topology2Client2).all().get(); Thread.sleep(500); result1.all().get(); Thread.sleep(500);{code} This appears to be greatly reducing the flakiness. Originally I used 200 millis which roughly increased the amount of time until failure by 4x and now I am testing it with the 500 millis as shown above. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > Labels: flaky-test > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:29 AM: [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with I did manage to predictably replicate the test's flakiness and it does appear to be related to load, i.e. the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. was (Author: mdedetrich-aiven): [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with I did manage to predictably replicate the test and its flakiness does appear to be related to load, i.e. the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ >
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:27 AM: [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with I did manage to predictably replicate the test and its flakiness does appear to be related to load, i.e. the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. was (Author: mdedetrich-aiven): [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with, the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ >
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004 ] Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:26 AM: [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with, the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. Note that 5 cpu's is already considered "high" (at least for a machine that would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 cores, 12 threads) and at least when running with all of the resources on the machine I couldn't replicate the flaky test. was (Author: mdedetrich-aiven): [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with, the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004 ] Matthew de Detrich commented on KAFKA-14014: [~cadonna] So I did some debugging on this ticket over the past week and I found out some interesting things. To start off with, the test is more flaky the less CPU resources it has. I am using docker (i.e. running the tests within docker gradle image) by using the --cpu flag to limit resources. Interestingly I have gone up to 5 cpu's and its still flaking out albeit less often. I attempted to increase the various timeouts that is used in t he test but this had no effect so I am going to dig a bit further. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557842#comment-17557842 ] Matthew de Detrich commented on KAFKA-14014: So I ran the tests overnight with JDK 11 and have no failure with ~10k runs, I suspect that JDK 17 will also provide the same result. This means that a few more possibilities are opened up when it comes to why I may not be able to reproduce the failure locally 1. This is gradle specific (going to create a runner in gradle) 2. JVM on the CI is being run with different options that could be causing the failure? (I have to check up on how the CI works) 3. In addition to this test maybe other tests are being run concurrently on the CI and the extra load is causing certain parts of the test setup to take longer which may be causing race conditions? I will look into this a bit later, any other hints/clues for possible reasons behind the failure would be greatly appreciated! > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557564#comment-17557564 ] Matthew de Detrich commented on KAFKA-14014: Okay so I just noticed the linked builds with different JDK versions (11/17 versus my 16) so I will rerun the tests using those JDK versions to see if I can simulate the CI. > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557525#comment-17557525 ] Matthew de Detrich edited comment on KAFKA-14014 at 6/22/22 3:29 PM: - So interestingly I worked on another ticket that reported the exact same test as being flaky (see https://issues.apache.org/jira/browse/KAFKA-13531) and I couldn't reproduce any flakiness (hence the reason why that ticket is closed). I rebased my fork of Kafka to the latest version in trunk and started running this test and so far I have 1.6k runs without a failure (continuously running the tests in loop as I post this comment). [~cadonna] Would it be possible to state how you are running the tests and also which JDK version (and which branch if you are not running latest trunk?). Personally I am using Java 16 and to run the tests I am using the JUnit (not gradle) runner with Intellij (primarily because it gives the ability to run a test until it fails). was (Author: mdedetrich-aiven): So interestingly I worked on another ticket that reported the exact same test as being flaky (see https://issues.apache.org/jira/browse/KAFKA-13531) and I couldn't reproduce any flakiness. I rebased my fork of Kafka to the latest version in trunk and started running this test and so far I have 1.6k runs without a failure (re-running the tests in loop as I post this comment) [~cadonna] Would it be possible to state how you are running the tests and also which JDK version (and which branch if you are not running latest trunk?). Personally I am using Java 16 and to run the tests I am using the JUnit (not gradle) runner with Intellij (primarily because it gives the ability to run a test until it fails). > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
[ https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557525#comment-17557525 ] Matthew de Detrich commented on KAFKA-14014: So interestingly I worked on another ticket that reported the exact same test as being flaky (see https://issues.apache.org/jira/browse/KAFKA-13531) and I couldn't reproduce any flakiness. I rebased my fork of Kafka to the latest version in trunk and started running this test and so far I have 1.6k runs without a failure (re-running the tests in loop as I post this comment) [~cadonna] Would it be possible to state how you are running the tests and also which JDK version (and which branch if you are not running latest trunk?). Personally I am using Java 16 and to run the tests I am using the JUnit (not gradle) runner with Intellij (primarily because it gives the ability to run a test until it fails). > Flaky test > NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets() > > > Key: KAFKA-14014 > URL: https://issues.apache.org/jira/browse/KAFKA-14014 > Project: Kafka > Issue Type: Test > Components: streams >Reporter: Bruno Cadonna >Priority: Critical > > {code:java} > java.lang.AssertionError: > Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]> > but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) > {code} > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ > https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/ -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (KAFKA-6527) Transient failure in DynamicBrokerReconfigurationTest.testDefaultTopicConfig
[ https://issues.apache.org/jira/browse/KAFKA-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-6527: -- Labels: flakey (was: ) > Transient failure in DynamicBrokerReconfigurationTest.testDefaultTopicConfig > > > Key: KAFKA-6527 > URL: https://issues.apache.org/jira/browse/KAFKA-6527 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Labels: flakey > Fix For: 3.3.0 > > > {code:java} > java.lang.AssertionError: Log segment size increase not applied > at kafka.utils.TestUtils$.fail(TestUtils.scala:355) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:865) > at > kafka.server.DynamicBrokerReconfigurationTest.testDefaultTopicConfig(DynamicBrokerReconfigurationTest.scala:348) > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (KAFKA-13897) Add 3.1.1 to system tests and streams upgrade tests
[ https://issues.apache.org/jira/browse/KAFKA-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-13897: --- Labels: flakey (was: ) > Add 3.1.1 to system tests and streams upgrade tests > --- > > Key: KAFKA-13897 > URL: https://issues.apache.org/jira/browse/KAFKA-13897 > Project: Kafka > Issue Type: Task > Components: streams, system tests >Reporter: Tom Bentley >Priority: Blocker > Labels: flakey > Fix For: 3.3.0, 3.1.2, 3.2.1 > > > Per the penultimate bullet on the [release > checklist|https://cwiki.apache.org/confluence/display/KAFKA/Release+Process#ReleaseProcess-Afterthevotepasses], > Kafka v3.1.1 is released. We should add this version to the system tests. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (KAFKA-13897) Add 3.1.1 to system tests and streams upgrade tests
[ https://issues.apache.org/jira/browse/KAFKA-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-13897: --- Labels: (was: flakey) > Add 3.1.1 to system tests and streams upgrade tests > --- > > Key: KAFKA-13897 > URL: https://issues.apache.org/jira/browse/KAFKA-13897 > Project: Kafka > Issue Type: Task > Components: streams, system tests >Reporter: Tom Bentley >Priority: Blocker > Fix For: 3.3.0, 3.1.2, 3.2.1 > > > Per the penultimate bullet on the [release > checklist|https://cwiki.apache.org/confluence/display/KAFKA/Release+Process#ReleaseProcess-Afterthevotepasses], > Kafka v3.1.1 is released. We should add this version to the system tests. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (KAFKA-8280) Flaky Test DynamicBrokerReconfigurationTest#testUncleanLeaderElectionEnable
[ https://issues.apache.org/jira/browse/KAFKA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-8280: -- Labels: flakey (was: ) > Flaky Test DynamicBrokerReconfigurationTest#testUncleanLeaderElectionEnable > --- > > Key: KAFKA-8280 > URL: https://issues.apache.org/jira/browse/KAFKA-8280 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: John Roesler >Priority: Blocker > Labels: flakey > Fix For: 3.3.0 > > > I saw this fail again on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3979/testReport/junit/kafka.server/DynamicBrokerReconfigurationTest/testUncleanLeaderElectionEnable/ > {noformat} > Error Message > java.lang.AssertionError: Unclean leader not elected > Stacktrace > java.lang.AssertionError: Unclean leader not elected > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > kafka.server.DynamicBrokerReconfigurationTest.testUncleanLeaderElectionEnable(DynamicBrokerReconfigurationTest.scala:510) > {noformat} > {noformat} > Standard Output > Completed Updating config for entity: brokers '0'. > Completed Updating config for entity: brokers '1'. > Completed Updating config for entity: brokers '2'. > [2019-04-23 01:17:11,690] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=1] Error for partition testtopic-6 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,690] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=1] Error for partition testtopic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,722] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition testtopic-7 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,722] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition testtopic-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,760] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=1] Error for partition testtopic-6 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,761] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=1] Error for partition testtopic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,765] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition testtopic-3 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:11,765] ERROR [ReplicaFetcher replicaId=2, leaderId=0, > fetcherId=0] Error for partition testtopic-9 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:13,779] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=1] Error for partition __consumer_offsets-8 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:13,780] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=1] Error for partition __consumer_offsets-38 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:13,780] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=1] Error for partition __consumer_offsets-2 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-04-23 01:17:13,780] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=1] Error for partition __consumer_offsets-14 at offset 0 > (kafka.server.ReplicaFetcherThread:76) >
[jira] [Updated] (KAFKA-13736) Flaky kafka.network.SocketServerTest.closingChannelWithBufferedReceives
[ https://issues.apache.org/jira/browse/KAFKA-13736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich updated KAFKA-13736: --- Labels: flakey (was: ) > Flaky kafka.network.SocketServerTest.closingChannelWithBufferedReceives > --- > > Key: KAFKA-13736 > URL: https://issues.apache.org/jira/browse/KAFKA-13736 > Project: Kafka > Issue Type: Bug >Reporter: Guozhang Wang >Priority: Blocker > Labels: flakey > Fix For: 3.3.0 > > > Examples: > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-11895/1/tests > {code} > java.lang.AssertionError: receiveRequest timed out > at > kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:140) > at > kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1521) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) > at > kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1520) > at > kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1483) > at > kafka.network.SocketServerTest.closingChannelWithBufferedReceives(SocketServerTest.scala:1431) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13531) Flaky test NamedTopologyIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554979#comment-17554979 ] Matthew de Detrich commented on KAFKA-13531: [~mjsax] I tried to replicate the flakiness of the test and I was unable to do so (I had my laptop running the test until failure overnight and with over 20k runs I didn't get a single failure). I also noticed that the method/test is named slightly differently (i.e. currently its called shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets) and has been modified since this ticket was created so maybe changes made it not flaky anymore? > Flaky test NamedTopologyIntegrationTest > --- > > Key: KAFKA-13531 > URL: https://issues.apache.org/jira/browse/KAFKA-13531 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets > {quote}java.lang.AssertionError: Did not receive all 3 records from topic > output-stream-2 within 6 ms, currently accumulated data is [] Expected: > is a value equal to or greater than <3> but: <0> was less than <3> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:648) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:368) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:336) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:644) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:617) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:439){quote} > STDERR > {quote}java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting > offsets of a topic is forbidden while the consumer group is actively > subscribed to it. at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213) > at > org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) > at > org.apache.kafka.common.internals.KafkaCompletableFuture.kafkaComplete(KafkaCompletableFuture.java:39) > at > org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:122) > at > org.apache.kafka.streams.processor.internals.TopologyMetadata.maybeNotifyTopologyVersionWaiters(TopologyMetadata.java:154) > at > org.apache.kafka.streams.processor.internals.StreamThread.checkForTopologyUpdates(StreamThread.java:916) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:575) > Caused by: org.apache.kafka.common.errors.GroupSubscribedToTopicException: > Deleting offsets of a topic is forbidden while the consumer group is actively > subscribed to it. java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting > offsets of a topic is forbidden while the consumer group is actively > subscribed to it. at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at >
[jira] [Assigned] (KAFKA-13531) Flaky test NamedTopologyIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-13531: -- Assignee: Matthew de Detrich > Flaky test NamedTopologyIntegrationTest > --- > > Key: KAFKA-13531 > URL: https://issues.apache.org/jira/browse/KAFKA-13531 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Assignee: Matthew de Detrich >Priority: Critical > Labels: flaky-test > > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets > {quote}java.lang.AssertionError: Did not receive all 3 records from topic > output-stream-2 within 6 ms, currently accumulated data is [] Expected: > is a value equal to or greater than <3> but: <0> was less than <3> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:648) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:368) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:336) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:644) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:617) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:439){quote} > STDERR > {quote}java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting > offsets of a topic is forbidden while the consumer group is actively > subscribed to it. at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213) > at > org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) > at > org.apache.kafka.common.internals.KafkaCompletableFuture.kafkaComplete(KafkaCompletableFuture.java:39) > at > org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:122) > at > org.apache.kafka.streams.processor.internals.TopologyMetadata.maybeNotifyTopologyVersionWaiters(TopologyMetadata.java:154) > at > org.apache.kafka.streams.processor.internals.StreamThread.checkForTopologyUpdates(StreamThread.java:916) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:575) > Caused by: org.apache.kafka.common.errors.GroupSubscribedToTopicException: > Deleting offsets of a topic is forbidden while the consumer group is actively > subscribed to it. java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting > offsets of a topic is forbidden while the consumer group is actively > subscribed to it. at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213) > at > org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) > at >
[jira] [Commented] (KAFKA-13531) Flaky test NamedTopologyIntegrationTest
[ https://issues.apache.org/jira/browse/KAFKA-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554010#comment-17554010 ] Matthew de Detrich commented on KAFKA-13531: I will have a look into this. > Flaky test NamedTopologyIntegrationTest > --- > > Key: KAFKA-13531 > URL: https://issues.apache.org/jira/browse/KAFKA-13531 > Project: Kafka > Issue Type: Test > Components: streams, unit tests >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets > {quote}java.lang.AssertionError: Did not receive all 3 records from topic > output-stream-2 within 6 ms, currently accumulated data is [] Expected: > is a value equal to or greater than <3> but: <0> was less than <3> at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:648) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:368) > at > org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:336) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:644) > at > org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:617) > at > org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:439){quote} > STDERR > {quote}java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting > offsets of a topic is forbidden while the consumer group is actively > subscribed to it. at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213) > at > org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) > at > org.apache.kafka.common.internals.KafkaCompletableFuture.kafkaComplete(KafkaCompletableFuture.java:39) > at > org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:122) > at > org.apache.kafka.streams.processor.internals.TopologyMetadata.maybeNotifyTopologyVersionWaiters(TopologyMetadata.java:154) > at > org.apache.kafka.streams.processor.internals.StreamThread.checkForTopologyUpdates(StreamThread.java:916) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:575) > Caused by: org.apache.kafka.common.errors.GroupSubscribedToTopicException: > Deleting offsets of a topic is forbidden while the consumer group is actively > subscribed to it. java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting > offsets of a topic is forbidden while the consumer group is actively > subscribed to it. at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213) > at > org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) > at >
[jira] [Commented] (KAFKA-13957) Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores
[ https://issues.apache.org/jira/browse/KAFKA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553810#comment-17553810 ] Matthew de Detrich commented on KAFKA-13957: I have fixed the test at https://github.com/apache/kafka/pull/12289 > Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores > - > > Key: KAFKA-13957 > URL: https://issues.apache.org/jira/browse/KAFKA-13957 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: A. Sophie Blee-Goldman >Assignee: Matthew de Detrich >Priority: Major > Labels: flaky-test > Attachments: > StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores.rtf > > > Failed on a local build so I have the full logs (attached) > {code:java} > java.lang.AssertionError: Unexpected exception thrown while getting the value > from store. > Expected: is (a string containing "Cannot get state store source-table > because the stream thread is PARTITIONS_ASSIGNED, not RUNNING" or a string > containing "The state store, source-table, may have migrated to another > instance" or a string containing "Cannot get state store source-table because > the stream thread is STARTING, not RUNNING") > but: was "The specified partition 1 for store source-table does not > exist." > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.verifyRetrievableException(StoreQueryIntegrationTest.java:539) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.lambda$shouldQuerySpecificActivePartitionStores$5(StoreQueryIntegrationTest.java:241) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.until(StoreQueryIntegrationTest.java:557) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:183) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13957) Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores
[ https://issues.apache.org/jira/browse/KAFKA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553562#comment-17553562 ] Matthew de Detrich commented on KAFKA-13957: I am working on this issue, I managed to replicate the exact same exception. > Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores > - > > Key: KAFKA-13957 > URL: https://issues.apache.org/jira/browse/KAFKA-13957 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: A. Sophie Blee-Goldman >Assignee: Matthew de Detrich >Priority: Major > Labels: flaky-test > Attachments: > StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores.rtf > > > Failed on a local build so I have the full logs (attached) > {code:java} > java.lang.AssertionError: Unexpected exception thrown while getting the value > from store. > Expected: is (a string containing "Cannot get state store source-table > because the stream thread is PARTITIONS_ASSIGNED, not RUNNING" or a string > containing "The state store, source-table, may have migrated to another > instance" or a string containing "Cannot get state store source-table because > the stream thread is STARTING, not RUNNING") > but: was "The specified partition 1 for store source-table does not > exist." > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.verifyRetrievableException(StoreQueryIntegrationTest.java:539) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.lambda$shouldQuerySpecificActivePartitionStores$5(StoreQueryIntegrationTest.java:241) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.until(StoreQueryIntegrationTest.java:557) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:183) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13957) Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores
[ https://issues.apache.org/jira/browse/KAFKA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-13957: -- Assignee: Matthew de Detrich > Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores > - > > Key: KAFKA-13957 > URL: https://issues.apache.org/jira/browse/KAFKA-13957 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: A. Sophie Blee-Goldman >Assignee: Matthew de Detrich >Priority: Major > Labels: flaky-test > Attachments: > StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores.rtf > > > Failed on a local build so I have the full logs (attached) > {code:java} > java.lang.AssertionError: Unexpected exception thrown while getting the value > from store. > Expected: is (a string containing "Cannot get state store source-table > because the stream thread is PARTITIONS_ASSIGNED, not RUNNING" or a string > containing "The state store, source-table, may have migrated to another > instance" or a string containing "Cannot get state store source-table because > the stream thread is STARTING, not RUNNING") > but: was "The specified partition 1 for store source-table does not > exist." > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.verifyRetrievableException(StoreQueryIntegrationTest.java:539) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.lambda$shouldQuerySpecificActivePartitionStores$5(StoreQueryIntegrationTest.java:241) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.until(StoreQueryIntegrationTest.java:557) > at > org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:183) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:833) {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13980) Update to Scala 2.12.16
Matthew de Detrich created KAFKA-13980: -- Summary: Update to Scala 2.12.16 Key: KAFKA-13980 URL: https://issues.apache.org/jira/browse/KAFKA-13980 Project: Kafka Issue Type: Improvement Components: core Reporter: Matthew de Detrich Assignee: Matthew de Detrich -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign
[ https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551082#comment-17551082 ] Matthew de Detrich edited comment on KAFKA-8420 at 6/7/22 1:20 PM: --- So in order to work on this issue I tried making a test to replicate what you are describing and I came across some interesting, the test that I wrote looks like this {code:java} @Test public void gracefulHandlingSwitchSubscribeToManualAssign() { ConsumerMetadata metadata = createMetadata(subscription); MockClient client = new MockClient(time, metadata); initMetadata(client, Collections.singletonMap(topic, 1)); Node node = metadata.fetch().nodes().get(0); KafkaConsumer consumer = newConsumer(time, client, subscription, metadata, assignor, true, groupInstanceId); consumer.subscribe(singleton(topic), getConsumerRebalanceListener(consumer)); prepareRebalance(client, node, singleton(topic), assignor, singletonList(tp0), null); ConsumerRecords initialConsumerRecords = consumer.poll(Duration.ofMillis(0)); assertTrue(initialConsumerRecords.isEmpty()); consumer.unsubscribe(); consumer.assign(singleton(tp0)); client.prepareResponseFrom(syncGroupResponse(singletonList(tp0), Errors.NONE), coordinator); consumer.poll(Duration.ofSeconds(1)); } {code} The problem that I am currently getting is that the {{consumer.poll(Duration.ofSeconds(1));}} is causing an infinite loop/deadlock (note that originally I had a {{consumer.poll(Duration.ofSeconds(0));}} however this caused the {{consumer.poll}} method to short circuit due to {{timer.notExpired()}} never executing and hence just immediately returning an {{ConsumerRecords.empty();}} without the consumer ever sending a request to trigger a sync-group resonse). After spending some time debugging this is the piece of code that is not terminating [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L250-L251]. What I am finding highly confusing if the fact that the {{lookupCoordinator()}} does actually complete (in this case it immediately returns {{findCoordinatorFuture}} at [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L294]) however for some reason the loop at [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java#L215] never terminates. It doesn't appear to detect that the future has finished which I believe to be the case? I am not sure if this is related to what you mentioned, i.e. {quote}In the worst case (i.e. leader keep sending incompatible assignment), this would case the consumer to fall into endless re-joins. {quote} but it looks like that I have either found something else or I am barking up the wrong tree? Do you have any insights into this [~guozhang] was (Author: mdedetrich-aiven): So in order to work on this issue I tried making a test to replicate what you are describing and I came across some interesting, the test that I wrote looks like this {code:java} @Test public void gracefulHandlingSwitchSubscribeToManualAssign() { ConsumerMetadata metadata = createMetadata(subscription); MockClient client = new MockClient(time, metadata); initMetadata(client, Collections.singletonMap(topic, 1)); Node node = metadata.fetch().nodes().get(0); KafkaConsumer consumer = newConsumer(time, client, subscription, metadata, assignor, true, groupInstanceId); consumer.subscribe(singleton(topic), getConsumerRebalanceListener(consumer)); prepareRebalance(client, node, singleton(topic), assignor, singletonList(tp0), null); ConsumerRecords initialConsumerRecords = consumer.poll(Duration.ofMillis(0)); assertTrue(initialConsumerRecords.isEmpty()); consumer.unsubscribe(); consumer.assign(singleton(tp0)); client.prepareResponseFrom(syncGroupResponse(singletonList(tp0), Errors.NONE), coordinator); consumer.poll(Duration.ofSeconds(1)); } {code} The problem that I am currently getting is that the {{consumer.poll(Duration.ofSeconds(1));}} is causing an infinite loop/deadlock (note that originally I had a {{consumer.poll(Duration.ofSeconds(0));}} however this caused the {{consumer.poll}} method to short circuit due to {{timer.notExpired()}} never executing and hence just immediately returning an {{ConsumerRecords.empty();}} without the consumer ever sending a request to trigger a sync-group resonse). After spending some time debugging this is the piece of code that is not terminating
[jira] [Commented] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign
[ https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551082#comment-17551082 ] Matthew de Detrich commented on KAFKA-8420: --- So in order to work on this issue I tried making a test to replicate what you are describing and I came across some interesting, the test that I wrote looks like this {code:java} @Test public void gracefulHandlingSwitchSubscribeToManualAssign() { ConsumerMetadata metadata = createMetadata(subscription); MockClient client = new MockClient(time, metadata); initMetadata(client, Collections.singletonMap(topic, 1)); Node node = metadata.fetch().nodes().get(0); KafkaConsumer consumer = newConsumer(time, client, subscription, metadata, assignor, true, groupInstanceId); consumer.subscribe(singleton(topic), getConsumerRebalanceListener(consumer)); prepareRebalance(client, node, singleton(topic), assignor, singletonList(tp0), null); ConsumerRecords initialConsumerRecords = consumer.poll(Duration.ofMillis(0)); assertTrue(initialConsumerRecords.isEmpty()); consumer.unsubscribe(); consumer.assign(singleton(tp0)); client.prepareResponseFrom(syncGroupResponse(singletonList(tp0), Errors.NONE), coordinator); consumer.poll(Duration.ofSeconds(1)); } {code} The problem that I am currently getting is that the {{consumer.poll(Duration.ofSeconds(1));}} is causing an infinite loop/deadlock (note that originally I had a {{consumer.poll(Duration.ofSeconds(0));}} however this caused the {{consumer.poll}} method to short circuit due to {{timer.notExpired()}} never executing and hence just immediately returning an {{ConsumerRecords.empty();}} without the consumer ever sending a request to trigger a sync-group resonse). After spending some time debugging this is the piece of code that is not terminating [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L250-L251]. What I am finding highly confusing if the fact that the {{lookupCoordinator()}} does actually complete (in this case it immediately returns {{findCoordinatorFuture}} at [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L294]) however for some reason the loop at [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java#L215] never terminates. It doesn't appear to detect that the future has finished which I believe to be the case? I am not sure if this is related to what you mentioned, i.e. {quote} In the worst case (i.e. leader keep sending incompatible assignment), this would case the consumer to fall into endless re-joins. {quote} but it looks like that I have either found something else or I am barking up the wrong tree? > Graceful handling when consumer switches from subscribe to manual assign > > > Key: KAFKA-8420 > URL: https://issues.apache.org/jira/browse/KAFKA-8420 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Assignee: Matthew de Detrich >Priority: Major > > Today if a consumer switches between subscribe (and hence relies on group > rebalance to get assignment) and manual assign, it may cause unnecessary > rebalances. For example: > 1. consumer.subscribe(); > 2. consumer.poll(); // join-group request sent, returns empty because > poll timeout > 3. consumer.unsubscribe(); > 4. consumer.assign(..); > 5. consumer.poll(); // sync-group request received, and the assigned > partitions does not match the current subscription-state. In this case it > will tries to re-join which is not necessary. > In the worst case (i.e. leader keep sending incompatible assignment), this > would case the consumer to fall into endless re-joins. > Although it is not a very common usage scenario, it still worth being better > handled than the status-quo. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign
[ https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545423#comment-17545423 ] Matthew de Detrich commented on KAFKA-8420: --- I am currently working on this issue, as of now trying to reproduce it with a test and then I will implement the fix for it. > Graceful handling when consumer switches from subscribe to manual assign > > > Key: KAFKA-8420 > URL: https://issues.apache.org/jira/browse/KAFKA-8420 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Assignee: Matthew de Detrich >Priority: Major > > Today if a consumer switches between subscribe (and hence relies on group > rebalance to get assignment) and manual assign, it may cause unnecessary > rebalances. For example: > 1. consumer.subscribe(); > 2. consumer.poll(); // join-group request sent, returns empty because > poll timeout > 3. consumer.unsubscribe(); > 4. consumer.assign(..); > 5. consumer.poll(); // sync-group request received, and the assigned > partitions does not match the current subscription-state. In this case it > will tries to re-join which is not necessary. > In the worst case (i.e. leader keep sending incompatible assignment), this > would case the consumer to fall into endless re-joins. > Although it is not a very common usage scenario, it still worth being better > handled than the status-quo. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign
[ https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-8420: - Assignee: Matthew de Detrich > Graceful handling when consumer switches from subscribe to manual assign > > > Key: KAFKA-8420 > URL: https://issues.apache.org/jira/browse/KAFKA-8420 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Assignee: Matthew de Detrich >Priority: Major > > Today if a consumer switches between subscribe (and hence relies on group > rebalance to get assignment) and manual assign, it may cause unnecessary > rebalances. For example: > 1. consumer.subscribe(); > 2. consumer.poll(); // join-group request sent, returns empty because > poll timeout > 3. consumer.unsubscribe(); > 4. consumer.assign(..); > 5. consumer.poll(); // sync-group request received, and the assigned > partitions does not match the current subscription-state. In this case it > will tries to re-join which is not necessary. > In the worst case (i.e. leader keep sending incompatible assignment), this > would case the consumer to fall into endless re-joins. > Although it is not a very common usage scenario, it still worth being better > handled than the status-quo. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13299) Accept listeners that have the same port but use IPv4 vs IPv6
[ https://issues.apache.org/jira/browse/KAFKA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441164#comment-17441164 ] Matthew de Detrich commented on KAFKA-13299: PR is open at https://github.com/apache/kafka/pull/11478 > Accept listeners that have the same port but use IPv4 vs IPv6 > - > > Key: KAFKA-13299 > URL: https://issues.apache.org/jira/browse/KAFKA-13299 > Project: Kafka > Issue Type: Improvement >Reporter: Matthew de Detrich >Priority: Major > > Currently we are going through a process where we want to migrate Kafka > brokers from IPv4 to IPv6. The simplest way for us to do this would be to > allow Kafka to have 2 listeners of the same port however one listener has an > IPv4 address allocated and another listener has an IPv6 address allocated. > Currently this is not possible in Kafka because it validates that all of the > listeners have a unique port. With some rudimentary testing if this > validation is removed (so we are able to have 2 listeners of the same port > but with different IP versions) there doesn't seem to be any immediate > problems, the kafka clusters works without any problems. > Is there some fundamental reason behind this limitation of having unique > ports? Consequently would there be any problems in loosening this limitation > (i.e. duplicate ports are allowed if the IP versions are different) or just > altogether removing the restriction -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13299) Accept listeners that have the same port but use IPv4 vs IPv6
[ https://issues.apache.org/jira/browse/KAFKA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441105#comment-17441105 ] Matthew de Detrich commented on KAFKA-13299: [~showuon] Thanks for the reply, I am working on this right now will send an example PR today. > Accept listeners that have the same port but use IPv4 vs IPv6 > - > > Key: KAFKA-13299 > URL: https://issues.apache.org/jira/browse/KAFKA-13299 > Project: Kafka > Issue Type: Improvement >Reporter: Matthew de Detrich >Priority: Major > > Currently we are going through a process where we want to migrate Kafka > brokers from IPv4 to IPv6. The simplest way for us to do this would be to > allow Kafka to have 2 listeners of the same port however one listener has an > IPv4 address allocated and another listener has an IPv6 address allocated. > Currently this is not possible in Kafka because it validates that all of the > listeners have a unique port. With some rudimentary testing if this > validation is removed (so we are able to have 2 listeners of the same port > but with different IP versions) there doesn't seem to be any immediate > problems, the kafka clusters works without any problems. > Is there some fundamental reason behind this limitation of having unique > ports? Consequently would there be any problems in loosening this limitation > (i.e. duplicate ports are allowed if the IP versions are different) or just > altogether removing the restriction -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13299) Accept listeners that have the same port but use IPv4 vs IPv6
Matthew de Detrich created KAFKA-13299: -- Summary: Accept listeners that have the same port but use IPv4 vs IPv6 Key: KAFKA-13299 URL: https://issues.apache.org/jira/browse/KAFKA-13299 Project: Kafka Issue Type: Improvement Reporter: Matthew de Detrich Currently we are going through a process where we want to migrate Kafka brokers from IPv4 to IPv6. The simplest way for us to do this would be to allow Kafka to have 2 listeners of the same port however one listener has an IPv4 address allocated and another listener has an IPv6 address allocated. Currently this is not possible in Kafka because it validates that all of the listeners have a unique port. With some rudimentary testing if this validation is removed (so we are able to have 2 listeners of the same port but with different IP versions) there doesn't seem to be any immediate problems, the kafka clusters works without any problems. Is there some fundamental reason behind this limitation of having unique ports? Consequently would there be any problems in loosening this limitation (i.e. duplicate ports are allowed if the IP versions are different) or just altogether removing the restriction -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12913) Make Scala Case class's final
Matthew de Detrich created KAFKA-12913: -- Summary: Make Scala Case class's final Key: KAFKA-12913 URL: https://issues.apache.org/jira/browse/KAFKA-12913 Project: Kafka Issue Type: Improvement Components: core Reporter: Matthew de Detrich Assignee: Matthew de Detrich Its considered best practice to make case classes final since Scala code that uses case class relies on equals/hashcode/unapply to function correctly (which breaks if user's can override this behaviour) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation
[ https://issues.apache.org/jira/browse/KAFKA-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351617#comment-17351617 ] Matthew de Detrich commented on KAFKA-12430: I think I might just create a second property to not create topics and document that it will break upstream clusters. > emit.heartbeats.enabled = false should disable heartbeats topic creation > > > Key: KAFKA-12430 > URL: https://issues.apache.org/jira/browse/KAFKA-12430 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Reporter: Ivan Yurchenko >Assignee: Matthew de Detrich >Priority: Minor > > Currently, whether MirrorMaker 2's {{MirrorHeartbeatConnector}} emits > heartbeats or not is based on {{emit.heartbeats.enabled}} setting. However, > {{heartbeats}} topic is created unconditionally. It seems that the same > setting should really disable the topic creation as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation
[ https://issues.apache.org/jira/browse/KAFKA-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-12430: -- Assignee: Matthew de Detrich > emit.heartbeats.enabled = false should disable heartbeats topic creation > > > Key: KAFKA-12430 > URL: https://issues.apache.org/jira/browse/KAFKA-12430 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Reporter: Ivan Yurchenko >Assignee: Matthew de Detrich >Priority: Minor > > Currently, whether MirrorMaker 2's {{MirrorHeartbeatConnector}} emits > heartbeats or not is based on {{emit.heartbeats.enabled}} setting. However, > {{heartbeats}} topic is created unconditionally. It seems that the same > setting should really disable the topic creation as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation
[ https://issues.apache.org/jira/browse/KAFKA-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351093#comment-17351093 ] Matthew de Detrich commented on KAFKA-12430: [~ryannedolan] I am going to look into this, do you have any comments/context to add on this topic (i.e. is there some deliberate reason why if emit.heartbeats.enabled is false then the heartbeat topics are still created?) > emit.heartbeats.enabled = false should disable heartbeats topic creation > > > Key: KAFKA-12430 > URL: https://issues.apache.org/jira/browse/KAFKA-12430 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Reporter: Ivan Yurchenko >Priority: Minor > > Currently, whether MirrorMaker 2's {{MirrorHeartbeatConnector}} emits > heartbeats or not is based on {{emit.heartbeats.enabled}} setting. However, > {{heartbeats}} topic is created unconditionally. It seems that the same > setting should really disable the topic creation as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12819) Quality of life improvements for tests
Matthew de Detrich created KAFKA-12819: -- Summary: Quality of life improvements for tests Key: KAFKA-12819 URL: https://issues.apache.org/jira/browse/KAFKA-12819 Project: Kafka Issue Type: Improvement Reporter: Matthew de Detrich Assignee: Matthew de Detrich Minor improvements to various tests, such as using assertObject instead of assertEquals (when comparing objects), fill in missing messages in asserts etc etc -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9726) LegacyReplicationPolicy for MM2 to mimic MM1
[ https://issues.apache.org/jira/browse/KAFKA-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew de Detrich reassigned KAFKA-9726: - Assignee: Matthew de Detrich (was: Ivan Yurchenko) > LegacyReplicationPolicy for MM2 to mimic MM1 > > > Key: KAFKA-9726 > URL: https://issues.apache.org/jira/browse/KAFKA-9726 > Project: Kafka > Issue Type: Improvement > Components: mirrormaker >Reporter: Ryanne Dolan >Assignee: Matthew de Detrich >Priority: Minor > > Per KIP-382, we should support MM2 in "legacy mode", i.e. with behavior > similar to MM1. A key requirement for this is a ReplicationPolicy that does > not rename topics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9726) LegacyReplicationPolicy for MM2 to mimic MM1
[ https://issues.apache.org/jira/browse/KAFKA-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346969#comment-17346969 ] Matthew de Detrich commented on KAFKA-9726: --- Would it be possible to assign the ticket to me, I am taking over the task from [~ivanyu] (I have already created a PR at [https://github.com/apache/kafka/pull/10648).] [~ivanyu] has tried changing the assignee to me but it didn't work. > LegacyReplicationPolicy for MM2 to mimic MM1 > > > Key: KAFKA-9726 > URL: https://issues.apache.org/jira/browse/KAFKA-9726 > Project: Kafka > Issue Type: Improvement > Components: mirrormaker >Reporter: Ryanne Dolan >Assignee: Ivan Yurchenko >Priority: Minor > > Per KIP-382, we should support MM2 in "legacy mode", i.e. with behavior > similar to MM1. A key requirement for this is a ReplicationPolicy that does > not rename topics. -- This message was sent by Atlassian Jira (v8.3.4#803005)