from:"Matthew de Detrich \(Jira\)"

[jira] [Commented] (KAFKA-14733) Update AclAuthorizerTest to run tests for both zk and kraft mode

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838647#comment-17838647
 ] 

Matthew de Detrich commented on KAFKA-14733:


[~emissionnebula] Have you started working on this or can I look into it?

> Update AclAuthorizerTest to run tests for both zk and kraft mode
> 
>
> Key: KAFKA-14733
> URL: https://issues.apache.org/jira/browse/KAFKA-14733
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Purshotam Chauhan
>Assignee: Purshotam Chauhan
>Priority: Minor
>
> Currently, we have two test classes AclAuthorizerTest and 
> StandardAuthorizerTest that are used exclusively for zk and kraft mode.
> But AclAuthorizerTest has a lot of tests covering various scenarios. We 
> should change AclAuthorizerTest to run for both zk and kraft modes so as to 
> keep parity between both modes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15744) KRaft support in CustomQuotaCallbackTest

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838644#comment-17838644
 ] 

Matthew de Detrich commented on KAFKA-15744:


[~high.lee] Are you still working on this issue or is it okay if I can take 
over?

> KRaft support in CustomQuotaCallbackTest
> 
>
> Key: KAFKA-15744
> URL: https://issues.apache.org/jira/browse/KAFKA-15744
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: highluck
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
>
> The following tests in CustomQuotaCallbackTest in 
> core/src/test/scala/integration/kafka/api/CustomQuotaCallbackTest.scala need 
> to be updated to support KRaft
> 90 : def testCustomQuotaCallback(): Unit = {
> Scanned 468 lines. Found 0 KRaft tests out of 1 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15724) KRaft support in OffsetFetchRequestTest

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838643#comment-17838643
 ] 

Matthew de Detrich commented on KAFKA-15724:


[~shivsundar] Is it oaky if I can take over the ticket or are you still working 
on it?

> KRaft support in OffsetFetchRequestTest
> ---
>
> Key: KAFKA-15724
> URL: https://issues.apache.org/jira/browse/KAFKA-15724
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Shivsundar R
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
>
> The following tests in OffsetFetchRequestTest in 
> core/src/test/scala/unit/kafka/server/OffsetFetchRequestTest.scala need to be 
> updated to support KRaft
> 83 : def testOffsetFetchRequestSingleGroup(): Unit = {
> 130 : def testOffsetFetchRequestAllOffsetsSingleGroup(): Unit = {
> 180 : def testOffsetFetchRequestWithMultipleGroups(): Unit = {
> Scanned 246 lines. Found 0 KRaft tests out of 3 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15709) KRaft support in ServerStartupTest

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-15709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838640#comment-17838640
 ] 

Matthew de Detrich commented on KAFKA-15709:


[~linzihao1999] are you still working on this or can I take over

> KRaft support in ServerStartupTest
> --
>
> Key: KAFKA-15709
> URL: https://issues.apache.org/jira/browse/KAFKA-15709
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Zihao Lin
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
>
> The following tests in ServerStartupTest in 
> core/src/test/scala/unit/kafka/server/ServerStartupTest.scala need to be 
> updated to support KRaft
> 38 : def testBrokerCreatesZKChroot(): Unit = {
> 51 : def testConflictBrokerStartupWithSamePort(): Unit = {
> 65 : def testConflictBrokerRegistration(): Unit = {
> 82 : def testBrokerSelfAware(): Unit = {
> 93 : def testBrokerStateRunningAfterZK(): Unit = {
> Scanned 107 lines. Found 0 KRaft tests out of 5 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-15737) KRaft support in ConsumerBounceTest

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-15737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838639#comment-17838639
 ] 

Matthew de Detrich edited comment on KAFKA-15737 at 4/18/24 12:39 PM:
--

[~k-raina] are you working on this or can I have a look into it?


was (Author: JIRAUSER297606):
Ill have a look at this

> KRaft support in ConsumerBounceTest
> ---
>
> Key: KAFKA-15737
> URL: https://issues.apache.org/jira/browse/KAFKA-15737
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Kaushik Raina
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
>
> The following tests in ConsumerBounceTest in 
> core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala need to be 
> updated to support KRaft
> 81 : def testConsumptionWithBrokerFailures(): Unit = 
> consumeWithBrokerFailures(10)
> 122 : def testSeekAndCommitWithBrokerFailures(): Unit = 
> seekAndCommitWithBrokerFailures(5)
> 161 : def testSubscribeWhenTopicUnavailable(): Unit = {
> 212 : def testClose(): Unit = {
> 297 : def 
> testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(): 
> Unit = {
> 337 : def testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize(): Unit = {
> 370 : def testCloseDuringRebalance(): Unit = {
> Scanned 535 lines. Found 0 KRaft tests out of 7 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15737) KRaft support in ConsumerBounceTest

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-15737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838639#comment-17838639
 ] 

Matthew de Detrich commented on KAFKA-15737:


Ill have a look at this

> KRaft support in ConsumerBounceTest
> ---
>
> Key: KAFKA-15737
> URL: https://issues.apache.org/jira/browse/KAFKA-15737
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Kaushik Raina
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
>
> The following tests in ConsumerBounceTest in 
> core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala need to be 
> updated to support KRaft
> 81 : def testConsumptionWithBrokerFailures(): Unit = 
> consumeWithBrokerFailures(10)
> 122 : def testSeekAndCommitWithBrokerFailures(): Unit = 
> seekAndCommitWithBrokerFailures(5)
> 161 : def testSubscribeWhenTopicUnavailable(): Unit = {
> 212 : def testClose(): Unit = {
> 297 : def 
> testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(): 
> Unit = {
> 337 : def testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize(): Unit = {
> 370 : def testCloseDuringRebalance(): Unit = {
> Scanned 535 lines. Found 0 KRaft tests out of 7 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14547) Be able to run kafka KRaft Server in tests without needing to run a storage setup script

2024-04-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838637#comment-17838637
 ] 

Matthew de Detrich commented on KAFKA-14547:


Ill have a look into this

> Be able to run kafka KRaft Server in tests without needing to run a storage 
> setup script
> 
>
> Key: KAFKA-14547
> URL: https://issues.apache.org/jira/browse/KAFKA-14547
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft
>Affects Versions: 3.3.1
>Reporter: Natan Silnitsky
>Priority: Major
>
> Currently kafka KRaft Server requires running kafka-storage.sh in order to 
> start properly.
> This makes setup much more cubersome for build tools like bazel to work 
> properly.
> One way to mitigate this is to configure the paths via kafkaConfig...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (KAFKA-14572) Migrate EmbeddedKafkaCluster used by Streams integration tests from EmbeddedZookeeper to KRaft

2023-12-10 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-14572:
--

Assignee: Matthew de Detrich

> Migrate EmbeddedKafkaCluster used by Streams integration tests from 
> EmbeddedZookeeper to KRaft
> --
>
> Key: KAFKA-14572
> URL: https://issues.apache.org/jira/browse/KAFKA-14572
> Project: Kafka
>  Issue Type: Test
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Blocker
> Fix For: 4.0.0
>
>
> ZK will be removed in 4.0, and we need to update our test to switch to ZK to 
> KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14572) Migrate EmbeddedKafkaCluster used by Streams integration tests from EmbeddedZookeeper to KRaft

2023-12-10 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795182#comment-17795182
 ] 

Matthew de Detrich commented on KAFKA-14572:


Ill have a look into this

> Migrate EmbeddedKafkaCluster used by Streams integration tests from 
> EmbeddedZookeeper to KRaft
> --
>
> Key: KAFKA-14572
> URL: https://issues.apache.org/jira/browse/KAFKA-14572
> Project: Kafka
>  Issue Type: Test
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Priority: Blocker
> Fix For: 4.0.0
>
>
> ZK will be removed in 4.0, and we need to update our test to switch to ZK to 
> KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14237) Kafka TLS Doesn't Present Intermediary Certificates when using PEM

2023-12-10 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795181#comment-17795181
 ] 

Matthew de Detrich commented on KAFKA-14237:


I believe this issue has been resolved so it can be closed (see 
https://issues.apache.org/jira/browse/KAFKA-10338)

> Kafka TLS Doesn't Present Intermediary Certificates when using PEM
> --
>
> Key: KAFKA-14237
> URL: https://issues.apache.org/jira/browse/KAFKA-14237
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.2.1
> Environment: Deployed using the Bitnami Helm 
> Chart(https://github.com/bitnami/charts/tree/master/bitnami/kafka)
> The Bitnami Helm Chart uses Docker Image: 
> https://github.com/bitnami/containers/tree/main/bitnami/kafka
> An issue was already opened with Bitnami and they told us to send this 
> upstream: https://github.com/bitnami/containers/issues/6654
>Reporter: Ryan R
>Priority: Blocker
>
> When using PEM TLS certificates, Kafka does not present the entire 
> certificate chain.
>  
> Our {{/opt/bitnami/kafka/config/server.properties}} file looks like this:
> {code:java}
> ssl.keystore.type=PEM
> ssl.truststore.type=PEM
> ssl.keystore.key=-BEGIN PRIVATE KEY- \
> 
> -END PRIVATE KEY-
> ssl.keystore.certificate.chain=-BEGIN CERTIFICATE- \
> 
> -END CERTIFICATE- \
> -BEGIN CERTIFICATE- \
> MIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw \
> TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh \
> cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMjAwOTA0MDAwMDAw \
> WhcNMjUwOTE1MTYwMDAwWjAyMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNTGV0J3Mg \
> RW5jcnlwdDELMAkGA1UEAxMCUjMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK \
> AoIBAQC7AhUozPaglNMPEuyNVZLD+ILxmaZ6QoinXSaqtSu5xUyxr45r+XXIo9cP \
> R5QUVTVXjJ6oojkZ9YI8QqlObvU7wy7bjcCwXPNZOOftz2nwWgsbvsCUJCWH+jdx \
> sxPnHKzhm+/b5DtFUkWWqcFTzjTIUu61ru2P3mBw4qVUq7ZtDpelQDRrK9O8Zutm \
> NHz6a4uPVymZ+DAXXbpyb/uBxa3Shlg9F8fnCbvxK/eG3MHacV3URuPMrSXBiLxg \
> Z3Vms/EY96Jc5lP/Ooi2R6X/ExjqmAl3P51T+c8B5fWmcBcUr2Ok/5mzk53cU6cG \
> /kiFHaFpriV1uxPMUgP17VGhi9sVAgMBAAGjggEIMIIBBDAOBgNVHQ8BAf8EBAMC \
> AYYwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMBMBIGA1UdEwEB/wQIMAYB \
> Af8CAQAwHQYDVR0OBBYEFBQusxe3WFbLrlAJQOYfr52LFMLGMB8GA1UdIwQYMBaA \
> FHm0WeZ7tuXkAXOACIjIGlj26ZtuMDIGCCsGAQUFBwEBBCYwJDAiBggrBgEFBQcw \
> AoYWaHR0cDovL3gxLmkubGVuY3Iub3JnLzAnBgNVHR8EIDAeMBygGqAYhhZodHRw \
> Oi8veDEuYy5sZW5jci5vcmcvMCIGA1UdIAQbMBkwCAYGZ4EMAQIBMA0GCysGAQQB \
> gt8TAQEBMA0GCSqGSIb3DQEBCwUAA4ICAQCFyk5HPqP3hUSFvNVneLKYY611TR6W \
> PTNlclQtgaDqw+34IL9fzLdwALduO/ZelN7kIJ+m74uyA+eitRY8kc607TkC53wl \
> ikfmZW4/RvTZ8M6UK+5UzhK8jCdLuMGYL6KvzXGRSgi3yLgjewQtCPkIVz6D2QQz \
> CkcheAmCJ8MqyJu5zlzyZMjAvnnAT45tRAxekrsu94sQ4egdRCnbWSDtY7kh+BIm \
> lJNXoB1lBMEKIq4QDUOXoRgffuDghje1WrG9ML+Hbisq/yFOGwXD9RiX8F6sw6W4 \
> avAuvDszue5L3sz85K+EC4Y/wFVDNvZo4TYXao6Z0f+lQKc0t8DQYzk1OXVu8rp2 \
> yJMC6alLbBfODALZvYH7n7do1AZls4I9d1P4jnkDrQoxB3UqQ9hVl3LEKQ73xF1O \
> yK5GhDDX8oVfGKF5u+decIsH4YaTw7mP3GFxJSqv3+0lUFJoi5Lc5da149p90Ids \
> hCExroL1+7mryIkXPeFM5TgO9r0rvZaBFOvV2z0gp35Z0+L4WPlbuEjN/lxPFin+ \
> HlUjr8gRsI3qfJOQFy/9rKIJR0Y/8Omwt/8oTWgy1mdeHmmjk7j1nYsvC9JSQ6Zv \
> MldlTTKB3zhThV1+XWYp6rjd5JW1zbVWEkLNxE7GJThEUG3szgBVGP7pSWTUTsqX \
> nLRbwHOoq7hHwg== \
> -END CERTIFICATE- \
> ssl.truststore.certificates=-BEGIN CERTIFICATE- \
> MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw \
> TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh \
> cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4 \
> WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu \
> ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY \
> MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc \
> h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+ \
> 0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U \
> A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW \
> T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH \
> B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC \
> B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv \
> KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn \
> OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn \
> jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw \
> qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI \
> rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV \
> HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq \
> hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL \
> ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ \
>

[jira] [Assigned] (KAFKA-14245) Topic deleted during reassignment

2023-11-30 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-14245:
--

Assignee: Matthew de Detrich

> Topic deleted during reassignment
> -
>
> Key: KAFKA-14245
> URL: https://issues.apache.org/jira/browse/KAFKA-14245
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Junyang Liu
>Assignee: Matthew de Detrich
>Priority: Blocker
>
> When Doing partition reassignment, the partition should not be deleted. But 
> in the method "markTopicIneligibleForDeletion", topics need to be in 
> "controllerContext.topicsToBeDeleted" while this is usually false because 
> topics are not deleted at that time. This makes the topics doing reassignment 
> not able to be added to "topicsIneligibleForDeletion". So when topic deletion 
> comes, the topic in reassignment can also be deleted, which leads to the 
> result that the reassignment can never be finished.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14245) Topic deleted during reassignment

2023-11-30 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791501#comment-17791501
 ] 

Matthew de Detrich commented on KAFKA-14245:


Ill have a look into this to see if its still valid or not (and if it is ill 
try and implement a fix)

> Topic deleted during reassignment
> -
>
> Key: KAFKA-14245
> URL: https://issues.apache.org/jira/browse/KAFKA-14245
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Junyang Liu
>Priority: Blocker
>
> When Doing partition reassignment, the partition should not be deleted. But 
> in the method "markTopicIneligibleForDeletion", topics need to be in 
> "controllerContext.topicsToBeDeleted" while this is usually false because 
> topics are not deleted at that time. This makes the topics doing reassignment 
> not able to be added to "topicsIneligibleForDeletion". So when topic deletion 
> comes, the topic in reassignment can also be deleted, which leads to the 
> result that the reassignment can never be finished.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14132) Remaining PowerMock to Mockito tests

2023-10-17 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776407#comment-17776407
 ] 

Matthew de Detrich commented on KAFKA-14132:


[~ChrisEgerton] Feel free to leave it unassigned, I don't have capacity for 
this.

> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
> Fix For: 3.7.0
>
>
> {color:#de350b}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> {color:#ff8b00}InReview{color}
> {color:#00875a}Merged{color}
>  # {color:#00875a}ErrorHandlingTaskTest{color} (owner: [~shekharrajak])
>  # {color:#00875a}SourceTaskOffsetCommiterTest{color} (owner: Christo)
>  # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
>  # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}ConnectorsResourceTest{color} (owner: [~mdedetrich-aiven])
>  # {color:#ff8b00}StandaloneHerderTest{color} (owner: [~mdedetrich-aiven]) 
> ([https://github.com/apache/kafka/pull/12728])
>  # KafkaConfigBackingStoreTest (UNOWNED)
>  # {color:#00875a}KafkaOffsetBackingStoreTest{color} (owner: Christo) 
> ([https://github.com/apache/kafka/pull/12418])
>  # {color:#00875a}KafkaBasedLogTest{color} (owner: @bachmanity ])
>  # {color:#00875a}RetryUtilTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}RepartitionTopicTest{color} (streams) (owner: Christo)
>  # {color:#00875a}StateManagerUtilTest{color} (streams) (owner: Christo)
> *The coverage report for the above tests after the change should be >= to 
> what the coverage is now.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete

2023-09-11 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich resolved KAFKA-14975.

Resolution: Won't Fix

Closing issue since the core problem was solved in another way (see 
https://github.com/apache/kafka/pull/14127)

> Make TopicBasedRemoteLogMetadataManager methods wait for initialize to 
> complete
> ---
>
> Key: KAFKA-14975
> URL: https://issues.apache.org/jira/browse/KAFKA-14975
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> In the current implementation of TopicBasedRemoteLogMetadataManager various 
> methods internally call the 
> ensureInitializedAndNotClosed to ensure that the 
> TopicBasedRemoteLogMetadataManager is initialized. If 
> TopicBasedRemoteLogMetadataManager is not initialized then an exception will 
> be thrown.
> This is not an ideal behaviour, rather than just throwing an exception we 
> should instead try to wait until TopicBasedRemoteLogMetadataManager is 
> initialised (with a timeout). This is what the expected behaviour from users 
> should be and its also what other parts of Kafka that use plugin based 
> systems (ergo kafka connect) do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete

2023-05-11 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14975:
---
Parent: KAFKA-7739
Issue Type: Sub-task  (was: Task)

> Make TopicBasedRemoteLogMetadataManager methods wait for initialize to 
> complete
> ---
>
> Key: KAFKA-14975
> URL: https://issues.apache.org/jira/browse/KAFKA-14975
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> In the current implementation of TopicBasedRemoteLogMetadataManager various 
> methods internally call the 
> ensureInitializedAndNotClosed to ensure that the 
> TopicBasedRemoteLogMetadataManager is initialized. If 
> TopicBasedRemoteLogMetadataManager is not initialized then an exception will 
> be thrown.
> This is not an ideal behaviour, rather than just throwing an exception we 
> should instead try to wait until TopicBasedRemoteLogMetadataManager is 
> initialised (with a timeout). This is what the expected behaviour from users 
> should be and its also what other parts of Kafka that use plugin based 
> systems (ergo kafka connect) do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete

2023-05-08 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14975:
---
Description: 
In the current implementation of TopicBasedRemoteLogMetadataManager various 
methods internally call the 
ensureInitializedAndNotClosed to ensure that the 
TopicBasedRemoteLogMetadataManager is initialized. If 
TopicBasedRemoteLogMetadataManager is not initialized then an exception will be 
thrown.

This is not an ideal behaviour, rather than just throwing an exception we 
should instead try to wait until TopicBasedRemoteLogMetadataManager is 
initialised (with a timeout). This is what the expected behaviour from users 
should be and its also what other parts of Kafka that use plugin based systems 
(ergo kafka connect) do.

  was:In the current implementation of 


> Make TopicBasedRemoteLogMetadataManager methods wait for initialize to 
> complete
> ---
>
> Key: KAFKA-14975
> URL: https://issues.apache.org/jira/browse/KAFKA-14975
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> In the current implementation of TopicBasedRemoteLogMetadataManager various 
> methods internally call the 
> ensureInitializedAndNotClosed to ensure that the 
> TopicBasedRemoteLogMetadataManager is initialized. If 
> TopicBasedRemoteLogMetadataManager is not initialized then an exception will 
> be thrown.
> This is not an ideal behaviour, rather than just throwing an exception we 
> should instead try to wait until TopicBasedRemoteLogMetadataManager is 
> initialised (with a timeout). This is what the expected behaviour from users 
> should be and its also what other parts of Kafka that use plugin based 
> systems (ergo kafka connect) do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete

2023-05-08 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-14975:
--

Assignee: Matthew de Detrich

> Make TopicBasedRemoteLogMetadataManager methods wait for initialize to 
> complete
> ---
>
> Key: KAFKA-14975
> URL: https://issues.apache.org/jira/browse/KAFKA-14975
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> In the current implementation of 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete

2023-05-08 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14975:
---
Description: In the current implementation of 

> Make TopicBasedRemoteLogMetadataManager methods wait for initialize to 
> complete
> ---
>
> Key: KAFKA-14975
> URL: https://issues.apache.org/jira/browse/KAFKA-14975
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In the current implementation of 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14975) Make TopicBasedRemoteLogMetadataManager methods wait for initialize to complete

2023-05-08 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14975:
--

 Summary: Make TopicBasedRemoteLogMetadataManager methods wait for 
initialize to complete
 Key: KAFKA-14975
 URL: https://issues.apache.org/jira/browse/KAFKA-14975
 Project: Kafka
  Issue Type: Task
Reporter: Matthew de Detrich






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413
 ] 

Matthew de Detrich edited comment on KAFKA-14524 at 1/4/23 11:45 AM:
-

I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
[https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and 
[https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as examples).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?


was (Author: mdedetrich-aiven):
I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
[https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and 
[https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as an 
example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413
 ] 

Matthew de Detrich edited comment on KAFKA-14524 at 1/4/23 11:44 AM:
-

I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
[https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h] and 
[https://lists.apache.org/thread/j1fggrbh7hl4pfqcqqq7p527kvdgk35s] as an 
example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?


was (Author: mdedetrich-aiven):
I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h as an example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14524) Modularize `core` monolith

2023-01-04 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654413#comment-17654413
 ] 

Matthew de Detrich commented on KAFKA-14524:


I believe the main intention of this is to modularize the Kafka core (which I 
don't think anyone disagrees with) but the Scala removal has been tacked on 
without a good justification as to why these orthogonal concerns are done 
together. Combining such 2 massive tasks at once seems to make the scope of 
this too big (splitting out the core without migrating away from Scala has far 
less risk). I'm also wondering if the move away from Scala would qualify as 
consensus required changes, mostly to get the opinion of the community on this 
topic. For example, Flink which also decided to do such a similar change 
consulted the community via discussion threads (see 
https://lists.apache.org/thread/voj99gkk2v1bgw8xqcbmzgvn9ffs7v7h as an example).

May I suggest that instead of doing both the modularization and the Scala to 
Java migration at once, we just do the modularization and do the migration 
later once the community is consulted?

> Modularize `core` monolith
> --
>
> Key: KAFKA-14524
> URL: https://issues.apache.org/jira/browse/KAFKA-14524
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
>
> The `core` module has grown too large and it's time to split it into multiple 
> modules. A much slimmer `core` module will remain in the end.
> Evidence of `core` growing too large is that it takes 1m10s to compile the 
> main code and tests and it takes hours to run all the tests sequentially.
> As part of this effort, we should rewrite the Scala code in Java to reduce 
> developer friction, reduce compilation time and simplify deployment (i.e. we 
> can remove the scala version suffix from the module name). Scala may have a 
> number of advantages over Java 8 (minimum version we support now) and Java 11 
> (minimum version we will support in Kafka 4.0), but a mixture of Scala and 
> Java (as we have now) is more complex than just Java.
> Another benefit is that code dependencies will be strictly enforced, which 
> will hopefully help ensure better abstractions.
> This pattern was started with the `tools` (but not completed), `metadata` and 
> `raft` modules and we have (when this ticket was filed) a couple more in 
> progress: `group-coordinator` and `storage`.
> This is an umbrella ticket and it will link to each ticket related to this 
> goal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14332) Split out checkstyle configs between test and main

2022-10-24 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14332:
---
Description: Currently in Kafka we have one global import-control.xml that 
is used to control imports for both the main source code and tests. It would be 
ideal to have separate configs, one for test and one for main so that for 
example one cannot accidentally add a test library (such as powermock) to the 
main source code.

> Split out checkstyle configs between test and main
> --
>
> Key: KAFKA-14332
> URL: https://issues.apache.org/jira/browse/KAFKA-14332
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests, unit tests
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> Currently in Kafka we have one global import-control.xml that is used to 
> control imports for both the main source code and tests. It would be ideal to 
> have separate configs, one for test and one for main so that for example one 
> cannot accidentally add a test library (such as powermock) to the main source 
> code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14332) Split out checkstyle configs between test and main

2022-10-24 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14332:
---
Component/s: system tests
 unit tests

> Split out checkstyle configs between test and main
> --
>
> Key: KAFKA-14332
> URL: https://issues.apache.org/jira/browse/KAFKA-14332
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests, unit tests
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (KAFKA-14332) Split out checkstyle configs between test and main

2022-10-24 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-14332:
--

Assignee: Matthew de Detrich

> Split out checkstyle configs between test and main
> --
>
> Key: KAFKA-14332
> URL: https://issues.apache.org/jira/browse/KAFKA-14332
> Project: Kafka
>  Issue Type: Bug
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14332) Split out checkstyle configs between test and main

2022-10-24 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14332:
--

 Summary: Split out checkstyle configs between test and main
 Key: KAFKA-14332
 URL: https://issues.apache.org/jira/browse/KAFKA-14332
 Project: Kafka
  Issue Type: Bug
Reporter: Matthew de Detrich






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests

2022-10-07 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14132:
---
Description: 
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}InReview{color}
{color:#00875a}Merged{color}
 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) 
([https://github.com/apache/kafka/pull/12725])
 # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
 # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
 # RetryUtilTest (owner: [~mdedetrich-aiven] )
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*

  was:
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}InReview{color}
{color:#00875a}Merged{color}
 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
 # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
 # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] 
)(https://github.com/apache/kafka/pull/12725)
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
 # RetryUtilTest (owner: [~mdedetrich-aiven] )
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*


> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#de350b}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> {color:#ff8b00}InReview{color}
> {color:#00875a}Merged{color}
>  # ErrorHandlingTaskTest (owner: Divij)
>  # SourceTaskOffsetCommiterTest (owner: Divij)
>  # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
>  # WorkerSinkTaskTest (owner: Divij)
>  # WorkerSinkTaskThreadedTest (owner: Divij)
>  # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
>  # ConnectorsResourceTest (owner: [~mdedetrich-aiven] ) 
> ([https://github.com/apache/kafka/pull/12725])
>  # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
>  # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
>  #

[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests

2022-10-07 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14132:
---
Description: 
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}InReview{color}
{color:#00875a}Merged{color}
 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
 # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
 # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] 
)(https://github.com/apache/kafka/pull/12725)
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
 # RetryUtilTest (owner: [~mdedetrich-aiven] )
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*

  was:
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}InReview{color}
{color:#00875a}Merged{color}
 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
 # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
 # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
 # RetryUtilTest (owner: [~mdedetrich-aiven] )
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*


> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#de350b}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> {color:#ff8b00}InReview{color}
> {color:#00875a}Merged{color}
>  # ErrorHandlingTaskTest (owner: Divij)
>  # SourceTaskOffsetCommiterTest (owner: Divij)
>  # {color:#00875a}WorkerMetricsGroupTest{color} (owner: Divij)
>  # WorkerSinkTaskTest (owner: Divij)
>  # WorkerSinkTaskThreadedTest (owner: Divij)
>  # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
>  # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
>  # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
>  # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] 
> )(https://github.com/apache/kafka/pull/12725)
>  # KafkaOffsetBackingStoreTest (owner: Christo) 
>

[jira] [Updated] (KAFKA-14283) Fix connector creation authorization tests not doing anything

2022-10-07 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14283:
---
Description: Currently the testCreateConnectorWithoutHeaderAuthorization 
and testCreateConnectorWithHeaderAuthorization tests within 
ConnectorsResourceTest aren't actually doing anything. This is because in 
reality the requests should be forwarded to a leader and tests aren't actually 
testing that a leader RestClient request is made  (was: Currently the 
testCreateConnectorWithoutHeaderAuthorization and 
testCreateConnectorWithHeaderAuthorization tests within ConnectorsResourceTest 
aren't actually anything. This is because in reality the requests should be 
forwarded to a leader and tests aren't actually testing that a leader 
RestClient request is made)

> Fix connector creation authorization tests not doing anything
> -
>
> Key: KAFKA-14283
> URL: https://issues.apache.org/jira/browse/KAFKA-14283
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> Currently the testCreateConnectorWithoutHeaderAuthorization and 
> testCreateConnectorWithHeaderAuthorization tests within 
> ConnectorsResourceTest aren't actually doing anything. This is because in 
> reality the requests should be forwarded to a leader and tests aren't 
> actually testing that a leader RestClient request is made



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-14283) Fix connector creation authorization tests not doing anything

2022-10-07 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich resolved KAFKA-14283.

Resolution: Fixed

> Fix connector creation authorization tests not doing anything
> -
>
> Key: KAFKA-14283
> URL: https://issues.apache.org/jira/browse/KAFKA-14283
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Matthew de Detrich
>Assignee: Matthew de Detrich
>Priority: Major
>
> Currently the testCreateConnectorWithoutHeaderAuthorization and 
> testCreateConnectorWithHeaderAuthorization tests within 
> ConnectorsResourceTest aren't actually anything. This is because in reality 
> the requests should be forwarded to a leader and tests aren't actually 
> testing that a leader RestClient request is made



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14283) Fix connector creation authorization tests not doing anything

2022-10-06 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14283:
--

 Summary: Fix connector creation authorization tests not doing 
anything
 Key: KAFKA-14283
 URL: https://issues.apache.org/jira/browse/KAFKA-14283
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Matthew de Detrich


Currently the testCreateConnectorWithoutHeaderAuthorization and 
testCreateConnectorWithHeaderAuthorization tests within ConnectorsResourceTest 
aren't actually anything. This is because in reality the requests should be 
forwarded to a leader and tests aren't actually testing that a leader 
RestClient request is made



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14256) Update to Scala 2.13.9

2022-09-22 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14256:
--

 Summary: Update to Scala 2.13.9
 Key: KAFKA-14256
 URL: https://issues.apache.org/jira/browse/KAFKA-14256
 Project: Kafka
  Issue Type: Task
Reporter: Matthew de Detrich






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14132) Remaining PowerMock to Mockito tests

2022-09-19 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14132:
---
Description: 
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}InReview{color}
{color:#00875a}Merged{color}
 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # WorkerMetricsGroupTest (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
 # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
 # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
 # RetryUtilTest (owner: [~mdedetrich-aiven] )
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*

  was:
{color:#de350b}Some of the tests below use EasyMock as well. For those migrate 
both PowerMock and EasyMock to Mockito.{color}

Unless stated in brackets the tests are in the connect module.

A list of tests which still require to be moved from PowerMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#FF8B00}InReview{color}
{color:#00875A}Merged{color}

 # ErrorHandlingTaskTest (owner: Divij)
 # SourceTaskOffsetCommiterTest (owner: Divij)
 # WorkerMetricsGroupTest (owner: Divij)
 # WorkerSinkTaskTest (owner: Divij)
 # WorkerSinkTaskThreadedTest (owner: Divij)
 # {color:#00875A}WorkerTaskTest{color} (owner: [~yash.mayya])
 # {color:#00875A}ErrorReporterTest{color} (owner: [~yash.mayya])
 # {color:#00875A}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
 # {color:#00875A}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
 # ConnectorsResourceTest
 # StandaloneHerderTest
 # KafkaConfigBackingStoreTest
 # KafkaOffsetBackingStoreTest (owner: Christo) 
([https://github.com/apache/kafka/pull/12418])
 # KafkaBasedLogTest
 # RetryUtilTest
 # RepartitionTopicTest (streams) (owner: Christo)
 # StateManagerUtilTest (streams) (owner: Christo)

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*


> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#de350b}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> {color:#ff8b00}InReview{color}
> {color:#00875a}Merged{color}
>  # ErrorHandlingTaskTest (owner: Divij)
>  # SourceTaskOffsetCommiterTest (owner: Divij)
>  # WorkerMetricsGroupTest (owner: Divij)
>  # WorkerSinkTaskTest (owner: Divij)
>  # WorkerSinkTaskThreadedTest (owner: Divij)
>  # {color:#00875a}WorkerTaskTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}ErrorReporterTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}RetryWithToleranceOperatorTest{color} (owner: [~yash.mayya])
>  # {color:#00875a}WorkerErrantRecordReporterTest{color} (owner: [~yash.mayya])
>  # ConnectorsResourceTest (owner: [~mdedetrich-aiven] )
>  # StandaloneHerderTest (owner: [~mdedetrich-aiven] )
>  # KafkaConfigBackingStoreTest (owner: [~mdedetrich-aiven] )
>  # KafkaOffsetBackingStoreTest (owner: Christo) 
> ([https://github.com/apache/kafka/pull/12418])
>  # KafkaBasedLogTest (owner: [~mdedetrich-aiven] )
>  # RetryUtilTest (owner: [~mdedetrich-aiven] )
>  # RepartitionTopicTest (streams) (owner: Christo)
>  # StateManagerUtilTest (streams) (owner: Christo)
> *The coverage report for the above tests after the change should

[jira] [Comment Edited] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603260#comment-17603260
 ] 

Matthew de Detrich edited comment on KAFKA-14221 at 9/12/22 9:52 PM:
-

Thanks, INFRA tickets created. I checked the matrix and indeed even the latest 
variants of the JDK's are older than the versions of JDK where this bug has 
been fixed.


was (Author: mdedetrich-aiven):
Thanks, INFRA tickets created

> Update Apache Kafka JVM version in CI to latest versions
> 
>
> Key: KAFKA-14221
> URL: https://issues.apache.org/jira/browse/KAFKA-14221
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In a recent test (see 
> [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
>  run the JVM crashed with the following stack trace
>  
> {code:java}
> [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
> Runtime Environment:
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, 
> pid=6229, tid=7342
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
> (17.0.1+12) (build 17.0.1+12-LTS-39)
> [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
> class ptrs, parallel gc, linux-amd64)
> [2022-09-12T14:22:22.414Z] # Problematic frame:
> [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
> PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) 
> [clone .part.0]+0x52
> {code}
> After some research online I found that there was a JDK bug filed for the 
> same kind of crash, see 
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]
> This bug was fixed in JDK 17.0.2 which is a newer version than the one that 
> is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with 
> the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash.
> We should update both JDK 11 and JDk 17 to the latest version in the CI to 
> see if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603260#comment-17603260
 ] 

Matthew de Detrich edited comment on KAFKA-14221 at 9/12/22 9:49 PM:
-

Thanks, INFRA tickets created


was (Author: mdedetrich-aiven):
Thanks, INFRA ticket created

> Update Apache Kafka JVM version in CI to latest versions
> 
>
> Key: KAFKA-14221
> URL: https://issues.apache.org/jira/browse/KAFKA-14221
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In a recent test (see 
> [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
>  run the JVM crashed with the following stack trace
>  
> {code:java}
> [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
> Runtime Environment:
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, 
> pid=6229, tid=7342
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
> (17.0.1+12) (build 17.0.1+12-LTS-39)
> [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
> class ptrs, parallel gc, linux-amd64)
> [2022-09-12T14:22:22.414Z] # Problematic frame:
> [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
> PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) 
> [clone .part.0]+0x52
> {code}
> After some research online I found that there was a JDK bug filed for the 
> same kind of crash, see 
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]
> This bug was fixed in JDK 17.0.2 which is a newer version than the one that 
> is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with 
> the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash.
> We should update both JDK 11 and JDk 17 to the latest version in the CI to 
> see if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-14223) Update jdk_17_latest to adoptium 17.0.4+1

2022-09-12 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich resolved KAFKA-14223.

Resolution: Invalid

Created in wrong project

> Update jdk_17_latest to adoptium 17.0.4+1
> -
>
> Key: KAFKA-14223
> URL: https://issues.apache.org/jira/browse/KAFKA-14223
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> The current latest JDK 17 is outdated and Apache Kafka appeared to have hit a 
> bug due to this (see https://issues.apache.org/jira/browse/KAFKA-14221). 
> Would it be possible to update jdk_17_latest to the adoptium 17.0.4+1 release 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603260#comment-17603260
 ] 

Matthew de Detrich commented on KAFKA-14221:


Thanks, INFRA ticket created

> Update Apache Kafka JVM version in CI to latest versions
> 
>
> Key: KAFKA-14221
> URL: https://issues.apache.org/jira/browse/KAFKA-14221
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In a recent test (see 
> [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
>  run the JVM crashed with the following stack trace
>  
> {code:java}
> [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
> Runtime Environment:
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, 
> pid=6229, tid=7342
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
> (17.0.1+12) (build 17.0.1+12-LTS-39)
> [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
> class ptrs, parallel gc, linux-amd64)
> [2022-09-12T14:22:22.414Z] # Problematic frame:
> [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
> PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) 
> [clone .part.0]+0x52
> {code}
> After some research online I found that there was a JDK bug filed for the 
> same kind of crash, see 
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]
> This bug was fixed in JDK 17.0.2 which is a newer version than the one that 
> is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with 
> the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash.
> We should update both JDK 11 and JDk 17 to the latest version in the CI to 
> see if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14223) Update jdk_17_latest to adoptium 17.0.4+1

2022-09-12 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14223:
--

 Summary: Update jdk_17_latest to adoptium 17.0.4+1
 Key: KAFKA-14223
 URL: https://issues.apache.org/jira/browse/KAFKA-14223
 Project: Kafka
  Issue Type: Task
Reporter: Matthew de Detrich


The current latest JDK 17 is outdated and Apache Kafka appeared to have hit a 
bug due to this (see https://issues.apache.org/jira/browse/KAFKA-14221). Would 
it be possible to update jdk_17_latest to the adoptium 17.0.4+1 release  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14221:
---
Description: 
In a recent test (see 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
 run the JVM crashed with the following stack trace

 
{code:java}
[2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
Runtime Environment:
[2022-09-12T14:22:22.414Z] #
[2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, 
tid=7342
[2022-09-12T14:22:22.414Z] #
[2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
(17.0.1+12) (build 17.0.1+12-LTS-39)
[2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
(17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
class ptrs, parallel gc, linux-amd64)
[2022-09-12T14:22:22.414Z] # Problematic frame:
[2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone 
.part.0]+0x52
{code}
After some research online I found that there was a JDK bug filed for the same 
kind of crash, see 
[https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]

This bug was fixed in JDK 17.0.2 which is a newer version than the one that is 
run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with the 
latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash.

We should update both JDK 11 and JDk 17 to the latest version in the CI to see 
if this will solve the problem.

  was:
In a recent test (see 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
 run the JVM crashed with the following stack trace

 
{code:java}
[2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
Runtime Environment:
[2022-09-12T14:22:22.414Z] #
[2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, 
tid=7342
[2022-09-12T14:22:22.414Z] #
[2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
(17.0.1+12) (build 17.0.1+12-LTS-39)
[2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
(17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
class ptrs, parallel gc, linux-amd64)
[2022-09-12T14:22:22.414Z] # Problematic frame:
[2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone 
.part.0]+0x52
{code}
After some research online I found that there was a JDK bug filed for the same 
kind of crash, see 
[https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]

This bug was fixed in JDK 17.0.2 which is a newer version than the one that is 
run in Apache Kafka CI (which is 17.0.1+12-LTS-39).

We should update both JDK 11 and JDk 17 to the latest version in the CI to see 
if this will solve the problem.


> Update Apache Kafka JVM version in CI to latest versions
> 
>
> Key: KAFKA-14221
> URL: https://issues.apache.org/jira/browse/KAFKA-14221
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In a recent test (see 
> [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
>  run the JVM crashed with the following stack trace
>  
> {code:java}
> [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
> Runtime Environment:
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, 
> pid=6229, tid=7342
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
> (17.0.1+12) (build 17.0.1+12-LTS-39)
> [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
> class ptrs, parallel gc, linux-amd64)
> [2022-09-12T14:22:22.414Z] # Problematic frame:
> [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
> PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) 
> [clone .part.0]+0x52
> {code}
> After some research online I found that there was a JDK bug filed for the 
> same kind of crash, see 
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]
> This bug was fixed in JDK 17.0.2 which is a newer version than the one that 
> is run in Apache Kafka CI (which is 17.0.1+12-LTS-39). Note that locally with 
> the latest OpenJDK (specifically 17.0.4.1) I cannot reproduce this crash.
> We should update both JDK 11 and JDk 17 to the latest version in the CI to 
> see if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603151#comment-17603151
 ] 

Matthew de Detrich commented on KAFKA-14221:


[~ijuma] Maybe you have some idea on how to update the JDK on the CI runner?

> Update Apache Kafka JVM version in CI to latest versions
> 
>
> Key: KAFKA-14221
> URL: https://issues.apache.org/jira/browse/KAFKA-14221
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In a recent test (see 
> [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
>  run the JVM crashed with the following stack trace
>  
> {code:java}
> [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
> Runtime Environment:
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, 
> pid=6229, tid=7342
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
> (17.0.1+12) (build 17.0.1+12-LTS-39)
> [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
> class ptrs, parallel gc, linux-amd64)
> [2022-09-12T14:22:22.414Z] # Problematic frame:
> [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
> PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) 
> [clone .part.0]+0x52
> {code}
> After some research online I found that there was a JDK bug filed for the 
> same kind of crash, see 
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]
> This bug was fixed in JDK 17.0.2 which is a newer version than the one that 
> is run in Apache Kafka CI (which is 17.0.1+12-LTS-39).
> We should update both JDK 11 and JDk 17 to the latest version in the CI to 
> see if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14221:
---
Description: 
In a recent test (see 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
 run the JVM crashed with the following stack trace

 
{code:java}
[2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
Runtime Environment:
[2022-09-12T14:22:22.414Z] #
[2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, 
tid=7342
[2022-09-12T14:22:22.414Z] #
[2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
(17.0.1+12) (build 17.0.1+12-LTS-39)
[2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
(17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
class ptrs, parallel gc, linux-amd64)
[2022-09-12T14:22:22.414Z] # Problematic frame:
[2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone 
.part.0]+0x52
{code}
After some research online I found that there was a JDK bug filed for the same 
kind of crash, see 
[https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]

This bug was fixed in JDK 17.0.2 which is a newer version than the one that is 
run in Apache Kafka CI (which is 17.0.1+12-LTS-39).

We should update both JDK 11 and JDk 17 to the latest version in the CI to see 
if this will solve the problem.

  was:
In a recent test (see 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
 run the JVM crashed with the following stack trace


[2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
Runtime Environment:

[2022-09-12T14:22:22.414Z] #

[2022-09-12T14:22:22.414Z] #  SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, 
tid=7342

[2022-09-12T14:22:22.414Z] #

[2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
(17.0.1+12) (build 17.0.1+12-LTS-39)

[2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
(17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
class ptrs, parallel gc, linux-amd64)

[2022-09-12T14:22:22.414Z] # Problematic frame:

[2022-09-12T14:22:22.415Z] # V  [libjvm.so+0xcc6b12]  
PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone 
.part.0]+0x52
After some research online I found that there was a JDK bug filed for the same 
kind of crash, see 
[https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]

This bug was fixed in JDK 17.0.2 which is a newer version than the one that is 
run in Apache Kafka CI (which is 17.0.1+12-LTS-39).

We should update both JDK 11 and JDk 17 to the latest version in the CI to see 
if this will solve the problem.


> Update Apache Kafka JVM version in CI to latest versions
> 
>
> Key: KAFKA-14221
> URL: https://issues.apache.org/jira/browse/KAFKA-14221
> Project: Kafka
>  Issue Type: Task
>Reporter: Matthew de Detrich
>Priority: Major
>
> In a recent test (see 
> [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
>  run the JVM crashed with the following stack trace
>  
> {code:java}
> [2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
> Runtime Environment:
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # SIGSEGV (0xb) at pc=0x7f982a771b12, 
> pid=6229, tid=7342
> [2022-09-12T14:22:22.414Z] #
> [2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
> (17.0.1+12) (build 17.0.1+12-LTS-39)
> [2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
> class ptrs, parallel gc, linux-amd64)
> [2022-09-12T14:22:22.414Z] # Problematic frame:
> [2022-09-12T14:22:22.415Z] # V [libjvm.so+0xcc6b12] 
> PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) 
> [clone .part.0]+0x52
> {code}
> After some research online I found that there was a JDK bug filed for the 
> same kind of crash, see 
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]
> This bug was fixed in JDK 17.0.2 which is a newer version than the one that 
> is run in Apache Kafka CI (which is 17.0.1+12-LTS-39).
> We should update both JDK 11 and JDk 17 to the latest version in the CI to 
> see if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14221) Update Apache Kafka JVM version in CI to latest versions

2022-09-12 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14221:
--

 Summary: Update Apache Kafka JVM version in CI to latest versions
 Key: KAFKA-14221
 URL: https://issues.apache.org/jira/browse/KAFKA-14221
 Project: Kafka
  Issue Type: Task
Reporter: Matthew de Detrich


In a recent test (see 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1215/pipeline/16)]
 run the JVM crashed with the following stack trace


[2022-09-12T14:22:22.414Z] # A fatal error has been detected by the Java 
Runtime Environment:

[2022-09-12T14:22:22.414Z] #

[2022-09-12T14:22:22.414Z] #  SIGSEGV (0xb) at pc=0x7f982a771b12, pid=6229, 
tid=7342

[2022-09-12T14:22:22.414Z] #

[2022-09-12T14:22:22.414Z] # JRE version: Java(TM) SE Runtime Environment 
(17.0.1+12) (build 17.0.1+12-LTS-39)

[2022-09-12T14:22:22.414Z] # Java VM: Java HotSpot(TM) 64-Bit Server VM 
(17.0.1+12-LTS-39, mixed mode, sharing, tiered, compressed oops, compressed 
class ptrs, parallel gc, linux-amd64)

[2022-09-12T14:22:22.414Z] # Problematic frame:

[2022-09-12T14:22:22.415Z] # V  [libjvm.so+0xcc6b12]  
PhaseIdealLoop::spinup(Node*, Node*, Node*, Node*, Node*, small_cache*) [clone 
.part.0]+0x52
After some research online I found that there was a JDK bug filed for the same 
kind of crash, see 
[https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8270886]

This bug was fixed in JDK 17.0.2 which is a newer version than the one that is 
run in Apache Kafka CI (which is 17.0.1+12-LTS-39).

We should update both JDK 11 and JDk 17 to the latest version in the CI to see 
if this will solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-14133) Remaining EasyMock to Mockito tests

2022-08-14 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-14133:
---
Description: 
{color:#de350b}There are tests which use both PowerMock and EasyMock. I have 
put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here rely 
solely on EasyMock.{color}

Unless stated in brackets the tests are in the streams module.

A list of tests which still require to be moved from EasyMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}In Review{color}
 # WorkerConnectorTest (connect) (owner: [~yash.mayya] )
 # WorkerCoordinatorTest (connect) (owner: [~yash.mayya] )
 # RootResourceTest (connect) (owner: [~yash.mayya] )
 # ByteArrayProducerRecordEquals (connect) (owner: [~yash.mayya] )
 # {color:#ff8b00}KStreamFlatTransformTest{color} (owner: Christo)
 # {color:#ff8b00}KStreamFlatTransformValuesTest{color} (owner: Christo)
 # {color:#ff8b00}KStreamPrintTest{color} (owner: Christo)
 # {color:#ff8b00}KStreamRepartitionTest{color} (owner: Christo)
 # {color:#ff8b00}MaterializedInternalTest{color} (owner: Christo)
 # {color:#ff8b00}TransformerSupplierAdapterTest{color} (owner: Christo)
 # {color:#ff8b00}KTableSuppressProcessorMetricsTest{color} (owner: Christo)
 # {color:#ff8b00}ClientUtilsTest{color} (owner: Christo)
 # {color:#ff8b00}HighAvailabilityStreamsPartitionAssignorTest{color} (owner: 
Christo)
 # {color:#ff8b00}StreamsRebalanceListenerTest{color} (owner: Christo)
 # {color:#ff8b00}TimestampedKeyValueStoreMaterializerTest{color} (owner: 
Christo)
 # {color:#ff8b00}CachingInMemoryKeyValueStoreTest{color} (owner: Christo)
 # {color:#ff8b00}CachingInMemorySessionStoreTest{color} (owner: Christo)
 # {color:#ff8b00}CachingPersistentSessionStoreTest{color} (owner: Christo)
 # {color:#ff8b00}CachingPersistentWindowStoreTest{color} (owner: Christo)
 # {color:#ff8b00}ChangeLoggingKeyValueBytesStoreTest{color} (owner: Christo)
 # {color:#ff8b00}ChangeLoggingTimestampedKeyValueBytesStoreTest{color} (owner: 
Christo)
 # {color:#ff8b00}CompositeReadOnlyWindowStoreTest{color} (owner: Christo)
 # {color:#ff8b00}KeyValueStoreBuilderTest{color} (owner: Christo)
 # {color:#ff8b00}RocksDBStoreTest{color} (owner: Christo)
 # {color:#ff8b00}StreamThreadStateStoreProviderTest{color} (owner: Christo)
 # TopologyTest (owner: Christo)
 # KTableSuppressProcessorTest (owner: Christo)
 # InternalTopicManagerTest (owner: Christo)
 # ProcessorContextImplTest (owner: Christo)
 # ProcessorStateManagerTest ({*}WIP{*} owner: Matthew)
 # StandbyTaskTest ({*}WIP{*} owner: Matthew)
 # StoreChangelogReaderTest ({*}WIP{*} owner: Matthew)
 # StreamTaskTest ({*}WIP{*} owner: Matthew)
 # StreamThreadTest ({*}WIP{*} owner: Matthew)
 # StreamsAssignmentScaleTest ({*}WIP{*} owner: Christo)
 # StreamsPartitionAssignorTest ({*}WIP{*} owner: Christo)
 # TaskManagerTest
 # WriteConsistencyVectorTest
 # AssignmentTestUtils ({*}WIP{*} owner: Christo)
 # StreamsMetricsImplTest
 # ChangeLoggingSessionBytesStoreTest
 # ChangeLoggingTimestampedWindowBytesStoreTest
 # ChangeLoggingWindowBytesStoreTest
 # MeteredTimestampedWindowStoreTest
 # TimeOrderedCachingPersistentWindowStoreTest
 # TimeOrderedWindowStoreTest

*The coverage report for the above tests after the change should be >= to what 
the coverage is now.*

  was:
{color:#de350b}There are tests which use both PowerMock and EasyMock. I have 
put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here rely 
solely on EasyMock.{color}

Unless stated in brackets the tests are in the streams module.

A list of tests which still require to be moved from EasyMock to Mockito as of 
2nd of August 2022 which do not have a Jira issue and do not have pull requests 
I am aware of which are opened:

{color:#ff8b00}In Review{color}
 # WorkerConnectorTest (connect) (owner: [~yash.mayya] )
 # WorkerCoordinatorTest (connect) (owner: [~yash.mayya] )
 # RootResourceTest (connect) (owner: [~yash.mayya] )
 # ByteArrayProducerRecordEquals (connect) (owner: [~yash.mayya] )
 # {color:#ff8b00}KStreamFlatTransformTest{color} (owner: Christo)
 # {color:#ff8b00}KStreamFlatTransformValuesTest{color} (owner: Christo)
 # {color:#ff8b00}KStreamPrintTest{color} (owner: Christo)
 # {color:#ff8b00}KStreamRepartitionTest{color} (owner: Christo)
 # {color:#ff8b00}MaterializedInternalTest{color} (owner: Christo)
 # {color:#ff8b00}TransformerSupplierAdapterTest{color} (owner: Christo)
 # {color:#ff8b00}KTableSuppressProcessorMetricsTest{color} (owner: Christo)
 # {color:#ff8b00}ClientUtilsTest{color} (owner: Christo)
 # {color:#ff8b00}HighAvailabilityStreamsPartitionAssignorTest{color} (owner: 
Christo)
 # {color:#ff8b00}StreamsRebalanceListenerTest{color} (owner: Christo)
 # {color:#ff8b00}TimestampedKeyValueStoreMaterializerTest{color} (owner: 
Christo)
 #

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-08-09 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573211#comment-17573211
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 8/9/22 12:58 PM:
-

In case people want to reproduce the flakiness, assuming you have a working 
docker installation you can do the following
{code:java}
docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w 
/home/gradle/project gradle sh{code}
where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff 
between higher occurrence to encounter the flakiness vs how fast the test 
runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the 
kafak project in gradle can take ages.

The above command will put you into a shell at which point you can do
{code:java}
while [ $? -eq 0 ]; do ./gradlew :streams:test --tests 
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets;
 done{code}
Which will re-run the tests until there is a failure. 

Note that due to how Gradle cache's test runs, you need to do something like 
[https://github.com/gradle/gradle/issues/9151#issue-434212465] or put
{code:java}
test.outputs.upToDateWhen {false}{code}
in order to force gradle to re-run the test every time.


was (Author: mdedetrich-aiven):
In case people want to reproduce the flakiness, assuming you have a working 
docker installation you can do the following
{code:java}
docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w 
/home/gradle/project gradle sh{code}
where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff 
between higher occurrence to encounter the flakiness vs how fast the test 
runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the 
kafak project in gradle can take ages.

The above command will put you into a shell at which point you can do
{code:java}
while [ $? -eq 0 ]; do ./gradlew :streams:test --tests 
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets;
 done{code}
Which will re-run the tests until there is a failure. 

Note that due to how Gradle cache's test runs, you need to do something like 
[https://github.com/gradle/gradle/issues/9151#issue-434212465] in order to 
force gradle to re-run the test every time.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>

[jira] [Commented] (KAFKA-14132) Remaining PowerMock to Mockito tests

2022-08-05 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575616#comment-17575616
 ] 

Matthew de Detrich commented on KAFKA-14132:


Ah misunderstood, I will review them to help out.

> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#DE350B}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> # ErrorHandlingTaskTest (owner: Divij)
> # SourceTaskOffsetCommiterTest (owner: Divij)
> # WorkerMetricsGroupTest (owner: Divij)
> # WorkerSinkTaskTest (owner: Divij)
> # WorkerSinkTaskThreadedTest (owner: Divij)
> # WorkerTaskTest
> # ErrorReporterTest
> # RetryWithToleranceOperatorTest
> # WorkerErrantRecordReporterTest
> # ConnectorsResourceTest
> # StandaloneHerderTest
> # KafkaConfigBackingStoreTest
> # KafkaOffsetBackingStoreTest (owner: Christo) 
> (https://github.com/apache/kafka/pull/12418)
> # KafkaBasedLogTest
> # RetryUtilTest
> # RepartitionTopicTest (streams) (owner: Christo)
> # StateManagerUtilTest (streams) (owner: Christo)
> *The coverage report for the above tests after the change should be >= to 
> what the coverage is now.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14133) Remaining EasyMock to Mockito tests

2022-08-05 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575617#comment-17575617
 ] 

Matthew de Detrich commented on KAFKA-14133:


No worries, I will review them to help out.

> Remaining EasyMock to Mockito tests
> ---
>
> Key: KAFKA-14133
> URL: https://issues.apache.org/jira/browse/KAFKA-14133
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#DE350B}There are tests which use both PowerMock and EasyMock. I have 
> put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here 
> rely solely on EasyMock.{color}
> Unless stated in brackets the tests are in the streams module.
> A list of tests which still require to be moved from EasyMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> # WorkerConnectorTest (connect) (owner: Yash)
> # WorkerCoordinatorTest (connect)
> # RootResourceTest (connect)
> # ByteArrayProducerRecordEquals (connect)
> # TopologyTest
> # KStreamFlatTransformTest
> # KStreamFlatTransformValuesTest
> # KStreamPrintTest
> # KStreamRepartitionTest
> # MaterializedInternalTest
> # TransformerSupplierAdapterTest
> # KTableSuppressProcessorMetricsTest
> # KTableSuppressProcessorTest
> # ClientUtilsTest
> # HighAvailabilityStreamsPartitionAssignorTest
> # InternalTopicManagerTest
> # ProcessorContextImplTest
> # ProcessorStateManagerTest
> # StandbyTaskTest
> # StoreChangelogReaderTest
> # StreamTaskTest
> # StreamThreadTest
> # StreamsAssignmentScaleTest
> # StreamsPartitionAssignorTest
> # StreamsRebalanceListenerTest
> # TaskManagerTest
> # TimestampedKeyValueStoreMaterializerTest
> # WriteConsistencyVectorTest
> # AssignmentTestUtils
> # StreamsMetricsImplTest
> # CachingInMemoryKeyValueStoreTest
> # CachingInMemorySessionStoreTest
> # CachingPersistentSessionStoreTest
> # CachingPersistentWindowStoreTest
> # ChangeLoggingKeyValueBytesStoreTest
> # ChangeLoggingSessionBytesStoreTest
> # ChangeLoggingTimestampedKeyValueBytesStoreTest
> # ChangeLoggingTimestampedWindowBytesStoreTest
> # ChangeLoggingWindowBytesStoreTest
> # CompositeReadOnlyWindowStoreTest
> # KeyValueStoreBuilderTest
> # MeteredTimestampedWindowStoreTest
> # RocksDBStoreTest
> # StreamThreadStateStoreProviderTest
> # TimeOrderedCachingPersistentWindowStoreTest
> # TimeOrderedWindowStoreTest
> *The coverage report for the above tests after the change should be >= to 
> what the coverage is now.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14133) Remaining EasyMock to Mockito tests

2022-08-04 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575435#comment-17575435
 ] 

Matthew de Detrich commented on KAFKA-14133:


[~divijvaidya] [~christo_lolov] As per the email on the mailing list, 
apparently this ticket is looking for an owner and I am happy to take it up. 
Everyone okay with that?

> Remaining EasyMock to Mockito tests
> ---
>
> Key: KAFKA-14133
> URL: https://issues.apache.org/jira/browse/KAFKA-14133
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#DE350B}There are tests which use both PowerMock and EasyMock. I have 
> put those in https://issues.apache.org/jira/browse/KAFKA-14132. Tests here 
> rely solely on EasyMock.{color}
> Unless stated in brackets the tests are in the streams module.
> A list of tests which still require to be moved from EasyMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> # WorkerConnectorTest (connect) (owner: Yash)
> # WorkerCoordinatorTest (connect)
> # RootResourceTest (connect)
> # ByteArrayProducerRecordEquals (connect)
> # TopologyTest
> # KStreamFlatTransformTest
> # KStreamFlatTransformValuesTest
> # KStreamPrintTest
> # KStreamRepartitionTest
> # MaterializedInternalTest
> # TransformerSupplierAdapterTest
> # KTableSuppressProcessorMetricsTest
> # KTableSuppressProcessorTest
> # ClientUtilsTest
> # HighAvailabilityStreamsPartitionAssignorTest
> # InternalTopicManagerTest
> # ProcessorContextImplTest
> # ProcessorStateManagerTest
> # StandbyTaskTest
> # StoreChangelogReaderTest
> # StreamTaskTest
> # StreamThreadTest
> # StreamsAssignmentScaleTest
> # StreamsPartitionAssignorTest
> # StreamsRebalanceListenerTest
> # TaskManagerTest
> # TimestampedKeyValueStoreMaterializerTest
> # WriteConsistencyVectorTest
> # AssignmentTestUtils
> # StreamsMetricsImplTest
> # CachingInMemoryKeyValueStoreTest
> # CachingInMemorySessionStoreTest
> # CachingPersistentSessionStoreTest
> # CachingPersistentWindowStoreTest
> # ChangeLoggingKeyValueBytesStoreTest
> # ChangeLoggingSessionBytesStoreTest
> # ChangeLoggingTimestampedKeyValueBytesStoreTest
> # ChangeLoggingTimestampedWindowBytesStoreTest
> # ChangeLoggingWindowBytesStoreTest
> # CompositeReadOnlyWindowStoreTest
> # KeyValueStoreBuilderTest
> # MeteredTimestampedWindowStoreTest
> # RocksDBStoreTest
> # StreamThreadStateStoreProviderTest
> # TimeOrderedCachingPersistentWindowStoreTest
> # TimeOrderedWindowStoreTest
> *The coverage report for the above tests after the change should be >= to 
> what the coverage is now.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14132) Remaining PowerMock to Mockito tests

2022-08-04 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575434#comment-17575434
 ] 

Matthew de Detrich commented on KAFKA-14132:


[~divijvaidya] [~christo_lolov] As per the email on the mailing list, 
apparently this ticket is looking for an owner and I am happy to take it up. 
Everyone okay with that?

> Remaining PowerMock to Mockito tests
> 
>
> Key: KAFKA-14132
> URL: https://issues.apache.org/jira/browse/KAFKA-14132
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Major
>
> {color:#DE350B}Some of the tests below use EasyMock as well. For those 
> migrate both PowerMock and EasyMock to Mockito.{color}
> Unless stated in brackets the tests are in the connect module.
> A list of tests which still require to be moved from PowerMock to Mockito as 
> of 2nd of August 2022 which do not have a Jira issue and do not have pull 
> requests I am aware of which are opened:
> # ErrorHandlingTaskTest (owner: Divij)
> # SourceTaskOffsetCommiterTest (owner: Divij)
> # WorkerMetricsGroupTest (owner: Divij)
> # WorkerSinkTaskTest (owner: Divij)
> # WorkerSinkTaskThreadedTest (owner: Divij)
> # WorkerTaskTest
> # ErrorReporterTest
> # RetryWithToleranceOperatorTest
> # WorkerErrantRecordReporterTest
> # ConnectorsResourceTest
> # StandaloneHerderTest
> # KafkaConfigBackingStoreTest
> # KafkaOffsetBackingStoreTest (owner: Christo) 
> (https://github.com/apache/kafka/pull/12418)
> # KafkaBasedLogTest
> # RetryUtilTest
> # RepartitionTopicTest (streams) (owner: Christo)
> # StateManagerUtilTest (streams) (owner: Christo)
> *The coverage report for the above tests after the change should be >= to 
> what the coverage is now.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-13530) Flaky test ReplicaManagerTest

2022-08-03 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574920#comment-17574920
 ] 

Matthew de Detrich commented on KAFKA-13530:


So I ran the tests inside a docker container to simulate limited resources and 
couldn't replicate the flakiness

> Flaky test ReplicaManagerTest
> -
>
> Key: KAFKA-13530
> URL: https://issues.apache.org/jira/browse/KAFKA-13530
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> kafka.server.ReplicaManagerTest.[1] usesTopicIds=true
> {quote}org.opentest4j.AssertionFailedError: expected:  but was:  
> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502)
>  at 
> kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote}
> STDOUT
> {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint 
> (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>  at java.nio.file.Files.newByteChannel(Files.java:361) at 
> java.nio.file.Files.newByteChannel(Files.java:407) at 
> java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
>  at java.nio.file.Files.newInputStream(Files.java:152) at 
> java.nio.file.Files.newBufferedReader(Files.java:2784) at 
> java.nio.file.Files.newBufferedReader(Files.java:2816) at 
> org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) 
> at 
> kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48)
>  at 
> kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89)
>  at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at 
> kafka.cluster.Partition.createLog(Partition.scala:361) at 
> kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at 
> kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at 
> kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at 
> kafka.cluster.Partition.makeLeader(Partition.scala:530) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163)
>  at scala.Option.foreach(Option.scala:437) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160)
>  at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159)
>  at 
> kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
>  at 
> scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
>  at 
> kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) 
> at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at 
> kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (KAFKA-13513) Flaky test AdjustStreamThreadCountTest

2022-08-02 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-13513:
--

Assignee: Matthew de Detrich

> Flaky test AdjustStreamThreadCountTest
> --
>
> Key: KAFKA-13513
> URL: https://issues.apache.org/jira/browse/KAFKA-13513
> Project: Kafka
>  Issue Type: Test
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> org.apache.kafka.streams.integration.AdjustStreamThreadCountTest.testConcurrentlyAccessThreads
> {quote}java.lang.AssertionError: expected null, but 
> was: at 
> org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotNull(Assert.java:756) at 
> org.junit.Assert.assertNull(Assert.java:738) at 
> org.junit.Assert.assertNull(Assert.java:748) at 
> org.apache.kafka.streams.integration.AdjustStreamThreadCountTest.testConcurrentlyAccessThreads(AdjustStreamThreadCountTest.java:367)
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-08-02 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-14014:
--

Assignee: Matthew de Detrich

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-13530) Flaky test ReplicaManagerTest

2022-08-02 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17574099#comment-17574099
 ] 

Matthew de Detrich commented on KAFKA-13530:


Okay so I have done some testing with docker to see if I can reproduce the 
flakiness under constrained resources and I couldn't simulate although the 
progress of doing this is incredibly slow (for some reason when running tests 
under the Kafka core gradle project it takes a *really* long amount of time per 
run, it takes ~5-10 minutes in the configuring stage before even starting to 
compile the sources).

[~mjsax] When is the last time you experienced this flaky test?

> Flaky test ReplicaManagerTest
> -
>
> Key: KAFKA-13530
> URL: https://issues.apache.org/jira/browse/KAFKA-13530
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> kafka.server.ReplicaManagerTest.[1] usesTopicIds=true
> {quote}org.opentest4j.AssertionFailedError: expected:  but was:  
> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502)
>  at 
> kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote}
> STDOUT
> {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint 
> (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>  at java.nio.file.Files.newByteChannel(Files.java:361) at 
> java.nio.file.Files.newByteChannel(Files.java:407) at 
> java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
>  at java.nio.file.Files.newInputStream(Files.java:152) at 
> java.nio.file.Files.newBufferedReader(Files.java:2784) at 
> java.nio.file.Files.newBufferedReader(Files.java:2816) at 
> org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) 
> at 
> kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48)
>  at 
> kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89)
>  at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at 
> kafka.cluster.Partition.createLog(Partition.scala:361) at 
> kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at 
> kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at 
> kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at 
> kafka.cluster.Partition.makeLeader(Partition.scala:530) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163)
>  at scala.Option.foreach(Option.scala:437) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160)
>  at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159)
>  at 
> kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
>  at 
> scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
>  at 
> kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) 
> at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at 
> kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote}



--
This message was sent by Atlassian Jira

[jira] [Assigned] (KAFKA-13530) Flaky test ReplicaManagerTest

2022-07-30 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-13530:
--

Assignee: Matthew de Detrich

> Flaky test ReplicaManagerTest
> -
>
> Key: KAFKA-13530
> URL: https://issues.apache.org/jira/browse/KAFKA-13530
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> kafka.server.ReplicaManagerTest.[1] usesTopicIds=true
> {quote}org.opentest4j.AssertionFailedError: expected:  but was:  
> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502)
>  at 
> kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote}
> STDOUT
> {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint 
> (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>  at java.nio.file.Files.newByteChannel(Files.java:361) at 
> java.nio.file.Files.newByteChannel(Files.java:407) at 
> java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
>  at java.nio.file.Files.newInputStream(Files.java:152) at 
> java.nio.file.Files.newBufferedReader(Files.java:2784) at 
> java.nio.file.Files.newBufferedReader(Files.java:2816) at 
> org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) 
> at 
> kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48)
>  at 
> kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89)
>  at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at 
> kafka.cluster.Partition.createLog(Partition.scala:361) at 
> kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at 
> kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at 
> kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at 
> kafka.cluster.Partition.makeLeader(Partition.scala:530) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163)
>  at scala.Option.foreach(Option.scala:437) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160)
>  at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159)
>  at 
> kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
>  at 
> scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
>  at 
> kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) 
> at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at 
> kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-13530) Flaky test ReplicaManagerTest

2022-07-30 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573282#comment-17573282
 ] 

Matthew de Detrich commented on KAFKA-13530:


So I tried to simulate this test failure under normal circumstances and 
couldn't do so. I will retry the tests using docker to limit the resources of 
to see if I can fail the test under those circumstances.

> Flaky test ReplicaManagerTest
> -
>
> Key: KAFKA-13530
> URL: https://issues.apache.org/jira/browse/KAFKA-13530
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
>
> kafka.server.ReplicaManagerTest.[1] usesTopicIds=true
> {quote}org.opentest4j.AssertionFailedError: expected:  but was:  
> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> kafka.server.ReplicaManagerTest.assertFetcherHasTopicId(ReplicaManagerTest.scala:3502)
>  at 
> kafka.server.ReplicaManagerTest.testReplicaAlterLogDirsWithAndWithoutIds(ReplicaManagerTest.scala:3572){quote}
> STDOUT
> {quote}[2021-12-07 16:19:35,906] ERROR Error while reading checkpoint file 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint 
> (kafka.server.LogDirFailureChannel:76) java.nio.file.NoSuchFileException: 
> /tmp/kafka-6310287969113820536/replication-offset-checkpoint at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>  at java.nio.file.Files.newByteChannel(Files.java:361) at 
> java.nio.file.Files.newByteChannel(Files.java:407) at 
> java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
>  at java.nio.file.Files.newInputStream(Files.java:152) at 
> java.nio.file.Files.newBufferedReader(Files.java:2784) at 
> java.nio.file.Files.newBufferedReader(Files.java:2816) at 
> org.apache.kafka.server.common.CheckpointFile.read(CheckpointFile.java:104) 
> at 
> kafka.server.checkpoints.CheckpointFileWithFailureHandler.read(CheckpointFileWithFailureHandler.scala:48)
>  at 
> kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:70)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets$lzycompute(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.offsets(OffsetCheckpointFile.scala:94)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpointMap.fetch(OffsetCheckpointFile.scala:97)
>  at 
> kafka.server.checkpoints.LazyOffsetCheckpoints.fetch(OffsetCheckpointFile.scala:89)
>  at kafka.cluster.Partition.updateHighWatermark$1(Partition.scala:348) at 
> kafka.cluster.Partition.createLog(Partition.scala:361) at 
> kafka.cluster.Partition.maybeCreate$1(Partition.scala:334) at 
> kafka.cluster.Partition.createLogIfNotExists(Partition.scala:341) at 
> kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:546) at 
> kafka.cluster.Partition.makeLeader(Partition.scala:530) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$3(ReplicaManager.scala:2163)
>  at scala.Option.foreach(Option.scala:437) at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2(ReplicaManager.scala:2160)
>  at 
> kafka.server.ReplicaManager.$anonfun$applyLocalLeadersDelta$2$adapted(ReplicaManager.scala:2159)
>  at 
> kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
>  at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
>  at 
> scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
>  at 
> kafka.server.ReplicaManager.applyLocalLeadersDelta(ReplicaManager.scala:2159) 
> at kafka.server.ReplicaManager.applyDelta(ReplicaManager.scala:2136) at 
> kafka.server.ReplicaManagerTest.testDeltaToLeaderOrFollowerMarksPartitionOfflineIfLogCantBeCreated(ReplicaManagerTest.scala:3349){quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-29 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573211#comment-17573211
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/30/22 5:57 AM:
-

In case people want to reproduce the flakiness, assuming you have a working 
docker installation you can do the following
{code:java}
docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w 
/home/gradle/project gradle sh{code}
where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff 
between higher occurrence to encounter the flakiness vs how fast the test 
runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the 
kafak project in gradle can take ages.

The above command will put you into a shell at which point you can do
{code:java}
while [ $? -eq 0 ]; do ./gradlew :streams:test --tests 
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets;
 done{code}
Which will re-run the tests until there is a failure. 

Note that due to how Gradle cache's test runs, you need to do something like 
[https://github.com/gradle/gradle/issues/9151#issue-434212465] in order to 
force gradle to re-run the test every time.


was (Author: mdedetrich-aiven):
In case people want to reproduce the flakiness, assuming you have a working 
docker installation you can do the following


{code:java}
docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w 
/home/gradle/project gradle sh{code}
where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff 
between higher occurrence to encounter the flakiness vs how fast the test 
runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the 
kafak project in gradle can take ages.

The above command will put you into a shell at which point you can do


{code:java}
while [ $? -eq 0 ]; do ./gradlew :streams:test --tests 
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets;
 done{code}
Which will re-run the tests until there is a failure. 

 

 

 

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
>

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-29 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573211#comment-17573211
 ] 

Matthew de Detrich commented on KAFKA-14014:


In case people want to reproduce the flakiness, assuming you have a working 
docker installation you can do the following


{code:java}
docker run -it --cpus=2 --rm -u gradle -v "$PWD":/home/gradle/project -w 
/home/gradle/project gradle sh{code}
where cpus=2 is how you can toggle how many cpus you want (there is a tradeoff 
between higher occurrence to encounter the flakiness vs how fast the test 
runs). I wouldn't recommend doing lower than cpus=2 otherwise even building the 
kafak project in gradle can take ages.

The above command will put you into a shell at which point you can do


{code:java}
while [ $? -eq 0 ]; do ./gradlew :streams:test --tests 
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets;
 done{code}
Which will re-run the tests until there is a failure. 

 

 

 

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-13467) Clients never refresh cached bootstrap IPs

2022-07-29 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573144#comment-17573144
 ] 

Matthew de Detrich commented on KAFKA-13467:


[~ijuma] are you talking about 
https://issues.apache.org/jira/browse/KAFKA-13405, if so its already closed? 

> Clients never refresh cached bootstrap IPs
> --
>
> Key: KAFKA-13467
> URL: https://issues.apache.org/jira/browse/KAFKA-13467
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, network
>Reporter: Matthias J. Sax
>Priority: Minor
>
> Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405.
> For certain broker rolling upgrade scenarios, it would be beneficial to 
> expired cached bootstrap server IP addresses and re-resolve those IPs to 
> allow clients to re-connect to the cluster without the need to restart the 
> client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-13467) Clients never refresh cached bootstrap IPs

2022-07-29 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572788#comment-17572788
 ] 

Matthew de Detrich edited comment on KAFKA-13467 at 7/29/22 7:18 AM:
-

I am currently having a look at this issue although due to the nature of the 
ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] 
would be able to give my some pointers on how to best approach this? Following 
what [~dengziming] said earlier, I am currently looking at SocketServer which 
seems to be core of where all of the netty connections are initialized but this 
may the wrong place to do the change.

Furthermore I may be missing something but if one naively implements the 
"broker disconnecting a connection when the client connects to force an IP 
change" would create an infinite loop unless one one checks for a condition 
(i.e. only on broker upgrade, is this possible) or maybe some other condition?


was (Author: mdedetrich-aiven):
I am currently having a look at this issue although due to the nature of the 
ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] 
would be able to give my some pointers on how to best approach this. Following 
what [~dengziming] said earlier, I am currently looking at SocketServer which 
seems to be core of where all of the netty connections are initialized but this 
may the wrong place to do the change.

Furthermore I may be missing something but if one naively implements the 
"broker disconnecting a connection when the client connects to force an IP 
change" would create an infinite loop unless one one checks for a condition 
(i.e. only on broker upgrade, is this possible) or maybe some other condition?

> Clients never refresh cached bootstrap IPs
> --
>
> Key: KAFKA-13467
> URL: https://issues.apache.org/jira/browse/KAFKA-13467
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, network
>Reporter: Matthias J. Sax
>Priority: Minor
>
> Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405.
> For certain broker rolling upgrade scenarios, it would be beneficial to 
> expired cached bootstrap server IP addresses and re-resolve those IPs to 
> allow clients to re-connect to the cluster without the need to restart the 
> client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-13467) Clients never refresh cached bootstrap IPs

2022-07-29 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572788#comment-17572788
 ] 

Matthew de Detrich edited comment on KAFKA-13467 at 7/29/22 7:18 AM:
-

I am currently having a look at this issue although due to the nature of the 
ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] 
would be able to give my some pointers on how to best approach this? Following 
what [~dengziming] said earlier, I am currently looking at SocketServer which 
seems to be core of where all of the netty connections are initialized but this 
may the wrong place to do the change.

Furthermore I may be missing something but if one naively implements the 
"broker disconnecting a connection when the client connects to force a cache 
refresh to get new IP's from FQDN" would create an infinite loop unless one one 
checks for a condition on that disconnect (i.e. only on broker upgrade, is this 
possible) or maybe some other condition?


was (Author: mdedetrich-aiven):
I am currently having a look at this issue although due to the nature of the 
ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] 
would be able to give my some pointers on how to best approach this? Following 
what [~dengziming] said earlier, I am currently looking at SocketServer which 
seems to be core of where all of the netty connections are initialized but this 
may the wrong place to do the change.

Furthermore I may be missing something but if one naively implements the 
"broker disconnecting a connection when the client connects to force an IP 
change" would create an infinite loop unless one one checks for a condition 
(i.e. only on broker upgrade, is this possible) or maybe some other condition?

> Clients never refresh cached bootstrap IPs
> --
>
> Key: KAFKA-13467
> URL: https://issues.apache.org/jira/browse/KAFKA-13467
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, network
>Reporter: Matthias J. Sax
>Priority: Minor
>
> Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405.
> For certain broker rolling upgrade scenarios, it would be beneficial to 
> expired cached bootstrap server IP addresses and re-resolve those IPs to 
> allow clients to re-connect to the cluster without the need to restart the 
> client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-13467) Clients never refresh cached bootstrap IPs

2022-07-29 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572788#comment-17572788
 ] 

Matthew de Detrich commented on KAFKA-13467:


I am currently having a look at this issue although due to the nature of the 
ticket I am diving a bit in the deep end. [~rsivaram] , [~ijuma] [~mimaison] 
would be able to give my some pointers on how to best approach this. Following 
what [~dengziming] said earlier, I am currently looking at SocketServer which 
seems to be core of where all of the netty connections are initialized but this 
may the wrong place to do the change.

Furthermore I may be missing something but if one naively implements the 
"broker disconnecting a connection when the client connects to force an IP 
change" would create an infinite loop unless one one checks for a condition 
(i.e. only on broker upgrade, is this possible) or maybe some other condition?

> Clients never refresh cached bootstrap IPs
> --
>
> Key: KAFKA-13467
> URL: https://issues.apache.org/jira/browse/KAFKA-13467
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, network
>Reporter: Matthias J. Sax
>Priority: Minor
>
> Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405.
> For certain broker rolling upgrade scenarios, it would be beneficial to 
> expired cached bootstrap server IP addresses and re-resolve those IPs to 
> allow clients to re-connect to the cluster without the need to restart the 
> client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-14103) Check for hostnames in CoreUtils.scala

2022-07-25 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-14103:
--

 Summary: Check for hostnames in CoreUtils.scala
 Key: KAFKA-14103
 URL: https://issues.apache.org/jira/browse/KAFKA-14103
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Matthew de Detrich


In the process of working on [https://github.com/apache/kafka/pull/11478] we 
realized that Kafka does not do any hostname validation when parsing listener 
configurations. It would be ideal to investigate hostname validation so that we 
can eagerly short-circuit on invalid hostnames rather than the current 
behaviour (this needs to be verified). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-08 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564158#comment-17564158
 ] 

Matthew de Detrich commented on KAFKA-14014:


Okay so another update, I think my async theory doesn't hold up. After running 
the above code over a few days it doesn't appear to be making a difference if 
you account for the extra time that the test is taking due to the Thread.sleep

I did however notice something interesting, three are actually 2 types of stack 
traces that you can get, the one mentioned in the ticket and this one


{code:java}
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest > 
shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets
 FAILED
java.lang.AssertionError: 
Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1), KeyValue(C, 2)]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at 
org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:544){code}
As you can see, in this case C appears to be counted twice, once with  a value 
1 and the other with a value 2. Due to this I have a suspicion that there may 
be some issue with the underlying stream that only appears when under heavy 
load?

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563162#comment-17563162
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 11:27 AM:
-

So I think I have gotten somewhere. I tried adjusting various timeouts to no 
avail so I moved onto another theory which is that some topology/stream based 
methods are asynchronous, i.e they are doing something in the background even 
after the calling function has finished. 

To test this theory I have adjusted the flaky test by adding some manual 
Thread.sleep's to account for this supposed synchronicity, i.e.
{code:java}
final AddNamedTopologyResult result1 = 
streams.addNamedTopology(topology2Client1);
Thread.sleep(500);
streams2.addNamedTopology(topology2Client2).all().get();
Thread.sleep(500);
result1.all().get();
Thread.sleep(500);{code}
This appears to be greatly reducing the flakiness. Originally I used 200 millis 
which roughly increased the amount of time until failure by 4x however I still 
got a failure so now I am testing it with the 500 millis as shown above.


was (Author: mdedetrich-aiven):
So I think I have gotten somewhere. I tried adjusting various timeouts to no 
avail so I moved onto another theory which is that some topology/stream based 
methods are asynchronous, i.e they are doing something in the background even 
after the calling function has finished. 

To test this theory I have adjusted the flaky test by adding some manual 
sleeps, i.e.
{code:java}
final AddNamedTopologyResult result1 = 
streams.addNamedTopology(topology2Client1);
Thread.sleep(500);
streams2.addNamedTopology(topology2Client2).all().get();
Thread.sleep(500);
result1.all().get();
Thread.sleep(500);{code}
This appears to be greatly reducing the flakiness. Originally I used 200 millis 
which roughly increased the amount of time until failure by 4x however I still 
got a failure so now I am testing it with the 500 millis as shown above.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
>

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 11:24 AM:
-

[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with I did manage to predictably replicate the test's flakiness 
and it does appear to be related to load, i.e. the test is more flaky the less 
CPU resources it has. I am using docker (i.e. running the tests within docker 
gradle image) by using the --cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running has 12 "cpus"'s 
(6 cores, 12 threads) and at least when running with all of the resources on 
the machine I couldn't replicate the flaky test.


was (Author: mdedetrich-aiven):
[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with I did manage to predictably replicate the test's flakiness 
and it does appear to be related to load, i.e. the test is more flaky the less 
CPU resources it has. I am using docker (i.e. running the tests within docker 
gradle image) by using the --cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 
cores, 12 threads) and at least when running with all of the resources on the 
machine I couldn't replicate the flaky test.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
>

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563162#comment-17563162
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 11:23 AM:
-

So I think I have gotten somewhere. I tried adjusting various timeouts to no 
avail so I moved onto another theory which is that some topology/stream based 
methods are asynchronous, i.e they are doing something in the background even 
after the calling function has finished. 

To test this theory I have adjusted the flaky test by adding some manual 
sleeps, i.e.
{code:java}
final AddNamedTopologyResult result1 = 
streams.addNamedTopology(topology2Client1);
Thread.sleep(500);
streams2.addNamedTopology(topology2Client2).all().get();
Thread.sleep(500);
result1.all().get();
Thread.sleep(500);{code}
This appears to be greatly reducing the flakiness. Originally I used 200 millis 
which roughly increased the amount of time until failure by 4x however I still 
got a failure so now I am testing it with the 500 millis as shown above.


was (Author: mdedetrich-aiven):
So I think I have gotten somewhere. I tried adjusting various timeouts to no 
avail so I moved onto another theory which is that some topology/stream based 
methods are asynchronous, i.e they are doing something in the background even 
after the calling function has finished. 

To test this theory I have adjusted the flaky test by adding some manual 
sleeps, i.e.
{code:java}
final AddNamedTopologyResult result1 = 
streams.addNamedTopology(topology2Client1);
Thread.sleep(500);
streams2.addNamedTopology(topology2Client2).all().get();
Thread.sleep(500);
result1.all().get();
Thread.sleep(500);{code}
This appears to be greatly reducing the flakiness. Originally I used 200 millis 
which roughly increased the amount of time until failure by 4x and now I am 
testing it with the 500 millis as shown above.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563162#comment-17563162
 ] 

Matthew de Detrich commented on KAFKA-14014:


So I think I have gotten somewhere. I tried adjusting various timeouts to no 
avail so I moved onto another theory which is that some topology/stream based 
methods are asynchronous, i.e they are doing something in the background even 
after the calling function has finished. 

To test this theory I have adjusted the flaky test by adding some manual 
sleeps, i.e.
{code:java}
final AddNamedTopologyResult result1 = 
streams.addNamedTopology(topology2Client1);
Thread.sleep(500);
streams2.addNamedTopology(topology2Client2).all().get();
Thread.sleep(500);
result1.all().get();
Thread.sleep(500);{code}
This appears to be greatly reducing the flakiness. Originally I used 200 millis 
which roughly increased the amount of time until failure by 4x and now I am 
testing it with the 500 millis as shown above.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>  Labels: flaky-test
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:29 AM:


[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with I did manage to predictably replicate the test's flakiness 
and it does appear to be related to load, i.e. the test is more flaky the less 
CPU resources it has. I am using docker (i.e. running the tests within docker 
gradle image) by using the --cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 
cores, 12 threads) and at least when running with all of the resources on the 
machine I couldn't replicate the flaky test.


was (Author: mdedetrich-aiven):
[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with I did manage to predictably replicate the test and its 
flakiness does appear to be related to load, i.e. the test is more flaky the 
less CPU resources it has. I am using docker (i.e. running the tests within 
docker gradle image) by using the --cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 
cores, 12 threads) and at least when running with all of the resources on the 
machine I couldn't replicate the flaky test.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
>

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:27 AM:


[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with I did manage to predictably replicate the test and its 
flakiness does appear to be related to load, i.e. the test is more flaky the 
less CPU resources it has. I am using docker (i.e. running the tests within 
docker gradle image) by using the --cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 
cores, 12 threads) and at least when running with all of the resources on the 
machine I couldn't replicate the flaky test.


was (Author: mdedetrich-aiven):
[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with, the test is more flaky the less CPU resources it has. I am 
using docker (i.e. running the tests within docker gradle image) by using the 
--cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 
cores, 12 threads) and at least when running with all of the resources on the 
machine I couldn't replicate the flaky test.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
>

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:26 AM:


[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with, the test is more flaky the less CPU resources it has. I am 
using docker (i.e. running the tests within docker gradle image) by using the 
--cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

Note that 5 cpu's is already considered "high" (at least for a machine that 
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6 
cores, 12 threads) and at least when running with all of the resources on the 
machine I couldn't replicate the flaky test.


was (Author: mdedetrich-aiven):
[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with, the test is more flaky the less CPU resources it has. I am 
using docker (i.e. running the tests within docker gradle image) by using the 
--cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-07-06 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563004#comment-17563004
 ] 

Matthew de Detrich commented on KAFKA-14014:


[~cadonna] So I did some debugging on this ticket over the past week and I 
found out some interesting things.

To start off with, the test is more flaky the less CPU resources it has. I am 
using docker (i.e. running the tests within docker gradle image) by using the 
--cpu flag to limit resources.

Interestingly I have gone up to 5 cpu's and its still flaking out albeit less 
often. I attempted to increase the various timeouts that is used in t he test 
but this had no effect so I am going to dig a bit further.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-06-23 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557842#comment-17557842
 ] 

Matthew de Detrich commented on KAFKA-14014:


So I ran the tests overnight with JDK 11 and have no failure with ~10k runs, I 
suspect that JDK 17 will also provide the same result.

This means that a few more possibilities are opened up when it comes to why I 
may not be able to reproduce the failure locally

1. This is gradle specific (going to create a runner in gradle)
2. JVM on the CI is being run with different options that could be causing the 
failure? (I have to check up on how the CI works)
3. In addition to this test maybe other tests are being run concurrently on the 
CI and the extra load is causing certain parts of the test setup to take longer 
which may be causing race conditions?

I will look into this a bit later, any other hints/clues for possible reasons 
behind the failure would be greatly appreciated!

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-06-22 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557564#comment-17557564
 ] 

Matthew de Detrich commented on KAFKA-14014:


Okay so I just noticed the linked builds with different JDK versions (11/17 
versus my 16) so I will rerun the tests using those JDK versions to see if I 
can simulate the CI.

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Comment Edited] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-06-22 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557525#comment-17557525
 ] 

Matthew de Detrich edited comment on KAFKA-14014 at 6/22/22 3:29 PM:
-

So interestingly I worked on another ticket that reported the exact same test 
as being flaky (see https://issues.apache.org/jira/browse/KAFKA-13531) and I 
couldn't reproduce any flakiness (hence the reason why that ticket is closed).

I rebased my fork of Kafka to the latest version in trunk and started running 
this test and so far I have 1.6k runs without a failure (continuously running 
the tests in loop as I post this comment).

[~cadonna] Would it be possible to state how you are running the tests and also 
which JDK version (and which branch if you are not running latest trunk?). 
Personally I am using Java 16 and to run the tests I am using the JUnit (not 
gradle) runner with Intellij (primarily because it gives the ability to run a 
test until it fails).


was (Author: mdedetrich-aiven):
So interestingly I worked on another ticket that reported the exact same test 
as being flaky (see https://issues.apache.org/jira/browse/KAFKA-13531) and I 
couldn't reproduce any flakiness.

I rebased my fork of Kafka to the latest version in trunk and started running 
this test and so far I have 1.6k runs without a failure (re-running the tests 
in loop as I post this comment)

[~cadonna] Would it be possible to state how you are running the tests and also 
which JDK version (and which branch if you are not running latest trunk?). 
Personally I am using Java 16 and to run the tests I am using the JUnit (not 
gradle) runner with Intellij (primarily because it gives the ability to run a 
test until it fails).

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (KAFKA-14014) Flaky test NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()

2022-06-22 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557525#comment-17557525
 ] 

Matthew de Detrich commented on KAFKA-14014:


So interestingly I worked on another ticket that reported the exact same test 
as being flaky (see https://issues.apache.org/jira/browse/KAFKA-13531) and I 
couldn't reproduce any flakiness.

I rebased my fork of Kafka to the latest version in trunk and started running 
this test and so far I have 1.6k runs without a failure (re-running the tests 
in loop as I post this comment)

[~cadonna] Would it be possible to state how you are running the tests and also 
which JDK version (and which branch if you are not running latest trunk?). 
Personally I am using Java 16 and to run the tests I am using the JUnit (not 
gradle) runner with Intellij (primarily because it gives the ability to run a 
test until it fails).

> Flaky test 
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> 
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
>  Issue Type: Test
>  Components: streams
>Reporter: Bruno Cadonna
>Priority: Critical
>
> {code:java}
> java.lang.AssertionError: 
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
>  but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (KAFKA-6527) Transient failure in DynamicBrokerReconfigurationTest.testDefaultTopicConfig

2022-06-17 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-6527:
--
Labels: flakey  (was: )

> Transient failure in DynamicBrokerReconfigurationTest.testDefaultTopicConfig
> 
>
> Key: KAFKA-6527
> URL: https://issues.apache.org/jira/browse/KAFKA-6527
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Blocker
>  Labels: flakey
> Fix For: 3.3.0
>
>
> {code:java}
> java.lang.AssertionError: Log segment size increase not applied
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:355)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:865)
>   at 
> kafka.server.DynamicBrokerReconfigurationTest.testDefaultTopicConfig(DynamicBrokerReconfigurationTest.scala:348)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (KAFKA-13897) Add 3.1.1 to system tests and streams upgrade tests

2022-06-17 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-13897:
---
Labels: flakey  (was: )

> Add 3.1.1 to system tests and streams upgrade tests
> ---
>
> Key: KAFKA-13897
> URL: https://issues.apache.org/jira/browse/KAFKA-13897
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Reporter: Tom Bentley
>Priority: Blocker
>  Labels: flakey
> Fix For: 3.3.0, 3.1.2, 3.2.1
>
>
> Per the penultimate bullet on the [release 
> checklist|https://cwiki.apache.org/confluence/display/KAFKA/Release+Process#ReleaseProcess-Afterthevotepasses],
>  Kafka v3.1.1 is released. We should add this version to the system tests.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (KAFKA-13897) Add 3.1.1 to system tests and streams upgrade tests

2022-06-17 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-13897:
---
Labels:   (was: flakey)

> Add 3.1.1 to system tests and streams upgrade tests
> ---
>
> Key: KAFKA-13897
> URL: https://issues.apache.org/jira/browse/KAFKA-13897
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Reporter: Tom Bentley
>Priority: Blocker
> Fix For: 3.3.0, 3.1.2, 3.2.1
>
>
> Per the penultimate bullet on the [release 
> checklist|https://cwiki.apache.org/confluence/display/KAFKA/Release+Process#ReleaseProcess-Afterthevotepasses],
>  Kafka v3.1.1 is released. We should add this version to the system tests.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (KAFKA-8280) Flaky Test DynamicBrokerReconfigurationTest#testUncleanLeaderElectionEnable

2022-06-17 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-8280:
--
Labels: flakey  (was: )

> Flaky Test DynamicBrokerReconfigurationTest#testUncleanLeaderElectionEnable
> ---
>
> Key: KAFKA-8280
> URL: https://issues.apache.org/jira/browse/KAFKA-8280
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: John Roesler
>Priority: Blocker
>  Labels: flakey
> Fix For: 3.3.0
>
>
> I saw this fail again on 
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3979/testReport/junit/kafka.server/DynamicBrokerReconfigurationTest/testUncleanLeaderElectionEnable/
> {noformat}
> Error Message
> java.lang.AssertionError: Unclean leader not elected
> Stacktrace
> java.lang.AssertionError: Unclean leader not elected
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> kafka.server.DynamicBrokerReconfigurationTest.testUncleanLeaderElectionEnable(DynamicBrokerReconfigurationTest.scala:510)
> {noformat}
> {noformat}
> Standard Output
> Completed Updating config for entity: brokers '0'.
> Completed Updating config for entity: brokers '1'.
> Completed Updating config for entity: brokers '2'.
> [2019-04-23 01:17:11,690] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=1] Error for partition testtopic-6 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,690] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=1] Error for partition testtopic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,722] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition testtopic-7 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,722] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition testtopic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,760] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=1] Error for partition testtopic-6 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,761] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=1] Error for partition testtopic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,765] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition testtopic-3 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:11,765] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition testtopic-9 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:13,779] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=1] Error for partition __consumer_offsets-8 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:13,780] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=1] Error for partition __consumer_offsets-38 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:13,780] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=1] Error for partition __consumer_offsets-2 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-23 01:17:13,780] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=1] Error for partition __consumer_offsets-14 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
>

[jira] [Updated] (KAFKA-13736) Flaky kafka.network.SocketServerTest.closingChannelWithBufferedReceives

2022-06-17 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich updated KAFKA-13736:
---
Labels: flakey  (was: )

> Flaky kafka.network.SocketServerTest.closingChannelWithBufferedReceives
> ---
>
> Key: KAFKA-13736
> URL: https://issues.apache.org/jira/browse/KAFKA-13736
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Priority: Blocker
>  Labels: flakey
> Fix For: 3.3.0
>
>
> Examples:
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-11895/1/tests
> {code}
> java.lang.AssertionError: receiveRequest timed out
>   at 
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:140)
>   at 
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1521)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
>   at 
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1520)
>   at 
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1483)
>   at 
> kafka.network.SocketServerTest.closingChannelWithBufferedReceives(SocketServerTest.scala:1431)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (KAFKA-13531) Flaky test NamedTopologyIntegrationTest

2022-06-16 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554979#comment-17554979
 ] 

Matthew de Detrich commented on KAFKA-13531:


[~mjsax] I tried to replicate the flakiness of the test and I was unable to do 
so (I had my laptop running the test until failure overnight and with over 20k 
runs I didn't get a single failure).


I also noticed that the method/test is named slightly differently (i.e. 
currently its called 
shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets)
 and has been modified since this ticket was created so maybe changes made it 
not flaky anymore?

> Flaky test NamedTopologyIntegrationTest
> ---
>
> Key: KAFKA-13531
> URL: https://issues.apache.org/jira/browse/KAFKA-13531
> Project: Kafka
>  Issue Type: Test
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets
> {quote}java.lang.AssertionError: Did not receive all 3 records from topic 
> output-stream-2 within 6 ms, currently accumulated data is [] Expected: 
> is a value equal to or greater than <3> but: <0> was less than <3> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:648)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:368)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:336)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:644)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:617)
>  at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:439){quote}
> STDERR
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting 
> offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>  at 
> org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107)
>  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) 
> at 
> org.apache.kafka.common.internals.KafkaCompletableFuture.kafkaComplete(KafkaCompletableFuture.java:39)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:122)
>  at 
> org.apache.kafka.streams.processor.internals.TopologyMetadata.maybeNotifyTopologyVersionWaiters(TopologyMetadata.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.checkForTopologyUpdates(StreamThread.java:916)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:575)
>  Caused by: org.apache.kafka.common.errors.GroupSubscribedToTopicException: 
> Deleting offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting 
> offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>  at 
>

[jira] [Assigned] (KAFKA-13531) Flaky test NamedTopologyIntegrationTest

2022-06-14 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-13531:
--

Assignee: Matthew de Detrich

> Flaky test NamedTopologyIntegrationTest
> ---
>
> Key: KAFKA-13531
> URL: https://issues.apache.org/jira/browse/KAFKA-13531
> Project: Kafka
>  Issue Type: Test
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthew de Detrich
>Priority: Critical
>  Labels: flaky-test
>
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets
> {quote}java.lang.AssertionError: Did not receive all 3 records from topic 
> output-stream-2 within 6 ms, currently accumulated data is [] Expected: 
> is a value equal to or greater than <3> but: <0> was less than <3> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:648)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:368)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:336)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:644)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:617)
>  at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:439){quote}
> STDERR
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting 
> offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>  at 
> org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107)
>  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) 
> at 
> org.apache.kafka.common.internals.KafkaCompletableFuture.kafkaComplete(KafkaCompletableFuture.java:39)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:122)
>  at 
> org.apache.kafka.streams.processor.internals.TopologyMetadata.maybeNotifyTopologyVersionWaiters(TopologyMetadata.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.checkForTopologyUpdates(StreamThread.java:916)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:575)
>  Caused by: org.apache.kafka.common.errors.GroupSubscribedToTopicException: 
> Deleting offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting 
> offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>  at 
> org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107)
>  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) 
> at 
>

[jira] [Commented] (KAFKA-13531) Flaky test NamedTopologyIntegrationTest

2022-06-14 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554010#comment-17554010
 ] 

Matthew de Detrich commented on KAFKA-13531:


I will have a look into this.

> Flaky test NamedTopologyIntegrationTest
> ---
>
> Key: KAFKA-13531
> URL: https://issues.apache.org/jira/browse/KAFKA-13531
> Project: Kafka
>  Issue Type: Test
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
>
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets
> {quote}java.lang.AssertionError: Did not receive all 3 records from topic 
> output-stream-2 within 6 ms, currently accumulated data is [] Expected: 
> is a value equal to or greater than <3> but: <0> was less than <3> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:648)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:368)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:336)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:644)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:617)
>  at 
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:439){quote}
> STDERR
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting 
> offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>  at 
> org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107)
>  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) 
> at 
> org.apache.kafka.common.internals.KafkaCompletableFuture.kafkaComplete(KafkaCompletableFuture.java:39)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:122)
>  at 
> org.apache.kafka.streams.processor.internals.TopologyMetadata.maybeNotifyTopologyVersionWaiters(TopologyMetadata.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.checkForTopologyUpdates(StreamThread.java:916)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598)
>  at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:575)
>  Caused by: org.apache.kafka.common.errors.GroupSubscribedToTopicException: 
> Deleting offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.GroupSubscribedToTopicException: Deleting 
> offsets of a topic is forbidden while the consumer group is actively 
> subscribed to it. at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>  at 
> org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper.lambda$removeNamedTopology$3(KafkaStreamsNamedTopologyWrapper.java:213)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.lambda$whenComplete$2(KafkaFutureImpl.java:107)
>  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) 
> at 
>

[jira] [Commented] (KAFKA-13957) Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores

2022-06-13 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553810#comment-17553810
 ] 

Matthew de Detrich commented on KAFKA-13957:


I have fixed the test at https://github.com/apache/kafka/pull/12289

> Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores
> -
>
> Key: KAFKA-13957
> URL: https://issues.apache.org/jira/browse/KAFKA-13957
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: A. Sophie Blee-Goldman
>Assignee: Matthew de Detrich
>Priority: Major
>  Labels: flaky-test
> Attachments: 
> StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores.rtf
>
>
> Failed on a local build so I have the full logs (attached)
> {code:java}
> java.lang.AssertionError: Unexpected exception thrown while getting the value 
> from store.
> Expected: is (a string containing "Cannot get state store source-table 
> because the stream thread is PARTITIONS_ASSIGNED, not RUNNING" or a string 
> containing "The state store, source-table, may have migrated to another 
> instance" or a string containing "Cannot get state store source-table because 
> the stream thread is STARTING, not RUNNING")
>  but: was "The specified partition 1 for store source-table does not 
> exist."
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.verifyRetrievableException(StoreQueryIntegrationTest.java:539)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.lambda$shouldQuerySpecificActivePartitionStores$5(StoreQueryIntegrationTest.java:241)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.until(StoreQueryIntegrationTest.java:557)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:183)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833) {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (KAFKA-13957) Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores

2022-06-13 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553562#comment-17553562
 ] 

Matthew de Detrich commented on KAFKA-13957:


I am working on this issue, I managed to replicate the exact same exception.

> Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores
> -
>
> Key: KAFKA-13957
> URL: https://issues.apache.org/jira/browse/KAFKA-13957
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: A. Sophie Blee-Goldman
>Assignee: Matthew de Detrich
>Priority: Major
>  Labels: flaky-test
> Attachments: 
> StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores.rtf
>
>
> Failed on a local build so I have the full logs (attached)
> {code:java}
> java.lang.AssertionError: Unexpected exception thrown while getting the value 
> from store.
> Expected: is (a string containing "Cannot get state store source-table 
> because the stream thread is PARTITIONS_ASSIGNED, not RUNNING" or a string 
> containing "The state store, source-table, may have migrated to another 
> instance" or a string containing "Cannot get state store source-table because 
> the stream thread is STARTING, not RUNNING")
>  but: was "The specified partition 1 for store source-table does not 
> exist."
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.verifyRetrievableException(StoreQueryIntegrationTest.java:539)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.lambda$shouldQuerySpecificActivePartitionStores$5(StoreQueryIntegrationTest.java:241)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.until(StoreQueryIntegrationTest.java:557)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:183)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833) {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Assigned] (KAFKA-13957) Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores

2022-06-13 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-13957:
--

Assignee: Matthew de Detrich

> Flaky Test StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores
> -
>
> Key: KAFKA-13957
> URL: https://issues.apache.org/jira/browse/KAFKA-13957
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: A. Sophie Blee-Goldman
>Assignee: Matthew de Detrich
>Priority: Major
>  Labels: flaky-test
> Attachments: 
> StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores.rtf
>
>
> Failed on a local build so I have the full logs (attached)
> {code:java}
> java.lang.AssertionError: Unexpected exception thrown while getting the value 
> from store.
> Expected: is (a string containing "Cannot get state store source-table 
> because the stream thread is PARTITIONS_ASSIGNED, not RUNNING" or a string 
> containing "The state store, source-table, may have migrated to another 
> instance" or a string containing "Cannot get state store source-table because 
> the stream thread is STARTING, not RUNNING")
>  but: was "The specified partition 1 for store source-table does not 
> exist."
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.verifyRetrievableException(StoreQueryIntegrationTest.java:539)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.lambda$shouldQuerySpecificActivePartitionStores$5(StoreQueryIntegrationTest.java:241)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.until(StoreQueryIntegrationTest.java:557)
>   at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:183)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:833) {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (KAFKA-13980) Update to Scala 2.12.16

2022-06-11 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-13980:
--

 Summary: Update to Scala 2.12.16
 Key: KAFKA-13980
 URL: https://issues.apache.org/jira/browse/KAFKA-13980
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Matthew de Detrich
Assignee: Matthew de Detrich






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Comment Edited] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign

2022-06-07 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551082#comment-17551082
 ] 

Matthew de Detrich edited comment on KAFKA-8420 at 6/7/22 1:20 PM:
---

So in order to work on this issue I tried making a test to replicate what you 
are describing and I came across some interesting, the test that I wrote looks 
like this
{code:java}
@Test
public void gracefulHandlingSwitchSubscribeToManualAssign() {
ConsumerMetadata metadata = createMetadata(subscription);
MockClient client = new MockClient(time, metadata);

initMetadata(client, Collections.singletonMap(topic, 1));
Node node = metadata.fetch().nodes().get(0);

KafkaConsumer consumer = newConsumer(time, client, 
subscription, metadata, assignor, true, groupInstanceId);

consumer.subscribe(singleton(topic), 
getConsumerRebalanceListener(consumer));
prepareRebalance(client, node, singleton(topic), assignor, 
singletonList(tp0), null);


ConsumerRecords initialConsumerRecords = 
consumer.poll(Duration.ofMillis(0));
assertTrue(initialConsumerRecords.isEmpty());

consumer.unsubscribe();

consumer.assign(singleton(tp0));
client.prepareResponseFrom(syncGroupResponse(singletonList(tp0), 
Errors.NONE), coordinator);
consumer.poll(Duration.ofSeconds(1));
} {code}
The problem that I am currently getting is that the 
{{consumer.poll(Duration.ofSeconds(1));}} is causing an infinite loop/deadlock 
(note that originally I had a {{consumer.poll(Duration.ofSeconds(0));}} however 
this caused the {{consumer.poll}} method to short circuit due to 
{{timer.notExpired()}} never executing and hence just immediately returning an 
{{ConsumerRecords.empty();}} without the consumer ever sending a request to 
trigger a sync-group resonse).

After spending some time debugging this is the piece of code that is not 
terminating 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L250-L251].
 What I am finding highly confusing if the fact that the 
{{lookupCoordinator()}} does actually complete (in this case it immediately 
returns {{findCoordinatorFuture}} at 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L294])
 however for some reason the loop at 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java#L215]
 never terminates. It doesn't appear to detect that the future has finished 
which I believe to be the case? I am not sure if this is related to what you 
mentioned, i.e. 
{quote}In the worst case (i.e. leader keep sending incompatible assignment), 
this would case the consumer to fall into endless re-joins.
{quote}
but it looks like that I have either found something else or I am barking up 
the wrong tree? Do you have any insights into this [~guozhang] 


was (Author: mdedetrich-aiven):
So in order to work on this issue I tried making a test to replicate what you 
are describing and I came across some interesting, the test that I wrote looks 
like this


{code:java}
@Test
public void gracefulHandlingSwitchSubscribeToManualAssign() {
ConsumerMetadata metadata = createMetadata(subscription);
MockClient client = new MockClient(time, metadata);

initMetadata(client, Collections.singletonMap(topic, 1));
Node node = metadata.fetch().nodes().get(0);

KafkaConsumer consumer = newConsumer(time, client, 
subscription, metadata, assignor, true, groupInstanceId);

consumer.subscribe(singleton(topic), 
getConsumerRebalanceListener(consumer));
prepareRebalance(client, node, singleton(topic), assignor, 
singletonList(tp0), null);


ConsumerRecords initialConsumerRecords = 
consumer.poll(Duration.ofMillis(0));
assertTrue(initialConsumerRecords.isEmpty());

consumer.unsubscribe();

consumer.assign(singleton(tp0));
client.prepareResponseFrom(syncGroupResponse(singletonList(tp0), 
Errors.NONE), coordinator);
consumer.poll(Duration.ofSeconds(1));
} {code}
The problem that I am currently getting is that the 
{{consumer.poll(Duration.ofSeconds(1));}} is causing an infinite loop/deadlock 
(note that originally I had a {{consumer.poll(Duration.ofSeconds(0));}} however 
this caused the {{consumer.poll}} method to short circuit due to 
{{timer.notExpired()}} never executing and hence just immediately returning an 
{{ConsumerRecords.empty();}} without the consumer ever sending a request to 
trigger a sync-group resonse).

After spending some time debugging this is the piece of code that is not 
terminating

[jira] [Commented] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign

2022-06-07 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551082#comment-17551082
 ] 

Matthew de Detrich commented on KAFKA-8420:
---

So in order to work on this issue I tried making a test to replicate what you 
are describing and I came across some interesting, the test that I wrote looks 
like this


{code:java}
@Test
public void gracefulHandlingSwitchSubscribeToManualAssign() {
ConsumerMetadata metadata = createMetadata(subscription);
MockClient client = new MockClient(time, metadata);

initMetadata(client, Collections.singletonMap(topic, 1));
Node node = metadata.fetch().nodes().get(0);

KafkaConsumer consumer = newConsumer(time, client, 
subscription, metadata, assignor, true, groupInstanceId);

consumer.subscribe(singleton(topic), 
getConsumerRebalanceListener(consumer));
prepareRebalance(client, node, singleton(topic), assignor, 
singletonList(tp0), null);


ConsumerRecords initialConsumerRecords = 
consumer.poll(Duration.ofMillis(0));
assertTrue(initialConsumerRecords.isEmpty());

consumer.unsubscribe();

consumer.assign(singleton(tp0));
client.prepareResponseFrom(syncGroupResponse(singletonList(tp0), 
Errors.NONE), coordinator);
consumer.poll(Duration.ofSeconds(1));
} {code}
The problem that I am currently getting is that the 
{{consumer.poll(Duration.ofSeconds(1));}} is causing an infinite loop/deadlock 
(note that originally I had a {{consumer.poll(Duration.ofSeconds(0));}} however 
this caused the {{consumer.poll}} method to short circuit due to 
{{timer.notExpired()}} never executing and hence just immediately returning an 
{{ConsumerRecords.empty();}} without the consumer ever sending a request to 
trigger a sync-group resonse).

After spending some time debugging this is the piece of code that is not 
terminating 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L250-L251].
 What I am finding highly confusing if the fact that the 
{{lookupCoordinator()}} does actually complete (in this case it immediately 
returns {{findCoordinatorFuture}} at 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L294])
 however for some reason the loop at 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java#L215]
 never terminates. It doesn't appear to detect that the future has finished 
which I believe to be the case? I am not sure if this is related to what you 
mentioned, i.e. 
{quote}
In the worst case (i.e. leader keep sending incompatible assignment), this 
would case the consumer to fall into endless re-joins.
{quote}
but it looks like that I have either found something else or I am barking up 
the wrong tree?

> Graceful handling when consumer switches from subscribe to manual assign
> 
>
> Key: KAFKA-8420
> URL: https://issues.apache.org/jira/browse/KAFKA-8420
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Matthew de Detrich
>Priority: Major
>
> Today if a consumer switches between subscribe (and hence relies on group 
> rebalance to get assignment) and manual assign, it may cause unnecessary 
> rebalances. For example:
> 1. consumer.subscribe();
> 2. consumer.poll(); // join-group request sent, returns empty because 
> poll timeout
> 3. consumer.unsubscribe();
> 4. consumer.assign(..);
> 5. consumer.poll(); // sync-group request received, and the assigned 
> partitions does not match the current subscription-state. In this case it 
> will tries to re-join which is not necessary.
> In the worst case (i.e. leader keep sending incompatible assignment), this 
> would case the consumer to fall into endless re-joins.
> Although it is not a very common usage scenario, it still worth being better 
> handled than the status-quo.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign

2022-06-02 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545423#comment-17545423
 ] 

Matthew de Detrich commented on KAFKA-8420:
---

I am currently working on this issue, as of now trying to reproduce it with a 
test and then I will implement the fix for it.

> Graceful handling when consumer switches from subscribe to manual assign
> 
>
> Key: KAFKA-8420
> URL: https://issues.apache.org/jira/browse/KAFKA-8420
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Matthew de Detrich
>Priority: Major
>
> Today if a consumer switches between subscribe (and hence relies on group 
> rebalance to get assignment) and manual assign, it may cause unnecessary 
> rebalances. For example:
> 1. consumer.subscribe();
> 2. consumer.poll(); // join-group request sent, returns empty because 
> poll timeout
> 3. consumer.unsubscribe();
> 4. consumer.assign(..);
> 5. consumer.poll(); // sync-group request received, and the assigned 
> partitions does not match the current subscription-state. In this case it 
> will tries to re-join which is not necessary.
> In the worst case (i.e. leader keep sending incompatible assignment), this 
> would case the consumer to fall into endless re-joins.
> Although it is not a very common usage scenario, it still worth being better 
> handled than the status-quo.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Assigned] (KAFKA-8420) Graceful handling when consumer switches from subscribe to manual assign

2022-06-02 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-8420:
-

Assignee: Matthew de Detrich

> Graceful handling when consumer switches from subscribe to manual assign
> 
>
> Key: KAFKA-8420
> URL: https://issues.apache.org/jira/browse/KAFKA-8420
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Matthew de Detrich
>Priority: Major
>
> Today if a consumer switches between subscribe (and hence relies on group 
> rebalance to get assignment) and manual assign, it may cause unnecessary 
> rebalances. For example:
> 1. consumer.subscribe();
> 2. consumer.poll(); // join-group request sent, returns empty because 
> poll timeout
> 3. consumer.unsubscribe();
> 4. consumer.assign(..);
> 5. consumer.poll(); // sync-group request received, and the assigned 
> partitions does not match the current subscription-state. In this case it 
> will tries to re-join which is not necessary.
> In the worst case (i.e. leader keep sending incompatible assignment), this 
> would case the consumer to fall into endless re-joins.
> Although it is not a very common usage scenario, it still worth being better 
> handled than the status-quo.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (KAFKA-13299) Accept listeners that have the same port but use IPv4 vs IPv6

2021-11-09 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441164#comment-17441164
 ] 

Matthew de Detrich commented on KAFKA-13299:


PR is open at https://github.com/apache/kafka/pull/11478

> Accept listeners that have the same port but use IPv4 vs IPv6
> -
>
> Key: KAFKA-13299
> URL: https://issues.apache.org/jira/browse/KAFKA-13299
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Matthew de Detrich
>Priority: Major
>
> Currently we are going through a process where we want to migrate Kafka 
> brokers from IPv4 to IPv6. The simplest way for us to do this would be to 
> allow Kafka to have 2 listeners of the same port however one listener has an 
> IPv4 address allocated and another listener has an IPv6 address allocated.
> Currently this is not possible in Kafka because it validates that all of the 
> listeners have a unique port. With some rudimentary testing if this 
> validation is removed (so we are able to have 2 listeners of the same port 
> but with different IP versions) there doesn't seem to be any immediate 
> problems, the kafka clusters works without any problems.
> Is there some fundamental reason behind this limitation of having unique 
> ports? Consequently would there be any problems in loosening this limitation 
> (i.e. duplicate ports are allowed if the IP versions are different) or just 
> altogether removing the restriction 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (KAFKA-13299) Accept listeners that have the same port but use IPv4 vs IPv6

2021-11-09 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441105#comment-17441105
 ] 

Matthew de Detrich commented on KAFKA-13299:


[~showuon] Thanks for the reply, I am working on this right now will send an 
example PR today.

> Accept listeners that have the same port but use IPv4 vs IPv6
> -
>
> Key: KAFKA-13299
> URL: https://issues.apache.org/jira/browse/KAFKA-13299
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Matthew de Detrich
>Priority: Major
>
> Currently we are going through a process where we want to migrate Kafka 
> brokers from IPv4 to IPv6. The simplest way for us to do this would be to 
> allow Kafka to have 2 listeners of the same port however one listener has an 
> IPv4 address allocated and another listener has an IPv6 address allocated.
> Currently this is not possible in Kafka because it validates that all of the 
> listeners have a unique port. With some rudimentary testing if this 
> validation is removed (so we are able to have 2 listeners of the same port 
> but with different IP versions) there doesn't seem to be any immediate 
> problems, the kafka clusters works without any problems.
> Is there some fundamental reason behind this limitation of having unique 
> ports? Consequently would there be any problems in loosening this limitation 
> (i.e. duplicate ports are allowed if the IP versions are different) or just 
> altogether removing the restriction 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (KAFKA-13299) Accept listeners that have the same port but use IPv4 vs IPv6

2021-09-14 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-13299:
--

 Summary: Accept listeners that have the same port but use IPv4 vs 
IPv6
 Key: KAFKA-13299
 URL: https://issues.apache.org/jira/browse/KAFKA-13299
 Project: Kafka
  Issue Type: Improvement
Reporter: Matthew de Detrich


Currently we are going through a process where we want to migrate Kafka brokers 
from IPv4 to IPv6. The simplest way for us to do this would be to allow Kafka 
to have 2 listeners of the same port however one listener has an IPv4 address 
allocated and another listener has an IPv6 address allocated.

Currently this is not possible in Kafka because it validates that all of the 
listeners have a unique port. With some rudimentary testing if this validation 
is removed (so we are able to have 2 listeners of the same port but with 
different IP versions) there doesn't seem to be any immediate problems, the 
kafka clusters works without any problems.

Is there some fundamental reason behind this limitation of having unique ports? 
Consequently would there be any problems in loosening this limitation (i.e. 
duplicate ports are allowed if the IP versions are different) or just 
altogether removing the restriction 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-12913) Make Scala Case class's final

2021-06-08 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-12913:
--

 Summary: Make Scala Case class's final
 Key: KAFKA-12913
 URL: https://issues.apache.org/jira/browse/KAFKA-12913
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Matthew de Detrich
Assignee: Matthew de Detrich


Its considered best practice to make case classes final since Scala code that 
uses case class relies on equals/hashcode/unapply to function correctly (which 
breaks if user's can override this behaviour)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation

2021-05-26 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351617#comment-17351617
 ] 

Matthew de Detrich commented on KAFKA-12430:


I think I might just create a second property to not create topics and document 
that it will break upstream clusters.

> emit.heartbeats.enabled = false should disable heartbeats topic creation
> 
>
> Key: KAFKA-12430
> URL: https://issues.apache.org/jira/browse/KAFKA-12430
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Reporter: Ivan Yurchenko
>Assignee: Matthew de Detrich
>Priority: Minor
>
> Currently, whether MirrorMaker 2's {{MirrorHeartbeatConnector}} emits 
> heartbeats or not is based on {{emit.heartbeats.enabled}} setting. However, 
> {{heartbeats}} topic is created unconditionally. It seems that the same 
> setting should really disable the topic creation as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation

2021-05-25 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-12430:
--

Assignee: Matthew de Detrich

> emit.heartbeats.enabled = false should disable heartbeats topic creation
> 
>
> Key: KAFKA-12430
> URL: https://issues.apache.org/jira/browse/KAFKA-12430
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Reporter: Ivan Yurchenko
>Assignee: Matthew de Detrich
>Priority: Minor
>
> Currently, whether MirrorMaker 2's {{MirrorHeartbeatConnector}} emits 
> heartbeats or not is based on {{emit.heartbeats.enabled}} setting. However, 
> {{heartbeats}} topic is created unconditionally. It seems that the same 
> setting should really disable the topic creation as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation

2021-05-25 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351093#comment-17351093
 ] 

Matthew de Detrich commented on KAFKA-12430:


[~ryannedolan] I am going to look into this, do you have any comments/context 
to add on this topic (i.e. is there some deliberate reason why if 
emit.heartbeats.enabled is false then the heartbeat topics are still created?)

> emit.heartbeats.enabled = false should disable heartbeats topic creation
> 
>
> Key: KAFKA-12430
> URL: https://issues.apache.org/jira/browse/KAFKA-12430
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Reporter: Ivan Yurchenko
>Priority: Minor
>
> Currently, whether MirrorMaker 2's {{MirrorHeartbeatConnector}} emits 
> heartbeats or not is based on {{emit.heartbeats.enabled}} setting. However, 
> {{heartbeats}} topic is created unconditionally. It seems that the same 
> setting should really disable the topic creation as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-12819) Quality of life improvements for tests

2021-05-20 Thread Matthew de Detrich (Jira)

Matthew de Detrich created KAFKA-12819:
--

 Summary: Quality of life improvements for tests
 Key: KAFKA-12819
 URL: https://issues.apache.org/jira/browse/KAFKA-12819
 Project: Kafka
  Issue Type: Improvement
Reporter: Matthew de Detrich
Assignee: Matthew de Detrich


Minor improvements to various tests, such as using assertObject instead of 
assertEquals (when comparing objects), fill in missing messages in asserts etc 
etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-9726) LegacyReplicationPolicy for MM2 to mimic MM1

2021-05-18 Thread Matthew de Detrich (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew de Detrich reassigned KAFKA-9726:
-

Assignee: Matthew de Detrich  (was: Ivan Yurchenko)

> LegacyReplicationPolicy for MM2 to mimic MM1
> 
>
> Key: KAFKA-9726
> URL: https://issues.apache.org/jira/browse/KAFKA-9726
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Ryanne Dolan
>Assignee: Matthew de Detrich
>Priority: Minor
>
> Per KIP-382, we should support MM2 in "legacy mode", i.e. with behavior 
> similar to MM1. A key requirement for this is a ReplicationPolicy that does 
> not rename topics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9726) LegacyReplicationPolicy for MM2 to mimic MM1

2021-05-18 Thread Matthew de Detrich (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346969#comment-17346969
 ] 

Matthew de Detrich commented on KAFKA-9726:
---

Would it be possible to assign the ticket to me,  I am taking over the task 
from [~ivanyu] (I have already created a PR at 
[https://github.com/apache/kafka/pull/10648).] [~ivanyu] has tried changing the 
assignee to me but it didn't work.

> LegacyReplicationPolicy for MM2 to mimic MM1
> 
>
> Key: KAFKA-9726
> URL: https://issues.apache.org/jira/browse/KAFKA-9726
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Ryanne Dolan
>Assignee: Ivan Yurchenko
>Priority: Minor
>
> Per KIP-382, we should support MM2 in "legacy mode", i.e. with behavior 
> similar to MM1. A key requirement for this is a ReplicationPolicy that does 
> not rename topics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

100 matches

Mail list logo