[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-859044497 It should be sufficient to upgrade just the consumers, this is a client-side fix only -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-770017240 Cherrypicked to 2.6 as well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-767988592 Merged to trunk and cherrypicked to 2.7 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-767983217 A few tests failed, but no hanging this time: ``` StoreQueryIntegrationTest.shouldQuerySpecificStalePartitionStores -- known to be flaky FetcherTest.testEarlierOffsetResetArrivesLate -- hit "TimeoutException: testEarlierOffsetResetArrivesLate() timed out after 10 seconds", I haven't seen this fail before, on this PR or on any other, so I believe it's unrelated. But I ran it 10 times locally to be sure and all passed MirrorConnectorsIntegrationSSLTest.testReplication -- in Connect, seems to be unrelated StoreQueryIntegrationTest.shouldQuerySpecificStalePartitionStores -- known flaky, looks environmental (slow startup) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-767808575 Ugh, looks like the JDK 15 build still timed out. But I think this was probably environmental based on inspecting the output -- I also tracked down and verified that every run of the previously-hanging `#testCoordinatorFailover` test did complete (and pass) so that does seem to be fixed. Will retrigger the build to be safe This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-767278329 Ok I think I've gotten to the bottom of this hanging test, and pushed a fix. Tests seem to be passing reliably for me locally. Aiming to get this merged in the next day or so so let me know if you have any concerns around the latest @guozhangwang This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-764053539 Weird -- these changes seem to be causing the `SaslXConsumerTest` family of tests to hang. I'm not very (or at all) familiar with these tests so I haven't found anything yet but I'm actively looking into it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-764053539 Weird -- these changes seem to be causing the `SaslXConsumerTest` family of tests to hang. I'm not very (or at all) familiar with these tests so I haven't found anything yet but I'm actively looking into it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-747732868 @ijuma sorry, I missed your earlier response. It's definitely not a trivial bug, yes. The main reasons I didn't formally propose this as a blocker was that it's been around forever, and I'm not confident that the fix is low-risk. WDYT @bbejeck ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-741977437 @ijuma no, I don't think it should be a 2.7 blocker. It's definitely not a regression, AFAICT this has been around since the beginning. And a restart of the client will get it out of the bad state This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-740339466 Kicked off 30 versions of the system test which has seemed to be flaky due to this bug: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/4302/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ableegoldman commented on pull request #9671: KAFKA-10793: move handling of FindCoordinatorFuture to fix race condition
ableegoldman commented on pull request #9671: URL: https://github.com/apache/kafka/pull/9671#issuecomment-736964914 Waiting to add tests until I get some sanity checks on this proposal This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org