[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171101#comment-17171101 ] Yifan Cai commented on CASSANDRA-15338: --- Just took another look. The exception is not expected. Pasting the stack trace. {code:java} java.util.concurrent.TimeoutException at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258) at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143) at org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268) at org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236) at org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679) {code} In the test, testMessagePurging, it already closed the inbound connection. The test does not throw at that line, so I would assume that inbound is closed successfully. If so, we should not see the Timeout exception from {{doTestManual(ConnectionTest.java:268)}}. Because the inbound is closed already. I suspect that there might be some hidden issue in the {{close()}} method. I will open a new ticket if it is. > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta1 > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166765#comment-17166765 ] Yifan Cai commented on CASSANDRA-15338: --- Thanks for reporting, [~dcapwell]. The test failure in the link is timeout (30 seconds) when closing the inbound connection. I will take a closer look later. > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta1 > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166759#comment-17166759 ] David Capwell commented on CASSANDRA-15338: --- [~yifanc] this test is still flakey, but now hitting TimeoutException: https://app.circleci.com/pipelines/github/dcapwell/cassandra/362/workflows/c04020b0-d13e-4e18-ae27-0277e636b73d/jobs/1858 > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta1 > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084750#comment-17084750 ] Andres de la Peña commented on CASSANDRA-15338: --- Committed to trunk as [753b40eb0f570fc88b5211b9bcea04761a240071|https://github.com/apache/cassandra/commit/753b40eb0f570fc88b5211b9bcea04761a240071]. > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084233#comment-17084233 ] Yifan Cai commented on CASSANDRA-15338: --- Thanks [~adelapena]. That will be great. Please proceed. > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084228#comment-17084228 ] Andres de la Peña commented on CASSANDRA-15338: --- Looks good to me. [~yifanc] do you need me to commit it? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082478#comment-17082478 ] David Capwell commented on CASSANDRA-15338: --- Sounds great thanks! > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082227#comment-17082227 ] Andres de la Peña commented on CASSANDRA-15338: --- [~dcapwell] I can start reviewing this one tomorrow, if it's ok with you > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080639#comment-17080639 ] David Capwell commented on CASSANDRA-15338: --- nope, still too much on my plate. This patch requires me to review closer which I am struggling with the time on atm =( > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080602#comment-17080602 ] Ekaterina Dimitrova commented on CASSANDRA-15338: - I [~dcapwell] are you still on that? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076727#comment-17076727 ] David Capwell commented on CASSANDRA-15338: --- I have a few things on my plate, I should be able to look end of the week? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076633#comment-17076633 ] Yifan Cai commented on CASSANDRA-15338: --- [~benedict][~dcapwell], do you want to take a look? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061964#comment-17061964 ] David Capwell commented on CASSANDRA-15338: --- not started yet, looked at the other one only. > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061759#comment-17061759 ] Ekaterina Dimitrova commented on CASSANDRA-15338: - [~yifanc] [~dcapwell][~ifesdjeen] is anyone reviewing this one? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061613#comment-17061613 ] ZhaoYang commented on CASSANDRA-15338: -- this patch should fix CASSANDRA-15629/CASSANDRA-15628 as well.. > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055533#comment-17055533 ] Yifan Cai commented on CASSANDRA-15338: --- The test failure was not able to reproduce when simply running it from my laptop. However, it can be easily reproduced when running in a docker container with limited CPUs (i.e., 2). After multiple runs, the observation was that the test runs only failed when testing with LargeMessage. It indicated that the failures were probably related with {{LargeMessageDelivery}}. The following is what I think have happened. # When the {{inbound}} just opened and the first message gets queued into the {{outbound}}, handshake happens and the execution was deferred once the connection was established (executeAgain). # Since enqueue is not blocking, the next line, {{unsafeRunOnDelivery}} runs immediately. The effect is that the runnable gets registered, but not run yet. # Connection is established, so we {{executeAgain()}}. Because the runnable {{stopAndRun}} is present, and at this point, the {{inProgress}} flag is still false. The test runs the runnable, which counts down {{deliveryDone}} unexpectedly. # Delivery proceeds to flush the message. In {{LargeMessageDelivery}}, the flush is async and race condition can happen. ## when the inbound has received message (and countdown receiveDone) ## {{LargeMessageDelivery}} is still polling for the completion of flush, so not yet release capacity. Therefore, the assertion on the pendingCount failed. There are 2 places in the test flow are (or can go) wrong. See step 3 and step 4. Regarding step 3, the runnable {{stopAndRun}} should not be registered when establishing the connection. In production, is there a case that a {{stopAndRun}} being registered this early? Probably not. Regarding step 4, the {{outbound}} has no knowledge about whether the {{inbound}} has received any message. Test should register the runnable {{stopAndRun}} at the message handler to count down the {{deliveryDone}}. Therefore, the runnable can correctly wait for the current delivery to complete. Then it runs. PR is here: https://github.com/apache/cassandra/pull/466 As mentioned, I reproduced using the docker. Here is the bundle that one can simply download and run. [^CASS-15338-Docker.zip] It runs {{ConnectionTest}} repeatedly until failures. I have included the patch within the image too. To reproduce, run {code:bash} bash build_and_run.sh {code} To see the runs with the patch, run {code:bash} bash build_and_run.sh patched {code} > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039510#comment-17039510 ] Yifan Cai commented on CASSANDRA-15338: --- The test failure reported here is caused by the flaky {{testAcquireReleaseOutbound}} in CASSANDRA-15308. After fixing {{testAcquireReleaseOutbound}}, all tests in {{ConnectionTest}} can pass happily. The repeated local runs of {{ConnectionTest}} using either Java 8 and 11 proved the test failure as described in this ticket did not show up. {code:java} Switched to Java 8 12:42:00 in cassandra on CASSANDRA-15308 ➜ while [[ "$(ant testclasslist -Dtest.classlistfile=<(echo org/apache/cassandra/net/ConnectionTest.java) | grep -c 'BUILD SUCCESSFUL')" == "1" ]]; do echo "It was a good run."; done It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. ^C% 13:10:47 in cassandra on CASSANDRA-15308 took 28m 39s ➜ j11 Switched to Java 11 13:10:49 in cassandra on CASSANDRA-15308 ➜ while [[ "$(ant testclasslist -Dtest.classlistfile=<(echo org/apache/cassandra/net/ConnectionTest.java) -Duse.jdk11=true | grep -c 'BUILD SUCCESSFUL')" == "1" ]]; do echo "It was a good run."; done It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. It was a good run. ^C% {code} > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Priority: Normal > Fix For: 4.0-alpha > > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940769#comment-16940769 ] Benedict Elliott Smith commented on CASSANDRA-15338: cc [~ifesdjeen] > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Priority: Normal > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org