[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-10-05 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207934#comment-17207934
 ] 

Benjamin Lerer commented on CASSANDRA-15958:


Could not mark the ticket as committed. Will try again once the problem has 
been fixed.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-10-05 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207931#comment-17207931
 ] 

Benjamin Lerer commented on CASSANDRA-15958:


I apparently forgot to put my +1 for the patch.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-09-24 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201511#comment-17201511
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15958:
-

Is a second reviewer needed?

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-09-10 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193902#comment-17193902
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Thanks for kicking that off.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-09-10 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193861#comment-17193861
 ] 

Yifan Cai commented on CASSANDRA-15958:
---

Unit test and dtest passed.

[https://app.circleci.com/pipelines/github/yifan-c/cassandra/96/workflows/ddf05402-e19d-4a74-a14b-90a105dea475]

 

+1 to the patch.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-09-10 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193813#comment-17193813
 ] 

Yifan Cai commented on CASSANDRA-15958:
---

Not seeing the link to the CI run.

Attaching one: 
[https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=C-15958]

 

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-09-09 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193202#comment-17193202
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Thanks for the link Aleksey. 

Returning to this, option #1 seems a bit absurd to me: changing a test to work 
around a flaw in the thing it was meant to test. I also think this class could 
benefit from some simplification, but I don't think we have the appetite to 
bite off and perf test a change like that at this point (and I think we've 
agreed this particular case is not a danger in a running system). 

Here's a [patch|https://github.com/aholmberg/cassandra/pull/4/files], updated 
according to Yifan's input, and augmented to make sure the test will not race 
with the background pruning, and timeout.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-09-01 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188398#comment-17188398
 ] 

Aleksey Yeschenko commented on CASSANDRA-15958:
---

Ping [~sbtourist] / CASSANDRA-15700

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-31 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17187866#comment-17187866
 ] 

Yifan Cai commented on CASSANDRA-15958:
---

Hi Adam. Yes. I agree with option 1. 

IMO, the periodic prune task and the condition to guard entering the actual 
prune operation is a best effort optimization to reduce the messages in the 
outbound queue. 
The contention for the lock (between poll and prune) would be increased, if 
removing the condition. The periodic prune task will try to acquire the lock 
every time it is triggered (every 100 ms). Prune is an O(n) operation. It could 
become expensive when the outbound queue is large and block the poll more 
likely. So having a condition to guard the entry makes sense to me, even though 
the condition could become inaccurate (, best effort). The exact impact on the 
queue performance need to be benchmarked. Intuitively, I think having the guard 
provides benefit. 

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-27 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186105#comment-17186105
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Sorry I wasn't more precise on the third option. There, I was suggesting that 
we stop trying to calculate the earliest possible prune, and just make it prune 
periodically -- say every 10ms. Then there's no deadline to adjust forward or 
backward and synchronize relative to new messages coming in -- it's just a 
periodic task.

>Therefore, it seems a special case in the test. It should not harm in 
>production.
It sound like in this paragraph you're agreeing with my option #1. If that is 
the case (and we don't hear any opposition), when I get back to it I'll update 
the test to be robust against this.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-27 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185693#comment-17185693
 ] 

Yifan Cai commented on CASSANDRA-15958:
---

Nice finding! 
I am able to reproduce the race, as long as the 
{{nextExpirationDeadlineUpdater}} that sets {{earliestExpiresAt}} to 
{{Long.MAX_VALUE}} in {{pruneInternalQueueWithLock}} happens after the updater 
in the *last* {{add()}} message call in the test. 

The impact is that the prune task might not be able to prune the queue when 
there is no new message with expiration time that can trigger 
{{pruneWithLock()}} enqueued. But in an operating system, there should be new 
messages enqueued continuously. Therefore, the chance that messages not being 
pruned should be low. 
In the worst case that an expired message is not pruned in the queue, the 
message is still discarded when outbound connection sends the message. It 
checks {{shouldSend()}} when polling from the queue. 
Therefore, it seems a special case in the test. It should not harm in 
production.

Regarding the third alternative, if I understand correctly, without updating 
the {{nextExpirationDeadline}} in the {{add()}} message method, no prune task 
will be scheduled. Because the {{nextExpirationDeadline}} remains 
{{Long.MAX_VALUE}} and {{clock.isAfter(nowNanos, nextExpirationDeadline)}} 
always returns false in the {{maybePruneExpired()}} method.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-26 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185385#comment-17185385
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

A third alternative would be to skip synchronization and computation and just 
make it periodic. I'm not sure what the genesis of the more precise calculation 
was.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-26 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185251#comment-17185251
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Looked at the {{OutboundMessageQueue}} issue a bit more. Put more simply, 
there's a race [adding to this queue and updating the expiration 
deadline|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L89-L92]
 while another thread is draining 
([1|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L256][2|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L148][3|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L484])
 and also updating 
([1|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L288][2|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L488]).
 The race is there, but I'm not certain it would be a problem in an operating 
server, since nothing is spinning on an inactive queue waiting for messages to 
be evacuated, like this test is. In other words, new incoming messages and 
ongoing delivery would break this loose naturally.

Two ways to proceed:
1.) We can agree that it's not a problem, and I can make this test not 
susceptible to the timeout.
2.) We try to fix by adding synchronization around both the external queue and 
expiry update. I would need to expand the analysis quite a bit to understand 
what performance implications that might have (since the apparent point of the 
two queue design is efficiency).

[~yifanc] I'm interested in your take.
also /cc [~benedict] [~aleksey] for ideas since this is part of your newish 
messaging rewrite.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-25 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184097#comment-17184097
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Thanks, Yifan, for taking a look. I'll revisit that before submitting for 
review.

In the mean time, I looked some at the other flakiness.

{noformat}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at 
org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$36(ConnectionTest.java:700)
at 
org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$40(ConnectionTest.java:730)
at 
org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:263)
at 
org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
at 
org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Caused by: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at 
org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$36(ConnectionTest.java:693)
{noformat}

Timeout waiting for message to be purged. It will never purge.

Identified another race condition where the next expiration can be set to 
Long.MAX_VALUE, and pruning will never happen.

[Periodic 
pruning|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L203-L209]
 runs in the background. It might run against an empty queue, producing a 
[pruner|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L288]
 with {{earliestExpiresAt = Long.MAX_VALUE}}. In the mean time a message is 
added and the current deadline matches that message. If our nowNanos is was 
established more than the clock resolution after the message expire time was 
set, the earliestExpiresAt time is 
[set|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L228-L229]
 to {{Long.MAX_VALUE}}.

I have another tweak for this issue. Still looking at some other related unit 
tests.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183614#comment-17183614
 ] 

Yifan Cai commented on CASSANDRA-15958:
---

Hi Adam, thanks for the analysis! 
I agree with the cause, and I can reproduce the timeout exception (at closing) 
by inserting a pause between those 2 close call sites. 

The patch looks good to me. The {{closeFuture}} is updated/read within the 
synchronized block.

Only one thing to bring up is that there is a slight side effect introduced. 
With the patch, only the {{Consumer 
shutdownExecutors}} in the first call site will be registered. The other call 
sites, if supplying a different consumer, are all ignored. Maybe update the 
method docs for such behavior. 

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183533#comment-17183533
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Looks like there are some additional sources of flakiness in this test.
https://app.circleci.com/pipelines/github/aholmberg/cassandra/25/workflows/536c8ec1-51d3-4ef0-95c6-7dd9677225dc/jobs/207

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183526#comment-17183526
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

[patch|https://github.com/apache/cassandra/compare/trunk...aholmberg:CASSANDRA-15958?expand=1]

The timeout occurs [waiting for this close 
future|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/test/unit/org/apache/cassandra/net/ConnectionTest.java#L268],
 failing intermittently due to a race condition. The test 
[closes|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/test/unit/org/apache/cassandra/net/ConnectionTest.java#L723]
 the inbound connection 
[twice|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/test/unit/org/apache/cassandra/net/ConnectionTest.java#L268].
 If the first execution finishes and [shuts down the 
executor|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/InboundSockets.java#L128]
 before the second, the 
[FutureCombiner|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/InboundSockets.java#L125-L126]
 fails to 
[addListener|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/FutureCombiner.java#L83]
 to the channel, and the [done 
future|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/InboundSockets.java#L131]
 will never complete.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-07-23 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163823#comment-17163823
 ] 

David Capwell commented on CASSANDRA-15958:
---

Failed again: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/307/workflows/a5d97246-2f81-489e-81e9-55c311ad65ae/jobs/1542

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org