[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185251#comment-17185251
 ] 

Adam Holmberg commented on CASSANDRA-15958:
-------------------------------------------

Looked at the {{OutboundMessageQueue}} issue a bit more. Put more simply, 
there's a race [adding to this queue and updating the expiration 
deadline|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L89-L92]
 while another thread is draining 
([1|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L256][2|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L148][3|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L484])
 and also updating 
([1|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L288][2|https://github.com/apache/cassandra/blob/405e2dd8b5610208596ab4cb0bb6b9be7a159f5e/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L488]).
 The race is there, but I'm not certain it would be a problem in an operating 
server, since nothing is spinning on an inactive queue waiting for messages to 
be evacuated, like this test is. In other words, new incoming messages and 
ongoing delivery would break this loose naturally.

Two ways to proceed:
1.) We can agree that it's not a problem, and I can make this test not 
susceptible to the timeout.
2.) We try to fix by adding synchronization around both the external queue and 
expiry update. I would need to expand the analysis quite a bit to understand 
what performance implications that might have (since the apparent point of the 
two queue design is efficiency).

[~yifanc] I'm interested in your take.
also /cc [~benedict] [~aleksey] for ideas since this is part of your newish 
messaging rewrite.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-15958
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/unit
>            Reporter: David Capwell
>            Assignee: Adam Holmberg
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>       at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>       at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>       at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>       at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>       at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to