[ 
https://issues.apache.org/jira/browse/CASSANDRA-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070970#comment-17070970
 ] 

ZhaoYang edited comment on CASSANDRA-15666 at 3/30/20, 6:47 PM:
----------------------------------------------------------------

{quote}1) Only the "follower" is allowed to send the CompleteMessage.
 2) Only the "initiator" is allowed to close the session and its channels after 
receiving the CompleteMessage.
{quote}
At the end, I reverted the above implementation, as I think it doesn't fix the 
racy cases here. Improving synchronization on "maybeCompleted()" should be 
sufficient to avoid the StreamSession racy cases and less invasive for 4.0..

[Patch|https://github.com/apache/cassandra/pull/497]:
 * Make sure {{maybeCompleted()}} is executed at most once as it may be 
triggered by different threads asynchronously.
 * Synchronize {{maybeCompleted()}} so that it won't race with itself or 
{{complete()}}.
 * Added {{MessageStateSink}} in {{StreamSession}} to capture received stream 
messages and state transitions to improve testing.
 * Unified terminologies to use {{Initiator}} and {{Follower}}.
 * Ported some simplifications from CASSANDRA-15754 here:
 ** Simplify StreamSession with State.isFinalState
 ** Remove unused StreamSession from StreamMessage.Serializer.deserialize
 ** Change QueryState to use ClientState instead of clientState

[~blerer] [~sbtourist] WDYT?


was (Author: jasonstack):
{quote}1) Only the "follower" is allowed to send the CompleteMessage.
 2) Only the "initiator" is allowed to close the session and its channels after 
receiving the CompleteMessage.
{quote}
At the end, I reverted the above implementation, as I think it may cause 
backward compatibility issue on cross-version streaming and doesn't fix the 
racy cases here. Improving synchronization on "maybeCompleted()" should be 
sufficient to avoid the StreamSession racy cases and less invasive for 4.0..

[Patch|https://github.com/apache/cassandra/pull/497]:
* Make sure {{maybeCompleted()}} is executed at most once as it may be 
triggered by different threads asynchronously.
* Synchronize {{maybeCompleted()}} so that it won't race with itself or 
{{complete()}}.
* Added {{MessageStateSink}} in {{StreamSession}} to capture received stream 
messages and state transitions to improve testing.
* Unified terminologies to use {{Initiator}} and {{Follower}}.
* Ported some simplifications from CASSANDRA-15754 here:
    ** Simplify StreamSession with State.isFinalState
    ** Remove unused StreamSession from StreamMessage.Serializer.deserialize
    ** Change QueryState to use ClientState instead of clientState

[~blerer] [~sbtourist] WDYT?

> Race condition when completing stream sessions
> ----------------------------------------------
>
>                 Key: CASSANDRA-15666
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15666
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Streaming and Messaging
>            Reporter: Sergio Bossa
>            Assignee: ZhaoYang
>            Priority: Normal
>             Fix For: 4.0
>
>
> {{StreamSession#prepareAsync()}} executes, as the name implies, 
> asynchronously from the IO thread: this opens up for race conditions between 
> the sending of the {{PrepareSynAckMessage}} and the call to 
> {{StreamSession#maybeCompleted()}}. I.e., the following could happen:
> 1) Node A sends {{PrepareSynAckMessage}} from the {{prepareAsync()}} thread.
> 2) Node B receives it and starts streaming.
> 3) Node A receives the streamed file and sends {{ReceivedMessage}}.
> 4) At this point, if this was the only file to stream, both nodes are ready 
> to close the session via {{maybeCompleted()}}, but:
> a) Node A will call it twice from both the IO thread and the thread at #1, 
> closing the session and its channels.
> b) Node B will attempt to send a {{CompleteMessage}}, but will fail because 
> the session has been closed in the meantime.
> There are other subtle variations of the pattern above, depending on the 
> order of concurrently sent/received messages.
> I believe the best fix would be to modify the message exchange so that:
> 1) Only the "follower" is allowed to send the {{CompleteMessage}}.
> 2) Only the "initiator" is allowed to close the session and its channels 
> after receiving the {{CompleteMessage}}.
> By doing so, the message exchange logic would be easier to reason about, 
> which is overall a win anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to