[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2019-03-25 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801295#comment-16801295
 ] 

Benedict commented on CASSANDRA-14503:
--

Thanks for the patch! And sorry for taking so long to get back to you. We did 
find [some 
bugs|https://gist.github.com/belliottsmith/0d12df678d8e9ab06776e29116d56b91], 
and other areas of improvement; I've filed CASSANDRA-15066 as a follow-up, and 
look forward to hearing your feedback.

 

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2019-01-23 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750032#comment-16750032
 ] 

Benedict commented on CASSANDRA-14503:
--

It's next on my docket.

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2019-01-22 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749383#comment-16749383
 ] 

Dinesh Joshi commented on CASSANDRA-14503:
--

[~jasobrown] I can take a pass at it but I'd rather go after [~benedict]'s done 
his review.

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2019-01-22 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749356#comment-16749356
 ] 

Jason Brown commented on CASSANDRA-14503:
-

[~benedict] / [~djoshi3] Any update on reviewing this latest patch? This seems 
to be a blocker for 4.0.

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2018-11-06 Thread Vinay Chella (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677289#comment-16677289
 ] 

Vinay Chella commented on CASSANDRA-14503:
--

Thank you [~jasobrown] for the patch.

[~jolynch] and I benchmarked Jason's 14503-v2 branch, our benchmark results 
show [trunk-Jason's 
branch|https://github.com/jasobrown/cassandra/tree/14503-v2] is significantly 
out-performing 3.0.17 in terms of mean, 99th, and 95th percentile during a pure 
write benchmark. When systems are under heavy load, we have seen coordinator 
mean latencies are ~14x better, 99th latencies are ~4x better and 95th 
latencies are ~6x better on the trunk.

When both trunk and 3.0.17 had 67k write QPS applied, throughput is steady on 
the trunk and 3.0.17 fell over. Note that we have only tested writes in this 
benchmark. However, the trunk is accumulating more hints than 3.0.17 and 
dropping messages compared to 3.0.17, these issues are yet to troubleshoot. For 
a detailed analysis of this benchmarking, find attached document [Cassandra 4.0 
testing with CASSANDRA-14503 fixes]

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2018-10-24 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662265#comment-16662265
 ] 

Jason Brown commented on CASSANDRA-14503:
-

Based on testing conducted with [~jolynch] and [~vinaykumarcse], here's an 
updated branch with performance fixes and code improvements:

||v2||
|[branch|https://github.com/jasobrown/cassandra/tree/14503-v2]|
|[utests  
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14503-v2]|
||PR|https://github.com/apache/cassandra/pull/289|

The major change in this branch is I experimented with aggregating the messages 
to send into a single ByteBuf, instead of sending the messages individually to 
the netty pipeline. Since we can send up to (the current hard-coded size of) 64 
messages per each iteration of OMC.dequeueMessages(), that's 64 times to invoke 
the pipeline mechanics, 64 ByteBuf allocations (and releases), and 64 times to 
fulfill the promise corresponding to each message. If, instead, we send one 
ByteBuf (with data serialized into it) then it's just one message into the 
pipeline, one allocation, and one promise fulfillment. The primary trade-off is 
that the single buffer will be, of course, large; perhaps large enough to not 
be efficient with the netty allocator. To that end, wrote a JMH benchmark, and 
the results are compelling: TL;DR a single buffer is significantly faster than 
multiple smaller buffers. The closest case is a single buffer is twice as fast, 
with the typical percentile difference being about 10-20 times faster for the 
single buffer (1.5 micros vs. 23 micros).

To make this work, I need the allocation and serialization code to be moved 
outside of the pipeline handler (as it now needs to be invoked from OMC). I had 
already done this work with CASSANDRA-13630. Thus, I pulled that patch into 
this branch. That patch also greatly reduced the need for the ChannelWriter 
abstraction, and combined with the outstanding work in this branch, I am able 
to eliminate ChannelWriter and the confusion it added. However, I still need to 
handle large messages separately (as we don't want to use our blocking 
serializers on the event loop), so I've preserved the "move large message 
serialization on a separate thread" behavior from CASSANDRA-13630 by creating a 
new abstraction in OMC by adding (the not cleverly named) MessageDequeuer 
interface, with implementations for large messages and "small messages" 
(basically the current behavior of this patch that we've been riffing on).

One feature that we've been debating again is the whether to include the 
message coalescing feature. The current branch does not include it - mostly due 
to the fact that we've been iterating quite quickly over this code, and I broke 
it when incorporating the CASSANDRA-13630 patch (and killing off 
ChannelWriter). There is some testing happening to reevaluate the efficacy of 
message coalescing with the netty internode messaging.

Some other points of interest:

- switch OMC#backlog from ConcurrentLinkedQueue to MpscLinkedQueue from 
jctools. MpscLinkedQueue is dramtically better, and 
ConcurrentLinkedQueue#isEmpty was a CPU drain.
- improved scheduling of the consumerTask in OutboundMessagingConnection, 
though still needs a bit more refinement
- ditched the OMC.State from the last branch
- added [~jolynch]'s fixes wrt not setting a default SO_SNDBUF value
- OMC - introduced consumerTaskThread vs eventLoop member field
- ditched the auto-read in RebufferingByteBufDataInputPlus - I need to document 
this

In general I have a small bit of documenting to add, but the branch is ready 
for review.

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might 

[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2018-09-10 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609099#comment-16609099
 ] 

Jason Brown commented on CASSANDRA-14503:
-

Patch available here:

||14503||
|[branch|https://github.com/jasobrown/cassandra/tree/14503]|
|[utests  
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14503]|
||

Additionally, I've [created a Pull 
Request|https://github.com/apache/cassandra/pull/264] for review, as well.

Note: this patch will need to be rebased when CASSANDRA-13630 is committed, and 
incorprate the changes ChannelWriter for large messages, but that should not 
affect this patch much (I've been keeping that in mind as I worked on this)

- OutboundMessagingConnection changes 
-- All producer threads queue messages into the backlog, and messages are only 
consumed by a task on a fixed thread (the event loop). Producers will contend 
to schedule the consumer, 
but have no further involvement in sending a message (unlike the current 
implementation).
-- All netty-related activity (setting up a remote connection, 
connection-related callbacks and time outs, consuming form the backlog and 
writing to the channel and associated callbacks)
are all handled on the event loop. OutboundMessagingConnection gets a reference 
to a event loop in it's constructor, and uses that for the duration of it's 
lifetime.
-- Finally forward-ported the queue bounding functionality of CASSANDRA-13265. 
In short, we want to limit the size of queued messages in order to not OOM. 
Thus, we schedule a task for the consumer thread
 that examines the queue looking for elements to prune. Further, I've added a 
naive upper bound to the queue so that producers drop messages before enqueuing 
if the backlog is in a *really* bad state.
@djoshi3 has recomended bounding by message size rather than by message count, 
which I agree with, but propose saving that for a followup ticket.
-- Cleaner, more documented, and better tested State machine to manage state 
transitions for the class.

- ChannelWriter and MessageOutHandler became much simpler as we can control the 
flush behaviors from the OMC (instead of the previous complicated CW/MOH dance) 
because we're already on the event loop
when consuming from the backlog and writing to the channel.

- I was able to clean up/remove a bunch of extra code due to this 
simplification, as well (ExpiredException, OutboundMessagingParameters, 
MessageResult)

- Updated all the javadoc documentation for these changes (mostly OMC and 
ChannelWriter)

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone

2018-08-22 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588759#comment-16588759
 ] 

Jason Brown commented on CASSANDRA-14503:
-

Working on a solution that will also resolve the problems [~sbtourist] raised 
with CASSANDRA-14507. I've spoken with some of the netty maintainers and 
they've given me some solid advice about to rework 
{{OutboubndMessageConnection}}.

> Internode connection management is race-prone
> -
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Jason Brown
>Priority: Major
>
> Following CASSANDRA-8457, internode connection management has been rewritten 
> to rely on Netty, but the new implementation in 
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular 
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the 
> former could run into an NPE if the latter nulls the {{channelWriter}} (but 
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods: 
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when 
> handshaking or closing, but there's no guarantee those will be actually 
> cancelled (as they might be already running), so they might end up changing 
> the connection state concurrently with other methods (i.e. by unexpectedly 
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very 
> difficult to assess given the current implementation: I would suggest to 
> refactor it into a single-thread model, where all connection state changing 
> actions are enqueued on a single threaded scheduler, so that state 
> transitions can be clearly defined and checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org