[ https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609099#comment-16609099 ]
Jason Brown commented on CASSANDRA-14503: ----------------------------------------- Patch available here: ||14503|| |[branch|https://github.com/jasobrown/cassandra/tree/14503]| |[utests & dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14503]| || Additionally, I've [created a Pull Request|https://github.com/apache/cassandra/pull/264] for review, as well. Note: this patch will need to be rebased when CASSANDRA-13630 is committed, and incorprate the changes ChannelWriter for large messages, but that should not affect this patch much (I've been keeping that in mind as I worked on this) - OutboundMessagingConnection changes -- All producer threads queue messages into the backlog, and messages are only consumed by a task on a fixed thread (the event loop). Producers will contend to schedule the consumer, but have no further involvement in sending a message (unlike the current implementation). -- All netty-related activity (setting up a remote connection, connection-related callbacks and time outs, consuming form the backlog and writing to the channel and associated callbacks) are all handled on the event loop. OutboundMessagingConnection gets a reference to a event loop in it's constructor, and uses that for the duration of it's lifetime. -- Finally forward-ported the queue bounding functionality of CASSANDRA-13265. In short, we want to limit the size of queued messages in order to not OOM. Thus, we schedule a task for the consumer thread that examines the queue looking for elements to prune. Further, I've added a naive upper bound to the queue so that producers drop messages before enqueuing if the backlog is in a *really* bad state. @djoshi3 has recomended bounding by message size rather than by message count, which I agree with, but propose saving that for a followup ticket. -- Cleaner, more documented, and better tested State machine to manage state transitions for the class. - ChannelWriter and MessageOutHandler became much simpler as we can control the flush behaviors from the OMC (instead of the previous complicated CW/MOH dance) because we're already on the event loop when consuming from the backlog and writing to the channel. - I was able to clean up/remove a bunch of extra code due to this simplification, as well (ExpiredException, OutboundMessagingParameters, MessageResult) - Updated all the javadoc documentation for these changes (mostly OMC and ChannelWriter) > Internode connection management is race-prone > --------------------------------------------- > > Key: CASSANDRA-14503 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14503 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Reporter: Sergio Bossa > Assignee: Jason Brown > Priority: Major > Labels: pull-request-available > Fix For: 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Following CASSANDRA-8457, internode connection management has been rewritten > to rely on Netty, but the new implementation in > {{OutboundMessagingConnection}} seems quite race prone to me, in particular > on those two cases: > * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the > former could run into an NPE if the latter nulls the {{channelWriter}} (but > this is just an example, other conflicts might happen). > * Connection timeout and retry racing with state changing methods: > {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when > handshaking or closing, but there's no guarantee those will be actually > cancelled (as they might be already running), so they might end up changing > the connection state concurrently with other methods (i.e. by unexpectedly > closing the channel or clearing the backlog). > Overall, the thread safety of {{OutboundMessagingConnection}} is very > difficult to assess given the current implementation: I would suggest to > refactor it into a single-thread model, where all connection state changing > actions are enqueued on a single threaded scheduler, so that state > transitions can be clearly defined and checked. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org