[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094911#comment-16094911 ]
Joel Knighton commented on CASSANDRA-13700: ------------------------------------------- Thanks, Jason! In this case, I agree the first option is safer for this issue. Something like the second likely makes sense eventually, at least as part of a larger audit of correctness issues in gossip. I believe your volatile suggestion is correct. I don't have a lot of helpful information to reproduce this; it reproduces in larger clusters, particularly with higher latency levels. We can see the effects locally with a few well-timed sleeps in MessagingService, but that isn't terribly representative. Branches pushed here: ||branch|| |[13700-2.1|https://github.com/jkni/cassandra/tree/13700-2.1]|| |[13700-2.2|https://github.com/jkni/cassandra/tree/13700-2.2]|| |[13700-3.0|https://github.com/jkni/cassandra/tree/13700-3.0]|| |[13700-3.11|https://github.com/jkni/cassandra/tree/13700-3.11]|| |[13700-trunk|https://github.com/jkni/cassandra/tree/13700-trunk]|| There's a somewhat conceptually similar issue when we bump the gossip generation in the middle of constructing a reply - I believe that's the cause in [CASSANDRA-11825], which presents similar problems. I'm choosing to address them separately because they're indeed distinct problems and 11825 requires an additional trigger (enabling and disabling gossip during runtime). > Heartbeats can cause gossip information to go permanently missing on certain > nodes > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata > Reporter: Joel Knighton > Assignee: Joel Knighton > Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org