[ 
https://issues.apache.org/jira/browse/CASSANDRA-16525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16525:
----------------------------------
    Description: 
In 4.0, new application states are added in Gossip and the corresponding old 
ones are deprecated, e.g. {{STATUS}} and the successor {{STATUS_WITH_PORT}}.

There are 2 issues discovered by the jvm (upgrade) dtest. First, the {{STATUS}} 
field of a peer in the lower version (e.g. 3.0) node can be missing. Second, it 
is possible the {{STATUS}} coexist with the new state {{STATUS_WITH_PORT}} in 
the 4.0 nodes after cluster is fully upgraded and the {{STATUS}} field can 
becomes stale as the 4.0 node filters out when applying new state. 

The first issue can happen in this scenario. During upgrade, node1 and node2 
are in v4 and node3 is still in v3. If node3 only gets the gossip info 
regarding node2 from node1, the {{STATUS}} field of node2 will be missing in 
node3's local state, which is unexpected. There could be many reasons that 
node3 does not exchange gossip with node2 directly, e.g. network issue between 
node2 and node3, or simple node2 does not select node3 when initiating the 
gossip round. I have a [jvm upgrade 
dtest|https://github.com/yifan-c/cassandra/blob/CASSANDRA-16525/trunk/test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeGossipTest.java#L81]
 to demonstrate the unexpected behavior.

The cause of the second issue is more subtle. Heartbeat update happens as part 
of the Gossip task and outside of the GossipStage. When node2 just update its 
local application state and received a {{SYN}} from node1, node 2 just replies 
its gossip state without updating the heartbeat version. When node1 receives 
it, it first filters out the legacy {{STATUS}} field, and only saves the new 
one. So far so good. However, node2 soon updates its heart beat, and node1 
realizes that its local version is less than the remove (node2) version in the 
next gossip round. So node2 sends {{STATUS}} along to node1. Because it does 
not come together with the new field, node1 does not filter it out when 
receiving. Boo! Node1 now has the {{STATUS}} field from node2. Such field can 
become stale and diverge with its successor in a live cluster. The jvm upgrade 
test 
[testStatusFieldShouldExistInOldVersionNodes|https://github.com/yifan-c/cassandra/blob/CASSANDRA-16525/trunk/test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeGossipTest.java#L47]
 can fairly easy to reproduce it when the entire cluster is upgraded. And there 
is another jvm dtest (with source changes to help make deterministic result, 
see the attached 
Demonstrate-a-scenario-that-a-node-may-hold-the-stale-status.patch) that 
demonstrates the {{STATUS}} can be replicated to the peer and become stale. 

The fix is to 
1) retain the legacy fields if the cluster is still in mixed mode
2) remove the legacy field when cluster is fully upgraded

  was:
In 4.0, new application states are added in Gossip and the corresponding old 
ones are deprecated, e.g. {{STATUS}} and the successor {{STATUS_WITH_PORT}}.

There are 2 issues discovered by the jvm (upgrade) dtest. First, the {{STATUS}} 
field of a peer in the lower version (e.g. 3.0) node can be missing. Second, it 
is possible the {{STATUS}} coexist with the new state {{STATUS_WITH_PORT}} in 
the 4.0 nodes after cluster is fully upgraded and the {{STATUS}} field can 
becomes stale as the 4.0 node filters out when applying new state. 

The first issue can happen in this scenario. During upgrade, node1 and node2 
are in v4 and node3 is still in v3. If node3 only gets the gossip info 
regarding node2 from node1, the {{STATUS}} field of node2 will be missing in 
node3's local state, which is unexpected. There could be many reasons that 
node3 does not exchange gossip with node2 directly, e.g. network issue between 
node2 and node3, or simple node2 does not select node3 when initiating the 
gossip round. I have a jvm upgrade dtest to demonstrate the unexpected behavior.

The cause of the second issue is more subtle. Heartbeat update happens as part 
of the Gossip task and outside of the GossipStage. When node2 just update its 
local application state and received a {{SYN}} from node1, node 2 just replies 
its gossip state without updating the heartbeat version. When node1 receives 
it, it first filters out the legacy {{STATUS}} field, and only saves the new 
one. So far so good. However, node2 soon updates its heart beat, and node1 
realizes that its local version is less than the remove (node2) version in the 
next gossip round. So node2 sends {{STATUS}} along to node1. Because it does 
not come together with the new field, node1 does not filter it out when 
receiving. Boo! Node1 now has the {{STATUS}} field from node2. Such field can 
become stale and diverge with its successor in a live cluster. The jvm upgrade 
test {{testStatusFieldShouldExistInOldVersionNodes}} can fairly easy to 
reproduce it when the entire cluster is upgraded. And there is another jvm 
dtest (with source changes to help make deterministic result, see the attached 
Demonstrate-a-scenario-that-a-node-may-hold-the-stale-status.patch) that 
demonstrates the {{STATUS}} can be replicated to the peer and become stale. 

The fix is to 
1) retain the legacy fields if the cluster is still in mixed mode
2) remove the legacy field when cluster is fully upgraded


> Gossip STATUS can be either missing during upgrade or stale after upgrade
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16525
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16525
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>         Attachments: 
> Demonstrate-a-scenario-that-a-node-may-hold-the-stale-status.patch
>
>
> In 4.0, new application states are added in Gossip and the corresponding old 
> ones are deprecated, e.g. {{STATUS}} and the successor {{STATUS_WITH_PORT}}.
> There are 2 issues discovered by the jvm (upgrade) dtest. First, the 
> {{STATUS}} field of a peer in the lower version (e.g. 3.0) node can be 
> missing. Second, it is possible the {{STATUS}} coexist with the new state 
> {{STATUS_WITH_PORT}} in the 4.0 nodes after cluster is fully upgraded and the 
> {{STATUS}} field can becomes stale as the 4.0 node filters out when applying 
> new state. 
> The first issue can happen in this scenario. During upgrade, node1 and node2 
> are in v4 and node3 is still in v3. If node3 only gets the gossip info 
> regarding node2 from node1, the {{STATUS}} field of node2 will be missing in 
> node3's local state, which is unexpected. There could be many reasons that 
> node3 does not exchange gossip with node2 directly, e.g. network issue 
> between node2 and node3, or simple node2 does not select node3 when 
> initiating the gossip round. I have a [jvm upgrade 
> dtest|https://github.com/yifan-c/cassandra/blob/CASSANDRA-16525/trunk/test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeGossipTest.java#L81]
>  to demonstrate the unexpected behavior.
> The cause of the second issue is more subtle. Heartbeat update happens as 
> part of the Gossip task and outside of the GossipStage. When node2 just 
> update its local application state and received a {{SYN}} from node1, node 2 
> just replies its gossip state without updating the heartbeat version. When 
> node1 receives it, it first filters out the legacy {{STATUS}} field, and only 
> saves the new one. So far so good. However, node2 soon updates its heart 
> beat, and node1 realizes that its local version is less than the remove 
> (node2) version in the next gossip round. So node2 sends {{STATUS}} along to 
> node1. Because it does not come together with the new field, node1 does not 
> filter it out when receiving. Boo! Node1 now has the {{STATUS}} field from 
> node2. Such field can become stale and diverge with its successor in a live 
> cluster. The jvm upgrade test 
> [testStatusFieldShouldExistInOldVersionNodes|https://github.com/yifan-c/cassandra/blob/CASSANDRA-16525/trunk/test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeGossipTest.java#L47]
>  can fairly easy to reproduce it when the entire cluster is upgraded. And 
> there is another jvm dtest (with source changes to help make deterministic 
> result, see the attached 
> Demonstrate-a-scenario-that-a-node-may-hold-the-stale-status.patch) that 
> demonstrates the {{STATUS}} can be replicated to the peer and become stale. 
> The fix is to 
> 1) retain the legacy fields if the cluster is still in mixed mode
> 2) remove the legacy field when cluster is fully upgraded



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to