[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069845#comment-15069845 ] T. David Hudson commented on CASSANDRA-8113: I'm seeing 3 nodes of a 4-node Cassandra 2.1.1 (upgraded from 2.Xs a while back) cluster reporting ancient generations and refusing to accept a modern generation from the fourth. Could the generation check allow a timestamp far past the local generation but nevertheless reasonable w.r.t. the current system clock? > Gossip should ignore generation numbers too far in the future > - > > Key: CASSANDRA-8113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 > Project: Cassandra > Issue Type: Improvement >Reporter: Richard Low >Assignee: Jason Brown > Fix For: 2.1.1 > > Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt, 8113-v4.txt, > 8133-fix.txt > > > If a node sends corrupted gossip, it could set the generation numbers for > other nodes to arbitrarily large values. This is dangerous since one bad node > (e.g. with bad memory) could in theory bring down the cluster. Nodes should > refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175487#comment-14175487 ] Jason Brown commented on CASSANDRA-8113: +1 to @driftx's fix-it patch Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Fix For: 2.1.1 Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt, 8113-v4.txt, 8133-fix.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171097#comment-14171097 ] Brandon Williams commented on CASSANDRA-8113: - On the whole, +1, though I think MAX_VERSION_DIFFERENCE should be MAX_GENERATION_DIFFERENCE so it's more correct, since we have both generations and versions in gossip. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Attachments: 8113-v1.txt, 8113-v2.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171299#comment-14171299 ] Brandon Williams commented on CASSANDRA-8113: - I'm more leery about the max version check since technically the version has no bounds, and possibly you could someday have something minor get out of whack like severity but it shouldn't effectively make the node be ignore. Plus if you get a bit flipped in a version making it artificially high, simple restarting the node would fix it since it will have a superior generation after that. What you can't do is easily fix the artificially generation as easily since you basically have to assassinate it at that point. So, I would say drop the version check since there's some risk of a regression there and it's a little overly zealous since that's trivial to solve if it does happen. (and then rename max_version since it's not looking at versions) Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171304#comment-14171304 ] Brandon Williams commented on CASSANDRA-8113: - Also there's no real contract for how often you can update your gossip state if you want to, so we could have an accidental loop blow the version up and then run into further problems because now it's being ignored. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171576#comment-14171576 ] Brandon Williams commented on CASSANDRA-8113: - +1 with minor nit: the continue used in the generation isn't needed Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt, 8113-v4.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171584#comment-14171584 ] Jason Brown commented on CASSANDRA-8113: ahh, will remove that continue. I'm planning on applying this to 2.0 and up, wdyt? Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt, 8113-v4.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171590#comment-14171590 ] Brandon Williams commented on CASSANDRA-8113: - I was thinking 2.1.1, since this is an improvement rather than a bug. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt, 8113-v4.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171598#comment-14171598 ] Jason Brown commented on CASSANDRA-8113: wfm, committed to 2.1 and trunk Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Assignee: Jason Brown Fix For: 2.1.1 Attachments: 8113-v1.txt, 8113-v2.txt, 8113-v3.txt, 8113-v4.txt If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169922#comment-14169922 ] Jason Brown commented on CASSANDRA-8113: What happens when you have a network partition that lasts days (or a week or two)? The heartbeat is updated, more or less, once a second. So the version of a given node can increment by ~86400 per day (minus a few for GC collection, thread scheduling, etc) . Depending on what you think a too far in the future value is, if you set that high water mark too low, you will doom the cluster to never converging, as well. If we want to consider a high water mark of difference as a couple million, or so, that might be reasonable. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169933#comment-14169933 ] Brandon Williams commented on CASSANDRA-8113: - bq. Depending on what you think a too far in the future value is I was thinking we could just take current time + one year. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169950#comment-14169950 ] Jason Brown commented on CASSANDRA-8113: What if your local clock is borked? Perhaps, a remote node's last version + an approximation of the number of updates in a year (86400 * 365) = 31,536,000. wdyt? Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169956#comment-14169956 ] Brandon Williams commented on CASSANDRA-8113: - bq. What if your local clock is borked? Then you have bigger problems ;) bq. Perhaps, a remote node's last version + an approximation of the number of updates in a year (86400 * 365) = 31,536,000. Sounds reasonable to me. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)