[ https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-5171: -------------------------------------- Priority: Critical (was: Trivial) Affects Version/s: (was: 1.2.0) 0.7.1 Fix Version/s: (was: 1.2.1) 1.2.6 Issue Type: Bug (was: Improvement) Summary: Save EC2Snitch topology information in system table (was: Enhance Ec2Snitch gossip info.) This was reverted in CASSANDRA-5432, but I think the problem it solves is actually pretty severe, so I'm reopening it. The problem is that pretty much everything from TokenMetadata to NetworkTopologyStrategy assumes that once we see a node, the snitch can tell us where it lives, and in particular that once the snitch tells us where a node lives it won't change its answer. So this is problematic: {code} public String getDatacenter(InetAddress endpoint) { if (endpoint.equals(FBUtilities.getBroadcastAddress())) return ec2region; EndpointState state = Gossiper.instance.getEndpointStateForEndpoint(endpoint); if (state == null || state.getApplicationState(ApplicationState.DC) == null) return DEFAULT_DC; return state.getApplicationState(ApplicationState.DC).value; } {code} That is, if we don't know where a node belongs (e.g., we just restarted and haven't been gosipped to yet), assume it's in {{DEFAULT_DC}}. This can lead to data loss. Consider node X in DC1, where keyspace KS is replicated. Suddenly X is yanked out of DC1 and placed in DC2, where KS is not replicated. Nobody will bother querying X for the data in KS that was formerly replicated to it. Even repair will not see it. > Save EC2Snitch topology information in system table > --------------------------------------------------- > > Key: CASSANDRA-5171 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5171 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7.1 > Environment: EC2 > Reporter: Vijay > Assignee: Vijay > Priority: Critical > Fix For: 1.2.6 > > Attachments: 0001-CASSANDRA-5171.patch > > > EC2Snitch currently waits for the Gossip information to understand the > cluster information every time we restart. It will be nice to use already > available system table info similar to GPFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira