[ 
https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5171:
--------------------------------------

             Priority: Critical  (was: Trivial)
    Affects Version/s:     (was: 1.2.0)
                       0.7.1
        Fix Version/s:     (was: 1.2.1)
                       1.2.6
           Issue Type: Bug  (was: Improvement)
              Summary: Save EC2Snitch topology information in system table  
(was: Enhance Ec2Snitch gossip info.)

This was reverted in CASSANDRA-5432, but I think the problem it solves is 
actually pretty severe, so I'm reopening it.

The problem is that pretty much everything from TokenMetadata to 
NetworkTopologyStrategy assumes that once we see a node, the snitch can tell us 
where it lives, and in particular that once the snitch tells us where a node 
lives it won't change its answer.

So this is problematic:

{code}
    public String getDatacenter(InetAddress endpoint)
    {
        if (endpoint.equals(FBUtilities.getBroadcastAddress()))
            return ec2region;
        EndpointState state = 
Gossiper.instance.getEndpointStateForEndpoint(endpoint);
        if (state == null || state.getApplicationState(ApplicationState.DC) == 
null)
            return DEFAULT_DC;
        return state.getApplicationState(ApplicationState.DC).value;
    }
{code}

That is, if we don't know where a node belongs (e.g., we just restarted and 
haven't been gosipped to yet), assume it's in {{DEFAULT_DC}}.

This can lead to data loss.  Consider node X in DC1, where keyspace KS is 
replicated.  Suddenly X is yanked out of DC1 and placed in DC2, where KS is not 
replicated.  Nobody will bother querying X for the data in KS that was formerly 
replicated to it.  Even repair will not see it.
                
> Save EC2Snitch topology information in system table
> ---------------------------------------------------
>
>                 Key: CASSANDRA-5171
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5171
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>         Environment: EC2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Critical
>             Fix For: 1.2.6
>
>         Attachments: 0001-CASSANDRA-5171.patch
>
>
> EC2Snitch currently waits for the Gossip information to understand the 
> cluster information every time we restart. It will be nice to use already 
> available system table info similar to GPFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to