[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049724#comment-13049724
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/18/11 8:32 AM:
-------------------------------------------------------------------

bq. [snip] One possibility is to use the ip octets like the 
RackInferringSnitch. 

In our usecase we have three nodes defined via 
PropertyFileSnitch:{noformat}152.90.241.22=DC1:RAC1 #node1
152.90.241.23=DC2:RAC1 #node2
152.90.241.24=DC1:RAC1 #node3{noformat}
The only to infer here is even addresses belong to one dc, odd to the other. 
This is not how RackInferringSnithc works.

When we make the connection through the "other" (node2) endpoint taking the 
rack inferring approach "152.90." will say it's in DC2. (again) this is the 
wrong DC and will return itself as a valid endpoint....

Step (3) seems to me to be too specific to be included here.
If i go only with steps (1),(2),and (4) we get this code:{noformat}    public 
String[] sort_endpoints_by_proximity(String endpoint, String[] endpoints, 
boolean restrictToSameDC) 
            throws TException, InvalidRequestException
    {
        try
        {
            List<String> results = new ArrayList<String>();
            InetAddress address = InetAddress.getByName(endpoint);
            boolean endpointValid = null != 
Gossiper.instance.getEndpointStateForEndpoint(address);
            String datacenter = DatabaseDescriptor
                    .getEndpointSnitch().getDatacenter(endpointValid ? address 
: FBUtilities.getLocalAddress());
            List<InetAddress> addresses = new ArrayList<InetAddress>();
            for(String ep : endpoints)
            {
                addresses.add(InetAddress.getByName(endpoint));
            }
            DatabaseDescriptor.getEndpointSnitch().sortByProximity(address, 
addresses);
            for(InetAddress ep : addresses)
            {
                String dc = 
DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep);
                if(FailureDetector.instance.isAlive(ep) && (!restrictToSameDC 
|| datacenter.equals(dc)))
                {
                    results.add(ep.getHostName());
                }
            }
            return results.toArray(new String[results.size()]);
        }
        catch (UnknownHostException e)
        {
            throw new InvalidRequestException(e.getMessage());
        }
    }{noformat}

I'm happy with this (except that 
{{Gossiper.instance.getEndpointStateForEndpoint(address)}} is only my guess on 
how to tell if an endpoint is valid as such).

      was (Author: michaelsembwever):
    bq. [snip] One possibility is to use the ip octets like the 
RackInferringSnitch. 

In our usecase we have three nodes defined via 
PropertyFileSnitch:{noformat}152.90.241.22=DC1:RAC1 #node1
152.90.241.23=DC2:RAC1 #node2
152.90.241.24=DC1:RAC1 #node3{noformat}

When we make the connection through the "other" (node2) endpoint taking the 
rack inferring approach "152.90." will say it's in DC2. (again) this is the 
wrong DC and will return itself as a valid endpoint....

Step (3) seems to me to be too specific to be included here.
If i go only with steps (1),(2),and (4) we get this code:{noformat}    public 
String[] sort_endpoints_by_proximity(String endpoint, String[] endpoints, 
boolean restrictToSameDC) 
            throws TException, InvalidRequestException
    {
        try
        {
            List<String> results = new ArrayList<String>();
            InetAddress address = InetAddress.getByName(endpoint);
            boolean endpointValid = null != 
Gossiper.instance.getEndpointStateForEndpoint(address);
            String datacenter = DatabaseDescriptor
                    .getEndpointSnitch().getDatacenter(endpointValid ? address 
: FBUtilities.getLocalAddress());
            List<InetAddress> addresses = new ArrayList<InetAddress>();
            for(String ep : endpoints)
            {
                addresses.add(InetAddress.getByName(endpoint));
            }
            DatabaseDescriptor.getEndpointSnitch().sortByProximity(address, 
addresses);
            for(InetAddress ep : addresses)
            {
                String dc = 
DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep);
                if(FailureDetector.instance.isAlive(ep) && (!restrictToSameDC 
|| datacenter.equals(dc)))
                {
                    results.add(ep.getHostName());
                }
            }
            return results.toArray(new String[results.size()]);
        }
        catch (UnknownHostException e)
        {
            throw new InvalidRequestException(e.getMessage());
        }
    }{noformat}

I'm happy with this (except that 
{{Gossiper.instance.getEndpointStateForEndpoint(address)}} is only my guess on 
how to tell if an endpoint is valid as such).
  
> ColumnFamilyRecordReader fails for a given split because a host is down, even 
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>              Labels: hadoop, inputformat
>             Fix For: 0.8.2
>
>         Attachments: 0002_On_TException_try_next_split.patch, 
> CASSANDRA-2388.patch, CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We 
> should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to