[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

T Jake Luciani (JIRA) Tue, 14 Jun 2011 06:10:52 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049169#comment-13049169
 ]


T Jake Luciani edited comment on CASSANDRA-2388 at 6/14/11 1:07 PM:
--------------------------------------------------------------------

I think the core issue is you can't assume the hadoop node is running on a 
cassandra node...

If it is then the logic is straight forward, if not then it's possible the 
connection could cross DC boundaries. One possibility is to use the ip octets 
like the RackInferringSnitch.  

How's this proposal then?  keep the sort_endpoints_by_proximity signature as is 
and pass the client endpoint along with the list of data endpoints and add the 
following logic:

1) sort the endpoints using the endpoint_snitch.
2) if client endpoint *is* a valid cassandra node get the nodes DC and prune 
nodes outside of this DC
3) if client endpoint *is not* a valid cassandra node try to infer the DC from 
its ip and prune dataendpoint nodes in a different DC. If no cassandra nodes 
are in the DC list goto 3).
4) all else fails return the sorted endpoint list


      was (Author: tjake):
    I think the core issue is you can't assume the hadoop node is running on a 
cassandra node...

If it is then the logic is straight forward, if not then it's possible the 
connection could cross DC boundaries. One possibility is to use the ip octets 
like the RackInferringSnitch.  

How's this proposal then?  keep the sort_endpoints_by_proximity signature as is 
and pass the client endpoint along with the list of data endpoints and add the 
following logic:

1) sort the endpoints using the endpoint_snitch.
2) if client endpoint *is* a valid cassandra node get the nodes DC and prune 
nodes outside of this DC
3) if client endpoint *is not* a valid cassandra node try to infer the DC from 
its ip and prune dataendpoint nodes in a different DC. If no cassandra nodes 
are in the DC list goto 3).
3) all else fails return the sorted endpoint list

  
> ColumnFamilyRecordReader fails for a given split because a host is down, even 
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>              Labels: hadoop, inputformat
>             Fix For: 0.8.1
>
>         Attachments: 0002_On_TException_try_next_split.patch, 
> CASSANDRA-2388.patch, CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We 
> should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

Reply via email to