[jira] [Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

Lanny Ripple (JIRA) Wed, 31 Oct 2012 09:35:14 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487922#comment-13487922
 ]


Lanny Ripple edited comment on CASSANDRA-2388 at 10/31/12 4:34 PM:
-------------------------------------------------------------------

Would very much like a fix to this.  We have a 40 node ring running 2x hadoop 
clusters on 20 nodes each.  One cluster is on systems that are more flaky than 
the other (bad batch of memory).  When building a split on the first cluster if 
a ring node is down in the area of the second cluster we get timeouts with no 
way to blacklist the offending node even though we have replicas local to the 
first cluster.

The ring is partitioned into DC1:2, DC2:2 with a hadoop cluster over each DC.
                
      was (Author: lannyripple):
    Would very much like a fix to this.  We have a 40 node ring running 2x 
hadoop clusters on 20 nodes each.  One cluster is on systems that are more 
flaky than the other (bad batch of memory).  When building a split on the first 
cluster if a ring node is down in the area of the second cluster we get 
timeouts with no way to blacklist the offending node even though we have 
replicas local to the first cluster.
                  
> ColumnFamilyRecordReader fails for a given split because a host is down, even 
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>            Priority: Minor
>              Labels: hadoop, inputformat
>             Fix For: 1.1.7
>
>         Attachments: 0002_On_TException_try_next_split.patch, 
> CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, 
> CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
> CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We 
> should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

Reply via email to