[ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487922#comment-13487922 ]
Lanny Ripple edited comment on CASSANDRA-2388 at 10/31/12 4:34 PM: ------------------------------------------------------------------- Would very much like a fix to this. We have a 40 node ring running 2x hadoop clusters on 20 nodes each. One cluster is on systems that are more flaky than the other (bad batch of memory). When building a split on the first cluster if a ring node is down in the area of the second cluster we get timeouts with no way to blacklist the offending node even though we have replicas local to the first cluster. The ring is partitioned into DC1:2, DC2:2 with a hadoop cluster over each DC. was (Author: lannyripple): Would very much like a fix to this. We have a 40 node ring running 2x hadoop clusters on 20 nodes each. One cluster is on systems that are more flaky than the other (bad batch of memory). When building a split on the first cluster if a ring node is down in the area of the second cluster we get timeouts with no way to blacklist the offending node even though we have replicas local to the first cluster. > ColumnFamilyRecordReader fails for a given split because a host is down, even > if records could reasonably be read from other replica. > ------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-2388 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2388 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Affects Versions: 0.6 > Reporter: Eldon Stegall > Assignee: Mck SembWever > Priority: Minor > Labels: hadoop, inputformat > Fix For: 1.1.7 > > Attachments: 0002_On_TException_try_next_split.patch, > CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, > CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, > CASSANDRA-2388.patch > > > ColumnFamilyRecordReader only tries the first location for a given split. We > should try multiple locations for a given split. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira