Hi, I'm running cassandra .6.2 on a dedicated 4 node cluster and I also have a dedicated 4 node hadoop cluster. I'm trying to run a simple map reduce job against a single column family and it only takes 32 map tasks before I get floods of thrift timeouts. That would make sense to me if the cassandra was stressing the hardware or the network, but it's not. Each box has 8 cores/16G ram. During the job CPU averages 150-250% (1/5 utilization on 8 cores), network IO hovers around 15% throughput, iostat < 15%.
The hadoop machines are taking even less of a beating. The simpler I make the job, the faster it hits cassandra, the faster it throws timeouts & vice versa. I'm guessing there's a software/config related bottleneck I'm hitting well before tapping out the hardware. Any idea what that might be? java.lang.RuntimeException: TimedOutException() at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:174) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11015) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:151) ... 11 more