Hi all! I've a test cluster of 4 nodes running cassandra 0.7.8, with one keyspace with RF=1, each node owns 25% of the data. As long as all nodes are alive, there is no problem, but when I shut down just one node I get UnavailableException in my application. cassandra-cli returns "null" and hadoop mapreduce task won't start at all.
Loosing one node is not a problem for me, the data are not important, loosing even half the cluster is not a problem as long as everything runs just as with a full cluster. The error from hadoop is like this: Exception in thread "main" java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:120) at cz.xxx.yyy.zzz.DelegatingInputFormat.getSplits(DelegatingInputFormat.java:111) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961) at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at cz.xxx.yyy.zzz.ContextIndexer.run(ContextIndexer.java:663) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at cz.xxx.yyy.zzz.ContextIndexer.main(ContextIndexer.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints 10.0.18.87 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:116) ... 20 more Caused by: java.io.IOException: failed connecting to all endpoints 10.0.18.87 at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:197) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:67) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:153) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:138) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)