I have a freshly started 3-node cluster with a replication factor of 2. If I take down two nodes, I can no longer do any writes, even with a consistency level of one. I tried on a variety of keys to ensure that I'd get at least one where the live node was responsible for one of the replicas. I have not yet tried on trunk. On cassandra 0.4.1, I get an UnavailableException:
DEBUG [pool-1-thread-1] 2009-10-29 18:53:24,371 CassandraServer.java (line 408) insert WARN [pool-1-thread-1] 2009-10-29 18:53:24,388 AbstractReplicationStrategy.java (line 135) Unable to find a live Endpoint we might be out of live nodes , This is dangerous !!!! ERROR [pool-1-thread-1] 2009-10-29 18:53:24,390 StorageProxy.java (line 179) error writing key 1 UnavailableException() at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:156) at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:468) at org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:421) at org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:824) at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:627) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Moreover, if I bring another of the nodes back up, do some writes, take it back down again, and try to do writes with a consistency level of one, I get an ApplicationException with the error text "unknown result". There's nothing in the debug logs about this new error: DEBUG [pool-1-thread-7] 2009-10-29 19:08:26,411 CassandraServer.java (line 258) get DEBUG [pool-1-thread-7] 2009-10-29 19:08:26,411 CassandraServer.java (line 307) multiget DEBUG [pool-1-thread-7] 2009-10-29 19:08:26,413 StorageProxy.java (line 239) weakreadlocal reading SliceByNamesReadCommand(table='Keyspace1', key='3', columnParent='QueryPath(columnFamilyName='Standard1', superColumnName='null', columnName='null')', columns=[[118, 97, 108, 117, 101],]) I would've instead expected the node to accept the write and then have the key repaired on subsequent reads when the other nodes get back up. Along the same lines, how does Cassandra handle network partitioning where 2 writes for the same keys hit 2 different partitions, neither of which are able to form a quorum? Dynamo maintained version vectors and put the burden on the client to resolve conflicts, but there's no similar interface in the thrift api. Edmond