Question about consistency levels
I’m trying to be more succinct this time since no answers on my last attempt. We are currently using 2.0.2 in test (no C* in production yet), and use (LOCAL_)QUORUM CL on read and writes which guarantees (if successful) that we read latest data. That said, it is highly likely that (LOCAL_)ONE would return our data since it isn’t read for quite some time after write. Given that we must do our best to return data, we want to see what options we have when a quorum read fails (say 2 of 3 replicas go down with 3 replicas - note we have also seen this issue with bugs related to CF deletion/re-creating during compaction or load causing data corruption in which case 1 bad node can screw things up) One option is to fall back to (LOCAL_)ONE if we detect the right exception from (LOCAL_)QUORUM from the client side, but that obviously degrades your consistency. That said we ONLY ever do idempotent writes, and NEVER delete. So once again I wonder if there is a (reasonable) use case for a CL whereby you will accept the first non empty response from any replica? smime.p7s Description: S/MIME cryptographic signature
A lot of MUTATION and REQUEST_RESPONSE messages dropped
I recently upgraded to 1.2.9 and I am seeing a lot of REQUEST_RESPONSE and MUTATION messages are being dropped. This happens when I have multiple nodes in the cluster (about 3 nodes) and I send traffic to only one node. I don't think the traffic is that high, it is around 400 msg/sec with 100 threads. When I take down other two nodes I don't see any errors (at least on the client side) I am using Pelops. On the client I get UnavailableException, but the nodes are up. Initially I thought I am hitting CASSANDRA-6297 (gossip thread blocking) so I changed memtable_flush_writers to 3. Still no luck. UnavailableException: org.scale7.cassandra.pelops.exceptions.UnavailableException: null at org.scale7.cassandra.pelops.exceptions.IExceptionTranslator$ExceptionTranslator.translate(IExceptionTranslator.java:61) ~[na:na] at In the debug log on the cassandra node this is the exception I see DEBUG [Thrift:78] 2013-11-09 16:47:28,212 CustomTThreadPoolServer.java Thrift transport error occurred during processing of message. org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Could this be because of high load ? with Cassandra 1.0.011 I did not see this issue. Thanks, Sandeep