I recently upgraded to 1.2.9 and I am seeing a lot of REQUEST_RESPONSE and MUTATION messages are being dropped.
This happens when I have multiple nodes in the cluster (about 3 nodes) and I send traffic to only one node. I don't think the traffic is that high, it is around 400 msg/sec with 100 threads. When I take down other two nodes I don't see any errors (at least on the client side) I am using Pelops. On the client I get UnavailableException, but the nodes are up. Initially I thought I am hitting CASSANDRA-6297 (gossip thread blocking) so I changed memtable_flush_writers to 3. Still no luck. UnavailableException: org.scale7.cassandra.pelops.exceptions.UnavailableException: null at org.scale7.cassandra.pelops.exceptions.IExceptionTranslator$ExceptionTranslator.translate(IExceptionTranslator.java:61) ~[na:na] at In the debug log on the cassandra node this is the exception I see DEBUG [Thrift:78] 2013-11-09 16:47:28,212 CustomTThreadPoolServer.java Thrift transport error occurred during processing of message. org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Could this be because of high load ? with Cassandra 1.0.011 I did not see this issue. Thanks, Sandeep