A lot of MUTATION and REQUEST_RESPONSE messages dropped

srmore Sat, 09 Nov 2013 16:03:16 -0800

I recently upgraded to 1.2.9 and I am seeing a lot of REQUEST_RESPONSE and
MUTATION messages are being dropped.


This happens when I have multiple nodes in the cluster (about 3 nodes) and
I send traffic to only one node. I don't think the traffic is that high, it
is around 400 msg/sec with 100 threads. When I take down other two nodes I
don't see any errors (at least on the client side) I am using Pelops.

On the client I get UnavailableException, but the nodes are up. Initially I
thought I am hitting CASSANDRA-6297 (gossip thread blocking) so I changed
memtable_flush_writers to 3. Still no luck.

UnavailableException:
org.scale7.cassandra.pelops.exceptions.UnavailableException: null at
org.scale7.cassandra.pelops.exceptions.IExceptionTranslator$ExceptionTranslator.translate(IExceptionTranslator.java:61)
~[na:na] at

In the debug log on the cassandra node this is the exception I see

DEBUG [Thrift:78] 2013-11-09 16:47:28,212 CustomTThreadPoolServer.java
Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
        at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
        at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
        at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
        at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
        at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
        at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
        at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)

Could this be because of high load ? with Cassandra 1.0.011 I did not see
this issue.

Thanks,
Sandeep

A lot of MUTATION and REQUEST_RESPONSE messages dropped

Reply via email to