Can you turn the logging up to DEBUG level and look for a message from 
CassandraServer that says "... timed out" ?

Also check the thread pool stats "nodetool tpstats" to see if the node is 
keeping up. 
 
Aaron

On 7 Apr 2011, at 13:43, Sheng Chen wrote:

> Thank you Aaron.
> 
> It does not seem to be an overload problem.
> 
> I have 16 cores and 48G ram on the single node, and I reduced the concurrent 
> threads to be 1. 
> Still, it just suddenly dies of a timeout, while the cpu, ram, disk load are 
> below 10% and write latency is about 0.5ms for the past 10 minutes which is 
> really fast.
> 
> No logs of dropped messages are found.
> 
> 
> 
> 
> 
> 2011/4/7 aaron morton <aa...@thelastpickle.com>
> TimedOutException means that the less than CL number of nodes responded to 
> the coordinator before the rpc_timeout.
> 
> So it was overloaded. Which makes sense when you say it only happens with 
> secondary indexes. Consider things like
> - reducing the throughput
> - reducing the number of clients
> - ensuring the clients are connecting to all nodes in the cluster.
> 
> You will probably find some logs about dropped messages on some nodes.
> Aaron
> 
> On 6 Apr 2011, at 20:39, Sheng Chen wrote:
> 
> > I used py_stress module to insert 10m test data with a secondary index.
> > I got the following exceptions.
> >
> > # python stress.py -d xxx -o insert -n 10000000 -c 5 -s 34 -C 5 -x keys
> > total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
> > 265322,26532,26541,0.00186140829433,10
> > 630300,36497,36502,0.00129331431204,20
> > 986781,35648,35640,0.0013310986218,30
> > 1332190,34540,34534,0.00135942295893,40
> > 1473578,14138,14138,0.00142941070007,50
> > Process Inserter-38:
> > Traceback (most recent call last):
> >   File "/usr/lib64/python2.4/site-packages/multiprocessing/process.py", 
> > line 237, in _bootstrap
> >     self.run()
> >   File "stress.py", line 242, in run
> >     self.cclient.batch_mutate(cfmap, consistency)
> >   File 
> > "/root/apache-cassandra-0.7.4-src/interface/thrift/gen-py/cassandra/Cassandra.py",
> >  line 784, in batch_mutate
> > TimedOutException: TimedOutException(args=())
> >     self.run()
> >   File "stress.py", line 242, in run
> >     self.recv_batch_mutate()
> >   File 
> > "/root/apache-cassandra-0.7.4-src/interface/thrift/gen-py/cassandra/Cassandra.py",
> >  line 810, in recv_batch_mutate
> >     raise result.te
> >
> >
> > Tests without secondary index is ok at about 40k ops/sec.
> >
> > There is a `GC for ParNew` for about 200ms taking place every second. Does 
> > it matter?
> > The same gc for about 400ms happens every 2 seconds, which does not hurt 
> > the inserts without secondary index.
> >
> > Thanks in advance for any advice.
> >
> > Sheng
> 
> 

Reply via email to