Also,
Looking at gc log. I see messages like this across different servers before they start dropping messages "2012-07-04T10:48:20.336+0000: 96771.117: [GC 96771.118: [ParNew: 1367297K->57371K(1474560K), 0.0617350 secs] 6641571K->5340088K(12419072K), 0.0634460 secs] [Times: user=0.56 sys=0.01, real=0.06 secs] Total time for which application threads were stopped: 0.0850010 seconds Total time for which application threads were stopped: 16.7663710 seconds" The 16 second pause doesnt seem to be caused by the minor/major gc which are quite fast and are also logged. "Total time for which ..." messages are caused by PrintGCApplicationStoppedTime paramater which is supposed to be logged whenever threads reach a safepoint. Is there any way I can figure out what caused the java threads to pause. Thanks Rohit On Thu, Jul 5, 2012 at 12:19 PM, rohit bhatia <rohit2...@gmail.com> wrote: > Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap, > 1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer > threads). The replication factor is 2 with 10 column families and we > service Counter incrementing write intensive tasks(CL=ONE). > > I am trying to figure out the bottleneck, > > 1) Is using JDK 1.7 any way detrimental to cassandra? > > 2) What is the max write operation qps that should be expected. Is the > netflix benchmark also applicable for counter incrmenting tasks? > > http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html > > 3) At around 50,000qps for the cluster (~12500 qps per node), the cpu > idle time is around 30%, cassandra is not disk bound(insignificant > read operations and cpu's iowait is around 0.05%) and is not swapping > its memory(around 15 gb RAM is free or inactive). The average gc pause > time for parnew are 100ms occuring every second. So cassandra spends > 10% of its time stuck in "Stop the world" collector. > The os load is around 16-20 and the average write latency is 3ms. > tpstats do not show any significant pending tasks. > > At this point suddenly, Several nodes start dropping several > "Mutation" messages. There are also lots of pending > MutationStage,replicateOnWriteStage tasks in tpstats. > The number of threads in the java process increase to around 25,000 > from the usual 300-400. Almost all the new threads seem to be named > "pool-2-thread-*". > The OS load jumps to around 30-40, the "write request latency" starts > spiking to more than 500ms (even to several tens of seconds sometime). > Even the "Local write latency" increases fourfolds to 200 microseconds > from 50 microseconds. This happens across all the nodes and in around > 2-3 minutes. > My guess is that this might be due to the 128 Writer threads not being > able to perform more writes.(though with average local write latency > of 100-150 micro seconds, each thread should be able to serve 10,000 > qps and with 128 writer threads, should be able to serve 1,280,000 qps > per node) > Could there be any other reason for this? What else should I monitor > since system.log do not seem to say anything conclusive before > dropping messages. > > > > Thanks > Rohit