Re: Finding bottleneck of a cluster

rohit bhatia Thu, 05 Jul 2012 05:21:52 -0700

Also,


Looking at gc log. I see messages like this across different servers
before they start dropping messages

"2012-07-04T10:48:20.336+0000: 96771.117: [GC 96771.118: [ParNew:
1367297K->57371K(1474560K), 0.0617350 secs]
6641571K->5340088K(12419072K), 0.0634460 secs] [Times: user=0.56
sys=0.01, real=0.06 secs]
Total time for which application threads were stopped: 0.0850010 seconds
Total time for which application threads were stopped: 16.7663710 seconds"

The 16 second pause doesnt seem to be caused by the minor/major gc
which are quite fast and are also logged. "Total time for which ..."
messages are caused by PrintGCApplicationStoppedTime paramater which
is supposed to be logged whenever threads reach a safepoint. Is there
any way I can figure out what caused the java threads to pause.

Thanks
Rohit

On Thu, Jul 5, 2012 at 12:19 PM, rohit bhatia <rohit2...@gmail.com> wrote:
> Our Cassandra cluster consists of 8 nodes(16 core, 32G ram, 12G Heap,
> 1600Mb Young gen, cassandra1.0.5, JDK 1.7, 128 Concurrent writer
> threads). The replication factor is 2 with 10 column families and we
> service Counter incrementing write intensive tasks(CL=ONE).
>
> I am trying to figure out the bottleneck,
>
> 1) Is using JDK 1.7 any way detrimental to cassandra?
>
> 2) What is the max write operation qps that should be expected. Is the
> netflix benchmark also applicable for counter incrmenting tasks?
>     
> http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
>
> 3) At around 50,000qps for the cluster (~12500 qps per node), the cpu
> idle time is around 30%, cassandra is not disk bound(insignificant
> read operations and cpu's iowait is around 0.05%) and is not swapping
> its memory(around 15 gb RAM is free or inactive). The average gc pause
> time for parnew are 100ms occuring every second. So cassandra spends
> 10% of its time stuck in "Stop the world" collector.
> The os load is around 16-20 and the average write latency is 3ms.
> tpstats do not show any significant pending tasks.
>
>     At this point suddenly, Several nodes start dropping several
> "Mutation" messages. There are also lots of pending
> MutationStage,replicateOnWriteStage tasks in tpstats.
> The number of threads in the java process increase to around 25,000
> from the usual 300-400. Almost all the new threads seem to be named
> "pool-2-thread-*".
> The OS load jumps to around 30-40, the "write request latency" starts
> spiking to more than 500ms (even to several tens of seconds sometime).
> Even the "Local write latency" increases fourfolds to 200 microseconds
> from 50 microseconds. This happens across all the nodes and in around
> 2-3 minutes.
> My guess is that this might be due to the 128 Writer threads not being
> able to perform more writes.(though with  average local write latency
> of 100-150 micro seconds, each thread should be able to serve 10,000
> qps and with 128 writer threads, should be able to serve 1,280,000 qps
> per node)
> Could there be any other reason for this? What else should I monitor
> since system.log do not seem to say anything conclusive before
> dropping messages.
>
>
>
> Thanks
> Rohit

Re: Finding bottleneck of a cluster

Reply via email to