> It is the write latency, read latency is ok. Interestingly the latency is low 
> when there is one node. When I join other nodes the latency drops about 1/3. 
> To be specific, when I start sending traffic to the other nodes the latency 
> for all the nodes increases, if I stop traffic to other nodes the latency 
> drops again, I checked, this is not node specific it happens to any node.
Is this the local write latency or the cluster wide write request latency ? 

What sort of numbers are you seeing ? 

Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/12/2013, at 3:39 pm, srmore <comom...@gmail.com> wrote:

> Thanks Aaron
> 
> 
> On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>> Changed memtable_total_space_in_mb to 1024 still no luck.
> 
> Reducing memtable_total_space_in_mb will increase the frequency of flushing 
> to disk, which will create more for compaction to do and result in increased 
> IO. 
> 
> You should return it to the default.
> 
> You are right, had to revert it back to default.
>  
> 
>> when I send traffic to one node its performance is 2x more than when I send 
>> traffic to all the nodes.
>>  
> What are you measuring, request latency or local read/write latency ? 
> 
> If it’s write latency it’s probably GC, if it’s read is probably IO or data 
> model. 
> 
> It is the write latency, read latency is ok. Interestingly the latency is low 
> when there is one node. When I join other nodes the latency drops about 1/3. 
> To be specific, when I start sending traffic to the other nodes the latency 
> for all the nodes increases, if I stop traffic to other nodes the latency 
> drops again, I checked, this is not node specific it happens to any node.
> 
> I don't see any GC activity in logs. Tried to control the compaction by 
> reducing the number of threads, did not help much.
> 
> 
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 7/12/2013, at 8:05 am, srmore <comom...@gmail.com> wrote:
> 
>> Changed memtable_total_space_in_mb to 1024 still no luck.
>> 
>> 
>> On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak <vicky....@gmail.com> wrote:
>> Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3 
>> which is 8/3 ~ 2.6 gb in capacity
>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>> 
>> The flushing of 2.6 gb to the disk might slow the performance if frequently 
>> called, may be you have lots of write operations going on.
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 10:06 PM, srmore <comom...@gmail.com> wrote:
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak <vicky....@gmail.com> wrote:
>> You have passed the JVM configurations and not the cassandra configurations 
>> which is in cassandra.yaml.
>> 
>> Apologies, was tuning JVM and that's what was in my mind. 
>> Here are the cassandra settings http://pastebin.com/uN42GgYT
>> 
>>  
>> The spikes are not that significant in our case and we are running the 
>> cluster with 1.7 gb heap.
>> 
>> Are these spikes causing any issue at your end?
>> 
>> There are no big spikes, the overall performance seems to be about 40% low.
>>  
>> 
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 9:10 PM, srmore <comom...@gmail.com> wrote:
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky....@gmail.com> wrote:
>> Hard to say much without knowing about the cassandra configurations.
>>  
>> The cassandra configuration is 
>> -Xms8G
>> -Xmx8G
>> -Xmn800m
>> -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=4
>> -XX:MaxTenuringThreshold=2
>> -XX:CMSInitiatingOccupancyFraction=75
>> -XX:+UseCMSInitiatingOccupancyOnly
>> 
>>  
>> Yes compactions/GC's could skipe the CPU, I had similar behavior with my 
>> setup.
>> 
>> Were you able to get around it ?
>>  
>> 
>> -VK
>> 
>> 
>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comom...@gmail.com> wrote:
>> We have a 3 node cluster running cassandra 1.2.12, they are pretty big 
>> machines 64G ram with 16 cores, cassandra heap is 8G. 
>> 
>> The interesting observation is that, when I send traffic to one node its 
>> performance is 2x more than when I send traffic to all the nodes. We ran 
>> 1.0.11 on the same box and we observed a slight dip but not half as seen 
>> with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing 
>> CL to ONE make a slight improvement but not much.
>> 
>> The read_Repair_chance is 0.1. We see some compactions running.
>> 
>> following is my iostat -x output, sda is the ssd (for commit log) and sdb is 
>> the spinner.
>> 
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           66.46    0.00    8.95    0.01    0.00   24.58
>> 
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
>> avgqu-sz   await  svctm  %util
>> sda               0.00    27.60  0.00  4.40     0.00   256.00    58.18     
>> 0.01    2.55   1.32   0.58
>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     
>> 0.00    0.00   0.00   0.00
>> sda2              0.00    27.60  0.00  4.40     0.00   256.00    58.18     
>> 0.01    2.55   1.32   0.58
>> sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     
>> 0.00    0.00   0.00   0.00
>> sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     
>> 0.00    0.00   0.00   0.00
>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00     
>> 0.00    0.00   0.00   0.00
>> dm-1              0.00     0.00  0.00  0.60     0.00     4.80     8.00     
>> 0.00    5.33   2.67   0.16
>> dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     
>> 0.00    0.00   0.00   0.00
>> dm-3              0.00     0.00  0.00 24.80     0.00   198.40     8.00     
>> 0.24    9.80   0.13   0.32
>> dm-4              0.00     0.00  0.00  6.60     0.00    52.80     8.00     
>> 0.01    1.36   0.55   0.36
>> dm-5              0.00     0.00  0.00  0.00     0.00     0.00     0.00     
>> 0.00    0.00   0.00   0.00
>> dm-6              0.00     0.00  0.00 24.80     0.00   198.40     8.00     
>> 0.29   11.60   0.13   0.32
>> 
>> 
>> 
>> I can see I am cpu bound here but couldn't figure out exactly what is 
>> causing it, is this caused by GC or Compaction ? I am thinking it is 
>> compaction, I see a lot of context switches and interrupts in my vmstat 
>> output.
>> 
>> I don't see GC activity in the logs but see some compaction activity. Has 
>> anyone seen this ? or know what can be done to free up the CPU.
>> 
>> Thanks,
>> Sandeep
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 

Reply via email to