subject:"Write performance with 1.2.12"

Re: Write performance with 1.2.12

2013-12-17 Thread Aaron Morton

 With a single node I get 3K for cassandra 1.0.12 and 1.2.12. So I suspect 
 there is some network chatter. I have started looking at the sources, hoping 
 to find something.
1.2 is pretty stable, I doubt there is anything in there that makes it run 
slower than 1.0. It’s probably something in your configuration or network.

Compare the local write time from nodetool cfhistograms and the request latency 
from nodetool proxyhistograms. Writes request latency should be a bit below 1ms 
and local write latency should be around .5 ms or better. if there is a wider 
difference between the two it’s wait time + network time. 

As a general rule you should get around 3k to 4k writes per second per core.

Cheers


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 13/12/2013, at 8:06 pm, Rahul Menon ra...@apigee.com wrote:

 Quote from 
 http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
 
 Murmur3Partitioner is NOT compatible with RandomPartitioner, so if you’re 
 upgrading and using the new cassandra.yaml file, be sure to change the 
 partitioner back to RandomPartitioner
 
 
 On Thu, Dec 12, 2013 at 10:57 PM, srmore comom...@gmail.com wrote:
 
 
 
 On Thu, Dec 12, 2013 at 11:15 AM, J. Ryan Earl o...@jryanearl.us wrote:
 Why did you switch to RandomPartitioner away from Murmur3Partitioner?  Have 
 you tried with Murmur3?
 
 # partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 partitioner: org.apache.cassandra.dht.RandomPartitioner
 
 
 Since I am comparing between the two versions I am keeping all the settings 
 same. I see
 Murmur3Partitioner has some performance improvement but then switching back to
 RandomPartitioner should not cause performance to tank, right ? or am I 
 missing something ? 
 
 Also, is there an easier way to update the data from RandomPartitioner to 
 Murmur3 ? (upgradesstable ?)
 
 
  
 
 On Fri, Dec 6, 2013 at 10:36 AM, srmore comom...@gmail.com wrote:
 
 
 
 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:
 You have passed the JVM configurations and not the cassandra configurations 
 which is in cassandra.yaml.
 
 Apologies, was tuning JVM and that's what was in my mind. 
 Here are the cassandra settings http://pastebin.com/uN42GgYT
 
  
 The spikes are not that significant in our case and we are running the 
 cluster with 1.7 gb heap.
 
 Are these spikes causing any issue at your end?
 
 There are no big spikes, the overall performance seems to be about 40% low.
  
 
 
 
 
 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:
 
 
 
 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:
 Hard to say much without knowing about the cassandra configurations.
  
 The cassandra configuration is 
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly
 
  
 Yes compactions/GC's could skipe the CPU, I had similar behavior with my 
 setup.
 
 Were you able to get around it ?
  
 
 -VK
 
 
 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:
 We have a 3 node cluster running cassandra 1.2.12, they are pretty big 
 machines 64G ram with 16 cores, cassandra heap is 8G. 
 
 The interesting observation is that, when I send traffic to one node its 
 performance is 2x more than when I send traffic to all the nodes. We ran 
 1.0.11 on the same box and we observed a slight dip but not half as seen with 
 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing CL to 
 ONE make a slight improvement but not much.
 
 The read_Repair_chance is 0.1. We see some compactions running.
 
 following is my iostat -x output, sda is the ssd (for commit log) and sdb is 
 the spinner.
 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58
 
 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.0058.18 
 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.0058.18 
 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80 8.00 
 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-3  0.00

Re: Write performance with 1.2.12

2013-12-12 Thread srmore

On Wed, Dec 11, 2013 at 10:49 PM, Aaron Morton aa...@thelastpickle.comwrote:

It is the write latency, read latency is ok. Interestingly the latency is
low when there is one node. When I join other nodes the latency drops about
1/3. To be specific, when I start sending traffic to the other nodes the
latency for all the nodes increases, if I stop traffic to other nodes the
latency drops again, I checked, this is not node specific it happens to any
node.

Is this the local write latency or the cluster wide write request latency
?

This is a cluster wide write latency.

What sort of numbers are you seeing ?

I have a custom application that writes data to the cassandra node, so the
numbers might be different than the standard stress test but it should be
good enough for comparison. With the previous release 1.0.12 I was getting
around 10K requests/ sec and with 1.2.12 I am getting around 6K requests/
sec. Everything else is the same. This is a three node cluster.

With a single node I get 3K for cassandra 1.0.12 and 1.2.12. So I suspect
there is some network chatter. I have started looking at the sources,
hoping to find something.

-sandeep

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/12/2013, at 3:39 pm, srmore comom...@gmail.com wrote:

Thanks Aaron

On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton aa...@thelastpickle.comwrote:

Changed memtable_total_space_in_mb to 1024 still no luck.

Reducing memtable_total_space_in_mb will increase the frequency of
flushing to disk, which will create more for compaction to do and result in
increased IO.

You should return it to the default.

You are right, had to revert it back to default.

when I send traffic to one node its performance is 2x more than when I
send traffic to all the nodes.

What are you measuring, request latency or local read/write latency ?

If it’s write latency it’s probably GC, if it’s read is probably IO or
data model.

I don't see any GC activity in logs. Tried to control the compaction by
reducing the number of threads, did not help much.

Hope that helps.

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 7/12/2013, at 8:05 am, srmore comom...@gmail.com wrote:

Changed memtable_total_space_in_mb to 1024 still no luck.

On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak vicky@gmail.com wrote:

Can you set the memtable_total_space_in_mb value, it is defaulting to
1/3 which is 8/3 ~ 2.6 gb in capacity

http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management

The flushing of 2.6 gb to the disk might slow the performance if
frequently called, may be you have lots of write operations going on.

On Fri, Dec 6, 2013 at 10:06 PM, srmore comom...@gmail.com wrote:

On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

You have passed the JVM configurations and not the cassandra
configurations which is in cassandra.yaml.

Apologies, was tuning JVM and that's what was in my mind.
Here are the cassandra settings http://pastebin.com/uN42GgYT

The spikes are not that significant in our case and we are running the
cluster with 1.7 gb heap.

Are these spikes causing any issue at your end?

There are no big spikes, the overall performance seems to be about 40%
low.

On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:

On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.comwrote:

Hard to say much without knowing about the cassandra configurations.

The cassandra configuration is
-Xms8G
-Xmx8G
-Xmn800m
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=4
-XX:MaxTenuringThreshold=2
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly

Yes compactions/GC's could skipe the CPU, I had similar behavior
with my setup.

Were you able to get around it ?

-VK

On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

We have a 3 node cluster running cassandra 1.2.12, they are pretty
big machines 64G ram with 16 cores, cassandra heap is 8G.

The interesting observation is that, when I send traffic to one
node its performance is 2x more than when I send traffic to all the
nodes.
We ran 1.0.11 on the same box and we observed a slight dip but not
half as
seen with 1.2.12. In both the cases we were writing with

Re: Write performance with 1.2.12

2013-12-12 Thread J. Ryan Earl

Why did you switch to RandomPartitioner away from Murmur3Partitioner?  Have
you tried with Murmur3?


   1. # partitioner: org.apache.cassandra.dht.Murmur3Partitioner
   2. partitioner: org.apache.cassandra.dht.RandomPartitioner



On Fri, Dec 6, 2013 at 10:36 AM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


 Apologies, was tuning JVM and that's what was in my mind.
 Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


 There are no big spikes, the overall performance seems to be about 40% low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-12 Thread srmore

On Thu, Dec 12, 2013 at 11:15 AM, J. Ryan Earl o...@jryanearl.us wrote:

 Why did you switch to RandomPartitioner away from Murmur3Partitioner?
  Have you tried with Murmur3?


1. # partitioner: org.apache.cassandra.dht.Murmur3Partitioner
2. partitioner: org.apache.cassandra.dht.RandomPartitioner



Since I am comparing between the two versions I am keeping all the settings
same. I see
Murmur3Partitioner has some performance improvement but then switching back
to
RandomPartitioner should not cause performance to tank, right ? or am I
missing something ?

Also, is there an easier way to update the data from RandomPartitioner to
Murmur3 ? (upgradesstable ?)





 On Fri, Dec 6, 2013 at 10:36 AM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


 Apologies, was tuning JVM and that's what was in my mind.
 Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


 There are no big spikes, the overall performance seems to be about 40%
 low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



  Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty
 big machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
 avgrq-sz avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-12 Thread Rahul Menon

Quote from
http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2

*Murmur3Partitioner is NOT compatible with RandomPartitioner, so if you’re
upgrading and using the new cassandra.yaml file, be sure to change the
partitioner back to RandomPartitioner*


On Thu, Dec 12, 2013 at 10:57 PM, srmore comom...@gmail.com wrote:




 On Thu, Dec 12, 2013 at 11:15 AM, J. Ryan Earl o...@jryanearl.us wrote:

 Why did you switch to RandomPartitioner away from Murmur3Partitioner?
  Have you tried with Murmur3?


1. # partitioner: org.apache.cassandra.dht.Murmur3Partitioner
2. partitioner: org.apache.cassandra.dht.RandomPartitioner



 Since I am comparing between the two versions I am keeping all the
 settings same. I see
 Murmur3Partitioner has some performance improvement but then switching
 back to
 RandomPartitioner should not cause performance to tank, right ? or am I
 missing something ?

 Also, is there an easier way to update the data from RandomPartitioner to
 Murmur3 ? (upgradesstable ?)





 On Fri, Dec 6, 2013 at 10:36 AM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


 Apologies, was tuning JVM and that's what was in my mind.
 Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


 There are no big spikes, the overall performance seems to be about 40%
 low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



  Yes compactions/GC's could skipe the CPU, I had similar behavior
 with my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty
 big machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log)
 and sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
 avgrq-sz avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what
 is causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction
 activity. Has anyone seen this ? or know what can be done to free up the
 CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-11 Thread Aaron Morton

 Changed memtable_total_space_in_mb to 1024 still no luck.
Reducing memtable_total_space_in_mb will increase the frequency of flushing to 
disk, which will create more for compaction to do and result in increased IO. 

You should return it to the default.

 when I send traffic to one node its performance is 2x more than when I send 
 traffic to all the nodes.
  
What are you measuring, request latency or local read/write latency ? 

If it’s write latency it’s probably GC, if it’s read is probably IO or data 
model. 

Hope that helps. 

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 7/12/2013, at 8:05 am, srmore comom...@gmail.com wrote:

 Changed memtable_total_space_in_mb to 1024 still no luck.
 
 
 On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak vicky@gmail.com wrote:
 Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3 
 which is 8/3 ~ 2.6 gb in capacity
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
 
 The flushing of 2.6 gb to the disk might slow the performance if frequently 
 called, may be you have lots of write operations going on.
 
 
 
 On Fri, Dec 6, 2013 at 10:06 PM, srmore comom...@gmail.com wrote:
 
 
 
 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:
 You have passed the JVM configurations and not the cassandra configurations 
 which is in cassandra.yaml.
 
 Apologies, was tuning JVM and that's what was in my mind. 
 Here are the cassandra settings http://pastebin.com/uN42GgYT
 
  
 The spikes are not that significant in our case and we are running the 
 cluster with 1.7 gb heap.
 
 Are these spikes causing any issue at your end?
 
 There are no big spikes, the overall performance seems to be about 40% low.
  
 
 
 
 
 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:
 
 
 
 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:
 Hard to say much without knowing about the cassandra configurations.
  
 The cassandra configuration is 
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly
 
  
 Yes compactions/GC's could skipe the CPU, I had similar behavior with my 
 setup.
 
 Were you able to get around it ?
  
 
 -VK
 
 
 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:
 We have a 3 node cluster running cassandra 1.2.12, they are pretty big 
 machines 64G ram with 16 cores, cassandra heap is 8G. 
 
 The interesting observation is that, when I send traffic to one node its 
 performance is 2x more than when I send traffic to all the nodes. We ran 
 1.0.11 on the same box and we observed a slight dip but not half as seen with 
 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing CL to 
 ONE make a slight improvement but not much.
 
 The read_Repair_chance is 0.1. We see some compactions running.
 
 following is my iostat -x output, sda is the ssd (for commit log) and sdb is 
 the spinner.
 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58
 
 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.0058.18 
 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.0058.18 
 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80 8.00 
 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40 8.00 
 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80 8.00 
 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40 8.00 
 0.29   11.60   0.13   0.32
 
 
 
 I can see I am cpu bound here but couldn't figure out exactly what is causing 
 it, is this caused by GC or Compaction ? I am thinking it is compaction, I 
 see a lot of context switches and interrupts in my vmstat output.
 
 I don't see GC activity in the logs but see some compaction activity. Has 
 anyone seen this ?

Re: Write performance with 1.2.12

2013-12-11 Thread srmore

Thanks Aaron


On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton aa...@thelastpickle.comwrote:

 Changed memtable_total_space_in_mb to 1024 still no luck.

 Reducing memtable_total_space_in_mb will increase the frequency of
 flushing to disk, which will create more for compaction to do and result in
 increased IO.

 You should return it to the default.


You are right, had to revert it back to default.



 when I send traffic to one node its performance is 2x more than when I
 send traffic to all the nodes.



 What are you measuring, request latency or local read/write latency ?

 If it’s write latency it’s probably GC, if it’s read is probably IO or
 data model.


It is the write latency, read latency is ok. Interestingly the latency is
low when there is one node. When I join other nodes the latency drops about
1/3. To be specific, when I start sending traffic to the other nodes the
latency for all the nodes increases, if I stop traffic to other nodes the
latency drops again, I checked, this is not node specific it happens to any
node.

I don't see any GC activity in logs. Tried to control the compaction by
reducing the number of threads, did not help much.


 Hope that helps.

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 7/12/2013, at 8:05 am, srmore comom...@gmail.com wrote:

 Changed memtable_total_space_in_mb to 1024 still no luck.


 On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak vicky@gmail.com wrote:

 Can you set the memtable_total_space_in_mb value, it is defaulting to
 1/3 which is 8/3 ~ 2.6 gb in capacity

 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management

 The flushing of 2.6 gb to the disk might slow the performance if
 frequently called, may be you have lots of write operations going on.



 On Fri, Dec 6, 2013 at 10:06 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


 Apologies, was tuning JVM and that's what was in my mind.
 Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


 There are no big spikes, the overall performance seems to be about 40%
 low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty
 big machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log)
 and sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
 avgrq-sz avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80

Re: Write performance with 1.2.12

2013-12-11 Thread Aaron Morton

It is the write latency, read latency is ok. Interestingly the latency is low
when there is one node. When I join other nodes the latency drops about 1/3.
To be specific, when I start sending traffic to the other nodes the latency
for all the nodes increases, if I stop traffic to other nodes the latency
drops again, I checked, this is not node specific it happens to any node.
Is this the local write latency or the cluster wide write request latency ?

What sort of numbers are you seeing ?

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/12/2013, at 3:39 pm, srmore comom...@gmail.com wrote:

Thanks Aaron

On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton aa...@thelastpickle.com wrote:
Changed memtable_total_space_in_mb to 1024 still no luck.

Reducing memtable_total_space_in_mb will increase the frequency of flushing
to disk, which will create more for compaction to do and result in increased
IO.

You should return it to the default.

You are right, had to revert it back to default.

when I send traffic to one node its performance is 2x more than when I send
traffic to all the nodes.

What are you measuring, request latency or local read/write latency ?

If it’s write latency it’s probably GC, if it’s read is probably IO or data
model.

It is the write latency, read latency is ok. Interestingly the latency is low
when there is one node. When I join other nodes the latency drops about 1/3.
To be specific, when I start sending traffic to the other nodes the latency
for all the nodes increases, if I stop traffic to other nodes the latency
drops again, I checked, this is not node specific it happens to any node.

I don't see any GC activity in logs. Tried to control the compaction by
reducing the number of threads, did not help much.

Hope that helps.

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 7/12/2013, at 8:05 am, srmore comom...@gmail.com wrote:

Changed memtable_total_space_in_mb to 1024 still no luck.

On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak vicky@gmail.com wrote:
Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3
which is 8/3 ~ 2.6 gb in capacity
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management

The flushing of 2.6 gb to the disk might slow the performance if frequently
called, may be you have lots of write operations going on.

On Fri, Dec 6, 2013 at 10:06 PM, srmore comom...@gmail.com wrote:

On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:
You have passed the JVM configurations and not the cassandra configurations
which is in cassandra.yaml.

Apologies, was tuning JVM and that's what was in my mind.
Here are the cassandra settings http://pastebin.com/uN42GgYT

The spikes are not that significant in our case and we are running the
cluster with 1.7 gb heap.

Are these spikes causing any issue at your end?

There are no big spikes, the overall performance seems to be about 40% low.

On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:

On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:
Hard to say much without knowing about the cassandra configurations.

Yes compactions/GC's could skipe the CPU, I had similar behavior with my
setup.

Were you able to get around it ?

-VK

On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:
We have a 3 node cluster running cassandra 1.2.12, they are pretty big
machines 64G ram with 16 cores, cassandra heap is 8G.

The interesting observation is that, when I send traffic to one node its
performance is 2x more than when I send traffic to all the nodes. We ran
1.0.11 on the same box and we observed a slight dip but not half as seen
with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
CL to ONE make a slight improvement but not much.

The read_Repair_chance is 0.1. We see some compactions running.

following is my iostat -x output, sda is the ssd (for commit log) and sdb is
the spinner.

avg-cpu: %user %nice %system %iowait %steal %idle
66.460.008.950.010.00 24.58

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 0.0027.60 0.00 4.40 0.00 256.0058.18
0.012.55 1.32 0.58
sda1 0.00 0.00 0.00 0.00 0.00 0.00

Write performance with 1.2.12

2013-12-06 Thread srmore

We have a 3 node cluster running cassandra 1.2.12, they are pretty big
machines 64G ram with 16 cores, cassandra heap is 8G.

The interesting observation is that, when I send traffic to one node its
performance is 2x more than when I send traffic to all the nodes. We ran
1.0.11 on the same box and we observed a slight dip but not half as seen
with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
CL to ONE make a slight improvement but not much.

The read_Repair_chance is 0.1. We see some compactions running.

following is my iostat -x output, sda is the ssd (for commit log) and sdb
is the spinner.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  66.460.008.950.010.00   24.58

Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda   0.0027.60  0.00  4.40 0.00   256.0058.18
0.012.55   1.32   0.58
sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
sda2  0.0027.60  0.00  4.40 0.00   256.0058.18
0.012.55   1.32   0.58
sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
sdb1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
dm-0  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
dm-1  0.00 0.00  0.00  0.60 0.00 4.80 8.00
0.005.33   2.67   0.16
dm-2  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
dm-3  0.00 0.00  0.00 24.80 0.00   198.40 8.00
0.249.80   0.13   0.32
dm-4  0.00 0.00  0.00  6.60 0.0052.80 8.00
0.011.36   0.55   0.36
dm-5  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
dm-6  0.00 0.00  0.00 24.80 0.00   198.40 8.00
0.29   11.60   0.13   0.32



I can see I am cpu bound here but couldn't figure out exactly what is
causing it, is this caused by GC or Compaction ? I am thinking it is
compaction, I see a lot of context switches and interrupts in my vmstat
output.

I don't see GC activity in the logs but see some compaction activity. Has
anyone seen this ? or know what can be done to free up the CPU.

Thanks,
Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak

Hard to say much without knowing about the cassandra configurations.
Yes compactions/GC's could skipe the CPU, I had similar behavior with my
setup.

-VK


On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node its
 performance is 2x more than when I send traffic to all the nodes. We ran
 1.0.11 on the same box and we observed a slight dip but not half as seen
 with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
 CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and sdb
 is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.0058.18
 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.0058.18
 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00
 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00 0.00
 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80 8.00
 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00 0.00
 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40 8.00
 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80 8.00
 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00 0.00
 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40 8.00
 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity. Has
 anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread srmore

On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


The cassandra configuration is
-Xms8G
-Xmx8G
-Xmn800m
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=4
-XX:MaxTenuringThreshold=2
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with my
 setup.


Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node its
 performance is 2x more than when I send traffic to all the nodes. We ran
 1.0.11 on the same box and we observed a slight dip but not half as seen
 with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
 CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and sdb
 is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity. Has
 anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread Jason Wee

Hi srmore,

Perhaps if you use jconsole and connect to the jvm using jmx. Then uner
MBeans tab, start inspecting the GC metrics.

/Jason


On Fri, Dec 6, 2013 at 11:40 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with my
 setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node its
 performance is 2x more than when I send traffic to all the nodes. We ran
 1.0.11 on the same box and we observed a slight dip but not half as seen
 with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
 CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak

You have passed the JVM configurations and not the cassandra configurations
which is in cassandra.yaml.
The spikes are not that significant in our case and we are running the
cluster with 1.7 gb heap.

Are these spikes causing any issue at your end?




On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with my
 setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node its
 performance is 2x more than when I send traffic to all the nodes. We ran
 1.0.11 on the same box and we observed a slight dip but not half as seen
 with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
 CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread srmore

On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


Apologies, was tuning JVM and that's what was in my mind.
Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


There are no big spikes, the overall performance seems to be about 40% low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with my
 setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak

Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3
which is 8/3 ~ 2.6 gb in capacity
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management

The flushing of 2.6 gb to the disk might slow the performance if frequently
called, may be you have lots of write operations going on.



On Fri, Dec 6, 2013 at 10:06 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


 Apologies, was tuning JVM and that's what was in my mind.
 Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


 There are no big spikes, the overall performance seems to be about 40% low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread srmore

Looks like I am spending some time in GC.

java.lang:type=GarbageCollector,name=ConcurrentMarkSweep

CollectionTime = 51707;
CollectionCount = 103;

java.lang:type=GarbageCollector,name=ParNew

 CollectionTime = 466835;
 CollectionCount = 21315;


On Fri, Dec 6, 2013 at 9:58 AM, Jason Wee peich...@gmail.com wrote:

 Hi srmore,

 Perhaps if you use jconsole and connect to the jvm using jmx. Then uner
 MBeans tab, start inspecting the GC metrics.

 /Jason


 On Fri, Dec 6, 2013 at 11:40 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with my
 setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak

Since how long the server had been up, hours,days,months?


On Fri, Dec 6, 2013 at 10:41 PM, srmore comom...@gmail.com wrote:

 Looks like I am spending some time in GC.

 java.lang:type=GarbageCollector,name=ConcurrentMarkSweep

 CollectionTime = 51707;
 CollectionCount = 103;

 java.lang:type=GarbageCollector,name=ParNew

  CollectionTime = 466835;
  CollectionCount = 21315;


 On Fri, Dec 6, 2013 at 9:58 AM, Jason Wee peich...@gmail.com wrote:

 Hi srmore,

 Perhaps if you use jconsole and connect to the jvm using jmx. Then uner
 MBeans tab, start inspecting the GC metrics.

 /Jason


 On Fri, Dec 6, 2013 at 11:40 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty big
 machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
 avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread srmore

Not long: Uptime (seconds) : 6828

Token: 56713727820156410577229101238628035242
ID   : c796609a-a050-48df-bf56-bb09091376d9
Gossip active: true
Thrift active: true
Native Transport active: false
Load : 49.71 GB
Generation No: 1386344053
Uptime (seconds) : 6828
Heap Memory (MB) : 2409.71 / 8112.00
Data Center  : DC
Rack : RAC-1
Exceptions   : 0
Key Cache: size 56154704 (bytes), capacity 104857600 (bytes), 27
hits, 155669426 requests, 0.000 recent hit rate, 14400 save period in
seconds
Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests,
NaN recent hit rate, 0 save period in seconds


On Fri, Dec 6, 2013 at 11:15 AM, Vicky Kak vicky@gmail.com wrote:

 Since how long the server had been up, hours,days,months?


 On Fri, Dec 6, 2013 at 10:41 PM, srmore comom...@gmail.com wrote:

 Looks like I am spending some time in GC.

 java.lang:type=GarbageCollector,name=ConcurrentMarkSweep

 CollectionTime = 51707;
 CollectionCount = 103;

 java.lang:type=GarbageCollector,name=ParNew

  CollectionTime = 466835;
  CollectionCount = 21315;


 On Fri, Dec 6, 2013 at 9:58 AM, Jason Wee peich...@gmail.com wrote:

 Hi srmore,

 Perhaps if you use jconsole and connect to the jvm using jmx. Then uner
 MBeans tab, start inspecting the GC metrics.

 /Jason


 On Fri, Dec 6, 2013 at 11:40 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty
 big machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
 avgrq-sz avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

2013-12-06 Thread srmore

Changed memtable_total_space_in_mb to 1024 still no luck.


On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak vicky@gmail.com wrote:

 Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3
 which is 8/3 ~ 2.6 gb in capacity

 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management

 The flushing of 2.6 gb to the disk might slow the performance if
 frequently called, may be you have lots of write operations going on.



 On Fri, Dec 6, 2013 at 10:06 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak vicky@gmail.com wrote:

 You have passed the JVM configurations and not the cassandra
 configurations which is in cassandra.yaml.


 Apologies, was tuning JVM and that's what was in my mind.
 Here are the cassandra settings http://pastebin.com/uN42GgYT



 The spikes are not that significant in our case and we are running the
 cluster with 1.7 gb heap.

 Are these spikes causing any issue at your end?


 There are no big spikes, the overall performance seems to be about 40%
 low.






 On Fri, Dec 6, 2013 at 9:10 PM, srmore comom...@gmail.com wrote:




 On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote:

 Hard to say much without knowing about the cassandra configurations.


 The cassandra configuration is
 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=4
 -XX:MaxTenuringThreshold=2
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly



 Yes compactions/GC's could skipe the CPU, I had similar behavior with
 my setup.


 Were you able to get around it ?



 -VK


 On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote:

 We have a 3 node cluster running cassandra 1.2.12, they are pretty
 big machines 64G ram with 16 cores, cassandra heap is 8G.

 The interesting observation is that, when I send traffic to one node
 its performance is 2x more than when I send traffic to all the nodes. We
 ran 1.0.11 on the same box and we observed a slight dip but not half as
 seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
 Changing CL to ONE make a slight improvement but not much.

 The read_Repair_chance is 0.1. We see some compactions running.

 following is my iostat -x output, sda is the ssd (for commit log) and
 sdb is the spinner.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   66.460.008.950.010.00   24.58

 Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
 avgrq-sz avgqu-sz   await  svctm  %util
 sda   0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sda1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sda2  0.0027.60  0.00  4.40 0.00   256.00
 58.18 0.012.55   1.32   0.58
 sdb   0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 sdb1  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-0  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-1  0.00 0.00  0.00  0.60 0.00 4.80
 8.00 0.005.33   2.67   0.16
 dm-2  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-3  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.249.80   0.13   0.32
 dm-4  0.00 0.00  0.00  6.60 0.0052.80
 8.00 0.011.36   0.55   0.36
 dm-5  0.00 0.00  0.00  0.00 0.00 0.00
 0.00 0.000.00   0.00   0.00
 dm-6  0.00 0.00  0.00 24.80 0.00   198.40
 8.00 0.29   11.60   0.13   0.32



 I can see I am cpu bound here but couldn't figure out exactly what is
 causing it, is this caused by GC or Compaction ? I am thinking it is
 compaction, I see a lot of context switches and interrupts in my vmstat
 output.

 I don't see GC activity in the logs but see some compaction activity.
 Has anyone seen this ? or know what can be done to free up the CPU.

 Thanks,
 Sandeep

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

Re: Write performance with 1.2.12

19 matches

Site Navigation

Mail list logo

Footer information