Fwd: Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-02 Thread Thomas Spengler
we have monitoring of std *nix stuff via zabbix
and cassandra, as all other java via mbeans and zabbix

Best
Tom


 Original Message 
Subject: Re: virtual memory of all cassandra-nodes is growing extremly
since Cassandra 1.1.0
Date: Wed, 1 Aug 2012 14:43:17 -0500
From: Greg Fausak g...@named.com
Reply-To: user@cassandra.apache.org
To: user@cassandra.apache.org

Mina,

Thanks for that post.  Very interesting :-)

What sort of things are you graphing?  Standard *nux stuff
(mem/cpu/etc)?  Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g


On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

 Hi Thomas

 On a modern 64bit server, I recommend you pay little attention to the virtual 
 size.  It's made up of almost everything within the process's address space, 
 including on-disk files mmap()ed in for zero-copy access.  It's not 
 unreasonable for a machine with N amount RAM to have a process whose virtual 
 size is several times the value of N.  That in and of itself is not 
 problematic

 In a default cassandra 1.1.x setup, the bulk of that will be your sstables' 
 data and index files.  On linux you can invoke the pmap tool on the 
 cassandra process's PID to see what's in there.  Much of it will be anonymous 
 memory allocations (the JVM heap itself, off-heap data structures, etc), but 
 lots of it will be references to files on disk (binaries, libraries, mmap()ed 
 files, etc).

 What's more important to keep an eye on is the JVM heap - typically 
 statically allocated to a fixed size at cassandra startup.  You can get info 
 about its used/capacity values via nodetool -h localhost info.  You can 
 also hook up jconsole and trend it over time.

 The other critical piece is the process's RESident memory size, which 
 includes the JVM heap but also other off-heap data structures and 
 miscellanea.  Cassandra has recently been making more use of off-heap 
 structures (for example, row caching via SerializingCacheProvider).  This is 
 done as a matter of efficiency - a serialized off-heap row is much smaller 
 than a classical object sitting in the JVM heap - so you can do more with 
 less.

 Unfortunately, in my experience, it's not perfect.  They still have a cost, 
 in terms of on-heap usage, as well as off-heap growth over time.

 Specifically, my experience with cassandra 1.1.0 showed that off-heap row 
 caches incurred a very high on-heap cost (ironic) - see my post at 
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
  - as documented in that email, I managed that with regularly scheduled full 
 GC runs via System.gc()

 I have, since then, moved away from scheduled System.gc() to scheduled row 
 cache invalidations.  While this had the same effect as System.gc() I 
 described in my email, it eliminated the 20-30 second pause associated with 
 it.  It did however introduce (or may be I never noticed earlier), slow creep 
 in memory usage outside of the heap.

 It's typical in my case for example for a process configured with 6G of JVM 
 heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly 
 throughout a week to 10-11GB range.  Depending on what else the box is doing, 
 I've experienced the linux OOM killer killing cassandra as you've described, 
 or heavy swap usage bringing everything down (we're latency-sensitive), etc..

 And now for the good news.  Since I've upgraded to 1.1.2:
 1. There's no more need for regularly scheduled System.gc()
 2. There's no more need for regularly scheduled row cache invalidation
 3. The HEAP usage within the JVM is stable over time
 4. The RESident size of the process appears also stable over time

 Point #4 above is still pending as I only have 3 day graphs since the 
 upgrade, but they show promising results compared to the slope of the same 
 graph before the upgrade to 1.1.2

 So my advice is give 1.1.2 a shot - just be mindful of 
 https://issues.apache.org/jira/browse/CASSANDRA-4411


 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

 I saw this.

 All works fine upto version 1.1.0
 the 0.8.x takes 5GB of memory of an 8GB machine
 the 1.0.x takes between 6 and 7 GB on a 8GB machine
 and
 the 1.1.0 takes all

 and it is a problem
 for me it is no solution to wait of the OOM-Killer from the linux kernel
 and restart the cassandraprocess

 when my machine has less then 100MB ram available then I have a problem.



 On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
 Are you actually seeing any problems from this? High virtual memory usage
 on its own really doesn't mean anything. See
 http://wiki.apache.org/cassandra/FAQ#mmap

 On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
 thomas.speng...@toptarif.de wrote:

 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Thomas Spengler
Just for information

we are running on 1.1.2
JNA or not, had no difference
Manually call full gc, had no difference

but
in my case

the reduction of
commitlog_total_space_in_mb to 2048 (from default 4096)
makes the difference.




On 07/26/2012 04:27 PM, Mina Naguib wrote:
 
 Hi Thomas
 
 On a modern 64bit server, I recommend you pay little attention to the virtual 
 size.  It's made up of almost everything within the process's address space, 
 including on-disk files mmap()ed in for zero-copy access.  It's not 
 unreasonable for a machine with N amount RAM to have a process whose virtual 
 size is several times the value of N.  That in and of itself is not 
 problematic
 
 In a default cassandra 1.1.x setup, the bulk of that will be your sstables' 
 data and index files.  On linux you can invoke the pmap tool on the 
 cassandra process's PID to see what's in there.  Much of it will be anonymous 
 memory allocations (the JVM heap itself, off-heap data structures, etc), but 
 lots of it will be references to files on disk (binaries, libraries, mmap()ed 
 files, etc).
 
 What's more important to keep an eye on is the JVM heap - typically 
 statically allocated to a fixed size at cassandra startup.  You can get info 
 about its used/capacity values via nodetool -h localhost info.  You can 
 also hook up jconsole and trend it over time.
 
 The other critical piece is the process's RESident memory size, which 
 includes the JVM heap but also other off-heap data structures and 
 miscellanea.  Cassandra has recently been making more use of off-heap 
 structures (for example, row caching via SerializingCacheProvider).  This is 
 done as a matter of efficiency - a serialized off-heap row is much smaller 
 than a classical object sitting in the JVM heap - so you can do more with 
 less.
 
 Unfortunately, in my experience, it's not perfect.  They still have a cost, 
 in terms of on-heap usage, as well as off-heap growth over time.
 
 Specifically, my experience with cassandra 1.1.0 showed that off-heap row 
 caches incurred a very high on-heap cost (ironic) - see my post at 
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
  - as documented in that email, I managed that with regularly scheduled full 
 GC runs via System.gc()
 
 I have, since then, moved away from scheduled System.gc() to scheduled row 
 cache invalidations.  While this had the same effect as System.gc() I 
 described in my email, it eliminated the 20-30 second pause associated with 
 it.  It did however introduce (or may be I never noticed earlier), slow creep 
 in memory usage outside of the heap.
 
 It's typical in my case for example for a process configured with 6G of JVM 
 heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly 
 throughout a week to 10-11GB range.  Depending on what else the box is doing, 
 I've experienced the linux OOM killer killing cassandra as you've described, 
 or heavy swap usage bringing everything down (we're latency-sensitive), etc..
 
 And now for the good news.  Since I've upgraded to 1.1.2:
   1. There's no more need for regularly scheduled System.gc()
   2. There's no more need for regularly scheduled row cache invalidation
   3. The HEAP usage within the JVM is stable over time
   4. The RESident size of the process appears also stable over time
 
 Point #4 above is still pending as I only have 3 day graphs since the 
 upgrade, but they show promising results compared to the slope of the same 
 graph before the upgrade to 1.1.2
 
 So my advice is give 1.1.2 a shot - just be mindful of 
 https://issues.apache.org/jira/browse/CASSANDRA-4411
 
 
 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:
 
 I saw this.

 All works fine upto version 1.1.0
 the 0.8.x takes 5GB of memory of an 8GB machine
 the 1.0.x takes between 6 and 7 GB on a 8GB machine
 and
 the 1.1.0 takes all

 and it is a problem
 for me it is no solution to wait of the OOM-Killer from the linux kernel
 and restart the cassandraprocess

 when my machine has less then 100MB ram available then I have a problem.



 On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
 Are you actually seeing any problems from this? High virtual memory usage
 on its own really doesn't mean anything. See
 http://wiki.apache.org/cassandra/FAQ#mmap

 On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
 thomas.speng...@toptarif.de wrote:

 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard
 row_cache_size_in_mb: 0
 key_cache_size_in_mb: 0


 Our next try will to change

 SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

 any other proposals are welcom

 On 07/04/2012 02:13 PM, Thomas Spengler wrote:
 Hi @all,

 since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
 of the cassandra-nodes explodes

 our setup is:
 * 5 - centos 5.8 nodes
 * each 4 CPU's and 8 GB RAM
 * each node holds about 

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Greg Fausak
Mina,

Thanks for that post.  Very interesting :-)

What sort of things are you graphing?  Standard *nux stuff
(mem/cpu/etc)?  Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g


On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

 Hi Thomas

 On a modern 64bit server, I recommend you pay little attention to the virtual 
 size.  It's made up of almost everything within the process's address space, 
 including on-disk files mmap()ed in for zero-copy access.  It's not 
 unreasonable for a machine with N amount RAM to have a process whose virtual 
 size is several times the value of N.  That in and of itself is not 
 problematic

 In a default cassandra 1.1.x setup, the bulk of that will be your sstables' 
 data and index files.  On linux you can invoke the pmap tool on the 
 cassandra process's PID to see what's in there.  Much of it will be anonymous 
 memory allocations (the JVM heap itself, off-heap data structures, etc), but 
 lots of it will be references to files on disk (binaries, libraries, mmap()ed 
 files, etc).

 What's more important to keep an eye on is the JVM heap - typically 
 statically allocated to a fixed size at cassandra startup.  You can get info 
 about its used/capacity values via nodetool -h localhost info.  You can 
 also hook up jconsole and trend it over time.

 The other critical piece is the process's RESident memory size, which 
 includes the JVM heap but also other off-heap data structures and 
 miscellanea.  Cassandra has recently been making more use of off-heap 
 structures (for example, row caching via SerializingCacheProvider).  This is 
 done as a matter of efficiency - a serialized off-heap row is much smaller 
 than a classical object sitting in the JVM heap - so you can do more with 
 less.

 Unfortunately, in my experience, it's not perfect.  They still have a cost, 
 in terms of on-heap usage, as well as off-heap growth over time.

 Specifically, my experience with cassandra 1.1.0 showed that off-heap row 
 caches incurred a very high on-heap cost (ironic) - see my post at 
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
  - as documented in that email, I managed that with regularly scheduled full 
 GC runs via System.gc()

 I have, since then, moved away from scheduled System.gc() to scheduled row 
 cache invalidations.  While this had the same effect as System.gc() I 
 described in my email, it eliminated the 20-30 second pause associated with 
 it.  It did however introduce (or may be I never noticed earlier), slow creep 
 in memory usage outside of the heap.

 It's typical in my case for example for a process configured with 6G of JVM 
 heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly 
 throughout a week to 10-11GB range.  Depending on what else the box is doing, 
 I've experienced the linux OOM killer killing cassandra as you've described, 
 or heavy swap usage bringing everything down (we're latency-sensitive), etc..

 And now for the good news.  Since I've upgraded to 1.1.2:
 1. There's no more need for regularly scheduled System.gc()
 2. There's no more need for regularly scheduled row cache invalidation
 3. The HEAP usage within the JVM is stable over time
 4. The RESident size of the process appears also stable over time

 Point #4 above is still pending as I only have 3 day graphs since the 
 upgrade, but they show promising results compared to the slope of the same 
 graph before the upgrade to 1.1.2

 So my advice is give 1.1.2 a shot - just be mindful of 
 https://issues.apache.org/jira/browse/CASSANDRA-4411


 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

 I saw this.

 All works fine upto version 1.1.0
 the 0.8.x takes 5GB of memory of an 8GB machine
 the 1.0.x takes between 6 and 7 GB on a 8GB machine
 and
 the 1.1.0 takes all

 and it is a problem
 for me it is no solution to wait of the OOM-Killer from the linux kernel
 and restart the cassandraprocess

 when my machine has less then 100MB ram available then I have a problem.



 On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
 Are you actually seeing any problems from this? High virtual memory usage
 on its own really doesn't mean anything. See
 http://wiki.apache.org/cassandra/FAQ#mmap

 On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
 thomas.speng...@toptarif.de wrote:

 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard
 row_cache_size_in_mb: 0
 key_cache_size_in_mb: 0


 Our next try will to change

 SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

 any other proposals are welcom

 On 07/04/2012 02:13 PM, Thomas Spengler wrote:
 Hi @all,

 since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
 of the cassandra-nodes explodes

 our setup is:
 * 5 - centos 5.8 nodes
 * each 4 

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Mina Naguib

All our servers (cassandra and otherwise) get monitored with nagios + get many 
basic metrics graphed by pnp4nagios.  This covers a large chunk of a box's 
health, as well as cassandra basics (specifically the pending tasks, JVM heap 
state).  IMO it's not possible to clearly debug a cassandra issue if you don't 
have a good holistic view of the boxes' health (CPU, RAM, swap, disk 
throughput, etc.)

Separate from that we have an operational dashboard.  It's a bunch of 
manually-defined RRD files and custom scripts that grab metrics, store, and 
graph the health of various layers in the infrastructure in an an 
easy-to-digest way (for example, each data center gets a color scheme - stacked 
machines within multiple DCs can just be eyeballed).  There we can see for 
example our total read volume, total write volume, struggling boxes, dynamic 
endpoint snitch reaction, etc...

Finally, almost all the software we write integrates with statsd + graphite.  
In graphite we have more metrics than we know what to do with, but it's better 
than the other way around.  From there for example we can see cassandra's 
response time including things cassandra itself can't measure (network, thrift, 
etc), across various different client softwares that talk to it.  Within 
graphite we have several dashboards defined (users make their own, some 
infrastructure components have shared dashboards.)


--
Mina Naguib :: Director, Infrastructure Engineering
Bloom Digital Platforms :: T 514.394.7951 #208
http://bloom-hq.com/



On 2012-08-01, at 3:43 PM, Greg Fausak wrote:

 Mina,
 
 Thanks for that post.  Very interesting :-)
 
 What sort of things are you graphing?  Standard *nux stuff
 (mem/cpu/etc)?  Or do you
 have some hooks in to the C* process (I saw somoething about port 1414
 in the .yaml file).
 
 Best,
 
 -g
 
 
 On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
 mina.nag...@bloomdigital.com wrote:
 
 Hi Thomas
 
 On a modern 64bit server, I recommend you pay little attention to the 
 virtual size.  It's made up of almost everything within the process's 
 address space, including on-disk files mmap()ed in for zero-copy access.  
 It's not unreasonable for a machine with N amount RAM to have a process 
 whose virtual size is several times the value of N.  That in and of itself 
 is not problematic
 
 In a default cassandra 1.1.x setup, the bulk of that will be your sstables' 
 data and index files.  On linux you can invoke the pmap tool on the 
 cassandra process's PID to see what's in there.  Much of it will be 
 anonymous memory allocations (the JVM heap itself, off-heap data structures, 
 etc), but lots of it will be references to files on disk (binaries, 
 libraries, mmap()ed files, etc).
 
 What's more important to keep an eye on is the JVM heap - typically 
 statically allocated to a fixed size at cassandra startup.  You can get info 
 about its used/capacity values via nodetool -h localhost info.  You can 
 also hook up jconsole and trend it over time.
 
 The other critical piece is the process's RESident memory size, which 
 includes the JVM heap but also other off-heap data structures and 
 miscellanea.  Cassandra has recently been making more use of off-heap 
 structures (for example, row caching via SerializingCacheProvider).  This is 
 done as a matter of efficiency - a serialized off-heap row is much smaller 
 than a classical object sitting in the JVM heap - so you can do more with 
 less.
 
 Unfortunately, in my experience, it's not perfect.  They still have a cost, 
 in terms of on-heap usage, as well as off-heap growth over time.
 
 Specifically, my experience with cassandra 1.1.0 showed that off-heap row 
 caches incurred a very high on-heap cost (ironic) - see my post at 
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
  - as documented in that email, I managed that with regularly scheduled full 
 GC runs via System.gc()
 
 I have, since then, moved away from scheduled System.gc() to scheduled row 
 cache invalidations.  While this had the same effect as System.gc() I 
 described in my email, it eliminated the 20-30 second pause associated with 
 it.  It did however introduce (or may be I never noticed earlier), slow 
 creep in memory usage outside of the heap.
 
 It's typical in my case for example for a process configured with 6G of JVM 
 heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up 
 slowly throughout a week to 10-11GB range.  Depending on what else the box 
 is doing, I've experienced the linux OOM killer killing cassandra as you've 
 described, or heavy swap usage bringing everything down (we're 
 latency-sensitive), etc..
 
 And now for the good news.  Since I've upgraded to 1.1.2:
1. There's no more need for regularly scheduled System.gc()
2. There's no more need for regularly scheduled row cache invalidation
3. The HEAP usage within the JVM is stable over time

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-07-26 Thread Thomas Spengler
I saw this.

All works fine upto version 1.1.0
the 0.8.x takes 5GB of memory of an 8GB machine
the 1.0.x takes between 6 and 7 GB on a 8GB machine
and
the 1.1.0 takes all

and it is a problem
for me it is no solution to wait of the OOM-Killer from the linux kernel
and restart the cassandraprocess

when my machine has less then 100MB ram available then I have a problem.



On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
 Are you actually seeing any problems from this? High virtual memory usage
 on its own really doesn't mean anything. See
 http://wiki.apache.org/cassandra/FAQ#mmap
 
 On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
 thomas.speng...@toptarif.de wrote:
 
 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard
 row_cache_size_in_mb: 0
 key_cache_size_in_mb: 0


 Our next try will to change

 SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

 any other proposals are welcom

 On 07/04/2012 02:13 PM, Thomas Spengler wrote:
 Hi @all,

 since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
 of the cassandra-nodes explodes

 our setup is:
 * 5 - centos 5.8 nodes
 * each 4 CPU's and 8 GB RAM
 * each node holds about 100 GB on data
 * each jvm's uses 2GB Ram
 * DiskAccessMode is standard, indexAccessMode is standard

 The memory usage grows upto the whole memory is used.

 Just for information, as we had cassandra 1.0.3, we used
 * DiskAccessMode is standard, indexAccessMode is mmap
 * and the ram-usage was ~4GB


 can anyone help?


 With Regards



 --
 Thomas Spengler
 Chief Technology Officer
 

 TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin
 Tel.: (030) 2000912 0 | Fax: (030) 2000912 100
 thomas.speng...@toptarif.de | www.toptarif.de

 Amtsgericht Charlottenburg, HRB 113287 B
 Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor
 -



 
 


-- 
Thomas Spengler
Chief Technology Officer


TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin
Tel.: (030) 2000912 0 | Fax: (030) 2000912 100
thomas.speng...@toptarif.de | www.toptarif.de

Amtsgericht Charlottenburg, HRB 113287 B
Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor
-




Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-07-25 Thread Tyler Hobbs
Are you actually seeing any problems from this? High virtual memory usage
on its own really doesn't mean anything. See
http://wiki.apache.org/cassandra/FAQ#mmap

On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
thomas.speng...@toptarif.de wrote:

 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard
 row_cache_size_in_mb: 0
 key_cache_size_in_mb: 0


 Our next try will to change

 SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

 any other proposals are welcom

 On 07/04/2012 02:13 PM, Thomas Spengler wrote:
  Hi @all,
 
  since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
  of the cassandra-nodes explodes
 
  our setup is:
  * 5 - centos 5.8 nodes
  * each 4 CPU's and 8 GB RAM
  * each node holds about 100 GB on data
  * each jvm's uses 2GB Ram
  * DiskAccessMode is standard, indexAccessMode is standard
 
  The memory usage grows upto the whole memory is used.
 
  Just for information, as we had cassandra 1.0.3, we used
  * DiskAccessMode is standard, indexAccessMode is mmap
  * and the ram-usage was ~4GB
 
 
  can anyone help?
 
 
  With Regards
 


 --
 Thomas Spengler
 Chief Technology Officer
 

 TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin
 Tel.: (030) 2000912 0 | Fax: (030) 2000912 100
 thomas.speng...@toptarif.de | www.toptarif.de

 Amtsgericht Charlottenburg, HRB 113287 B
 Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor
 -





-- 
Tyler Hobbs
DataStax http://datastax.com/