Fwd: Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0
we have monitoring of std *nix stuff via zabbix and cassandra, as all other java via mbeans and zabbix Best Tom Original Message Subject: Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0 Date: Wed, 1 Aug 2012 14:43:17 -0500 From: Greg Fausak g...@named.com Reply-To: user@cassandra.apache.org To: user@cassandra.apache.org Mina, Thanks for that post. Very interesting :-) What sort of things are you graphing? Standard *nux stuff (mem/cpu/etc)? Or do you have some hooks in to the C* process (I saw somoething about port 1414 in the .yaml file). Best, -g On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib mina.nag...@bloomdigital.com wrote: Hi Thomas On a modern 64bit server, I recommend you pay little attention to the virtual size. It's made up of almost everything within the process's address space, including on-disk files mmap()ed in for zero-copy access. It's not unreasonable for a machine with N amount RAM to have a process whose virtual size is several times the value of N. That in and of itself is not problematic In a default cassandra 1.1.x setup, the bulk of that will be your sstables' data and index files. On linux you can invoke the pmap tool on the cassandra process's PID to see what's in there. Much of it will be anonymous memory allocations (the JVM heap itself, off-heap data structures, etc), but lots of it will be references to files on disk (binaries, libraries, mmap()ed files, etc). What's more important to keep an eye on is the JVM heap - typically statically allocated to a fixed size at cassandra startup. You can get info about its used/capacity values via nodetool -h localhost info. You can also hook up jconsole and trend it over time. The other critical piece is the process's RESident memory size, which includes the JVM heap but also other off-heap data structures and miscellanea. Cassandra has recently been making more use of off-heap structures (for example, row caching via SerializingCacheProvider). This is done as a matter of efficiency - a serialized off-heap row is much smaller than a classical object sitting in the JVM heap - so you can do more with less. Unfortunately, in my experience, it's not perfect. They still have a cost, in terms of on-heap usage, as well as off-heap growth over time. Specifically, my experience with cassandra 1.1.0 showed that off-heap row caches incurred a very high on-heap cost (ironic) - see my post at http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E - as documented in that email, I managed that with regularly scheduled full GC runs via System.gc() I have, since then, moved away from scheduled System.gc() to scheduled row cache invalidations. While this had the same effect as System.gc() I described in my email, it eliminated the 20-30 second pause associated with it. It did however introduce (or may be I never noticed earlier), slow creep in memory usage outside of the heap. It's typical in my case for example for a process configured with 6G of JVM heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly throughout a week to 10-11GB range. Depending on what else the box is doing, I've experienced the linux OOM killer killing cassandra as you've described, or heavy swap usage bringing everything down (we're latency-sensitive), etc.. And now for the good news. Since I've upgraded to 1.1.2: 1. There's no more need for regularly scheduled System.gc() 2. There's no more need for regularly scheduled row cache invalidation 3. The HEAP usage within the JVM is stable over time 4. The RESident size of the process appears also stable over time Point #4 above is still pending as I only have 3 day graphs since the upgrade, but they show promising results compared to the slope of the same graph before the upgrade to 1.1.2 So my advice is give 1.1.2 a shot - just be mindful of https://issues.apache.org/jira/browse/CASSANDRA-4411 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote: I saw this. All works fine upto version 1.1.0 the 0.8.x takes 5GB of memory of an 8GB machine the 1.0.x takes between 6 and 7 GB on a 8GB machine and the 1.1.0 takes all and it is a problem for me it is no solution to wait of the OOM-Killer from the linux kernel and restart the cassandraprocess when my machine has less then 100MB ram available then I have a problem. On 07/25/2012 07:06 PM, Tyler Hobbs wrote: Are you actually seeing any problems from this? High virtual memory usage on its own really doesn't mean anything. See http://wiki.apache.org/cassandra/FAQ#mmap On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler thomas.speng...@toptarif.de wrote: No one has any idea? we tryed update to 1.1.2 DiskAccessMode standard, indexAccessMode standard
Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0
Just for information we are running on 1.1.2 JNA or not, had no difference Manually call full gc, had no difference but in my case the reduction of commitlog_total_space_in_mb to 2048 (from default 4096) makes the difference. On 07/26/2012 04:27 PM, Mina Naguib wrote: Hi Thomas On a modern 64bit server, I recommend you pay little attention to the virtual size. It's made up of almost everything within the process's address space, including on-disk files mmap()ed in for zero-copy access. It's not unreasonable for a machine with N amount RAM to have a process whose virtual size is several times the value of N. That in and of itself is not problematic In a default cassandra 1.1.x setup, the bulk of that will be your sstables' data and index files. On linux you can invoke the pmap tool on the cassandra process's PID to see what's in there. Much of it will be anonymous memory allocations (the JVM heap itself, off-heap data structures, etc), but lots of it will be references to files on disk (binaries, libraries, mmap()ed files, etc). What's more important to keep an eye on is the JVM heap - typically statically allocated to a fixed size at cassandra startup. You can get info about its used/capacity values via nodetool -h localhost info. You can also hook up jconsole and trend it over time. The other critical piece is the process's RESident memory size, which includes the JVM heap but also other off-heap data structures and miscellanea. Cassandra has recently been making more use of off-heap structures (for example, row caching via SerializingCacheProvider). This is done as a matter of efficiency - a serialized off-heap row is much smaller than a classical object sitting in the JVM heap - so you can do more with less. Unfortunately, in my experience, it's not perfect. They still have a cost, in terms of on-heap usage, as well as off-heap growth over time. Specifically, my experience with cassandra 1.1.0 showed that off-heap row caches incurred a very high on-heap cost (ironic) - see my post at http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E - as documented in that email, I managed that with regularly scheduled full GC runs via System.gc() I have, since then, moved away from scheduled System.gc() to scheduled row cache invalidations. While this had the same effect as System.gc() I described in my email, it eliminated the 20-30 second pause associated with it. It did however introduce (or may be I never noticed earlier), slow creep in memory usage outside of the heap. It's typical in my case for example for a process configured with 6G of JVM heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly throughout a week to 10-11GB range. Depending on what else the box is doing, I've experienced the linux OOM killer killing cassandra as you've described, or heavy swap usage bringing everything down (we're latency-sensitive), etc.. And now for the good news. Since I've upgraded to 1.1.2: 1. There's no more need for regularly scheduled System.gc() 2. There's no more need for regularly scheduled row cache invalidation 3. The HEAP usage within the JVM is stable over time 4. The RESident size of the process appears also stable over time Point #4 above is still pending as I only have 3 day graphs since the upgrade, but they show promising results compared to the slope of the same graph before the upgrade to 1.1.2 So my advice is give 1.1.2 a shot - just be mindful of https://issues.apache.org/jira/browse/CASSANDRA-4411 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote: I saw this. All works fine upto version 1.1.0 the 0.8.x takes 5GB of memory of an 8GB machine the 1.0.x takes between 6 and 7 GB on a 8GB machine and the 1.1.0 takes all and it is a problem for me it is no solution to wait of the OOM-Killer from the linux kernel and restart the cassandraprocess when my machine has less then 100MB ram available then I have a problem. On 07/25/2012 07:06 PM, Tyler Hobbs wrote: Are you actually seeing any problems from this? High virtual memory usage on its own really doesn't mean anything. See http://wiki.apache.org/cassandra/FAQ#mmap On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler thomas.speng...@toptarif.de wrote: No one has any idea? we tryed update to 1.1.2 DiskAccessMode standard, indexAccessMode standard row_cache_size_in_mb: 0 key_cache_size_in_mb: 0 Our next try will to change SerializingCacheProvider to ConcurrentLinkedHashCacheProvider any other proposals are welcom On 07/04/2012 02:13 PM, Thomas Spengler wrote: Hi @all, since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage of the cassandra-nodes explodes our setup is: * 5 - centos 5.8 nodes * each 4 CPU's and 8 GB RAM * each node holds about
Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0
Mina, Thanks for that post. Very interesting :-) What sort of things are you graphing? Standard *nux stuff (mem/cpu/etc)? Or do you have some hooks in to the C* process (I saw somoething about port 1414 in the .yaml file). Best, -g On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib mina.nag...@bloomdigital.com wrote: Hi Thomas On a modern 64bit server, I recommend you pay little attention to the virtual size. It's made up of almost everything within the process's address space, including on-disk files mmap()ed in for zero-copy access. It's not unreasonable for a machine with N amount RAM to have a process whose virtual size is several times the value of N. That in and of itself is not problematic In a default cassandra 1.1.x setup, the bulk of that will be your sstables' data and index files. On linux you can invoke the pmap tool on the cassandra process's PID to see what's in there. Much of it will be anonymous memory allocations (the JVM heap itself, off-heap data structures, etc), but lots of it will be references to files on disk (binaries, libraries, mmap()ed files, etc). What's more important to keep an eye on is the JVM heap - typically statically allocated to a fixed size at cassandra startup. You can get info about its used/capacity values via nodetool -h localhost info. You can also hook up jconsole and trend it over time. The other critical piece is the process's RESident memory size, which includes the JVM heap but also other off-heap data structures and miscellanea. Cassandra has recently been making more use of off-heap structures (for example, row caching via SerializingCacheProvider). This is done as a matter of efficiency - a serialized off-heap row is much smaller than a classical object sitting in the JVM heap - so you can do more with less. Unfortunately, in my experience, it's not perfect. They still have a cost, in terms of on-heap usage, as well as off-heap growth over time. Specifically, my experience with cassandra 1.1.0 showed that off-heap row caches incurred a very high on-heap cost (ironic) - see my post at http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E - as documented in that email, I managed that with regularly scheduled full GC runs via System.gc() I have, since then, moved away from scheduled System.gc() to scheduled row cache invalidations. While this had the same effect as System.gc() I described in my email, it eliminated the 20-30 second pause associated with it. It did however introduce (or may be I never noticed earlier), slow creep in memory usage outside of the heap. It's typical in my case for example for a process configured with 6G of JVM heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly throughout a week to 10-11GB range. Depending on what else the box is doing, I've experienced the linux OOM killer killing cassandra as you've described, or heavy swap usage bringing everything down (we're latency-sensitive), etc.. And now for the good news. Since I've upgraded to 1.1.2: 1. There's no more need for regularly scheduled System.gc() 2. There's no more need for regularly scheduled row cache invalidation 3. The HEAP usage within the JVM is stable over time 4. The RESident size of the process appears also stable over time Point #4 above is still pending as I only have 3 day graphs since the upgrade, but they show promising results compared to the slope of the same graph before the upgrade to 1.1.2 So my advice is give 1.1.2 a shot - just be mindful of https://issues.apache.org/jira/browse/CASSANDRA-4411 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote: I saw this. All works fine upto version 1.1.0 the 0.8.x takes 5GB of memory of an 8GB machine the 1.0.x takes between 6 and 7 GB on a 8GB machine and the 1.1.0 takes all and it is a problem for me it is no solution to wait of the OOM-Killer from the linux kernel and restart the cassandraprocess when my machine has less then 100MB ram available then I have a problem. On 07/25/2012 07:06 PM, Tyler Hobbs wrote: Are you actually seeing any problems from this? High virtual memory usage on its own really doesn't mean anything. See http://wiki.apache.org/cassandra/FAQ#mmap On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler thomas.speng...@toptarif.de wrote: No one has any idea? we tryed update to 1.1.2 DiskAccessMode standard, indexAccessMode standard row_cache_size_in_mb: 0 key_cache_size_in_mb: 0 Our next try will to change SerializingCacheProvider to ConcurrentLinkedHashCacheProvider any other proposals are welcom On 07/04/2012 02:13 PM, Thomas Spengler wrote: Hi @all, since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage of the cassandra-nodes explodes our setup is: * 5 - centos 5.8 nodes * each 4
Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0
All our servers (cassandra and otherwise) get monitored with nagios + get many basic metrics graphed by pnp4nagios. This covers a large chunk of a box's health, as well as cassandra basics (specifically the pending tasks, JVM heap state). IMO it's not possible to clearly debug a cassandra issue if you don't have a good holistic view of the boxes' health (CPU, RAM, swap, disk throughput, etc.) Separate from that we have an operational dashboard. It's a bunch of manually-defined RRD files and custom scripts that grab metrics, store, and graph the health of various layers in the infrastructure in an an easy-to-digest way (for example, each data center gets a color scheme - stacked machines within multiple DCs can just be eyeballed). There we can see for example our total read volume, total write volume, struggling boxes, dynamic endpoint snitch reaction, etc... Finally, almost all the software we write integrates with statsd + graphite. In graphite we have more metrics than we know what to do with, but it's better than the other way around. From there for example we can see cassandra's response time including things cassandra itself can't measure (network, thrift, etc), across various different client softwares that talk to it. Within graphite we have several dashboards defined (users make their own, some infrastructure components have shared dashboards.) -- Mina Naguib :: Director, Infrastructure Engineering Bloom Digital Platforms :: T 514.394.7951 #208 http://bloom-hq.com/ On 2012-08-01, at 3:43 PM, Greg Fausak wrote: Mina, Thanks for that post. Very interesting :-) What sort of things are you graphing? Standard *nux stuff (mem/cpu/etc)? Or do you have some hooks in to the C* process (I saw somoething about port 1414 in the .yaml file). Best, -g On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib mina.nag...@bloomdigital.com wrote: Hi Thomas On a modern 64bit server, I recommend you pay little attention to the virtual size. It's made up of almost everything within the process's address space, including on-disk files mmap()ed in for zero-copy access. It's not unreasonable for a machine with N amount RAM to have a process whose virtual size is several times the value of N. That in and of itself is not problematic In a default cassandra 1.1.x setup, the bulk of that will be your sstables' data and index files. On linux you can invoke the pmap tool on the cassandra process's PID to see what's in there. Much of it will be anonymous memory allocations (the JVM heap itself, off-heap data structures, etc), but lots of it will be references to files on disk (binaries, libraries, mmap()ed files, etc). What's more important to keep an eye on is the JVM heap - typically statically allocated to a fixed size at cassandra startup. You can get info about its used/capacity values via nodetool -h localhost info. You can also hook up jconsole and trend it over time. The other critical piece is the process's RESident memory size, which includes the JVM heap but also other off-heap data structures and miscellanea. Cassandra has recently been making more use of off-heap structures (for example, row caching via SerializingCacheProvider). This is done as a matter of efficiency - a serialized off-heap row is much smaller than a classical object sitting in the JVM heap - so you can do more with less. Unfortunately, in my experience, it's not perfect. They still have a cost, in terms of on-heap usage, as well as off-heap growth over time. Specifically, my experience with cassandra 1.1.0 showed that off-heap row caches incurred a very high on-heap cost (ironic) - see my post at http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E - as documented in that email, I managed that with regularly scheduled full GC runs via System.gc() I have, since then, moved away from scheduled System.gc() to scheduled row cache invalidations. While this had the same effect as System.gc() I described in my email, it eliminated the 20-30 second pause associated with it. It did however introduce (or may be I never noticed earlier), slow creep in memory usage outside of the heap. It's typical in my case for example for a process configured with 6G of JVM heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly throughout a week to 10-11GB range. Depending on what else the box is doing, I've experienced the linux OOM killer killing cassandra as you've described, or heavy swap usage bringing everything down (we're latency-sensitive), etc.. And now for the good news. Since I've upgraded to 1.1.2: 1. There's no more need for regularly scheduled System.gc() 2. There's no more need for regularly scheduled row cache invalidation 3. The HEAP usage within the JVM is stable over time
Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0
I saw this. All works fine upto version 1.1.0 the 0.8.x takes 5GB of memory of an 8GB machine the 1.0.x takes between 6 and 7 GB on a 8GB machine and the 1.1.0 takes all and it is a problem for me it is no solution to wait of the OOM-Killer from the linux kernel and restart the cassandraprocess when my machine has less then 100MB ram available then I have a problem. On 07/25/2012 07:06 PM, Tyler Hobbs wrote: Are you actually seeing any problems from this? High virtual memory usage on its own really doesn't mean anything. See http://wiki.apache.org/cassandra/FAQ#mmap On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler thomas.speng...@toptarif.de wrote: No one has any idea? we tryed update to 1.1.2 DiskAccessMode standard, indexAccessMode standard row_cache_size_in_mb: 0 key_cache_size_in_mb: 0 Our next try will to change SerializingCacheProvider to ConcurrentLinkedHashCacheProvider any other proposals are welcom On 07/04/2012 02:13 PM, Thomas Spengler wrote: Hi @all, since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage of the cassandra-nodes explodes our setup is: * 5 - centos 5.8 nodes * each 4 CPU's and 8 GB RAM * each node holds about 100 GB on data * each jvm's uses 2GB Ram * DiskAccessMode is standard, indexAccessMode is standard The memory usage grows upto the whole memory is used. Just for information, as we had cassandra 1.0.3, we used * DiskAccessMode is standard, indexAccessMode is mmap * and the ram-usage was ~4GB can anyone help? With Regards -- Thomas Spengler Chief Technology Officer TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin Tel.: (030) 2000912 0 | Fax: (030) 2000912 100 thomas.speng...@toptarif.de | www.toptarif.de Amtsgericht Charlottenburg, HRB 113287 B Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor - -- Thomas Spengler Chief Technology Officer TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin Tel.: (030) 2000912 0 | Fax: (030) 2000912 100 thomas.speng...@toptarif.de | www.toptarif.de Amtsgericht Charlottenburg, HRB 113287 B Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor -
Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0
Are you actually seeing any problems from this? High virtual memory usage on its own really doesn't mean anything. See http://wiki.apache.org/cassandra/FAQ#mmap On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler thomas.speng...@toptarif.de wrote: No one has any idea? we tryed update to 1.1.2 DiskAccessMode standard, indexAccessMode standard row_cache_size_in_mb: 0 key_cache_size_in_mb: 0 Our next try will to change SerializingCacheProvider to ConcurrentLinkedHashCacheProvider any other proposals are welcom On 07/04/2012 02:13 PM, Thomas Spengler wrote: Hi @all, since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage of the cassandra-nodes explodes our setup is: * 5 - centos 5.8 nodes * each 4 CPU's and 8 GB RAM * each node holds about 100 GB on data * each jvm's uses 2GB Ram * DiskAccessMode is standard, indexAccessMode is standard The memory usage grows upto the whole memory is used. Just for information, as we had cassandra 1.0.3, we used * DiskAccessMode is standard, indexAccessMode is mmap * and the ram-usage was ~4GB can anyone help? With Regards -- Thomas Spengler Chief Technology Officer TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin Tel.: (030) 2000912 0 | Fax: (030) 2000912 100 thomas.speng...@toptarif.de | www.toptarif.de Amtsgericht Charlottenburg, HRB 113287 B Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor - -- Tyler Hobbs DataStax http://datastax.com/