subject:"Re\: virtual memory of all cassandra\-nodes is growing extremly since Cassandra 1.1.0"

Fwd: Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-02 Thread Thomas Spengler

we have monitoring of std *nix stuff via zabbix
and cassandra, as all other java via mbeans and zabbix

Best
Tom


 Original Message 
Subject: Re: virtual memory of all cassandra-nodes is growing extremly
since Cassandra 1.1.0
Date: Wed, 1 Aug 2012 14:43:17 -0500
From: Greg Fausak g...@named.com
Reply-To: user@cassandra.apache.org
To: user@cassandra.apache.org

Mina,

Thanks for that post.  Very interesting :-)

What sort of things are you graphing?  Standard *nux stuff
(mem/cpu/etc)?  Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g


On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

 Hi Thomas

 On a modern 64bit server, I recommend you pay little attention to the virtual 
 size.  It's made up of almost everything within the process's address space, 
 including on-disk files mmap()ed in for zero-copy access.  It's not 
 unreasonable for a machine with N amount RAM to have a process whose virtual 
 size is several times the value of N.  That in and of itself is not 
 problematic

 In a default cassandra 1.1.x setup, the bulk of that will be your sstables' 
 data and index files.  On linux you can invoke the pmap tool on the 
 cassandra process's PID to see what's in there.  Much of it will be anonymous 
 memory allocations (the JVM heap itself, off-heap data structures, etc), but 
 lots of it will be references to files on disk (binaries, libraries, mmap()ed 
 files, etc).

 What's more important to keep an eye on is the JVM heap - typically 
 statically allocated to a fixed size at cassandra startup.  You can get info 
 about its used/capacity values via nodetool -h localhost info.  You can 
 also hook up jconsole and trend it over time.

 The other critical piece is the process's RESident memory size, which 
 includes the JVM heap but also other off-heap data structures and 
 miscellanea.  Cassandra has recently been making more use of off-heap 
 structures (for example, row caching via SerializingCacheProvider).  This is 
 done as a matter of efficiency - a serialized off-heap row is much smaller 
 than a classical object sitting in the JVM heap - so you can do more with 
 less.

 Unfortunately, in my experience, it's not perfect.  They still have a cost, 
 in terms of on-heap usage, as well as off-heap growth over time.

 Specifically, my experience with cassandra 1.1.0 showed that off-heap row 
 caches incurred a very high on-heap cost (ironic) - see my post at 
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
  - as documented in that email, I managed that with regularly scheduled full 
 GC runs via System.gc()

 I have, since then, moved away from scheduled System.gc() to scheduled row 
 cache invalidations.  While this had the same effect as System.gc() I 
 described in my email, it eliminated the 20-30 second pause associated with 
 it.  It did however introduce (or may be I never noticed earlier), slow creep 
 in memory usage outside of the heap.

 It's typical in my case for example for a process configured with 6G of JVM 
 heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly 
 throughout a week to 10-11GB range.  Depending on what else the box is doing, 
 I've experienced the linux OOM killer killing cassandra as you've described, 
 or heavy swap usage bringing everything down (we're latency-sensitive), etc..

 And now for the good news.  Since I've upgraded to 1.1.2:
 1. There's no more need for regularly scheduled System.gc()
 2. There's no more need for regularly scheduled row cache invalidation
 3. The HEAP usage within the JVM is stable over time
 4. The RESident size of the process appears also stable over time

 Point #4 above is still pending as I only have 3 day graphs since the 
 upgrade, but they show promising results compared to the slope of the same 
 graph before the upgrade to 1.1.2

 So my advice is give 1.1.2 a shot - just be mindful of 
 https://issues.apache.org/jira/browse/CASSANDRA-4411


 On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

 I saw this.

 All works fine upto version 1.1.0
 the 0.8.x takes 5GB of memory of an 8GB machine
 the 1.0.x takes between 6 and 7 GB on a 8GB machine
 and
 the 1.1.0 takes all

 and it is a problem
 for me it is no solution to wait of the OOM-Killer from the linux kernel
 and restart the cassandraprocess

 when my machine has less then 100MB ram available then I have a problem.



 On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
 Are you actually seeing any problems from this? High virtual memory usage
 on its own really doesn't mean anything. See
 http://wiki.apache.org/cassandra/FAQ#mmap

 On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
 thomas.speng...@toptarif.de wrote:

 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Thomas Spengler

Just for information

we are running on 1.1.2
JNA or not, had no difference
Manually call full gc, had no difference

but
in my case

the reduction of
commitlog_total_space_in_mb to 2048 (from default 4096)
makes the difference.

On 07/26/2012 04:27 PM, Mina Naguib wrote:

Hi Thomas

On a modern 64bit server, I recommend you pay little attention to the virtual
size. It's made up of almost everything within the process's address space,
including on-disk files mmap()ed in for zero-copy access. It's not
unreasonable for a machine with N amount RAM to have a process whose virtual
size is several times the value of N. That in and of itself is not
problematic

In a default cassandra 1.1.x setup, the bulk of that will be your sstables'
data and index files. On linux you can invoke the pmap tool on the
cassandra process's PID to see what's in there. Much of it will be anonymous
memory allocations (the JVM heap itself, off-heap data structures, etc), but
lots of it will be references to files on disk (binaries, libraries, mmap()ed
files, etc).

What's more important to keep an eye on is the JVM heap - typically
statically allocated to a fixed size at cassandra startup. You can get info
about its used/capacity values via nodetool -h localhost info. You can
also hook up jconsole and trend it over time.

The other critical piece is the process's RESident memory size, which
includes the JVM heap but also other off-heap data structures and
miscellanea. Cassandra has recently been making more use of off-heap
structures (for example, row caching via SerializingCacheProvider). This is
done as a matter of efficiency - a serialized off-heap row is much smaller
than a classical object sitting in the JVM heap - so you can do more with
less.

Unfortunately, in my experience, it's not perfect. They still have a cost,
in terms of on-heap usage, as well as off-heap growth over time.

Specifically, my experience with cassandra 1.1.0 showed that off-heap row
caches incurred a very high on-heap cost (ironic) - see my post at
http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
- as documented in that email, I managed that with regularly scheduled full
GC runs via System.gc()

I have, since then, moved away from scheduled System.gc() to scheduled row
cache invalidations. While this had the same effect as System.gc() I
described in my email, it eliminated the 20-30 second pause associated with
it. It did however introduce (or may be I never noticed earlier), slow creep
in memory usage outside of the heap.

It's typical in my case for example for a process configured with 6G of JVM
heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly
throughout a week to 10-11GB range. Depending on what else the box is doing,
I've experienced the linux OOM killer killing cassandra as you've described,
or heavy swap usage bringing everything down (we're latency-sensitive), etc..

And now for the good news. Since I've upgraded to 1.1.2:
1. There's no more need for regularly scheduled System.gc()
2. There's no more need for regularly scheduled row cache invalidation
3. The HEAP usage within the JVM is stable over time
4. The RESident size of the process appears also stable over time

Point #4 above is still pending as I only have 3 day graphs since the
upgrade, but they show promising results compared to the slope of the same
graph before the upgrade to 1.1.2

So my advice is give 1.1.2 a shot - just be mindful of
https://issues.apache.org/jira/browse/CASSANDRA-4411

On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

I saw this.

All works fine upto version 1.1.0
the 0.8.x takes 5GB of memory of an 8GB machine
the 1.0.x takes between 6 and 7 GB on a 8GB machine
and
the 1.1.0 takes all

and it is a problem
for me it is no solution to wait of the OOM-Killer from the linux kernel
and restart the cassandraprocess

when my machine has less then 100MB ram available then I have a problem.

On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
Are you actually seeing any problems from this? High virtual memory usage
on its own really doesn't mean anything. See
http://wiki.apache.org/cassandra/FAQ#mmap

On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler
thomas.speng...@toptarif.de wrote:

No one has any idea?

we tryed

update to 1.1.2
DiskAccessMode standard, indexAccessMode standard
row_cache_size_in_mb: 0
key_cache_size_in_mb: 0

Our next try will to change

SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

any other proposals are welcom

On 07/04/2012 02:13 PM, Thomas Spengler wrote:
Hi @all,

since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
of the cassandra-nodes explodes

our setup is:
* 5 - centos 5.8 nodes
* each 4 CPU's and 8 GB RAM
* each node holds about

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Greg Fausak

Mina,

Thanks for that post. Very interesting :-)

What sort of things are you graphing? Standard *nux stuff
(mem/cpu/etc)? Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g

On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

Hi Thomas

Unfortunately, in my experience, it's not perfect. They still have a cost,
in terms of on-heap usage, as well as off-heap growth over time.

Point #4 above is still pending as I only have 3 day graphs since the
upgrade, but they show promising results compared to the slope of the same
graph before the upgrade to 1.1.2

So my advice is give 1.1.2 a shot - just be mindful of
https://issues.apache.org/jira/browse/CASSANDRA-4411

On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

I saw this.

All works fine upto version 1.1.0
the 0.8.x takes 5GB of memory of an 8GB machine
the 1.0.x takes between 6 and 7 GB on a 8GB machine
and
the 1.1.0 takes all

and it is a problem
for me it is no solution to wait of the OOM-Killer from the linux kernel
and restart the cassandraprocess

when my machine has less then 100MB ram available then I have a problem.

On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler
thomas.speng...@toptarif.de wrote:

No one has any idea?

we tryed

update to 1.1.2
DiskAccessMode standard, indexAccessMode standard
row_cache_size_in_mb: 0
key_cache_size_in_mb: 0

Our next try will to change

SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

any other proposals are welcom

On 07/04/2012 02:13 PM, Thomas Spengler wrote:
Hi @all,

since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
of the cassandra-nodes explodes

our setup is:
* 5 - centos 5.8 nodes
* each 4

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Mina Naguib

All our servers (cassandra and otherwise) get monitored with nagios + get many
basic metrics graphed by pnp4nagios. This covers a large chunk of a box's
health, as well as cassandra basics (specifically the pending tasks, JVM heap
state). IMO it's not possible to clearly debug a cassandra issue if you don't
have a good holistic view of the boxes' health (CPU, RAM, swap, disk
throughput, etc.)

Separate from that we have an operational dashboard. It's a bunch of
manually-defined RRD files and custom scripts that grab metrics, store, and
graph the health of various layers in the infrastructure in an an
easy-to-digest way (for example, each data center gets a color scheme - stacked
machines within multiple DCs can just be eyeballed). There we can see for
example our total read volume, total write volume, struggling boxes, dynamic
endpoint snitch reaction, etc...

Finally, almost all the software we write integrates with statsd + graphite.
In graphite we have more metrics than we know what to do with, but it's better
than the other way around. From there for example we can see cassandra's
response time including things cassandra itself can't measure (network, thrift,
etc), across various different client softwares that talk to it. Within
graphite we have several dashboards defined (users make their own, some
infrastructure components have shared dashboards.)

--
Mina Naguib :: Director, Infrastructure Engineering
Bloom Digital Platforms :: T 514.394.7951 #208
http://bloom-hq.com/

On 2012-08-01, at 3:43 PM, Greg Fausak wrote:

Mina,

Thanks for that post. Very interesting :-)

What sort of things are you graphing? Standard *nux stuff
(mem/cpu/etc)? Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g

On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

Hi Thomas

On a modern 64bit server, I recommend you pay little attention to the
virtual size. It's made up of almost everything within the process's
address space, including on-disk files mmap()ed in for zero-copy access.
It's not unreasonable for a machine with N amount RAM to have a process
whose virtual size is several times the value of N. That in and of itself
is not problematic

In a default cassandra 1.1.x setup, the bulk of that will be your sstables'
data and index files. On linux you can invoke the pmap tool on the
cassandra process's PID to see what's in there. Much of it will be
anonymous memory allocations (the JVM heap itself, off-heap data structures,
etc), but lots of it will be references to files on disk (binaries,
libraries, mmap()ed files, etc).

Unfortunately, in my experience, it's not perfect. They still have a cost,
in terms of on-heap usage, as well as off-heap growth over time.

It's typical in my case for example for a process configured with 6G of JVM
heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up
slowly throughout a week to 10-11GB range. Depending on what else the box
is doing, I've experienced the linux OOM killer killing cassandra as you've
described, or heavy swap usage bringing everything down (we're
latency-sensitive), etc..

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-07-26 Thread Thomas Spengler

I saw this.

All works fine upto version 1.1.0
the 0.8.x takes 5GB of memory of an 8GB machine
the 1.0.x takes between 6 and 7 GB on a 8GB machine
and
the 1.1.0 takes all

and it is a problem
for me it is no solution to wait of the OOM-Killer from the linux kernel
and restart the cassandraprocess

when my machine has less then 100MB ram available then I have a problem.



On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
 Are you actually seeing any problems from this? High virtual memory usage
 on its own really doesn't mean anything. See
 http://wiki.apache.org/cassandra/FAQ#mmap
 
 On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
 thomas.speng...@toptarif.de wrote:
 
 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard
 row_cache_size_in_mb: 0
 key_cache_size_in_mb: 0


 Our next try will to change

 SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

 any other proposals are welcom

 On 07/04/2012 02:13 PM, Thomas Spengler wrote:
 Hi @all,

 since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
 of the cassandra-nodes explodes

 our setup is:
 * 5 - centos 5.8 nodes
 * each 4 CPU's and 8 GB RAM
 * each node holds about 100 GB on data
 * each jvm's uses 2GB Ram
 * DiskAccessMode is standard, indexAccessMode is standard

 The memory usage grows upto the whole memory is used.

 Just for information, as we had cassandra 1.0.3, we used
 * DiskAccessMode is standard, indexAccessMode is mmap
 * and the ram-usage was ~4GB


 can anyone help?


 With Regards



 --
 Thomas Spengler
 Chief Technology Officer
 

 TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin
 Tel.: (030) 2000912 0 | Fax: (030) 2000912 100
 thomas.speng...@toptarif.de | www.toptarif.de

 Amtsgericht Charlottenburg, HRB 113287 B
 Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor
 -



 
 


-- 
Thomas Spengler
Chief Technology Officer


TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin
Tel.: (030) 2000912 0 | Fax: (030) 2000912 100
thomas.speng...@toptarif.de | www.toptarif.de

Amtsgericht Charlottenburg, HRB 113287 B
Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor
-

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-07-25 Thread Tyler Hobbs

Are you actually seeing any problems from this? High virtual memory usage
on its own really doesn't mean anything. See
http://wiki.apache.org/cassandra/FAQ#mmap

On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler 
thomas.speng...@toptarif.de wrote:

 No one has any idea?

 we tryed

 update to 1.1.2
 DiskAccessMode standard, indexAccessMode standard
 row_cache_size_in_mb: 0
 key_cache_size_in_mb: 0


 Our next try will to change

 SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

 any other proposals are welcom

 On 07/04/2012 02:13 PM, Thomas Spengler wrote:
  Hi @all,
 
  since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
  of the cassandra-nodes explodes
 
  our setup is:
  * 5 - centos 5.8 nodes
  * each 4 CPU's and 8 GB RAM
  * each node holds about 100 GB on data
  * each jvm's uses 2GB Ram
  * DiskAccessMode is standard, indexAccessMode is standard
 
  The memory usage grows upto the whole memory is used.
 
  Just for information, as we had cassandra 1.0.3, we used
  * DiskAccessMode is standard, indexAccessMode is mmap
  * and the ram-usage was ~4GB
 
 
  can anyone help?
 
 
  With Regards
 


 --
 Thomas Spengler
 Chief Technology Officer
 

 TopTarif Internet GmbH, Pappelallee 78-79, D-10437 Berlin
 Tel.: (030) 2000912 0 | Fax: (030) 2000912 100
 thomas.speng...@toptarif.de | www.toptarif.de

 Amtsgericht Charlottenburg, HRB 113287 B
 Geschäftsführer: Dr. Rainer Brosch, Dr. Carolin Gabor
 -





-- 
Tyler Hobbs
DataStax http://datastax.com/

Fwd: Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

6 matches

Site Navigation

Mail list logo

Footer information