Thanks for your thoughts. Answers below:

On Sun, Jun 20, 2010 at 2:21 PM, Peter Schuller <peter.schul...@infidyne.com
> wrote:

> > The memory problems I've posted about before have gotten much worse and
> our
> > nodes are becoming incredibly slow/unusable every 24 hours or so.
> Basically,
> > the JVM reports that only 14GB is committed, but the RSS of the process
> is
> > 22GB, and cassandra is completely unresponsive, but still having requests
> > routed to it internally, so it completely destroys performance.
> > I'm at a loss for how to diagnose this issue.
>
> Sorry, I don't know the history of this (you mentioned you've alluded
> to the problems before), so maybe I am being redundant or missing
> something, but:
>
> (1) Is the machine swapping? (Actively swapping in/out as reported by
> e.g. vmstat)
>

Yes, somewhat, although swappiness is set to 0.


> (2) Do the logs indicate that GC is running excessively, thus
> indicating an almost-out-of-heap condition?
>

It runs, but I wouldn't say excessively.


> (3) mmap():ed memory that is currently resident will count towards
> RSS; if you're using mmap():ed I/O (the default), that is to be
> expected.
>

This is where I'm a little confused. I thought that mmap()'d IO didn't
actually allocate memory. I thought it was just IO through a faster code
path.


> (4) If you are using mmap():ed I/O, that is also in and of itself
> something which can cause trouble if the operating system decides to
> swap your application out in favor of the mmap()

(5) If you are swapping (see (1)), try switching from mmap():ed to
> standard I/O (due to (4)), and/or try decreasing the swappyness if
> you're on Linux (see /proc/sys/vm/swappiness).
>

I tried switching to standard IO mode, but it was very, very slow. What I'm
confused about here is that if mmap()'d IO actually allocates memory that
can put pressure on other processes' memory, is there no way to bound that?
If not, how can anybody safely use mmap()'d IO on the JVM without risking
pushing their process's important pages out of memory.

swappiness is already at 0.


> (6) Is Cassandra CPU bound or disk bound in general, regardless of
> swapping?
>

Hard to tell because of the paging.


>
> --
> / Peter Schuller
>

Reply via email to