What is my "live set"? Is the system CPU bound given the few statements
below? This is from running 4 concurrent processes against the node...do I
need to throttle back the concurrent read/writers?

I do all reads/writes as Quorum. (Replication factor of 3).

The memtable threshold is the default of 256.

All caching is turned off.

The database is pretty small, maybe a few million keys (2-3) in 4 CFs. The
key size is pretty small. Some of the rows are pretty fat though (fatter
than I thought). I am saving secondary indexes in separate CFs and those are
the large rows that I think might be part of the problem. I will restart
testing turning these off and see if I see any difference.

Would an extra fat row explain repeated OOM crashes in a row? I have finally
got the system to stabilize relatively and I even ran compaction on the bad
node without a problem (still no row size stats).

I now have several other nodes flapping with the following single error in
the cassandra.log
Error: Exception thrown by the agent : java.lang.NullPointerException

I assume this is an unrelated problem?

Thanks for all of your help!

On Thu, Aug 19, 2010 at 10:26 PM, Peter Schuller <
peter.schul...@infidyne.com> wrote:

> So, these:
>
> >  INFO [GC inspection] 2010-08-19 16:34:46,656 GCInspector.java (line 116)
> GC
> > for ConcurrentMarkSweep: 41615 ms, 192522712 reclaimed leaving 8326856720
> > used; max is 8700035072
> [snip]
> > INFO [GC inspection] 2010-08-19 16:36:00,786 GCInspector.java (line 116)
> GC for ConcurrentMarkSweep: 37122 ms, 157488
> > reclaimed leaving 8342836376 used; max is 8700035072
>
> ...show that you're live set is indeed very close to heap maximum, and
> so concurrent mark/sweep phases run often freeing very little memory.
> In addition the fact that it seems to take 35-45 seconds to do the
> concurrent mark/sweep on an 8 gig heap on a modern system suggests
> that you are probably CPU bound in cassandra at the time (meaning GC
> is slower).
>
> In short you're using too much memory in comparison to the maximum
> heap size. The expected result is to either get an OOM, or just become
> too slow due to excessive GC activity (usually the latter followed by
> the former).
>
> Now, the question is what memory is used *for*, and why. First off, to
> get that out of the way, are you inserting with consistency level
> ZERO? I am not sure whether it applies to 0.6.4 or not but there used
> to be an issue involving writes at consistency level ZERO not being
> throttled at all, meaning that if you threw writes at the system
> faster than it would handle them, you would accumulate memory use. I
> don't believe this is a problem with CL.ONE and above, even in 0.6.4
> (but someone correct me if I'm wrong).
>
> (As an aside: I'm not sure whether the behavior was such that it might
> explain OOM on restart as a result of accumulated commitlogs that get
> replayed faster than memtable flushing happens. Perhaps not, not
> sure.)
>
> In any case, the most important factors are what you're actually doing
> with the cluster, but you don't say much about the data. In particular
> how many rows and colums you're populating it with.
>
> The primary users of large amounts of memory in cassandra include
> (hopefully I'm not missing something major);
>
> * bloom filters that are used to efficiently avoid doing I/O on
> sstables that do not contain relevant data. the size of bloom filters
> scale linearly with the number of row keys (not columns right? I don't
> remember). so here we have an expected permanent, but low, memory use
> as a result of a large database. how large is your database? 100
> million keys? 1 billion? 10 billion?
>
> * the memtables; the currently active memtable and any memtables
> currently undergoing flushing. the size of these are directly
> controllable in the configuration file. make sure they are reasonable.
> (If you're not sure at all, with an 8 gig heap I'd say <= 512 mb is a
> reasonable recommendation unless you have a reason to make them
> larger)
>
> * row cache and key cache, both controllable in the configuration. in
> particular the row cache can be huge if you have configured it as
> such.
>
> * to some extent unflushed commitlogs; the commit log rotation
> threshold controls this. the default value is low enough that it
> should not be your culprit
>
> So the question is what you're usage is like. How many unique rows do
> you have? How many columns? The data size in and of itself should not
> matter much to memory use, except of course that extremely large
> individual values will be relevant to transient high memory use when
> they are read/written.
>
> In general, lacking large row caches and such things, you should be
> able to have hundreds of millions of entries on an 8 gb heap, assuming
> reasonably sized keys.
>
> --
> / Peter Schuller
>

Reply via email to