I don't know what you are basing that on. It seems unlikely to me that
the working set of a compaction is 600 MB. However, it may very well
be that the allocation rate is such that it contributes to an
additional 600 MB average heap usage after a CMS phase has completed.
I will investigate situation more closely using gc via jconsole, but
isn't bloom filter for new sstable entirely in memory? On disk there are
only 2 files Index and Data.
-rw-r--r-- 1 root wheel 1388969984 Dec 27 09:25
sipdb-tmp-hc-4634-Index.db
-rw-r--r-- 1 root wheel 10965221376 Dec 27 09:25
sipdb-tmp-hc-4634-Data.db
Bloom filter can be that big. I have experience that if i trigger major
compaction on 180 GB CF ( Compacted row mean size: 130) it will OOM node
after 10 seconds, so i am sure that compactions eats memory pretty well.
> Also, you say it's "pretty dead". What exactly does that mean? Does
it OOM?
yes, it prints messages like heap is almost full and after some time it
usually OOM during large compaction.
The easiest fix is probably to increase the heap size. I know this
e-mail doesn't begin to explain details but it's such a long story.
Actually there is lack of decent documentation about cassandra memory
and GC tuning.
datastax recommends this: (memtable_total_space_in_mb) + 1GB +
(key_cache_size_estimate). which will work only for small tables.