Hi

We have 8 cassandra 1.0.5 nodes with 16 cores and 32G ram, Heap size
is 12G, memtable_total_space_in_mb is one third = 4G, There are 12 Hot
CFs (write-read ratio of 10).
memtable_flush_queue_size = 4 and memtable_flush_writers = 2..

I got this log-entry " MeteredFlusher.java (line 74) estimated
4239999318 bytes used by all memtables pre-flush", following which
cassandra flushed several of its "largest" memtables.
I understand that this message is due to the
"memtable_total_space_in_mb" setting being reached, but I do not
understand the remedy to the problem.
Is increasing this variable my only option?

Also, In standard MeteredFlusher flushes (the ones that trigger due to
"if my entire flush pipeline were full of memtables of this size, how
big could I allow them to be." logic),
I see memtables of serialized size of 100-200 MB with estimated live
size of 500 MB get flushed to produce sstables of around 10-15 MB
sizes.
Are these factors of 10-20 between serialized on disk and memory and
3-5 for liveRatio expected?

Also, this very informative article
http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ has
this to say
"For example if memtable_total_space_in_mb is 100MB, and
memtable_flush_writers is the default 1 (with one data directory), and
memtable_flush_queue_size is the default 4, and a Column Family has no
secondary indexes. The CF will not be allowed to get above one seventh
of 100MB or 14MB, as if the CF filled the flush pipeline with 7
memtables of this size it would take 98MB".
Since the formula is "CF Count + Secondary Index Count +
memtable_flush_queue_size (defaults to 4) + memtable_flush_writers
(defaults to 1 per data directory) memtables in memory the JVM at
once.", shouldn't the limit be 6 (and not 7) memtables in memory?


Thanks
Rohit

Reply via email to