Hi We have 8 cassandra 1.0.5 nodes with 16 cores and 32G ram, Heap size is 12G, memtable_total_space_in_mb is one third = 4G, There are 12 Hot CFs (write-read ratio of 10). memtable_flush_queue_size = 4 and memtable_flush_writers = 2..
I got this log-entry " MeteredFlusher.java (line 74) estimated 4239999318 bytes used by all memtables pre-flush", following which cassandra flushed several of its "largest" memtables. I understand that this message is due to the "memtable_total_space_in_mb" setting being reached, but I do not understand the remedy to the problem. Is increasing this variable my only option? Also, In standard MeteredFlusher flushes (the ones that trigger due to "if my entire flush pipeline were full of memtables of this size, how big could I allow them to be." logic), I see memtables of serialized size of 100-200 MB with estimated live size of 500 MB get flushed to produce sstables of around 10-15 MB sizes. Are these factors of 10-20 between serialized on disk and memory and 3-5 for liveRatio expected? Also, this very informative article http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ has this to say "For example if memtable_total_space_in_mb is 100MB, and memtable_flush_writers is the default 1 (with one data directory), and memtable_flush_queue_size is the default 4, and a Column Family has no secondary indexes. The CF will not be allowed to get above one seventh of 100MB or 14MB, as if the CF filled the flush pipeline with 7 memtables of this size it would take 98MB". Since the formula is "CF Count + Secondary Index Count + memtable_flush_queue_size (defaults to 4) + memtable_flush_writers (defaults to 1 per data directory) memtables in memory the JVM at once.", shouldn't the limit be 6 (and not 7) memtables in memory? Thanks Rohit