Try reducing memtable_total_space_in_mb config setting. If the problem
is incorrect memory metering that should help.
it does not helps much because difference in correct and cassandra
assumed calculation is way too high. It would require me to shrink
memtables to about 10% of their correct size leading to too much
compactions.
i have 3 workload types running in batch. Delete only workload,
insert only and heavy update (lot of overwrites)
Are you saying you do a lot of deletes, followed by a lot of inserts
and then updates all for the same CF ?
no. most common workload type is insert only. from time to time there
are batch job doing lot of overwrites in memtables, and ocassionaly
cleanup jobs doing only deletes. This breaks liveratio calculation too
because cassandra assumes not only that average column size stored in
memtable is constant but also that overwrite ratio in memtable is
constant. If you overwrite too much cassandra starts to make very tiny
sstables, if you delete too much there is risk of OOM.
yes. Record is about 120, but it is rare. 80 should be good enough.
Default 10 (if not jusing jamm) is way too low.
Can you provide some information on what is stored in the CF and what
sort of workload. It would be interesting to understand why the real
memory usage is 120 times the serialised size.
super column family:
and column_metadata = [
{column_name : 'crc32',
validation_class : LongType},
{column_name : 'id',
validation_class : LongType},
{column_name : 'name',
validation_class : AsciiType},
{column_name : 'size',
validation_class : LongType}];
]
but this is not important, problem is that you do not calculate live
ration frequently enough, if workload changes ratio looks like:
INFO [MemoryMeter:1] 2012-05-12 21:11:51,649 Memtable.java (line 186)
CFS(Keyspace='dedup', ColumnFamily='resultcache') liveRatio is 64.0
(just-counted was 4.633391051722882). calculation took 111ms for 4465
columns
why not recalculate it every 5 or 10 minutes. calculation takes just few
seconds.