Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "MemtableSSTable" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=14&rev2=15

--------------------------------------------------

  == Compaction ==
  To bound the number of SSTable files that must be consulted on reads, and to 
reclaim [[DistributedDeletes|space taken by unused data]], Cassandra performs 
compactions: merging multiple old SSTable files into a single new one. 
Compactions are triggered when at least N SStables have been flushed to disk, 
where N is tunable and defaults to 4. Four similar-sized SSTables are merged 
into a single one. They start out being the same size as your memtable flush 
size, and then form a hierarchy with each one doubling in size. So you'll have 
up to N of the same size as your memtable, then up to N double that size, then 
up to N double that size, etc.
  
- "Minor" only compactions merge sstables of similar size; "major" compactions 
merge all sstables in a given !ColumnFamily.  Only major compactions can clean 
out obsolete [[DistributedDeletes|tombstones]].
+ "Minor" only compactions merge sstables of similar size; "major" compactions 
merge all sstables in a given !ColumnFamily.  Prior to Cassandra 0.6.6/0.7.0, 
only major compactions can clean out obsolete [[DistributedDeletes|tombstones]].
  
  Since the input SSTables are all sorted by key, merging can be done 
efficiently, still requiring no random i/o.  Once compaction is finished, the 
old SSTable files may be deleted: note that in the worst case (a workload 
consisting of no overwrites or deletes) this will temporarily require 2x your 
existing on-disk space used.  In today's world of multi-TB disks this is 
usually not a problem but it is good to keep in mind when you are setting alert 
thresholds.
  

Reply via email to