[Cassandra Wiki] Trivial Update of "MemtableSSTable" by JonathanEllis

Apache Wiki Thu, 22 Apr 2010 07:56:53 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "MemtableSSTable" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=8&rev2=9

--------------------------------------------------

  Cassandra writes are first written to the [[Durability|CommitLog]], and then 
to a per-!ColumnFamily structure called a Memtable.  A Memtable is basically a 
write-back cache of data rows that can be looked up by key -- that is, unlike a 
write-through cache, writes are batched up in the Memtable until it is full, 
before being written to disk as an SSTable.
  
- The process of turning a Memtable into a SSTable is called flushing.  You can 
manually trigger flush via jmx (e.g. with bin/nodetool), which you may want to 
do before restarting nodes since it will reduce !CommitLog replay time.  
Memtables are sorted by key and then written out sequentially.
+ The process of turning a Memtable into a SSTable is called flushing.  You can 
manually trigger flush via jmx (e.g. with bin/nodetool), which you may want to 
do before restarting nodes since it will reduce !CommitLog replay time.  
Memtables are sorted by key and then written out sequentially.  Thus, writes 
are extremely fast, costing only a commitlog append and an amortized sequential 
write for the flush!
- 
- Thus, writes are extremely fast, costing only a commitlog append and an 
amortized sequential write for the flush!
  
  Once flushed, SSTable files are immutable; no further writes may be done.  
So, on the read path, the server must (potentially, although it uses tricks 
like bloom filters to avoid doing so unnecessarily) combine row fragments from 
all the SSTables on disk, as well as any unflushed Memtables, to produce the 
requested data.
  
@@ -12, +10 @@

  
  Since the input SSTables are all sorted by key, merging can be done 
efficiently, still requiring no random i/o.  Once compaction is finished, the 
old SSTable files may be deleted: note that in the worst case (a workload 
consisting of no overwrites or deletes) this will temporarily require 2x your 
existing on-disk space used.  In today's world of multi-TB disks this is 
usually not a problem but it is good to keep in mind when you are setting alert 
thresholds.
  
- SSTables that are obsoleted by a compaction are deleted asynchronously when 
the JVM performs a GC.  You can force a GC from jconsole if necessary but this 
is not necessary; Cassandra will force one itself if it detects that it is low 
on space.  A compaction marker is also added to obsolete sstables so they can 
be deleted on startup if the server does not perform a GC before being 
restarted.
+ SSTables that are obsoleted by a compaction are deleted asynchronously when 
the JVM performs a GC.  You can force a GC from jconsole if necessary, but 
Cassandra will force one itself if it detects that it is low on space.  A 
compaction marker is also added to obsolete sstables so they can be deleted on 
startup if the server does not perform a GC before being restarted.
  
  CFStoreMBean exposes sstable space used as getLiveDiskSpaceUsed (only 
includes size of non-obsolete files) and getTotalDiskSpaceUsed (includes 
everything).

[Cassandra Wiki] Trivial Update of "MemtableSSTable" by JonathanEllis

Reply via email to