Data is in Memtables from writes before they get flushed (based on first
threshold of ops/size/time exceeded; all are configurable) to SSTables on
disk.

There is a keycache and a rowcache.  The keycache caches offsets into
SSTables for the rows.  the rowcache caches the entire row.  There is also
the OS page cache which is heavily used.

When a read happens, the keycache is updated with the information for the
SSTables the row was eventually found in.  If there are too many entries now
in the keycache, some are ejected.  Overall the keycache uses very little
memory per entry and can cut your disk IO in half so it's a pretty big win.

If you read an entire row it goes in the row cache.  Like the keycache, this
may result in older entries being ejected from the cache.  If you insert
lots of really large rows in the rowcache you can OOM your JVM.  The
rowcache is kept up to date with the memtables as writes come in.

When a read comes in, C* will collect the data from the SSTables and
Memtables and merge them together but data only goes into Memtables from
writes.

On Tue, Feb 22, 2011 at 3:32 AM, Viktor Jevdokimov <vjevdoki...@gmail.com>wrote:

> Hello,
>
> Write path is perfectly documented in architecture overview.
>
> I need Reads to be clarified:
>
> How memory is used
> 1. When data is in the Memtable
> 2. When data is in the SSTable
>
> How cache is used alongside with Memtable?
>
> Are records created in the Memtable from writes only or from reads also?
>
> What I need to know is, how Cassandra uses memory and Memtables for reads?
>
>
> Thenk you,
> Viktor
>

Reply via email to