On Tue, Sep 10, 2013 at 2:30 AM, Jan Algermissen <jan.algermis...@nordsc.com > wrote:
> So in a sense, C* is designed to maximize IO write efficiency by > pre-organizing write queries in memory. The more memory, the better the > organization works (caveat GC). > http://en.wikipedia.org/wiki/Log-structured_merge-tree " The LSM-tree is a hybrid data structure. It is composed of two tree-like<http://en.wikipedia.org/wiki/Tree_(data_structure)> structures, known as the C0 and C1 components. C0 is smaller and entirely resident in memory, whereas C1 is resident on disk. New records are inserted into the memory-resident C0 component. If the insertion causes the C0 component to exceed a certain size threshold, a contiguous segment of entries is removed from C0 and merged into C1 on disk. The performance characteristics of LSM-trees stem for the fact that each component is tuned to the characteristics of its underlying storage medium, and that data is efficiently migrated across media in rolling batches, using an algorithm reminiscent of merge sort <http://en.wikipedia.org/wiki/Merge_sort>. " Cassandra takes this eagerness for consuming writes and organizing the > writes in memory to such an extreme, that any given node will rather die > than stop consuming writes. > Perhaps more simply : "RAM is faster than disk" and "Cassandra does not prevent a given node from writing to RAM faster than it can flush to disk"? =Rob