Throughput and RAM

2013-09-10 Thread Jan Algermissen
Based on my tuning work with C* over the last days, I guess I reached the 
following insights.

Maybe someone can confirm whether they make sense:

The more heap I give to Cassandra (up to the GC tipping point of ~8GB) the more 
writes it can accumulate in memtables before doing IO.

The more writes are accumulated in memtables, the closer the IO gets towards 
the maximum possible IO throughput (because there will be fewer writes of 
larger sstables).

So in a sense, C* is designed to maximize IO write efficiency by pre-organizing 
write queries in memory. The more memory, the better the organization works 
(caveat GC).

Cassandra takes this eagerness for consuming writes and organizing the writes 
in memory to such an extreme, that any given node will rather die than stop 
consuming writes.

Especially I am looking a confirmation of the last one.

Jan

Re: Throughput and RAM

2013-09-10 Thread Robert Coli
On Tue, Sep 10, 2013 at 2:30 AM, Jan Algermissen jan.algermis...@nordsc.com
 wrote:

 So in a sense, C* is designed to maximize IO write efficiency by
 pre-organizing write queries in memory. The more memory, the better the
 organization works (caveat GC).


http://en.wikipedia.org/wiki/Log-structured_merge-tree

The LSM-tree is a hybrid data structure. It is composed of two
tree-likehttp://en.wikipedia.org/wiki/Tree_(data_structure)
structures,
known as the C0 and C1 components. C0 is smaller and entirely resident in
memory, whereas C1 is resident on disk. New records are inserted into the
memory-resident C0 component. If the insertion causes the C0 component to
exceed a certain size threshold, a contiguous segment of entries is removed
from C0 and merged into C1 on disk. The performance characteristics of
LSM-trees stem for the fact that each component is tuned to the
characteristics of its underlying storage medium, and that data is
efficiently migrated across media in rolling batches, using an algorithm
reminiscent of merge sort http://en.wikipedia.org/wiki/Merge_sort.


Cassandra takes this eagerness for consuming writes and organizing the
 writes in memory to such an extreme, that any given node will rather die
 than stop consuming writes.


Perhaps more simply : RAM is faster than disk and Cassandra does not
prevent a given node from writing to RAM faster than it can flush to disk?

=Rob


Re: Throughput and RAM

2013-09-10 Thread Jan Algermissen

On 10.09.2013, at 19:37, Robert Coli rc...@eventbrite.com wrote:

 Cassandra does not prevent a given node from writing to RAM faster than it 
 can flush to disk? 

Yes, that is what I meant.

What remains unclear to me is what the oprational strategy is towards handling 
an increase in writes or peaks.

Seems to be: wait until nodes die and then add capacity.

I guess what I am looking for is the switch so that *I* can tell C* not to 
write more to RAM than it is able to flush.

I have a hunch that coordinators pile up incoming requests and that the memory 
used by them causes the node to stop flushing completely.

I tried to reduce rpc connections and/or reduce write timeouts but both hadd no 
effect.

Can anybody provide a direction in which to look?

This image ( http://twitpic.com/dcwlmn)  shows the typical situation for me, no 
matter what switches I work with. There is always this segment of an arc which 
shows  the increasing unflushed memtables.

Jan