[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

Ariel Weisberg (JIRA) Mon, 13 Jun 2016 17:09:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328674#comment-15328674
 ]


Ariel Weisberg commented on CASSANDRA-7937:
-------------------------------------------

Looking at this a little harder it looks like 3.0 might be better off than 
later versions because of changes that were part of CASSANDRA-6696 which appear 
to be what reduced the number of concurrent flushes that could occur.

[See this 
change|https://github.com/apache/cassandra/commit/e2c6341898fa43b0e262ef031f267587050b8d0f#diff-98f5acb96aa6d684781936c141132e2aR121]
 which was actually a surprise to me because I thought it worked the old way. 
[~krummas]

> Apply backpressure gently when overloaded with writes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-7937
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
>             Project: Cassandra
>          Issue Type: Improvement
>         Environment: Cassandra 2.0
>            Reporter: Piotr Kołaczkowski
>              Labels: performance
>
> When writing huge amounts of data into C* cluster from analytic tools like 
> Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
> This is because analytic tools typically write data "as fast as they can" in 
> parallel, from many nodes and they are not artificially rate-limited, so C* 
> is the bottleneck here. Also, increasing the number of nodes doesn't really 
> help, because in a collocated setup this also increases number of 
> Hadoop/Spark nodes (writers) and although possible write performance is 
> higher, the problem still remains.
> We observe the following behavior:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. the available memory limit for memtables is reached and writes are no 
> longer accepted
> 3. the application gets hit by "write timeout", and retries repeatedly, in 
> vain 
> 4. after several failed attempts to write, the job gets aborted 
> Desired behaviour:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. after exceeding some memtable "fill threshold", C* applies adaptive rate 
> limiting to writes - the more the buffers are filled-up, the less writes/s 
> are accepted, however writes still occur within the write timeout.
> 3. thanks to slowed down data ingestion, now flush can finish before all the 
> memory gets used
> Of course the details how rate limiting could be done are up for a discussion.
> It may be also worth considering putting such logic into the driver, not C* 
> core, but then C* needs to expose at least the following information to the 
> driver, so we could calculate the desired maximum data rate:
> 1. current amount of memory available for writes before they would completely 
> block
> 2. total amount of data queued to be flushed and flush progress (amount of 
> data to flush remaining for the memtable currently being flushed)
> 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

Reply via email to