[ https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328674#comment-15328674 ]
Ariel Weisberg commented on CASSANDRA-7937: ------------------------------------------- Looking at this a little harder it looks like 3.0 might be better off than later versions because of changes that were part of CASSANDRA-6696 which appear to be what reduced the number of concurrent flushes that could occur. [See this change|https://github.com/apache/cassandra/commit/e2c6341898fa43b0e262ef031f267587050b8d0f#diff-98f5acb96aa6d684781936c141132e2aR121] which was actually a surprise to me because I thought it worked the old way. [~krummas] > Apply backpressure gently when overloaded with writes > ----------------------------------------------------- > > Key: CASSANDRA-7937 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 > Project: Cassandra > Issue Type: Improvement > Environment: Cassandra 2.0 > Reporter: Piotr Kołaczkowski > Labels: performance > > When writing huge amounts of data into C* cluster from analytic tools like > Hadoop or Apache Spark, we can see that often C* can't keep up with the load. > This is because analytic tools typically write data "as fast as they can" in > parallel, from many nodes and they are not artificially rate-limited, so C* > is the bottleneck here. Also, increasing the number of nodes doesn't really > help, because in a collocated setup this also increases number of > Hadoop/Spark nodes (writers) and although possible write performance is > higher, the problem still remains. > We observe the following behavior: > 1. data is ingested at an extreme fast pace into memtables and flush queue > fills up > 2. the available memory limit for memtables is reached and writes are no > longer accepted > 3. the application gets hit by "write timeout", and retries repeatedly, in > vain > 4. after several failed attempts to write, the job gets aborted > Desired behaviour: > 1. data is ingested at an extreme fast pace into memtables and flush queue > fills up > 2. after exceeding some memtable "fill threshold", C* applies adaptive rate > limiting to writes - the more the buffers are filled-up, the less writes/s > are accepted, however writes still occur within the write timeout. > 3. thanks to slowed down data ingestion, now flush can finish before all the > memory gets used > Of course the details how rate limiting could be done are up for a discussion. > It may be also worth considering putting such logic into the driver, not C* > core, but then C* needs to expose at least the following information to the > driver, so we could calculate the desired maximum data rate: > 1. current amount of memory available for writes before they would completely > block > 2. total amount of data queued to be flushed and flush progress (amount of > data to flush remaining for the memtable currently being flushed) > 3. average flush write speed -- This message was sent by Atlassian JIRA (v6.3.4#6332)