[ 
https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212642#comment-15212642
 ] 

Wei Deng commented on CASSANDRA-11380:
--------------------------------------

bq. but one simple client mechanism, especially in bulk loading scenarios, is 
to set a slightly higher consistency level.

That's exactly based on the load shedding approach mentioned in the first 
paragraph, and is not always effective.

> Client visible backpressure mechanism
> -------------------------------------
>
>                 Key: CASSANDRA-11380
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11380
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Coordination
>            Reporter: Wei Deng
>
> Cassandra currently lacks a sophisticated back pressure mechanism to prevent 
> clients ingesting data at too high throughput. One of the reasons why it 
> hasn't done so is because of its SEDA (Staged Event Driven Architecture) 
> design. With SEDA, an overloaded thread pool can drop those droppable 
> messages (in this case, MutationStage can drop mutation or counter mutation 
> messages) when they exceed the 2-second timeout. This can save the JVM from 
> running out of memory and crash. However, one downside from this kind of 
> load-shedding based backpressure approach is that increased number of dropped 
> mutations will increase the chance of inconsistency among replicas and will 
> likely require more repair (hints can help to some extent, but it's not 
> designed to cover all inconsistencies); another downside is that excessive 
> writes will also introduce much more pressure on compaction (especially LCS), 
>  and backlogged compaction will increase read latency and cause more frequent 
> GC pauses, and depending on the type of compaction, some backlog can take a 
> long time to clear up even after the write is removed. It seems that the 
> current load-shedding mechanism is not adequate to address a common bulk 
> loading scenario, where clients are trying to ingest data at highest 
> throughput possible. We need a more direct way to tell the client drivers to 
> slow down.
> It appears that HBase had suffered similar situation as discussed in 
> HBASE-5162, and they introduced some special exception type to tell the 
> client to slow down when a certain "overloaded" criteria is met. If we can 
> leverage a similar mechanism, our dropped mutation event can be used to 
> trigger such exceptions to push back on the client; at the same time, 
> backlogged compaction (when the number of pending compactions exceeds a 
> certain threshold) can also be used for the push back and this can prevent 
> vicious cycle mentioned in 
> https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to