[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927952#comment-15927952 ] Corentin Chary commented on CASSANDRA-11380: >From my tests I didn't find a way to create a setup were there would be a fair >backpressure using this (which is an issue when you have a cluster shared by >multiple clients/workloads). > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15554366#comment-15554366 ] Corentin Chary commented on CASSANDRA-11380: Looks like a good start. I'll try to test this with my workload and publish the results. Thanks for the link. > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553677#comment-15553677 ] Stefania commented on CASSANDRA-11380: -- CASSANDRA-9318 added a backpressure mechanism at the coordinator. It's not client visible but it indirectly slows down clients by delaying mutations at the coordinator based on the rate at which replicas acknowledge mutations. It also allows the possibility of implementing new strategies. It's a long discussion, this is the initial [comment|https://issues.apache.org/jira/browse/CASSANDRA-9318?focusedCommentId=15344958&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15344958] (Sergio Bossa - 23 Jun 16) that describes the backpressure mechanism, but then it got changed again a few times so for a quick summary you can read the documentation in [cassandra.yaml|https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1176]. > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553373#comment-15553373 ] Corentin Chary commented on CASSANDRA-11380: This might be naive, but couldn't we just stop reading from client's socket (and maybe slow down compaction) if memory pressure / load becomes too high ? TCP will eventually do its job telling the client to "slow down". I currently have a setup with pretty frequent issues of infinite queues of pending mutation where the server seem to never be able to recover. > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328048#comment-15328048 ] Wei Deng commented on CASSANDRA-11380: -- Added a link to CASSANDRA-7937 as there are quite a bit of discussions from the dev team on this issue (and many of them are worth reading to understand what people have considered). As long as this general problem is still on people's radar, I'm ok to close this one as duplicate (assuming 7937 can be re-opened). > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212642#comment-15212642 ] Wei Deng commented on CASSANDRA-11380: -- bq. but one simple client mechanism, especially in bulk loading scenarios, is to set a slightly higher consistency level. That's exactly based on the load shedding approach mentioned in the first paragraph, and is not always effective. > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism
[ https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212353#comment-15212353 ] Jeremy Hanna commented on CASSANDRA-11380: -- Not to discourage work on this ticket at all, but one simple client mechanism, especially in bulk loading scenarios, is to set a slightly higher consistency level. See also https://datastax-oss.atlassian.net/browse/SPARKC-262 > Client visible backpressure mechanism > - > > Key: CASSANDRA-11380 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11380 > Project: Cassandra > Issue Type: New Feature > Components: Coordination >Reporter: Wei Deng > > Cassandra currently lacks a sophisticated back pressure mechanism to prevent > clients ingesting data at too high throughput. One of the reasons why it > hasn't done so is because of its SEDA (Staged Event Driven Architecture) > design. With SEDA, an overloaded thread pool can drop those droppable > messages (in this case, MutationStage can drop mutation or counter mutation > messages) when they exceed the 2-second timeout. This can save the JVM from > running out of memory and crash. However, one downside from this kind of > load-shedding based backpressure approach is that increased number of dropped > mutations will increase the chance of inconsistency among replicas and will > likely require more repair (hints can help to some extent, but it's not > designed to cover all inconsistencies); another downside is that excessive > writes will also introduce much more pressure on compaction (especially LCS), > and backlogged compaction will increase read latency and cause more frequent > GC pauses, and depending on the type of compaction, some backlog can take a > long time to clear up even after the write is removed. It seems that the > current load-shedding mechanism is not adequate to address a common bulk > loading scenario, where clients are trying to ingest data at highest > throughput possible. We need a more direct way to tell the client drivers to > slow down. > It appears that HBase had suffered similar situation as discussed in > HBASE-5162, and they introduced some special exception type to tell the > client to slow down when a certain "overloaded" criteria is met. If we can > leverage a similar mechanism, our dropped mutation event can be used to > trigger such exceptions to push back on the client; at the same time, > backlogged compaction (when the number of pending compactions exceeds a > certain threshold) can also be used for the push back and this can prevent > vicious cycle mentioned in > https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786. -- This message was sent by Atlassian JIRA (v6.3.4#6332)