[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ariel Weisberg resolved CASSANDRA-9318. --------------------------------------- Resolution: Won't Fix This ticket was specifically scoped to an implementation strategy that isn't going to solve the issue of clients being able to submit more work than a cluster can handle resulting in timeouts and nodes appearing unresponsive because they can't do the work in time. We can stop the server from running out of memory and crashing, but we can't stop the client from submitting more requests then the server can handle because we need nodes to effectively operate as write buffers for slow nodes to maintain availability. At this point I am kind of with [Jonathan Shook|https://issues.apache.org/jira/browse/CASSANDRA-9318?focusedCommentId=14536846&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14536846] that shedding load (and writing hints) inside the DB is less useful for dealing with overload. I think it is useful for dealing with temporarily slow ranges on the hash ring and it's part of the overall nodes as write buffers strategy C* uses to maintain availability. I found some ways to OOM the server (CASSANDRA-10971 and CASSANDRA-10972) and have patches out for those. The # of in flight requests already has bounds depending on the bottleneck that prevent the server from crashing so adding an explicit one isn't useful right now. When TPC is implemented we will have to implement a bound since there is no thread pool to exhaust, but that is later work. > Bound the number of in-flight requests at the coordinator > --------------------------------------------------------- > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging > Reporter: Ariel Weisberg > Assignee: Ariel Weisberg > Fix For: 2.1.x, 2.2.x > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)