[ 
https://issues.apache.org/jira/browse/CASSANDRA-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16663:
----------------------------------------
    Test and Documentation Plan: 
There are two new {{cassandra.yaml}} options in this patch:

{{native_transport_rate_limiting_enabled}} - Whether or not to apply a rate 
limit on CQL requests. (Default: false)

{{native_transport_max_requests_per_second}} - The limit itself. (Default: 
Integer.MAX_VALUE)

In terms of testing, there is a new suite of overload tests in 
{{RateLimitingTest}}, and this provides at least basic coverage along the major 
axes of message size, back-pressure vs. throw-on-overload, and behavior in the 
face of configuration changes at runtime (although part of that last bit is in 
{{ClientResourceLimitsTest}}).

As this moves into review, I've also used {{SimpleClientPerfTest}} across 
small/large messages and all 4 active protocol versions to make sure the 
limiter maintains throughput in a reasonably tight band around the configured 
limit. More stress testing, likely w/ tlp-stress and with multi-node clusters, 
will happen concurrently w/ review.

Finally, from an operational perspective, this is a coordinator-level limit, 
and assuming RF=N, any time a replica is unavailable, cluster-wide capacity to 
query a given partition (assuming a token-aware client) decreases by a factor 
of 1/N.

  was:
There are two new {{cassandra.yaml}} options in this patch:

{{native_transport_rate_limiting_enabled}} - Whether or not to apply a rate 
limit on CQL requests. (Default: false)

{{native_transport_max_requests_per_second}} - The limit itself. (Default: 
Double.MAX_VALUE)

In terms of testing, there is a new suite of overload tests in 
{{RateLimitingTest}}, and this provides at least basic coverage along the major 
axes of message size, back-pressure vs. throw-on-overload, and behavior in the 
face of configuration changes at runtime (although part of that last bit is in 
{{ClientResourceLimitsTest}}).

As this moves into review, I've also used {{SimpleClientPerfTest}} across 
small/large messages and all 4 active protocol versions to make sure the 
limiter maintains throughput in a reasonably tight band around the configured 
limit. More stress testing, likely w/ tlp-stress and with multi-node clusters, 
will happen concurrently w/ review.

Finally, from an operational perspective, this is a coordinator-level limit, 
and assuming RF=N, any time a replica is unavailable, cluster-wide capacity to 
query a given partition (assuming a token-aware client) decreases by a factor 
of 1/N.


> Request-Based Native Transport Rate-Limiting
> --------------------------------------------
>
>                 Key: CASSANDRA-16663
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16663
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Client
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 4.x
>
>          Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Together, CASSANDRA-14855, CASSANDRA-15013, and CASSANDRA-15519 added support 
> for a runtime-configurable, per-coordinator limit on the number of bytes 
> allocated for concurrent requests over the native protocol. It supports 
> channel back-pressure by default, and optionally supports throwing 
> OverloadedException if that is requested in the relevant connection’s STARTUP 
> message.
> This can be an effective tool to prevent the coordinator from running out of 
> memory, but it may not correspond to how expensive a queries are or provide a 
> direct conceptual mapping to how users think about request capacity. I 
> propose adding the option of request-based (or perhaps more correctly 
> message-based) back-pressure, coexisting with (and reusing the logic that 
> supports) the current bytes-based back-pressure.
> _We can roll this forward in phases_, where the server’s cost accounting 
> becomes more accurate, we segment limits by operation type/keyspace/etc., and 
> the client/driver reacts more intelligently to (especially non-back-pressure) 
> overload, _but something minimally viable could look like this_:
> 1.) Reuse most of the existing logic in Limits, et al. to support a simple 
> per-coordinator limit only on native transport requests per second. Under 
> this limit will be CQL reads and writes, but also auth requests, prepare 
> requests, and batches. This is obviously simplistic, and it does not account 
> for the variation in cost between individual queries, but even a fixed cost 
> model should be useful in aggregate.
>  * If the client specifies THROW_ON_OVERLOAD in its STARTUP message at 
> connection time, a breach of the per-node limit will result in an 
> OverloadedException being propagated to the client, and the server will 
> discard the request.
>  * If THROW_ON_OVERLOAD is not specified, the server will stop consuming 
> messages from the channel/socket, which should back-pressure the client, 
> while the message continues to be processed.
> 2.) This limit is infinite by default (or simply disabled), and can be 
> enabled via the YAML config or JMX at runtime. (It might be cleaner to have a 
> no-op rate limiter that's used when the feature is disabled entirely.)
> 3.) The current value of the limit is available via JMX, and metrics around 
> coordinator operations/second are already available to compare against it.
> 4.) Any interaction with existing byte-based limits will intersect. (i.e. A 
> breach of any limit, bytes or request-based, will actuate back-pressure or 
> OverloadedExceptions.)
> In this first pass, explicitly out of scope would be any work on the 
> client/driver side.
> In terms of validation/testing, our biggest concern with anything that adds 
> overhead on a very hot path is performance. In particular, we want to fully 
> understand how the client and server perform along two axes constituting 4 
> scenarios. Those are a.) whether or not we are breaching the request limit 
> and b.) whether the server is throwing on overload at the behest of the 
> client. Having said that, query execution should dwarf the cost of limit 
> accounting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to