[ 
https://issues.apache.org/jira/browse/CASSANDRA-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363913#comment-17363913
 ] 

Caleb Rackliffe commented on CASSANDRA-16663:
---------------------------------------------

This is a coordinator rate limit, so I kicked off a few single node tests w/ 
tlp-stress. The goals here are a.) to determine how responsive back-pressure is 
around the configured limit, b.) make sure there isn't a visible regression in 
performance while under the limit, and c.) see what happens with a 
pathologically overloaded node.

ex.

{noformat}
bin/tlp-stress run --compaction "{'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 
'tombstone_compaction_interval': '864000', 'tombstone_threshold': '0.2'}" 
--compression "{'sstable_compression': 'LZ4Compressor'}" --readrate 1.0  
--partitions 10000 --populate 10000 --rate 1000 --duration 5m KeyValue
{noformat}

*Native Protocol V4*

|Backpressure|Limit (requests/second)|Client-Requested Rate|Client p99 
(millis)|Client-Observed Rate|
|disabled|n/a|1000|0.45|993.01|
|enabled|1000|999|0.40|992.02|
|enabled|1000|1001|171.93|992.23|
|disabled|n/a|2000|0.63|1986.48|
|enabled|1000|2000|220.05|993.75|

*Native Protocol V5*

|Backpressure|Limit (requests/second)|Client-Requested Rate|Client p99 
(millis)|Client-Observed Rate|
|disabled|n/a|1000|0.42|993.01|
|enabled|1000|999|0.42|992.03|
|enabled|1000|1001|161.82|993.17|
|disabled|n/a|2000|0.36|1986.47|
|enabled|1000|2000|162.61|993.26|

In short, the behavior around the back-pressure thresholds here is a.) more or 
less the same between protocol versions and b.) exactly what we would expect. 
When the rate of client requests is below the limit, there is no visible 
degradation in p99 latency, and when the rate of client requests is above the 
limit, latencies increase over the course of the test (not visible in the 
tables above, as those are ending 1-minute p99s) with observed throughput 
remaining constant.

Next, I'll look at how things behave when we're throwing on overload...

> Request-Based Native Transport Rate-Limiting
> --------------------------------------------
>
>                 Key: CASSANDRA-16663
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16663
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Client
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 4.x
>
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> Together, CASSANDRA-14855, CASSANDRA-15013, and CASSANDRA-15519 added support 
> for a runtime-configurable, per-coordinator limit on the number of bytes 
> allocated for concurrent requests over the native protocol. It supports 
> channel back-pressure by default, and optionally supports throwing 
> OverloadedException if that is requested in the relevant connection’s STARTUP 
> message.
> This can be an effective tool to prevent the coordinator from running out of 
> memory, but it may not correspond to how expensive a queries are or provide a 
> direct conceptual mapping to how users think about request capacity. I 
> propose adding the option of request-based (or perhaps more correctly 
> message-based) back-pressure, coexisting with (and reusing the logic that 
> supports) the current bytes-based back-pressure.
> _We can roll this forward in phases_, where the server’s cost accounting 
> becomes more accurate, we segment limits by operation type/keyspace/etc., and 
> the client/driver reacts more intelligently to (especially non-back-pressure) 
> overload, _but something minimally viable could look like this_:
> 1.) Reuse most of the existing logic in Limits, et al. to support a simple 
> per-coordinator limit only on native transport requests per second. Under 
> this limit will be CQL reads and writes, but also auth requests, prepare 
> requests, and batches. This is obviously simplistic, and it does not account 
> for the variation in cost between individual queries, but even a fixed cost 
> model should be useful in aggregate.
>  * If the client specifies THROW_ON_OVERLOAD in its STARTUP message at 
> connection time, a breach of the per-node limit will result in an 
> OverloadedException being propagated to the client, and the server will 
> discard the request.
>  * If THROW_ON_OVERLOAD is not specified, the server will stop consuming 
> messages from the channel/socket, which should back-pressure the client, 
> while the message continues to be processed.
> 2.) This limit is infinite by default (or simply disabled), and can be 
> enabled via the YAML config or JMX at runtime. (It might be cleaner to have a 
> no-op rate limiter that's used when the feature is disabled entirely.)
> 3.) The current value of the limit is available via JMX, and metrics around 
> coordinator operations/second are already available to compare against it.
> 4.) Any interaction with existing byte-based limits will intersect. (i.e. A 
> breach of any limit, bytes or request-based, will actuate back-pressure or 
> OverloadedExceptions.)
> In this first pass, explicitly out of scope would be any work on the 
> client/driver side.
> In terms of validation/testing, our biggest concern with anything that adds 
> overhead on a very hot path is performance. In particular, we want to fully 
> understand how the client and server perform along two axes constituting 4 
> scenarios. Those are a.) whether or not we are breaching the request limit 
> and b.) whether the server is throwing on overload at the behest of the 
> client. Having said that, query execution should dwarf the cost of limit 
> accounting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to