[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388813#comment-15388813
 ] 

Stefania commented on CASSANDRA-9318:
-------------------------------------

bq. any strategy that doesn't take into account per-node state, is IMHO not a 
proper back-pressure strategy, and can be implemented in the same way as hints 
overflowing is implemented (that is, outside the back-pressure APIs).

OK, so we are basically saying other coordinator-based approaches won't be 
implemented as a back-pressure strategy. Provided [~jbellis] and [~slebresne] 
are fine with this then that's OK with me.

bq. Anyway, I've made the BackPressureState interface more generic so it can 
react on messages when they're about to be sent, which should allow to 
implement different strategies as you mentioned. This should also make the 
current RateBased implementation clearer (see below), which is nice

I like what you did here, thanks.

bq. Apologies, the SP method went through several rewrites in our back and 
forth and I didn't pay enough attention during the last one, totally stupid 
mistake. Should be fixed now, but I've kept the back-pressure application on 
top, as in case the strategy implementation wants to terminate the request 
straight away (i.e. by throwing an exception) it doesn't make sense to 
partially send messages (that is, either all or nothing is IMHO better).

It looks good now. Agreed on keeping back-pressure on top.

bq. The RateBased implementation doesn't "count" outgoing requests when they're 
sent, which would cause the bug you mentioned, ...

I totally forgot this yesterday when I was reasoning about non-local replicas, 
apologies.

bq. This can only happen when a new replica is injected into the system, or if 
it comes back alive after being dead, otherwise we always send mutations to all 
replicas, or am I missing something? There's no lock-free way to remove such 
replicas, because "0 outgoing" is a very transient state, and that's actually 
the point: the algorithm is designed to work at steady state, and after a 
single "window" (rpc timeout seconds) the replica state will have received 
enough updates.

Yes you are correct, we always receive replies, even from non-local replicas.

I was simply suggesting excluding replicas with positive infinity rate from the 
sorted states used to calculate the back pressure, but that's no longer 
relevant.

bq. Please have a look at my changes, I will do one more review pass myself as 
I think some aspects of the rate-based algorithm can be improved, then if we're 
all happy I will go ahead with rebasing to trunk and then moving into testing.

+1 on starting the tests, just two more easy things:

* Something went wrong with the IDE imports at this line 
[here|https://github.com/apache/cassandra/commit/3041d58184a4ee63a04204d3c6228cbaa1c4fd05#diff-0cbd76290040064cf257d855fb5dd779R42],
 there's a bunch of repeated unused imports that need to be removed.

* There's another overload of {{sendRR}} [just 
above|https://github.com/apache/cassandra/commit/3041d58184a4ee63a04204d3c6228cbaa1c4fd05#diff-af09288f448c37a525e831ee90ea49f9R766],
 I think technically we should cover this too even though mutations go through 
the other one.

One final thought, now we have N rate limiters, and we call aquire(1) on only 
one of them. For the next mutation we may pick another one and so forth. So now 
the rate limiters of each replicas have not 'registered' the actual permits 
(one per message) and may not throttle as effectively. Have you considered this 
at all?

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to