[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365817#comment-15365817
 ] 

Sylvain Lebresne commented on CASSANDRA-9318:
---------------------------------------------

A first concern I have, that has been brought up earlier in the conversation 
already, is that this will create a lot more OverloadedException that we're 
currently used to, with 2 concrete issues imo:
# the new OverloadedException has a different meaning than the old one. The one 
meant the coordinator was overloaded (by too much hints writing), so client 
drivers will tend to try another coordinator when that happens. That's not a 
good idea here.
# I think in many case that's not what the user want. If you write at CL.ONE, 
and 1 of your 3 replica is struggling (but other nodes are perfectly fine), 
having all your queries rejected (breaking availability concretely) until that 
one replica catch up (or dies) feels quite wrong. What I think you'd want is 
consider the node dead for that query and hint that slow node (and if a 
coordinator gets overwhelmed by those hints, we already throw an OE), but 
proceed with the write. I'll not in particular that when a node dies, it's not 
detected so right away, and detection can actually take a few seconds, which 
might well mean a node that just dies will be considered overloaded 
temporarilly by some coordinator. So considering that overloaded is roughtly 
the same as dead makes some sense (you wouldn't want to start failing writes 
because a node dies due to this mechanism if you have enough node for your CL 
in particular).

bq. For trunk, we can enable it by default provided we can test it on a medium 
size cluster before committing

We *should* test it on a medium sized cluster before committing, but even then 
I'm really reluctant making it the default on commit. I don't think making new 
features untested in production and can that have huge impacts the default 
right away is smart, even though we have done it numerous time (unsucessfully 
most of the time I'd say). I'd much rather leave it opt-in for a few releases 
so users can test it and wait more feedback until we make it the default (I 
know the counter argument, that no-one will test it unless it's a default, but 
I doubt that's true if people are genuinely put off by the existing behavior).

In general, as was already brought up earlier in the discussion, I suspect fine 
tuning the parameters won't be trivial, and I think we'll need a fair amount of 
testing in different conditions to guarantee the defaults for those parameters 
are sane.

bq. back_pressure_timeout_override: should we call it for what it is, the 
back_pressure_window_size and recommend that they increase the write timeout in 
the comments

I concur that I don't love the name either, nor the concept of having an 
override for another setting. I'd also prefer splitting the double meaning, 
leaving {{write_request_timeout_in_ms}} always be the timeout, and add a 
{{window_size}} in the strategy parameter. We can additionally do 2 things to 
get roughly the same default that in the current patch:
# make {{window_size}} be equal to the write timeout by default.
# make the {{*_request_timeout_in_ms}} option "silent" default (i.e. commented 
by default), and make the default for the write one depend on whether 
back_pressure is on or not.

To finish on the yaml option, if we move {{window_size}} to the strategy 
parameters, we could also got rid of {black_pressure_enabled}} and instead have 
a {{NoBackPressure}} strategy (which we can totally special case internally). 
Don't care terribly about it, but it's imo slightly more inline with other 
strategies.

I'd also suggest abstracting the {{BackPressureState}} state as an interface 
and making the strategy responsible of creating a new one, since most of the 
details of the state are likely strategy somewhat specific. As far as I can, 
all we'd need as interface is:
{noformat}
interface BackPressureState
{
    void onMessageSent();
    void onResponseReceived();
    double currentBackPressureRate();
}
{noformat}
This won't provide more info for custom implementation just yet, but at least 
it'll give them more flexibility, and it's imo a bit clearer.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to