[jira] [Comment Edited] (CASSANDRA-7392) Abort in-progress queries that time out

Ariel Weisberg (JIRA) Thu, 10 Sep 2015 10:38:08 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739184#comment-14739184
 ]


Ariel Weisberg edited comment on CASSANDRA-7392 at 9/10/15 5:37 PM:
--------------------------------------------------------------------

bq. Ok, but does a lazySet() really buy a lot compared to a CAS? I mean we are 
introducing a race, albeit with no consequences at the moment.
Truthfully not much. This is a CAS once per operation on a non-contended memory 
location that is hopefully in a nearby cache. Things that only occur once per 
operation aren't going to be super high yield. The biggest benefits come from 
the tight loops where the thing you made faster/eliminated occurs N/lg N/N lg N 
times per operation.

You do get to eliminate a CAS loop. So it's not just a performance benefit. I 
don't feel strongly about it either way since this is a once per operation 
optimization.

The CAS is going to stall execution while the CPU waits for the store buffers 
to flush to L1. Lazy set on x86 does not cause a stall it just prevents 
instruction reordering by the compiler. x86 is a TSO architecture so stores to 
shared memory are never reordered from program order so the CPU does no 
additional work compared to what it always does.

Waiting for the store buffers to drain is maybe a minimum of 10-15 nanoseconds. 
I have never seen measurements of store buffer flushing time so I can't give an 
exact answer. A fast operation is going to be 10s of microseconds. In C* I 
don't know how long they take could be hundreds it really depends on the 
operation. We are talking a fraction of a % of total operation time and more 
importantly there is no impact on scalability.

-I have some knowledge gaps as to what the store buffer can/can't do in 
practice so I can't be very accurate-
Another thing I didn't mention until I got confirmation from Benedict. A word 
aligned blind store (write to a field without reading the value) can be 
performed asynchronously by the store buffer even if the cache line is not 
resident. So in the best case you eliminate a stall to L2/L3/Main memory.

bq. Properties created. Should they be user visible in the yaml too? What about 
MessagingService.LOG_DROPPED_INTERVAL_IN_MS, should this be replaced with the 
property we just created or shall we leave it alone? Basically these are 
messages that weren't even processed, as opposite to started and then timed 
out. As for a default value for checking, the default read timeout is 5 
seconds, so let's use 1% of this, 50 milli seconds?
I see options in the YAML as a bad thing and I also don't like undocumented 
options. I also fear shipping with hard coded constants because it means there 
is no option other than recompiling in the field.

I see properties as bridging the gap between the next release where we either 
fix the implementation so tuning the defaults isn't necessary or make it an 
option in the YAML if we can't make it just work without operator intervention.

WRT to the default. I would say 1% of the timeout is pretty high precision. You 
do have some insight there thinking about the frequency as a % of the timeout. 
I would say go with that and set the check frequency as a % of the timeout 
unless overridden by a property, but also set a minimum value. I think 50 
milliseconds is good as a minimum. I would say 10% is good enough so we would 
be off by around 1/10th of the timeout, but I don't feel strongly.

bq. What about MessagingService.LOG_DROPPED_INTERVAL_IN_MS, should this be 
replaced with the property we just created or shall we leave it alone? 
That log message is fixed size (% of verbs) and covers all dropped messages and 
not just the subset of timeouts you are working on right now. I would say leave 
it alone just by virtue of being out of scope. It might benefit from backoff.

bq. Something like NoSpanLogger except it needs to cache similar log messages 
rather than identical messages and it should be configurable? I have run out of 
time today so I haven't thought about this much so far.
I think that's it. NoSpamLogger doesn't allow you to specify a policy for 
backoff as opposed to fixed intervals. I think that is a missing capability. 
Have it support a policy, and then tell you whether it is  time to log so you 
can decide whether to clear the stats.

One caveat that occurs to me of logging this kind of thing at a variable rate 
is that absolute counts of events are no longer informative. You need to log a 
rate so you can compare without having to do your own math. Even then there is 
some harm because you could grow the reporting interval to include time where 
nothing is going wrong distorting the reported rate. I think there is some 
tension between my pony and providing precise data. What do you think?


was (Author: aweisberg):
bq. Ok, but does a lazySet() really buy a lot compared to a CAS? I mean we are 
introducing a race, albeit with no consequences at the moment.
Truthfully not much. This is a CAS once per operation on a non-contended memory 
location that is hopefully in a nearby cache. Things that only occur once per 
operation aren't going to be super high yield. The biggest benefits come from 
the tight loops where the thing you made faster/eliminated occurs N/lg N/N lg N 
times per operation.

You do get to eliminate a CAS loop. So it's not just a performance benefit.

The CAS is going to stall execution while the CPU waits for the store buffers 
to flush to L1. Lazy set on x86 does not cause a stall it just prevents 
instruction reordering by the compiler. x86 is a TSO architecture so stores to 
shared memory are never reordered from program order so the CPU does no 
additional work compared to what it always does.

Waiting for the store buffers to drain is maybe a minimum of 10-15 nanoseconds. 
I have never seen measurements of store buffer flushing time so I can't give an 
exact answer. A fast operation is going to be 10s of microseconds. In C* I 
don't know how long they take could be hundreds it really depends on the 
operation. We are talking a fraction of a % of total operation time and more 
importantly there is no impact on scalability.

I have some knowledge gaps as to what the store buffer can/can't do in practice 
so I can't be very accurate

bq. Properties created. Should they be user visible in the yaml too? What about 
MessagingService.LOG_DROPPED_INTERVAL_IN_MS, should this be replaced with the 
property we just created or shall we leave it alone? Basically these are 
messages that weren't even processed, as opposite to started and then timed 
out. As for a default value for checking, the default read timeout is 5 
seconds, so let's use 1% of this, 50 milli seconds?
I see options in the YAML as a bad thing and I also don't like undocumented 
options. I also fear shipping with hard coded constants because it means there 
is no option other than recompiling in the field.

I see properties as bridging the gap between the next release where we either 
fix the implementation so tuning the defaults isn't necessary or make it an 
option in the YAML if we can't make it just work without operator intervention.

WRT to the default. I would say 1% of the timeout is pretty high precision. You 
do have some insight there thinking about the frequency as a % of the timeout. 
I would say go with that and set the check frequency as a % of the timeout 
unless overridden by a property, but also set a minimum value. I think 50 
milliseconds is good as a minimum. I would say 10% is good enough so we would 
be off by around 1/10th of the timeout, but I don't feel strongly.

bq. What about MessagingService.LOG_DROPPED_INTERVAL_IN_MS, should this be 
replaced with the property we just created or shall we leave it alone? 
That log message is fixed size (% of verbs) and covers all dropped messages and 
not just the subset of timeouts you are working on right now. I would say leave 
it alone just by virtue of being out of scope. It might benefit from backoff.

bq. Something like NoSpanLogger except it needs to cache similar log messages 
rather than identical messages and it should be configurable? I have run out of 
time today so I haven't thought about this much so far.
I think that's it. NoSpamLogger doesn't allow you to specify a policy for 
backoff as opposed to fixed intervals. I think that is a missing capability. 
Have it support a policy, and then tell you whether it is  time to log so you 
can decide whether to clear the stats.

One caveat that occurs to me of logging this kind of thing at a variable rate 
is that absolute counts of events are no longer informative. You need to log a 
rate so you can compare without having to do your own math. Even then there is 
some harm because you could grow the reporting interval to include time where 
nothing is going wrong distorting the reported rate. I think there is some 
tension between my pony and providing precise data. What do you think?

> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node 
> is overloaded) but not queries that time out while being processed.  
> (Particularly common for index queries on data that shouldn't be indexed.)  
> Adding the latter and logging when we have to interrupt one gets us a poor 
> man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7392) Abort in-progress queries that time out

Reply via email to