[ 
https://issues.apache.org/jira/browse/CASSANDRA-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815405#comment-16815405
 ] 

Marcus Olsson commented on CASSANDRA-14983:
-------------------------------------------

No worries, thanks for the update and suggestions for the testing.

I tried out 4 but testing the complete flow of this behavior still seems a bit 
problematic. I think this comes from a few different factors. One is that 
background requests are required to trigger this unless CASSANDRA-15022 gets 
in. Another problem is that when the fast path is used neither speculative 
retries nor timeouts seems to be properly respected as they are executed 
sequentially after the local read. If we have timeout defined as 100ms and the 
request takes 150ms locally then it won't throw an exception after either 
100/150 ms (unless the remote request was even slower). As we already have the 
data after 150ms I do think it makes sense to return it(rather than throw an 
exception) but IMO it should have thrown an exception after ~100ms. A similar 
scenario occurs for the speculative retries as well, they wouldn't be triggered 
until after the local request is finished.
I think this would leave using request latency as a basis for verifying the 
behavior but that could make the test case quite unstable and sensitive to the 
testing environment.

>From a correctness perspective this feature seems to have some edge cases with 
>request timeouts and speculative retries. In CASSANDRA-6995 it was suggested 
>to add this feature specifically for LOCAL_ONE/ONE requests which I think 
>would narrow some of the edge cases. And as this is probably a quite narrow 
>failure case as it is, I'm not sure if we should aim to solve it for all cases 
>as that might limit the use case significantly. I.e. I don't think we can 
>solve the timeout issue without a significant rework of the request path.

However I do think we should consider the effects this has for speculative 
retries and if this feature should be enabled in combination with that.

> Local reads potentially blocking remote reads
> ---------------------------------------------
>
>                 Key: CASSANDRA-14983
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14983
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Marcus Olsson
>            Assignee: Marcus Olsson
>            Priority: Low
>         Attachments: graph_local_read.html, graph_local_read_trunk.html, 
> local_read_trace.log
>
>
> Since CASSANDRA-4718 there is a fast path allowing local requests to continue 
> to [work in the same 
> thread|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L157]
>  rather than being sent over to the read stage.
> Based on the comment
> {code:java}
> // We delay the local (potentially blocking) read till the end to avoid 
> stalling remote requests.
> {code}
> it seems like this should be performed last in the chain to avoid blocking 
> remote requests but that does not seem to be the case when the local request 
> is a data request. The digest request(s) are sent after the data requests are 
> sent (and now the transient replica requests as well). When the fast path is 
> used for local data/transient data requests this will block the next type of 
> request from being sent away until the local read is finished and add 
> additional latency to the request.
> In addition to this it seems like local requests are *always* data requests 
> (might not be a problem), but the log message can say either ["digest" or 
> "data"|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L156]
>  as the type of request.
> I have tried to run performance measurements to see the impact of this in 3.0 
> (by moving local requests to the end of ARE#executeAsync()) but I haven't 
> seen any big difference yet. I'll continue to run some more tests to see if I 
> can find a use case affected by this.
> Attaching a trace (3.0) where this happens. Reproduction:
>  # Create a three node CCM cluster
>  # Provision data with stress (rf=3)
>  # In parallel:
>  ## Start stress read run
>  ## Run multiple manual read queries in cqlsh with tracing on and 
> local_quorum (as this does not always happen)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to