[ 
https://issues.apache.org/jira/browse/CASSANDRA-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992941#comment-16992941
 ] 

Jordan West commented on CASSANDRA-15442:
-----------------------------------------

Thanks for the patch [~yifanc]. I've actually wondered about it for a while but 
never thought of it as a bug (assumed it was by design). Glad to see it 
addressed. A few minor nits, all optional:

 
 * I agree that we should only use the repair timeout for the reason you 
mentioned. I find it confusing that the write timeout is involved since the 
client has no idea if read repair will occur when submitting the read. 
 * Since `awaitRepairs` is only called from `awaitRepairsUntil` consider 
merging the two functions. It might make the code more clear and require less 
jumping around.
 * I agree with [~bdeggleston] that setting the write timeout very high in your 
test can show how the change ensures independence from it – regardless of any 
other code changes. It may also give you a more flexible way to write the 
assertion at the end of `readRepairTimeoutTest()`.
 * Additionally, re: the two `Assert.assertTrue`s at the end of 
`readRepairTimeoutTest`: consider adding failure messages to make them more 
descriptive, as well as collapsing it in to one expression (e.g 
`actualTimeTaken < reducedReadTimeout + magicDelayAmount && actualTimeTaken > 
reducedReadTimeout`). 

> Read repair implicitly increases read timeout value
> ---------------------------------------------------
>
>                 Key: CASSANDRA-15442
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15442
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Core
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>
> When read repair occurs during a read, internally, it starts several 
> _blocking_ operations in sequence. See 
> {{org.apache.cassandra.service.StorageProxy#fetchRows}}. 
>  The timeline of the blocking operations
>  # Regular read, wait for full data/digest read response to complete. 
> {{reads[*].awaitResponses();}}
>  # Read repair read, wait for full data read response to complete. 
> {{reads[*].awaitReadRepair();}}
>  # Read repair write, wait for write response to complete. 
> {{concatAndBlockOnRepair(results, repairs);}}
> Step 1 and 2 share the same timeout, and wait for the duration of read 
> timeout, say 5 s.
> Step 3 waits for the duration of write timeout, say 2 s.
> In the worse case, the actual time taken for a read could accumulate to ~7 s, 
> if each individual step does not exceed the timeout value.
> From the client perspective, it may not expect a request taken higher than 
> the database configured timeout value. 
> Such scenario is especially bad for the clients that have set up client-side 
> timeout monitoring close to the configured one. The clients think the 
> operations timed out and abort, but they are in fact still running on server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to