[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574643#comment-16574643
 ] 

Alex Petrov edited comment on CASSANDRA-10726 at 8/10/18 7:53 AM:
------------------------------------------------------------------

It seems that there still might be a problem with {{Accumulator}} during 
{{BlockingReadRepair}}. 

{{DataResolver}} is created in {{BlockingReadRepair#startRepair}}, which is 
gets {{allReplicas}} from {{ReadCallback#endpoints}}, which gets it through  
{{AbstractReadExecutor#getReadExecutor}}, where they represent 
{{consistencyLevel.filterForQuery(keyspace, allReplicas)}}.

This means that, when we're sending more data requests, we will call 
{{getLiveSortedEndpoints}} and can end up with more nodes, but since 
{{Accumulator}} was initialised with just "target" nodes, if we keep getting 
responses (e.g. if the node was slow, not dead, which is more often the case), 
{{Accumulator}} will overflow. Unfortunately, testing with RF3 won't reveal 
that. 

Some comments on the patch itself:
  * We might want to simplify the code a little by not caching versions 
[here|https://github.com/apache/cassandra/compare/trunk...bdeggleston:10726-v4#diff-b677a5a6a3f1a90a889bcf906c1f8001R211].
   * {{BlockingDigestRepair}} has methods anmed 
[awaitRepair|https://github.com/apache/cassandra/compare/trunk...bdeggleston:10726-v4#diff-0246c72855070863c2fdbee6d97f494dR123]
 and 
[awaitRepairs|https://github.com/apache/cassandra/compare/trunk...bdeggleston:10726-v4#diff-0246c72855070863c2fdbee6d97f494dR174],
 which might be a bit counter-intuitive.
  * Partition range code path is also affected (since 
{{StorageProxy#fetchRows}} is changed). It'd be great to have dtests for 
partition ranges as well.


was (Author: ifesdjeen):
It seems that there still might be a problem with {{Accumulator}} during 
{{BlockingReadRepair}}. 

{{DataResolver}} is created in {{BlockingReadRepair#startRepair}}, which is 
gets {{allReplicas}} from {{ReadCallback#endpoints}}, which gets it through  
{{AbstractReadExecutor#getReadExecutor}}, where they represent 
{{consistencyLevel.filterForQuery(keyspace, allReplicas)}}.

This means that, when we're sending more data requests, we will call 
{{getLiveSortedEndpoints}} and can end up with more nodes, but since 
{{Accumulator}} was initialised with just "target" nodes, if we keep getting 
responses (e.g. if the node was slow, not dead, which is more often the case), 
{{Accumulator}} will overflow. Unfortunately, testing with RF3 won't reveal 
that. 

> Read repair inserts should not be blocking
> ------------------------------------------
>
>                 Key: CASSANDRA-10726
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Richard Low
>            Assignee: Blake Eggleston
>            Priority: Major
>             Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to