[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068248#comment-15068248
 ] 

Jonathan Ellis commented on CASSANDRA-10726:
--------------------------------------------

Seeing reads "go backwards in time" is one of the most confusing aspects of 
eventual consistency for people, so I do think it's important that quorum reads 
avoid that, even more so because users tend to oversimplify quorum reads as 
"strong consistency that means I don't have to think about EC."  So to the 
degree we can make that assumption true, we should, especially if that's been 
our behavior already for 4+ years.

It seems like there are two primary problem scenarios:

* When a node is overloaded for writes, this stops reads as well.  First, 
delaying reads when we're behind on writes is arguably a good thing that will 
help you recover faster.  Second, the right way to tackle this is with better 
handling of the write overload as in CASANDRA-9318.
* When data is read-only because disks are failing.  I agree with Sylvain that 
half-broken is often worse than completely broken, and in this specific case if 
a disk puts itself in read-only mode then it won't be long until it isn't 
readable either.  This is another case where "mark a disk bad and broadcast to 
other nodes not to send me requests for tokens pinned to it" as envisioned in 
CASSANDRA-6696 would be useful, along with an option for "promote write errors 
to blacklist on reads as wells."

> Read repair inserts should not be blocking
> ------------------------------------------
>
>                 Key: CASSANDRA-10726
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Richard Low
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to