[ 
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023151#comment-13023151
 ] 

Peter Schuller commented on CASSANDRA-2494:
-------------------------------------------

I don't think anyone is claiming otherwise, unless I'm misunderstanding. The 
problem is that while the "if sucessfully written to quorum, subsequent quorum 
reads will see it" guarantee is indeed maintained, it is possible for quorum 
reads to see data go backwards (on a timeline) in the event of a *failed* 
attempted quorum write. This includes the possibility of reads seeing data that 
then permanently vanishes, even though you only lost say 1 node that you 
designed your cluster for surviving (RF >= 3, QUORUM). ("lost 1 node" can be 
substituted with "killed 1 node in periodic commit mode")

I still don't think this is a violation of what was promised, but I can see how 
making the further guarantee would make for more useful consistency semantics 
in some cases.

With respect to implicit write: An alternative is to adjust reconciliation 
logic when applied as part of reads (as opposed to AES,  hinted hand-off, 
writes) to take consistency level into account and only consider columns whose 
timestamp is >= the greatest timestamp that has quorum (off the top of my head 
I think that should be correct in call cases, but I didn't think this through 
terribly).


> Quorum reads are not consistent
> -------------------------------
>
>                 Key: CASSANDRA-2494
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sean Bridges
>
> As discussed in this thread,
> http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html
> Quorum reads should be consistent.  Assume we have a cluster of 3 nodes 
> (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but 
> not Y and Z, then a read from X should not return N unless the read is 
> committed to at  least two nodes.  To ensure this, a read from X should wait 
> for an ack of the read repair write from either Y or Z before returning.
> Are there system tests for cassandra?  If so, there should be a test similar 
> to the original post in the email thread.  One thread should write 1,2,3... 
> at consistency level ONE.  Another thread should read at consistency level 
> QUORUM from a random host, and verify that each read is >= the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to