[ https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023151#comment-13023151 ]
Peter Schuller commented on CASSANDRA-2494: ------------------------------------------- I don't think anyone is claiming otherwise, unless I'm misunderstanding. The problem is that while the "if sucessfully written to quorum, subsequent quorum reads will see it" guarantee is indeed maintained, it is possible for quorum reads to see data go backwards (on a timeline) in the event of a *failed* attempted quorum write. This includes the possibility of reads seeing data that then permanently vanishes, even though you only lost say 1 node that you designed your cluster for surviving (RF >= 3, QUORUM). ("lost 1 node" can be substituted with "killed 1 node in periodic commit mode") I still don't think this is a violation of what was promised, but I can see how making the further guarantee would make for more useful consistency semantics in some cases. With respect to implicit write: An alternative is to adjust reconciliation logic when applied as part of reads (as opposed to AES, hinted hand-off, writes) to take consistency level into account and only consider columns whose timestamp is >= the greatest timestamp that has quorum (off the top of my head I think that should be correct in call cases, but I didn't think this through terribly). > Quorum reads are not consistent > ------------------------------- > > Key: CASSANDRA-2494 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2494 > Project: Cassandra > Issue Type: Bug > Reporter: Sean Bridges > > As discussed in this thread, > http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html > Quorum reads should be consistent. Assume we have a cluster of 3 nodes > (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but > not Y and Z, then a read from X should not return N unless the read is > committed to at least two nodes. To ensure this, a read from X should wait > for an ack of the read repair write from either Y or Z before returning. > Are there system tests for cassandra? If so, there should be a test similar > to the original post in the email thread. One thread should write 1,2,3... > at consistency level ONE. Another thread should read at consistency level > QUORUM from a random host, and verify that each read is >= the last read. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira