[
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315894#comment-17315894
]
Vytenis Silgalis edited comment on CASSANDRA-12126 at 4/6/21, 11:53 PM:
Just a note that the bug that this fixes usually pops up as the following
timeout for people looking for reasons why SERIAL or LOCAL_SERIAL are seeing
read timeouts >3.11.10. Setting the flag to the opt-out option will `fix` it
but probably shouldn't be reading at this level if you run into this.
{code:java}
! com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout
during read query at consistency LOCAL_SERIAL (2 responses were required but
only 0 replica responded)
{code}
was (Author: vsilgalis):
Just a note that the bug that this fixes usually pops up as the following
timeout for people looking for reasons why SERIAL or LOCAL_SERIAL are seeing
read timeouts >3.11.10. Setting the flag to the opt-out option will `fix` it,
but probably shouldn't be reading at this level if you run into this.
{code:java}
! com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout
during read query at consistency LOCAL_SERIAL (2 responses were required but
only 0 replica responded)
{code}
> CAS Reads Inconsistencies
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
> Issue Type: Bug
> Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
> Labels: LWT, pull-request-available
> Fix For: 3.0.24, 3.11.10, 4.0, 4.0-beta4
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies
> true to a propose and saves the commit in accepted filed. The other two
> machines B and C does not get to the accept phase.
> Current state is that machine A has this commit in paxos table as accepted
> but not committed and B and C does not.
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the
> value written in step 1. This step is as if nothing is inflight.
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that
> there is something inflight from A and will propose and commit it with the
> current ballot. Now we can read the value written in step 1 as part of this
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value
> written in step 1.
> 4. Issue a CAS Write and it involves only B and C. This will succeed and
> commit a different value than step 1. Step 1 value will never be seen again
> and was never seen before.
> If you read the Lamport “paxos made simple” paper and read section 2.3. It
> talks about this issue which is how learners can find out if majority of the
> acceptors have accepted the proposal.
> In step 3, it is correct that we propose the value again since we dont know
> if it was accepted by majority of acceptors. When we ask majority of
> acceptors, and more than one acceptors but not majority has something in
> flight, we have no way of knowing if it is accepted by majority of acceptors.
> So this behavior is correct.
> However we need to fix step 2, since it caused reads to not be linearizable
> with respect to writes and other reads. In this case, we know that majority
> of acceptors have no inflight commit which means we have majority that
> nothing was accepted by majority. I think we should run a propose step here
> with empty commit and that will cause write written in step 1 to not be
> visible ever after.
> With this fix, we will either see data written in step 1 on next serial read
> or will never see it which is what we want.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org