Blake Eggleston created CASSANDRA-20514:
-------------------------------------------
Summary: Paxos mixed mode infinite loop with ttl'd state
Key: CASSANDRA-20514
URL: https://issues.apache.org/jira/browse/CASSANDRA-20514
Project: Apache Cassandra
Issue Type: Bug
Components: Feature/Lightweight Transactions
Reporter: Blake Eggleston
Assignee: Blake Eggleston
This is similar to the bug fixed in CASSANDRA-20493.
CEP-14 changed the ttl behavior of legacy paxos state to expire based off the
ballot time of the operation being persisted, not the time a commit is
persisted. This eliminated the race addressed by CASSANDRA-12043, and so the
check it added to the most recent commit prepare logic was removed.
When operating in mixed mode though, this race can still be a problem. If a 4.1
or higher node is coordinating a paxos operation with 2 or more replicas on 4.0
or lower, this race becomes a problem again. You need 3 things to make this an
infinite loop
1. a 4.1 node coordinating a paxos operation with 2x 4.0 replicas
2. replica A) a 4.0 node returns a most recent commit for a ballot that's could
have been ttld
3. replica B) a 4.0 node has ttl'd that mrc AND converted the ttld cells into
tombstones
The 4.1 coordinator receives the mrc from replica A, but since it no longer
disregards missing most recent commits past the ttl window, it sends the
"missing" commit to replica B. Since replica B now has a tombstone for that
mrc, and tombstones win when reconciled with live cells, even ones with ttls,
the commit is a noop and it continues to report nothing for its mrc value when
the coordinator restarts the prepare phase. This loops until the query times out
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]