[jira] [Comment Edited] (CASSANDRA-20374) CEP-45: Index read support for mutation tracking

Aleksey Yeschenko (Jira) Tue, 29 Jul 2025 03:45:54 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010627#comment-18010627
 ]


Aleksey Yeschenko edited comment on CASSANDRA-20374 at 7/29/25 10:38 AM:
-------------------------------------------------------------------------

Pushed a rebased and squashed branch here 
(https://github.com/iamaleksey/cassandra/commits/20374-rebased), with some 
suggestions and nits, including from previous review and Caleb's comment above, 
addressed.

While reviewing this branch and restructuring the logic to be more linear so I 
can understand it better, I've realized that there are still some issues there 
with short read protection implementation for range reads, some shared with the 
indexed implementation:

1. Per partition limit isn't handled properly. If an augmenting mutation 
shadows some of the rows in the initial read, we don't fix that partition.
2. We can materialize an unbounded number of regular and range tombstones even 
with very reasonable limits set. This does allow us to safely and correctly 
merge the initial read data with the augmenting partition updates, yes, but I 
think we can't afford to do it. An alternative would be avoid materializing row 
and ranged tombstones, but keep track, per partition, of whether we have 
encountered any while materializing. If we then have an augmenting partition 
update with live data, we mark that partition is potentially short, and will 
have to swap it for a result of a single-partition SRP read for just that 
partition.
3. I think even withut per partition limits, we can have incorrect handling of 
SRP as implemented now for the very last partition. If some of the augmenting 
partition updates have tombstones that shadow portions of the last partition, 
it won't be 'repaired' - the follow-up read witll start with the next key 
that's not included.
4. We are mixing metaphors in the implementation - using a state machine for 
the bulk of the read stages (initial, prepared, completed), then future 
combining for SRP/follow-up reads. It would be clearer if we followed just one 
approach, as it's a little tricky to follow the logic.
5. The state machine for {PartialTrackedRead} is a little involved, and has 
class hierarchies that make the logic very non-linear. Have to go up- and down- 
layers to follow the path of a request - for range reads specifically. I think 
it can be simplified quite a bit.

I have a WIP branch with some of this done (was flattening the logic for the 
purpose of understanding the existing code better), that I believe might be 
worth evolving further into a follow-up PR that will address these issues.

That said, since these issues are pre-existing, and the indexing part itself 
looks good to me, I think it would be fine to commit 20374 (this issue) as is 
in the rebased branch, once Caleb's feedback's been addressed, and close the 
issue, addressing the rest of it in a follow-up JIRA ticket.


was (Author: iamaleksey):
Pushed a rebased and squashed branch here 
(https://github.com/iamaleksey/cassandra/commits/20374-rebased), with some 
suggestions and nits, including from previous review and Caleb's comment above, 
addressed.

While reviewing this branch and restructuring the logic to be more linear so I 
can understand it better, I've realized that there are still some issues there 
with short read protection implementation for range reads, some shared with the 
indexed implementation:

1. Per partition limit isn't handled properly. If an augmenting mutation 
shadows some of the rows in the initial read, we don't fix that partition.
2. We can materialize an unbounded number of regular and range tombstones even 
with very reasonable limits set. This does allow us to safely and correctly 
merge the initial read data with the augmenting partition updates, yes, but I 
think we can't afford to do it. An alternative would be avoid materializing row 
and ranged tombstones, but keep track, per partition, of whether we have 
encountered any while materializing. If we then have an augmenting partition 
update with live data, we mark that partition is potentially short, and will 
have to swap it for a result of a single-partition SRP read for just that 
partition.
3. I think even withut per partition limits, we can have incorrect handling of 
SRP as implemented now for the very last partition. If some of the augmenting 
partition updates have tombstones that shadow portions of the last partition, 
it won't be 'repaired' - the follow-up read witll start with the next key 
that's not included.
4. We are mixing metaphors in the implementation - using a state machine for 
the bulk of the read stages (initial, prepared, completed), then future 
combining for SRP/follow-up reads. It would be clearer if we followed just one 
approach, as it's a little tricky to follow the logic.
5. The state machine for {PartialTrackedRead} is a little involved, and has 
class hierarchies that make the logic very non-linear. Have to go up- and down- 
layers to follow the path of a request - for range reads specifically. I think 
it can be simplified quite a bit.

I have a WIP branch with some of this done (was flattening the logic for the 
purpose of understanding the existing code better), that I believe might be 
worth evolving further into a follow-up PR that will address these issues.

That said, since these issues are pre-existing, and the indexing part itself 
looks good to me, I think it would be fine to commit 20374 (this issue) as is 
in the rebased branch and close the issue, addressing the rest of it in a 
follow-up JIRA ticket.

> CEP-45: Index read support for mutation tracking
> ------------------------------------------------
>
>                 Key: CASSANDRA-20374
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20374
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Coordination
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>
> Index reads are disabled for mutation tracking and we need to add support for 
> them. A big part of this will be adapting the ReplicaFilteringProtection to 
> work with the logged read classes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-20374) CEP-45: Index read support for mutation tracking

Reply via email to