[
https://issues.apache.org/jira/browse/CASSANDRA-20374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010627#comment-18010627
]
Aleksey Yeschenko edited comment on CASSANDRA-20374 at 7/29/25 10:38 AM:
-------------------------------------------------------------------------
Pushed a rebased and squashed branch here
(https://github.com/iamaleksey/cassandra/commits/20374-rebased), with some
suggestions and nits, including from previous review and Caleb's comment above,
addressed.
While reviewing this branch and restructuring the logic to be more linear so I
can understand it better, I've realized that there are still some issues there
with short read protection implementation for range reads, some shared with the
indexed implementation:
1. Per partition limit isn't handled properly. If an augmenting mutation
shadows some of the rows in the initial read, we don't fix that partition.
2. We can materialize an unbounded number of regular and range tombstones even
with very reasonable limits set. This does allow us to safely and correctly
merge the initial read data with the augmenting partition updates, yes, but I
think we can't afford to do it. An alternative would be avoid materializing row
and ranged tombstones, but keep track, per partition, of whether we have
encountered any while materializing. If we then have an augmenting partition
update with live data, we mark that partition is potentially short, and will
have to swap it for a result of a single-partition SRP read for just that
partition.
3. I think even withut per partition limits, we can have incorrect handling of
SRP as implemented now for the very last partition. If some of the augmenting
partition updates have tombstones that shadow portions of the last partition,
it won't be 'repaired' - the follow-up read witll start with the next key
that's not included.
4. We are mixing metaphors in the implementation - using a state machine for
the bulk of the read stages (initial, prepared, completed), then future
combining for SRP/follow-up reads. It would be clearer if we followed just one
approach, as it's a little tricky to follow the logic.
5. The state machine for {PartialTrackedRead} is a little involved, and has
class hierarchies that make the logic very non-linear. Have to go up- and down-
layers to follow the path of a request - for range reads specifically. I think
it can be simplified quite a bit.
I have a WIP branch with some of this done (was flattening the logic for the
purpose of understanding the existing code better), that I believe might be
worth evolving further into a follow-up PR that will address these issues.
That said, since these issues are pre-existing, and the indexing part itself
looks good to me, I think it would be fine to commit 20374 (this issue) as is
in the rebased branch, once Caleb's feedback's been addressed, and close the
issue, addressing the rest of it in a follow-up JIRA ticket.
was (Author: iamaleksey):
Pushed a rebased and squashed branch here
(https://github.com/iamaleksey/cassandra/commits/20374-rebased), with some
suggestions and nits, including from previous review and Caleb's comment above,
addressed.
While reviewing this branch and restructuring the logic to be more linear so I
can understand it better, I've realized that there are still some issues there
with short read protection implementation for range reads, some shared with the
indexed implementation:
1. Per partition limit isn't handled properly. If an augmenting mutation
shadows some of the rows in the initial read, we don't fix that partition.
2. We can materialize an unbounded number of regular and range tombstones even
with very reasonable limits set. This does allow us to safely and correctly
merge the initial read data with the augmenting partition updates, yes, but I
think we can't afford to do it. An alternative would be avoid materializing row
and ranged tombstones, but keep track, per partition, of whether we have
encountered any while materializing. If we then have an augmenting partition
update with live data, we mark that partition is potentially short, and will
have to swap it for a result of a single-partition SRP read for just that
partition.
3. I think even withut per partition limits, we can have incorrect handling of
SRP as implemented now for the very last partition. If some of the augmenting
partition updates have tombstones that shadow portions of the last partition,
it won't be 'repaired' - the follow-up read witll start with the next key
that's not included.
4. We are mixing metaphors in the implementation - using a state machine for
the bulk of the read stages (initial, prepared, completed), then future
combining for SRP/follow-up reads. It would be clearer if we followed just one
approach, as it's a little tricky to follow the logic.
5. The state machine for {PartialTrackedRead} is a little involved, and has
class hierarchies that make the logic very non-linear. Have to go up- and down-
layers to follow the path of a request - for range reads specifically. I think
it can be simplified quite a bit.
I have a WIP branch with some of this done (was flattening the logic for the
purpose of understanding the existing code better), that I believe might be
worth evolving further into a follow-up PR that will address these issues.
That said, since these issues are pre-existing, and the indexing part itself
looks good to me, I think it would be fine to commit 20374 (this issue) as is
in the rebased branch and close the issue, addressing the rest of it in a
follow-up JIRA ticket.
> CEP-45: Index read support for mutation tracking
> ------------------------------------------------
>
> Key: CASSANDRA-20374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20374
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Consistency/Coordination
> Reporter: Blake Eggleston
> Assignee: Blake Eggleston
> Priority: Normal
>
> Index reads are disabled for mutation tracking and we need to add support for
> them. A big part of this will be adapting the ReplicaFilteringProtection to
> work with the logged read classes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]