[ 
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799154#comment-17799154
 ] 

Caleb Rackliffe edited comment on CASSANDRA-19018 at 12/20/23 7:55 PM:
-----------------------------------------------------------------------

Update before the holidays begin...

I've pushed a commit to [the PR 
here|https://github.com/apache/cassandra/pull/2935] that, though incomplete, is 
a good start. Here's an overview:

1.) {{RowFilter}} now has a boolean field that indicates whether or not we 
allow what I call "strict" filtering at replicas. Essentially, strict filtering 
is allowed when we have a query at ONE/LOCAL_ONE/NODE_LOCAL, where we simply 
won't have to worry about partial updates. This is sent to replicas via a flag 
during {{ReadCommand}} serialization. (This actually could be used by the 
{{RowFilter}} itself for non-indexed plain filtering queries, although I'm not 
sure if fixing those needs to be a goal for this Jira.)

2.) With strict filtering allowed, really nothing about SAI query execution 
changes. When strict filtering is unsafe and therefore disabled, we segregate 
indexes for repaired and un-repaired SSTables, then do strict filtering on the 
repaired set, but non-strict (intersections become unions) filtering on the 
un-repaired set (which includes the Memtableindexes).

3.) I've remove the optimization during index query "view" building around the 
"most selective" index expression. Any selectivity calculations are inaccurate, 
since we don't index deletes. Also, when strict filtering is not allowed, 
intersecting the combined range for the SSTable indexes of the "most selective" 
expression is simply not safe. I'm not sure we've ever gotten much out of this 
optimization anyway. (With partition restricted queries and LCS, we'll already 
be dealing with a relatively small set of SSTables.)

4.) Replica filtering protection hard-codes its {{RowFilter}} to do strict 
filtering, as it must, since it happens at the coordinator.

If I left it in this state, things would be _correct_, just very inefficient. 
There are many cases where we can still do strict filtering for CL > ONE, and 
the surface area for non-strict filtering needs further shrinking. (ex. 
Retrieved rows have no timestamp divergence for queried columns, only a single 
mutable column participates in an AND query, etc.)

CC [~adelapena]


was (Author: maedhroz):
Update before the holidays begin...

I've pushed a commit to [the PR 
here|https://github.com/apache/cassandra/pull/2935] that, though incomplete, is 
a good start. Here's an overview:

1.) {{RowFilter}} now has a boolean field that indicates whether or not we 
allow what I call "strict" filtering at replicas. Essentially, strict filtering 
is allowed when we have a query at ONE/LOCAL_ONE/NODE_LOCAL, where we simply 
won't have to worry about partial updates. This is sent to replicas via a flag 
during {{ReadCommand}} serialization. (This actually could be used by the 
{{RowFilter}} itself for non-indexed plain filtering queries, although I'm not 
sure if fixing those needs to be a goal for this Jira.)

2.) With strict filtering allowed, really nothing about SAI query execution 
changes. When strict filtering is unsafe and therefore disabled, we segregate 
indexes for repaired and un-repaired SSTables, then do strict filtering on the 
repaired set, but non-strict (intersections become unions) filtering on the 
un-repaired set (which includes the Memtableindexes).

3.) I've remove the optimization during index query "view" building around the 
"most selective" index expression. Any selectivity calculations are inaccurate, 
since we don't index deletes. Also, when strict filtering is not allowed, 
intersecting the combined range for the SSTable indexes of the "most selective" 
expression is simply not safe. I'm not sure we've ever gotten much out of this 
optimization anyway. (With partition restricted queries and LCS, we'll already 
be dealing with a relatively small set of SSTables.)

4.) Replica filtering protection hard-codes its {{RowFilter}} to do strict 
filtering, as it must, since it happens at the coordinator.

CC [~adelapena]

> An SAI-specific mechanism to ensure consistency isn't violated for 
> multi-column (i.e. AND) queries at CL > ONE
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19018
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination, Feature/SAI
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.0-rc, 5.x
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around 
> filtering/index queries that use intersection/AND over partially updated 
> non-key columns. (ex. Restricting one clustering column and one normal column 
> does not cause a consistency problem, as primary keys cannot be partially 
> updated.) This issue exists to attempt to fix this specifically for SAI in 
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config -> 
> config.with(GOSSIP).with(NETWORK)).start()))
>         {
>             cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int 
> PRIMARY KEY, a int, b int)"));
>             cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING 
> 'sai'"));
>             cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING 
> 'sai'"));
>             // insert a split row
>             cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> a) VALUES (0, 1)"));
>             cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> b) VALUES (0, 2)"));
>         // Uncomment this line and test succeeds w/ partial writes 
> completed...
>         //cluster.get(1).nodetoolResult("repair", 
> KEYSPACE).asserts().success();
>             String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND 
> b = 2");
>             Object[][] initialRows = cluster.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
>             assertRows(initialRows, row(0, 1, 2)); // not found!!
>         }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial 
> matches from the coordinator that would combine there to form full matches. 
> Simple non-index filtering queries also suffer from this problem, but they 
> hide the partial matches in a different way. I'll outline a possible solution 
> for this in the comments that takes advantage of replica filtering protection 
> and the repaired/unrepaired datasets...and attempts to minimize the amount of 
> extra row data sent to the coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to