[ 
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817515#comment-17817515
 ] 

Caleb Rackliffe commented on CASSANDRA-19018:
---------------------------------------------

I've finally narrowed in on a concrete repro for the range tombstone problems...

{noformat}
@Test
public void testPartialUpdatesWithDeleteBetween()
{
    CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.partial_updates (k int, 
c int, a int, b int, PRIMARY KEY (k, c)) WITH read_repair = 'NONE'"));
    CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.partial_updates(a) 
USING 'sai'"));
    CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.partial_updates(b) 
USING 'sai'"));
    SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);

    // insert a split row w/ a range tombstone sandwiched in the middle 
temporally 
    CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO 
%s.partial_updates(k, c, a) VALUES (0, 1, 1) USING TIMESTAMP 1"));
    CLUSTER.get(2).executeInternal(withKeyspace("DELETE FROM %s.partial_updates 
USING TIMESTAMP 2 WHERE k = 0 AND c > 0"));
    CLUSTER.get(2).executeInternal(withKeyspace("INSERT INTO 
%s.partial_updates(k, c, b) VALUES (0, 1, 2) USING TIMESTAMP 3"));

    String select = withKeyspace("SELECT * FROM %s.partial_updates WHERE a = 1 
AND b = 2");
    Object[][] initialRows = CLUSTER.coordinator(1).execute(select, 
ConsistencyLevel.ALL);
    assertRows(initialRows);  <-- This returns a row when it shouldn't!
}
{noformat}

tl;dr Because we can degrade intersections to unions inside SAI on unrepaired 
data, RFP no longer implicitly covers all delete cases without sending range 
tombstones to the coordinator or identifying silent replicas purely at the row 
level. In the case above, RFP could be made to work if it identified "silent" 
columns rather than entire rows. (i.e. It would notice that "a" from node 1 has 
no corresponding value from node 2, so the response from node 2 needs to be 
protected. Assuming data isn't always horrifically out of date, this is likely 
better than trying to send mostly unnecessary RTs.)

> An SAI-specific mechanism to ensure consistency isn't violated for 
> multi-column (i.e. AND) queries at CL > ONE
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19018
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination, Feature/SAI
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.0-rc, 5.x
>
>         Attachments: ci_summary-1.html, ci_summary.html, 
> result_details.tar-1.gz, result_details.tar.gz
>
>          Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around 
> filtering/index queries that use intersection/AND over partially updated 
> non-key columns. (ex. Restricting one clustering column and one normal column 
> does not cause a consistency problem, as primary keys cannot be partially 
> updated.) This issue exists to attempt to fix this specifically for SAI in 
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config -> 
> config.with(GOSSIP).with(NETWORK)).start()))
>         {
>             cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int 
> PRIMARY KEY, a int, b int)"));
>             cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING 
> 'sai'"));
>             cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING 
> 'sai'"));
>             // insert a split row
>             cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> a) VALUES (0, 1)"));
>             cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> b) VALUES (0, 2)"));
>         // Uncomment this line and test succeeds w/ partial writes 
> completed...
>         //cluster.get(1).nodetoolResult("repair", 
> KEYSPACE).asserts().success();
>             String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND 
> b = 2");
>             Object[][] initialRows = cluster.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
>             assertRows(initialRows, row(0, 1, 2)); // not found!!
>         }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial 
> matches from the coordinator that would combine there to form full matches. 
> Simple non-index filtering queries also suffer from this problem, but they 
> hide the partial matches in a different way. I'll outline a possible solution 
> for this in the comments that takes advantage of replica filtering protection 
> and the repaired/unrepaired datasets...and attempts to minimize the amount of 
> extra row data sent to the coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to