Michael Marshall created CASSANDRA-21118:
--------------------------------------------

             Summary: SAI query on indexed static column reads full partition
                 Key: CASSANDRA-21118
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21118
             Project: Apache Cassandra
          Issue Type: Bug
            Reporter: Michael Marshall


The `ResultRetriever` in SAI materializes `matches` eagerly instead of 
iteratively, and as a result, when a static primary key is used to create the 
partition iterator, we iterate the full partition, independent of the `limit` 
value. Here is a test that demonstrates the problem (it doesn't fail, so you'll 
need to add logging or attach a debugger).


{code:java}
    @Test
    public void staticIndexOnlyMaterializesLimitRowsFromPartition() throws 
Throwable
    {
        createTable("CREATE TABLE %s (pk int, ck int, val1 int static, val2 
int, PRIMARY KEY(pk, ck))");
        disableCompaction(KEYSPACE);
        createIndex("CREATE INDEX ON %s(val1) USING 'sai'");

        execute("INSERT INTO %s(pk, ck, val1, val2) VALUES(?, ?, ?, ?)", 1, 1, 
2, 1);
        for (int i = 2; i < 10000; i++)
            execute("INSERT INTO %s(pk, ck,       val2) VALUES(?, ?,    ?)", 1, 
i,    i);

        beforeAndAfterFlush(() -> assertRows(execute("SELECT pk, ck, val1, val2 
FROM %s WHERE val1 = 2 LIMIT 3"),
                                             row(1, 1, 2, 1), row(1, 2, 2, 2), 
row(1, 3, 2, 3)));
    }
{code}

The proper solution is to apply an iterator based filter so that rows are 
lazily filtered. It might be worth reviewing the git history to see if it was 
implemented that way initially.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to