Benjamin Lerer created CASSANDRA-16737:
------------------------------------------

             Summary: ALTER ... ADD can increase the number of SSTables being 
read
                 Key: CASSANDRA-16737
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16737
             Project: Cassandra
          Issue Type: Bug
          Components: CQL/Semantics
            Reporter: Benjamin Lerer
            Assignee: Benjamin Lerer


With the following SSTables:

{code}
CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck))

INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000;
--> flush
INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000;
--> flush()
INSERT INTO my_table  (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000
--> flush()
{code}

the following query:
{code}SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code}
will only read the third SSTable.

If we add a column to the table  (e.g. {{ALTER TABLE my_table ADD v2 int}}) and 
rerun the query, the query will read the 3 SSTables.

The reason for this behavior is due to the fact that C* is trying to read all 
the {{fetched}} columns to ensure that it will return a row if at least one of 
its column is non null.

In practice for CQL tables, C* does not need to fetch all columns if the row 
contains a primary key liveness as it is enough to guaranty that the row 
exists. By consequence, even after the addition of the new column C* should 
read only the third SSTable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to