[ https://issues.apache.org/jira/browse/CASSANDRA-16737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ekaterina Dimitrova updated CASSANDRA-16737: -------------------------------------------- Description: With the following SSTables: {code:java} CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck)) INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000; --> flush() INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000; --> flush() INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000 --> flush() {code} the following query: {code:java} SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code} will only read the third SSTable. If we add a column to the table (e.g. {{ALTER TABLE my_table ADD v2 int}}) and rerun the query, the query will read the 3 SSTables. The reason for this behavior is due to the fact that C* is trying to read all the {{fetched}} columns to ensure that it will return a row if at least one of its column is non null. In practice for CQL tables, C* does not need to fetch all columns if the row contains a primary key liveness as it is enough to guaranty that the row exists. By consequence, even after the addition of the new column C* should read only the third SSTable. was: With the following SSTables: {code} CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck)) INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000; --> flush INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000; --> flush() INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000 --> flush() {code} the following query: {code}SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code} will only read the third SSTable. If we add a column to the table (e.g. {{ALTER TABLE my_table ADD v2 int}}) and rerun the query, the query will read the 3 SSTables. The reason for this behavior is due to the fact that C* is trying to read all the {{fetched}} columns to ensure that it will return a row if at least one of its column is non null. In practice for CQL tables, C* does not need to fetch all columns if the row contains a primary key liveness as it is enough to guaranty that the row exists. By consequence, even after the addition of the new column C* should read only the third SSTable. > ALTER ... ADD can increase the number of SSTables being read > ------------------------------------------------------------ > > Key: CASSANDRA-16737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16737 > Project: Cassandra > Issue Type: Bug > Components: CQL/Semantics > Reporter: Benjamin Lerer > Assignee: Benjamin Lerer > Priority: Normal > Fix For: 3.11.x, 4.0.x > > Time Spent: 20m > Remaining Estimate: 0h > > With the following SSTables: > {code:java} > CREATE TABLE my_table (pk int, ck int, v1 int, PRIMARY KEY(pk, ck)) > INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 1) USING TIMESTAMP 1000; > --> flush() > INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 2) USING TIMESTAMP 2000; > --> flush() > INSERT INTO my_table (pk, ck, v1) VALUES (1, 1, 3) USING TIMESTAMP 3000 > --> flush() > {code} > the following query: > {code:java} > SELECT pk, ck, v1 FROM my_table WHERE pk = 1 AND ck = 1{code} > will only read the third SSTable. > If we add a column to the table (e.g. {{ALTER TABLE my_table ADD v2 int}}) > and rerun the query, the query will read the 3 SSTables. > The reason for this behavior is due to the fact that C* is trying to read all > the {{fetched}} columns to ensure that it will return a row if at least one > of its column is non null. > In practice for CQL tables, C* does not need to fetch all columns if the row > contains a primary key liveness as it is enough to guaranty that the row > exists. By consequence, even after the addition of the new column C* should > read only the third SSTable. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org