DOAN DuyHai created CASSANDRA-8180:
--------------------------------------

             Summary: Optimize disk seek using min/max colunm name meta data 
when the LIMIT clause is used
                 Key: CASSANDRA-8180
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
         Environment: Cassandra 2.0.10
            Reporter: DOAN DuyHai
            Priority: Minor


I was working on an example of sensor data table (timeseries) and face a use 
case where C* does not optimize read on disk.

{code}
cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
WITH CLUSTERING ORDER BY (col DESC);
cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
...
>nodetool flush test test
...
cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
...
>nodetool flush test test
...
cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
...
>nodetool flush test test
{code}

After that, I activate request tracing:

{code}
cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
 activity                                                                  | 
timestamp    | source    | source_elapsed
---------------------------------------------------------------------------+--------------+-----------+----------------
                                                        execute_cql3_query | 
23:48:46,498 | 127.0.0.1 |              0
                            Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
23:48:46,498 | 127.0.0.1 |             74
                                                       Preparing statement | 
23:48:46,499 | 127.0.0.1 |            253
                                  Executing single-partition query on test | 
23:48:46,499 | 127.0.0.1 |            930
                                              Acquiring sstable references | 
23:48:46,499 | 127.0.0.1 |            943
                                               Merging memtable tombstones | 
23:48:46,499 | 127.0.0.1 |           1032
                                               Key cache hit for sstable 3 | 
23:48:46,500 | 127.0.0.1 |           1160
                               Seeking to partition beginning in data file | 
23:48:46,500 | 127.0.0.1 |           1173
                                               Key cache hit for sstable 2 | 
23:48:46,500 | 127.0.0.1 |           1889
                               Seeking to partition beginning in data file | 
23:48:46,500 | 127.0.0.1 |           1901
                                               Key cache hit for sstable 1 | 
23:48:46,501 | 127.0.0.1 |           2373
                               Seeking to partition beginning in data file | 
23:48:46,501 | 127.0.0.1 |           2384
 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 
23:48:46,501 | 127.0.0.1 |           2768
                                Merging data from memtables and 3 sstables | 
23:48:46,501 | 127.0.0.1 |           2784
                                        Read 2 live and 0 tombstoned cells | 
23:48:46,501 | 127.0.0.1 |           2976
                                                          Request complete | 
23:48:46,501 | 127.0.0.1 |           3551
{code}

We can clearly see that C* hits 3 SSTables on disk instead of just one, 
although it has the min/max column meta data to decide which SSTable contains 
the most recent data.

Funny enough, if we add a clause on the clustering column to the select, this 
time C* optimizes the read path:

{code}
cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
 activity                                                                  | 
timestamp    | source    | source_elapsed
---------------------------------------------------------------------------+--------------+-----------+----------------
                                                        execute_cql3_query | 
23:52:31,888 | 127.0.0.1 |              0
               Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | 
23:52:31,888 | 127.0.0.1 |             60
                                                       Preparing statement | 
23:52:31,888 | 127.0.0.1 |            277
                                  Executing single-partition query on test | 
23:52:31,889 | 127.0.0.1 |            961
                                              Acquiring sstable references | 
23:52:31,889 | 127.0.0.1 |            971
                                               Merging memtable tombstones | 
23:52:31,889 | 127.0.0.1 |           1020
                                               Key cache hit for sstable 3 | 
23:52:31,889 | 127.0.0.1 |           1108
                               Seeking to partition beginning in data file | 
23:52:31,889 | 127.0.0.1 |           1117
 Skipped 2/3 non-slice-intersecting sstables, included 0 due to tombstones | 
23:52:31,889 | 127.0.0.1 |           1611
                                Merging data from memtables and 1 sstables | 
23:52:31,890 | 127.0.0.1 |           1624
                                        Read 1 live and 0 tombstoned cells | 
23:52:31,890 | 127.0.0.1 |           1700
                                                          Request complete | 
23:52:31,890 | 127.0.0.1 |           2140
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to