[ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182733#comment-14182733
 ] 

DOAN DuyHai commented on CASSANDRA-8180:
----------------------------------------

The idea is:

If there is no restriction on clustering columns but only on LIMIT:

1) Order SSTables by min/max column depending on the first clustering order
2) Hit the first SSTable and start sequential read until reaching LIMIT
3) If LIMIT is large enough, switch to another SSTable and so on



> Optimize disk seek using min/max colunm name meta data when the LIMIT clause 
> is used
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Cassandra 2.0.10
>            Reporter: DOAN DuyHai
>            Assignee: Tyler Hobbs
>            Priority: Minor
>
> I was working on an example of sensor data table (timeseries) and face a use 
> case where C* does not optimize read on disk.
> {code}
> cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
> WITH CLUSTERING ORDER BY (col DESC);
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
> ...
> >nodetool flush test test
> {code}
> After that, I activate request tracing:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
>  activity                                                                  | 
> timestamp    | source    | source_elapsed
> ---------------------------------------------------------------------------+--------------+-----------+----------------
>                                                         execute_cql3_query | 
> 23:48:46,498 | 127.0.0.1 |              0
>                             Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
> 23:48:46,498 | 127.0.0.1 |             74
>                                                        Preparing statement | 
> 23:48:46,499 | 127.0.0.1 |            253
>                                   Executing single-partition query on test | 
> 23:48:46,499 | 127.0.0.1 |            930
>                                               Acquiring sstable references | 
> 23:48:46,499 | 127.0.0.1 |            943
>                                                Merging memtable tombstones | 
> 23:48:46,499 | 127.0.0.1 |           1032
>                                                Key cache hit for sstable 3 | 
> 23:48:46,500 | 127.0.0.1 |           1160
>                                Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |           1173
>                                                Key cache hit for sstable 2 | 
> 23:48:46,500 | 127.0.0.1 |           1889
>                                Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |           1901
>                                                Key cache hit for sstable 1 | 
> 23:48:46,501 | 127.0.0.1 |           2373
>                                Seeking to partition beginning in data file | 
> 23:48:46,501 | 127.0.0.1 |           2384
>  Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 
> 23:48:46,501 | 127.0.0.1 |           2768
>                                 Merging data from memtables and 3 sstables | 
> 23:48:46,501 | 127.0.0.1 |           2784
>                                         Read 2 live and 0 tombstoned cells | 
> 23:48:46,501 | 127.0.0.1 |           2976
>                                                           Request complete | 
> 23:48:46,501 | 127.0.0.1 |           3551
> {code}
> We can clearly see that C* hits 3 SSTables on disk instead of just one, 
> although it has the min/max column meta data to decide which SSTable contains 
> the most recent data.
> Funny enough, if we add a clause on the clustering column to the select, this 
> time C* optimizes the read path:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
>  activity                                                                  | 
> timestamp    | source    | source_elapsed
> ---------------------------------------------------------------------------+--------------+-----------+----------------
>                                                         execute_cql3_query | 
> 23:52:31,888 | 127.0.0.1 |              0
>                Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | 
> 23:52:31,888 | 127.0.0.1 |             60
>                                                        Preparing statement | 
> 23:52:31,888 | 127.0.0.1 |            277
>                                   Executing single-partition query on test | 
> 23:52:31,889 | 127.0.0.1 |            961
>                                               Acquiring sstable references | 
> 23:52:31,889 | 127.0.0.1 |            971
>                                                Merging memtable tombstones | 
> 23:52:31,889 | 127.0.0.1 |           1020
>                                                Key cache hit for sstable 3 | 
> 23:52:31,889 | 127.0.0.1 |           1108
>                                Seeking to partition beginning in data file | 
> 23:52:31,889 | 127.0.0.1 |           1117
>  Skipped 2/3 non-slice-intersecting sstables, included 0 due to tombstones | 
> 23:52:31,889 | 127.0.0.1 |           1611
>                                 Merging data from memtables and 1 sstables | 
> 23:52:31,890 | 127.0.0.1 |           1624
>                                         Read 1 live and 0 tombstoned cells | 
> 23:52:31,890 | 127.0.0.1 |           1700
>                                                           Request complete | 
> 23:52:31,890 | 127.0.0.1 |           2140
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to