[ https://issues.apache.org/jira/browse/CASSANDRA-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692877#comment-17692877 ]
Jacek Lewandowski commented on CASSANDRA-18134: ----------------------------------------------- [~mck] can we close it? > Improve handling of min/max clustering in sstable > ------------------------------------------------- > > Key: CASSANDRA-18134 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18134 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable > Reporter: Jacek Lewandowski > Assignee: Jacek Lewandowski > Priority: Normal > Fix For: 4.2 > > Time Spent: 20m > Remaining Estimate: 0h > > This patch improves the following things: > # SSTable metadata will store a covered slice in addition min/max > clusterings. The difference is that for slices there is available the type of > a bound rather than just a clustering. In particular it will provide the > information whether the lower and upper bound of an sstable is opened or > closed. The legacy min/max clustering will be stored until a new major format > {{o}} to ensure backward compatibility > # SSTable metadata will store a flag whether the SSTable contains any > partition level deletions or not > # SSTable metadata will store the first and the last keys of the sstable. > This is mostly for consistency - key range is logically a part of stats > metadata. So far it is stored at the end of the index summary. After this > change, index summary will be no longer needed to read key range of an > sstable (although we will keep storing key range as before for compatibility > reasons) > # The above two changes required to introduce a new minor format for SSTables > - {{nc}} > # Single partition read command makes use of the above changes. In particular > an sstable can be skipped when it does not intersect with the column filter, > does not have partition level deletions and does not have statics; In case > there are partition level deletions, but the other conditions are satisfied, > only the partition header needs to be accessed (tests attached) > # Skipping sstables assuming those three conditions are satisfied has been > implemented also for partition range queries (tests attached). Also added > minor separate statistics to record the number of accessed sstables in > partition reads because now not all of them need to be accessed. That > statistics is also needed in tests to confirm skipping. > # Artificial lower bound marker is now an object on its own and is not > implemented as a special case of range tombstone bound. Instead it sorts > right before the lowest available bound in the data > # Extended the lower bound optimization usage due the 1 and 2 > # Do not initialize iterator just to get a cached partition and associated > columns index. The purpose of using lower bound optimization was to avoid > opening an iterator of an sstable if possible. > See also CASSANDRA-14861 > The changes in this patch include work of [~blambov], [~slebresne], > [~jakubzytka] and [~jlewandowski] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org