[ https://issues.apache.org/jira/browse/CASSANDRA-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563179#comment-13563179 ]
Jonathan Ellis commented on CASSANDRA-4011: ------------------------------------------- DataTracker.intervalTree is used on the read path regardless of compactionstrategy. > range-based log(n) elimination of sstables in read path > ------------------------------------------------------- > > Key: CASSANDRA-4011 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4011 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Peter Schuller > > If the read path was able to eliminate sstables based on token ranges, we > would avoid {{O(n)}} bloom filter checks ({{n}} being number of sstables). > Contributing motivation: > * For maximally efficient bulk-import, you tend to want a lot of small > sstables to avoid having to build up huge ones during the bulk creation > process. > * To avoid having to keep duplicate data when switching a data set (in a > periodic bulk replace import process), keeping sstables partitioned on token > range (similarly to leveled compaction) allows in-place replacement of > sstables one sstable at a time. > Those two in combination would mean that you can run a bulk-import based > total-dataset-replacement cluster with zero compaction and with zero disk > space overhead stemming from having to have overhead for compaction. > In addition: > * For e.g. leveled compaction where we have range based partitioning anyway, > {{log(n)}} is preferable to {{o(n)}}; especially if it would allow us to have > more than 10 "partitions" per level. I'm not sure yet whether there are other > reasons to have "only" 10, but if we can make them smaller by eliminating the > {{o(n)}} behavior in the read path, individual compactions can be even > smaller with leveled and you would scale even more easily with large data > sets while avoiding build-up in L0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira