[ https://issues.apache.org/jira/browse/CASSANDRA-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305410#comment-14305410 ]
Benedict commented on CASSANDRA-8731: ------------------------------------- Correct. I plan(ned) to replace the CF objects in the new storage format. 8099 does make a partial change to CF objects inviting, but may be a bit ambitious for 3.0. > Optimise merges involving multiple clustering columns > ----------------------------------------------------- > > Key: CASSANDRA-8731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8731 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Labels: performance > Fix For: 3.0 > > > Since the new storage format is dead in the water for the moment, we should > do our best to optimise current behaviour. When merging data from multiple > sstables with multiple clustering columns, currently we must incur the full > costs of comparison for the entire matching prefix, and must heapify every > cell in our PriorityQueue, incurring lg(N) of these costlier comparisons for > every cell we merge, where N is the number of sources we're merging. > Essentially I'm proposing a trie-based merge approach as a replacement for > the ManyToOne MergeIterator, wherein we treat each clustering component as a > tree underwhich all Cells with a common prefix occur. We then perform a tree > merge, rather than a flat merge. For byte-order fields this trie can even be > a full binary-trie (although built on the fly). The advantage here is that we > rapidly prune merges involving disjoint ranges, so that instead of always > incurring lg(N) costs on each new record, we may often incur O(1) costs. For > timeseries data, for instance, we could merge dozens of files and so long as > they were non-overlapping our CPU burden would be little more than reading > from a single file. > On top of this, we no longer incur any of the shared prefix repetition costs, > since we compare each prefix piece-wise, and only once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)