[ https://issues.apache.org/jira/browse/CASSANDRA-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348606#comment-14348606 ]
Sylvain Lebresne commented on CASSANDRA-8906: --------------------------------------------- The idea is to reduce the number of row comparison within a given partition key, typically for time series data. Say you do a read on a given partition, for which we have 2 sstables A and B, but the clustering value for the row of A are strictly before the ones of B. Then we'll still end up doing tons of comparisons when merging (for every row in A, which their can be a lot). But if we know that there is no row overlapping over those 2 sstables for that partition, merging can just be "return all the row of A, then all those of B" and no comparison is involved (and the price being whatever it takes us to decide that there is no overlapping for that partition in the first place, which will involve a couple of comparisons, but will potentially save tons of them). Of course, for that, we need to be able to decide cheaply (and as precisely as possible) that 2 sstable don't overlap for a given partition. As said above, a very simple heuristic is using the sstable min/max clustering values, but that has the downside of not being terribly precise. A more precise option would be to use the in-partition indexes but that's quite a bit more involved. Now, for completeness sake, let's add that: * if there is overlaps within the partition considered, the check for overlap will be wasted CPU, so we want that check to be cheap but it would also make sense to only do it when we believe there is a chance it could be useful (say, when date tiered compaction is in use?). * if we use the simple check based on min/max clustering values, then even with date tiered, it's likely that 2 consecutive sstable will still be considered overlapping even if they have very little overlap in practice (unless we something along the lines of CASSANDRA-8737). And while this could still be useful when we're merging more than 2 sstables today, CASSANDRA-8915 should make this moot. Long story short, provided we do CASSANDRA-8915 (which we should definitively do), I don't think the simple min/max clustering values based check will be much useful. Using the in-partition index to guide merging could however help quite a bit much, but it's possibly quite involved, so I'd like to check if this can be done in a reasonably simple way at some point, but this might not pan out. > Experiment with optimizing partition merging when we can prove that some > sources don't overlap > ---------------------------------------------------------------------------------------------- > > Key: CASSANDRA-8906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8906 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > > When we merge a partition from two sources and it turns out that those 2 > sources don't overlap for that partition, we still end up doing one > comparison by row in the first source. However, if we can prove that the 2 > sources don't overlap, for example by using the sstable min/max clustering > values that we store, we could speed this up. Note that it practice it's > little bit more hairy because we need to deal with N sources, but that's > probably not too hard either. > I'll note that using the sstable min/max clustering values is not terribly > precise. We could do better if we were to push the same reasoning inside the > merge iterator, by for instance using the sstable per-partition index, which > can in theory tell use things like "don't bother comparing rows until the end > of this row block". This is quite a bit more involved though so maybe note > worth the complexity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)