[ https://issues.apache.org/jira/browse/CASSANDRA-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13710074#comment-13710074 ]
Jonathan Ellis commented on CASSANDRA-5722: ------------------------------------------- bq. is the cost of decorating index keys so high that it outweighs the savings from exiting the loop earlier when a greater key is found? That's exactly what we want to do, and that's what the {{indexDecoratedKey.compareTo(position) > 0}} check does for us. The part I removed allows us to skip this check when we don't find key greater than the position before we've finished the block where such a key would exist, i.e., it saves us exactly one iteration of the loop [since the first key out of the next block is guaranteed to be greater]. I thought that fell into the category of premature optimization and took it out so it was more clear what we're doing. Did I miss something? > Cleanup should skip sstables that don't contain data outside a nodes ranges > --------------------------------------------------------------------------- > > Key: CASSANDRA-5722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5722 > Project: Cassandra > Issue Type: Improvement > Reporter: Nick Bailey > Assignee: Tyler Hobbs > Fix For: 2.0.1 > > Attachments: 0001-Skip-cleanup-when-unneeded.patch > > > Right now cleanup is optimized to simply delete sstables that *only* contain > data that doesn't belong on the node, for all other sstables though, it will > read them, check each row, and write out new sstables. > Cleanup could be optimized to look at an sstable and determine that all data > within the sstable does belong on a node, and therefore skip re-writing that > sstable. This would make cleanup essentially a noop in the case where all > data on a node belongs on that node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira