[ https://issues.apache.org/jira/browse/CASSANDRA-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13710290#comment-13710290 ]
Tyler Hobbs commented on CASSANDRA-5722: ---------------------------------------- bq. If you think about it, you can see why that might be so – we only have to scan extra rows on a bloom filter false positive. The common case by the time we start looping through rows is that the row we're looking for exists. Ahh, that makes perfect sense. Thanks! bq. I can squash it down and commit if you're good with the changes. +1, sounds good. > Cleanup should skip sstables that don't contain data outside a nodes ranges > --------------------------------------------------------------------------- > > Key: CASSANDRA-5722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5722 > Project: Cassandra > Issue Type: Improvement > Reporter: Nick Bailey > Assignee: Tyler Hobbs > Fix For: 2.0.1 > > Attachments: 0001-Skip-cleanup-when-unneeded.patch > > > Right now cleanup is optimized to simply delete sstables that *only* contain > data that doesn't belong on the node, for all other sstables though, it will > read them, check each row, and write out new sstables. > Cleanup could be optimized to look at an sstable and determine that all data > within the sstable does belong on a node, and therefore skip re-writing that > sstable. This would make cleanup essentially a noop in the case where all > data on a node belongs on that node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira