[ https://issues.apache.org/jira/browse/CASSANDRA-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086653#comment-13086653 ]
Daniel Doubleday commented on CASSANDRA-2498: --------------------------------------------- bq. avoid the tombstone collection by avoiding full collateColumns until the end. Very nice and clean and solves the tombstone problem. +1 Looks all good to me > Improve read performance in update-intensive workload > ----------------------------------------------------- > > Key: CASSANDRA-2498 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2498 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Daniel Doubleday > Priority: Minor > Labels: ponies > Fix For: 1.0 > > Attachments: 2498-v2.txt, 2498-v3.txt, > supersede-name-filter-collations.patch > > > Read performance in an update-heavy environment relies heavily on compaction > to maintain good throughput. (This is not the case for workloads where rows > are only inserted once, because the bloom filter keeps us from having to > check sstables unnecessarily.) > Very early versions of Cassandra attempted to mitigate this by checking > sstables in descending generation order (mostly equivalent to descending > mtime): once all the requested columns were found, it would not check any > older sstables. > This was incorrect, because data timestamp will not correspond to sstable > timestamp, both because compaction has the side effect of "refreshing" data > to a newer sstable, and because hintead handoff may send us data older than > what we already have. > Instead, we could create a per-sstable piece of metadata containing the most > recent (client-specified) timestamp for any column in the sstable. We could > then sort sstables by this timestamp instead, and perform a similar > optimization (if the remaining sstable client-timestamps are older than the > oldest column found in the desired result set so far, we don't need to look > further). Since under almost every workload, client timestamps of data in a > given sstable will tend to be similar, we expect this to cut the number of > sstables down proportionally to how frequently each column in the row is > updated. (If each column is updated with each write, we only have to check a > single sstable.) > This may also be useful information when deciding which SSTables to compact. > (Note that this optimization is only appropriate for named-column queries, > not slice queries, since we don't know what non-overlapping columns may exist > in older sstables.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira