[ https://issues.apache.org/jira/browse/CASSANDRA-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2498: -------------------------------------- Attachment: 2498-v3.txt v3 attached. bq. CollationController.collectSSTablesWith*Filter should move to the filter implementations, right? Feels like one step forward, one step back to me, since that starts to de-encapsulate "iterables." Also, special-casing counters means that we use the "slice" method for name-based counter queries. (Renamed the methods to reflect this.) bq. Could maintain the SSTables in DataTracker.View in sorted order according to SSTable.sortNewestDataFirst Good idea. Done. bq. The comment "Caller is responsible for final removeDeleted" isn't relevant to collectSSTablesWithNameFilter Are you sure? It's not the _immediate_ caller anymore, but we're still going back up to CFS.getCF via getTLC, and we still need to do a final removeDeleted there. bq. we'll need to special case Counters here Done. > Improve read performance in update-intensive workload > ----------------------------------------------------- > > Key: CASSANDRA-2498 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2498 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Sylvain Lebresne > Priority: Minor > Labels: ponies > Fix For: 1.0 > > Attachments: 2498-v2.txt, 2498-v3.txt, > supersede-name-filter-collations.patch > > > Read performance in an update-heavy environment relies heavily on compaction > to maintain good throughput. (This is not the case for workloads where rows > are only inserted once, because the bloom filter keeps us from having to > check sstables unnecessarily.) > Very early versions of Cassandra attempted to mitigate this by checking > sstables in descending generation order (mostly equivalent to descending > mtime): once all the requested columns were found, it would not check any > older sstables. > This was incorrect, because data timestamp will not correspond to sstable > timestamp, both because compaction has the side effect of "refreshing" data > to a newer sstable, and because hintead handoff may send us data older than > what we already have. > Instead, we could create a per-sstable piece of metadata containing the most > recent (client-specified) timestamp for any column in the sstable. We could > then sort sstables by this timestamp instead, and perform a similar > optimization (if the remaining sstable client-timestamps are older than the > oldest column found in the desired result set so far, we don't need to look > further). Since under almost every workload, client timestamps of data in a > given sstable will tend to be similar, we expect this to cut the number of > sstables down proportionally to how frequently each column in the row is > updated. (If each column is updated with each write, we only have to check a > single sstable.) > This may also be useful information when deciding which SSTables to compact. > (Note that this optimization is only appropriate for named-column queries, > not slice queries, since we don't know what non-overlapping columns may exist > in older sstables.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira