@Michal : look a this for the improvement of read performance : https://issues.apache.org/jira/browse/CASSANDRA-2498
Best regards. Jean Armel 2013/7/18 Michał Michalski <mich...@opera.com> > SSTables are immutable - once they're written to disk, they cannot be > changed. > > On read C* checks *all* SSTables [1], but to make it faster, it uses Bloom > Filters, that can tell you if a row is *not* in a specific SSTable, so you > don't have to read it at all. However, *if* you read it in case you have > to, you don't read a whole SSTable - there's an in-memory Index Sample, > that is used for binary search and returning only a (relatively) small > block of real (full, on-disk) index, which you have to scan to find a > place to retrieve the data from SSTable. Additionally you have a KeyCache > to make reads faster - it points location of data in SSTable, so you don't > have to touch Index Sample and Index at all. > > Once C* retrieves all data "parts" (including the Memtable part), > timestamps are used to find the most recent version of data. > > [1] I believe that it's not true for all cases, as I saw a piece of code > somewhere in the source, that starts checking SSTables in order from the > newest to the oldest one (in terms of data timestamps - AFAIR SSTable > MetaData stores info about smallest and largest timestamp in SSTable), and > once the newest data for all columns are retrieved (assuming that schema is > defined), retrieving data stops and older SSTables are not checked. If > someone could confirm that it works this way and it's not something that I > saw in my dream and now believe it's real, I'd be glad ;-) > > W dniu 17.07.2013 22:58, S Ahmed pisze: > > Since SSTables are mutable, and they are ordered, does this mean that >> there >> is a index of key ranges that each SS table holds, and the value could be >> 1 >> more sstables that have to be scanned and then the latest one is chosen? >> >> e.g. Say I write a value "abc" to CF1. This gets stored in a sstable. >> >> Then I write "def" to CF1, this gets stored in another sstable eventually. >> >> How when I go to fetch the value, it has to scan 2 sstables and then >> figure >> out which is the latest entry correct? >> >> So is there an index of key's to sstables, and there can be 1 or more >> sstables per key? >> >> (This is assuming compaction hasn't occurred yet). >> >> >