> -If you perform a query for a specific row key and a column name, does > it read the most recent SSTable first and if it finds a hit, does it > stop there or does it need to read through all the SStables (to find > most recent one) regardless of whether if found a hit on the most > recent SSTable or not? Reads all SSTables, as the only way to know which column instance has the highest time stamp is to read them all.
> - If I perform a slice query on a column range does cassandra iterate > all the SS tables? All SSTables that contain any data for the row. (background http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/) > So I am wondering which option would be most efficient from read point of > view. I would go with the first, 64MB columns will be a pain. Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7/10/2011, at 7:50 AM, Ramesh Natarajan wrote: > Lets assume I perform frequent insert & update on a column family.. > Over a period of time multiple sstables will have this row/column > data. > I have 2 questions about how reads work in cassandra w.r.t. multiple SS > tables. > > -If you perform a query for a specific row key and a column name, does > it read the most recent SSTable first and if it finds a hit, does it > stop there or does it need to read through all the SStables (to find > most recent one) regardless of whether if found a hit on the most > recent SSTable or not? > > - If I perform a slice query on a column range does cassandra iterate > all the SS tables? > > We have an option to create > > 1st option: > > Key1 | COL1 | COL2 | COL3 ..... <multiple columns > > > We need to perform a slice query to get COL1-COL3 using key1. > > 2nd option: > > Key1 | <COL as one column and have application place values of > COL1-COLN in this one column> > > This key would be updated several times where the app would manage > adding multiple values to the one column key. Our max col value size > will be less than 64mb. When you need to search for a value, we would > read the one column and the application would manage looking up the > appropriate value in the list of values. > > So I am wondering which option would be most efficient from read point of > view. > > thanks > Ramesh