Re: Splitting a single row into multiple

2011-02-23 Thread Aaron Morton
AFAIK performance in the single row case will better. Multi get may require 
multiple seeks and reads in an sstable,, verses obviously a single seek and 
read for a single row. Multiplied by the number of sstables that contain row 
data.

Using the key cache would reduce the the seeks.

If it makes sense in your app do it. In general though try to model data so a 
single row read gets what you need.

Aaron

On 24/02/2011, at 5:59 AM, Aditya Narayan ady...@gmail.com wrote:

 Does it make any difference if I split a row, that needs to be
 accessed together, into two or three rows and then read those multiple
 rows ??
 (Assume the keys of all the three rows are known to me programatically
 since I split columns by certain categories).
 Would the performance be any better if all the three were just a single row ??
 
 I guess the performance should be same in both cases, the columns
 remain the same in quantity  there spread into several SST files..


Re: Splitting a single row into multiple

2011-02-23 Thread Aditya Narayan
Thanks Aaron.. I was looking to spliting the rows so that I could use
a standard CF instead of super.. but your argument also makes sense.



On Thu, Feb 24, 2011 at 1:19 AM, Aaron Morton aa...@thelastpickle.com wrote:
 AFAIK performance in the single row case will better. Multi get may require 
 multiple seeks and reads in an sstable,, verses obviously a single seek and 
 read for a single row. Multiplied by the number of sstables that contain row 
 data.

 Using the key cache would reduce the the seeks.

 If it makes sense in your app do it. In general though try to model data so a 
 single row read gets what you need.

 Aaron

 On 24/02/2011, at 5:59 AM, Aditya Narayan ady...@gmail.com wrote:

 Does it make any difference if I split a row, that needs to be
 accessed together, into two or three rows and then read those multiple
 rows ??
 (Assume the keys of all the three rows are known to me programatically
 since I split columns by certain categories).
 Would the performance be any better if all the three were just a single row 
 ??

 I guess the performance should be same in both cases, the columns
 remain the same in quantity  there spread into several SST files..