reading 1 column, is faster than reading lots of columns. this shouldn't be surprising.
On Fri, Jun 4, 2010 at 3:52 PM, Arya Goudarzi <agouda...@gaiaonline.com> wrote: > Hi Fellows, > > I have the following design for a system which holds basically key->value > pairs (aka Columns) for each user (SuperColumn Key) in different namespaces > (SuperColumnFamily row key). > > Like this: > > Namesapce->user->column_name = column_value; > > keyspaces: > - name: NKVP > replica_placement_strategy: > org.apache.cassandra.locator.RackUnawareStrategy > replication_factor: 3 > column_families: > - name: Namespaces > column_type: Super > compare_with: BytesType > compare_subcolumns_with: BytesType > rows_cached: 20000 > keys_cached: 100 > > Cluster using random partitioner. > > I use multiget_slice() for fetching 1 or many columns inside the child > supercolumn at the same time. This is an awkward performance result I get: > > 100 sequential reads completed in : 0.383 this uses multiget_slice() with > 1 key, and 1 column name inside the predicate->column_names > 100 batch loaded completed in : 0.786 this uses multiget_slice() with 1 > key, and multiple column names inside the predicate->column_names > > read/write consistency are ONE. > > Questions: > > Why doing 100 sequential reads is faster than doing 100 in batch? > Is this a good design for my problem? > Does my issue relate to https://issues.apache.org/jira/browse/CASSANDRA-598? > > Now on a single node with replication factor 1 I get this: > > 100 sequential reads completed in : 0.438 > 100 batch loaded completed in : 0.800 > > Please advice as to why is this happening? > > These nodes are VMs. 1 CPU and 1 Gb. > > Best Regards, > =Arya > > > > > > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com