Hi Fellows, I have the following design for a system which holds basically key->value pairs (aka Columns) for each user (SuperColumn Key) in different namespaces (SuperColumnFamily row key).
Like this: Namesapce->user->column_name = column_value; keyspaces: - name: NKVP replica_placement_strategy: org.apache.cassandra.locator.RackUnawareStrategy replication_factor: 3 column_families: - name: Namespaces column_type: Super compare_with: BytesType compare_subcolumns_with: BytesType rows_cached: 20000 keys_cached: 100 Cluster using random partitioner. I use multiget_slice() for fetching 1 or many columns inside the child supercolumn at the same time. This is an awkward performance result I get: 100 sequential reads completed in : 0.383 this uses multiget_slice() with 1 key, and 1 column name inside the predicate->column_names 100 batch loaded completed in : 0.786 this uses multiget_slice() with 1 key, and multiple column names inside the predicate->column_names read/write consistency are ONE. Questions: Why doing 100 sequential reads is faster than doing 100 in batch? Is this a good design for my problem? Does my issue relate to https://issues.apache.org/jira/browse/CASSANDRA-598? Now on a single node with replication factor 1 I get this: 100 sequential reads completed in : 0.438 100 batch loaded completed in : 0.800 Please advice as to why is this happening? These nodes are VMs. 1 CPU and 1 Gb. Best Regards, =Arya