Hi all, I'm using ycsb to test Cassanda's performance on key range gets. I have install ycsb on one node and latest Cassandra server on another node. Using one thread, I insert 10GB of uniformly random keys in Cassandra using ycsb, while performing range gets (get_range_slices) (every 1000 puts, I perform 1 range get). Keys are 100 bytes, values 1KB. Each range get retrieves a random number of entries between 1 and 100. Cassandra node has 3GB RAM and one 7500RPM SATA disk. I use default configuration for Cassandra. I have also replayed the experiment above with 10 threads instead of one.
I calculate the time needed for each get_range_slices() both in Cassandra and in ycsb. I was surprised with the extremely low latencies I got, and I'm not sure I understand why (or if they are correct). E.g. I see 5ms latencies, even lower than disk seek latencies, when I know that since there are multiple files on disk, Cassandra should check in all files to satisfy a get_range_slices() call (BF are of no use). Since node has 3GB RAM and I insert 10GB of data, there is no possibility data is cached in memory and calls are satisfied from there. So, since there are multiple files on disk (I don't know if leveldb compactions are default or not, but in either case there are more than one disk files that should be checked for each range get), and since -at least after inserting 3-4GB- each get_range_slices() must touch disk, and must touch more than one files, is it possible for a get_range_slices() to be satisfied in 5ms? Am I missing something? Thanks!