Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Sylvain Lebresne Tue, 09 Mar 2010 07:25:17 -0800

Hello,

I've done some tests and it seems that somehow to have more rows with few
columns is better than to have more rows with fewer columns, at least as long
as read performance is concerned.
Using stress.py, on a quad core 2.27Ghz with 4Go RAM and the out of the box
cassandra configuration, I inserted:


  1) 50000000 rows (that's 50 millions) with 1 column each
(stress.py -n 50000000 -c 1)
  2) 500000 rows (that's 500 thousands) with 100 column each
(stress.py -n 500000 -c 100)

that is, it ends up with 50 millions columns in both case (I use such big
numbers so that in case 2, the resulting data are big enough not to fit in
the system caches, in which case the problem I'm mentioning below
doesn't show).
Those two 'tests' have been done separatly, with data flushed completely
between them. I let cassandra compact everything each time, shutdown the
server and start it again (so that no data is in memtable). Then I tried
reading columns, one at a time using:
  1) stress.py -t 10 -o read -n 50000000 -c 1 -r
  2) stress.py -t 10 -o read -n 500000 -c 1 -r

In the case 1) I get around 200 reads/seconds and that's pretty stable. The
disk is spinning like crazy (~25% io_wait), very few cpu or memory used,
performances are IO bound, which is expected.
In the case 2) however, it starts with reasonnable performance (400+
reads/seconds), but it very quickly drop to an average of 80 reads/seconds
(after a minute and a half or so). And it don't go up significantly after
that. Turns out this seems to be a GC problem. Indeed, the info log (I'm
running trunk from today, but I first saw the problem on an older version of
trunk) show every few seconds lines like:
  GC for ConcurrentMarkSweep: 4599 ms, 57247304 reclaimed leaving
1033481216 used; max is 1211498496
I'm not surprised that performance are bad with such GC pauses. I'm surprised
to have such GC pauses.

Note that in case 1), the resulting data 'weights' ~14G, while in case 2) it
'weights' only ~2.4G.

Let me add that I used stress.py to try to identify the problem, but I first
run into it in an application I'm writting where I had rows with around 1000
columns of 30K each. With about 1000 rows, I had awfull performances, like 5
reads/seconds on average. I try switching to 1 millions row having each 1
column of 30K and end up with more than 300 reads/seconds.

Any idea, insight ? Am I doing something utterly wrong ?
Thanks in advance.

--
Sylvain

Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Reply via email to