Thanks Peter!

In my test application, for each record,

rowkey -> rand() * 4, about 64B

column * 20 -> rand() * 20, about 320B

I use batch_insert(rowkey, col*20) in thrift.

Kevin Yuan

-------- ???????? --------
??????: Peter Sch??ller <sc...@spotify.com>
??????: user@cassandra.apache.org
????: [***SPAM*** ] Re: writing speed test
????: Wed, 2 Jun 2010 10:44:52 +0200

Since this thread has now gone on for a while...

As far as I can tell you never specify the characteristics of your
writes. Evaluating expected write throughput in terms of "MB/s to
disk" is pretty impossible if one does not know anything about the
nature of the writes. If you're expecting 50 MB, is that reasonable? I
don't know; if you're writing a gazillion one-byte values with
shortish keys, 50 MB/seconds translates to a *huge* amounts of writes
per second and you're likely to be CPU bound even in the most
efficient implementation reasonably possible.

If on the other hand you're writing large values (say slabs of 128k)
you might more reasonably be expecting higher disk throughput.

I don't have enough hands-on experience with cassandra to have a feel
for the CPU vs. disk in terms of bottlenecking, and when we expect to
bottleneck on what, but I can say that it's definitely going to matter
quite a lot what *kind* of writes you're doing. This tends to be the
case regardless of the database system.




Reply via email to