Re: cassandra vs couchbase benchmark

2012-12-12 Thread Edward Capriolo
If your data fits into memory you probably do not need NoSQL.

You may also notice the company that produced the benchmark is a cloudera
partner so they "forgot" to show how much faster couchdb is then hbase in
this scenario, but were more then happy to show you how much "faster" it is
then mongo and cassandra .

!!!DRAMA!!!

On Wed, Dec 12, 2012 at 12:34 PM, Radim Kolar  wrote:

> if dataset fits into memory and data used in test almost fits into memory
> then cassandra is slow compared to other leading nosql databases, it can go
> up to 10:1 ratio. Check infinispan benchmarks. Common use pattern is to use
> memcached on top of cassandra.
>
> cassandra is good if you have way more data then your RAM size. It beats
> SQL database 10:1 ratio at cost low flexibility for queries.
>


Re: cassandra vs couchbase benchmark

2012-12-12 Thread Radim Kolar
if dataset fits into memory and data used in test almost fits into 
memory then cassandra is slow compared to other leading nosql databases, 
it can go up to 10:1 ratio. Check infinispan benchmarks. Common use 
pattern is to use memcached on top of cassandra.


cassandra is good if you have way more data then your RAM size. It beats 
SQL database 10:1 ratio at cost low flexibility for queries.


RE: cassandra vs couchbase benchmark

2012-12-12 Thread Viktor Jevdokimov
Pure marketing comparing apples to oranges.

Was Cassandra usage optimized?
- What consistency level was used? (fastest reads with ONE)
- Does Cassandra client used was token aware? (make request to appropriate node)
- Was dynamic snitch turned off? (prevent forward request to other replica if 
can be processed locally)
- Does Cassandra data model was used to mimic Couchbase data model? (Couchbase 
has only 1 value for 1 row)
- What caching was used on Cassandra? (Couchbase uses memcache built-in)

For our use case we've seen much better results upon testing from single node.
Throughput grows almost linearly adding nodes and growing amount of data to the 
same level as for single node.
Single node stats:
- A column family with 30 GiB compressed data per node (100 GiB uncompressed)
- 1 row with 10-30 columns weighted 0.5-2 KiB uncompressed
- 1 node with 6 cores 24 GiB RAM, 8 GiB heap, 1600 MiB new heap
- key cache only, 2M keys
- random reads 70%, random writes 30%
- Read latencies 10ms AVE100 with CPU 95%, 50k reads/s
- Read latencies <5ms AVE100 with CPU 60%, 20k reads/s

In reality, with many column families with different amount of data, read/write 
rates, performance results may significantly vary.
Just need to know what and how to optimize for Cassandra to get best results.


Couchbase is not for our use case because of its data model (requires reads for 
updates/inserts), so we can't compare it to Cassandra.



Best regards / Pagarbiai

Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063
Fax: +370 5 261 0453

J. Jasinskio 16C,
LT-01112 Vilnius,
Lithuania



Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.> -Original Message-
> From: Radim Kolar [mailto:h...@filez.com]
> Sent: Tuesday, December 11, 2012 17:42
> To: user@cassandra.apache.org
> Subject: cassandra vs couchbase benchmark
>
> http://www.slideshare.net/Couchbase/benchmarking-couchbase#btnNext