how can we get (a lot) more performance from cassandra

2012-05-16 Thread Yiming Sun
Hello, I asked the question as a follow-up under a different thread, so I figure I should ask here instead in case the other one gets buried, and besides, I have a little more information. "We find the lack of performance disturbing" as we are only able to get about 3-4MB/sec read performance out

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Mike Peters
Hi Yiming, Cassandra is optimized for write-heavy environments. If you have a read-heavy application, you shouldn't be running your reads through Cassandra. On the bright side - Cassandra read throughput will remain consistent, regardless of your volume. But you are going to have to "wrap"

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Yiming Sun
Ah, never thought I would be quoting Luke's "No, that's not true... that's impossible~~" here... sigh. But seriously, thanks Mike. Instead of using memcached, would it help to turn on row cache? An even more philosophical question: what would be a better choice for read-heavy loads? a major p

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Oleg Dulin
Indeed. This is how we are trying to solve this problem. Our application has a built-in cache that resembles a supercolumn or standardcolumn data structure and has API that resembles a combination of Pelops selector and mutator. You can do something like that for Hector. The cache is constra

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Yiming Sun
Thanks Oleg. Another caveat from our side is, we have a very large data space (imaging picking 100 items out of 3 million, the chance of having 2 items from the same bin is pretty low). We will experiment with row cache, and hopefully it will help, not the opposite (the tuning guide says row cache

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Oleg Dulin
Please do keep us posted. We have a somewhat similar Cassandra utilization pattern, and I would like to know what your solution is... On 2012-05-16 20:38:37 +, Yiming Sun said: Thanks Oleg.  Another caveat from our side is, we have a very large data space (imaging picking 100 items out of

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Yiming Sun
Will do, Oleg. Again, thanks for the information. -- Y. On Wed, May 16, 2012 at 4:44 PM, Oleg Dulin wrote: > ** > > Please do keep us posted. We have a somewhat similar Cassandra utilization > pattern, and I would like to know what your solution is... > > > > On 2012-05-16 20:38:37 +, Yimi

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Aaron Turner
On Wed, May 16, 2012 at 12:59 PM, Yiming Sun wrote: > Hello, > > I asked the question as a follow-up under a different thread, so I figure I > should ask here instead in case the other one gets buried, and besides, I > have a little more information. > > "We find the lack of performance disturbing

Re: how can we get (a lot) more performance from cassandra

2012-05-16 Thread Yiming Sun
Hi Aaron T., No, actually we haven't, but this sounds like a good suggestion. I can definitely try THIS before jumping into other things such as enabling row cache etc. Thanks! -- Y. On Wed, May 16, 2012 at 9:38 PM, Aaron Turner wrote: > On Wed, May 16, 2012 at 12:59 PM, Yiming Sun wrote: >

Re: how can we get (a lot) more performance from cassandra

2012-05-20 Thread aaron morton
I would look into the problems you are having with GC... > The server log shows the GC ParNew frequently gets longer than 200ms, often > in the range of 4-5seconds. But nowhere near 15 seconds (which is an > indication that JVM heap is being swapped out). Then check the throughput on the san a

Re: how can we get (a lot) more performance from cassandra

2012-05-21 Thread Yiming Sun
Hi Aaron, I don't know if you could elaborate a bit more on each of the points you suggested. Thanks. -- Y. On Sun, May 20, 2012 at 7:29 PM, aaron morton wrote: > I would look into the problems you are having with GC... > > The server log shows the GC ParNew frequently gets longer than 200ms,

Re: how can we get (a lot) more performance from cassandra

2012-05-22 Thread aaron morton
> I would look into the problems you are having with GC... When ParNew runs the jvm pauses https://blogs.oracle.com/jonthecollector/entry/our_collectors . If it's pausing for 4 seconds it's not processing queries. > Then check the throughput on the san and the steal on the VM's. Check to se