Re: Cassandra performance decreases drastically with increase in data size.

srmore Mon, 03 Jun 2013 07:08:59 -0700

Thanks all for the help.
I ran the traffic over the weekend surprisingly, my heap was doing OK
(around 5.7G of 8G) but GC activity went nuts and dropped the throughput. I
will probably increase the number of nodes.


The other interesting thing I noticed was that there were some objects with
finalize() methods, this could potentially cause GC issues.


On Fri, May 31, 2013 at 1:47 AM, Aiman Parvaiz <ai...@grapheffect.com>wrote:

> I believe you should roll out more nodes as a temporary fix to your
> problem, 400GB on all nodes means (as correctly mentioned in other mails of
> this thread) you are spending more time on GC. Check out the second comment
> in this link by Aaron Morton, he says the more than 300GB can be
> problematic, though this post is about older version of cassandra but I
> believe concept still stands true:
>
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-it-safe-to-stop-a-read-repair-and-any-suggestion-on-speeding-up-repairs-td6607367.html
>
> Thanks
>
> On May 29, 2013, at 9:32 PM, srmore <comom...@gmail.com> wrote:
>
> Hello,
> I am observing that my performance is drastically decreasing when my data
> size grows. I have a 3 node cluster with 64 GB of ram and my data size is
> around 400GB on all the nodes. I also see that when I re-start Cassandra
> the performance goes back to normal and then again starts decreasing after
> some time.
>
> Some hunting landed me to this page
> http://wiki.apache.org/cassandra/LargeDataSetConsiderations which talks
> about the large data sets and explains that it might be because I am going
> through multiple layers of OS cache, but does not tell me how to tune it.
>
> So, my question is, are there any optimizations that I can do to handle
> these large datatasets ?
>
> and why does my performance go back to normal when I restart Cassandra ?
>
> Thanks !
>
>
>

Re: Cassandra performance decreases drastically with increase in data size.

Reply via email to