Re: High GC activity on node with 4TB on data

2015-02-19 Thread Arya Goudarzi
Sorry to jump on this late. GC is one of my favorite topics. A while ago I wrote a blob post about C* GC tuning and documented several issues that I had experienced. It seems it has helped some people in that past, so I am sharing it here: http://aryanet.com/blog/cassandra-garbage-collector-tuning

Re: High GC activity on node with 4TB on data

2015-02-12 Thread Jiri Horky
Number of cores: 2x6Cores x 2(HT). I do agree with you that the the hardware is certainly overestimated for just one Cassandra, but we got a very good price since we ordered several 10s of the same nodes for a different project. That's why we use for multiple cassandra instances. Jirka H. On 02/

Re: High GC activity on node with 4TB on data

2015-02-12 Thread Eric Stevens
> each node has 256G of memory, 24x1T drives, 2x Xeon CPU I don't have first hand experience running Cassandra on such massive hardware, but it strikes me that these machines are dramatically oversized to be good candidates for Cassandra (though I wonder how many cores are in those CPUs; I'm guess

Re: High GC activity on node with 4TB on data

2015-02-11 Thread Jiri Horky
Hi Chris, On 02/09/2015 04:22 PM, Chris Lohfink wrote: > - number of tombstones - how can I reliably find it out? > https://github.com/spotify/cassandra-opstools > https://github.com/cloudian/support-tools thanks. > > If not getting much compression it may be worth trying to disable it, > it may

Re: High GC activity on node with 4TB on data

2015-02-09 Thread Chris Lohfink
- number of tombstones - how can I reliably find it out? https://github.com/spotify/cassandra-opstools https://github.com/cloudian/support-tools If not getting much compression it may be worth trying to disable it, it may contribute but its very unlikely that its the cause of the gc pressure itse

Re: High GC activity on node with 4TB on data

2015-02-09 Thread Jiri Horky
Hi all, thank you all for the info. To answer the questions: - we have 2 DCs with 5 nodes in each, each node has 256G of memory, 24x1T drives, 2x Xeon CPU - there are multiple cassandra instances running for different project. The node itself is powerful enough. - there 2 keyspaces, one with 3

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Francois Richard
: Mark Reddy To: user@cassandra.apache.org Cc: cassandra-u...@apache.org; FF Systems Sent: Sunday, February 8, 2015 1:32 PM Subject: Re: High GC activity on node with 4TB on data Hey Jiri,  While I don't have any experience running 4TB nodes (yet), I would recommend taking a look

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Colin
The most data I put on a node with spinning disk is 1TB. What are the machine specs? Cpu, memory, etc and what is the read/write pattern-heavy ingest rate/heavy read rate and how ling do you keep data in the cluster? -- Colin Clark +1 612 859 6129 Skype colin.p.clark > On Feb 8, 2015, at 2:44

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Mark Reddy
Hey Jiri, While I don't have any experience running 4TB nodes (yet), I would recommend taking a look at a presentation by Arron Morton on large nodes: http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/ to see if you can glean anythin

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Kevin Burton
Do you have a lot of individual tables? Or lots of small compactions? I think the general consensus is that (at least for Cassandra), 8GB heaps are ideal. If you have lots of small tables it’s a known anti-pattern (I believe) because the Cassandra internals could do a better job on handling the