Re: High GC activity on node with 4TB on data

2015-02-12 Thread Jiri Horky
Number of cores: 2x6Cores x 2(HT). I do agree with you that the the hardware is certainly overestimated for just one Cassandra, but we got a very good price since we ordered several 10s of the same nodes for a different project. That's why we use for multiple cassandra instances. Jirka H. On

Re: High GC activity on node with 4TB on data

2015-02-12 Thread Eric Stevens
each node has 256G of memory, 24x1T drives, 2x Xeon CPU I don't have first hand experience running Cassandra on such massive hardware, but it strikes me that these machines are dramatically oversized to be good candidates for Cassandra (though I wonder how many cores are in those CPUs; I'm

Re: High GC activity on node with 4TB on data

2015-02-11 Thread Jiri Horky
Hi Chris, On 02/09/2015 04:22 PM, Chris Lohfink wrote: - number of tombstones - how can I reliably find it out? https://github.com/spotify/cassandra-opstools https://github.com/cloudian/support-tools thanks. If not getting much compression it may be worth trying to disable it, it may

Re: High GC activity on node with 4TB on data

2015-02-09 Thread Chris Lohfink
- number of tombstones - how can I reliably find it out? https://github.com/spotify/cassandra-opstools https://github.com/cloudian/support-tools If not getting much compression it may be worth trying to disable it, it may contribute but its very unlikely that its the cause of the gc pressure

Re: High GC activity on node with 4TB on data

2015-02-09 Thread Jiri Horky
Hi all, thank you all for the info. To answer the questions: - we have 2 DCs with 5 nodes in each, each node has 256G of memory, 24x1T drives, 2x Xeon CPU - there are multiple cassandra instances running for different project. The node itself is powerful enough. - there 2 keyspaces, one with 3

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Kevin Burton
Do you have a lot of individual tables? Or lots of small compactions? I think the general consensus is that (at least for Cassandra), 8GB heaps are ideal. If you have lots of small tables it’s a known anti-pattern (I believe) because the Cassandra internals could do a better job on handling the

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Mark Reddy
Hey Jiri, While I don't have any experience running 4TB nodes (yet), I would recommend taking a look at a presentation by Arron Morton on large nodes: http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/ to see if you can glean

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Colin
The most data I put on a node with spinning disk is 1TB. What are the machine specs? Cpu, memory, etc and what is the read/write pattern-heavy ingest rate/heavy read rate and how ling do you keep data in the cluster? -- Colin Clark +1 612 859 6129 Skype colin.p.clark On Feb 8, 2015, at 2:44

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Francois Richard
Hi Jiri, We do run multiple nodes with 2TB to 4TB of data and we will usually see GC pressure when we create a lot of tombstones. With Cassandra 2.0.x you would be able to see a log with the following pattern: WARN [ReadStage:7] 2015-02-08 22:55:09,621 SliceQueryFilter.java (line 225) Read 939