Re: underutilized servers

2021-03-05 Thread Erick Ramirez
The tpstats you posted show that the node is dropping reads and writes which means that your disk can't keep up with the load meaning your disk is the bottleneck. If you haven't already, place data and commitlog on separate disks so they're not competing for the same IO bandwidth. Note that It's OK

Re: underutilized servers

2021-03-05 Thread daemeon reiydelle
you did not specify read and write consistency levels, default would be to hit two nodes (one for data, one for digest) with every query. Network load of 50% is not too helpful. 1gbit? 10gbit? 50% of each direction or average of both? Iowait is not great for a system of this size: assuming that yo

Re: underutilized servers

2021-03-05 Thread Attila Wind
Thanks for the answers @Sean and @Bowen !!! First of all, this article described very similar thing we experience - let me share https://www.senticore.com/overcoming-cassandra-write-performance-problems/ we are studying that now Furthermore * yes, we have some level of unbalanced data which

Re: underutilized servers

2021-03-05 Thread Bowen Song
Based on my personal experience, the combination of slow read queries and low CPU usage is often an indicator of bad table schema design (e.g.: large partitions) or bad query (e.g. without partition key). Check the Cassandra logs first, is there any long stop-the-world GC? tombstone warning? an

RE: underutilized servers

2021-03-05 Thread Durity, Sean R
Are there specific queries that are slow? Partition-key queries should have read latencies in the single digits of ms (or faster). If that is not what you are seeing, I would first review the data model and queries to make sure that the data is modeled properly for Cassandra. Without metrics, I

underutilized servers

2021-03-05 Thread Attila Wind
Hi guys, I have a DevOps related question - hope someone here could give some ideas/pointers... We are running a 3 nodes Cassandra cluster Recently we realized we do have performance issues. And based on investigation we took it seems our bottleneck is the Cassandra cluster. The application

Re: How to debug node load unbalance

2021-03-05 Thread Lapo Luchini
Thanks for the explanation, Kane! In case anyone is curious I decommissioned node7 and things re-balanced themselves automatically: https://i.imgur.com/EOxzJu9.png (node8 received 422 GiB, while the others did receive 82-153 GiB, as reported by "nodetool netstats -H") Lapo On 2021-03-03 23:59