Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-12 Thread Gil Ganz
Hey Bowen I agree it's better to have smaller servers in general, this is the smaller servers version :) In this case, I wouldn't say the data model is bad, and we certainly do our best to tune everything so less hardware is needed. It's just that the data and amount of requests/s is very big to

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-11 Thread Bowen Song
May I ask why do you scale your Cassandra cluster vertically instead of horizontally as recommended? I'm asking because I had dealt with a vertically scaled cluster before. It was because they had query performance issue and blamed the hardware wasn't strong enough. Scaling vertically had

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-11 Thread Gil Ganz
Yes. 192gb. On Thu, Mar 11, 2021 at 10:29 AM Kane Wilson wrote: > That is a very large heap. I presume you are using G1GC? How much memory > do your servers have? > > raft.so - Cassandra consulting, support, managed services > > On Thu., 11 Mar. 2021, 18:29 Gil Ganz, wrote: > >> I always

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-11 Thread Kane Wilson
That is a very large heap. I presume you are using G1GC? How much memory do your servers have? raft.so - Cassandra consulting, support, managed services On Thu., 11 Mar. 2021, 18:29 Gil Ganz, wrote: > I always prefer to do decommission, but the issue here is these servers > are on-prem, and

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-10 Thread Gil Ganz
I always prefer to do decommission, but the issue here is these servers are on-prem, and disks die from time to time. It's a very large cluster, in multiple datacenters around the world, so it can take some time before we have a replacement, so we usually need to run removenode in such cases.

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-09 Thread Kane Wilson
It's unlikely to help in this case, but you should be using nodetool decommission on the node you want to remove rather than removenode from another node (and definitely don't force removal) native_transport_max_concurrent_requests_in_bytes defaults to 10% of the heap, which I suppose depending

Node removal causes spike in pending native-transport requests and clients suffer

2021-03-08 Thread Gil Ganz
Hey, We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the upgrade we have an issue when we remove a node. The moment I run the removenode command, 3 servers in the same dc start to have a high amount of pending native-transport-requests (getting to around 1M) and clients are