From the nodetool output you quoted, I seriously suspect your Cassandra nodes have at least one of the following issues:

   * Clock out of sync

   * Bad network connectivity between nodes

   * Long GC pauses

   * Broken disks

   * CPU bottleneck

It's not normal to see over 2% dropped small messages. It needs investigation.

Adding node one by one is fine. If your data can grow at speed that's close to or faster than the speed of adding nodes, you have a much more serious problem.

We usually leave it running in the background with an automated process, and we don't really care if it took 1 day or 5 days to complete, as long as nodes are added correctly and successfully.

On 08/07/2022 10:55, Marc Hoppins wrote:
Ifconfig shows RX of 1.1T. This doesn't seem to fit with the LOAD of 145GiB 
(nodetool status), unless I am reading that wrong...and the fact that this node 
still has a status of UJ.

Netstats on this node shows (other than :
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name                    Active   Pending      Completed   Dropped
Large messages                  n/a         0              0         0
Small messages                  n/a        53      569755545  15740262
Gossip messages                 n/a         0         288878         2
None of this addresses the issue of not being able to add more nodes.

-----Original Message-----
From: Bowen Song via user<user@cassandra.apache.org> Sent: Friday, July 8, 2022 11:47 AM
To:user@cassandra.apache.org
Subject: Re: Adding nodes

EXTERNAL


I would assume that's 85 GB (i.e. gigabytes) then. Which is approximately 79 
GiB (i.e. gibibytes). This still sounds awfully slow - less than 1MB/s over a 
full day (24 hours).

You said CPU and network aren't the bottleneck. Have you checked the disk IO? 
Also, be mindful with CPU usage. It can still be a bottleneck if one thread 
uses 100% of a CPU core while all other cores are idle.

On 08/07/2022 07:09, Marc Hoppins wrote:
Thank you for pointing that out.

85 gigabytes/gibibytes/GIGABYTES/GIBIBYTES/whatever name you care to
give it

CPU and bandwidth are not the problem.

Version 4.0.3 but, as I stated, all nodes use the same version so the version 
is not important either.

Existing nodes have 350-400+(choose whatever you want to call a
gigabyte)

The problem appears to be that adding new nodes is a serial process, which is 
fine when there is no data and each node is added within 2minutes.  It is 
hardly practical in production.

-----Original Message-----
From: Bowen Song via user<user@cassandra.apache.org>
Sent: Thursday, July 7, 2022 8:43 PM
To:user@cassandra.apache.org
Subject: Re: Adding nodes

EXTERNAL


86Gb (that's gigabits, which is 10.75GB, gigabytes) took an entire day seems 
obviously too long. I would check the network bandwidth, disk IO and CPU usage 
and find out what is the bottleneck.

On 07/07/2022 15:48, Marc Hoppins wrote:
Hi all,

Cluster of 2 DC and 24 nodes

DC1 (RF3) = 12 nodes, 16 tokens each
DC2 (RF3) = 12 nodes, 16 tokens each

Adding 12 more nodes to DC1: I installed Cassandra (version is the same across 
all nodes) but, after the first node added, I couldn't seem to add any further 
nodes.

I check nodetool status and the newly added node is UJ. It remains this way all 
day and only 86Gb of data is added to the node over the entire day (probably 
not yet complete).  This seems a little slow and, more than a little 
inconvenient to only be able to add one node at a time - or at least one node 
every 2 minutes.  When the cluster was created, I timed each node from service 
start to status UJ (having a UUID) and it was around 120 seconds.  Of course 
there was no data.

Is it possible I have some setting not correctly tuned?

Thanks

Marc

Reply via email to