It is not necessary, but recommended to run repair before adding nodes. That's because deleted data may be resurrected if the time between two repair runs is longer than the gc_grace_period, and adding nodes can take a lots of time.

Running nodetool cleanup is also not required, but recommended. Without this, the disk space on existing nodes will not be freed up. If you are adding multiple new nodes, and aren't facing immediate free disk space crisis, it would make more sense to run cleanup once after *all* new nodes are added than run it once after *each* new node is added.


On 05/04/2023 05:24, David Tinker wrote:
The Datastax doc says to run cleanup one node at a time after bootstrapping has completed. The myadventuresincoding post says to run a repair on each node first. Is it necessary to run the repairs first? Thanks.

On Tue, Apr 4, 2023 at 1:11 PM Bowen Song via user <user@cassandra.apache.org> wrote:

    Perhaps have a read here?
    
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddNodeToCluster.html


    On 04/04/2023 06:41, David Tinker wrote:
    Ok. Have to psych myself up to the add node task a bit. Didn't go
    well the first time round!

    Tasks
    - Make sure the new node is not in seeds list!
    - Check cluster name, listen address, rpc address
    - Give it its own rack in cassandra-rackdc.properties
    - Delete cassandra-topology.properties if it exists
    - Make sure no compactions are on the go
    - rm -rf /var/lib/cassandra/*
    - rm /data/cassandra/commitlog/* (this is on different disk)
    - systemctl start cassandra

    And it should start streaming data from the other nodes and join
    the cluster. Anything else I have to watch out for? Tx.


    On Tue, Apr 4, 2023 at 5:25 AM Jeff Jirsa <jji...@gmail.com> wrote:

        Because executing “removenode” streamed extra data from live
        nodes to the “gaining” replica

        Oversimplified (if you had one token per node)

        If you  start with A B C

        Then add D

        D should bootstrap a range from each of A B and C, but at the
        end, some of the data that was A B C becomes B C D

        When you removenode, you tell B and C to send data back to A.

        A B and C will eventually contact that data away. Eventually.

        If you get around to adding D again, running “cleanup” when
        you’re done (successfully) will remove a lot of it.



        On Apr 3, 2023, at 8:14 PM, David Tinker
        <david.tin...@gmail.com> wrote:

        
        Looks like the remove has sorted things out. Thanks.

        One thing I am wondering about is why the nodes are
        carrying a lot more data? The loads were about 2.7T
        before, now 3.4T.

        # nodetool status
        Datacenter: dc1
        ===============
        Status=Up/Down
        |/ State=Normal/Leaving/Joining/Moving
        --  Address          Load      Tokens  Owns (effective)
         Host ID                   Rack
        UN  xxx.xxx.xxx.105  3.4 TiB   256 100.0%
         afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
        UN  xxx.xxx.xxx.253  3.34 TiB  256 100.0%
         e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
        UN  xxx.xxx.xxx.107  3.44 TiB  256 100.0%
         ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1

        On Mon, Apr 3, 2023 at 5:42 PM Bowen Song via user
        <user@cassandra.apache.org> wrote:

            That's correct. nodetool removenode is strongly
            preferred when your node is already down. If the node is
            still functional, use nodetool decommission on the node
            instead.

            On 03/04/2023 16:32, Jeff Jirsa wrote:
            FWIW, `nodetool decommission` is strongly preferred.
            `nodetool removenode` is designed to be run when a host
            is offline. Only decommission is guaranteed to maintain
            consistency / correctness, and removemode probably
            streams a lot more data around than decommission.


            On Mon, Apr 3, 2023 at 6:47 AM Bowen Song via user
            <user@cassandra.apache.org> wrote:

                Use nodetool removenode is strongly preferred in
                most circumstances, and only resort to assassinate
                if you do not care about data consistency or you
                know there won't be any consistency issue (e.g. no
                new writes and did not run nodetool cleanup).

                Since the size of data on the new node is small,
                nodetool removenode should finish fairly quickly
                and bring your cluster back.

                Next time when you are doing something like this
                again, please test it out on a non-production
                environment, make sure everything works as expected
                before moving onto the production.


                On 03/04/2023 06:28, David Tinker wrote:
                Should I use assassinate or removenode? Given that
                there is some data on the node. Or will that be
                found on the other nodes? Sorry for all the
                questions but I really don't want to mess up.

                On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz
                <crdiaz...@gmail.com> wrote:

                    That's what nodetool assassinte will do.

                    On Sun, Apr 2, 2023 at 10:19 PM David Tinker
                    <david.tin...@gmail.com> wrote:

                        Is it possible for me to remove the node
                        from the cluster i.e. to undo this mess
                        and get the cluster operating again?

                        On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz
                        <crdiaz...@gmail.com> wrote:

                            You can leave it in the seed list of
                            the other nodes, just make sure it's
                            not included in this node's seed list.
                            However, if you do decide to fix the
                            issue with the racks first assassinate
                            this node (nodetool assassinate <ip>),
                            and update the rack name before you
                            restart.

                            On Sun, Apr 2, 2023 at 10:06 PM David
                            Tinker <david.tin...@gmail.com> wrote:

                                It is also in the seeds list for
                                the other nodes. Should I remove
                                it from those, restart them one at
                                a time, then restart it?

                                /etc/cassandra # grep -i bootstrap *
                                doesn't show anything so I don't
                                think I have auto_bootstrap false.

                                Thanks very much for the help.


                                On Mon, Apr 3, 2023 at 7:01 AM
                                Carlos Diaz <crdiaz...@gmail.com>
                                wrote:

                                    Just remove it from the seed
                                    list in the cassandra.yaml
                                    file and restart the node. 
                                    Make sure that auto_bootstrap
                                    is set to true first though.

                                    On Sun, Apr 2, 2023 at 9:59 PM
                                    David Tinker
                                    <david.tin...@gmail.com> wrote:

                                        So likely because I made
                                        it a seed node when I
                                        added it to the cluster it
                                        didn't do the bootstrap
                                        process. How can I recover
                                        this?

                                        On Mon, Apr 3, 2023 at
                                        6:41 AM David Tinker
                                        <david.tin...@gmail.com>
                                        wrote:

                                            Yes replication factor
                                            is 3.

                                            I ran nodetool repair
                                            -pr on all the nodes
                                            (one at a time) and am
                                            still having issues
                                            getting data back from
                                            queries.

                                            I did make the new
                                            node a seed node.

                                            Re "rack4": I assumed
                                            that was just an
                                            indication as to the
                                            physical location of
                                            the server for
                                            redundancy. This one
                                            is separate from the
                                            others so I used rack4.

                                            On Mon, Apr 3, 2023 at
                                            6:30 AM Carlos Diaz
                                            <crdiaz...@gmail.com>
                                            wrote:

                                                I'm assuming that
                                                your replication
                                                factor is 3. If
                                                that's the case,
                                                did you
                                                intentionally put
                                                this node in rack
                                                4? Typically, you
                                                want to add nodes
                                                in multiples of
                                                your replication
                                                factor in order to
                                                keep the "racks"
                                                balanced.  In
                                                other words, this
                                                node should have
                                                been added to rack
                                                1, 2 or 3.

                                                Having said that,
                                                you should be able
                                                to easily fix your
                                                problem by running
                                                a nodetool repair
                                                -pr on the new node.

                                                On Sun, Apr 2,
                                                2023 at 8:16 PM
                                                David Tinker
                                                <david.tin...@gmail.com>
                                                wrote:

                                                    Hi All

                                                    I recently
                                                    added a node
                                                    to my 3 node
                                                    Cassandra
                                                    4.0.5 cluster
                                                    and now many
                                                    reads are not
                                                    returning
                                                    rows! What do
                                                    I need to do
                                                    to fix this?
                                                    There weren't
                                                    any errors in
                                                    the logs or
                                                    other problems
                                                    that I could
                                                    see. I
                                                    expected the
                                                    cluster to
                                                    balance itself
                                                    but this
                                                    hasn't
                                                    happened
                                                    (yet?). The
                                                    nodes are
                                                    similar so I
                                                    have
                                                    num_tokens=256
                                                    for each. I am
                                                    using the
                                                    Murmur3Partitioner.

                                                    # nodetool status
                                                    Datacenter: dc1
                                                    ===============
                                                    Status=Up/Down
                                                    |/
                                                    
State=Normal/Leaving/Joining/Moving
                                                    --  Address  
                                                         Load    
                                                    Tokens  Owns
                                                    (effective)
                                                     Host ID      
                                                        Rack
                                                    UN
                                                     xxx.xxx.xxx.105
                                                     2.65 TiB 256
                                                        72.9%
                                                    
afd02287-3f88-4c6f-8b27-06f7a8192402
                                                     rack3
                                                    UN
                                                     xxx.xxx.xxx.253
                                                     2.6 TiB  256
                                                        73.9%
                                                    
e1af72be-e5df-4c6b-a124-c7bc48c6602a
                                                     rack2
                                                    UN
                                                     xxx.xxx.xxx.24
                                                      93.82 KiB
                                                     256     80.0%
                                                    
c4e8b4a0-f014-45e6-afb4-648aad4f8500
                                                     rack4
                                                    UN
                                                     xxx.xxx.xxx.107
                                                     2.65 TiB 256
                                                        73.2%
                                                    
ab72f017-be96-41d2-9bef-a551dec2c7b5
                                                     rack1

                                                    # nodetool
                                                    netstats
                                                    Mode: NORMAL
                                                    Not sending
                                                    any streams.
                                                    Read Repair
                                                    Statistics:
                                                    Attempted: 0
                                                    Mismatch
                                                    (Blocking): 0
                                                    Mismatch
                                                    (Background): 0
                                                    Pool Name
                                                     Active
                                                    Pending
                                                     Completed Dropped
                                                    Large messages
                                                       n/a 0  71754 0
                                                    Small messages
                                                       n/a 0
                                                     8398184  14
                                                    Gossip
                                                    messages      
                                                        n/a      
                                                      0    1303634
                                                        0

                                                    # nodetool ring
                                                    Datacenter: dc1
                                                    ==========
                                                    Address      
                                                      Rack    
                                                     Status State
                                                      Load        
                                                     Owns  Token
                                                     9189523899826545641
                                                    xxx.xxx.xxx.24
                                                           rack4  
                                                      Up Normal
                                                     93.82 KiB
                                                    79.95%
                                                     -9194674091837769168
                                                    xxx.xxx.xxx.107
                                                          rack1  
                                                        Up    
                                                    Normal  2.65
                                                    TiB      
                                                     73.25%
                                                     -9168781258594813088
                                                    xxx.xxx.xxx.253
                                                          rack2  
                                                        Up    
                                                    Normal  2.6
                                                    TiB        
                                                    73.92%
                                                     -9163037340977721917
                                                    xxx.xxx.xxx.105
                                                          rack3  
                                                        Up    
                                                    Normal  2.65
                                                    TiB      
                                                     72.88%
                                                     -9148860739730046229
                                                    xxx.xxx.xxx.107
                                                          rack1  
                                                        Up    
                                                    Normal  2.65
                                                    TiB      
                                                     73.25%
                                                     -9125240034139323535
                                                    xxx.xxx.xxx.253
                                                          rack2  
                                                        Up    
                                                    Normal  2.6
                                                    TiB        
                                                    73.92%
                                                     -9112518853051755414
                                                    xxx.xxx.xxx.105
                                                          rack3  
                                                        Up    
                                                    Normal  2.65
                                                    TiB      
                                                     72.88%
                                                     -9100516173422432134
                                                    ...

                                                    This is
                                                    causing a
                                                    serious
                                                    production
                                                    issue. Please
                                                    help if you can.

                                                    Thanks
                                                    David


Reply via email to