Hi Fabrizio,

You are right, there is no pcs command for replacing a node. Instead, you could run pcs commands on the surviving node to remove and then add the other node. You need to do pcs auth first. This would sync most of the files (authkeys, corosync.conf, qdevice certificates) for you. Pcs doesn't handle drbd configuration.

Regards,
Tomas


Dne 13. 05. 25 v 9:19 Fabrizio Ermini napsal(a):
Thank you very much Alexey, I will certainly try that and update you on the result.

Best regards!


Il giorno lun 12 mag 2025 alle ore 22:36 <ale...@pavlyuts.ru <mailto:ale...@pavlyuts.ru>> ha scritto:

    Hi,____

    __ __

    Occasionally, I have pacemaker as a base layer of custom clustering
    solution and I have a script to rebuild the second node from the
    first one. I can’t share the script itself as is has a lot of
    solution-dependent references, but I can share the sequence to
    rebuild the failed node:____

     1. Setup the new node with the same IP and hostname____
     2. (optional) setup passwordless mutual key-based SSH access. It is
        not necessary, but make a lot of things easy.____
     3. Copy files from survived host to the new one:____
         1. /etc/corosync/authkey____
         2. /etc/corosync/corosync.conf____
         3. /etc/drbd.d/*.res____
         4. /etc/pacemaker/authkey____
     4. Set *hacluster* user pass to the same as it was on the survived
        node.____
     5. Re-auth pcs nodes with command
        pcs host auth <host1_name>  <host2_name> -u hacluster -p
        <ha_cluster_pass>____
     6. Reboot the restored server____
     7. PROFIT!!!____

    __ __

    If you use no arbiter (corosync-qnetd) this should be enough for
    your new cluster node up and running. If you use corosync-qnetd, you
    need also restore corosync-qdevice nssdb keys for the second host
    connect the arbiter node:____

     1. On old host, extract your arbiter certificate from nssdb on the
        survived host:
        certutil -L -d /etc/corosync/qdevice/net/nssdb -n 'QNet CA' -r
         > /root/qnetd-cert.crt____
     2. Copy certificate to the new host, assume the path on the new
        host is the same____
     3. On the new host, Init new nssdb with certificate:
        corosync-qdevice-net-certutil -i -c /root/qnetd-cert.crt____
     4. Copy certificate and key at location /etc/corosync/qdevice/net/
        nssdb/qdevice-net-node.p12from old node to new one____
     5. On the new node: Import certificate and key:
        corosync-qdevice-net-certutil -m -c /etc/corosync/qdevice/net/
        nssdb/qdevice-net-node.p12____
     6. Enable or restart corosync-qdevice:
        systemctl enable –now corosync-qdevice.service
        or
        systemctl restart corosync-qdevice.service____
     7. Enjoy!____

    __ __

    That’s what practically work for me and included in service scripts
    of our product, based on Pacemaker.____

    __ __

    Hope this could help!____

    __ __

    Sincerely,____

    __ __

    Alex____

    __ __

    __ __

    *From:*Users <users-boun...@clusterlabs.org <mailto:users-
    boun...@clusterlabs.org>> *On Behalf Of *Fabrizio Ermini
    *Sent:* Friday, May 9, 2025 5:26 PM
    *To:* users@clusterlabs.org <mailto:users@clusterlabs.org>
    *Subject:* [ClusterLabs] Rebuild of failed node____

    __ __

    Hi all! Freshmen here, just joined. ____

    __ __

    I'm currently in the need to rebuild a failed node on a
    pacemaker2.1/corosync3.1 2-node cluster with drbd storage. ____

    I've searched in Pacemaker docs and in the list archives, but I
    haven't found a clear guide on how to proceed in this task. So far,
    I've reinstalled a new server, configured the same IP and hostname
    of the failed one, and installed all the software. I've also fixed
    DRBD layer and started the resync of the volumes. But it's not clear
    to me how to proceed - I've found some hints online pointing to the
    need of manually copying corosync config, but they were quite old
    and probably obsolete. I'm using pcs as a shell and I haven't found
    a command designed to replace a node, only to add or remove them. ____

    It seems really strange to me that there isn't a guide, since this
    should be a very basic operation and it's quite important to know
    how to do it - HW breaks, as a matter of fact :D____

    So I'll be very grateful if anyone can point me in the right
    direction.____

    Thanks in advance, and best regards____

    __ __

    Fabrizio____

    __ __

    _______________________________________________
    Manage your subscription:
    https://lists.clusterlabs.org/mailman/listinfo/users <https://
    lists.clusterlabs.org/mailman/listinfo/users>

    ClusterLabs home: https://www.clusterlabs.org/ <https://
    www.clusterlabs.org/>


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to