On 08.08.23 11:45, Nicholas Yang wrote:
Maybe a firewall is blocking the ports used by corosync. In a 2-node
cluster using UDPU, 4 udp streams should be seen on one of the LAN
interfaces, including:
1. from local port 5405 to remote port 5405
2. from remote port 5405 to local port 5405
3. from a local ephemeral port to remote port 5405
4. from a remote ephemeral port to local port 5405
Blocking any one of the streams with stop corosync from working.
I'm not sure why I didn't start checking this from the beginning...
After adding a firewall rule for UDP port 5405, the node connected
without any issues.
Thank you for your help.
Regards,
bartk
On 8/8/23 16:50, Bartosz Kaczyński wrote:
On 08.08.23 10:06, Nicholas Yang wrote:
Hi Bartk,
Hi Nicholas,
Thank you for your message and advice. I've checked the logs on node02
during the cluster joining process. I've pasted them to the Pastebin
service:
- Logs from the systemd corosync service [1]
- Logs from the systemd pacemaker service [2]
- Log located at /var/log/pacemaker/pacemaker.log [3]
While reviewing the logs, I didn't particularly notice any specific
errors, certainly nothing directly related to the inability to resolve
the node01 name. However, I'm not an expert, so it's hard for me to
make a definite statement.
[1] https://paste.opensuse.org/pastes/009aaf4f60c8
[2] https://paste.opensuse.org/pastes/8a31fec97802
[3] https://paste.opensuse.org/pastes/4641f5adc940
Regards,
bartk
You may look into the status and logs of service `corosync` and
`pacemaker` to check what's going wrong.
systemctl status corosync
systemctl status pacemaker
journalctl --unit corosync
journalctl --unit pacemaker
Regards,
Nicholas Yang
On 8/8/23 04:45, Bartosz Kaczyński wrote:
Hello ClusterLabs community!
I'm in the process of learning this technology, using the course
"Say Goodbye to Downtime with SUSE Linux Enterprise Server (Repeat)"
[1] where the instructor guides through the process of installing a
cluster on one node and joining to the cluster on another node.
The instructor is using SLES 15. The entire test environment is
quite complex in terms of networking and storage. I think I managed
to replicate it on openSUSE Leap 15.5.
I initiated the cluster from node01 with the following command:
node01:~ # crm cluster init -u \
-s /dev/disk/by-path/ip-10.10.11.111:3260-iscsi-iqn.(...)-lun-0 \
-s /dev/disk/by-path/ip-10.10.12.112:3260-iscsi-iqn.(...)-lun-0 \
-s /dev/disk/by-path/ip-10.10.13.113:3260-iscsi-iqn.(...)-lun-0
From node02, I then tried to join it to the cluster with the
following command [2]
I also believe that I met all the requirements regarding the network
connectivity between the nodes and cluster resolution. ICMP is
working between all interfaces, the /etc/hosts file is also filled
in on both hosts, and time synchronization is ensured with chronyd.
The network diagram is available as a downloadable graphic [3], and
someone has created a similar tutorial using VMware as the
hypervisor (while I am using KVM) [4].
I am aware that providing assistance in such a situation can be
challenging, and the problem may occur at various levels. However,
is there anything that I might have overlooked, or is there anything
that comes to your mind?
[1] https://open.sap.com/courses/suse2-1-pc
[2] https://paste.opensuse.org/pastes/254c28adf1c5
[3]
https://opensap-pinboard.s3.openhpicloud.de/courses/6IB8s3snE8SkA2AxXCJWOL/topics/1nYPUtHIu4f73RLSTHdkqw/44nDvDjlO5bs8eoCxu0Uqh/openSAP-HA-LabEnv-Diagram.png
[4]
https://blogs.sap.com/2021/03/13/setting-up-suse-high-availability-cluster-quick-start-demo/
Regards,
bartk
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/