On 7/9/20 8:18 PM, Vladislav Bogdanov wrote: > Hi. > > This thread is getting too long. > > First, you need to ensure that your switch (or all switches in the > path) have igmp snooping enabled on host ports (and probably > interconnects along the path between your hosts). > > Second, you need an igmp querier to be enabled somewhere near (better > to have it enabled on a switch itself). Please verify that you see its > queries on hosts. > > Next, you probably need to make your hosts to use IGMPv2 (not 3) as > many switches still can not understand v3. This is doable by sysctl, > find on internet, there are many articles. Switch configuration might be in the way as well but as the problem exists in the communication between a host and the guest running on that host I would rather bet for firewall rules on the host(s). > > These advices are also applicable for running corosync itself in > multicast mode. > > Best, > Vladislav > > Thu, 02/07/2020 в 17:18 +0200, stefan.schm...@farmpartner-tec.com > wrote: >> Hello, >> >> I hope someone can help with this problem. We are (still) trying to >> get >> Stonith to achieve a running active/active HA Cluster, but sadly to >> no >> avail. >> >> There are 2 Centos Hosts. On each one there is a virtual Ubuntu VM. >> The >> Ubuntu VMs are the ones which should form the HA Cluster. >> >> The current status is this: >> >> # pcs status >> Cluster name: pacemaker_cluster >> WARNING: corosync and pacemaker node names do not match (IPs used in >> setup?) >> Stack: corosync >> Current DC: server2ubuntu1 (version 1.1.18-2b07d5c5a9) - partition >> with >> quorum >> Last updated: Thu Jul 2 17:03:53 2020 >> Last change: Thu Jul 2 14:33:14 2020 by root via cibadmin on >> server4ubuntu1 >> >> 2 nodes configured >> 13 resources configured >> >> Online: [ server2ubuntu1 server4ubuntu1 ] >> >> Full list of resources: >> >> stonith_id_1 (stonith:external/libvirt): Stopped >> Master/Slave Set: r0_pacemaker_Clone [r0_pacemaker] >> Masters: [ server4ubuntu1 ] >> Slaves: [ server2ubuntu1 ] >> Master/Slave Set: WebDataClone [WebData] >> Masters: [ server2ubuntu1 server4ubuntu1 ] >> Clone Set: dlm-clone [dlm] >> Started: [ server2ubuntu1 server4ubuntu1 ] >> Clone Set: ClusterIP-clone [ClusterIP] (unique) >> ClusterIP:0 (ocf::heartbeat:IPaddr2): Started >> server2ubuntu1 >> ClusterIP:1 (ocf::heartbeat:IPaddr2): Started >> server4ubuntu1 >> Clone Set: WebFS-clone [WebFS] >> Started: [ server4ubuntu1 ] >> Stopped: [ server2ubuntu1 ] >> Clone Set: WebSite-clone [WebSite] >> Started: [ server4ubuntu1 ] >> Stopped: [ server2ubuntu1 ] >> >> Failed Actions: >> * stonith_id_1_start_0 on server2ubuntu1 'unknown error' (1): >> call=201, >> status=Error, exitreason='', >> last-rc-change='Thu Jul 2 14:37:35 2020', queued=0ms, >> exec=3403ms >> * r0_pacemaker_monitor_60000 on server2ubuntu1 'master' (8): >> call=203, >> status=complete, exitreason='', >> last-rc-change='Thu Jul 2 14:38:39 2020', queued=0ms, exec=0ms >> * stonith_id_1_start_0 on server4ubuntu1 'unknown error' (1): >> call=202, >> status=Error, exitreason='', >> last-rc-change='Thu Jul 2 14:37:39 2020', queued=0ms, >> exec=3411ms >> >> >> The stonith resoursce is stopped and does not seem to work. >> On both hosts the command >> # fence_xvm -o list >> kvm102 bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 >> on >> >> returns the local VM. Apparently it connects through the >> Virtualization >> interface because it returns the VM name not the Hostname of the >> client >> VM. I do not know if this is how it is supposed to work? >> >> In the local network, every traffic is allowed. No firewall is >> locally >> active, just the connections leaving the local network are >> firewalled. >> Hence there are no coneection problems between the hosts and clients. >> For example we can succesfully connect from the clients to the Hosts: >> >> # nc -z -v -u 192.168.1.21 1229 >> Ncat: Version 7.50 ( >> https://nmap.org/ncat >> ) >> Ncat: Connected to 192.168.1.21:1229. >> Ncat: UDP packet sent successfully >> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds. >> >> # nc -z -v -u 192.168.1.13 1229 >> Ncat: Version 7.50 ( >> https://nmap.org/ncat >> ) >> Ncat: Connected to 192.168.1.13:1229. >> Ncat: UDP packet sent successfully >> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds. >> >> >> On the Ubuntu VMs we created and configured the the stonith resource >> according to the howto provided here: >> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/pdf/Clusters_from_Scratch/Pacemaker-1.1-Clusters_from_Scratch-en-US.pdf >> >> >> The actual line we used: >> # pcs -f stonith_cfg stonith create stonith_id_1 external/libvirt >> hostlist="Host4,host2" >> hypervisor_uri="qemu+ssh://192.168.1.21/system" >> >> >> But as you can see in in the pcs status output, stonith is stopped >> and >> exits with an unkown error. >> >> Can somebody please advise on how to procced or what additionla >> information is needed to solve this problem? >> Any help would be greatly appreciated! Thank you in advance. >> >> Kind regards >> Stefan Schmitz >> >> >> >> >> >> >> >> > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/