On 7/8/20 8:24 PM, Strahil Nikolov wrote: > Erm...network/firewall is always "green". Run tcpdump on Host1 and VM2 > (not on the same host). > Then run again 'fence_xvm -o list' and check what is captured. > > In summary, you need: > - key deployed on the Hypervisours > - key deployed on the VMs > - fence_virtd running on both Hypervisours > - Firewall opened (1229/udp for the hosts, 1229/tcp for the guests) > - fence_xvm on both VMs > > In your case , the primary suspect is multicast traffic. Or just a simple port-access issue ... firewalld is not the only way you can setup some kind of firewall on your local machine. iptables.service might be active for instance. I have no personal experience with multiple hosts & fence_xvm. So when you have solved your primary issue you might still consider running 2 parallel setups. I've read about a recommendation to do so and I have a vague memory about an email-thread stating some issues. Anybody here can state that multiple-hosts with a single multicast-ip is working reliably? > > Best Regards, > Strahil Nikolov > > На 8 юли 2020 г. 16:33:45 GMT+03:00, "stefan.schm...@farmpartner-tec.com" > <stefan.schm...@farmpartner-tec.com> написа: >> Hello, >> >>> I can't find fence_virtd for Ubuntu18, but it is available for >>> Ubuntu20. >> We have now upgraded our Server to Ubuntu 20.04 LTS and installed the >> packages fence-virt and fence-virtd. >> >> The command "fence_xvm -a 225.0.0.12 -o list" on the Hosts still just >> returns the single local VM. >> >> The same command on both VMs results in: >> # fence_xvm -a 225.0.0.12 -o list >> Timed out waiting for response >> Operation failed >> >> But just as before, trying to connect from the guest to the host via nc >> >> just works fine. >> #nc -z -v -u 192.168.1.21 1229 >> Connection to 192.168.1.21 1229 port [udp/*] succeeded! >> >> So the hosts and service basically is reachable. >> >> I have spoken to our Firewall tech, he has assured me, that no local >> traffic is hindered by anything. Be it multicast or not. >> Software Firewalls are not present/active on any of our servers. >> >> Ubuntu guests: >> # ufw status >> Status: inactive >> >> CentOS hosts: >> systemctl status firewalld >> ● firewalld.service - firewalld - dynamic firewall daemon >> Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; >> vendor preset: enabled) >> Active: inactive (dead) >> Docs: man:firewalld(1) >> >> >> Any hints or help on how to remedy this problem would be greatly >> appreciated! >> >> Kind regards >> Stefan Schmitz >> >> >> Am 07.07.2020 um 10:54 schrieb Klaus Wenninger: >>> On 7/7/20 10:33 AM, Strahil Nikolov wrote: >>>> I can't find fence_virtd for Ubuntu18, but it is available for >> Ubuntu20. >>>> Your other option is to get an iSCSI from your quorum system and use >> that for SBD. >>>> For watchdog, you can use 'softdog' kernel module or you can use KVM >> to present one to the VMs. >>>> You can also check the '-P' flag for SBD. >>> With kvm please use the qemu-watchdog and try to >>> prevent using softdogwith SBD. >>> Especially if you are aiming for a production-cluster ... >>> >>> Adding something like that to libvirt-xml should do the trick: >>> <watchdog model='i6300esb' action='reset'> >>> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' >>> function='0x0'/> >>> </watchdog> >>> >>>> Best Regards, >>>> Strahil Nikolov >>>> >>>> На 7 юли 2020 г. 10:11:38 GMT+03:00, >> "stefan.schm...@farmpartner-tec.com" >> <stefan.schm...@farmpartner-tec.com> написа: >>>>>> What does 'virsh list' >>>>>> give you onthe 2 hosts? Hopefully different names for >>>>>> the VMs ... >>>>> Yes, each host shows its own >>>>> >>>>> # virsh list >>>>> Id Name Status >>>>> ---------------------------------------------------- >>>>> 2 kvm101 running >>>>> >>>>> # virsh list >>>>> Id Name State >>>>> ---------------------------------------------------- >>>>> 1 kvm102 running >>>>> >>>>> >>>>> >>>>>> Did you try 'fence_xvm -a {mcast-ip} -o list' on the >>>>>> guests as well? >>>>> fence_xvm sadly does not work on the Ubuntu guests. The howto said >> to >>>>> install "yum install fence-virt fence-virtd" which do not exist as >>>>> such >>>>> in Ubuntu 18.04. After we tried to find the appropiate packages we >>>>> installed "libvirt-clients" and "multipath-tools". Is there maybe >>>>> something misisng or completely wrong? >>>>> Though we can connect to both hosts using "nc -z -v -u >> 192.168.1.21 >>>>> 1229", that just works fine. >>>>> >>> without fence-virt you can't expect the whole thing to work. >>> maybe you can build it for your ubuntu-version from sources of >>> a package for another ubuntu-version if it doesn't exist yet. >>> btw. which pacemaker-version are you using? >>> There was a convenience-fix on the master-branch for at least >>> a couple of days (sometimes during 2.0.4 release-cycle) that >>> wasn't compatible with fence_xvm. >>>>>> Usually, the biggest problem is the multicast traffic - as in >> many >>>>>> environments it can be dropped by firewalls. >>>>> To make sure I have requested our Datacenter techs to verify that >>>>> multicast Traffic can move unhindered in our local Network. But in >> the >>>>> past on multiple occasions they have confirmed, that local traffic >> is >>>>> not filtered in any way. But Since now I have never specifically >> asked >>>>> for multicast traffic, which I now did. I am waiting for an answer >> to >>>>> that question. >>>>> >>>>> >>>>> kind regards >>>>> Stefan Schmitz >>>>> >>>>> Am 06.07.2020 um 11:24 schrieb Klaus Wenninger: >>>>>> On 7/6/20 10:10 AM, stefan.schm...@farmpartner-tec.com wrote: >>>>>>> Hello, >>>>>>> >>>>>>>>> # fence_xvm -o list >>>>>>>>> kvm102 >>>>> bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 >>>>>>>>> on >>>>>>>> This should show both VMs, so getting to that point will likely >>>>> solve >>>>>>>> your problem. fence_xvm relies on multicast, there could be some >>>>>>>> obscure network configuration to get that working on the VMs. >>>>>> You said you tried on both hosts. What does 'virsh list' >>>>>> give you onthe 2 hosts? Hopefully different names for >>>>>> the VMs ... >>>>>> Did you try 'fence_xvm -a {mcast-ip} -o list' on the >>>>>> guests as well? >>>>>> Did you try pinging via the physical network that is >>>>>> connected tothe bridge configured to be used for >>>>>> fencing? >>>>>> If I got it right fence_xvm should supportcollecting >>>>>> answersfrom multiple hosts but I found a suggestion >>>>>> to do a setup with 2 multicast-addresses & keys for >>>>>> each host. >>>>>> Which route did you go? >>>>>> >>>>>> Klaus >>>>>>> Thank you for pointing me in that direction. We have tried to >> solve >>>>>>> that but with no success. We were using an howto provided here >>>>>>> https://wiki.clusterlabs.org/wiki/Guest_Fencing >>>>>>> >>>>>>> Problem is, it specifically states that the tutorial does not yet >>>>>>> support the case where guests are running on multiple hosts. >> There >>>>> are >>>>>>> some short hints what might be necessary to do, but working >> through >>>>>>> those sadly just did not work nor where there any clues which >> would >>>>>>> help us finding a solution ourselves. So now we are completely >> stuck >>>>>>> here. >>>>>>> >>>>>>> Has someone the same configuration with Guest VMs on multiple >> hosts? >>>>>>> And how did you manage to get that to work? What do we need to do >> to >>>>>>> resolve this? Is there maybe even someone who would be willing to >>>>> take >>>>>>> a closer look at our server? Any help would be greatly >> appreciated! >>>>>>> Kind regards >>>>>>> Stefan Schmitz >>>>>>> >>>>>>> >>>>>>> >>>>>>> Am 03.07.2020 um 02:39 schrieb Ken Gaillot: >>>>>>>> On Thu, 2020-07-02 at 17:18 +0200, >>>>> stefan.schm...@farmpartner-tec.com >>>>>>>> wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I hope someone can help with this problem. We are (still) >> trying >>>>> to >>>>>>>>> get >>>>>>>>> Stonith to achieve a running active/active HA Cluster, but >> sadly >>>>> to >>>>>>>>> no >>>>>>>>> avail. >>>>>>>>> >>>>>>>>> There are 2 Centos Hosts. On each one there is a virtual Ubuntu >>>>> VM. >>>>>>>>> The >>>>>>>>> Ubuntu VMs are the ones which should form the HA Cluster. >>>>>>>>> >>>>>>>>> The current status is this: >>>>>>>>> >>>>>>>>> # pcs status >>>>>>>>> Cluster name: pacemaker_cluster >>>>>>>>> WARNING: corosync and pacemaker node names do not match (IPs >> used >>>>> in >>>>>>>>> setup?) >>>>>>>>> Stack: corosync >>>>>>>>> Current DC: server2ubuntu1 (version 1.1.18-2b07d5c5a9) - >> partition >>>>>>>>> with >>>>>>>>> quorum >>>>>>>>> Last updated: Thu Jul 2 17:03:53 2020 >>>>>>>>> Last change: Thu Jul 2 14:33:14 2020 by root via cibadmin on >>>>>>>>> server4ubuntu1 >>>>>>>>> >>>>>>>>> 2 nodes configured >>>>>>>>> 13 resources configured >>>>>>>>> >>>>>>>>> Online: [ server2ubuntu1 server4ubuntu1 ] >>>>>>>>> >>>>>>>>> Full list of resources: >>>>>>>>> >>>>>>>>> stonith_id_1 (stonith:external/libvirt): Stopped >>>>>>>>> Master/Slave Set: r0_pacemaker_Clone [r0_pacemaker] >>>>>>>>> Masters: [ server4ubuntu1 ] >>>>>>>>> Slaves: [ server2ubuntu1 ] >>>>>>>>> Master/Slave Set: WebDataClone [WebData] >>>>>>>>> Masters: [ server2ubuntu1 server4ubuntu1 ] >>>>>>>>> Clone Set: dlm-clone [dlm] >>>>>>>>> Started: [ server2ubuntu1 server4ubuntu1 ] >>>>>>>>> Clone Set: ClusterIP-clone [ClusterIP] (unique) >>>>>>>>> ClusterIP:0 (ocf::heartbeat:IPaddr2): >> Started >>>>>>>>> server2ubuntu1 >>>>>>>>> ClusterIP:1 (ocf::heartbeat:IPaddr2): >> Started >>>>>>>>> server4ubuntu1 >>>>>>>>> Clone Set: WebFS-clone [WebFS] >>>>>>>>> Started: [ server4ubuntu1 ] >>>>>>>>> Stopped: [ server2ubuntu1 ] >>>>>>>>> Clone Set: WebSite-clone [WebSite] >>>>>>>>> Started: [ server4ubuntu1 ] >>>>>>>>> Stopped: [ server2ubuntu1 ] >>>>>>>>> >>>>>>>>> Failed Actions: >>>>>>>>> * stonith_id_1_start_0 on server2ubuntu1 'unknown error' (1): >>>>>>>>> call=201, >>>>>>>>> status=Error, exitreason='', >>>>>>>>> last-rc-change='Thu Jul 2 14:37:35 2020', queued=0ms, >>>>>>>>> exec=3403ms >>>>>>>>> * r0_pacemaker_monitor_60000 on server2ubuntu1 'master' (8): >>>>>>>>> call=203, >>>>>>>>> status=complete, exitreason='', >>>>>>>>> last-rc-change='Thu Jul 2 14:38:39 2020', queued=0ms, >>>>> exec=0ms >>>>>>>>> * stonith_id_1_start_0 on server4ubuntu1 'unknown error' (1): >>>>>>>>> call=202, >>>>>>>>> status=Error, exitreason='', >>>>>>>>> last-rc-change='Thu Jul 2 14:37:39 2020', queued=0ms, >>>>>>>>> exec=3411ms >>>>>>>>> >>>>>>>>> >>>>>>>>> The stonith resoursce is stopped and does not seem to work. >>>>>>>>> On both hosts the command >>>>>>>>> # fence_xvm -o list >>>>>>>>> kvm102 >>>>> bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 >>>>>>>>> on >>>>>>>> This should show both VMs, so getting to that point will likely >>>>> solve >>>>>>>> your problem. fence_xvm relies on multicast, there could be some >>>>>>>> obscure network configuration to get that working on the VMs. >>>>>>>> >>>>>>>>> returns the local VM. Apparently it connects through the >>>>>>>>> Virtualization >>>>>>>>> interface because it returns the VM name not the Hostname of >> the >>>>>>>>> client >>>>>>>>> VM. I do not know if this is how it is supposed to work? >>>>>>>> Yes, fence_xvm knows only about the VM names. >>>>>>>> >>>>>>>> To get pacemaker to be able to use it for fencing the cluster >>>>> nodes, >>>>>>>> you have to add a pcmk_host_map parameter to the fencing >> resource. >>>>> It >>>>>>>> looks like >> pcmk_host_map="nodename1:vmname1;nodename2:vmname2;..." >>>>>>>>> In the local network, every traffic is allowed. No firewall is >>>>>>>>> locally >>>>>>>>> active, just the connections leaving the local network are >>>>>>>>> firewalled. >>>>>>>>> Hence there are no coneection problems between the hosts and >>>>> clients. >>>>>>>>> For example we can succesfully connect from the clients to the >>>>> Hosts: >>>>>>>>> # nc -z -v -u 192.168.1.21 1229 >>>>>>>>> Ncat: Version 7.50 ( https://nmap.org/ncat ) >>>>>>>>> Ncat: Connected to 192.168.1.21:1229. >>>>>>>>> Ncat: UDP packet sent successfully >>>>>>>>> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds. >>>>>>>>> >>>>>>>>> # nc -z -v -u 192.168.1.13 1229 >>>>>>>>> Ncat: Version 7.50 ( https://nmap.org/ncat ) >>>>>>>>> Ncat: Connected to 192.168.1.13:1229. >>>>>>>>> Ncat: UDP packet sent successfully >>>>>>>>> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds. >>>>>>>>> >>>>>>>>> >>>>>>>>> On the Ubuntu VMs we created and configured the the stonith >>>>> resource >>>>>>>>> according to the howto provided here: >>>>>>>>> >> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/pdf/Clusters_from_Scratch/Pacemaker-1.1-Clusters_from_Scratch-en-US.pdf >>>>>>>>> The actual line we used: >>>>>>>>> # pcs -f stonith_cfg stonith create stonith_id_1 >> external/libvirt >>>>>>>>> hostlist="Host4,host2" >>>>>>>>> hypervisor_uri="qemu+ssh://192.168.1.21/system" >>>>>>>>> >>>>>>>>> >>>>>>>>> But as you can see in in the pcs status output, stonith is >> stopped >>>>>>>>> and >>>>>>>>> exits with an unkown error. >>>>>>>>> >>>>>>>>> Can somebody please advise on how to procced or what additionla >>>>>>>>> information is needed to solve this problem? >>>>>>>>> Any help would be greatly appreciated! Thank you in advance. >>>>>>>>> >>>>>>>>> Kind regards >>>>>>>>> Stefan Schmitz >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Manage your subscription: >>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>>> >>>>>>> ClusterLabs home: https://www.clusterlabs.org/ >>>>>>>
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/