Hello Ken, Thank you.
But if I have a two node cluster and a working fencing mechanism wouldn't it be enough to disable the corosync and pacemaker service on both nodes so when it fence they won't come up? Thank you pon., 18.03.2019, 16:19 użytkownik Ken Gaillot <kgail...@redhat.com> napisał: > On Sat, 2019-03-16 at 11:10 +0100, Adam Budziński wrote: > > Hello Andrei, > > > > Ok I see your point. So per my understanding if the resource is > > started successfully in that case fence vmware it will be monitored > > indefinitely but as you sad it will monitor the current active node. > > So how does the fence agent gets aware of problems with the slave? I > > The fence agent doesn't monitor the active node, or any node -- it > monitors the fence device. > > The cluster layer (i.e. corosync) monitors all nodes, and reports any > issues to pacemaker, which will initiate fencing if necessary. > > Pacemaker also monitors each resource and fence device, via any > recurring monitors that have been configured. > > > mean if in a two node cluster the cluster splits in to two partitions > > each of them will fence the other or does that happen because both > > will assume they are the only survivors and thus need to fence the > > other node which is in a unknow state so to say? > > If both nodes are functional but can't see each other, they will each > want to initiate fencing. If one of them is quicker than the other to > determine this, the other one will get shot before it has a chance to > do anything itself. > > However there is the possibility that both nodes will shoot at about > the same time, resulting in both nodes getting shot (a "stonith death > match"). This is only a problem in 2-node clusters. There are a few > ways around this: > > 1. Configure two separate fence devices, each targeting one of the > nodes, and put a delay on one of them (or a random delay on both). This > makes it highly unlikely that they will shoot at the same time. > > 2. Configure a fencing topology with a fence heuristics device plus > your real device. A fence heuristics device runs some test, and refuses > to shoot the other node if the test fails. For example, > fence_heuristics_ping tries to ping an IP address you give it; the idea > is that if a node can't ping that IP, you don't want it to survive. > This ensures that only a node that passes the test can shoot (which > means there still might be some cases where the nodes can both shoot > each other, and cases where the cluster will freeze because neither > node can see the IP). > > 3. Configure corosync with qdevice to provide true quorum via a third > host (which doesn't participate in the cluster otherwise). > > 4. Use sbd with a hardware watchdog and a shared storage device as the > fencing device. This is not a reliable option with VMWare, but I'm > listing it for the general case. > > > > > > Thank you and Best Regards, > > Adam > > > > sob., 16.03.2019, 07:17 użytkownik Andrei Borzenkov < > > arvidj...@gmail.com> napisał: > > > 16.03.2019 9:01, Adam Budziński пишет: > > > > Thank you Andrei. The problem is that I can see with 'pcs status' > > > that > > > > resources are runnin on srv2cr1 but its at the same time its > > > telling that > > > > the fence_vmware_soap is running on srv1cr1. That's somewhat > > > confusing. > > > > Could you possibly explain this? > > > > > > > > > > Two points. > > > > > > It is actually logical to have stonith agent running on different > > > node > > > than node with active resources - because it is the *other* node > > > that > > > will initiate fencing when node with active resources fails. > > > > > > But even considering the above, active (running) state of fence (or > > > stonith) agent just determines on which node recurring monitor > > > operation > > > will be started. The actual result of this monitor operation has no > > > impact on subsequent stonith attempt and serves just as warning to > > > administrator. When stonith request comes, agent may be used by any > > > node > > > where stonith agent is not prohibited to run by (co-)location > > > rules. My > > > understanding is that this node is selected by DC in partition. > > > > > > > Thank you! > > > > > > > > sob., 16.03.2019, 05:37 użytkownik Andrei Borzenkov < > > > arvidj...@gmail.com> > > > > napisał: > > > > > > > >> 16.03.2019 1:16, Adam Budziński пишет: > > > >>> Hi Tomas, > > > >>> > > > >>> Ok but how then pacemaker or the fence agent knows which route > > > to take to > > > >>> reach the vCenter? > > > >> > > > >> They do not know or care at all. It is up to your underlying > > > operating > > > >> system and its routing tables. > > > >> > > > >>> Btw. Do I have to add the stonith resource on each of the nodes > > > or is it > > > >>> just enough to add it on one as for other resources? > > > >> > > > >> If your fencing agent can (should) be able to run on any node, > > > it should > > > >> be enough to define it just once as long as it can properly > > > determine > > > >> "port" to use on fencing "device" for a given node. There are > > > cases when > > > >> you may want to restrict fencing agent to only subset on nodes > > > or when > > > >> you are forced to set unique parameter for each node (consider > > > IPMI IP > > > >> address), in this case you would need separate instance of agent > > > in each > > > >> case. > > > >> > > > >>> Thank you! > > > >>> > > > >>> pt., 15.03.2019, 15:48 użytkownik Tomas Jelinek < > > > tojel...@redhat.com> > > > >>> napisał: > > > >>> > > > >>>> Dne 15. 03. 19 v 15:09 Adam Budziński napsal(a): > > > >>>>> Hello Tomas, > > > >>>>> > > > >>>>> Thank you! So far I need to say how great this community is, > > > would > > > >>>>> never expect so much positive vibes! A big thank you your > > > doing a great > > > >>>>> job! > > > >>>>> > > > >>>>> Now let's talk business :) > > > >>>>> > > > >>>>> So if pcsd is using ring0 and it fails will ring1 not be used > > > at all? > > > >>>> > > > >>>> Pcs and pcsd never use ring1, but they are just tools for > > > managing > > > >>>> clusters. You can have a perfectly functioning cluster without > > > pcs and > > > >>>> pcsd running or even installed, it would be just more > > > complicated to set > > > >>>> it up and manage it. > > > >>>> > > > >>>> Even if ring0 fails, you will be able to use pcs (in somehow > > > limited > > > >>>> manner) as most of its commands don't go through network > > > anyway. > > > >>>> > > > >>>> Corosync, which is the actual cluster messaging layer, will of > > > course > > > >>>> use ring1 in case of ring0 failure. > > > >>>> > > > >>>>> > > > >>>>> So in regards to VMware that would mean that the interface > > > should be > > > >>>>> configured with a network that can access the vCenter to > > > fence right? > > > >>>>> But wouldn't it then use only ring0 so if that fails it > > > wouldn't switch > > > >>>>> to ring1? > > > >>>> > > > >>>> If you are talking about pcmk_host_map, that does not really > > > have > > > >>>> anything to do with network interfaces of cluster nodes. It > > > maps node > > > >>>> names (parts before :) to "ports" of a fence device (parts > > > after :). > > > >>>> Pcs-0.9.x does not support defining custom node names, > > > therefore node > > > >>>> names are the same as ring0 addresses. > > > >>>> > > > >>>> I am not an expert on fence agents / devices, but I'm sure > > > someone else > > > >>>> on this list will be able to help you with configuring fencing > > > for your > > > >>>> cluster. > > > >>>> > > > >>>> > > > >>>> Tomas > > > >>>> > > > >>>>> > > > >>>>> Thank you! > > > >>>>> > > > >>>>> pt., 15.03.2019, 13:14 użytkownik Tomas Jelinek < > > > tojel...@redhat.com > > > >>>>> <mailto:tojel...@redhat.com>> napisał: > > > >>>>> > > > >>>>> Dne 15. 03. 19 v 12:32 Adam Budziński napsal(a): > > > >>>>> > Hello Folks,____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > Tow node active/passive VMware VM cluster.____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > /etc/hosts____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > 10.116.63.83 srv1____ > > > >>>>> > > > > >>>>> > 10.116.63.84 srv2____ > > > >>>>> > > > > >>>>> > 172.16.21.12 srv2cr1____ > > > >>>>> > > > > >>>>> > 172.16.22.12 srv2cr2____ > > > >>>>> > > > > >>>>> > 172.16.21.11 srv1cr1____ > > > >>>>> > > > > >>>>> > 172.16.22.11 srv1cr2____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > I have 3 NIC’s on each VM:____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > 10.116.63.83 srv1 and 10.116.63.84 srv2 are > > > networks used > > > >>>> to > > > >>>>> > access the VM’s via SSH or any resource directly if > > > not via a > > > >>>>> VIP.____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > Everything with cr in its name is used for corosync > > > >>>>> communication, so > > > >>>>> > basically I have two rings (this are two no routable > > > networks > > > >>>>> just for > > > >>>>> > that).____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > My questions are:____ > > > >>>>> > > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > __1.__With ‘pcs cluster auth’ which interface / > > > interfaces > > > >> should > > > >>>>> I use > > > >>>>> > ?____ > > > >>>>> > > > >>>>> Hi Adam, > > > >>>>> > > > >>>>> I can see you are using pcs-0.9.x. In that case you > > > should do: > > > >>>>> pcs cluster auth srv1cr1 srv2cr1 > > > >>>>> > > > >>>>> In other words, use the first address of each node. > > > >>>>> Authenticating all the other addresses should not cause > > > any issues. > > > >>>> It > > > >>>>> is pointless, though, as pcs only communicates via ring0 > > > addresses. > > > >>>>> > > > >>>>> > > > > >>>>> > __2.__With ‘pcs cluster setup –name’ I would use the > > > corosync > > > >>>>> interfaces > > > >>>>> > e.g. ‘pcs cluster setup –name MyCluster > > > srv1cr1,srv1cr2 > > > >>>>> srv2cr1,srv2cr2’ > > > >>>>> > right ?____ > > > >>>>> > > > >>>>> Yes, that is correct. > > > >>>>> > > > >>>>> > > > > >>>>> > __3.__With fence_vmware_soap > > > >> inpcmk_host_map="X:VM_C;X:VM:OTRS_D" > > > >>>>> which > > > >>>>> > interface should replace X ?____ > > > >>>>> > > > >>>>> X should be replaced by node names as seen by pacemaker. > > > Once you > > > >>>>> set up > > > >>>>> and start your cluster, run 'pcs status' to get (amongs > > > other info) > > > >>>> the > > > >>>>> node names. In your configuration, they should be srv1cr1 > > > and > > > >>>> srv2cr1. > > > >>>>> > > > >>>>> > > > >>>>> Regards, > > > >>>>> Tomas > > > >>>>> > > > >>>>> > __ __ > > > >>>>> > > > > >>>>> > Thank you! > > > >>>>> > > > > >>>>> > > > > >>>>> > _______________________________________________ > > > >>>>> > Users mailing list: Users@clusterlabs.org > > > >>>>> <mailto:Users@clusterlabs.org> > > > >>>>> > https://lists.clusterlabs.org/mailman/listinfo/users > > > >>>>> > > > > >>>>> > Project Home: http://www.clusterlabs.org > > > >>>>> > Getting started: > > > >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > >>>>> > Bugs: http://bugs.clusterlabs.org > > > >>>>> > > > > >>>>> _______________________________________________ > > > >>>>> Users mailing list: Users@clusterlabs.org <mailto: > > > >>>> Users@clusterlabs.org> > > > >>>>> https://lists.clusterlabs.org/mailman/listinfo/users > > > >>>>> > > > >>>>> Project Home: http://www.clusterlabs.org > > > >>>>> Getting started: > > > >>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > >>>>> Bugs: http://bugs.clusterlabs.org > > > >>>>> > > > >>>>> > > > >>>>> _______________________________________________ > > > >>>>> Users mailing list: Users@clusterlabs.org > > > >>>>> https://lists.clusterlabs.org/mailman/listinfo/users > > > >>>>> > > > >>>>> Project Home: http://www.clusterlabs.org > > > >>>>> Getting started: > > > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > >>>>> Bugs: http://bugs.clusterlabs.org > > > >>>>> > > > >>>> _______________________________________________ > > > >>>> Users mailing list: Users@clusterlabs.org > > > >>>> https://lists.clusterlabs.org/mailman/listinfo/users > > > >>>> > > > >>>> Project Home: http://www.clusterlabs.org > > > >>>> Getting started: > > > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > >>>> Bugs: http://bugs.clusterlabs.org > > > >>>> > > > >>> > > > >>> > > > >>> _______________________________________________ > > > >>> Users mailing list: Users@clusterlabs.org > > > >>> https://lists.clusterlabs.org/mailman/listinfo/users > > > >>> > > > >>> Project Home: http://www.clusterlabs.org > > > >>> Getting started: > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > >>> Bugs: http://bugs.clusterlabs.org > > > >>> > > > >> > > > >> _______________________________________________ > > > >> Users mailing list: Users@clusterlabs.org > > > >> https://lists.clusterlabs.org/mailman/listinfo/users > > > >> > > > >> Project Home: http://www.clusterlabs.org > > > >> Getting started: > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > >> Bugs: http://bugs.clusterlabs.org > > > >> > > > > > > > > > > > > _______________________________________________ > > > > Users mailing list: Users@clusterlabs.org > > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > Project Home: http://www.clusterlabs.org > > > > Getting started: > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > Bugs: http://bugs.clusterlabs.org > > > > > > > > > > _______________________________________________ > > > Users mailing list: Users@clusterlabs.org > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > -- > Ken Gaillot <kgail...@redhat.com> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/