>>> Andrei Borzenkov <arvidj...@gmail.com> schrieb am 28.10.2021 um 09:58 in Nachricht <CAA91j0Wptn=2v_vnn84cyilam9beb4yc3uqfcuy4tttuhwk...@mail.gmail.com>: > On Thu, Oct 28, 2021 at 10:30 AM Ulrich Windl > <ulrich.wi...@rz.uni-regensburg.de> wrote: >> >> Fencing _is_ a part of failover! >> > > As any blanket answer this is mostly incorrect in this context.
If I read the logs correctly, a monitoring operation timed out, and as a consequence the corresponding node would be fenced. So the resource would fail over to another node. > > There are two separate objects here - remote host itself and pacemaker > resource used to connect to and monitor state of remote host. > > Remote host itself does not failover. Resources on this host do, but > OP does not ask about it. Then I missed that detail. > > Pacemaker resource used to monitor remote host may failover as any > other cluster resource. This failover does not require any fencing *of > remote host itself*, and in this particular case connection between > two cluster nodes was present all the time (at least, as long as we > can believe logs) so there was no reason for fencing as well. Whether > pacemaker should attempt to failover this resource to another node if > connection to remote host fails, is subject to discussion. > > So fencing of the remote host itself is most certainly *not* part of > the failover of the resource that monitors this remote host. I just treated the resources as a black box, not looking what they do. Regards, Ulrich > >> >>> "Janghyuk Boo" <janghyuk....@ibm.com> schrieb am 26.10.2021 um 22:09 in >> Nachricht >> <of6751af09.dd2c657c-on0025877a.006ea8cb-0025877a.006eb...@ibm.com>: >> Dear Community , >> Thank you Ken for your reply last time. >> I attached the log messages as requested from the last thread. >> I have a Pacemaker cluster with two cluster nodes with two network > interfaces >> each, and two remote nodes and a prototyped fencing agent(GPFS-Fence) to cut > a >> hosts access from the clustered filesystem. >> I noticed that remote node gets fenced when the quorum node its connected to >> gets fenced or experiences network failure. >> For example, when I disconnected srv-2 from the rest of the cluster by using >> iptables on srv-2 >> iptables -A INPUT -s [IP of srv-1] -j DROP ; iptables -A OUTPUT -s [IP of >> srv-1] -j DROP >> iptables -A INPUT -s [IP of srv-3] -j DROP ; iptables -A OUTPUT -s [IP of >> srv-3] -j DROP >> iptables -A INPUT -s [IP of srv-4] -j DROP ; iptables -A OUTPUT -s [IP of >> srv-4] -j DROP >> I expected that remote node jangcluster-srv-4 would failover to srv-1 given > my >> location constraints, >> but remote node’s monitor ‘jangcluster-srv-4_monitor’ failed and srv-4 was >> getting fenced before attempting to failover. >> What would be the proper way to simulate the network failover? >> How can I configure the cluster so that remote node srv-4 fails over instead >> of getting fenced? >> >> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/