Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On Fri, 2021-11-05 at 11:22 +0300, Andrei Borzenkov wrote: > On 05.11.2021 01:20, Ken Gaillot wrote: > > > There are two issues discussed in this thread. > > > > > > 1. Remote node is fenced when connection with this node is lost. > > > For > > > all > > > I can tell this is intended and expected behavior. That was the > > > original > > > question. > > > > It's expected only because the connection can't be recovered > > elsewhere. > > If another node can run the connection, pacemaker will try to > > reconnect > > from there and re-probe everything to make sure what the current > > state > > is. > > > > That's not what I see in sources and documentation and not what I > obverse. Pacemaker will reprobe from another node only after > attempting > fencing of remote node. Ah, you're right, I misremembered. Probe/start failures of a remote connection don't require fencing but recurring monitor failures do. I guess that makes sense, otherwise recovery of resources on a failed remote could be greatly delayed. I was confusing that with when the connection host is lost and has to be fenced, in which case the connection will be recovered elsewhere if possible, without fencing the remote. > The difference seems to be reconnect_interval parameter. If it is > present in remote resource definition, pacemaker will not proceed > after > failed fencing. > > As there is no real documentation how it is supposed to work I do not > know whether all of this is a bug or not. But one is certainly sure - > when connection to remote node is lost the first thing pacemaker does > is > to fence it and only then initiate any recovery action. reconnect_interval is implemented as a sort of special case of failure- timeout. When the interval expires, the connection failure is timed out, so the cluster no longer sees a need for fencing. It's not a bug but maybe a questionable design. That's a case of a broader problem: if the cause for fencing goes away, the cluster will stop trying fencing and act as if nothing was wrong. This can be a good thing, for example a brief network interruption can sometimes heal without fencing. However it's been suggested (e.g. CLBZ#5476) that we need the concept of fencing required independently of conditions -- i.e., for certain types of failure, fencing should be considered required until it succeeds, regardless of whether the original need for it goes away. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On 05.11.2021 01:20, Ken Gaillot wrote: >> >> There are two issues discussed in this thread. >> >> 1. Remote node is fenced when connection with this node is lost. For >> all >> I can tell this is intended and expected behavior. That was the >> original >> question. > > It's expected only because the connection can't be recovered elsewhere. > If another node can run the connection, pacemaker will try to reconnect > from there and re-probe everything to make sure what the current state > is. > That's not what I see in sources and documentation and not what I obverse. Pacemaker will reprobe from another node only after attempting fencing of remote node. Documentation (pacemaker remote) reconnect_interval: If this is a positive time interval, the cluster will attempt to reconnect to a remote node after an active connection has been lost at this interval. Otherwise, the cluster will attempt to reconnect immediately (*after any fencing needed*). Config: node 1: ha1 \ attributes pingd=1 \ utilization cpu=20 node 2: ha2 \ attributes pingd=1 \ utilization cpu=20 primitive qnetd ocf:pacemaker:remote \ params reconnect_interval=30s \ op monitor timeout=5s interval=10s \ meta target-role=Started primitive stonith_sbd stonith:external/sbd \ op monitor interval=3600 property cib-bootstrap-options: \ cluster-infrastructure=corosync \ cluster-name=ha \ dc-version="2.1.0+20210816.c6a4f6e6c-1.1-2.1.0+20210816.c6a4f6e6c" \ last-lrm-refresh=1635607816 \ stonith-enabled=true \ have-watchdog=true \ stonith-watchdog-timeout=0 \ placement-strategy=balanced Logs (skipping CIB updates, remote node active on ha1, ha2 is DC): Nov 05 10:31:26.742 ha1 pacemaker-controld [3246] (monitor_timeout_cb@controld_remote_ra.c:474)info: Timed out waiting for remote poke response from qnetd Nov 05 10:31:26.742 ha1 pacemaker-controld [3246] (process_lrm_event@controld_execd.c:2826)error: Result of monitor operation for qnetd on ha1: Timed Out | call=3 key=qnetd_monitor_1 timeout=5000ms Nov 05 10:31:26.742 ha1 pacemaker-controld [3246] (lrmd_api_disconnect@lrmd_client.c:1640) info: Disconnecting TLS qnetd executor connection Nov 05 10:31:26.742 ha1 pacemaker-controld [3246] (lrmd_tls_connection_destroy@lrmd_client.c:562) info: TLS connection destroyed Nov 05 10:31:26.742 ha1 pacemaker-controld [3246] (remote_lrm_op_callback@controld_remote_ra.c:578)error: Lost connection to Pacemaker Remote node qnetd Nov 05 10:31:26.742 ha1 pacemaker-controld [3246] (lrmd_api_disconnect@lrmd_client.c:1640) info: Disconnecting TLS qnetd executor connection Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (determine_online_status_fencing@unpack.c:1434) info: Node ha2 is active Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (determine_online_status@unpack.c:1574) info: Node ha2 is online Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (determine_online_status_fencing@unpack.c:1434) info: Node ha1 is active Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (determine_online_status@unpack.c:1574) info: Node ha1 is online Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (unpack_rsc_op_failure@unpack.c:3022)warning: Unexpected result (error) was recorded for monitor of qnetd on ha1 at Nov 5 10:31:26 2021 | rc=1 id=qnetd_last_failure_0 Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (unpack_rsc_op_failure@unpack.c:3117)notice: qnetd will not be started under current conditions Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (pe_fence_node@unpack.c:143) warning: Remote node qnetd will be fenced: remote connection is unrecoverable Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313] (pcmk__native_allocate@pcmk_sched_native.c:626) info: Resource qnetd cannot run anywhere ... Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313] (stage6@pcmk_sched_allocate.c:1634) warning: Scheduling Node qnetd for STONITH Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313] (log_list_item@output_log.c:198) notice: Actions: Fence (reboot) qnetd 'remote connection is unrecoverable' Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313] (rsc_action_default@pcmk_output.c:928) info: Leave stonith_sbd (Started ha2) Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313] (log_list_item@output_log.c:198) notice: Actions: Stop qnetd ( ha1 ) due to node availability ... Nov 05 10:31:26.869 ha2 pacemaker-controld [3314] (te_rsc_command@controld_te_actions.c:320) notice: Initiating stop operation qnetd_stop_0 on ha1 | action 4 Nov 05 10:31:26.869 ha2 pacemaker-controld [3314] (te_fence_node@controld_fencing.c:869) notice: Requesting fencing (reboot) of node qnetd | action=3 timeout=6 Nov 05 10:31:26.869 ha2
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On Sat, 2021-10-30 at 21:17 +0300, Andrei Borzenkov wrote: > On 29.10.2021 18:37, Ken Gaillot wrote: > ... > > > > > To address the original question, this is the log sequence I > > > > > find > > > > > most > > > > > relevant: > > > > > > > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker- > > > > > > schedulerd[776553] > > > > > > (unpack_rsc_op_failure) warning: Unexpected result > > > > > > (error) > > > > > > was > > > > > > recorded for monitor of jangcluster-srv-4 on jangcluster- > > > > > > srv-2 > > > > > > at Oct > > > > > > 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0 > > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker- > > > > > > schedulerd[776553] > > > > > > (unpack_rsc_op_failure) notice: jangcluster-srv-4 will > > > > > > not > > > > > > be > > > > > > started under current conditions > > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[ > > > > > > 776553] (pe_fence_node) warning: Remote node > > > > > > jangcluster- > > > > > > srv-4 > > > > > > will be fenced: remote connection is unrecoverable > > > > > > > > > > The "will not be started" is why the node had to be fenced. > > > > > There > > > > > was > > > > > > > > OK so it implies that remote resource should fail over if > > > > connection to > > > > remote node fails. Thank you, that was not exactly clear from > > > > documentation. > > > > > > > > > nowhere to recover the connection. I'd need to see the CIB > > > > > from > > > > > that > > > > > time to know why; it's possible you had an old constraint > > > > > banning > > > > > the > > > > > connection from the other node (e.g. from a ban or move > > > > > command), > > > > > or > > > > > something like that. > > > > > > > > > > > > > Hmm ... looking in (current) sources it seems this message is > > > > emitted > > > > only in case of on-fail=stop operation property ... > > > > > > > > > > Well ... > > > > > > /* For remote nodes, ensure that any failure that results in > > > dropping an > > > > > > * active connection to the node results in fencing of the > > > node. > > > > > > * > > > > > > * There are only two action failures that don't result in > > > fencing. > > > > > > * 1. probes - probe failures are expected. > > > > > > * 2. start - a start failure indicates that an active > > > connection > > > does not already > > > > > > * exist. The user can set op on-fail=fence if they really > > > want > > > to > > > fence start > > > > > > * failures. */ > > > > > > > > > pacemaker will forcibly set on-fail=stop for remote resource. > > > > The default isn't any different, it's on-fail=restart. > > > > At that point in the code, on-fail is not what the user set (or > > default), but how the result should be handled, taking into account > > what the user set. E.g. if the result is success, then on-fail is > > set > > to ignore because nothing needs to be done, regardless of what the > > configured on-fail is. > > > > There are two issues discussed in this thread. > > 1. Remote node is fenced when connection with this node is lost. For > all > I can tell this is intended and expected behavior. That was the > original > question. It's expected only because the connection can't be recovered elsewhere. If another node can run the connection, pacemaker will try to reconnect from there and re-probe everything to make sure what the current state is. > 2. Remote resource appears to not fail over. I cannot reproduce it, > but > then we also do not have the complete CIB, so something may affect > it. > OTOH logs shown stop before fencing has possibly succeeded, so may be > remote resource *did* fail over. > > What I see is - connection to remote node is lost, pacemaker fences > remote node and attempts to restart remote resource, if this is > unsuccessful (meaning - connection still could not be established) > remote resource fails over to another node. > > I do not know if it is possible to avoid fencing of remote node under > described conditions. > > What is somewhat interesting (and looks like a bug) - in my testing > pacemaker ignored failed fencing attempt and proceeded with > restarting > of remote resource. Is it expected behavior? I don't see a failed fencing attempt (or any result of the fencing attempt) in the logs in the original message, only failures of the connection monitor. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On 29.10.2021 18:37, Ken Gaillot wrote: ... To address the original question, this is the log sequence I find most relevant: > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker- > schedulerd[776553] > (unpack_rsc_op_failure) warning: Unexpected result (error) > was > recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 > at Oct > 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0 > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker- > schedulerd[776553] > (unpack_rsc_op_failure) notice: jangcluster-srv-4 will not > be > started under current conditions > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[ > 776553] (pe_fence_node) warning: Remote node jangcluster- > srv-4 > will be fenced: remote connection is unrecoverable The "will not be started" is why the node had to be fenced. There was >>> >>> OK so it implies that remote resource should fail over if >>> connection to >>> remote node fails. Thank you, that was not exactly clear from >>> documentation. >>> nowhere to recover the connection. I'd need to see the CIB from that time to know why; it's possible you had an old constraint banning the connection from the other node (e.g. from a ban or move command), or something like that. >>> >>> Hmm ... looking in (current) sources it seems this message is >>> emitted >>> only in case of on-fail=stop operation property ... >>> >> >> Well ... >> >> /* For remote nodes, ensure that any failure that results in >> dropping an >> >> * active connection to the node results in fencing of the node. >> >> * >> >> * There are only two action failures that don't result in >> fencing. >> >> * 1. probes - probe failures are expected. >> >> * 2. start - a start failure indicates that an active connection >> does not already >> >> * exist. The user can set op on-fail=fence if they really want >> to >> fence start >> >> * failures. */ >> >> >> pacemaker will forcibly set on-fail=stop for remote resource. > > The default isn't any different, it's on-fail=restart. > > At that point in the code, on-fail is not what the user set (or > default), but how the result should be handled, taking into account > what the user set. E.g. if the result is success, then on-fail is set > to ignore because nothing needs to be done, regardless of what the > configured on-fail is. > There are two issues discussed in this thread. 1. Remote node is fenced when connection with this node is lost. For all I can tell this is intended and expected behavior. That was the original question. 2. Remote resource appears to not fail over. I cannot reproduce it, but then we also do not have the complete CIB, so something may affect it. OTOH logs shown stop before fencing has possibly succeeded, so may be remote resource *did* fail over. What I see is - connection to remote node is lost, pacemaker fences remote node and attempts to restart remote resource, if this is unsuccessful (meaning - connection still could not be established) remote resource fails over to another node. I do not know if it is possible to avoid fencing of remote node under described conditions. What is somewhat interesting (and looks like a bug) - in my testing pacemaker ignored failed fencing attempt and proceeded with restarting of remote resource. Is it expected behavior? ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On Fri, 2021-10-29 at 18:18 +0300, Andrei Borzenkov wrote: > On 29.10.2021 18:16, Andrei Borzenkov wrote: > > On 29.10.2021 17:53, Ken Gaillot wrote: > > > On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote: > > > > Hey Andrei, > > > > > > > > Thanks for your response again. The cluster nodes and remote > > > > hosts > > > > each share two networks, however there is no routing between > > > > them. I > > > > don't suppose there is a configuration parameter we can set to > > > > tell > > > > Pacemaker to try communicating with the remotes using multiple > > > > IP > > > > addresses? > > > > > > > > Gerry Sommerville > > > > E-mail: ge...@ca.ibm.com > > > > > > Hi, > > > > > > No, but you can use bonding if you want to have interface > > > redundancy > > > for a remote connection. To be clear, there is no requirement > > > that > > > remote nodes and cluster nodes have the same level of redundancy, > > > it's > > > just a design choice. > > > > > > To address the original question, this is the log sequence I find > > > most > > > relevant: > > > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker- > > > > schedulerd[776553] > > > > (unpack_rsc_op_failure) warning: Unexpected result (error) > > > > was > > > > recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 > > > > at Oct > > > > 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0 > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker- > > > > schedulerd[776553] > > > > (unpack_rsc_op_failure) notice: jangcluster-srv-4 will not > > > > be > > > > started under current conditions > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[ > > > > 776553] (pe_fence_node) warning: Remote node jangcluster- > > > > srv-4 > > > > will be fenced: remote connection is unrecoverable > > > > > > The "will not be started" is why the node had to be fenced. There > > > was > > > > OK so it implies that remote resource should fail over if > > connection to > > remote node fails. Thank you, that was not exactly clear from > > documentation. > > > > > nowhere to recover the connection. I'd need to see the CIB from > > > that > > > time to know why; it's possible you had an old constraint banning > > > the > > > connection from the other node (e.g. from a ban or move command), > > > or > > > something like that. > > > > > > > Hmm ... looking in (current) sources it seems this message is > > emitted > > only in case of on-fail=stop operation property ... > > > > Well ... > > /* For remote nodes, ensure that any failure that results in > dropping an > > * active connection to the node results in fencing of the node. > > * > > * There are only two action failures that don't result in > fencing. > > * 1. probes - probe failures are expected. > > * 2. start - a start failure indicates that an active connection > does not already > > * exist. The user can set op on-fail=fence if they really want > to > fence start > > * failures. */ > > > pacemaker will forcibly set on-fail=stop for remote resource. The default isn't any different, it's on-fail=restart. At that point in the code, on-fail is not what the user set (or default), but how the result should be handled, taking into account what the user set. E.g. if the result is success, then on-fail is set to ignore because nothing needs to be done, regardless of what the configured on-fail is. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On 29.10.2021 18:16, Andrei Borzenkov wrote: > On 29.10.2021 17:53, Ken Gaillot wrote: >> On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote: >>> Hey Andrei, >>> >>> Thanks for your response again. The cluster nodes and remote hosts >>> each share two networks, however there is no routing between them. I >>> don't suppose there is a configuration parameter we can set to tell >>> Pacemaker to try communicating with the remotes using multiple IP >>> addresses? >>> >>> Gerry Sommerville >>> E-mail: ge...@ca.ibm.com >> >> Hi, >> >> No, but you can use bonding if you want to have interface redundancy >> for a remote connection. To be clear, there is no requirement that >> remote nodes and cluster nodes have the same level of redundancy, it's >> just a design choice. >> >> To address the original question, this is the log sequence I find most >> relevant: >> >>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] >>> (unpack_rsc_op_failure) warning: Unexpected result (error) was >>> recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct >>> 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0 >> >>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] >>> (unpack_rsc_op_failure) notice: jangcluster-srv-4 will not be >>> started under current conditions >> >>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[ >>> 776553] (pe_fence_node) warning: Remote node jangcluster-srv-4 >>> will be fenced: remote connection is unrecoverable >> >> The "will not be started" is why the node had to be fenced. There was > > OK so it implies that remote resource should fail over if connection to > remote node fails. Thank you, that was not exactly clear from documentation. > >> nowhere to recover the connection. I'd need to see the CIB from that >> time to know why; it's possible you had an old constraint banning the >> connection from the other node (e.g. from a ban or move command), or >> something like that. >> > > Hmm ... looking in (current) sources it seems this message is emitted > only in case of on-fail=stop operation property ... > Well ... /* For remote nodes, ensure that any failure that results in dropping an * active connection to the node results in fencing of the node. * * There are only two action failures that don't result in fencing. * 1. probes - probe failures are expected. * 2. start - a start failure indicates that an active connection does not already * exist. The user can set op on-fail=fence if they really want to fence start * failures. */ pacemaker will forcibly set on-fail=stop for remote resource. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On 29.10.2021 17:53, Ken Gaillot wrote: > On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote: >> Hey Andrei, >> >> Thanks for your response again. The cluster nodes and remote hosts >> each share two networks, however there is no routing between them. I >> don't suppose there is a configuration parameter we can set to tell >> Pacemaker to try communicating with the remotes using multiple IP >> addresses? >> >> Gerry Sommerville >> E-mail: ge...@ca.ibm.com > > Hi, > > No, but you can use bonding if you want to have interface redundancy > for a remote connection. To be clear, there is no requirement that > remote nodes and cluster nodes have the same level of redundancy, it's > just a design choice. > > To address the original question, this is the log sequence I find most > relevant: > >> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] >> (unpack_rsc_op_failure) warning: Unexpected result (error) was >> recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct >> 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0 > >> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] >> (unpack_rsc_op_failure) notice: jangcluster-srv-4 will not be >> started under current conditions > >> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[ >> 776553] (pe_fence_node) warning: Remote node jangcluster-srv-4 >> will be fenced: remote connection is unrecoverable > > The "will not be started" is why the node had to be fenced. There was OK so it implies that remote resource should fail over if connection to remote node fails. Thank you, that was not exactly clear from documentation. > nowhere to recover the connection. I'd need to see the CIB from that > time to know why; it's possible you had an old constraint banning the > connection from the other node (e.g. from a ban or move command), or > something like that. > Hmm ... looking in (current) sources it seems this message is emitted only in case of on-fail=stop operation property ... ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote: > Hey Andrei, > > Thanks for your response again. The cluster nodes and remote hosts > each share two networks, however there is no routing between them. I > don't suppose there is a configuration parameter we can set to tell > Pacemaker to try communicating with the remotes using multiple IP > addresses? > > Gerry Sommerville > E-mail: ge...@ca.ibm.com Hi, No, but you can use bonding if you want to have interface redundancy for a remote connection. To be clear, there is no requirement that remote nodes and cluster nodes have the same level of redundancy, it's just a design choice. To address the original question, this is the log sequence I find most relevant: > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] > (unpack_rsc_op_failure) warning: Unexpected result (error) was > recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct > 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0 > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] > (unpack_rsc_op_failure) notice: jangcluster-srv-4 will not be > started under current conditions > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[ > 776553] (pe_fence_node) warning: Remote node jangcluster-srv-4 > will be fenced: remote connection is unrecoverable The "will not be started" is why the node had to be fenced. There was nowhere to recover the connection. I'd need to see the CIB from that time to know why; it's possible you had an old constraint banning the connection from the other node (e.g. from a ban or move command), or something like that. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
Hey Andrei, Thanks for your response again. The cluster nodes and remote hosts each share two networks, however there is no routing between them. I don't suppose there is a configuration parameter we can set to tell Pacemaker to try communicating with the remotes using multiple IP addresses? Gerry Sommerville E-mail: ge...@ca.ibm.com - Original message -From: "Andrei Borzenkov" Sent by: "Users" To: users@clusterlabs.orgCc:Subject: [EXTERNAL] Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issueDate: Thu, Oct 28, 2021 2:59 PM On 28.10.2021 20:13, Gerry R Sommerville wrote:>> What we also found to be interesting is that if the cluster is only using a> single heartbeat ring, then srv-2 will get fenced instead, and theSo as already suspected you did not actually isolate the node at all.> pacemaker-remote connection resources will successfully fail over without any> additional fencing to the remote nodes themselves. It seems a little backwards> to us since our reasoning for configuring multiple heartbeat rings was to> increase the clusters reliability/robustness of the cluster, but it seems to do> the opposite when using pacemaker-remote. :(>Remote node is still node. It does not participate in quorum and it doesnot perform fencing but otherwise it is part of cluster. If you haveredundant rings you are expected to provide redundant connection toremote nodes as well to match it.You do not complain that srv-2 is fenced when its single connection isdown; how does it differ from srv-4 being fenced when its singleconnection is down?> Any suggestions/comments on our configuration / test scenario's are appreciated!Every HA configuration is as reliable as its weakest link. If you makehalf of connections redundant and half of connections not, not redundantconnections will be single point of failure.Besides, you never actually described what is the problem you are tryingto solve. If you had no active resources on remote node, how does itmatter whether this node was fenced? If you had active resources, thoseresources would have failed over to another node.___Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On 28.10.2021 20:13, Gerry R Sommerville wrote: > > What we also found to be interesting is that if the cluster is only using a > single heartbeat ring, then srv-2 will get fenced instead, and the So as already suspected you did not actually isolate the node at all. > pacemaker-remote connection resources will successfully fail over without any > additional fencing to the remote nodes themselves. It seems a little > backwards > to us since our reasoning for configuring multiple heartbeat rings was to > increase the clusters reliability/robustness of the cluster, but it seems to > do > the opposite when using pacemaker-remote. :( > Remote node is still node. It does not participate in quorum and it does not perform fencing but otherwise it is part of cluster. If you have redundant rings you are expected to provide redundant connection to remote nodes as well to match it. You do not complain that srv-2 is fenced when its single connection is down; how does it differ from srv-4 being fenced when its single connection is down? > Any suggestions/comments on our configuration / test scenario's are > appreciated! Every HA configuration is as reliable as its weakest link. If you make half of connections redundant and half of connections not, not redundant connections will be single point of failure. Besides, you never actually described what is the problem you are trying to solve. If you had no active resources on remote node, how does it matter whether this node was fenced? If you had active resources, those resources would have failed over to another node. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
Hey Andrei, UlrichI am working with Janghyuk on his testing effort. Thank you for your responses, you have clarified some of the terminology we have been misusing.As Janghyuk mentions previously, we have two "full cluster" nodes using two-node quorum and multiple heart beat rings + two more servers as pacemaker-remotes. The pacemaker-remote connection resources each prefer a specific full cluster node to run on, however they are configured such that they can fail over to the other cluster node if needed. Here is the configuration again...corosync.conf - nodelist & quorum nodelist { node { ring0_addr: node-1-subnet-1 ring1_addr: node-1-subnet-2 name: jangcluster-srv-1 nodeid: 1 } node { ring0_addr: node-2-subnet-1 ring1_addr: node-2-subnet-2 name: jangcluster-srv-2 nodeid: 2 }}quorum { provider: corosync_votequorum two_node: 1} crm config show node 1: jangcluster-srv-1node 2: jangcluster-srv-2node jangcluster-srv-3:remotenode jangcluster-srv-4:remoteprimitive GPFS-Fence stonith:fence_gpfs \ params instance=regress1 shared_filesystem="" pcmk_host_list=" jangcluster-srv-1 jangcluster-srv-2 jangcluster-srv-3 jangcluster-srv-4" secure=true \ op monitor interval=30s timeout=500s \ op off interval=0 \ meta is-managed=trueprimitive jangcluster-srv-3 ocf:pacemaker:remote \ params server=jangcluster-srv-3 reconnect_interval=1m \ op monitor interval=30s \ op_params migration-threshold=1 \ op stop interval=0 \ meta is-managed=trueprimitive jangcluster-srv-4 ocf:pacemaker:remote \ params server=jangcluster-srv-4 reconnect_interval=1m \ op monitor interval=30s \ op_params migration-threshold=1 \ meta is-managed=truelocation prefer-CF-Hosts GPFS-Fence \ rule 100: #uname eq jangcluster-srv-1 or #uname eq jangcluster-srv-2location prefer-node-jangcluster-srv-3 jangcluster-srv-3 100: jangcluster-srv-1location prefer-node-jangcluster-srv-3-2 jangcluster-srv-3 50: jangcluster-srv-2location prefer-node-jangcluster-srv-4 jangcluster-srv-4 100: jangcluster-srv-2location prefer-node-jangcluster-srv-4-2 jangcluster-srv-4 50: jangcluster-srv-1 We are testing several failure scenarios... In most cases the pacemaker-remote connection resource will successfully fail over to the other cluster node. For example, if we reboot, shutdown, halt, or crash srv-2, the pacemaker-remote connection resource for srv-4 will fail over and start running on srv-1 without srv-3's physical host getting fenced. Manually fencing srv-2 via stonith_admin also works.However when we attempt to simulate a communication failure on srv-2's Ethernet adapter via iptables, we observe srv-3's host getting fenced before the connection resource fails over to srv-1.The concern here is that in the future we may have many remotes connecting to a single cluster host, and so far it seems like a Ethernet adapter issue on the cluster host could lead to many remote hosts getting unnecessarily fenced.Here are the updated iptables commands that we run on srv-2 to simulate srv-2 losing the ability to communicate to srv-4.iptables -A INPUT -s [IP of srv-1] -j DROP ; iptables -A OUTPUT -d [IP of srv-1] -j DROPiptables -A INPUT -s [IP of srv-3] -j DROP ; iptables -A OUTPUT -d [IP of srv-3] -j DROPiptables -A INPUT -s [IP of srv-4] -j DROP ; iptables -A OUTPUT -d [IP of srv-4] -j DROPAs Janghyuk has shown previously, it seems that the pacemaker-remote connection monitor timesout and causes the remote host to get fenced. Here are the logs that I think are most relevant.Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount) info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount) info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount) info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount) info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (unpack_rsc_op_failure) warning: Unexpected result (error) was recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (unpack_rsc_op_failure) notice: jangcluster-srv-4 will not be started under current conditionsOct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_fence_node) warning: Remote node jangcluster-srv-4 will be fenced: remote connection is unrecoverableWhat we also found to be interesting is that if the cluster is only using a single heartbeat ring, then srv-2 will get fenced instead, and the
Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
On Thu, Oct 28, 2021 at 10:30 AM Ulrich Windl wrote: > > Fencing _is_ a part of failover! > As any blanket answer this is mostly incorrect in this context. There are two separate objects here - remote host itself and pacemaker resource used to connect to and monitor state of remote host. Remote host itself does not failover. Resources on this host do, but OP does not ask about it. Pacemaker resource used to monitor remote host may failover as any other cluster resource. This failover does not require any fencing *of remote host itself*, and in this particular case connection between two cluster nodes was present all the time (at least, as long as we can believe logs) so there was no reason for fencing as well. Whether pacemaker should attempt to failover this resource to another node if connection to remote host fails, is subject to discussion. So fencing of the remote host itself is most certainly *not* part of the failover of the resource that monitors this remote host. > >>> "Janghyuk Boo" schrieb am 26.10.2021 um 22:09 in > Nachricht > : > Dear Community , > Thank you Ken for your reply last time. > I attached the log messages as requested from the last thread. > I have a Pacemaker cluster with two cluster nodes with two network interfaces > each, and two remote nodes and a prototyped fencing agent(GPFS-Fence) to cut a > hosts access from the clustered filesystem. > I noticed that remote node gets fenced when the quorum node its connected to > gets fenced or experiences network failure. > For example, when I disconnected srv-2 from the rest of the cluster by using > iptables on srv-2 > iptables -A INPUT -s [IP of srv-1] -j DROP ; iptables -A OUTPUT -s [IP of > srv-1] -j DROP > iptables -A INPUT -s [IP of srv-3] -j DROP ; iptables -A OUTPUT -s [IP of > srv-3] -j DROP > iptables -A INPUT -s [IP of srv-4] -j DROP ; iptables -A OUTPUT -s [IP of > srv-4] -j DROP > I expected that remote node jangcluster-srv-4 would failover to srv-1 given my > location constraints, > but remote node’s monitor ‘jangcluster-srv-4_monitor’ failed and srv-4 was > getting fenced before attempting to failover. > What would be the proper way to simulate the network failover? > How can I configure the cluster so that remote node srv-4 fails over instead > of getting fenced? > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/