Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-11-05 Thread Ken Gaillot
On Fri, 2021-11-05 at 11:22 +0300, Andrei Borzenkov wrote:
> On 05.11.2021 01:20, Ken Gaillot wrote:
> > > There are two issues discussed in this thread.
> > > 
> > > 1. Remote node is fenced when connection with this node is lost.
> > > For
> > > all
> > > I can tell this is intended and expected behavior. That was the
> > > original
> > > question.
> > 
> > It's expected only because the connection can't be recovered
> > elsewhere.
> > If another node can run the connection, pacemaker will try to
> > reconnect
> > from there and re-probe everything to make sure what the current
> > state
> > is.
> > 
> 
> That's not what I see in sources and documentation and not what I
> obverse. Pacemaker will reprobe from another node only after
> attempting
> fencing of remote node.

Ah, you're right, I misremembered. Probe/start failures of a remote
connection don't require fencing but recurring monitor failures do. I
guess that makes sense, otherwise recovery of resources on a failed
remote could be greatly delayed.

I was confusing that with when the connection host is lost and has to
be fenced, in which case the connection will be recovered elsewhere if
possible, without fencing the remote.



> The difference seems to be reconnect_interval parameter. If it is
> present in remote resource definition, pacemaker will not proceed
> after
> failed fencing.
> 
> As there is no real documentation how it is supposed to work I do not
> know whether all of this is a bug or not. But one is certainly sure -
> when connection to remote node is lost the first thing pacemaker does
> is
> to fence it and only then initiate any recovery action.

reconnect_interval is implemented as a sort of special case of failure-
timeout. When the interval expires, the connection failure is timed
out, so the cluster no longer sees a need for fencing. It's not a bug
but maybe a questionable design.

That's a case of a broader problem: if the cause for fencing goes away,
the cluster will stop trying fencing and act as if nothing was wrong.
This can be a good thing, for example a brief network interruption can
sometimes heal without fencing. However it's been suggested (e.g.
CLBZ#5476) that we need the concept of fencing required independently
of conditions -- i.e., for certain types of failure, fencing should be
considered required until it succeeds, regardless of whether the
original need for it goes away.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-11-05 Thread Andrei Borzenkov
On 05.11.2021 01:20, Ken Gaillot wrote:
>>
>> There are two issues discussed in this thread.
>>
>> 1. Remote node is fenced when connection with this node is lost. For
>> all
>> I can tell this is intended and expected behavior. That was the
>> original
>> question.
> 
> It's expected only because the connection can't be recovered elsewhere.
> If another node can run the connection, pacemaker will try to reconnect
> from there and re-probe everything to make sure what the current state
> is.
> 

That's not what I see in sources and documentation and not what I
obverse. Pacemaker will reprobe from another node only after attempting
fencing of remote node.

Documentation (pacemaker remote)

  reconnect_interval: If this is a positive time interval, the cluster
will attempt to
  reconnect to a remote node after an active
connection has been
  lost at this interval. Otherwise, the cluster will
attempt to
  reconnect immediately (*after any fencing needed*).

Config:

node 1: ha1 \

attributes pingd=1 \

utilization cpu=20

node 2: ha2 \

attributes pingd=1 \

utilization cpu=20

primitive qnetd ocf:pacemaker:remote \

params reconnect_interval=30s \

op monitor timeout=5s interval=10s \

meta target-role=Started

primitive stonith_sbd stonith:external/sbd \

op monitor interval=3600

property cib-bootstrap-options: \

cluster-infrastructure=corosync \

cluster-name=ha \

dc-version="2.1.0+20210816.c6a4f6e6c-1.1-2.1.0+20210816.c6a4f6e6c" \

last-lrm-refresh=1635607816 \

stonith-enabled=true \

have-watchdog=true \

stonith-watchdog-timeout=0 \

placement-strategy=balanced


Logs (skipping CIB updates, remote node active on ha1, ha2 is DC):

Nov 05 10:31:26.742 ha1 pacemaker-controld  [3246]
(monitor_timeout_cb@controld_remote_ra.c:474)info: Timed out
waiting for remote poke response from qnetd

Nov 05 10:31:26.742 ha1 pacemaker-controld  [3246]
(process_lrm_event@controld_execd.c:2826)error: Result of monitor
operation for qnetd on ha1: Timed Out | call=3 key=qnetd_monitor_1
timeout=5000ms

Nov 05 10:31:26.742 ha1 pacemaker-controld  [3246]
(lrmd_api_disconnect@lrmd_client.c:1640) info: Disconnecting TLS
qnetd executor connection

Nov 05 10:31:26.742 ha1 pacemaker-controld  [3246]
(lrmd_tls_connection_destroy@lrmd_client.c:562)  info: TLS
connection destroyed

Nov 05 10:31:26.742 ha1 pacemaker-controld  [3246]
(remote_lrm_op_callback@controld_remote_ra.c:578)error: Lost
connection to Pacemaker Remote node qnetd

Nov 05 10:31:26.742 ha1 pacemaker-controld  [3246]
(lrmd_api_disconnect@lrmd_client.c:1640) info: Disconnecting TLS
qnetd executor connection


Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status_fencing@unpack.c:1434)  info: Node ha2 is
active

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status@unpack.c:1574)  info: Node ha2 is online

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status_fencing@unpack.c:1434)  info: Node ha1 is
active

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status@unpack.c:1574)  info: Node ha1 is online

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(unpack_rsc_op_failure@unpack.c:3022)warning: Unexpected result
(error) was recorded for monitor of qnetd on ha1 at Nov  5 10:31:26 2021
| rc=1 id=qnetd_last_failure_0

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(unpack_rsc_op_failure@unpack.c:3117)notice: qnetd will not be
started under current conditions

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(pe_fence_node@unpack.c:143) warning: Remote node qnetd will be
fenced: remote connection is unrecoverable

Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(pcmk__native_allocate@pcmk_sched_native.c:626)  info: Resource
qnetd cannot run anywhere

...
Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(stage6@pcmk_sched_allocate.c:1634)  warning: Scheduling Node qnetd for
STONITH

Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(log_list_item@output_log.c:198) notice: Actions: Fence (reboot)
qnetd 'remote connection is unrecoverable'

Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(rsc_action_default@pcmk_output.c:928)   info: Leave   stonith_sbd
 (Started ha2)

Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(log_list_item@output_log.c:198) notice: Actions: Stop   qnetd
 (   ha1 )  due to node availability

...
Nov 05 10:31:26.869 ha2 pacemaker-controld  [3314]
(te_rsc_command@controld_te_actions.c:320)   notice: Initiating stop
operation qnetd_stop_0 on ha1 | action 4

Nov 05 10:31:26.869 ha2 pacemaker-controld  [3314]
(te_fence_node@controld_fencing.c:869)   notice: Requesting fencing
(reboot) of node qnetd | action=3 timeout=6

Nov 05 10:31:26.869 ha2 

Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-11-04 Thread Ken Gaillot
On Sat, 2021-10-30 at 21:17 +0300, Andrei Borzenkov wrote:
> On 29.10.2021 18:37, Ken Gaillot wrote:
> ...
> > > > > To address the original question, this is the log sequence I
> > > > > find
> > > > > most
> > > > > relevant:
> > > > > 
> > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-
> > > > > > schedulerd[776553]
> > > > > > (unpack_rsc_op_failure)  warning: Unexpected result
> > > > > > (error)
> > > > > > was
> > > > > > recorded for monitor of jangcluster-srv-4 on jangcluster-
> > > > > > srv-2
> > > > > > at Oct
> > > > > > 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0
> > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-
> > > > > > schedulerd[776553]
> > > > > > (unpack_rsc_op_failure)  notice: jangcluster-srv-4 will
> > > > > > not
> > > > > > be
> > > > > > started under current conditions
> > > > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[
> > > > > > 776553] (pe_fence_node)  warning: Remote node
> > > > > > jangcluster-
> > > > > > srv-4
> > > > > > will be fenced: remote connection is unrecoverable
> > > > > 
> > > > > The "will not be started" is why the node had to be fenced.
> > > > > There
> > > > > was
> > > > 
> > > > OK so it implies that remote resource should fail over if
> > > > connection to
> > > > remote node fails. Thank you, that was not exactly clear from
> > > > documentation.
> > > > 
> > > > > nowhere to recover the connection. I'd need to see the CIB
> > > > > from
> > > > > that
> > > > > time to know why; it's possible you had an old constraint
> > > > > banning
> > > > > the
> > > > > connection from the other node (e.g. from a ban or move
> > > > > command),
> > > > > or
> > > > > something like that.
> > > > > 
> > > > 
> > > > Hmm ... looking in (current) sources it seems this message is
> > > > emitted
> > > > only in case of on-fail=stop operation property ...
> > > > 
> > > 
> > > Well ...
> > > 
> > > /* For remote nodes, ensure that any failure that results in
> > > dropping an
> > > 
> > >  * active connection to the node results in fencing of the
> > > node.
> > > 
> > >  *
> > > 
> > >  * There are only two action failures that don't result in
> > > fencing.
> > > 
> > >  * 1. probes - probe failures are expected.
> > > 
> > >  * 2. start - a start failure indicates that an active
> > > connection
> > > does not already
> > > 
> > >  * exist. The user can set op on-fail=fence if they really
> > > want
> > > to
> > > fence start
> > > 
> > >  * failures. */
> > > 
> > > 
> > > pacemaker will forcibly set on-fail=stop for remote resource.
> > 
> > The default isn't any different, it's on-fail=restart.
> > 
> > At that point in the code, on-fail is not what the user set (or
> > default), but how the result should be handled, taking into account
> > what the user set. E.g. if the result is success, then on-fail is
> > set
> > to ignore because nothing needs to be done, regardless of what the
> > configured on-fail is.
> > 
> 
> There are two issues discussed in this thread.
> 
> 1. Remote node is fenced when connection with this node is lost. For
> all
> I can tell this is intended and expected behavior. That was the
> original
> question.

It's expected only because the connection can't be recovered elsewhere.
If another node can run the connection, pacemaker will try to reconnect
from there and re-probe everything to make sure what the current state
is.


> 2. Remote resource appears to not fail over. I cannot reproduce it,
> but
> then we also do not have the complete CIB, so something may affect
> it.
> OTOH logs shown stop before fencing has possibly succeeded, so may be
> remote resource *did* fail over.
> 
> What I see is - connection to remote node is lost, pacemaker fences
> remote node and attempts to restart remote resource, if this is
> unsuccessful (meaning - connection still could not be established)
> remote resource fails over to another node.
> 
> I do not know if it is possible to avoid fencing of remote node under
> described conditions.
> 
> What is somewhat interesting (and looks like a bug) - in my testing
> pacemaker ignored failed fencing attempt and proceeded with
> restarting
> of remote resource. Is it expected behavior?

I don't see a failed fencing attempt (or any result of the fencing
attempt) in the logs in the original message, only failures of the
connection monitor.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-30 Thread Andrei Borzenkov
On 29.10.2021 18:37, Ken Gaillot wrote:
...

 To address the original question, this is the log sequence I find
 most
 relevant:

> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-
> schedulerd[776553]
> (unpack_rsc_op_failure)  warning: Unexpected result (error)
> was
> recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2
> at Oct
> 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0
> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-
> schedulerd[776553]
> (unpack_rsc_op_failure)  notice: jangcluster-srv-4 will not
> be
> started under current conditions
> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[
> 776553] (pe_fence_node)  warning: Remote node jangcluster-
> srv-4
> will be fenced: remote connection is unrecoverable

 The "will not be started" is why the node had to be fenced. There
 was
>>>
>>> OK so it implies that remote resource should fail over if
>>> connection to
>>> remote node fails. Thank you, that was not exactly clear from
>>> documentation.
>>>
 nowhere to recover the connection. I'd need to see the CIB from
 that
 time to know why; it's possible you had an old constraint banning
 the
 connection from the other node (e.g. from a ban or move command),
 or
 something like that.

>>>
>>> Hmm ... looking in (current) sources it seems this message is
>>> emitted
>>> only in case of on-fail=stop operation property ...
>>>
>>
>> Well ...
>>
>> /* For remote nodes, ensure that any failure that results in
>> dropping an
>>
>>  * active connection to the node results in fencing of the node.
>>
>>  *
>>
>>  * There are only two action failures that don't result in
>> fencing.
>>
>>  * 1. probes - probe failures are expected.
>>
>>  * 2. start - a start failure indicates that an active connection
>> does not already
>>
>>  * exist. The user can set op on-fail=fence if they really want
>> to
>> fence start
>>
>>  * failures. */
>>
>>
>> pacemaker will forcibly set on-fail=stop for remote resource.
> 
> The default isn't any different, it's on-fail=restart.
> 
> At that point in the code, on-fail is not what the user set (or
> default), but how the result should be handled, taking into account
> what the user set. E.g. if the result is success, then on-fail is set
> to ignore because nothing needs to be done, regardless of what the
> configured on-fail is.
> 

There are two issues discussed in this thread.

1. Remote node is fenced when connection with this node is lost. For all
I can tell this is intended and expected behavior. That was the original
question.

2. Remote resource appears to not fail over. I cannot reproduce it, but
then we also do not have the complete CIB, so something may affect it.
OTOH logs shown stop before fencing has possibly succeeded, so may be
remote resource *did* fail over.

What I see is - connection to remote node is lost, pacemaker fences
remote node and attempts to restart remote resource, if this is
unsuccessful (meaning - connection still could not be established)
remote resource fails over to another node.

I do not know if it is possible to avoid fencing of remote node under
described conditions.

What is somewhat interesting (and looks like a bug) - in my testing
pacemaker ignored failed fencing attempt and proceeded with restarting
of remote resource. Is it expected behavior?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-29 Thread Ken Gaillot
On Fri, 2021-10-29 at 18:18 +0300, Andrei Borzenkov wrote:
> On 29.10.2021 18:16, Andrei Borzenkov wrote:
> > On 29.10.2021 17:53, Ken Gaillot wrote:
> > > On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote:
> > > > Hey Andrei,
> > > >  
> > > > Thanks for your response again. The cluster nodes and remote
> > > > hosts
> > > > each share two networks, however there is no routing between
> > > > them. I
> > > > don't suppose there is a configuration parameter we can set to
> > > > tell
> > > > Pacemaker to try communicating with the remotes using multiple
> > > > IP
> > > > addresses?
> > > >  
> > > > Gerry Sommerville
> > > > E-mail: ge...@ca.ibm.com
> > > 
> > > Hi,
> > > 
> > > No, but you can use bonding if you want to have interface
> > > redundancy
> > > for a remote connection. To be clear, there is no requirement
> > > that
> > > remote nodes and cluster nodes have the same level of redundancy,
> > > it's
> > > just a design choice.
> > > 
> > > To address the original question, this is the log sequence I find
> > > most
> > > relevant:
> > > 
> > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-
> > > > schedulerd[776553]
> > > > (unpack_rsc_op_failure)  warning: Unexpected result (error)
> > > > was
> > > > recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2
> > > > at Oct
> > > > 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0
> > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-
> > > > schedulerd[776553]
> > > > (unpack_rsc_op_failure)  notice: jangcluster-srv-4 will not
> > > > be
> > > > started under current conditions
> > > > Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[
> > > > 776553] (pe_fence_node)  warning: Remote node jangcluster-
> > > > srv-4
> > > > will be fenced: remote connection is unrecoverable
> > > 
> > > The "will not be started" is why the node had to be fenced. There
> > > was
> > 
> > OK so it implies that remote resource should fail over if
> > connection to
> > remote node fails. Thank you, that was not exactly clear from
> > documentation.
> > 
> > > nowhere to recover the connection. I'd need to see the CIB from
> > > that
> > > time to know why; it's possible you had an old constraint banning
> > > the
> > > connection from the other node (e.g. from a ban or move command),
> > > or
> > > something like that.
> > > 
> > 
> > Hmm ... looking in (current) sources it seems this message is
> > emitted
> > only in case of on-fail=stop operation property ...
> > 
> 
> Well ...
> 
> /* For remote nodes, ensure that any failure that results in
> dropping an
> 
>  * active connection to the node results in fencing of the node.
> 
>  *
> 
>  * There are only two action failures that don't result in
> fencing.
> 
>  * 1. probes - probe failures are expected.
> 
>  * 2. start - a start failure indicates that an active connection
> does not already
> 
>  * exist. The user can set op on-fail=fence if they really want
> to
> fence start
> 
>  * failures. */
> 
> 
> pacemaker will forcibly set on-fail=stop for remote resource.

The default isn't any different, it's on-fail=restart.

At that point in the code, on-fail is not what the user set (or
default), but how the result should be handled, taking into account
what the user set. E.g. if the result is success, then on-fail is set
to ignore because nothing needs to be done, regardless of what the
configured on-fail is.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-29 Thread Andrei Borzenkov
On 29.10.2021 18:16, Andrei Borzenkov wrote:
> On 29.10.2021 17:53, Ken Gaillot wrote:
>> On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote:
>>> Hey Andrei,
>>>  
>>> Thanks for your response again. The cluster nodes and remote hosts
>>> each share two networks, however there is no routing between them. I
>>> don't suppose there is a configuration parameter we can set to tell
>>> Pacemaker to try communicating with the remotes using multiple IP
>>> addresses?
>>>  
>>> Gerry Sommerville
>>> E-mail: ge...@ca.ibm.com
>>
>> Hi,
>>
>> No, but you can use bonding if you want to have interface redundancy
>> for a remote connection. To be clear, there is no requirement that
>> remote nodes and cluster nodes have the same level of redundancy, it's
>> just a design choice.
>>
>> To address the original question, this is the log sequence I find most
>> relevant:
>>
>>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553]
>>> (unpack_rsc_op_failure)  warning: Unexpected result (error) was
>>> recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct
>>> 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0
>>
>>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553]
>>> (unpack_rsc_op_failure)  notice: jangcluster-srv-4 will not be
>>> started under current conditions
>>
>>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[
>>> 776553] (pe_fence_node)  warning: Remote node jangcluster-srv-4
>>> will be fenced: remote connection is unrecoverable
>>
>> The "will not be started" is why the node had to be fenced. There was
> 
> OK so it implies that remote resource should fail over if connection to
> remote node fails. Thank you, that was not exactly clear from documentation.
> 
>> nowhere to recover the connection. I'd need to see the CIB from that
>> time to know why; it's possible you had an old constraint banning the
>> connection from the other node (e.g. from a ban or move command), or
>> something like that.
>>
> 
> Hmm ... looking in (current) sources it seems this message is emitted
> only in case of on-fail=stop operation property ...
> 

Well ...

/* For remote nodes, ensure that any failure that results in dropping an

 * active connection to the node results in fencing of the node.

 *

 * There are only two action failures that don't result in fencing.

 * 1. probes - probe failures are expected.

 * 2. start - a start failure indicates that an active connection
does not already

 * exist. The user can set op on-fail=fence if they really want to
fence start

 * failures. */


pacemaker will forcibly set on-fail=stop for remote resource.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-29 Thread Andrei Borzenkov
On 29.10.2021 17:53, Ken Gaillot wrote:
> On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote:
>> Hey Andrei,
>>  
>> Thanks for your response again. The cluster nodes and remote hosts
>> each share two networks, however there is no routing between them. I
>> don't suppose there is a configuration parameter we can set to tell
>> Pacemaker to try communicating with the remotes using multiple IP
>> addresses?
>>  
>> Gerry Sommerville
>> E-mail: ge...@ca.ibm.com
> 
> Hi,
> 
> No, but you can use bonding if you want to have interface redundancy
> for a remote connection. To be clear, there is no requirement that
> remote nodes and cluster nodes have the same level of redundancy, it's
> just a design choice.
> 
> To address the original question, this is the log sequence I find most
> relevant:
> 
>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553]
>> (unpack_rsc_op_failure)  warning: Unexpected result (error) was
>> recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct
>> 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0
> 
>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553]
>> (unpack_rsc_op_failure)  notice: jangcluster-srv-4 will not be
>> started under current conditions
> 
>> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[
>> 776553] (pe_fence_node)  warning: Remote node jangcluster-srv-4
>> will be fenced: remote connection is unrecoverable
> 
> The "will not be started" is why the node had to be fenced. There was

OK so it implies that remote resource should fail over if connection to
remote node fails. Thank you, that was not exactly clear from documentation.

> nowhere to recover the connection. I'd need to see the CIB from that
> time to know why; it's possible you had an old constraint banning the
> connection from the other node (e.g. from a ban or move command), or
> something like that.
> 

Hmm ... looking in (current) sources it seems this message is emitted
only in case of on-fail=stop operation property ...
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-29 Thread Ken Gaillot
On Fri, 2021-10-29 at 13:59 +, Gerry R Sommerville wrote:
> Hey Andrei,
>  
> Thanks for your response again. The cluster nodes and remote hosts
> each share two networks, however there is no routing between them. I
> don't suppose there is a configuration parameter we can set to tell
> Pacemaker to try communicating with the remotes using multiple IP
> addresses?
>  
> Gerry Sommerville
> E-mail: ge...@ca.ibm.com

Hi,

No, but you can use bonding if you want to have interface redundancy
for a remote connection. To be clear, there is no requirement that
remote nodes and cluster nodes have the same level of redundancy, it's
just a design choice.

To address the original question, this is the log sequence I find most
relevant:

> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553]
> (unpack_rsc_op_failure)  warning: Unexpected result (error) was
> recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct
> 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0

> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553]
> (unpack_rsc_op_failure)  notice: jangcluster-srv-4 will not be
> started under current conditions

> Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[
> 776553] (pe_fence_node)  warning: Remote node jangcluster-srv-4
> will be fenced: remote connection is unrecoverable

The "will not be started" is why the node had to be fenced. There was
nowhere to recover the connection. I'd need to see the CIB from that
time to know why; it's possible you had an old constraint banning the
connection from the other node (e.g. from a ban or move command), or
something like that.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-29 Thread Gerry R Sommerville
Hey Andrei, 
Thanks for your response again. The cluster nodes and remote hosts each share two networks, however there is no routing between them. I don't suppose there is a configuration parameter we can set to tell Pacemaker to try communicating with the remotes using multiple IP addresses? 
Gerry Sommerville
E-mail: ge...@ca.ibm.com
 
 
- Original message -From: "Andrei Borzenkov" Sent by: "Users" To: users@clusterlabs.orgCc:Subject: [EXTERNAL] Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issueDate: Thu, Oct 28, 2021 2:59 PM 
On 28.10.2021 20:13, Gerry R Sommerville wrote:>> What we also found to be interesting is that if the cluster is only using a> single heartbeat ring, then srv-2 will get fenced instead, and theSo as already suspected you did not actually isolate the node at all.> pacemaker-remote connection resources will successfully fail over without any> additional fencing to the remote nodes themselves. It seems a little backwards> to us since our reasoning for configuring multiple heartbeat rings was to> increase the clusters reliability/robustness of the cluster, but it seems to do> the opposite when using pacemaker-remote. :(>Remote node is still node. It does not participate in quorum and it doesnot perform fencing but otherwise it is part of cluster. If you haveredundant rings you are expected to provide redundant connection toremote nodes as well to match it.You do not complain that srv-2 is fenced when its single connection isdown; how does it differ from srv-4 being fenced when its singleconnection is down?> Any suggestions/comments on our configuration / test scenario's are appreciated!Every HA configuration is as reliable as its weakest link. If you makehalf of connections redundant and half of connections not, not redundantconnections will be single point of failure.Besides, you never actually described what is the problem you are tryingto solve. If you had no active resources on remote node, how does itmatter whether this node was fenced? If you had active resources, thoseresources would have failed over to another node.___Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ 
 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-28 Thread Andrei Borzenkov
On 28.10.2021 20:13, Gerry R Sommerville wrote:
> 
> What we also found to be interesting is that if the cluster is only using a 
> single heartbeat ring, then srv-2 will get fenced instead, and the 

So as already suspected you did not actually isolate the node at all.

> pacemaker-remote connection resources will successfully fail over without any 
> additional fencing to the remote nodes themselves. It seems a little 
> backwards 
> to us since our reasoning for configuring multiple heartbeat rings was to 
> increase the clusters reliability/robustness of the cluster, but it seems to 
> do 
> the opposite when using pacemaker-remote. :(
> 

Remote node is still node. It does not participate in quorum and it does
not perform fencing but otherwise it is part of cluster. If you have
redundant rings you are expected to provide redundant connection to
remote nodes as well to match it.

You do not complain that srv-2 is fenced when its single connection is
down; how does it differ from srv-4 being fenced when its single
connection is down?

> Any suggestions/comments on our configuration / test scenario's are 
> appreciated!

Every HA configuration is as reliable as its weakest link. If you make
half of connections redundant and half of connections not, not redundant
connections will be single point of failure.

Besides, you never actually described what is the problem you are trying
to solve. If you had no active resources on remote node, how does it
matter whether this node was fenced? If you had active resources, those
resources would have failed over to another node.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-28 Thread Gerry R Sommerville
Hey Andrei, UlrichI am working with Janghyuk on his testing effort. Thank you for your responses, you have clarified some of the terminology we have been misusing.As Janghyuk mentions previously, we have two "full cluster" nodes using two-node quorum and multiple heart beat rings + two more servers as pacemaker-remotes. The pacemaker-remote connection resources each prefer a specific full cluster node to run on, however they are configured such that they can fail over to the other cluster node if needed. Here is the configuration again...corosync.conf - nodelist & quorum
nodelist {    node {    ring0_addr: node-1-subnet-1    ring1_addr: node-1-subnet-2    name: jangcluster-srv-1    nodeid: 1    }    node {    ring0_addr: node-2-subnet-1    ring1_addr: node-2-subnet-2    name: jangcluster-srv-2    nodeid: 2    }}quorum {    provider: corosync_votequorum    two_node: 1}
 
crm config show 
node 1: jangcluster-srv-1node 2: jangcluster-srv-2node jangcluster-srv-3:remotenode jangcluster-srv-4:remoteprimitive GPFS-Fence stonith:fence_gpfs \    params instance=regress1 shared_filesystem="" pcmk_host_list=" jangcluster-srv-1 jangcluster-srv-2 jangcluster-srv-3 jangcluster-srv-4" secure=true \    op monitor interval=30s timeout=500s \    op off interval=0 \    meta is-managed=trueprimitive jangcluster-srv-3 ocf:pacemaker:remote \    params server=jangcluster-srv-3 reconnect_interval=1m \    op monitor interval=30s \    op_params migration-threshold=1 \    op stop interval=0 \    meta is-managed=trueprimitive jangcluster-srv-4 ocf:pacemaker:remote \    params server=jangcluster-srv-4 reconnect_interval=1m \    op monitor interval=30s \    op_params migration-threshold=1 \    meta is-managed=truelocation prefer-CF-Hosts GPFS-Fence \    rule 100: #uname eq jangcluster-srv-1 or #uname eq jangcluster-srv-2location prefer-node-jangcluster-srv-3 jangcluster-srv-3 100: jangcluster-srv-1location prefer-node-jangcluster-srv-3-2 jangcluster-srv-3 50: jangcluster-srv-2location prefer-node-jangcluster-srv-4 jangcluster-srv-4 100: jangcluster-srv-2location prefer-node-jangcluster-srv-4-2 jangcluster-srv-4 50: jangcluster-srv-1
 We are testing several failure scenarios... In most cases the pacemaker-remote connection resource will successfully fail over to the other cluster node. For example, if we reboot, shutdown, halt, or crash srv-2, the pacemaker-remote connection resource for srv-4 will fail over and start running on srv-1 without srv-3's physical host getting fenced. Manually fencing srv-2 via stonith_admin also works.However when we attempt to simulate a communication failure on srv-2's Ethernet adapter via iptables, we observe srv-3's host getting fenced before the connection resource fails over to srv-1.The concern here is that in the future we may have many remotes connecting to a single cluster host, and so far it seems like a Ethernet adapter issue on the cluster host could lead to many remote hosts getting unnecessarily fenced.Here are the updated iptables commands that we run on srv-2 to simulate srv-2 losing the ability to communicate to srv-4.iptables -A INPUT -s [IP of srv-1] -j DROP ; iptables -A OUTPUT -d [IP of srv-1] -j DROPiptables -A INPUT -s [IP of srv-3] -j DROP ; iptables -A OUTPUT -d [IP of srv-3] -j DROPiptables -A INPUT -s [IP of srv-4] -j DROP ; iptables -A OUTPUT -d [IP of srv-4] -j DROPAs Janghyuk has shown previously, it seems that the pacemaker-remote connection monitor timesout and causes the remote host to get fenced. Here are the logs that I think are most relevant.Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount)   info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount)   info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount)   info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_get_failcount)   info: jangcluster-srv-4 has failed 1 times on jangcluster-srv-2Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (unpack_rsc_op_failure)      warning: Unexpected result (error) was recorded for monitor of jangcluster-srv-4 on jangcluster-srv-2 at Oct 22 12:21:09 2021 | rc=1 id=jangcluster-srv-4_last_failure_0Oct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (unpack_rsc_op_failure)      notice: jangcluster-srv-4 will not be started under current conditionsOct 22 12:21:09.389 jangcluster-srv-2 pacemaker-schedulerd[776553] (pe_fence_node)      warning: Remote node jangcluster-srv-4 will be fenced: remote connection is unrecoverableWhat we also found to be interesting is that if the cluster is only using a single heartbeat ring, then srv-2 will get fenced instead, and the 

Re: [ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue

2021-10-28 Thread Andrei Borzenkov
On Thu, Oct 28, 2021 at 10:30 AM Ulrich Windl
 wrote:
>
> Fencing _is_ a part of failover!
>

As any blanket answer this is mostly incorrect in this context.

There are two separate objects here - remote host itself and pacemaker
resource used to connect to and monitor state of remote host.

Remote host itself does not failover. Resources on this host do, but
OP does not ask about it.

Pacemaker resource used to monitor remote host may failover as any
other cluster resource. This failover does not require any fencing *of
remote host itself*, and in this particular case connection between
two cluster nodes was present all the time (at least, as long as we
can believe logs) so there was no reason for fencing as well. Whether
pacemaker should attempt to failover this resource to another node if
connection to remote host fails, is subject to discussion.

So fencing of the remote host itself is most certainly *not* part of
the failover of the resource that monitors this remote host.

> >>> "Janghyuk Boo"  schrieb am 26.10.2021 um 22:09 in
> Nachricht
> :
> Dear Community ,
> Thank you Ken for your reply last time.
> I attached the log messages as requested from the last thread.
> I have a Pacemaker cluster with two cluster nodes with two network interfaces
> each, and two remote nodes and a prototyped fencing agent(GPFS-Fence) to cut a
> hosts access from the clustered filesystem.
> I noticed that remote node gets fenced when the quorum node its connected to
> gets fenced or experiences network failure.
> For example, when I disconnected srv-2 from the rest of the cluster by using
> iptables on srv-2
> iptables -A INPUT -s [IP of srv-1] -j DROP ; iptables -A OUTPUT -s [IP of
> srv-1] -j DROP
> iptables -A INPUT -s [IP of srv-3] -j DROP ; iptables -A OUTPUT -s [IP of
> srv-3] -j DROP
> iptables -A INPUT -s [IP of srv-4] -j DROP ; iptables -A OUTPUT -s [IP of
> srv-4] -j DROP
> I expected that remote node jangcluster-srv-4 would failover to srv-1 given my
> location constraints,
> but remote node’s monitor ‘jangcluster-srv-4_monitor’ failed and srv-4 was
> getting fenced before attempting to failover.
> What would be the proper way to simulate the network failover?
> How can I configure the cluster so that remote node srv-4 fails over instead
> of getting fenced?
>
>
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/