[ClusterLabs] Antw: Re: Antw: Re: why is node fenced ?

2019-08-13 Thread Ulrich Windl
>>> "Lentes, Bernd"  schrieb am 13.08.2019
um
16:03 in Nachricht
<854237493.2026097.1565705038122.javamail.zim...@helmholtz-muenchen.de>:

> 
> - On Aug 13, 2019, at 3:14 PM, Ulrich Windl 
> ulrich.wi...@rz.uni-regensburg.de wrote:
> 
>> You said you booted the hosts sequentially. From the logs they were
starting 
> in
>> parallel.
>> 
> 
> No. last says:

But why do the eth interfaces on both nodes come up the same second
(2019-08-09T17:42:19)?

> ha-idg-1: 
> reboot   system boot  4.12.14-95.29-de Fri Aug  9 17:42 - 15:56 (3+22:14)
> 
> ha-idg-2:
> reboot   system boot  4.12.14-95.29-de Fri Aug  9 18:08 - 15:58 (3+21:49)
> root pts/010.35.34.70  Fri Aug  9 17:24 - crash  (00:44)
> (unknown :0   :0   Fri Aug  9 17:24 - crash  (00:44)
> reboot   system boot  4.12.14-95.29-de Fri Aug  9 17:23 - 15:58 (3+22:34)
> 
>>> This is the initialization of the bond1 on ha‑idg‑1 during boot.
>>> 3 seconds later bond1 is fine:
>>> 
>>> 2019‑08‑09T17:42:19.299886+02:00 ha‑idg‑2 kernel: [ 1232.117470] tg3
>>> :03:04.0 eth2: Link is up at 1000 Mbps, full duplex
>>> 2019‑08‑09T17:42:19.299908+02:00 ha‑idg‑2 kernel: [ 1232.117482] tg3
>>> :03:04.0 eth2: Flow control is on for TX and on for RX
>>> 2019‑08‑09T17:42:19.315756+02:00 ha‑idg‑2 kernel: [ 1232.131565] tg3
>>> :03:04.1 eth3: Link is up at 1000 Mbps, full duplex
>>> 2019‑08‑09T17:42:19.315767+02:00 ha‑idg‑2 kernel: [ 1232.131568] tg3
>>> :03:04.1 eth3: Flow control is on for TX and on for RX
>>> 2019‑08‑09T17:42:19.351781+02:00 ha‑idg‑2 kernel: [ 1232.169386] bond1:
link
>> 
>>> status definitely up for interface eth2, 1000 Mbps full duplex
>>> 2019‑08‑09T17:42:19.351792+02:00 ha‑idg‑2 kernel: [ 1232.169390] bond1:
>> making
>>> interface eth2 the new active one
>>> 2019‑08‑09T17:42:19.352521+02:00 ha‑idg‑2 kernel: [ 1232.169473] bond1:
>> first
>>> active interface up!
>>> 2019‑08‑09T17:42:19.352532+02:00 ha‑idg‑2 kernel: [ 1232.169480] bond1:
link
>> 
>>> status definitely up for interface eth3, 1000 Mbps full duplex
>>> 
>>> also on ha‑idg‑1:
>>> 
>>> 2019‑08‑09T17:42:19.168035+02:00 ha‑idg‑1 kernel: [  110.164250] tg3
>>> :02:00.3 eth3: Link is up at 1000 Mbps, full duplex
>>> 2019‑08‑09T17:42:19.168050+02:00 ha‑idg‑1 kernel: [  110.164252] tg3
>>> :02:00.3 eth3: Flow control is on for TX and on for RX
>>> 2019‑08‑09T17:42:19.168052+02:00 ha‑idg‑1 kernel: [  110.164254] tg3
>>> :02:00.3 eth3: EEE is disabled
>>> 2019‑08‑09T17:42:19.172020+02:00 ha‑idg‑1 kernel: [  110.171378] tg3
>>> :02:00.2 eth2: Link is up at 1000 Mbps, full duplex
>>> 2019‑08‑09T17:42:19.172028+02:00 ha‑idg‑1 kernel: [  110.171380] tg3
>>> :02:00.2 eth2: Flow control is on for TX and on for RX
>>> 2019‑08‑09T17:42:19.172029+02:00 ha‑idg‑1 kernel: [  110.171382] tg3
>>> :02:00.2 eth2: EEE is disabled
>>>  ...
>>> 2019‑08‑09T17:42:19.244066+02:00 ha‑idg‑1 kernel: [  110.240310] bond1:
link
>> 
>>> status definitely up for interface eth2, 1000 Mbps full duplex
>>> 2019‑08‑09T17:42:19.244083+02:00 ha‑idg‑1 kernel: [  110.240311] bond1:
>> making
>>> interface eth2 the new active one
>>> 2019‑0809T17:42:19.244085+02:00 ha‑idg‑1 kernel: [  110.240353] bond1:
>> first
>>> active interface up!
>>> 2019‑08‑09T17:42:19.244087+02:00 ha‑idg‑1 kernel: [  110.240356] bond1:
link
>> 
>>> status definitely up for interface eth3, 1000 Mbps full duplex
>>> 
>>> And the cluster is started afterwards on ha‑idg‑1 at 17:43:04. I don't
find
>> 
>>> further entries for problems with bond1. So i think it's not related.
>>> Time is synchronized by ntp.
> 
> The two bonding devices (bond1) are connected directly (point-to-point).
> So if eth2 or eth3, the ones for the bonding, go online on one host the 
> other host
> sees it directly.
> 
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de 
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
> Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Master/slave failover does not work as expected

2019-08-13 Thread Jan Pokorný
On 13/08/19 09:44 +0200, Ulrich Windl wrote:
 Harvey Shepherd  schrieb am 12.08.2019 um 
 23:38
> in Nachricht :
>> I've been experiencing exactly the same issue. Pacemaker prioritises 
>> restarting the failed resource over maintaining a master instance. In my 
>> case 
>> I used crm_simulate to analyse the actions planned and taken by pacemaker 
>> during resource recovery. It showed that the system did plan to failover the 
>> master instance, but it was near the bottom of the action list. Higher 
>> priority was given to restarting the failed instance, consequently when that 
>> had occurred, it was easier just to promote the same instance rather than 
>> failing over.
> 
> That's interesting: Maybe usually it's actually faster to restart a
> failed (master) process rather than promoting a slave to master,
> possibly demoting the old master to slave, etc.
> 
> But most obviously while there is a (possible) resource utilization
> for resources, there is none for operations (AFAIK): If one could
> configure "operation costs" (maybe as rules), the cluster could
> prefer the transition with least costs. Unfortunately it will make
> things more complicated.
> 
> I could even imagine if you set the cost for "stop" to infinity, the
> cluster will not even try to stop the resource, but will fence the
> node instead...

Very courageous and highly nontrivial if you think about the
scalability impact (when at it, not that these wouldn't be mitigable
to some extent, switching single brain/DC into segmented multi-leader
approach met with hierarchical scheduling -- there are usually some
clusters [pun intended] of resources rather than each one coinciding
with all the others when the total count goes up).

Anyway, thanks for sharing the ideas, Ulrich, not just now :-)

-- 
Jan (Poki)


pgpiEPvKday33.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Increasing fence timeout

2019-08-13 Thread Casey & Gina
Thank you, I reached the same conclusion after reaching through the script.

Another question - I am no longer seeing the error quoted below as I've 
increased shell_timeout to 30 seconds, but failovers are still happening.  From 
the logs, it appears that the cluster simply loses communication with one of 
the nodes.  Is there a way to increase a timeout such that it waits a while to 
see if it can re-establish the connection before performing a failover?

Thank you,
-- 
Casey

> On Aug 12, 2019, at 1:28 AM, Oyvind Albrigtsen  wrote:
> 
> You should be able to increase this timeout by running:
> pcs stonith update  shell_timeout=10
> 
> Oyvind
> 
> On 08/08/19 12:13 -0600, Casey & Gina wrote:
>> Hi, I'm currently running into periodic premature killing of nodes due to 
>> the fence monitor timeout being set to 5 seconds.  Here is an example 
>> message from the logs:
>> 
>> fence_vmware_rest[22334] stderr: [ Exception: Operation timed out after 5001 
>> milliseconds with 0 bytes received ]
>> 
>> How can I increase this timeout using PCS?
>> 
>> Thank you,
>> -- 
>> Casey
>> ___
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>> 
>> ClusterLabs home: https://www.clusterlabs.org/

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Antw: Re: Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-13 Thread Ken Gaillot
On Tue, 2019-08-13 at 11:06 +0200, Ulrich Windl wrote:
> Hi,
> 
> an update:
> After setting a failure-timeout for the resource that stale monitor
> failure
> was removed automatically at next cluster recheck (it seems).
> Still I wonder why a resource cleanup didn't do that (bug?).

Possibly ... also possibly fixed already, as there have been a few
clean-up related fixes in the past few versions. I'm not sure what's
been backported in the build you have.

> 
> Regards,
> Ulrich
> 
> 
> > > > "Ulrich Windl"  schrieb am
> > > > 13.08.2019
> 
> um
> 10:07 in Nachricht <5d526fb002a100032...@gwsmtp.uni-regensburg.de
> >:
> > > > > Ken Gaillot  schrieb am 13.08.2019 um
> > > > > 01:03 in
> > 
> > Nachricht
> > :
> > > On Mon, 2019‑08‑12 at 17:46 +0200, Ulrich Windl wrote:
> > > > Hi!
> > > > 
> > > > I just noticed that a "crm resource cleanup " caused some
> > > > unexpected behavior and the syslog message:
> > > > crmd[7281]:  warning: new_event_notification (7281‑97955‑15):
> > > > Broken
> > > > pipe (32)
> > > > 
> > > > It's SLES14 SP4 last updated Sept. 2018 (up since then,
> > > > pacemaker‑
> > > > 1.1.19+20180928.0d2680780‑1.8.x86_64).
> > > > 
> > > > The cleanup was due to a failed monitor. As an unexpected
> > > > consequence
> > > > of this cleanup, CRM seemed to restart the complete resource
> > > > (and
> > > > dependencies), even though it was running.
> > > 
> > > I assume the monitor failure was old, and recovery had already
> > > completed? If not, recovery might have been initiated before the
> > > clean‑
> > > up was recorded.
> > > 
> > > > I noticed that a manual "crm_resource ‑C ‑r  ‑N "
> > > > command
> > > > has the same effect (multiple resources are "Cleaned up",
> > > > resources
> > > > are restarted seemingly before the "probe" is done.).
> > > 
> > > Can you verify whether the probes were done? The DC should log a
> > > message when each _monitor_0 result comes in.
> > 
> > So here's a rough sketch of events:
> > 17:10:23 crmd[7281]:   notice: State transition S_IDLE ->
> > S_POLICY_ENGINE
> > ...no probes yet...
> > 17:10:24 pengine[7280]:  warning: Processing failed monitor of 
> > prm_nfs_server
> > on rksaph11: not running
> > ...lots of starts/restarts...
> > 17:10:24 pengine[7280]:   notice:  * Restartprm_nfs_server  
> > ...
> > 17:10:24 crmd[7281]:   notice: Processing graph 6628
> > (ref=pe_calc-dc-1565622624-7313) derived from
> > /var/lib/pacemaker/pengine/pe-input-1810.bz2
> > ...monitors are being called...
> > 17:10:24 crmd[7281]:   notice: Result of probe operation for
> > prm_nfs_vg on
> > h11: 0 (ok)
> > ...the above was the first probe result...
> > 17:10:24 crmd[7281]:  warning: Action 33 (prm_nfs_vg_monitor_0) on
> > h11 
> > failed
> > (target: 7 vs. rc: 0): Error
> > ...not surprising to me: The resource was running; I don't know why
> > the
> > cluster want to start it...

That's normal, that's how pacemaker detects active resources after
clean-up. It schedules a probe and start in the assumption that the
probe will find the resource not running; if it is running, the probe
result will cause a new transition where the start isn't needed.

That message will be improved in the next version (already in master
branch), like:

notice: Transition 10 action 33 (prm_nfs_vg_monitor_0 on h11): expected
'not running' but got 'ok'

> > 17:10:24 crmd[7281]:   notice: Transition 6629 (Complete=9,
> > Pending=0,
> > Fired=0, Skipped=0, Incomplete=0,
> > Source=/var/lib/pacemaker/pengine/pe-input-1811.bz2): Complete
> > 17:10:24 crmd[7281]:   notice: State transition S_TRANSITION_ENGINE
> > ->
> 
> S_IDLE
> > 
> > The really bad thing after this is that the "cleaned up" resource
> > still has
> > a
> > failed status (dated in the past (last-rc-change='Mon Aug 12
> > 04:52:23 
> > 2019')),
> > even though "running".
> > 
> > I tend to believe that the cluster is in a bad state, or the
> > software has a
> > problem cleaning the status of the monitor.

It does sound like a clean-up bug, but I'm not aware of any current
issues. I suspect it's already fixed.

> > The CIB status for the resource looks like this:
> >  > class="ocf"
> > provider="heartbeat">
> >> operation_key="prm_nfs_server_start_0" operation="start"
> > crm-debug-origin="do_update_resource" crm_feature_set="3.0.14"
> > transition-key="67:6583:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> > transition-magic="0:0;67:6583:0:d941efc1-de73-4ee4-b593-
> > f65be9e90726"
> > exit-reason="" on_node="h11" call-id="799" rc-code="0" op-
> > status="0"
> > interval="0" last-run="1565582351" last-rc-change="1565582351" 
> > exec-time="708"
> > queue-time="0" op-digest="73311a0ef4ba8e9f1f97e05e989f6348"/>
> >> operation_key="prm_nfs_server_monitor_6" operation="monitor"
> > crm-debug-origin="do_update_resource" crm_feature_set="3.0.14"
> > transition-key="68:6583:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> > transition-magic="0:0;68:6583:0:d941efc1-de73-4ee4-b593-
> > f65be9e90726"
> >

Re: [ClusterLabs] Strange lost quorum with qdevice

2019-08-13 Thread Jan Friesse

Олег Самойлов napsal(a):




13 авг. 2019 г., в 15:55, Jan Friesse  написал(а):

There is going to be slightly different solution (set this timeouts based on 
corosync token timeout) which I'm working on, but it's kind of huge amount of 
work and not super high prio (workaround exists), so no ETA yet.


Is it will be only in RedHat 8? May be will be good to quick fix default 
sync_timeout to 40 in RedHat 7. I don't need in this now, but there is some 
reputation risk with others. ;)



Yep. Eventho it is not super high prio, but it is still high enough prio 
to get into RHEL 7/8, all supported Fedoras and upstream-vise to both 
corosync 2.4 (hopefully 2.4.6) and corosync-qdevice 3.0 (hopefully 3.0.1).


Honza
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: why is node fenced ?

2019-08-13 Thread Lentes, Bernd


- On Aug 13, 2019, at 3:14 PM, Ulrich Windl 
ulrich.wi...@rz.uni-regensburg.de wrote:

> You said you booted the hosts sequentially. From the logs they were starting 
> in
> parallel.
> 

No. last says:
ha-idg-1: 
reboot   system boot  4.12.14-95.29-de Fri Aug  9 17:42 - 15:56 (3+22:14)

ha-idg-2:
reboot   system boot  4.12.14-95.29-de Fri Aug  9 18:08 - 15:58 (3+21:49)
root pts/010.35.34.70  Fri Aug  9 17:24 - crash  (00:44)
(unknown :0   :0   Fri Aug  9 17:24 - crash  (00:44)
reboot   system boot  4.12.14-95.29-de Fri Aug  9 17:23 - 15:58 (3+22:34)

>> This is the initialization of the bond1 on ha‑idg‑1 during boot.
>> 3 seconds later bond1 is fine:
>> 
>> 2019‑08‑09T17:42:19.299886+02:00 ha‑idg‑2 kernel: [ 1232.117470] tg3
>> :03:04.0 eth2: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.299908+02:00 ha‑idg‑2 kernel: [ 1232.117482] tg3
>> :03:04.0 eth2: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.315756+02:00 ha‑idg‑2 kernel: [ 1232.131565] tg3
>> :03:04.1 eth3: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.315767+02:00 ha‑idg‑2 kernel: [ 1232.131568] tg3
>> :03:04.1 eth3: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.351781+02:00 ha‑idg‑2 kernel: [ 1232.169386] bond1: link
> 
>> status definitely up for interface eth2, 1000 Mbps full duplex
>> 2019‑08‑09T17:42:19.351792+02:00 ha‑idg‑2 kernel: [ 1232.169390] bond1:
> making
>> interface eth2 the new active one
>> 2019‑08‑09T17:42:19.352521+02:00 ha‑idg‑2 kernel: [ 1232.169473] bond1:
> first
>> active interface up!
>> 2019‑08‑09T17:42:19.352532+02:00 ha‑idg‑2 kernel: [ 1232.169480] bond1: link
> 
>> status definitely up for interface eth3, 1000 Mbps full duplex
>> 
>> also on ha‑idg‑1:
>> 
>> 2019‑08‑09T17:42:19.168035+02:00 ha‑idg‑1 kernel: [  110.164250] tg3
>> :02:00.3 eth3: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.168050+02:00 ha‑idg‑1 kernel: [  110.164252] tg3
>> :02:00.3 eth3: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.168052+02:00 ha‑idg‑1 kernel: [  110.164254] tg3
>> :02:00.3 eth3: EEE is disabled
>> 2019‑08‑09T17:42:19.172020+02:00 ha‑idg‑1 kernel: [  110.171378] tg3
>> :02:00.2 eth2: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.172028+02:00 ha‑idg‑1 kernel: [  110.171380] tg3
>> :02:00.2 eth2: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.172029+02:00 ha‑idg‑1 kernel: [  110.171382] tg3
>> :02:00.2 eth2: EEE is disabled
>>  ...
>> 2019‑08‑09T17:42:19.244066+02:00 ha‑idg‑1 kernel: [  110.240310] bond1: link
> 
>> status definitely up for interface eth2, 1000 Mbps full duplex
>> 2019‑08‑09T17:42:19.244083+02:00 ha‑idg‑1 kernel: [  110.240311] bond1:
> making
>> interface eth2 the new active one
>> 2019‑08‑09T17:42:19.244085+02:00 ha‑idg‑1 kernel: [  110.240353] bond1:
> first
>> active interface up!
>> 2019‑08‑09T17:42:19.244087+02:00 ha‑idg‑1 kernel: [  110.240356] bond1: link
> 
>> status definitely up for interface eth3, 1000 Mbps full duplex
>> 
>> And the cluster is started afterwards on ha‑idg‑1 at 17:43:04. I don't find
> 
>> further entries for problems with bond1. So i think it's not related.
>> Time is synchronized by ntp.

The two bonding devices (bond1) are connected directly (point-to-point).
So if eth2 or eth3, the ones for the bonding, go online on one host the other 
host
sees it directly.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

2019-08-13 Thread Lentes, Bernd



- On Aug 13, 2019, at 3:34 PM, Matthias Ferdinand m...@14v.de wrote:
>> 17:26:35  crm node standby ha-idg1-
> 
> if that is not a copy&paste error (ha-idg1- vs. ha-idg-1), then ha-idg-1
> was not set to standby, and installing updates may have done some
> meddling with corosync/pacemaker (like stopping corosync without
> stopping pacemaker) while having active resources.
> 

It's a typo. There were no updates for corosync or pacmaker.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Strange lost quorum with qdevice

2019-08-13 Thread Олег Самойлов


> 13 авг. 2019 г., в 15:55, Jan Friesse  написал(а):
> 
> There is going to be slightly different solution (set this timeouts based on 
> corosync token timeout) which I'm working on, but it's kind of huge amount of 
> work and not super high prio (workaround exists), so no ETA yet.

Is it will be only in RedHat 8? May be will be good to quick fix default 
sync_timeout to 40 in RedHat 7. I don't need in this now, but there is some 
reputation risk with others. ;)
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Master/slave failover does not work as expected

2019-08-13 Thread Michael Powell
around this time
>  
> 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now 
> running without any active interface!
>  
> so perhaps that's related.
>  
> HTH,
> Chris
>  
>  
> ?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" 
>  bernd.len...@helmholtz-muenchen.de> wrote:
>  
> Hi,
>
> last Friday (9th of August) i had to install patches on my two-node 
> cluster.
> I put one of the nodes (ha-idg-2) into standby (crm node standby 
> ha-idg-2), patched it, rebooted,
> started the cluster (systemctl start pacemaker) again, put the node again 
> online, everything fine.
>
> Then i wanted to do the same procedure with the other node (ha-idg-1).
> I put it in standby, patched it, rebooted, started pacemaker again.
> But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.
> I know that nodes which are unclean need to be shutdown, that's logical.
>
> But i don't know from where the conclusion comes that the node is unclean 
> respectively why it is unclean,
> i searched in the logs and didn't find any hint.
>
> I put the syslog and the pacemaker log on a seafile share, i'd be very 
> thankful if you'll have a look.
> https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/
>
> Here the cli history of the commands:
>
> 17:03:04  crm node standby ha-idg-2
> 17:07:15  zypper up (install Updates on ha-idg-2)
> 17:17:30  systemctl reboot
> 17:25:21  systemctl start pacemaker.service
> 17:25:47  crm node online ha-idg-2
> 17:26:35  crm node standby ha-idg1-
> 17:30:21  zypper up (install Updates on ha-idg-1)
> 17:37:32  systemctl reboot
> 17:43:04  systemctl start pacemaker.service
> 17:44:00  ha-idg-1 is fenced
>
> Thanks.
>
> Bernd
>
> OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1
>
> 
> --
> 
> Bernd Lentes
> Systemadministration
> Institut f?r Entwicklungsgenetik
> Geb?ude 35.34 - Raum 208
> HelmholtzZentrum m?nchen
> bernd.len...@helmholtz-muenchen.de
> phone: +49 89 3187 1241
> phone: +49 89 3187 3827
> fax: +49 89 3187 2294
> http://www.helmholtz-muenchen.de/idg
> 
> Perfekt ist wer keine Fehler macht
> Also sind Tote perfekt
> 
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
> Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>  
>  
> --
>  
> Message: 4
> Date: Mon, 12 Aug 2019 23:09:31 +0300
> From: Andrei Borzenkov 
> To: Cluster Labs - All topics related to open-source clustering
> welcomed 
> Cc: Venkata Reddy Chappavarapu 
> Subject: Re: [ClusterLabs] Master/slave failover does not work as
> expected
> Message-ID:
> 
> 
> Content-Type: text/plain; charset="utf-8"
>  
> On Mon, Aug 12, 2019 at 4:12 PM Michael Powell < 
> michael.pow...@harmonicinc.com> wrote:
>  
> > At 07:44:49, the ss agent discovers that the master instance has 
> > failed on node *mgraid?-0* as a result of a failed *ssadm* request 
> > in response to an *ss_monitor()* operation.  It issues a *crm_master 
> > -Q -D* command with the intent of demoting the master and promoting 
> > the slave, on the other node, to master.  The *ss_demote()* function 
> > finds that the application is no longer running and returns 
> > *OCF_NOT_RUNNING* (7).  In the older product, this was sufficient to 
> > promote the other instance to master, but in the current product, 
> > that does not happen.  Currently, the failed application is 
> > restarted, as expected, and is promoted to master, but this takes 10?s of 
> > seconds.
> > 
> > 
> > 
>  
> Did you try to disable resource stickiness for this ms?
> -- next part -- An HTML attachment was 
> scrubbed...
> URL: 
> <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/12
> 978d55/a

Re: [ClusterLabs] why is node fenced ?

2019-08-13 Thread Matthias Ferdinand
On Mon, Aug 12, 2019 at 04:09:48PM -0400, users-requ...@clusterlabs.org wrote:
> Date: Mon, 12 Aug 2019 18:09:24 +0200 (CEST)
> From: "Lentes, Bernd" 
> To: Pacemaker ML 
> Subject: [ClusterLabs] why is node fenced ?
> Message-ID:
>   <546330844.1686419.1565626164456.javamail.zim...@helmholtz-muenchen.de>
>   
...
> 17:26:35  crm node standby ha-idg1-

if that is not a copy&paste error (ha-idg1- vs. ha-idg-1), then ha-idg-1
was not set to standby, and installing updates may have done some
meddling with corosync/pacemaker (like stopping corosync without
stopping pacemaker) while having active resources.

Matthias
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: why is node fenced ?

2019-08-13 Thread Ulrich Windl
You said you booted the hosts sequentially. From the logs they were starting in
parallel.

>>> "Lentes, Bernd"  schrieb am 13.08.2019
um
13:53 in Nachricht
<767205671.1953556.1565697218136.javamail.zim...@helmholtz-muenchen.de>:
> ‑ On Aug 12, 2019, at 7:47 PM, Chris Walker cwal...@cray.com wrote:
> 
>> When ha‑idg‑1 started Pacemaker around 17:43, it did not see ha‑idg‑2, for
>> example,
>> 
>> Aug 09 17:43:05 [6318] ha‑idg‑1 pacemakerd: info: 
> pcmk_quorum_notification:
>> Quorum retained | membership=1320 members=1
>> 
>> after ~20s (dc‑deadtime parameter), ha‑idg‑2 is marked 'unclean' and
STONITHed
>> as part of startup fencing.
>> 
>> There is nothing in ha‑idg‑2's HA logs around 17:43 indicating that it saw
>> ha‑idg‑1 either, so it appears that there was no communication at all
between
>> the two nodes.
>> 
>> I'm not sure exactly why the nodes did not see one another, but there are
>> indications of network issues around this time
>> 
>> 2019‑08‑09T17:42:16.427947+02:00 ha‑idg‑2 kernel: [ 1229.245533] bond1:
now
>> running without any active interface!
>> 
>> so perhaps that's related.
> 
> This is the initialization of the bond1 on ha‑idg‑1 during boot.
> 3 seconds later bond1 is fine:
> 
> 2019‑08‑09T17:42:19.299886+02:00 ha‑idg‑2 kernel: [ 1232.117470] tg3 
> :03:04.0 eth2: Link is up at 1000 Mbps, full duplex
> 2019‑08‑09T17:42:19.299908+02:00 ha‑idg‑2 kernel: [ 1232.117482] tg3 
> :03:04.0 eth2: Flow control is on for TX and on for RX
> 2019‑08‑09T17:42:19.315756+02:00 ha‑idg‑2 kernel: [ 1232.131565] tg3 
> :03:04.1 eth3: Link is up at 1000 Mbps, full duplex
> 2019‑08‑09T17:42:19.315767+02:00 ha‑idg‑2 kernel: [ 1232.131568] tg3 
> :03:04.1 eth3: Flow control is on for TX and on for RX
> 2019‑08‑09T17:42:19.351781+02:00 ha‑idg‑2 kernel: [ 1232.169386] bond1: link

> status definitely up for interface eth2, 1000 Mbps full duplex
> 2019‑08‑09T17:42:19.351792+02:00 ha‑idg‑2 kernel: [ 1232.169390] bond1:
making 
> interface eth2 the new active one
> 2019‑08‑09T17:42:19.352521+02:00 ha‑idg‑2 kernel: [ 1232.169473] bond1:
first 
> active interface up!
> 2019‑08‑09T17:42:19.352532+02:00 ha‑idg‑2 kernel: [ 1232.169480] bond1: link

> status definitely up for interface eth3, 1000 Mbps full duplex
> 
> also on ha‑idg‑1:
> 
> 2019‑08‑09T17:42:19.168035+02:00 ha‑idg‑1 kernel: [  110.164250] tg3 
> :02:00.3 eth3: Link is up at 1000 Mbps, full duplex
> 2019‑08‑09T17:42:19.168050+02:00 ha‑idg‑1 kernel: [  110.164252] tg3 
> :02:00.3 eth3: Flow control is on for TX and on for RX
> 2019‑08‑09T17:42:19.168052+02:00 ha‑idg‑1 kernel: [  110.164254] tg3 
> :02:00.3 eth3: EEE is disabled
> 2019‑08‑09T17:42:19.172020+02:00 ha‑idg‑1 kernel: [  110.171378] tg3 
> :02:00.2 eth2: Link is up at 1000 Mbps, full duplex
> 2019‑08‑09T17:42:19.172028+02:00 ha‑idg‑1 kernel: [  110.171380] tg3 
> :02:00.2 eth2: Flow control is on for TX and on for RX
> 2019‑08‑09T17:42:19.172029+02:00 ha‑idg‑1 kernel: [  110.171382] tg3 
> :02:00.2 eth2: EEE is disabled
>  ...
> 2019‑08‑09T17:42:19.244066+02:00 ha‑idg‑1 kernel: [  110.240310] bond1: link

> status definitely up for interface eth2, 1000 Mbps full duplex
> 2019‑08‑09T17:42:19.244083+02:00 ha‑idg‑1 kernel: [  110.240311] bond1:
making 
> interface eth2 the new active one
> 2019‑08‑09T17:42:19.244085+02:00 ha‑idg‑1 kernel: [  110.240353] bond1:
first 
> active interface up!
> 2019‑08‑09T17:42:19.244087+02:00 ha‑idg‑1 kernel: [  110.240356] bond1: link

> status definitely up for interface eth3, 1000 Mbps full duplex
> 
> And the cluster is started afterwards on ha‑idg‑1 at 17:43:04. I don't find

> further entries for problems with bond1. So i think it's not related.
> Time is synchronized by ntp.
> 
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz‑muenchen.de 
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
> Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt‑IdNr: DE 129521671
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Strange lost quorum with qdevice

2019-08-13 Thread Jan Friesse

Олег Самойлов napsal(a):




12 авг. 2019 г., в 8:46, Jan Friesse  написал(а):

Let me try to bring some light in there:

- dpd_interval is qnetd variable how often qnetd walks thru the list of all 
clients (qdevices) and checks timestamp of last sent message. If diff between 
current timestamp and last sent message timestamp is larger than 2 * timeout 
sent by client then client is considered as death.

- interval - affects how often qdevice sends heartbeat to corosync (this is 
half of the interval) about its liveness and also how often it sends heartbeat 
to qnetd (0.8 * interval). On corosync side this is used as a timeout after 
which qdevice daemon is considered death and its votes are no longer valid.

- sync_timeout - Not used by qdevice/qnetd. Used by corosync during sync phase. 
If corosync doesn't get reply by qdevice till this timeout it considers qdevice 
daemon death and continues sync process.

It was probably not evident from my reply, but what I meant was to change just 
dpd_interval. Could you please recheck with dpd_interval=1, timeout=20, 
sync_timeout=60?


Did you by mistype call 'timeout' parameter of qdevice as 'interval'?


Yep



I can't understand how is planned your configuration to work. dpd_interval=1, 
timeout=20, sync_timeout=60. Qnetd will check every 1 second and timeout will 
be detected on 2 second away from the last message from Qdevice. But Qdevice 
will send message only every 0.8*20=16 seconds. So, by your description, quorum 
must be lost every time.


Nope. With dpd_interval=1 Qnetd would check every second that client 
sent (any) message no longer than 32 (0.8*20*2) seconds ago.





The reality also is strange. Qnetd detect lost client not after 2 seconds, by your description, but 


I've never wrote anything about 2 seconds.

after 33 seconds and this is only slightly less than 40s when 
dpd_interval=20. With 33 seconds timeout real reaction time is 54s which 
enough for timeout 60s, in this example. But looked like 6 (+-, may be 
slightly random) seconds gap is enough, there is no problems with lost 
quorum.


Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug   Client 
:::192.168.89.12:36150 (cluster krogan1, node_id 2) sent membership node 
list.
Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug msg seq num = 
11
Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug ring id = 
(2.30)
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug heuristics = 
Undefined
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug node list:
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   node_id = 
2, data_center_id = 0, node_state = not set
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   ffsplit: 
Membership for cluster krogan1 is not yet stable
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   Algorithm 
result vote is Wait for reply
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 warning Client 
:::192.168.89.11:48924 doesn't sent any message during 33000ms. 
Disconnecting
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client 
:::192.168.89.11:48924 (init_received 1, cluster krogan1, node_id 1) 
disconnect
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: 
Membership for cluster krogan1 is now stable
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: 
Quorate partition selected
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug node list:
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   node_id = 
2, data_center_id = 0, node_state = not set
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: No 
client gets NACK
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Sending vote 
info to client :::192.168.89.12:36150 (cluster krogan1, node_id 2)
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug msg seq num = 
6
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug vote = ACK
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client 
:::192.168.89.12:36150 (cluster krogan1, node_id 2) replied back to vote 
info message
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug msg seq num = 
6
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: All 
ACK votes sent for cluster krogan1
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client 
:::192.168.89.12:36150 (cluster krogan1, node_id 2) sent quorum node list.
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug msg seq num = 
12
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug quorate = 1
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug node list:
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   node_id = 
1, data_center_id = 0, node_state = dead
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debu

Re: [ClusterLabs] Strange lost quorum with qdevice

2019-08-13 Thread Олег Самойлов


> 12 авг. 2019 г., в 8:46, Jan Friesse  написал(а):
> 
> Let me try to bring some light in there:
> 
> - dpd_interval is qnetd variable how often qnetd walks thru the list of all 
> clients (qdevices) and checks timestamp of last sent message. If diff between 
> current timestamp and last sent message timestamp is larger than 2 * timeout 
> sent by client then client is considered as death.
> 
> - interval - affects how often qdevice sends heartbeat to corosync (this is 
> half of the interval) about its liveness and also how often it sends 
> heartbeat to qnetd (0.8 * interval). On corosync side this is used as a 
> timeout after which qdevice daemon is considered death and its votes are no 
> longer valid.
> 
> - sync_timeout - Not used by qdevice/qnetd. Used by corosync during sync 
> phase. If corosync doesn't get reply by qdevice till this timeout it 
> considers qdevice daemon death and continues sync process.
> 
> It was probably not evident from my reply, but what I meant was to change 
> just dpd_interval. Could you please recheck with dpd_interval=1, timeout=20, 
> sync_timeout=60?

Did you by mistype call 'timeout' parameter of qdevice as 'interval'?

I can't understand how is planned your configuration to work. dpd_interval=1, 
timeout=20, sync_timeout=60. Qnetd will check every 1 second and timeout will 
be detected on 2 second away from the last message from Qdevice. But Qdevice 
will send message only every 0.8*20=16 seconds. So, by your description, quorum 
must be lost every time.

The reality also is strange. Qnetd detect lost client not after 2 seconds, by 
your description, but after 33 seconds and this is only slightly less than 40s 
when dpd_interval=20. With 33 seconds timeout real reaction time is 54s which 
enough for timeout 60s, in this example. But looked like 6 (+-, may be slightly 
random) seconds gap is enough, there is no problems with lost quorum. 

Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug   Client 
:::192.168.89.12:36150 (cluster krogan1, node_id 2) sent membership node 
list.
Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug msg seq num = 
11
Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug ring id = 
(2.30)
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug heuristics = 
Undefined
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug node list:
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   node_id = 
2, data_center_id = 0, node_state = not set
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   ffsplit: 
Membership for cluster krogan1 is not yet stable
Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   Algorithm 
result vote is Wait for reply
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 warning Client 
:::192.168.89.11:48924 doesn't sent any message during 33000ms. 
Disconnecting
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client 
:::192.168.89.11:48924 (init_received 1, cluster krogan1, node_id 1) 
disconnect
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: 
Membership for cluster krogan1 is now stable
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: 
Quorate partition selected
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug node list:
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   node_id = 
2, data_center_id = 0, node_state = not set
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: No 
client gets NACK
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Sending vote 
info to client :::192.168.89.12:36150 (cluster krogan1, node_id 2)
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug msg seq num = 
6
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug vote = ACK
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client 
:::192.168.89.12:36150 (cluster krogan1, node_id 2) replied back to vote 
info message
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug msg seq num = 
6
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: All 
ACK votes sent for cluster krogan1
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client 
:::192.168.89.12:36150 (cluster krogan1, node_id 2) sent quorum node list.
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug msg seq num = 
12
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug quorate = 1
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug node list:
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   node_id = 
1, data_center_id = 0, node_state = dead
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   node_id = 
2, data_center_id = 0, node_state = member
Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Algorithm 
result vote is No change
Aug 13 02:22:3

Re: [ClusterLabs] why is node fenced ?

2019-08-13 Thread Lentes, Bernd
- On Aug 12, 2019, at 7:47 PM, Chris Walker cwal...@cray.com wrote:

> When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for
> example,
> 
> Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: 
> pcmk_quorum_notification:
> Quorum retained | membership=1320 members=1
> 
> after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed
> as part of startup fencing.
> 
> There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw
> ha-idg-1 either, so it appears that there was no communication at all between
> the two nodes.
> 
> I'm not sure exactly why the nodes did not see one another, but there are
> indications of network issues around this time
> 
> 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now
> running without any active interface!
> 
> so perhaps that's related.

This is the initialization of the bond1 on ha-idg-1 during boot.
3 seconds later bond1 is fine:

2019-08-09T17:42:19.299886+02:00 ha-idg-2 kernel: [ 1232.117470] tg3 
:03:04.0 eth2: Link is up at 1000 Mbps, full duplex
2019-08-09T17:42:19.299908+02:00 ha-idg-2 kernel: [ 1232.117482] tg3 
:03:04.0 eth2: Flow control is on for TX and on for RX
2019-08-09T17:42:19.315756+02:00 ha-idg-2 kernel: [ 1232.131565] tg3 
:03:04.1 eth3: Link is up at 1000 Mbps, full duplex
2019-08-09T17:42:19.315767+02:00 ha-idg-2 kernel: [ 1232.131568] tg3 
:03:04.1 eth3: Flow control is on for TX and on for RX
2019-08-09T17:42:19.351781+02:00 ha-idg-2 kernel: [ 1232.169386] bond1: link 
status definitely up for interface eth2, 1000 Mbps full duplex
2019-08-09T17:42:19.351792+02:00 ha-idg-2 kernel: [ 1232.169390] bond1: making 
interface eth2 the new active one
2019-08-09T17:42:19.352521+02:00 ha-idg-2 kernel: [ 1232.169473] bond1: first 
active interface up!
2019-08-09T17:42:19.352532+02:00 ha-idg-2 kernel: [ 1232.169480] bond1: link 
status definitely up for interface eth3, 1000 Mbps full duplex

also on ha-idg-1:

2019-08-09T17:42:19.168035+02:00 ha-idg-1 kernel: [  110.164250] tg3 
:02:00.3 eth3: Link is up at 1000 Mbps, full duplex
2019-08-09T17:42:19.168050+02:00 ha-idg-1 kernel: [  110.164252] tg3 
:02:00.3 eth3: Flow control is on for TX and on for RX
2019-08-09T17:42:19.168052+02:00 ha-idg-1 kernel: [  110.164254] tg3 
:02:00.3 eth3: EEE is disabled
2019-08-09T17:42:19.172020+02:00 ha-idg-1 kernel: [  110.171378] tg3 
:02:00.2 eth2: Link is up at 1000 Mbps, full duplex
2019-08-09T17:42:19.172028+02:00 ha-idg-1 kernel: [  110.171380] tg3 
:02:00.2 eth2: Flow control is on for TX and on for RX
2019-08-09T17:42:19.172029+02:00 ha-idg-1 kernel: [  110.171382] tg3 
:02:00.2 eth2: EEE is disabled
 ...
2019-08-09T17:42:19.244066+02:00 ha-idg-1 kernel: [  110.240310] bond1: link 
status definitely up for interface eth2, 1000 Mbps full duplex
2019-08-09T17:42:19.244083+02:00 ha-idg-1 kernel: [  110.240311] bond1: making 
interface eth2 the new active one
2019-08-09T17:42:19.244085+02:00 ha-idg-1 kernel: [  110.240353] bond1: first 
active interface up!
2019-08-09T17:42:19.244087+02:00 ha-idg-1 kernel: [  110.240356] bond1: link 
status definitely up for interface eth3, 1000 Mbps full duplex

And the cluster is started afterwards on ha-idg-1 at 17:43:04. I don't find 
further entries for problems with bond1. So i think it's not related.
Time is synchronized by ntp.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] pcs 0.9.168 released

2019-08-13 Thread Tomas Jelinek

I am happy to announce the latest release of pcs, version 0.9.168.

Source code is available at:
https://github.com/ClusterLabs/pcs/archive/0.9.168.tar.gz
or
https://github.com/ClusterLabs/pcs/archive/0.9.168.zip


Complete change log for this release:
## [0.9.168] - 2019-08-02

### Added
- It is now possible to disable pcsd SSL certificate being synced across
  the cluster during creating new cluster and adding a node to an
  existing cluster by setting `PCSD_SSL_CERT_SYNC_ENABLED` to `false` in
  pcsd config file ([rhbz#1665898])
- Length of DH key for SSL key exchange can be set in pcsd config file
- `pcs status` now shows failed and pending fencing actions and `pcs
  status --full` shows the whole fencing history. Pacemaker supporting
  fencing history is required. ([rhbz#1466088])
- `pcs stonith history` commands for displaying, synchronizing and
  cleaning up fencing history. Pacemaker supporting fencing history is
  required. ([rhbz#1595444])
- Support for clearing expired moves and bans of resources
  ([rhbz#1673829])
- HSTS is now enabled in pcsd ([rhbz#1558063])

### Fixed
- Pcs works even when PATH environment variable is not set
  ([rhbz#1671174])
- Fixed several "Unknown report" error messages
- Improved validation of qdevice heuristics exec options
  ([rhbz#1551663])
- Fixed crashes in the `pcs cluster auth` command ([rhbz#1676956])
- Pcs does not crash due to unhandled LibraryError exceptions
  ([rhbz#1710750])
- `pcs config restore` does not fail with "Invalid cross-device link"
  error any more ([rhbz#1712315])
- Fixed id conflict with current bundle configuration in `pcs resource
  bundle reset` ([rhbz#1725849])
- Standby nodes running resources are listed separately in `pcs status
  nodes` ([rhbz#1619253])
- Parsing arguments in the `pcs constraint order` and `pcs constraint
  colocation add` commands has been improved, errors which were
  previously silent are now reported ([rhbz#1500012])

### Changed
Command `pcs resource bundle reset` no longer accepts the container type
([rhbz#1598197])


Thanks / congratulations to everyone who contributed to this release,
including Ivan Devat, Ondrej Mular and Tomas Jelinek

Cheers,
Tomas


[rhbz#1466088]: https://bugzilla.redhat.com/show_bug.cgi?id=1466088
[rhbz#1500012]: https://bugzilla.redhat.com/show_bug.cgi?id=1500012
[rhbz#1551663]: https://bugzilla.redhat.com/show_bug.cgi?id=1551663
[rhbz#1558063]: https://bugzilla.redhat.com/show_bug.cgi?id=1558063
[rhbz#1595444]: https://bugzilla.redhat.com/show_bug.cgi?id=1595444
[rhbz#1598197]: https://bugzilla.redhat.com/show_bug.cgi?id=1598197
[rhbz#1619253]: https://bugzilla.redhat.com/show_bug.cgi?id=1619253
[rhbz#1665898]: https://bugzilla.redhat.com/show_bug.cgi?id=1665898
[rhbz#1671174]: https://bugzilla.redhat.com/show_bug.cgi?id=1671174
[rhbz#1673829]: https://bugzilla.redhat.com/show_bug.cgi?id=1673829
[rhbz#1676956]: https://bugzilla.redhat.com/show_bug.cgi?id=1676956
[rhbz#1710750]: https://bugzilla.redhat.com/show_bug.cgi?id=1710750
[rhbz#1712315]: https://bugzilla.redhat.com/show_bug.cgi?id=1712315
[rhbz#1725849]: https://bugzilla.redhat.com/show_bug.cgi?id=1725849
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: why is node fenced ?

2019-08-13 Thread Ulrich Windl
>>> "Lentes, Bernd"  schrieb am 13.08.2019
um
10:54 in Nachricht
<848962511.1856599.1565686469666.javamail.zim...@helmholtz-muenchen.de>:

> 
> ‑ On Aug 13, 2019, at 9:00 AM, Ulrich Windl
ulrich.wi...@rz.uni‑regensburg.de 
> wrote:
> 
>> Personally I feel more save with updates when the whole cluster node is
>> offline, not standby. When you are going to boot anyway, it won't make much

> of
>> a difference. Also you don't have to remember to put the node back online
in
>> the configuration.
> 
> OK.
>  
>> For your case: After Rebooting the first node, the second one was DC. If
you
>> reboot that, the first node becomes DC, but when booting node 2 it still 
> have
>> the old config saying it it the DC. So both nodes have to agree on that. 
> Maybe
>> that's why the cluster is "unclean" for a while. Did it "go away" after
some
>> time?
> 
> For debugging, e.g. after fencing: is there something in the logs form
where
> i recognize which one was DC and which not ? E.g. some differences in the 
> pacemaker.log ?

In SLES12 it's in /var/log/pacemaker.log: Look for strings like "Set DC to "
and "Unset DC. Was ", or "I_ELECTION_DC" and "I_NOT_DC".

> 
> What is about changes of the CIB and the following actions ? If after a diff

> in the CIB pengine is doing s.th.,
> this must be the DC i guess.

Correct.

> 
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz‑muenchen.de 
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
> Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt‑IdNr: DE 129521671
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: Antw: Re: Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-13 Thread Ulrich Windl
Hi,

an update:
After setting a failure-timeout for the resource that stale monitor failure
was removed automatically at next cluster recheck (it seems).
Still I wonder why a resource cleanup didn't do that (bug?).

Regards,
Ulrich


>>> "Ulrich Windl"  schrieb am 13.08.2019
um
10:07 in Nachricht <5d526fb002a100032...@gwsmtp.uni-regensburg.de>:
 Ken Gaillot  schrieb am 13.08.2019 um 01:03 in
> Nachricht
> :
>> On Mon, 2019‑08‑12 at 17:46 +0200, Ulrich Windl wrote:
>>> Hi!
>>> 
>>> I just noticed that a "crm resource cleanup " caused some
>>> unexpected behavior and the syslog message:
>>> crmd[7281]:  warning: new_event_notification (7281‑97955‑15): Broken
>>> pipe (32)
>>> 
>>> It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker‑
>>> 1.1.19+20180928.0d2680780‑1.8.x86_64).
>>> 
>>> The cleanup was due to a failed monitor. As an unexpected consequence
>>> of this cleanup, CRM seemed to restart the complete resource (and
>>> dependencies), even though it was running.
>> 
>> I assume the monitor failure was old, and recovery had already
>> completed? If not, recovery might have been initiated before the clean‑
>> up was recorded.
>> 
>>> I noticed that a manual "crm_resource ‑C ‑r  ‑N " command
>>> has the same effect (multiple resources are "Cleaned up", resources
>>> are restarted seemingly before the "probe" is done.).
>> 
>> Can you verify whether the probes were done? The DC should log a
>> message when each _monitor_0 result comes in.
> 
> So here's a rough sketch of events:
> 17:10:23 crmd[7281]:   notice: State transition S_IDLE -> S_POLICY_ENGINE
> ...no probes yet...
> 17:10:24 pengine[7280]:  warning: Processing failed monitor of 
> prm_nfs_server
> on rksaph11: not running
> ...lots of starts/restarts...
> 17:10:24 pengine[7280]:   notice:  * Restartprm_nfs_server  
> ...
> 17:10:24 crmd[7281]:   notice: Processing graph 6628
> (ref=pe_calc-dc-1565622624-7313) derived from
> /var/lib/pacemaker/pengine/pe-input-1810.bz2
> ...monitors are being called...
> 17:10:24 crmd[7281]:   notice: Result of probe operation for prm_nfs_vg on
> h11: 0 (ok)
> ...the above was the first probe result...
> 17:10:24 crmd[7281]:  warning: Action 33 (prm_nfs_vg_monitor_0) on h11 
> failed
> (target: 7 vs. rc: 0): Error
> ...not surprising to me: The resource was running; I don't know why the
> cluster want to start it...
> 17:10:24 crmd[7281]:   notice: Transition 6629 (Complete=9, Pending=0,
> Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-1811.bz2): Complete
> 17:10:24 crmd[7281]:   notice: State transition S_TRANSITION_ENGINE ->
S_IDLE
> 
> The really bad thing after this is that the "cleaned up" resource still has

> a
> failed status (dated in the past (last-rc-change='Mon Aug 12 04:52:23 
> 2019')),
> even though "running".
> 
> I tend to believe that the cluster is in a bad state, or the software has a
> problem cleaning the status of the monitor.
> 
> The CIB status for the resource looks like this:
>  provider="heartbeat">
>operation_key="prm_nfs_server_start_0" operation="start"
> crm-debug-origin="do_update_resource" crm_feature_set="3.0.14"
> transition-key="67:6583:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> transition-magic="0:0;67:6583:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> exit-reason="" on_node="h11" call-id="799" rc-code="0" op-status="0"
> interval="0" last-run="1565582351" last-rc-change="1565582351" 
> exec-time="708"
> queue-time="0" op-digest="73311a0ef4ba8e9f1f97e05e989f6348"/>
>operation_key="prm_nfs_server_monitor_6" operation="monitor"
> crm-debug-origin="do_update_resource" crm_feature_set="3.0.14"
> transition-key="68:6583:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> transition-magic="0:0;68:6583:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> exit-reason="" on_node="h11" call-id="800" rc-code="0" op-status="0"
> interval="6" last-rc-change="1565582351" exec-time="499" queue-time="0"
> op-digest="9d8aa17b2a741c8328d7896459733e56"/>
>operation_key="prm_nfs_server_monitor_6" operation="monitor"
> crm-debug-origin="do_update_resource" crm_feature_set="3.0.14"
> transition-key="4:6568:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> transition-magic="0:7;4:6568:0:d941efc1-de73-4ee4-b593-f65be9e90726"
> exit-reason="" on_node="h11" call-id="738" rc-code="7" op-status="0"
> interval="6" last-rc-change="1565578343" exec-time="0" queue-time="0"
> op-digest="9d8aa17b2a741c8328d7896459733e56"/>
> 
> 
> 
> Regards,
> Ulrich
> 
>> 
>>> Actually the manual says when cleaning up a single primitive, the
>>> whole group is cleaned up, unless using ‑‑force. Well ,I don't like
>>> this default, as I expect any status change from probe would
>>> propagate to the group anyway...
>> 
>> In 1.1, clean‑up always wipes the history of the affected resources,
>> regardless of whether the history is for success or failure. That means
>> all the cleaned resources will be reprobed. In 2.0, clean‑up by default

[ClusterLabs] Antw: Antw: why is node fenced ?

2019-08-13 Thread Ulrich Windl
>>> "Ulrich Windl"  schrieb am 13.08.2019 um
09:00 in Nachricht <5d52600102a100032...@gwsmtp.uni-regensburg.de>:

...
> Personally I feel more save with updates when the whole cluster node is

Of course I meant "more safe"... Time for coffein it seems ;-)

...


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Ganesha, after a system reboot, showmounts nothing - why?

2019-08-13 Thread lejeczek
On 07/08/2019 16:48, lejeczek wrote:
> hi guys,
>
> after a reboot Ganesha export are not there. Suffices to do: $ systemctl
> restart nfs-ganesha - and all is good again.
>
> Would you have any ideas why?
>
> I'm on Centos 7.6 with nfs-ganesha-gluster-2.7.6-1.el7.x86_64;
> glusterfs-server-6.4-1.el7.x86_64.
>
> many thanks, L.
>
nobody seen above problem?

Maybe somebody from @devel would comment - why nfs-ganesha's exports
disappear after system reboot, but suffices to systemctl restart the
service and mounts are exported again?

Seems like something does not get up before nfs-ganesha starts?

many thanks, L.



pEpkey.asc
Description: application/pgp-keys
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: why is node fenced ?

2019-08-13 Thread Lentes, Bernd



- On Aug 13, 2019, at 9:00 AM, Ulrich Windl 
ulrich.wi...@rz.uni-regensburg.de wrote:

> Personally I feel more save with updates when the whole cluster node is
> offline, not standby. When you are going to boot anyway, it won't make much of
> a difference. Also you don't have to remember to put the node back online in
> the configuration.

OK.
 
> For your case: After Rebooting the first node, the second one was DC. If you
> reboot that, the first node becomes DC, but when booting node 2 it still have
> the old config saying it it the DC. So both nodes have to agree on that. Maybe
> that's why the cluster is "unclean" for a while. Did it "go away" after some
> time?

For debugging, e.g. after fencing: is there something in the logs form where
i recognize which one was DC and which not ? E.g. some differences in the 
pacemaker.log ?

What is about changes of the CIB and the following actions ? If after a diff in 
the CIB pengine is doing s.th.,
this must be the DC i guess.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-13 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 13.08.2019 um 01:03 in
Nachricht
:
> On Mon, 2019‑08‑12 at 17:46 +0200, Ulrich Windl wrote:
>> Hi!
>> 
>> I just noticed that a "crm resource cleanup " caused some
>> unexpected behavior and the syslog message:
>> crmd[7281]:  warning: new_event_notification (7281‑97955‑15): Broken
>> pipe (32)
>> 
>> It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker‑
>> 1.1.19+20180928.0d2680780‑1.8.x86_64).
>> 
>> The cleanup was due to a failed monitor. As an unexpected consequence
>> of this cleanup, CRM seemed to restart the complete resource (and
>> dependencies), even though it was running.
> 
> I assume the monitor failure was old, and recovery had already
> completed? If not, recovery might have been initiated before the clean‑
> up was recorded.
> 
>> I noticed that a manual "crm_resource ‑C ‑r  ‑N " command
>> has the same effect (multiple resources are "Cleaned up", resources
>> are restarted seemingly before the "probe" is done.).
> 
> Can you verify whether the probes were done? The DC should log a
> message when each _monitor_0 result comes in.

So here's a rough sketch of events:
17:10:23 crmd[7281]:   notice: State transition S_IDLE -> S_POLICY_ENGINE
...no probes yet...
17:10:24 pengine[7280]:  warning: Processing failed monitor of prm_nfs_server
on rksaph11: not running
...lots of starts/restarts...
17:10:24 pengine[7280]:   notice:  * Restartprm_nfs_server  
...
17:10:24 crmd[7281]:   notice: Processing graph 6628
(ref=pe_calc-dc-1565622624-7313) derived from
/var/lib/pacemaker/pengine/pe-input-1810.bz2
...monitors are being called...
17:10:24 crmd[7281]:   notice: Result of probe operation for prm_nfs_vg on
h11: 0 (ok)
...the above was the first probe result...
17:10:24 crmd[7281]:  warning: Action 33 (prm_nfs_vg_monitor_0) on h11 failed
(target: 7 vs. rc: 0): Error
...not surprising to me: The resource was running; I don't know why the
cluster want to start it...
17:10:24 crmd[7281]:   notice: Transition 6629 (Complete=9, Pending=0,
Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-1811.bz2): Complete
17:10:24 crmd[7281]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE

The really bad thing after this is that the "cleaned up" resource still has a
failed status (dated in the past (last-rc-change='Mon Aug 12 04:52:23 2019')),
even though "running".

I tend to believe that the cluster is in a bad state, or the software has a
problem cleaning the status of the monitor.

The CIB status for the resource looks like this:

  
  
  



Regards,
Ulrich

> 
>> Actually the manual says when cleaning up a single primitive, the
>> whole group is cleaned up, unless using ‑‑force. Well ,I don't like
>> this default, as I expect any status change from probe would
>> propagate to the group anyway...
> 
> In 1.1, clean‑up always wipes the history of the affected resources,
> regardless of whether the history is for success or failure. That means
> all the cleaned resources will be reprobed. In 2.0, clean‑up by default
> wipes the history only if there's a failed action (‑‑refresh/‑R is
> required to get the 1.1 behavior). That lessens the impact of the
> "default to whole group" behavior.
> 
> I think the original idea was that a group indicates that the resources
> are closely related, so changing the status of one member might affect
> what status the others report.
> 
>> Regards,
>> Ulrich
> ‑‑ 
> Ken Gaillot 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: Re: Master/slave failover does not work as expected

2019-08-13 Thread Ulrich Windl
>>> Harvey Shepherd  schrieb am 12.08.2019 um 
>>> 23:38
in Nachricht :
> I've been experiencing exactly the same issue. Pacemaker prioritises 
> restarting the failed resource over maintaining a master instance. In my case 
> I used crm_simulate to analyse the actions planned and taken by pacemaker 
> during resource recovery. It showed that the system did plan to failover the 
> master instance, but it was near the bottom of the action list. Higher 
> priority was given to restarting the failed instance, consequently when that 
> had occurred, it was easier just to promote the same instance rather than 
> failing over.

That's interesting: Maybe usually it's actually faster to restart a failed 
(master) process rather than promoting a slave to master, possibly demoting the 
old master to slave, etc.

But most obviously while there is a (possible) resource utilization for 
resources, there is none for operations (AFAIK): If one could configure 
"operation costs" (maybe as rules), the cluster could prefer the transition 
with least costs. Unfortunately it will make things more complicated.

I could even imagine if you set the cost for "stop" to infinity, the cluster 
will not even try to stop the resource, but will fence the node instead...

> 
> This particular behaviour caused me a lot of headaches. In the end I had to 
> use a workaround by setting max failures for the resource to 1, and clearing 
> the failure after 10 seconds. This forces it to failover, but there is then a 
> window (longer than 10 seconds due to the cluster check timer which is used 
> to clear failures) where the resource can't fail back if there happened to be 
> a second failure. It also means that there is no slave running during this 
> time, which causes a performance hit in my case.
> 
[...]


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Restoring network connection breaks cluster services

2019-08-13 Thread Ulrich Windl
>>> Jan Pokorný  schrieb am 12.08.2019 um 22:30 in
Nachricht
<20190812203037.gm25...@redhat.com>:

[...]
> Is it OK for lower level components to do autonomous decisions
> without at least informing the higher level wrt. what exactly is
> going on, as we could observe here?
[...]

Excuse me for throwing in another comparison with old HP-UX ServiceGuard: As
far as I understood it, the HP-UX kernel had a hardware-based watchdog that the
process corresponding to crmd enabled during start and periodically "fed" to
avoid a "TOC" (Transfer Of Control, resulting in a kernel panic, crash dump and
reboot).

That was all: So when crmd died or failed to feed the watchdog the node
rebooted and the other node took care of the resources (if possible). Fencing
was basically network based with a disk as tie-breaker: If there was a network
outage, both (2-node cluster case) nodes tried to control the cluster, racing
for a SCSI lock on the "lock disk" (requiring a multi-initiator SCSI setup for
shared disks). The winner wrote his node name to the disk's slot so that the
other node(s) could read and tell who the winner of the race was. They all
committed suicide then (an exit of the main cluster process would be enough to
trigger the watchdog, but the did a  explicit TOC).

Comparing with pacemaker corosync and fencing this all seems unnecessarily
complex to me; at least if you have some shared storage.

The other nice thing was network traffic in ServiceGuard: The heartbeat
interval was configurable (like every 7 seconds), and when there was nothing
"interesting" happening in the cluster there was no traffic other than the
heartbeat (missing a configurable number of heartbeats declared a split brain,
and the machinery really started).  I think pacemaker is creating way to much
network traffic.

So I think sbd should not decide by itself whether to reboot a node or not.
Maybe even sbd should not use the watchdog, but the crmd should...

Regards,
Ulrich

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: why is node fenced ?

2019-08-13 Thread Ulrich Windl
>>> "Lentes, Bernd"  schrieb am 12.08.2019
um
18:09 in Nachricht
<546330844.1686419.1565626164456.javamail.zim...@helmholtz-muenchen.de>:
> Hi,
> 
> last Friday (9th of August) i had to install patches on my two-node
cluster.
> I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2),

> patched it, rebooted, 
> started the cluster (systemctl start pacemaker) again, put the node again 
> online, everything fine.

Personally I feel more save with updates when the whole cluster node is
offline, not standby. When you are going to boot anyway, it won't make much of
a difference. Also you don't have to remember to put the node back online in
the configuration.

For your case: After Rebooting the first node, the second one was DC. If you
reboot that, the first node becomes DC, but when booting node 2 it still have
the old config saying it it the DC. So both nodes have to agree on that. Maybe
that's why the cluster is "unclean" for a while. Did it "go away" after some
time?

Regards,
Ulrich

> 
> Then i wanted to do the same procedure with the other node (ha-idg-1).
> I put it in standby, patched it, rebooted, started pacemaker again.
> But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.
> I know that nodes which are unclean need to be shutdown, that's logical.
> 
> But i don't know from where the conclusion comes that the node is unclean 
> respectively why it is unclean,
> i searched in the logs and didn't find any hint.
> 
> I put the syslog and the pacemaker log on a seafile share, i'd be very 
> thankful if you'll have a look.
> https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ 
> 
> Here the cli history of the commands:
> 
> 17:03:04  crm node standby ha-idg-2
> 17:07:15  zypper up (install Updates on ha-idg-2)
> 17:17:30  systemctl reboot
> 17:25:21  systemctl start pacemaker.service
> 17:25:47  crm node online ha-idg-2
> 17:26:35  crm node standby ha-idg1-
> 17:30:21  zypper up (install Updates on ha-idg-1)
> 17:37:32  systemctl reboot
> 17:43:04  systemctl start pacemaker.service
> 17:44:00  ha-idg-1 is fenced
> 
> Thanks.
> 
> Bernd
> 
> OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1
> 
> 
> -- 
> 
> Bernd Lentes 
> Systemadministration 
> Institut für Entwicklungsgenetik 
> Gebäude 35.34 - Raum 208 
> HelmholtzZentrum münchen 
> bernd.len...@helmholtz-muenchen.de 
> phone: +49 89 3187 1241 
> phone: +49 89 3187 3827 
> fax: +49 89 3187 2294 
> http://www.helmholtz-muenchen.de/idg 
> 
> Perfekt ist wer keine Fehler macht 
> Also sind Tote perfekt
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de 
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich 
> Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/