Re: [ClusterLabs] SBD as watchdog daemon

2019-04-16 Thread Олег Самойлов
Well, I checked this PR
https://github.com/ClusterLabs/sbd/pull/27
from author repository
https://github.com/jjd27/sbd/tree/cluster-quorum

The problem is still exists. When corosync is frozen on one node, both node are 
rebooted. Don’t apply this PR.

> 16 апр. 2019 г., в 19:13, Klaus Wenninger  написал(а):

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-16 Thread Klaus Wenninger
On 4/16/19 5:27 PM, Олег Самойлов wrote:
>
>> 16 апр. 2019 г., в 16:21, Klaus Wenninger  написал(а):
>>
>> On 4/16/19 3:12 PM, Олег Самойлов wrote:
>>> Okey, looked like I found where it must be fixed.
>>>
>>> sbd-cluster.c
>>>
>>>/* TODO - Make a CPG call and only call notify_parent() when we 
>>> get a reply */
>>>notify_parent();
>>>
>>> Can anyone explain me, how to make mentioned CPG call?
>> There should be a PR already that does exactly that.
> Not only.
>
>> It just has to be rebased.
> Not true. This PR is in conflict with the master branch.

Which is what I wanted to express with 'has to be rebased' ;-)

>
>> But be aware that this isn't gonna solve your halted-pacemaker-daemons
>> issue.
> Also not true. I tried to merge this PR and I has solved several conflicts 
> intuitively. Now watchdog fires when corosync is frozen (half of my problems 
> is solved).
Exactly - which is why I was directing your attention to the
pacemaker-daemons.
>  But… It fires on both nodes. :) May be this is due to my lack of knowledge 
> of the corosync infrastructure.
>
> This PR is from 2017 year, why you didn’t fix and apply such very important 
> PR yet?
Because there were other things to do that were even more important ;-)
And as you've just discovered yourself things are not always that easy ...
Even if the issue with the non-blocked node restarting is solved there
are still delicate issues with startup/shutdown, installation/deinstallation
gradually configuring up a cluster from a single node over two-node to
several nodes, ... to be considered.

Klaus
>

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-16 Thread Олег Самойлов


> 16 апр. 2019 г., в 16:21, Klaus Wenninger  написал(а):
> 
> On 4/16/19 3:12 PM, Олег Самойлов wrote:
>> Okey, looked like I found where it must be fixed.
>> 
>> sbd-cluster.c
>> 
>>/* TODO - Make a CPG call and only call notify_parent() when we 
>> get a reply */
>>notify_parent();
>> 
>> Can anyone explain me, how to make mentioned CPG call?
> There should be a PR already that does exactly that.

Not only.

> It just has to be rebased.

Not true. This PR is in conflict with the master branch.

> But be aware that this isn't gonna solve your halted-pacemaker-daemons
> issue.

Also not true. I tried to merge this PR and I has solved several conflicts 
intuitively. Now watchdog fires when corosync is frozen (half of my problems is 
solved). But… It fires on both nodes. :) May be this is due to my lack of 
knowledge of the corosync infrastructure.

This PR is from 2017 year, why you didn’t fix and apply such very important PR 
yet?

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-16 Thread Klaus Wenninger
On 4/16/19 3:12 PM, Олег Самойлов wrote:
> Okey, looked like I found where it must be fixed.
>
> sbd-cluster.c
>
> /* TODO - Make a CPG call and only call notify_parent() when we 
> get a reply */
> notify_parent();
>
> Can anyone explain me, how to make mentioned CPG call?
There should be a PR already that does exactly that.
It just has to be rebased.
But be aware that this isn't gonna solve your halted-pacemaker-daemons
issue.

Klaus
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/


-- 
Klaus Wenninger

Senior Software Engineer, EMEA ENG Base Operating Systems

Red Hat

kwenn...@redhat.com   

Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, 
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael O'Neill, Tom Savage, Eric Shander

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-15 Thread Олег Самойлов


> 14 апр. 2019 г., в 10:12, Andrei Borzenkov  написал(а):

Thanks for explanation, I think this will be good addition to the SBD manual. 
(SBD manual need in this.) But my problem lies in other plain.

I investigated SBD. A common watchdog is a much simple. One infinite loop, 
checks some tests and write to the watchdog device. Any mistakes, freeze or 
segfault and watchdog will fire. But SBD has another design. First of all there 
is not one infinite loop. There are three different processes, one is 
«inquisitor" and to other «servants» for corosync and pacemaker. And there is 
complex logic to check each other inside SBD. But the problem even is not here. 
Both the servants send to the inquisitor health heartbeat every second. But… 
They send health heartbeat not as result of checking corosync or pacemaker, as 
expected to be, but from the internal buffer variable «servant_health». And if 
corosync or pacemaker is frozen (can be emulated by `kill -s STOP`), this 
variable is never changed and the servants continue send to the inquisitor a 
good health status always. And this is a bug. I am looking a way to fix this.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-14 Thread Andrei Borzenkov
12.04.2019 15:30, Олег Самойлов пишет:
> 
>> 11 апр. 2019 г., в 20:00, Klaus Wenninger 
>> написал(а):
>> 
>> On 4/11/19 5:27 PM, Олег Самойлов wrote:
>>> Hi all. I am developing HA PostgreSQL cluster for 2 or 3
>>> datacenters. In case of DataCenter failure (blackout) the fencing
>>> will not work and will prevent to switching to working DC. So I
>>> disable the fencing. The cluster working is based on a quorum and
>>> I added a quorum device on a third DC in case of 2 DC. But I need
>>> somehow solve
>> Why would you disable fencing? SBD with watchdog-fencing (no
>> shared disk) is made for exactly that use-case but you need fencing
>> to be enabled and stonith-watchdog-timeout to be set to roughly 2x
>> the watchdog-timeout.
> 
> Interesting. There are a lot in documentation about using the sbd
> with 1,2,3 block devices, but about using without block devices is
> nothing, except a sentence that this is possible. :)
> 

Yes, stonith-watchdog-timeout does not really ring the bell and does not
make it obvious it is related to SBD in any way.

The way it seems to works is

If stonithd^Wpaxcemaker-fenced receives request to kill node and no
suitable stonith device for this node was found and SBD is active and
stonith-watchdog-timeout is non-zero, fenced will a) forward request to
victim and b) wait for specified timeout and expect node to self fence.
If victim is alive it will initiate reboot either via local SBD or via
SysRq if it could not contact SBD. If victim is not reachable, it is
expected that SBD will commit suicide.

You will see in logs something like

Apr 14 09:13:45 ha1 pacemaker-fenced[1808] (call_remote_stonith)
notice: Waiting 10s for ha2 to self-fence (reboot) for
pacemaker-controld.1812.5a81fe48 ((nil))

Unfortunately

1. It is by far not obvious. You only see something during actual
stonith attempt. You see

Apr 14 09:13:45 ha1 pacemaker-schedulerd[1811] (unpack_config)  notice:
Watchdog will be used via SBD if fencing is required

sprinkled over log file, but it is misleading - it will *NOT* use SBD
fencing unless stonith-watchdog-timeout is actually set to non-zero. And
you see the following log entry exactly once during normal startup:

Apr 14 09:10:15 ha1 pacemaker-controld  [1812] (check_sbd_timeout)
debug: Using calculated value 1 for stonith-watchdog-timeout (-1)
Apr 14 09:10:15 ha1 pacemaker-controld  [1812] (check_sbd_timeout)
info: Watchdog configured with stonith-watchdog-timeout -1 and SBD
timeout 5000ms

And it may be too far in the past and already rotated away.

2. No high-level command showing current pacemaker run-time state shows
you whether this mechanism is active. OK, if you know what you are
looking for you may use direct CIB query to check values of
have-watchdog and stonith-watchdog-timeout. But where pray is
have-watchdog documented?

Apparently intention was to create special internal stonith device with
name "watchdog" which would be visible in at least stonith_admin -L
output (not sure whether crm_mon would show it). But as implemented
currently, fenced would (attempt to) create this device once very early
during initial startup *if* stonith-watchdog-timeout is not zero - but
it queries for the value of stonith-watchdog-timeout(or at least
receives reply) far later, which means this special device is never
created. Which leads to funny duplication of code which contains
identical handling of both no device and special "watchdog" device
cases. If anything, this is confusing to anyone looking at the code.

Apr 14 09:09:54 ha1 pacemaker-fenced[1808] (main)   info: Starting
pacemaker-fenced mainloop
^^^ this line is already past attempt to create special
watchdog device

Apr 14 09:10:15 ha1 pacemaker-controld  [1812] (check_sbd_timeout)
info: Watchdog configured with stonith-watchdog-timeout -1 and SBD
timeout 5000ms

And of course stonith-watchdog-timeout can be changed to 0 at run-time
so this device would now have to be removed even if it had been created
successfully in the first place. So it really need to be created in CIB
update handler. Which leaves window between changing this value and
creation of special device so probably duplication of code is inevitable.

If anyone thinks it is a bug, I will open bug report.

3. If node could not be contacted for whatever reasons and has not
received self-fence request it will still be assumed to be fenced and
pacemaker will start relocating resources. I do not know whether
pacemaker cross-checks with actual node state (i.e. node is expected to
be lost at the point watchdog timeout is expired).


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] SBD as watchdog daemon

2019-04-13 Thread Олег Самойлов
Hi all.
I am developing HA PostgreSQL cluster for 2 or 3 datacenters. In case of 
DataCenter failure (blackout) the fencing will not work and will prevent to 
switching to working DC. So I disable the fencing. The cluster working is based 
on a quorum and I added a quorum device on a third DC in case of 2 DC. But I 
need somehow solve cases when corosync or pacemaker is freeze. In this case I 
use a hw watchdog or a softdog and SBD as watchdog daemon (without shared 
devices). Well, after this if I kill the corosync or the pacemakerd, all fine, 
the node is restarted. And if I freeze sbd by `killall -s STOP sbd`, all fine, 
reboots.  But if I freeze corosync or pacemakerd by `killall -s STOP` or by 
`ifdown eth0` (corosync is frozen in this case), nothing happened. The question 
is «Is this is fixed in the master branch or in 1.4.0?» (I use centos rpms: sbd 
v1.3.1) or where I need to look for (in what file, function) to fix this.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-12 Thread Олег Самойлов

> 11 апр. 2019 г., в 20:00, Klaus Wenninger  написал(а):
> 
> On 4/11/19 5:27 PM, Олег Самойлов wrote:
>> Hi all.
>> I am developing HA PostgreSQL cluster for 2 or 3 datacenters. In case of 
>> DataCenter failure (blackout) the fencing will not work and will prevent to 
>> switching to working DC. So I disable the fencing. The cluster working is 
>> based on a quorum and I added a quorum device on a third DC in case of 2 DC. 
>> But I need somehow solve 
> Why would you disable fencing? SBD with watchdog-fencing (no shared
> disk) is made for exactly that use-case but you need fencing to
> be enabled and stonith-watchdog-timeout to be set to roughly 2x the
> watchdog-timeout.

Interesting. There are a lot in documentation about using the sbd with 1,2,3 
block devices, but about using without block devices is nothing, except a 
sentence that this is possible. :)

> Regarding a node restart to be triggered that shouldn't make much
> difference but if you disable fencing you won't get the remaining
> cluster to wait for the missing node to be reset and proceed afterwards
> (regardless if the lost node shows up again or not).

Yep, in my case this will good for floating IPs.

>> cases when corosync or pacemaker is freeze. In this case I use a hw watchdog 
>> or a softdog and SBD as watchdog daemon (without shared devices). Well, 
>> after this if I kill the corosync or the pacemakerd, all fine, the node is 
>> restarted. And if I freeze sbd by `killall -s STOP sbd`, all fine, reboots.  
>> But if I freeze corosync or pacemakerd by `killall -s STOP` or by `ifdown 
>> eth0` (corosync is frozen in this case), nothing happened. The question is 
>> «Is this is fixed in the master branch or in 1.4.0?» (I use centos rpms: sbd 
>> v1.3.1) or where I need to look for (in what file, function) to fix this.
> Referring to the above I'm not sure how you did configure sbd.

Just 
pcs stonith sbd enable
pcs property set stonith-enabled=false
Now I change to
pcs stonith sbd enable
pcs property set stonith-enabled=true
pcs property set stonith-watchdog-timeout=12

> 
> ifdown of the corosync-interface definitely gives me a reboot on a
> cluster with corosync-3 and current sbd from master.

I tested with corosync 2.4.3 (default for CentOS 7). Or may be in your case 
reboot was happened by the fencing, But no matter, if a watchdog will work as 
expected.

> But iirc there was an improvement regarding this in corosync.
> Freezing corosync or pacemakerd on the other hand doesn't trigger anything.
> For doing a regular ping to corosync via cpg there is an outstanding PR
> that should help here - unfortunately needs to be rebased to current sbd
> (don't find it atm - strange)
> 
> Regarding pacemakerd that should be a little bit more complicated as
> pacemakerd is just the main control daemon.
> So if you freeze that it shouldn't be harmful for the first but of
> course as pacemakerd is doing the observation of the rest of the
> pacemaker-daemons it should be somehow watchdog-observed. iirc there
> were some tests by hideo using corosync-watchdog-device-integration. But
> these attempts unfortunately slept in as well. You should find some
> discussion in the mailinglist-archives about it. Unfortunately having
> corosync open a watchdog-device makes it fight with sbd for that
> resource. But a generic solution isn't that simple as not every setup is
> using sbd.

Well, I see, freezeing of pacemaker daemons is not monitoring by the watchdog 
daemon (sbd). It’s strange, I see two «Watcher» daemon from sbd, one for 
corosync, other for pacemakerd, they must do something useful. :) I want that 
the behaviour was at least the same as with normal fencing. In case of fencing 
if corosync or pacemaker freeze, failure node is fenced.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SBD as watchdog daemon

2019-04-11 Thread Klaus Wenninger
On 4/11/19 5:27 PM, Олег Самойлов wrote:
> Hi all.
> I am developing HA PostgreSQL cluster for 2 or 3 datacenters. In case of 
> DataCenter failure (blackout) the fencing will not work and will prevent to 
> switching to working DC. So I disable the fencing. The cluster working is 
> based on a quorum and I added a quorum device on a third DC in case of 2 DC. 
> But I need somehow solve 
Why would you disable fencing? SBD with watchdog-fencing (no shared
disk) is made for exactly that use-case but you need fencing to
be enabled and stonith-watchdog-timeout to be set to roughly 2x the
watchdog-timeout.
Regarding a node restart to be triggered that shouldn't make much
difference but if you disable fencing you won't get the remaining
cluster to wait for the missing node to be reset and proceed afterwards
(regardless if the lost node shows up again or not).
> cases when corosync or pacemaker is freeze. In this case I use a hw watchdog 
> or a softdog and SBD as watchdog daemon (without shared devices). Well, after 
> this if I kill the corosync or the pacemakerd, all fine, the node is 
> restarted. And if I freeze sbd by `killall -s STOP sbd`, all fine, reboots.  
> But if I freeze corosync or pacemakerd by `killall -s STOP` or by `ifdown 
> eth0` (corosync is frozen in this case), nothing happened. The question is 
> «Is this is fixed in the master branch or in 1.4.0?» (I use centos rpms: sbd 
> v1.3.1) or where I need to look for (in what file, function) to fix this.
Referring to the above I'm not sure how you did configure sbd.

I'm not aware of any fixes directly targeting issues like that since v1.3.1.
There are 2 post v1.4.0 fixes that might be helpful in some cases though.
(make handling of cib-connection loss more robust & finalize cmap
connection if disconnected from cluster)

ifdown of the corosync-interface definitely gives me a reboot on a
cluster with corosync-3 and current sbd from master.
But iirc there was an improvement regarding this in corosync.
Freezing corosync or pacemakerd on the other hand doesn't trigger anything.
For doing a regular ping to corosync via cpg there is an outstanding PR
that should help here - unfortunately needs to be rebased to current sbd
(don't find it atm - strange)

Regarding pacemakerd that should be a little bit more complicated as
pacemakerd is just the main control daemon.
So if you freeze that it shouldn't be harmful for the first but of
course as pacemakerd is doing the observation of the rest of the
pacemaker-daemons it should be somehow watchdog-observed. iirc there
were some tests by hideo using corosync-watchdog-device-integration. But
these attempts unfortunately slept in as well. You should find some
discussion in the mailinglist-archives about it. Unfortunately having
corosync open a watchdog-device makes it fight with sbd for that
resource. But a generic solution isn't that simple as not every setup is
using sbd.

Klaus
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] SBD as watchdog daemon

2019-04-11 Thread Олег Самойлов
Hi all.
I am developing HA PostgreSQL cluster for 2 or 3 datacenters. In case of 
DataCenter failure (blackout) the fencing will not work and will prevent to 
switching to working DC. So I disable the fencing. The cluster working is based 
on a quorum and I added a quorum device on a third DC in case of 2 DC. But I 
need somehow solve cases when corosync or pacemaker is freeze. In this case I 
use a hw watchdog or a softdog and SBD as watchdog daemon (without shared 
devices). Well, after this if I kill the corosync or the pacemakerd, all fine, 
the node is restarted. And if I freeze sbd by `killall -s STOP sbd`, all fine, 
reboots.  But if I freeze corosync or pacemakerd by `killall -s STOP` or by 
`ifdown eth0` (corosync is frozen in this case), nothing happened. The question 
is «Is this is fixed in the master branch or in 1.4.0?» (I use centos rpms: sbd 
v1.3.1) or where I need to look for (in what file, function) to fix this.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/