> 14 апр. 2019 г., в 10:12, Andrei Borzenkov <arvidj...@gmail.com> написал(а):

Thanks for explanation, I think this will be good addition to the SBD manual. 
(SBD manual need in this.) But my problem lies in other plain.

I investigated SBD. A common watchdog is a much simple. One infinite loop, 
checks some tests and write to the watchdog device. Any mistakes, freeze or 
segfault and watchdog will fire. But SBD has another design. First of all there 
is not one infinite loop. There are three different processes, one is 
«inquisitor" and to other «servants» for corosync and pacemaker. And there is 
complex logic to check each other inside SBD. But the problem even is not here. 
Both the servants send to the inquisitor health heartbeat every second. But… 
They send health heartbeat not as result of checking corosync or pacemaker, as 
expected to be, but from the internal buffer variable «servant_health». And if 
corosync or pacemaker is frozen (can be emulated by `kill -s STOP`), this 
variable is never changed and the servants continue send to the inquisitor a 
good health status always. And this is a bug. I am looking a way to fix this.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to