On 22/04/17 04:39 AM, Andrei Borzenkov wrote: > 22.04.2017 11:31, Klaus Wenninger пишет: >>>>> >>>> I wonder how SBD fits into this discussion. It is marketed as stonith >>>> agent, but it is based on committing suicide so relies on well-behaving >>>> nodes. Which we by definition cannot trust to behave well, otherwise >>>> we'd not need stonith in the first place. >>> The logic, when using a watchdog timer, is that if the node is alive >>> enough to kick the watchdog, it's alive enough to not do something dumb >>> to the cluster. If it's not able to kick the timer, the watchdog timer >>> will reset the machine. This works *if* all resources hang when messages >>> stop coming back from the peer (a side effect of corosync's virtual >>> synchrony). >> >> In fact watchdog-implementations (meaning the software that >> kicks the hardware-watchdog) are a little bit smarter - and >> so is SBD. >> By having the watchdog-kicking and observation-code in a >> simple loop that is executed periodically you don't need the >> 'if it is alive enough to do the kicking it will behave well' >> paradigm. >> This burns down to making the critical part of the code very >> small and on top hard to control failures that result in any >> kind of hanging don't bother us. >> >>> >>> So as I understand it, for SBD to be safe, it requires a hardware >>> watchdog timer and a properly configured cluster. >> >> Yes, yes and yes ... as important as fencing I would say ;-) >> > > So I gather that for SBD to be reasonably safe, it needs real hardware > watchdog. I often see SBD recommended as stonith agent inside a VM, > where we do not have "hardware watchdog" by definition. I still wonder > whether it can be trusted in this case.
I suppose it depends. The fact that it requires some measure of predictable behaviour is concerning for me. That said, I have the same reservation with IPMI itself. So to me, "proper" fencing requires a backup, totally external, option like a pair of switched PDUs. Of course, I'm more paranoid than most. Having SBD properly configured is *massively* safer than no fencing at all. So for people where other fence methods are not available for whatever reason, SBD is the way to go. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org