On 2021-03-05 12:26 p.m., Klaus Wenninger wrote: > On 3/5/21 6:04 PM, Digimer wrote: >> On 2021-03-05 2:14 a.m., Ulrich Windl wrote: >>>>> How would the fencing be confirmed? I don't know. >>>> It's part of the FenceAgentAPI. The cluster invokes the fence agent, >>>> passes in variable=value pairs on STDIN, and waits for the agent to >>>> exit. It reads the agent's exit code and uses that to determine success >>>> or failure. >>> But the agent "acting remote" cannot be sure the "remote end" was killed, >>> specifically when the network connection seems dead. >>> I see that in the IPMI case you have a separate connection allowing >>> "out-of-band signaling", but in the general case that would not be possible. >> To elaborate on Klaus's reply; >> >> The cluster has no control over how the fence agent works, it can only >> dictate the API and expect the fence agent is implemented in a sane way. >> If your agent returns success, but the node wasn't confirmed off >> properly in the agent, you will get a split-brain and that will be no >> fault of the cluster itself. >> >> Speaking to the "remote end" part; >> >> All good fence agents need to work regardless of the state of the target >> node. If, somehow, a fence agent needs the target to be in some sort of >> defines state, it is a critically flawed fence agent. A classic example >> of this is the often-requested "ssh fence agent" (and it's why such an >> agent doesn't exist). >> >> So your fence agent must be able to work out of band, by definition and >> design. When you call an IPMI BMC, you are effectively talking to a >> different mini computer on the target. Even then, if the mainboard >> utterly dies and takes the BMC with it, it will fail to fence as well. >> This is why at Alteeve we always have a backup fence method, switched >> PDUs on different switches from the IPMI BMC connections. >> >> Fencing really is critical, and as such, it should be certain to work, >> and ideally, have a backup fence method. So if you find that your >> fence-azure agent isn't reliable, and you can use SBD as Klaus >> mentioned, you can configure fence-sbd as a backup method to fence-azure. >> > Nothing to add - to the point as usually - but that the statement from > Ulrich lookedgeneral - not necessarily azure specific - and thus my > comment was as well. > Just wanted to state that I didn't advertise SBD as fencing method > forazure. SBD needs a reliable watchdog and afaik softdog is the only > watchdogyou have on azure (maybe different for certain BareMetal > offerings). > If you consider that reliable enough you have to negotiate with your own > conscience or the provider of your distribution ;-) > > Klaus
I know nothing of Azure. If you don't have hardware watchdog, fence-sbd is not reliable. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/