On 6/16/21 3:03 PM, Andrei Borzenkov wrote:
We thought that access to storage was restored, but one step was
missing so devices appeared empty.
At this point I tried to restart the pacemaker. But as soon as I
stopped pacemaker SBD rebooted nodes ‑ which is logical, as quorum was
now lost.
How to cleanly stop pacemaker in this case and keep nodes up?
Unconfigurte sbd devices I guess.
Do you have *practical* suggestions on how to do it online in a
running pacemaker cluster? Can you explain how it is going to help
given that lack of sbd device was not the problem in the first place?
I would translate this issue as "how to gracefully shutdown sbd to deregister
sbd from pacemaker for the whole cluster". Seems no way to do that except
`systemctl stop corosync`.
With that, to calm down sbd suicide, I'm thinking some tricky steps as below
might help. Well, not sure it fits your situation as the whole.
crm cluster run "systemctl stop pacemaker"
crm cluster run "systemctl stop corosync"
BR,
Roger
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/