Hi! Wanted to pick up where we dropped the issue before Christmas.
Observations as a result of a startup-timeout issue led to the awareness that the way we are starting SBD together with the rest of the Clusterstack doesn't handle SBD startup-issues properly. SBD is integrated in the Clusterstack-startup using dependencies & sequence via the SBD-unit-file: [Unit] Description=Shared-storage based fencing daemon Before=pacemaker.service After=systemd-modules-load.service iscsi.service PartOf=corosync.service RefuseManualStop=true RefuseManualStart=true [Install] RequiredBy=corosync.service This is quite handy in the sense that if SBD is not present nobody has to care about anything regarding SBD as it would have to be if e.g. pacemaker directly required SBD - which it actually does or should do in some way on clusters using SBD. What is working as desired with this setup is that we have seamless startup/shutdown of SBD-service together with corosync once SBD is enabled. What does not work is that systemd doesn't wait/care if SBD comes up properly before it starts pacemaker. Just as a side-note: We just have a startup-issue here as a positive startup of SBD means that the watchdog (and you are strongly advised to have a proper one with SBD) is engaged. Thus any further problem SBD might face should reliably lead to a reboot. As a result of the discussion, and as there was no other solution at sight that wouldn't have dependencies over several packages, upstream-SBD merged the following pull-request: https://github.com/ClusterLabs/sbd/pull/39 Discussion here on the list and in this pull-request showed that this approach of simply adding [Unit] Before=dlm.service [Install] RequiredBy=pacemaker.service RequiredBy=dlm.service to the SBD-unit-file solves the issue for the first without having any impact on install/maintenance procedures. But it might not be the optimal longer-term solution. Thus I'm writing this here as a platform for further discussion. And what showed up as possible caveat since the discussion before Christmas, if you have SBD enabled and you are doing an upgrade from a version without the fix from this pull-request to a version with the fix you would have to manually reenable the service for the RequiredBy-links to be updated. Regards, Klaus _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org