On Fri, Jun 3, 2022 at 10:19 AM Zoran Bošnjak <zoran.bosn...@via.si> wrote:
>
> Hi all,
> I would appreciate an advice about sbd fencing (without shared storage).
>
> I am using ubuntu 20.04., with default packages from the repository 
> (pacemaker, corosync, fence-agents, ipmitool, pcs...).
>
> HW watchdog is present on servers. The first problem was to load/unload the 
> watchdog module. For some reason the module is blacklisted on ubuntu, so I've 
> created a service for this purpose.
>
> --- file: /etc/systemd/system/watchdog.service
> [Unit]
> Description=Load watchdog timer module
> After=syslog.target
>
> [Service]
> Type=oneshot
> RemainAfterExit=yes
> ExecStart=/sbin/modprobe ipmi_watchdog
> ExecStop=/sbin/rmmod ipmi_watchdog
>
> [Install]
> WantedBy=multi-user.target
> ---
>
> Is this a proper way to load watchdog module under ubuntu?
>
> Anyway, once the module is loaded, the /dev/watchdog (which is required by 
> 'sbd') is present.
> Next, the 'sbd' is installed by
>
> sudo apt install sbd
> (followed by one reboot to get the sbd active)
>
> The configuration of the 'sbd' is default. The sbd reacts to network failure 
> as expected (reboots the server). However, when the 'sbd' is active, the 
> server won't reboot normally any more. For example from the command line 
> "sudo reboot", it gets stuck at the end of the reboot sequence. There is a 
> message on the console:
>
> ... reboot progress
> [ OK ] Finished Reboot.
> [ OK ] Reached target Reboot.
> [ ... ] IPMI Watchdog: Unexpected close, not stopping watchdog!
> [ ... ] IPMI Watchdog: Unexpected close, not stopping watchdog!
> ... it gets stuck at this point
>
> After some long timeout, it looks like the watchdog timer expires and server 
> boots, but the failure indication remains on the front panel of the server. 
> If I uninstall the 'sbd' package, the "sudo reboot" works normally again.
>
> My question is: How do I configure the system, to have the 'sbd' function 
> present, but still be able to reboot the system normally.

Loading modules - depending on distribution an version - should probably rather
be done editing /etc/modules or putting some files under /etc/modprobe-d/.
Guess in your case stopping the unit won't work as the watchdog-device is
still opened by sbd. In general I don't see why the watchdog-module should
be unloaded upon shutdown. So as a first try you just might remove that part.

Klaus

>
> regards,
> Zoran
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to