Public bug reported:
This is related to and sounds very similar to
https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/2065848,
but differs enough to warrant another bug.
# lsb_release -rd
Description: Ubuntu 24.04.4 LTS
Release: 24.04
Package version:
# apt-cache policy resource-agents-extra
resource-agents-extra:
Installed: 1:4.13.0-1ubuntu4
Candidate: 1:4.13.0-1ubuntu4
Version table:
*** 1:4.13.0-1ubuntu4 500
500 http://archive.ubuntu.com/ubuntu noble/universe amd64 Packages
100 /var/lib/dpkg/status
Expected behavior:
Stopping the NFS server succeeds
Actual behavior:
The NFS server is stopped but the nfsserver resource's stop process
reports a failure, because /var/lib/nfs filesystem fails to unmount,
because fsidd is holding files open and is never stopped before
attempting to unmount.
Analysis:
At some point, the "fsidd" service was added to the nfs-kernel-server
package. The systemd unit supplied in this package starts fsidd before
nfs-server:
[Unit]
Description=NFS FSID Daemon
After=local-fs.target
Before=nfs-mountd.service nfs-server.service
[Service]
ExecStart=/usr/sbin/fsidd
[Install]
RequiredBy=nfs-mountd.service nfs-server.service
The fsidd process holds open a file in the nfs info dir:
# lsof -p 57130 | grep /var/lib/nfs
fsidd 57130 root 3u REG 147,0 16384 16777352
/var/lib/nfs/reexpdb.sqlite3
A combination of factors causes this service not to exit, and prevent
the resource-agent script from reporting success in stopping the nfs
server.
If the nfs info dir is mounted from somewhere - when using
nfs_shared_infodir - this process needs to exit before the
nfs_shared_infodir is unmounted.
When stopping nfs-server, the resource-agent script either uses
/etc/init.d/nfs-kernel-server if present or systemctl commands, as
mentioned in https://bugs.launchpad.net/ubuntu/+source/resource-
agents/+bug/2065848/comments/10. Leaving /etc/init.d/nfs-kernel-server
in place causes additional problems, as mentioned in
https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/2065848,
so I have removed it on my test system.
Regardless of how nfs-server is stopped, fsidd is not stopped. This
causes /var/lib/nfs to fail to unmount, and the error to bubble up.
For my use case - an NFS server managed by corosync/pacemaker- this
means failover to other hosts does not work as the existing NFS server
cannot be stopped.
This issue can be worked around by providing a systemd unit drop-in,
that declares fsidd PartOf nfs-server:
# cat > /etc/systemd/system/fsidd.service.d/nfs-server-dependency.conf << 'EOF'
[Unit]
PartOf=nfs-server.service
EOF
# systemctl daemon-reload
With this in place, fsidd is automatically stopped when nfs-server is
stopped, which means /var/lib/nfs can then be unmounted successfully.
I suspect nfs-kernel-server was updated adding this fsidd service, and
resource-agents was not updated to match. Or perhaps leaving fsidd
running even when nfs-server exits is sloppy. It's not my call.
** Affects: resource-agents (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2145790
Title:
ocf:heartbeat:nfsserver resource's stop operation fails due to
/var/lib/nfs filesystem failing to unmount
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/2145790/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs