Public bug reported:

This is related to and sounds very similar to
https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/2065848,
but differs enough to warrant another bug.

# lsb_release -rd
Description: Ubuntu 24.04.4 LTS
Release: 24.04

Package version:

# apt-cache policy resource-agents-extra
resource-agents-extra:
  Installed: 1:4.13.0-1ubuntu4
  Candidate: 1:4.13.0-1ubuntu4
  Version table:
 *** 1:4.13.0-1ubuntu4 500
        500 http://archive.ubuntu.com/ubuntu noble/universe amd64 Packages
        100 /var/lib/dpkg/status

Expected behavior:

Stopping the NFS server succeeds

Actual behavior:

The NFS server is stopped but the nfsserver resource's stop process
reports a failure, because /var/lib/nfs filesystem fails to unmount,
because fsidd is holding files open and is never stopped before
attempting to unmount.

Analysis:

At some point, the "fsidd" service was added to the nfs-kernel-server
package. The systemd unit supplied in this package starts fsidd before
nfs-server:

    [Unit]
    Description=NFS FSID Daemon
    After=local-fs.target
    Before=nfs-mountd.service nfs-server.service
    
    [Service]
    ExecStart=/usr/sbin/fsidd
    
    [Install]
    RequiredBy=nfs-mountd.service nfs-server.service

The fsidd process holds open a file in the nfs info dir:

    # lsof -p 57130 | grep /var/lib/nfs
    fsidd   57130 root    3u      REG              147,0    16384 16777352 
/var/lib/nfs/reexpdb.sqlite3

A combination of factors causes this service not to exit, and prevent
the resource-agent script from reporting success in stopping the nfs
server.

If the nfs info dir is mounted from somewhere - when using
nfs_shared_infodir - this process needs to exit before the
nfs_shared_infodir is unmounted.

When stopping nfs-server, the resource-agent script either uses
/etc/init.d/nfs-kernel-server if present or systemctl commands, as
mentioned in https://bugs.launchpad.net/ubuntu/+source/resource-
agents/+bug/2065848/comments/10. Leaving /etc/init.d/nfs-kernel-server
in place causes additional problems, as mentioned in
https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/2065848,
so I have removed it on my test system.

Regardless of how nfs-server is stopped, fsidd is not stopped. This
causes /var/lib/nfs to fail to unmount, and the error to bubble up.

For my use case - an NFS server managed by corosync/pacemaker- this
means failover to other hosts does not work as the existing NFS server
cannot be stopped.

This issue can be worked around by providing a systemd unit drop-in,
that declares fsidd PartOf nfs-server:

# cat > /etc/systemd/system/fsidd.service.d/nfs-server-dependency.conf << 'EOF'
[Unit]
PartOf=nfs-server.service
EOF

# systemctl daemon-reload

With this in place, fsidd is automatically stopped when nfs-server is
stopped, which means /var/lib/nfs can then be unmounted successfully.

I suspect nfs-kernel-server was updated adding this fsidd service, and
resource-agents was not updated to match. Or perhaps leaving fsidd
running even when nfs-server exits is sloppy. It's not my call.

** Affects: resource-agents (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2145790

Title:
  ocf:heartbeat:nfsserver resource's stop operation fails due to
  /var/lib/nfs filesystem failing to unmount

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/2145790/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to