Re: [PATCH 00/24] vfs: require filesystems to explicitly opt-in to lease support

Jeff Layton Tue, 13 Jan 2026 07:00:33 -0800

On Tue, 2026-01-13 at 09:31 -0500, Chuck Lever wrote:
> On 1/13/26 9:27 AM, Jeff Layton wrote:
> > On Tue, 2026-01-13 at 09:03 -0500, Chuck Lever wrote:
> > > On 1/13/26 6:45 AM, Jeff Layton wrote:
> > > > On Tue, 2026-01-13 at 09:54 +0100, Christian Brauner wrote:
> > > > > On Mon, Jan 12, 2026 at 09:50:20AM -0500, Jeff Layton wrote:
> > > > > > On Mon, 2026-01-12 at 09:31 -0500, Chuck Lever wrote:
> > > > > > > On 1/12/26 8:34 AM, Jeff Layton wrote:
> > > > > > > > On Fri, 2026-01-09 at 19:52 +0100, Amir Goldstein wrote:
> > > > > > > > > On Thu, Jan 8, 2026 at 7:57 PM Jeff Layton 
> > > > > > > > > <[email protected]> wrote:
> > > > > > > > > > 
> > > > > > > > > > On Thu, 2026-01-08 at 18:40 +0100, Jan Kara wrote:
> > > > > > > > > > > On Thu 08-01-26 12:12:55, Jeff Layton wrote:
> > > > > > > > > > > > Yesterday, I sent patches to fix how directory 
> > > > > > > > > > > > delegation support is
> > > > > > > > > > > > handled on filesystems where the should be disabled 
> > > > > > > > > > > > [1]. That set is
> > > > > > > > > > > > appropriate for v6.19. For v7.0, I want to make lease 
> > > > > > > > > > > > support be more
> > > > > > > > > > > > opt-in, rather than opt-out:
> > > > > > > > > > > > 
> > > > > > > > > > > > For historical reasons, when ->setlease() 
> > > > > > > > > > > > file_operation is set to NULL,
> > > > > > > > > > > > the default is to use the kernel-internal lease 
> > > > > > > > > > > > implementation. This
> > > > > > > > > > > > means that if you want to disable them, you need to 
> > > > > > > > > > > > explicitly set the
> > > > > > > > > > > > ->setlease() file_operation to simple_nosetlease() or 
> > > > > > > > > > > > the equivalent.
> > > > > > > > > > > > 
> > > > > > > > > > > > This has caused a number of problems over the years as 
> > > > > > > > > > > > some filesystems
> > > > > > > > > > > > have inadvertantly allowed leases to be acquired simply 
> > > > > > > > > > > > by having left
> > > > > > > > > > > > it set to NULL. It would be better if filesystems had 
> > > > > > > > > > > > to opt-in to lease
> > > > > > > > > > > > support, particularly with the advent of directory 
> > > > > > > > > > > > delegations.
> > > > > > > > > > > > 
> > > > > > > > > > > > This series has sets the ->setlease() operation in a 
> > > > > > > > > > > > pile of existing
> > > > > > > > > > > > local filesystems to generic_setlease() and then changes
> > > > > > > > > > > > kernel_setlease() to return -EINVAL when the setlease() 
> > > > > > > > > > > > operation is not
> > > > > > > > > > > > set.
> > > > > > > > > > > > 
> > > > > > > > > > > > With this change, new filesystems will need to 
> > > > > > > > > > > > explicitly set the
> > > > > > > > > > > > ->setlease() operations in order to provide lease and 
> > > > > > > > > > > > delegation
> > > > > > > > > > > > support.
> > > > > > > > > > > > 
> > > > > > > > > > > > I mainly focused on filesystems that are NFS 
> > > > > > > > > > > > exportable, since NFS and
> > > > > > > > > > > > SMB are the main users of file leases, and they tend to 
> > > > > > > > > > > > end up exporting
> > > > > > > > > > > > the same filesystem types. Let me know if I've missed 
> > > > > > > > > > > > any.
> > > > > > > > > > > 
> > > > > > > > > > > So, what about kernfs and fuse? They seem to be 
> > > > > > > > > > > exportable and don't have
> > > > > > > > > > > .setlease set...
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Yes, FUSE needs this too. I'll add a patch for that.
> > > > > > > > > > 
> > > > > > > > > > As far as kernfs goes: AIUI, that's basically what sysfs 
> > > > > > > > > > and resctrl
> > > > > > > > > > are built on. Do we really expect people to set leases 
> > > > > > > > > > there?
> > > > > > > > > > 
> > > > > > > > > > I guess it's technically a regression since you could set 
> > > > > > > > > > them on those
> > > > > > > > > > sorts of files earlier, but people don't usually export 
> > > > > > > > > > kernfs based
> > > > > > > > > > filesystems via NFS or SMB, and that seems like something 
> > > > > > > > > > that could be
> > > > > > > > > > used to make mischief.
> > > > > > > > > > 
> > > > > > > > > > AFAICT, kernfs_export_ops is mostly to support 
> > > > > > > > > > open_by_handle_at(). See
> > > > > > > > > > commit aa8188253474 ("kernfs: add exportfs operations").
> > > > > > > > > > 
> > > > > > > > > > One idea: we could add a wrapper around generic_setlease() 
> > > > > > > > > > for
> > > > > > > > > > filesystems like this that will do a WARN_ONCE() and then 
> > > > > > > > > > call
> > > > > > > > > > generic_setlease(). That would keep leases working on them 
> > > > > > > > > > but we might
> > > > > > > > > > get some reports that would tell us who's setting leases on 
> > > > > > > > > > these files
> > > > > > > > > > and why.
> > > > > > > > > 
> > > > > > > > > IMO, you are being too cautious, but whatever.
> > > > > > > > > 
> > > > > > > > > It is not accurate that kernfs filesystems are NFS exportable 
> > > > > > > > > in general.
> > > > > > > > > Only cgroupfs has KERNFS_ROOT_SUPPORT_EXPORTOP.
> > > > > > > > > 
> > > > > > > > > If any application is using leases on cgroup files, it must 
> > > > > > > > > be some
> > > > > > > > > very advanced runtime (i.e. systemd), so we should know about 
> > > > > > > > > the
> > > > > > > > > regression sooner rather than later.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > I think so too. For now, I think I'll not bother with the 
> > > > > > > > WARN_ONCE().
> > > > > > > > Let's just leave kernfs out of the set until someone presents a 
> > > > > > > > real
> > > > > > > > use-case.
> > > > > > > > 
> > > > > > > > > There are also the recently added nsfs and pidfs 
> > > > > > > > > export_operations.
> > > > > > > > > 
> > > > > > > > > I have a recollection about wanting to be explicit about not 
> > > > > > > > > allowing
> > > > > > > > > those to be exportable to NFS (nsfs specifically), but I 
> > > > > > > > > can't see where
> > > > > > > > > and if that restriction was done.
> > > > > > > > > 
> > > > > > > > > Christian? Do you remember?
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > (cc'ing Chuck)
> > > > > > > > 
> > > > > > > > FWIW, you can currently export and mount /sys/fs/cgroup via 
> > > > > > > > NFS. The
> > > > > > > > directory doesn't show up when you try to get to it via NFSv4, 
> > > > > > > > but you
> > > > > > > > can mount it using v3 and READDIR works. The files are all 
> > > > > > > > empty when
> > > > > > > > you try to read them. I didn't try to do any writes.
> > > > > > > > 
> > > > > > > > Should we add a mechanism to prevent exporting these sorts of
> > > > > > > > filesystems?
> > > > > > > > 
> > > > > > > > Even better would be to make nfsd exporting explicitly opt-in. 
> > > > > > > > What if
> > > > > > > > we were to add a EXPORT_OP_NFSD flag that explicitly allows 
> > > > > > > > filesystems
> > > > > > > > to opt-in to NFS exporting, and check for that in 
> > > > > > > > __fh_verify()? We'd
> > > > > > > > have to add it to a bunch of existing filesystems, but that's 
> > > > > > > > fairly
> > > > > > > > simple to do with an LLM.
> > > > > > > 
> > > > > > > What's the active harm in exporting /sys/fs/cgroup ? It has to be 
> > > > > > > done
> > > > > > > explicitly via /etc/exports, so this is under the NFS server 
> > > > > > > admin's
> > > > > > > control. Is it an attack surface?
> > > > > > > 
> > > > > > 
> > > > > > Potentially?
> > > > > > 
> > > > > > I don't see any active harm with exporting cgroupfs. It doesn't work
> > > > > > right via nfsd, but it's not crashing the box or anything.
> > > > > > 
> > > > > > At one time, those were only defined by filesystems that wanted to
> > > > > > allow NFS export. Now we've grown them on filesystems that just 
> > > > > > want to
> > > > > > provide filehandles for open_by_handle_at() and the like. nfsd 
> > > > > > doesn't
> > > > > > care though: if the fs has export operations, it'll happily use 
> > > > > > them.
> > > > > > 
> > > > > > Having an explicit "I want to allow nfsd" flag see ms like it might
> > > > > > save us some headaches in the future when other filesystems add 
> > > > > > export
> > > > > > ops for this sort of filehandle use.
> > > > > 
> > > > > So we are re-hashing a discussion we had a few months ago (Amir was
> > > > > involved at least).
> > > > > 
> > > > 
> > > > Yep, I was lurking on it, but didn't have a lot of input at the time.
> > > > 
> > > > > I don't think we want to expose cgroupfs via NFS that's super weird.
> > > > > It's like remote partial resource management and it would be very
> > > > > strange if a remote process suddenly would be able to move things 
> > > > > around
> > > > > in the cgroup tree. So I would prefer to not do this.
> > > > > 
> > > > > So my preference would be to really sever file handles from the export
> > > > > mechanism so that we can allow stuff like pidfs and nsfs and cgroupfs 
> > > > > to
> > > > > use file handles via name_to_handle_at() and open_by_handle_at() 
> > > > > without
> > > > > making them exportable.
> > > > 
> > > > Agreed. I think we want to make NFS export be a deliberate opt-in
> > > > decision that filesystem developers make.
> > > 
> > > No objection, what about ksmbd, AFS, or Ceph?
> > > 
> > 
> > ksmbd doesn't have anything akin to an export_operations. I think it
> > really has to rely on admins getting the share paths right when
> > exporting. This is a bit simpler there though since SMB2 doesn't deal
> > with filehandles.
> > 
> > AFS and Ceph in the kernel are clients. AFS isn't reexportable via NFS,
> > but Ceph is. We'll need to preserve that ability.
> 
> Well I think my point is that "is this file system type exportable"
> might be orthogonal to whether the FS offers a filehandle capability. If
> it doesn't make sense to export cgroupfs via NFS, it probably also does
> not make sense for ksmbd. Lather, rinse, repeat for other in-kernel file
> servers.
> 
> Perhaps the "is_exportable" predicate is better placed separately from
> export_ops.
>


That's a fair point.

An fstype flag would seem most natural then. For nfsd, I guess we'd
want to check for that in fh_compose() and fh_verify() ?

I don't know ksmbd well enough to know how they would want to plumb in
a check for this though. Maybe at the point where they resolve
pathnames?
-- 
Jeff Layton <[email protected]>

Re: [PATCH 00/24] vfs: require filesystems to explicitly opt-in to lease support

Reply via email to