On 1/13/26 9:27 AM, Jeff Layton wrote: > On Tue, 2026-01-13 at 09:03 -0500, Chuck Lever wrote: >> On 1/13/26 6:45 AM, Jeff Layton wrote: >>> On Tue, 2026-01-13 at 09:54 +0100, Christian Brauner wrote: >>>> On Mon, Jan 12, 2026 at 09:50:20AM -0500, Jeff Layton wrote: >>>>> On Mon, 2026-01-12 at 09:31 -0500, Chuck Lever wrote: >>>>>> On 1/12/26 8:34 AM, Jeff Layton wrote: >>>>>>> On Fri, 2026-01-09 at 19:52 +0100, Amir Goldstein wrote: >>>>>>>> On Thu, Jan 8, 2026 at 7:57 PM Jeff Layton <[email protected]> wrote: >>>>>>>>> >>>>>>>>> On Thu, 2026-01-08 at 18:40 +0100, Jan Kara wrote: >>>>>>>>>> On Thu 08-01-26 12:12:55, Jeff Layton wrote: >>>>>>>>>>> Yesterday, I sent patches to fix how directory delegation support is >>>>>>>>>>> handled on filesystems where the should be disabled [1]. That set is >>>>>>>>>>> appropriate for v6.19. For v7.0, I want to make lease support be >>>>>>>>>>> more >>>>>>>>>>> opt-in, rather than opt-out: >>>>>>>>>>> >>>>>>>>>>> For historical reasons, when ->setlease() file_operation is set to >>>>>>>>>>> NULL, >>>>>>>>>>> the default is to use the kernel-internal lease implementation. This >>>>>>>>>>> means that if you want to disable them, you need to explicitly set >>>>>>>>>>> the >>>>>>>>>>> ->setlease() file_operation to simple_nosetlease() or the >>>>>>>>>>> equivalent. >>>>>>>>>>> >>>>>>>>>>> This has caused a number of problems over the years as some >>>>>>>>>>> filesystems >>>>>>>>>>> have inadvertantly allowed leases to be acquired simply by having >>>>>>>>>>> left >>>>>>>>>>> it set to NULL. It would be better if filesystems had to opt-in to >>>>>>>>>>> lease >>>>>>>>>>> support, particularly with the advent of directory delegations. >>>>>>>>>>> >>>>>>>>>>> This series has sets the ->setlease() operation in a pile of >>>>>>>>>>> existing >>>>>>>>>>> local filesystems to generic_setlease() and then changes >>>>>>>>>>> kernel_setlease() to return -EINVAL when the setlease() operation >>>>>>>>>>> is not >>>>>>>>>>> set. >>>>>>>>>>> >>>>>>>>>>> With this change, new filesystems will need to explicitly set the >>>>>>>>>>> ->setlease() operations in order to provide lease and delegation >>>>>>>>>>> support. >>>>>>>>>>> >>>>>>>>>>> I mainly focused on filesystems that are NFS exportable, since NFS >>>>>>>>>>> and >>>>>>>>>>> SMB are the main users of file leases, and they tend to end up >>>>>>>>>>> exporting >>>>>>>>>>> the same filesystem types. Let me know if I've missed any. >>>>>>>>>> >>>>>>>>>> So, what about kernfs and fuse? They seem to be exportable and don't >>>>>>>>>> have >>>>>>>>>> .setlease set... >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, FUSE needs this too. I'll add a patch for that. >>>>>>>>> >>>>>>>>> As far as kernfs goes: AIUI, that's basically what sysfs and resctrl >>>>>>>>> are built on. Do we really expect people to set leases there? >>>>>>>>> >>>>>>>>> I guess it's technically a regression since you could set them on >>>>>>>>> those >>>>>>>>> sorts of files earlier, but people don't usually export kernfs based >>>>>>>>> filesystems via NFS or SMB, and that seems like something that could >>>>>>>>> be >>>>>>>>> used to make mischief. >>>>>>>>> >>>>>>>>> AFAICT, kernfs_export_ops is mostly to support open_by_handle_at(). >>>>>>>>> See >>>>>>>>> commit aa8188253474 ("kernfs: add exportfs operations"). >>>>>>>>> >>>>>>>>> One idea: we could add a wrapper around generic_setlease() for >>>>>>>>> filesystems like this that will do a WARN_ONCE() and then call >>>>>>>>> generic_setlease(). That would keep leases working on them but we >>>>>>>>> might >>>>>>>>> get some reports that would tell us who's setting leases on these >>>>>>>>> files >>>>>>>>> and why. >>>>>>>> >>>>>>>> IMO, you are being too cautious, but whatever. >>>>>>>> >>>>>>>> It is not accurate that kernfs filesystems are NFS exportable in >>>>>>>> general. >>>>>>>> Only cgroupfs has KERNFS_ROOT_SUPPORT_EXPORTOP. >>>>>>>> >>>>>>>> If any application is using leases on cgroup files, it must be some >>>>>>>> very advanced runtime (i.e. systemd), so we should know about the >>>>>>>> regression sooner rather than later. >>>>>>>> >>>>>>> >>>>>>> I think so too. For now, I think I'll not bother with the WARN_ONCE(). >>>>>>> Let's just leave kernfs out of the set until someone presents a real >>>>>>> use-case. >>>>>>> >>>>>>>> There are also the recently added nsfs and pidfs export_operations. >>>>>>>> >>>>>>>> I have a recollection about wanting to be explicit about not allowing >>>>>>>> those to be exportable to NFS (nsfs specifically), but I can't see >>>>>>>> where >>>>>>>> and if that restriction was done. >>>>>>>> >>>>>>>> Christian? Do you remember? >>>>>>>> >>>>>>> >>>>>>> (cc'ing Chuck) >>>>>>> >>>>>>> FWIW, you can currently export and mount /sys/fs/cgroup via NFS. The >>>>>>> directory doesn't show up when you try to get to it via NFSv4, but you >>>>>>> can mount it using v3 and READDIR works. The files are all empty when >>>>>>> you try to read them. I didn't try to do any writes. >>>>>>> >>>>>>> Should we add a mechanism to prevent exporting these sorts of >>>>>>> filesystems? >>>>>>> >>>>>>> Even better would be to make nfsd exporting explicitly opt-in. What if >>>>>>> we were to add a EXPORT_OP_NFSD flag that explicitly allows filesystems >>>>>>> to opt-in to NFS exporting, and check for that in __fh_verify()? We'd >>>>>>> have to add it to a bunch of existing filesystems, but that's fairly >>>>>>> simple to do with an LLM. >>>>>> >>>>>> What's the active harm in exporting /sys/fs/cgroup ? It has to be done >>>>>> explicitly via /etc/exports, so this is under the NFS server admin's >>>>>> control. Is it an attack surface? >>>>>> >>>>> >>>>> Potentially? >>>>> >>>>> I don't see any active harm with exporting cgroupfs. It doesn't work >>>>> right via nfsd, but it's not crashing the box or anything. >>>>> >>>>> At one time, those were only defined by filesystems that wanted to >>>>> allow NFS export. Now we've grown them on filesystems that just want to >>>>> provide filehandles for open_by_handle_at() and the like. nfsd doesn't >>>>> care though: if the fs has export operations, it'll happily use them. >>>>> >>>>> Having an explicit "I want to allow nfsd" flag see ms like it might >>>>> save us some headaches in the future when other filesystems add export >>>>> ops for this sort of filehandle use. >>>> >>>> So we are re-hashing a discussion we had a few months ago (Amir was >>>> involved at least). >>>> >>> >>> Yep, I was lurking on it, but didn't have a lot of input at the time. >>> >>>> I don't think we want to expose cgroupfs via NFS that's super weird. >>>> It's like remote partial resource management and it would be very >>>> strange if a remote process suddenly would be able to move things around >>>> in the cgroup tree. So I would prefer to not do this. >>>> >>>> So my preference would be to really sever file handles from the export >>>> mechanism so that we can allow stuff like pidfs and nsfs and cgroupfs to >>>> use file handles via name_to_handle_at() and open_by_handle_at() without >>>> making them exportable. >>> >>> Agreed. I think we want to make NFS export be a deliberate opt-in >>> decision that filesystem developers make. >> >> No objection, what about ksmbd, AFS, or Ceph? >> > > ksmbd doesn't have anything akin to an export_operations. I think it > really has to rely on admins getting the share paths right when > exporting. This is a bit simpler there though since SMB2 doesn't deal > with filehandles. > > AFS and Ceph in the kernel are clients. AFS isn't reexportable via NFS, > but Ceph is. We'll need to preserve that ability.
Well I think my point is that "is this file system type exportable" might be orthogonal to whether the FS offers a filehandle capability. If it doesn't make sense to export cgroupfs via NFS, it probably also does not make sense for ksmbd. Lather, rinse, repeat for other in-kernel file servers. Perhaps the "is_exportable" predicate is better placed separately from export_ops. -- Chuck Lever
