On 1/13/26 6:45 AM, Jeff Layton wrote: > On Tue, 2026-01-13 at 09:54 +0100, Christian Brauner wrote: >> On Mon, Jan 12, 2026 at 09:50:20AM -0500, Jeff Layton wrote: >>> On Mon, 2026-01-12 at 09:31 -0500, Chuck Lever wrote: >>>> On 1/12/26 8:34 AM, Jeff Layton wrote: >>>>> On Fri, 2026-01-09 at 19:52 +0100, Amir Goldstein wrote: >>>>>> On Thu, Jan 8, 2026 at 7:57 PM Jeff Layton <[email protected]> wrote: >>>>>>> >>>>>>> On Thu, 2026-01-08 at 18:40 +0100, Jan Kara wrote: >>>>>>>> On Thu 08-01-26 12:12:55, Jeff Layton wrote: >>>>>>>>> Yesterday, I sent patches to fix how directory delegation support is >>>>>>>>> handled on filesystems where the should be disabled [1]. That set is >>>>>>>>> appropriate for v6.19. For v7.0, I want to make lease support be more >>>>>>>>> opt-in, rather than opt-out: >>>>>>>>> >>>>>>>>> For historical reasons, when ->setlease() file_operation is set to >>>>>>>>> NULL, >>>>>>>>> the default is to use the kernel-internal lease implementation. This >>>>>>>>> means that if you want to disable them, you need to explicitly set the >>>>>>>>> ->setlease() file_operation to simple_nosetlease() or the equivalent. >>>>>>>>> >>>>>>>>> This has caused a number of problems over the years as some >>>>>>>>> filesystems >>>>>>>>> have inadvertantly allowed leases to be acquired simply by having left >>>>>>>>> it set to NULL. It would be better if filesystems had to opt-in to >>>>>>>>> lease >>>>>>>>> support, particularly with the advent of directory delegations. >>>>>>>>> >>>>>>>>> This series has sets the ->setlease() operation in a pile of existing >>>>>>>>> local filesystems to generic_setlease() and then changes >>>>>>>>> kernel_setlease() to return -EINVAL when the setlease() operation is >>>>>>>>> not >>>>>>>>> set. >>>>>>>>> >>>>>>>>> With this change, new filesystems will need to explicitly set the >>>>>>>>> ->setlease() operations in order to provide lease and delegation >>>>>>>>> support. >>>>>>>>> >>>>>>>>> I mainly focused on filesystems that are NFS exportable, since NFS and >>>>>>>>> SMB are the main users of file leases, and they tend to end up >>>>>>>>> exporting >>>>>>>>> the same filesystem types. Let me know if I've missed any. >>>>>>>> >>>>>>>> So, what about kernfs and fuse? They seem to be exportable and don't >>>>>>>> have >>>>>>>> .setlease set... >>>>>>>> >>>>>>> >>>>>>> Yes, FUSE needs this too. I'll add a patch for that. >>>>>>> >>>>>>> As far as kernfs goes: AIUI, that's basically what sysfs and resctrl >>>>>>> are built on. Do we really expect people to set leases there? >>>>>>> >>>>>>> I guess it's technically a regression since you could set them on those >>>>>>> sorts of files earlier, but people don't usually export kernfs based >>>>>>> filesystems via NFS or SMB, and that seems like something that could be >>>>>>> used to make mischief. >>>>>>> >>>>>>> AFAICT, kernfs_export_ops is mostly to support open_by_handle_at(). See >>>>>>> commit aa8188253474 ("kernfs: add exportfs operations"). >>>>>>> >>>>>>> One idea: we could add a wrapper around generic_setlease() for >>>>>>> filesystems like this that will do a WARN_ONCE() and then call >>>>>>> generic_setlease(). That would keep leases working on them but we might >>>>>>> get some reports that would tell us who's setting leases on these files >>>>>>> and why. >>>>>> >>>>>> IMO, you are being too cautious, but whatever. >>>>>> >>>>>> It is not accurate that kernfs filesystems are NFS exportable in general. >>>>>> Only cgroupfs has KERNFS_ROOT_SUPPORT_EXPORTOP. >>>>>> >>>>>> If any application is using leases on cgroup files, it must be some >>>>>> very advanced runtime (i.e. systemd), so we should know about the >>>>>> regression sooner rather than later. >>>>>> >>>>> >>>>> I think so too. For now, I think I'll not bother with the WARN_ONCE(). >>>>> Let's just leave kernfs out of the set until someone presents a real >>>>> use-case. >>>>> >>>>>> There are also the recently added nsfs and pidfs export_operations. >>>>>> >>>>>> I have a recollection about wanting to be explicit about not allowing >>>>>> those to be exportable to NFS (nsfs specifically), but I can't see where >>>>>> and if that restriction was done. >>>>>> >>>>>> Christian? Do you remember? >>>>>> >>>>> >>>>> (cc'ing Chuck) >>>>> >>>>> FWIW, you can currently export and mount /sys/fs/cgroup via NFS. The >>>>> directory doesn't show up when you try to get to it via NFSv4, but you >>>>> can mount it using v3 and READDIR works. The files are all empty when >>>>> you try to read them. I didn't try to do any writes. >>>>> >>>>> Should we add a mechanism to prevent exporting these sorts of >>>>> filesystems? >>>>> >>>>> Even better would be to make nfsd exporting explicitly opt-in. What if >>>>> we were to add a EXPORT_OP_NFSD flag that explicitly allows filesystems >>>>> to opt-in to NFS exporting, and check for that in __fh_verify()? We'd >>>>> have to add it to a bunch of existing filesystems, but that's fairly >>>>> simple to do with an LLM. >>>> >>>> What's the active harm in exporting /sys/fs/cgroup ? It has to be done >>>> explicitly via /etc/exports, so this is under the NFS server admin's >>>> control. Is it an attack surface? >>>> >>> >>> Potentially? >>> >>> I don't see any active harm with exporting cgroupfs. It doesn't work >>> right via nfsd, but it's not crashing the box or anything. >>> >>> At one time, those were only defined by filesystems that wanted to >>> allow NFS export. Now we've grown them on filesystems that just want to >>> provide filehandles for open_by_handle_at() and the like. nfsd doesn't >>> care though: if the fs has export operations, it'll happily use them. >>> >>> Having an explicit "I want to allow nfsd" flag see ms like it might >>> save us some headaches in the future when other filesystems add export >>> ops for this sort of filehandle use. >> >> So we are re-hashing a discussion we had a few months ago (Amir was >> involved at least). >> > > Yep, I was lurking on it, but didn't have a lot of input at the time. > >> I don't think we want to expose cgroupfs via NFS that's super weird. >> It's like remote partial resource management and it would be very >> strange if a remote process suddenly would be able to move things around >> in the cgroup tree. So I would prefer to not do this. >> >> So my preference would be to really sever file handles from the export >> mechanism so that we can allow stuff like pidfs and nsfs and cgroupfs to >> use file handles via name_to_handle_at() and open_by_handle_at() without >> making them exportable. > > Agreed. I think we want to make NFS export be a deliberate opt-in > decision that filesystem developers make.
No objection, what about ksmbd, AFS, or Ceph? > How we do that is up for > debate, of course. > > An export ops flag would be fairly simple to implement, but it sounds > like you're thinking that we should split some export_operations into > struct file_handle_operations and then add a pointer for that to > super_block (and maybe to export_operations too)? > > This would be a good LSF/MM topic, but I'm hoping we can come to a > consensus before then. > -- Chuck Lever
