On 2026-04-01, Mateusz Guzik <[email protected]> wrote: > On Tue, Mar 31, 2026 at 07:19:58PM +0200, Jori Koolstra wrote: > > @@ -5286,7 +5290,25 @@ int filename_mkdirat(int dfd, struct filename *name, > > umode_t mode) > > lookup_flags |= LOOKUP_REVAL; > > goto retry; > > } > > + > > + if (!error && (flags & MKDIRAT_FD_NEED_FD)) { > > + struct path new_path = { .mnt = path.mnt, .dentry = dentry }; > > + error = FD_ADD(0, dentry_open(&new_path, O_DIRECTORY, > > current_cred())); > > + } > > + end_creating_path(&path, dentry); > > return error; > > > You can't do it like this. Should it turn out no fd can be allocated, > the entire thing is going to error out while keeping the newly created > directory behind. You need to allocate the fd first, then do the hard > work, and only then fd_install and or free the fd. The FD_ADD machinery > can probably still be used provided proper wrapping of the real new > mkdir. > > It should be perfectly feasible to de facto wrap existing mkdir > functionality by this syscall. > > On top of that similarly to what other people mentioned the new syscall > will definitely want to support O_CLOEXEC and probably other flags down > the line. > > Trying to handle this in open() is a no-go. openat2 is rather > problematic.
I'm interested in what makes you say that. It would be very nice to be able to do mkdir + RESOLVE_IN_ROOT and get an fd back all in one syscall. :D To be fair, build_open_how() will need some more magic to keep openat() working, and that won't be particularly pretty. If we went with O_CREAT|O_DIRECTORY we would need to be quite careful to make sure O_TMPFILE continues to work for both openat() and openat2()... > I tend to agree mkdirat_fd is not a good name for the syscall either, > but I don't have a suggestion I'm happy with. I think least bad name > would follow the existing stuff and be mkdirat2 or similar. > > The routine would have to start with validating the passed O_ flags, for > now only allowing O_CLOEXEC and EINVAL-ing otherwise. Please do not use O_* flags! O_CLOEXEC takes up 3 flag bits on different architectures which makes adding new flags a nightmare. I think this should take AT_* flags and (like most newer syscalls) O_CLOEXEC should be automatically set. Userspace can unset it with fnctl(F_SETFD) in the relatively rare case where they don't want O_CLOEXEC. Alternatively, we could just bite the bullet and make AT_NO_CLOEXEC a thing... But yes, new syscalls *absolutely* need to take some kind of flag argument. I'd hoped we finally learned our lesson on that one... -- Aleksa Sarai https://www.cyphar.com/
signature.asc
Description: PGP signature

