Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 01:02:56PM -0400, Rich Felker wrote: > Would you be happy with a pair of patches where the first blocks chmod > of symlinks in chmod_common and the second adds the syscall with > flags? I think this is a clearly understandable fix, but it does > eliminate the ability to *fix* link access modes that have been set to > ridiculous values (note: I don't think it really matters since the > modes don't do anything anyway) in the past. I'd be much happier with that, yes.
RE: [PATCH] vfs: add fchmodat2 syscall
From: Rich Felker > Sent: 10 September 2020 15:24 ... > index 9af548fb841b..570a21f4d81e 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -610,15 +610,30 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) > return err; > } > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, > int flags) > { > struct path path; > int error; > unsigned int lookup_flags = LOOKUP_FOLLOW; > + > + if (flags & AT_SYMLINK_NOFOLLOW) > + lookup_flags &= ~LOOKUP_FOLLOW; > + if (flags & ~AT_SYMLINK_NOFOLLOW) > + return -EINVAL; I think I'd swap over those two tests. So unsupported flags are clearly errored. > retry: > error = user_path_at(dfd, filename, lookup_flags, &path); > if (!error) { > - error = chmod_common(&path, mode); > + /* Block chmod from getting to fs layer. Ideally the > + * fs would either allow it or fail with EOPNOTSUPP, > + * but some are buggy and return an error but change > + * the mode, which is non-conforming and wrong. > + * Userspace emulation of AT_SYMLINK_NOFOLLOW in > + * glibc and musl blocked it too, for same reason. */ > + if (S_ISLNK(path.dentry->d_inode->i_mode) > + && (flags & AT_SYMLINK_NOFOLLOW)) > + error = -EOPNOTSUPP; Again swap the order of the tests. I think it reads better as: if ((flags & AT_SYMLINK_NOFOLLOW) && S_ISLNK(path.dentry->d_inode->i_mode)) error = -EOPNOTSUPP; As well as saving a few clock cycles. > + else > + error = chmod_common(&path, mode); > path_put(&path); > if (retry_estale(error, lookup_flags)) { > lookup_flags |= LOOKUP_REVAL; ... David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 04:18:28PM +0100, Al Viro wrote: > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > > > It was determined (see glibc issue #14578 and commit a492b1e5ef) that, > > on some filesystems, performing chmod on the link itself produces a > > change in the inode's access mode, but returns an EOPNOTSUPP error. > > Which filesystem types are those? It's been a long time and I don't know if the details were recorded. It was reported for xfs but I believe we later found it happening for others. See: https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17 https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html and especially: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html where Christoph seems to have endorsed the approach in my patch. I'm fine with doing it differently if you'd prefer, though. Rich
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > POSIX defines fchmodat as having a 4th argument, flags, that can be > AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic > links is optional (EOPNOTSUPP allowed if not supported), but this flag > is important even on systems where symlinks do not have access modes, > since it's the only way to safely change the mode of a file which > might be asynchronously replaced with a symbolic link, without a race > condition whereby the link target is changed. > > It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both > musl libc and glibc do this, by opening an O_PATH file descriptor and > performing chmod on the corresponding magic symlink in /proc/self/fd. > However, this requires procfs to be mounted and accessible. > > It was determined (see glibc issue #14578 and commit a492b1e5ef) that, > on some filesystems, performing chmod on the link itself produces a > change in the inode's access mode, but returns an EOPNOTSUPP error. > This is non-conforming and wrong. Rather than try to fix all the > broken filesystem backends, block attempts to change the symlink > access mode via fchmodat2 at the frontend layer. This matches the > userspace emulation done in libc implementations. No change is made to > the underlying chmod_common(), so it's still possible to attempt > changes via procfs, if desired. If at some point all filesystems have > been fixed, this could be relaxed to allow filesystems to make their > own decision whether changing access mode of links is supported. A new syscall just because we have broken filesystems seems really odd, why not just fix the filesystems instead? thanks, greg k-h
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 05:42:34PM +0100, Christoph Hellwig wrote: > On Thu, Sep 10, 2020 at 12:39:50PM -0400, Rich Felker wrote: > > On Thu, Sep 10, 2020 at 05:20:59PM +0100, Christoph Hellwig wrote: > > > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > > > > userspace emulation done in libc implementations. No change is made to > > > > the underlying chmod_common(), so it's still possible to attempt > > > > changes via procfs, if desired. > > > > > > And that is the goddamn problem. We need to fix that _first_. > > > > Can you clarify exactly what that is? Do you mean fixing the > > underlying fs backends, or just ensuring that the chmod for symlinks > > doesn't reach them by putting the check in chmod_common? I'm ok with > > any of these. > > Either - we need to make sure the user can't change the permission > bits. > > > > After that we can add sugarcoating using new syscalls if needed. > > > > The new syscall is _not_ about this problem. It's about the missing > > flags argument and inability to implement fchmodat() without access to > > procfs. The above problem is just something you encounter and have to > > make a decision about in order to fix the missing flags problem and > > make a working AT_SYMLINK_NOFOLLOW. > > And I'm generally supportive of that. But we need to fix the damn > bug first an then do nice to haves. Would you be happy with a pair of patches where the first blocks chmod of symlinks in chmod_common and the second adds the syscall with flags? I think this is a clearly understandable fix, but it does eliminate the ability to *fix* link access modes that have been set to ridiculous values (note: I don't think it really matters since the modes don't do anything anyway) in the past. That's why I preferred to *start* with the forced-EOPNOTSUPP just in the new interface, so that links won't inadvertently get bogus modes set on them when libc starts using it. As long as some filesystems are representing access modes in links (and returning them via stat), it seems like there should be a way to "fix" any that were set in the past. The patch as I've submitted it now is the least invasive change in this sense; it does not take away any capability that already existed. Rich
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 12:39:50PM -0400, Rich Felker wrote: > On Thu, Sep 10, 2020 at 05:20:59PM +0100, Christoph Hellwig wrote: > > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > > > userspace emulation done in libc implementations. No change is made to > > > the underlying chmod_common(), so it's still possible to attempt > > > changes via procfs, if desired. > > > > And that is the goddamn problem. We need to fix that _first_. > > Can you clarify exactly what that is? Do you mean fixing the > underlying fs backends, or just ensuring that the chmod for symlinks > doesn't reach them by putting the check in chmod_common? I'm ok with > any of these. Either - we need to make sure the user can't change the permission bits. > > After that we can add sugarcoating using new syscalls if needed. > > The new syscall is _not_ about this problem. It's about the missing > flags argument and inability to implement fchmodat() without access to > procfs. The above problem is just something you encounter and have to > make a decision about in order to fix the missing flags problem and > make a working AT_SYMLINK_NOFOLLOW. And I'm generally supportive of that. But we need to fix the damn bug first an then do nice to haves.
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 05:20:59PM +0100, Christoph Hellwig wrote: > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > > userspace emulation done in libc implementations. No change is made to > > the underlying chmod_common(), so it's still possible to attempt > > changes via procfs, if desired. > > And that is the goddamn problem. We need to fix that _first_. Can you clarify exactly what that is? Do you mean fixing the underlying fs backends, or just ensuring that the chmod for symlinks doesn't reach them by putting the check in chmod_common? I'm ok with any of these. > After that we can add sugarcoating using new syscalls if needed. The new syscall is _not_ about this problem. It's about the missing flags argument and inability to implement fchmodat() without access to procfs. The above problem is just something you encounter and have to make a decision about in order to fix the missing flags problem and make a working AT_SYMLINK_NOFOLLOW. Rich
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 06:16:15PM +0200, Greg KH wrote: > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > > POSIX defines fchmodat as having a 4th argument, flags, that can be > > AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic > > links is optional (EOPNOTSUPP allowed if not supported), but this flag > > is important even on systems where symlinks do not have access modes, > > since it's the only way to safely change the mode of a file which > > might be asynchronously replaced with a symbolic link, without a race > > condition whereby the link target is changed. > > > > It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both > > musl libc and glibc do this, by opening an O_PATH file descriptor and > > performing chmod on the corresponding magic symlink in /proc/self/fd. > > However, this requires procfs to be mounted and accessible. > > > > It was determined (see glibc issue #14578 and commit a492b1e5ef) that, > > on some filesystems, performing chmod on the link itself produces a > > change in the inode's access mode, but returns an EOPNOTSUPP error. > > This is non-conforming and wrong. Rather than try to fix all the > > broken filesystem backends, block attempts to change the symlink > > access mode via fchmodat2 at the frontend layer. This matches the > > userspace emulation done in libc implementations. No change is made to > > the underlying chmod_common(), so it's still possible to attempt > > changes via procfs, if desired. If at some point all filesystems have > > been fixed, this could be relaxed to allow filesystems to make their > > own decision whether changing access mode of links is supported. > > A new syscall just because we have broken filesystems seems really odd, > why not just fix the filesystems instead? The part about broken filesystems is just the justification for doing the EOPNOTSUPP check at this layer rather than relying on the filesystem to do it, not the purposse for the syscall. The purpose of the syscall is fixing the deficiency in the original one, which lacked the flags argument, making it so you have to do a complicated emulation dance involving O_PATH and procfs magic symlinks to implement the standard functionality. Rich
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > userspace emulation done in libc implementations. No change is made to > the underlying chmod_common(), so it's still possible to attempt > changes via procfs, if desired. And that is the goddamn problem. We need to fix that _first_. After that we can add sugarcoating using new syscalls if needed.
Re: [PATCH] vfs: add fchmodat2 syscall
On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote: > It was determined (see glibc issue #14578 and commit a492b1e5ef) that, > on some filesystems, performing chmod on the link itself produces a > change in the inode's access mode, but returns an EOPNOTSUPP error. Which filesystem types are those?
[PATCH] vfs: add fchmodat2 syscall
POSIX defines fchmodat as having a 4th argument, flags, that can be AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic links is optional (EOPNOTSUPP allowed if not supported), but this flag is important even on systems where symlinks do not have access modes, since it's the only way to safely change the mode of a file which might be asynchronously replaced with a symbolic link, without a race condition whereby the link target is changed. It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both musl libc and glibc do this, by opening an O_PATH file descriptor and performing chmod on the corresponding magic symlink in /proc/self/fd. However, this requires procfs to be mounted and accessible. It was determined (see glibc issue #14578 and commit a492b1e5ef) that, on some filesystems, performing chmod on the link itself produces a change in the inode's access mode, but returns an EOPNOTSUPP error. This is non-conforming and wrong. Rather than try to fix all the broken filesystem backends, block attempts to change the symlink access mode via fchmodat2 at the frontend layer. This matches the userspace emulation done in libc implementations. No change is made to the underlying chmod_common(), so it's still possible to attempt changes via procfs, if desired. If at some point all filesystems have been fixed, this could be relaxed to allow filesystems to make their own decision whether changing access mode of links is supported. Signed-off-by: Rich Felker --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl| 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + fs/open.c | 29 ++--- include/linux/syscalls.h| 2 ++ include/uapi/asm-generic/unistd.h | 4 ++- 21 files changed, 49 insertions(+), 6 deletions(-) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index ec8bed9e7b75..5648fa8be7a1 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -479,3 +479,4 @@ 547common openat2 sys_openat2 548common pidfd_getfd sys_pidfd_getfd 549common faccessat2 sys_faccessat2 +550common fchmodat2 sys_fchmodat2 diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 171077cbf419..b6b715bb3315 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -453,3 +453,4 @@ 437common openat2 sys_openat2 438common pidfd_getfd sys_pidfd_getfd 439common faccessat2 sys_faccessat2 +440common fchmodat2 sys_fchmodat2 diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 3b859596840d..b3b2019f8d16 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -38,7 +38,7 @@ #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 440 +#define __NR_compat_syscalls 441 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index 734860ac7cf9..cd0845f3c19f 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2) __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd) #define __NR_faccessat2 439 __SYSCALL(__NR_faccessat2, sys_faccessat2) +#define __NR_fchmodat2 440 +__SYSCALL(__NR_fchmodat2, sys_fchmodat2) /* * Please add new compat syscalls above this comment and update diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index f52a41f4c340..7c3f8564d0f3 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -360,3 +360,4 @@ 437common openat2 sys_openat2 438common pidfd_getfd sys_pidfd_getfd 439co