Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Christoph Hellwig
On Thu, Sep 10, 2020 at 01:02:56PM -0400, Rich Felker wrote:
> Would you be happy with a pair of patches where the first blocks chmod
> of symlinks in chmod_common and the second adds the syscall with
> flags? I think this is a clearly understandable fix, but it does
> eliminate the ability to *fix* link access modes that have been set to
> ridiculous values (note: I don't think it really matters since the
> modes don't do anything anyway) in the past.

I'd be much happier with that, yes.


RE: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread David Laight
From: Rich Felker
> Sent: 10 September 2020 15:24
...
> index 9af548fb841b..570a21f4d81e 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -610,15 +610,30 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
>   return err;
>  }
> 
> -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, 
> int flags)
>  {
>   struct path path;
>   int error;
>   unsigned int lookup_flags = LOOKUP_FOLLOW;
> +
> + if (flags & AT_SYMLINK_NOFOLLOW)
> + lookup_flags &= ~LOOKUP_FOLLOW;
> + if (flags & ~AT_SYMLINK_NOFOLLOW)
> + return -EINVAL;

I think I'd swap over those two tests.
So unsupported flags are clearly errored.

>  retry:
>   error = user_path_at(dfd, filename, lookup_flags, &path);
>   if (!error) {
> - error = chmod_common(&path, mode);
> + /* Block chmod from getting to fs layer. Ideally the
> +  * fs would either allow it or fail with EOPNOTSUPP,
> +  * but some are buggy and return an error but change
> +  * the mode, which is non-conforming and wrong.
> +  * Userspace emulation of AT_SYMLINK_NOFOLLOW in
> +  * glibc and musl blocked it too, for same reason. */
> + if (S_ISLNK(path.dentry->d_inode->i_mode)
> + && (flags & AT_SYMLINK_NOFOLLOW))
> + error = -EOPNOTSUPP;

Again swap the order of the tests. I think it reads better as:
if ((flags & AT_SYMLINK_NOFOLLOW)
&& S_ISLNK(path.dentry->d_inode->i_mode))
error = -EOPNOTSUPP;
As well as saving a few clock cycles.

> + else
> + error = chmod_common(&path, mode);
>   path_put(&path);
>   if (retry_estale(error, lookup_flags)) {
>   lookup_flags |= LOOKUP_REVAL;
...

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Rich Felker
On Thu, Sep 10, 2020 at 04:18:28PM +0100, Al Viro wrote:
> On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> 
> > It was determined (see glibc issue #14578 and commit a492b1e5ef) that,
> > on some filesystems, performing chmod on the link itself produces a
> > change in the inode's access mode, but returns an EOPNOTSUPP error.
> 
> Which filesystem types are those?

It's been a long time and I don't know if the details were recorded.
It was reported for xfs but I believe we later found it happening for
others. See:

https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17
https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html

and especially:

https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html

where Christoph seems to have endorsed the approach in my patch. I'm
fine with doing it differently if you'd prefer, though.

Rich


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Greg KH
On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> POSIX defines fchmodat as having a 4th argument, flags, that can be
> AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic
> links is optional (EOPNOTSUPP allowed if not supported), but this flag
> is important even on systems where symlinks do not have access modes,
> since it's the only way to safely change the mode of a file which
> might be asynchronously replaced with a symbolic link, without a race
> condition whereby the link target is changed.
> 
> It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both
> musl libc and glibc do this, by opening an O_PATH file descriptor and
> performing chmod on the corresponding magic symlink in /proc/self/fd.
> However, this requires procfs to be mounted and accessible.
> 
> It was determined (see glibc issue #14578 and commit a492b1e5ef) that,
> on some filesystems, performing chmod on the link itself produces a
> change in the inode's access mode, but returns an EOPNOTSUPP error.
> This is non-conforming and wrong. Rather than try to fix all the
> broken filesystem backends, block attempts to change the symlink
> access mode via fchmodat2 at the frontend layer. This matches the
> userspace emulation done in libc implementations. No change is made to
> the underlying chmod_common(), so it's still possible to attempt
> changes via procfs, if desired. If at some point all filesystems have
> been fixed, this could be relaxed to allow filesystems to make their
> own decision whether changing access mode of links is supported.

A new syscall just because we have broken filesystems seems really odd,
why not just fix the filesystems instead?

thanks,

greg k-h


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Rich Felker
On Thu, Sep 10, 2020 at 05:42:34PM +0100, Christoph Hellwig wrote:
> On Thu, Sep 10, 2020 at 12:39:50PM -0400, Rich Felker wrote:
> > On Thu, Sep 10, 2020 at 05:20:59PM +0100, Christoph Hellwig wrote:
> > > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> > > > userspace emulation done in libc implementations. No change is made to
> > > > the underlying chmod_common(), so it's still possible to attempt
> > > > changes via procfs, if desired.
> > > 
> > > And that is the goddamn problem.  We need to fix that _first_.
> > 
> > Can you clarify exactly what that is? Do you mean fixing the
> > underlying fs backends, or just ensuring that the chmod for symlinks
> > doesn't reach them by putting the check in chmod_common? I'm ok with
> > any of these.
> 
> Either - we need to make sure the user can't change the permission
> bits.
> 
> > > After that we can add sugarcoating using new syscalls if needed.
> > 
> > The new syscall is _not_ about this problem. It's about the missing
> > flags argument and inability to implement fchmodat() without access to
> > procfs. The above problem is just something you encounter and have to
> > make a decision about in order to fix the missing flags problem and
> > make a working AT_SYMLINK_NOFOLLOW.
> 
> And I'm generally supportive of that.  But we need to fix the damn
> bug first an then do nice to haves.

Would you be happy with a pair of patches where the first blocks chmod
of symlinks in chmod_common and the second adds the syscall with
flags? I think this is a clearly understandable fix, but it does
eliminate the ability to *fix* link access modes that have been set to
ridiculous values (note: I don't think it really matters since the
modes don't do anything anyway) in the past.

That's why I preferred to *start* with the forced-EOPNOTSUPP just in
the new interface, so that links won't inadvertently get bogus modes
set on them when libc starts using it. As long as some filesystems are
representing access modes in links (and returning them via stat), it
seems like there should be a way to "fix" any that were set in the
past. The patch as I've submitted it now is the least invasive change
in this sense; it does not take away any capability that already
existed.

Rich


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Christoph Hellwig
On Thu, Sep 10, 2020 at 12:39:50PM -0400, Rich Felker wrote:
> On Thu, Sep 10, 2020 at 05:20:59PM +0100, Christoph Hellwig wrote:
> > On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> > > userspace emulation done in libc implementations. No change is made to
> > > the underlying chmod_common(), so it's still possible to attempt
> > > changes via procfs, if desired.
> > 
> > And that is the goddamn problem.  We need to fix that _first_.
> 
> Can you clarify exactly what that is? Do you mean fixing the
> underlying fs backends, or just ensuring that the chmod for symlinks
> doesn't reach them by putting the check in chmod_common? I'm ok with
> any of these.

Either - we need to make sure the user can't change the permission
bits.

> > After that we can add sugarcoating using new syscalls if needed.
> 
> The new syscall is _not_ about this problem. It's about the missing
> flags argument and inability to implement fchmodat() without access to
> procfs. The above problem is just something you encounter and have to
> make a decision about in order to fix the missing flags problem and
> make a working AT_SYMLINK_NOFOLLOW.

And I'm generally supportive of that.  But we need to fix the damn
bug first an then do nice to haves.


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Rich Felker
On Thu, Sep 10, 2020 at 05:20:59PM +0100, Christoph Hellwig wrote:
> On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> > userspace emulation done in libc implementations. No change is made to
> > the underlying chmod_common(), so it's still possible to attempt
> > changes via procfs, if desired.
> 
> And that is the goddamn problem.  We need to fix that _first_.

Can you clarify exactly what that is? Do you mean fixing the
underlying fs backends, or just ensuring that the chmod for symlinks
doesn't reach them by putting the check in chmod_common? I'm ok with
any of these.

> After that we can add sugarcoating using new syscalls if needed.

The new syscall is _not_ about this problem. It's about the missing
flags argument and inability to implement fchmodat() without access to
procfs. The above problem is just something you encounter and have to
make a decision about in order to fix the missing flags problem and
make a working AT_SYMLINK_NOFOLLOW.

Rich


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Rich Felker
On Thu, Sep 10, 2020 at 06:16:15PM +0200, Greg KH wrote:
> On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> > POSIX defines fchmodat as having a 4th argument, flags, that can be
> > AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic
> > links is optional (EOPNOTSUPP allowed if not supported), but this flag
> > is important even on systems where symlinks do not have access modes,
> > since it's the only way to safely change the mode of a file which
> > might be asynchronously replaced with a symbolic link, without a race
> > condition whereby the link target is changed.
> > 
> > It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both
> > musl libc and glibc do this, by opening an O_PATH file descriptor and
> > performing chmod on the corresponding magic symlink in /proc/self/fd.
> > However, this requires procfs to be mounted and accessible.
> > 
> > It was determined (see glibc issue #14578 and commit a492b1e5ef) that,
> > on some filesystems, performing chmod on the link itself produces a
> > change in the inode's access mode, but returns an EOPNOTSUPP error.
> > This is non-conforming and wrong. Rather than try to fix all the
> > broken filesystem backends, block attempts to change the symlink
> > access mode via fchmodat2 at the frontend layer. This matches the
> > userspace emulation done in libc implementations. No change is made to
> > the underlying chmod_common(), so it's still possible to attempt
> > changes via procfs, if desired. If at some point all filesystems have
> > been fixed, this could be relaxed to allow filesystems to make their
> > own decision whether changing access mode of links is supported.
> 
> A new syscall just because we have broken filesystems seems really odd,
> why not just fix the filesystems instead?

The part about broken filesystems is just the justification for doing
the EOPNOTSUPP check at this layer rather than relying on the
filesystem to do it, not the purposse for the syscall.

The purpose of the syscall is fixing the deficiency in the original
one, which lacked the flags argument, making it so you have to do a
complicated emulation dance involving O_PATH and procfs magic symlinks
to implement the standard functionality.

Rich


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Christoph Hellwig
On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:
> userspace emulation done in libc implementations. No change is made to
> the underlying chmod_common(), so it's still possible to attempt
> changes via procfs, if desired.

And that is the goddamn problem.  We need to fix that _first_.  After
that we can add sugarcoating using new syscalls if needed.


Re: [PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Al Viro
On Thu, Sep 10, 2020 at 10:23:37AM -0400, Rich Felker wrote:

> It was determined (see glibc issue #14578 and commit a492b1e5ef) that,
> on some filesystems, performing chmod on the link itself produces a
> change in the inode's access mode, but returns an EOPNOTSUPP error.

Which filesystem types are those?


[PATCH] vfs: add fchmodat2 syscall

2020-09-10 Thread Rich Felker
POSIX defines fchmodat as having a 4th argument, flags, that can be
AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic
links is optional (EOPNOTSUPP allowed if not supported), but this flag
is important even on systems where symlinks do not have access modes,
since it's the only way to safely change the mode of a file which
might be asynchronously replaced with a symbolic link, without a race
condition whereby the link target is changed.

It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both
musl libc and glibc do this, by opening an O_PATH file descriptor and
performing chmod on the corresponding magic symlink in /proc/self/fd.
However, this requires procfs to be mounted and accessible.

It was determined (see glibc issue #14578 and commit a492b1e5ef) that,
on some filesystems, performing chmod on the link itself produces a
change in the inode's access mode, but returns an EOPNOTSUPP error.
This is non-conforming and wrong. Rather than try to fix all the
broken filesystem backends, block attempts to change the symlink
access mode via fchmodat2 at the frontend layer. This matches the
userspace emulation done in libc implementations. No change is made to
the underlying chmod_common(), so it's still possible to attempt
changes via procfs, if desired. If at some point all filesystems have
been fixed, this could be relaxed to allow filesystems to make their
own decision whether changing access mode of links is supported.

Signed-off-by: Rich Felker 
---
 arch/alpha/kernel/syscalls/syscall.tbl  |  1 +
 arch/arm/tools/syscall.tbl  |  1 +
 arch/arm64/include/asm/unistd.h |  2 +-
 arch/arm64/include/asm/unistd32.h   |  2 ++
 arch/ia64/kernel/syscalls/syscall.tbl   |  1 +
 arch/m68k/kernel/syscalls/syscall.tbl   |  1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |  1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |  1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |  1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |  1 +
 arch/parisc/kernel/syscalls/syscall.tbl |  1 +
 arch/powerpc/kernel/syscalls/syscall.tbl|  1 +
 arch/s390/kernel/syscalls/syscall.tbl   |  1 +
 arch/sh/kernel/syscalls/syscall.tbl |  1 +
 arch/sparc/kernel/syscalls/syscall.tbl  |  1 +
 arch/x86/entry/syscalls/syscall_32.tbl  |  1 +
 arch/x86/entry/syscalls/syscall_64.tbl  |  1 +
 arch/xtensa/kernel/syscalls/syscall.tbl |  1 +
 fs/open.c   | 29 ++---
 include/linux/syscalls.h|  2 ++
 include/uapi/asm-generic/unistd.h   |  4 ++-
 21 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ec8bed9e7b75..5648fa8be7a1 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -479,3 +479,4 @@
 547common  openat2 sys_openat2
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
+550common  fchmodat2   sys_fchmodat2
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..b6b715bb3315 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -453,3 +453,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+440common  fchmodat2   sys_fchmodat2
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 3b859596840d..b3b2019f8d16 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   440
+#define __NR_compat_syscalls   441
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..cd0845f3c19f 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
 __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 #define __NR_faccessat2 439
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_fchmodat2 440
+__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index f52a41f4c340..7c3f8564d0f3 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -360,3 +360,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439co