Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Mon, Jun 01, 2020 at 12:02:10PM -0700, Sargun Dhillon wrote: > On Sat, May 30, 2020 at 9:07 AM Kees Cook wrote: > > > > On Sat, May 30, 2020 at 03:08:37PM +0100, Al Viro wrote: > > > On Fri, May 29, 2020 at 07:43:10PM -0700, Kees Cook wrote: > > > > > > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > > > > move the put_user() after instead? I think cleanup would just be: > > > > replace_fd(fd, NULL, 0) > > > > > > Bollocks. > > > > > > Repeat after me: descriptor tables can be shared. There is no > > > "cleanup" after you've put something there. > > > > Right -- this is what I was trying to ask about, and why I didn't like > > the idea of just leaving the fd in the table on failure. But yeah, there > > is a race if the process is about to fork or something. > > > > So the choice here is how to handle the put_user() failure: > > > > - add the put_user() address to the new helper, as I suggest in [1]. > > (exactly duplicates current behavior) > > - just leave the fd in place (not current behavior: dumps a fd into > > the process without "agreed" notification). > > - do a double put_user (once before and once after), also in [1]. > > (sort of a best-effort combo of the above two. and SCM_RIGHTS is > > hardly fast-pth). > > > > -Kees > > > > [1] https://lore.kernel.org/linux-api/202005282345.573B917@keescook/ > > > > -- > > Kees Cook > > I'm going to suggest we stick to the approach of doing[1]: > 1. Allocate FD > 2. put_user > 3. "Receive" and install file into FD > > That is the only way to preserve the current behaviour in which userspace > is notified about *every* FD that is received via SCM_RIGHTS. The > scm_detach_fds code as it reads today does effectively what is above, > in that the fd is not installed until *after* the put user. Therefore > if put_user > gets an EFAULT or ENOMEM, it falls through to the MSG_CTRUNC bit. > > The approach suggested[2] has a "change" in behaviour, in that (all in > file_receive): > 1. Allocate FD > 2. Receive file > 3. put_user > > Based on what Al Viro said, I don't think we can simply add step #4, > being "just" uninstall the FD. > > [1]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2179418.html > [2]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2179453.html Agreed. Thanks! -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 9:07 AM Kees Cook wrote: > > On Sat, May 30, 2020 at 03:08:37PM +0100, Al Viro wrote: > > On Fri, May 29, 2020 at 07:43:10PM -0700, Kees Cook wrote: > > > > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > > > move the put_user() after instead? I think cleanup would just be: > > > replace_fd(fd, NULL, 0) > > > > Bollocks. > > > > Repeat after me: descriptor tables can be shared. There is no > > "cleanup" after you've put something there. > > Right -- this is what I was trying to ask about, and why I didn't like > the idea of just leaving the fd in the table on failure. But yeah, there > is a race if the process is about to fork or something. > > So the choice here is how to handle the put_user() failure: > > - add the put_user() address to the new helper, as I suggest in [1]. > (exactly duplicates current behavior) > - just leave the fd in place (not current behavior: dumps a fd into > the process without "agreed" notification). > - do a double put_user (once before and once after), also in [1]. > (sort of a best-effort combo of the above two. and SCM_RIGHTS is > hardly fast-pth). > > -Kees > > [1] https://lore.kernel.org/linux-api/202005282345.573B917@keescook/ > > -- > Kees Cook I'm going to suggest we stick to the approach of doing[1]: 1. Allocate FD 2. put_user 3. "Receive" and install file into FD That is the only way to preserve the current behaviour in which userspace is notified about *every* FD that is received via SCM_RIGHTS. The scm_detach_fds code as it reads today does effectively what is above, in that the fd is not installed until *after* the put user. Therefore if put_user gets an EFAULT or ENOMEM, it falls through to the MSG_CTRUNC bit. The approach suggested[2] has a "change" in behaviour, in that (all in file_receive): 1. Allocate FD 2. Receive file 3. put_user Based on what Al Viro said, I don't think we can simply add step #4, being "just" uninstall the FD. [1]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2179418.html [2]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2179453.html
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 09:14:50AM -0700, Kees Cook wrote: > On Sat, May 30, 2020 at 04:13:29PM +0200, Christian Brauner wrote: > > On Fri, May 29, 2020 at 10:47:12PM -0700, Kees Cook wrote: > > > Oh hey! Look at scm_detach_fds_compat(). It needs this too. (And it's > > > missing the cgroup tracking.) That would fix: > > > > > > 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set > > > correctly") > > > d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set > > > correctly") > > > > > > So, yes, let's get this fixed up. I'd say first fix the missing sock > > > update in the compat path (so it can be CCed stable). Then fix the missing > > > > send this patch to net. > > > > > sock update in pidfd_getfd() (so it can be CCed stable), then write the > > > > send this patch to me. > > > > > helper with a refactoring of scm_detach_fds(), scm_detach_fds_compat(), > > > > this would be net-next most likely. > > > > > and pidfd_getfd(). And then add the addfd seccomp user_notif ioctl cmd. > > > > If you do this first, I'd suggest you resend the series here after all > > this has been merged. We're not in a rush since this won't make it for > > the 5.8 merge window anyway. By the time the changes land Kees might've > > applied my changes to his tree so you can rebase yours on top of it > > relieving Kees from fixing up merge conflicts. > > > > About your potential net and net-next changes. Just in case you don't > > know - otherwise ignore this - please read and treat > > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt > > as the gospel. Also note, that after this Sunday - assuming Linus > > releases - net-next will be closed until the merge window is closed, > > i.e. for _at least_ 2 weeks. After the merge window closes you can check > > http://vger.kernel.org/~davem/net-next.html > > which either has a picture saying "Come In We're Open" or a sign saying > > "Sorry, We're Closed". Only send when the first sign is up or the wrath > > of Dave might hit you. :) > > Yeah, timing is awkward here. I was originally thinking it could all > just land via seccomp (with appropriate Acks). Hmmm. I don't particularly care so sure. :) Christian
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 04:13:29PM +0200, Christian Brauner wrote: > On Fri, May 29, 2020 at 10:47:12PM -0700, Kees Cook wrote: > > Oh hey! Look at scm_detach_fds_compat(). It needs this too. (And it's > > missing the cgroup tracking.) That would fix: > > > > 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set > > correctly") > > d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set > > correctly") > > > > So, yes, let's get this fixed up. I'd say first fix the missing sock > > update in the compat path (so it can be CCed stable). Then fix the missing > > send this patch to net. > > > sock update in pidfd_getfd() (so it can be CCed stable), then write the > > send this patch to me. > > > helper with a refactoring of scm_detach_fds(), scm_detach_fds_compat(), > > this would be net-next most likely. > > > and pidfd_getfd(). And then add the addfd seccomp user_notif ioctl cmd. > > If you do this first, I'd suggest you resend the series here after all > this has been merged. We're not in a rush since this won't make it for > the 5.8 merge window anyway. By the time the changes land Kees might've > applied my changes to his tree so you can rebase yours on top of it > relieving Kees from fixing up merge conflicts. > > About your potential net and net-next changes. Just in case you don't > know - otherwise ignore this - please read and treat > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt > as the gospel. Also note, that after this Sunday - assuming Linus > releases - net-next will be closed until the merge window is closed, > i.e. for _at least_ 2 weeks. After the merge window closes you can check > http://vger.kernel.org/~davem/net-next.html > which either has a picture saying "Come In We're Open" or a sign saying > "Sorry, We're Closed". Only send when the first sign is up or the wrath > of Dave might hit you. :) Yeah, timing is awkward here. I was originally thinking it could all just land via seccomp (with appropriate Acks). Hmmm. -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 03:58:27PM +0200, Christian Brauner wrote: > On Sat, May 30, 2020 at 05:17:24AM +0200, Jann Horn wrote: > > On Sat, May 30, 2020 at 4:43 AM Kees Cook wrote: > > > I mean, yes, that's certainly better, but it just seems a shame that > > > everyone has to do the get_unused/put_unused dance just because of how > > > SCM_RIGHTS does this weird put_user() in the middle. > > > > > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > > > move the put_user() after instead? > > > > Honestly, I think trying to remove file descriptors and such after > > -EFAULT is a waste of time. If userspace runs into -EFAULT, userspace > [...] > > There's really no point in trying to save a broken scm message imho. Right -- my concern is about stuffing a fd into a process without it knowing (this is likely an overly paranoid concern, given that if the process is getting EFAULT at the end of a list of fds, all the prior ones will be installed too..) -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 03:08:37PM +0100, Al Viro wrote: > On Fri, May 29, 2020 at 07:43:10PM -0700, Kees Cook wrote: > > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > > move the put_user() after instead? I think cleanup would just be: > > replace_fd(fd, NULL, 0) > > Bollocks. > > Repeat after me: descriptor tables can be shared. There is no > "cleanup" after you've put something there. Right -- this is what I was trying to ask about, and why I didn't like the idea of just leaving the fd in the table on failure. But yeah, there is a race if the process is about to fork or something. So the choice here is how to handle the put_user() failure: - add the put_user() address to the new helper, as I suggest in [1]. (exactly duplicates current behavior) - just leave the fd in place (not current behavior: dumps a fd into the process without "agreed" notification). - do a double put_user (once before and once after), also in [1]. (sort of a best-effort combo of the above two. and SCM_RIGHTS is hardly fast-pth). -Kees [1] https://lore.kernel.org/linux-api/202005282345.573B917@keescook/ -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 10:47:12PM -0700, Kees Cook wrote: > On Sat, May 30, 2020 at 03:58:18AM +, Sargun Dhillon wrote: > > Isn't the "right" way to do this to allocate a bunch of file descriptors, > > and fill up the user buffer with them, and then install the files? This > > seems to like half-install the file descriptors and then error out. > > > > I know that's the current behaviour, but that seems like a bad idea. Do > > we really want to perpetuate this half-broken state? I guess that some > > userspace programs could be depending on this -- and their recovery > > semantics could rely on this. I mean this is 10+ year old code. > > Right -- my instincts on this are to leave the behavior as-is. I've > been burned by so many "nothing could possible rely on THIS" cases. ;) > > It might be worth adding a comment (or commit log note) that describes > the alternative you suggest here. But I think building a common helper > that does all of the work (and will get used in three^Wfour places now) > is the correct way to refactor this. If you do this, then probably > > Oh hey! Look at scm_detach_fds_compat(). It needs this too. (And it's > missing the cgroup tracking.) That would fix: > > 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set > correctly") > d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set > correctly") > > So, yes, let's get this fixed up. I'd say first fix the missing sock > update in the compat path (so it can be CCed stable). Then fix the missing send this patch to net. > sock update in pidfd_getfd() (so it can be CCed stable), then write the send this patch to me. > helper with a refactoring of scm_detach_fds(), scm_detach_fds_compat(), this would be net-next most likely. > and pidfd_getfd(). And then add the addfd seccomp user_notif ioctl cmd. If you do this first, I'd suggest you resend the series here after all this has been merged. We're not in a rush since this won't make it for the 5.8 merge window anyway. By the time the changes land Kees might've applied my changes to his tree so you can rebase yours on top of it relieving Kees from fixing up merge conflicts. About your potential net and net-next changes. Just in case you don't know - otherwise ignore this - please read and treat https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt as the gospel. Also note, that after this Sunday - assuming Linus releases - net-next will be closed until the merge window is closed, i.e. for _at least_ 2 weeks. After the merge window closes you can check http://vger.kernel.org/~davem/net-next.html which either has a picture saying "Come In We're Open" or a sign saying "Sorry, We're Closed". Only send when the first sign is up or the wrath of Dave might hit you. :) Christian
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 07:43:10PM -0700, Kees Cook wrote: > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > move the put_user() after instead? I think cleanup would just be: > replace_fd(fd, NULL, 0) Bollocks. Repeat after me: descriptor tables can be shared. There is no "cleanup" after you've put something there. If you do not get it, you have no business messing with any of this stuff.
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 05:17:24AM +0200, Jann Horn wrote: > On Sat, May 30, 2020 at 4:43 AM Kees Cook wrote: > > I mean, yes, that's certainly better, but it just seems a shame that > > everyone has to do the get_unused/put_unused dance just because of how > > SCM_RIGHTS does this weird put_user() in the middle. > > > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > > move the put_user() after instead? > > Honestly, I think trying to remove file descriptors and such after > -EFAULT is a waste of time. If userspace runs into -EFAULT, userspace Agreed, we've never bothered with trying to recover from EFAULT. Just look at kernel/fork.c:_do_fork(): if (clone_flags & CLONE_PARENT_SETTID) put_user(nr, args->parent_tid); we don't even bother even though we technically could. > is beyond saving and can't really do much other than exit immediately. > There are a bunch of places that will change state and then throw > -EFAULT at the end if userspace supplied an invalid address, because > trying to hold locks across userspace accesses just in case userspace > supplied a bogus address is kinda silly (and often borderline > impossible). > > You can actually see that even scm_detach_fds() currently just > silently swallows errors if writing some header fields fails at the > end. There's really no point in trying to save a broken scm message imho.
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 03:58:18AM +, Sargun Dhillon wrote: > Isn't the "right" way to do this to allocate a bunch of file descriptors, > and fill up the user buffer with them, and then install the files? This > seems to like half-install the file descriptors and then error out. > > I know that's the current behaviour, but that seems like a bad idea. Do > we really want to perpetuate this half-broken state? I guess that some > userspace programs could be depending on this -- and their recovery > semantics could rely on this. I mean this is 10+ year old code. Right -- my instincts on this are to leave the behavior as-is. I've been burned by so many "nothing could possible rely on THIS" cases. ;) It might be worth adding a comment (or commit log note) that describes the alternative you suggest here. But I think building a common helper that does all of the work (and will get used in three^Wfour places now) is the correct way to refactor this. Oh hey! Look at scm_detach_fds_compat(). It needs this too. (And it's missing the cgroup tracking.) That would fix: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly") d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly") So, yes, let's get this fixed up. I'd say first fix the missing sock update in the compat path (so it can be CCed stable). Then fix the missing sock update in pidfd_getfd() (so it can be CCed stable), then write the helper with a refactoring of scm_detach_fds(), scm_detach_fds_compat(), and pidfd_getfd(). And then add the addfd seccomp user_notif ioctl cmd. -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 05:17:24AM +0200, Jann Horn wrote: > On Sat, May 30, 2020 at 4:43 AM Kees Cook wrote: > > I mean, yes, that's certainly better, but it just seems a shame that > > everyone has to do the get_unused/put_unused dance just because of how > > SCM_RIGHTS does this weird put_user() in the middle. > > > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > > move the put_user() after instead? > > Honestly, I think trying to remove file descriptors and such after > -EFAULT is a waste of time. If userspace runs into -EFAULT, userspace > is beyond saving and can't really do much other than exit immediately. > There are a bunch of places that will change state and then throw > -EFAULT at the end if userspace supplied an invalid address, because > trying to hold locks across userspace accesses just in case userspace > supplied a bogus address is kinda silly (and often borderline > impossible). Logically, I agree. I'm more worried about the behavioral change -- if we don't remove the fd on failure, the fd is installed with no indication to the process that it exists (it won't know the close it -- if it keeps running -- and it may survive across exec). Before, it never entered the file table. > You can actually see that even scm_detach_fds() currently just > silently swallows errors if writing some header fields fails at the > end. Yeah, and it's a corner case. But it should be possible (trivial, even) to clean up on failure to retain the original results. -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
> > I mean, yes, that's certainly better, but it just seems a shame that > everyone has to do the get_unused/put_unused dance just because of how > SCM_RIGHTS does this weird put_user() in the middle. > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > move the put_user() after instead? I think cleanup would just be: > replace_fd(fd, NULL, 0) > > So: > > (updated to skip sock updates on failure; thank you Christian!) > > int file_receive(int fd, unsigned long flags, struct file *file) > { > struct socket *sock; > int ret; > > ret = security_file_receive(file); > if (ret) > return ret; > > /* Install the file. */ > if (fd == -1) { > ret = get_unused_fd_flags(flags); > if (ret >= 0) > fd_install(ret, get_file(file)); > } else { > ret = replace_fd(fd, file, flags); > } > > /* Bump the sock usage counts. */ > if (ret >= 0) { > sock = sock_from_file(addfd->file, ); > if (sock) { > sock_update_netprioidx(>sk->sk_cgrp_data); > sock_update_classid(>sk->sk_cgrp_data); > } > } > > return ret; > } > > scm_detach_fds() > ... > for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); i i++, cmfptr++) > { > int new_fd; > > err = file_receive(-1, MSG_CMSG_CLOEXEC & msg->msg_flags > ? O_CLOEXEC : 0, fp[i]); > if (err < 0) > break; > new_fd = err; > Isn't the "right" way to do this to allocate a bunch of file descriptors, and fill up the user buffer with them, and then install the files? This seems to like half-install the file descriptors and then error out. I know that's the current behaviour, but that seems like a bad idea. Do we really want to perpetuate this half-broken state? I guess that some userspace programs could be depending on this -- and their recovery semantics could rely on this. I mean this is 10+ year old code. > err = put_user(err, cmfptr); > if (err) { > /* >* If we can't notify userspace that it got the >* fd, we need to unwind and remove it again. >*/ > replace_fd(new_fd, NULL, 0); > break; > } > } > ... > > > > -- > Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 4:43 AM Kees Cook wrote: > I mean, yes, that's certainly better, but it just seems a shame that > everyone has to do the get_unused/put_unused dance just because of how > SCM_RIGHTS does this weird put_user() in the middle. > > Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we > move the put_user() after instead? Honestly, I think trying to remove file descriptors and such after -EFAULT is a waste of time. If userspace runs into -EFAULT, userspace is beyond saving and can't really do much other than exit immediately. There are a bunch of places that will change state and then throw -EFAULT at the end if userspace supplied an invalid address, because trying to hold locks across userspace accesses just in case userspace supplied a bogus address is kinda silly (and often borderline impossible). You can actually see that even scm_detach_fds() currently just silently swallows errors if writing some header fields fails at the end.
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Sat, May 30, 2020 at 01:10:55AM +, Sargun Dhillon wrote: > // And then SCM reads: > for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); ii++, cmfptr++) > { > int new_fd; > err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & msg->msg_flags > ? O_CLOEXEC : 0); > if (err < 0) > break; > new_fd = err; > err = put_user(new_fd, cmfptr); > if (err) { > put_unused_fd(new_fd); > break; > } > > err = file_receive(new_fd, fp[i]); > if (err) { > put_unused_fd(new_fd); > break; > } > } > > And our code reads: > > > static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) > { > int ret, err; > > /* >* Remove the notification, and reset the list pointers, indicating >* that it has been handled. >*/ > list_del_init(>list); > > if (addfd->fd == -1) { > ret = get_unused_fd_flags(addfd->flags); > if (ret < 0) > goto err; > > err = file_receive(ret, addfd->file); > if (err) { > put_unused_fd(ret); > ret = err; > } > } else { > ret = file_receive_replace(addfd->fd, addfd->flags, > addfd->file); > } > > err: > addfd->ret = ret; > complete(>completion); > } > > > And the pidfd getfd code reads: > > static int pidfd_getfd(struct pid *pid, int fd) > { > struct task_struct *task; > struct file *file; > int ret, err; > > task = get_pid_task(pid, PIDTYPE_PID); > if (!task) > return -ESRCH; > > file = __pidfd_fget(task, fd); > put_task_struct(task); > if (IS_ERR(file)) > return PTR_ERR(file); > > ret = get_unused_fd_flags(O_CLOEXEC); > if (ret >= 0) { > err = file_receive(ret, file); > if (err) { > put_unused_fd(ret); > ret = err; > } > } > > fput(file); > return ret; > } I mean, yes, that's certainly better, but it just seems a shame that everyone has to do the get_unused/put_unused dance just because of how SCM_RIGHTS does this weird put_user() in the middle. Can anyone clarify the expected failure mode from SCM_RIGHTS? Can we move the put_user() after instead? I think cleanup would just be: replace_fd(fd, NULL, 0) So: (updated to skip sock updates on failure; thank you Christian!) int file_receive(int fd, unsigned long flags, struct file *file) { struct socket *sock; int ret; ret = security_file_receive(file); if (ret) return ret; /* Install the file. */ if (fd == -1) { ret = get_unused_fd_flags(flags); if (ret >= 0) fd_install(ret, get_file(file)); } else { ret = replace_fd(fd, file, flags); } /* Bump the sock usage counts. */ if (ret >= 0) { sock = sock_from_file(addfd->file, ); if (sock) { sock_update_netprioidx(>sk->sk_cgrp_data); sock_update_classid(>sk->sk_cgrp_data); } } return ret; } scm_detach_fds() ... for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); imsg_flags ? O_CLOEXEC : 0, fp[i]); if (err < 0) break; new_fd = err; err = put_user(err, cmfptr); if (err) { /* * If we can't notify userspace that it got the * fd, we need to unwind and remove it again. */ replace_fd(new_fd, NULL, 0); break; } } ... -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 12:31:37AM -0700, Kees Cook wrote: > On Thu, May 28, 2020 at 04:08:57AM -0700, Sargun Dhillon wrote: > > This adds a seccomp notifier ioctl which allows for the listener to "add" > > file descriptors to a process which originated a seccomp user > > notification. This allows calls like mount, and mknod to be "implemented", > > as the return value, and the arguments are data in memory. On the other > > hand, calls like connect can be "implemented" using pidfd_getfd. > > > > Unfortunately, there are calls which return file descriptors, like > > open, which are vulnerable to TOC-TOU attacks, and require that the > > more privileged supervisor can inspect the argument, and perform the > > syscall on behalf of the process generating the notifiation. This > > allows the file descriptor generated from that open call to be > > returned to the calling process. > > > > In addition, there is funcitonality to allow for replacement of > > specific file descriptors, following dup2-like semantics. > > > > Signed-off-by: Sargun Dhillon > > Suggested-by: Matt Denton > > This looks mostly really clean. When I've got more brain tomorrow I want to > double-check the locking, but I think the use of notify_lock and being > in the ioctl fully protects everything from any use-after-free-like > issues. > > Notes below... > > > +/* valid flags for seccomp_notif_addfd */ > > +#define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ > > Nit: please use BIT() > > > @@ -735,6 +770,41 @@ static u64 seccomp_next_notify_id(struct > > seccomp_filter *filter) > > return filter->notif->next_id++; > > } > > > > +static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) > > +{ > > + struct socket *sock; > > + int ret, err; > > + > > + /* > > +* Remove the notification, and reset the list pointers, indicating > > +* that it has been handled. > > +*/ > > + list_del_init(>list); > > + > > + ret = security_file_receive(addfd->file); > > + if (ret) > > + goto out; > > + > > + if (addfd->fd == -1) { > > + ret = get_unused_fd_flags(addfd->flags); > > + if (ret >= 0) > > + fd_install(ret, get_file(addfd->file)); > > + } else { > > + ret = replace_fd(addfd->fd, addfd->file, addfd->flags); > > + } > > + > > + /* These are the semantics from copying FDs via SCM_RIGHTS */ > > + sock = sock_from_file(addfd->file, ); > > + if (sock) { > > + sock_update_netprioidx(>sk->sk_cgrp_data); > > + sock_update_classid(>sk->sk_cgrp_data); > > + } > > This made my eye twitch. ;) I see this is borrowed from > scm_detach_fds()... this really feels like the kind of thing that will > quickly go out of sync. I think this "receive an fd" logic needs to be > lifted out of scm_detach_fds() so it and seccomp can share it. I'm not > sure how to parameterize it quite right, though. Perhaps: > > int file_receive(int fd, unsigned long flags, struct file *file) > { > struct socket *sock; > int ret; > > ret = security_file_receive(file); > if (ret) > return ret; > > /* Install the file. */ > if (fd == -1) { > ret = get_unused_fd_flags(flags); > if (ret >= 0) > fd_install(ret, get_file(file)); > } else { > ret = replace_fd(fd, file, flags); > } > > /* Bump the usage count. */ > sock = sock_from_file(addfd->file, ); > if (sock) { > sock_update_netprioidx(>sk->sk_cgrp_data); > sock_update_classid(>sk->sk_cgrp_data); > } > > return ret; > } > > > static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) > { > /* >* Remove the notification, and reset the list pointers, indicating >* that it has been handled. >*/ > list_del_init(>list); > addfd->ret = file_receive(addfd->fd, addfd->flags, addfd->file); > complete(>completion); > } > > scm_detach_fds() > ... > for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); i i++, cmfptr++) > { > > err = file_receive(-1, MSG_CMSG_CLOEXEC & msg->msg_flags > ? O_CLOEXEC : 0, fp[i]); > if (err < 0) > break; > err = put_user(err, cmfptr); > if (err) > /* wat */ > } > ... > > I'm not sure on the put_user() failure, though. We could check early > for faults with a put_user(0, cmfptr) before the file_receive() call, or > we could just ignore it? I'm not sure what SCM does here. I guess > worst-case: > > int file_receive(int fd, unsigned long flags, struct file *file, >int __user *fdptr) > { > ... > ret = get_unused_fd_flags(flags); > if (ret >= 0) { > if (cmfptr) { >
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 6:31 AM Christian Brauner wrote: > > > > + /* Check if we were woken up by a addfd message */ > > > + addfd = list_first_entry_or_null(, > > > +struct seccomp_kaddfd, list); > > > + if (addfd && n.state != SECCOMP_NOTIFY_REPLIED) { > > > + seccomp_handle_addfd(addfd); > > > + mutex_unlock(>notify_lock); > > > + goto wait; > > > + } > > > ret = n.val; > > > err = n.error; > > > flags = n.flags; > > > } > > > > > > + /* If there were any pending addfd calls, clear them out */ > > > + list_for_each_entry_safe(addfd, tmp, , list) { > > > + /* The process went away before we got a chance to handle it > > > */ > > > + addfd->ret = -ESRCH; > > > + list_del_init(>list); > > > + complete(>completion); > > > + } > > I forgot to ask this in my first review before, don't you need a > complete(>completion) call in seccomp_notify_release() before > freeing it? > When complete(>ready) is called in seccomp_notify_release, subsequently the notifier (seccomp_do_user_notification) will be woken up and it'll fail this check: if (addfd && n.state != SECCOMP_NOTIFY_REPLIED) Falling through to: /* If there were any pending addfd calls, clear them out */ list_for_each_entry_safe(addfd, tmp, , list) { /* The process went away before we got a chance to handle it */ addfd->ret = -ESRCH; list_del_init(>list); complete(>completion); } Although ESRCH isn't the "right" response, this fall through behaviour should work.
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 12:32:55PM +0200, Christian Brauner wrote: > On Thu, May 28, 2020 at 04:08:57AM -0700, Sargun Dhillon wrote: > > This adds a seccomp notifier ioctl which allows for the listener to "add" > > file descriptors to a process which originated a seccomp user > > notification. This allows calls like mount, and mknod to be "implemented", > > as the return value, and the arguments are data in memory. On the other > > hand, calls like connect can be "implemented" using pidfd_getfd. > > > > Unfortunately, there are calls which return file descriptors, like > > open, which are vulnerable to TOC-TOU attacks, and require that the > > more privileged supervisor can inspect the argument, and perform the > > syscall on behalf of the process generating the notifiation. This > > allows the file descriptor generated from that open call to be > > returned to the calling process. > > > > In addition, there is funcitonality to allow for replacement of > > specific file descriptors, following dup2-like semantics. > > > > Signed-off-by: Sargun Dhillon > > Suggested-by: Matt Denton > > Cc: Kees Cook , > > Cc: Jann Horn , > > Cc: Robert Sesek , > > Cc: Chris Palmer > > Cc: Christian Brauner > > Cc: Tycho Andersen > > --- > > include/uapi/linux/seccomp.h | 25 + > > kernel/seccomp.c | 182 ++- > > 2 files changed, 206 insertions(+), 1 deletion(-) > > > > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h > > index c1735455bc53..c7bfe898e7a0 100644 > > --- a/include/uapi/linux/seccomp.h > > +++ b/include/uapi/linux/seccomp.h > > @@ -113,6 +113,27 @@ struct seccomp_notif_resp { > > __u32 flags; > > }; > > > > +/* valid flags for seccomp_notif_addfd */ > > +#define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ > > + > > +/** > > + * struct seccomp_notif_addfd > > + * @size: The size of the seccomp_notif_addfd datastructure > > + * @id: The ID of the seccomp notification > > + * @flags: SECCOMP_ADDFD_FLAG_* > > + * @srcfd: The local fd number > > + * @newfd: Optional remote FD number if SETFD option is set, otherwise 0. > > + * @newfd_flags: Flags the remote FD should be allocated under > > + */ > > +struct seccomp_notif_addfd { > > + __u64 size; > > + __u64 id; > > + __u64 flags; > > + __u32 srcfd; > > + __u32 newfd; > > + __u32 newfd_flags; > > +}; > > This doesn't correspond to how we usually pad structs, I think: > > struct seccomp_notif_addfd { > __u64 size; /* 0 8 */ > __u64 id; /* 8 8 */ > __u64 flags;/*16 8 */ > __u32 srcfd;/*24 4 */ > __u32 newfd;/*28 4 */ > __u32 newfd_flags; /*32 4 */ > > /* size: 40, cachelines: 1, members: 6 */ > /* padding: 4 */ > /* last cacheline: 40 bytes */ > }; > > You can either use the packed attribute or change the flags member from > u64 to u32: > > struct seccomp_notif_addfd { > __u64 size; > __u64 id; > __u32 flags; > __u32 srcfd; > __u32 newfd; > __u32 newfd_flags; > } > > ^^ This seems nicer to me and gets rid of the 4 byte padding. If we run > out of 32 flags we'll just add a second flag argument to the struct. > > > + > > #define SECCOMP_IOC_MAGIC '!' > > #define SECCOMP_IO(nr) _IO(SECCOMP_IOC_MAGIC, nr) > > #define SECCOMP_IOR(nr, type) _IOR(SECCOMP_IOC_MAGIC, nr, > > type) > > @@ -124,4 +145,8 @@ struct seccomp_notif_resp { > > #define SECCOMP_IOCTL_NOTIF_SEND SECCOMP_IOWR(1, \ > > struct seccomp_notif_resp) > > #define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOR(2, __u64) > > +/* On success, the return value is the remote process's added fd number */ > > +#define SECCOMP_IOCTL_NOTIF_ADDFD SECCOMP_IOR(3, \ > > + struct seccomp_notif_addfd) > > + > > #endif /* _UAPI_LINUX_SECCOMP_H */ > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c > > index 94ae4c7502cc..02b9ba1fbee0 100644 > > --- a/kernel/seccomp.c > > +++ b/kernel/seccomp.c > > @@ -41,6 +41,9 @@ > > #include > > #include > > #include > > +#include > > +#include > > +#include > > > > enum notify_state { > > SECCOMP_NOTIFY_INIT, > > @@ -77,10 +80,42 @@ struct seccomp_knotif { > > long val; > > u32 flags; > > > > - /* Signals when this has entered SECCOMP_NOTIFY_REPLIED */ > > + /* > > +* Signals when this has changed states, such as the listener > > +* dying, a new seccomp addfd message, or changing to REPLIED > > +*/ > > struct completion ready; > > > > struct list_head list; > > + > > + /* outstanding addfd requests */ > > +
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Thu, May 28, 2020 at 04:08:57AM -0700, Sargun Dhillon wrote: > This adds a seccomp notifier ioctl which allows for the listener to "add" > file descriptors to a process which originated a seccomp user > notification. This allows calls like mount, and mknod to be "implemented", > as the return value, and the arguments are data in memory. On the other > hand, calls like connect can be "implemented" using pidfd_getfd. > > Unfortunately, there are calls which return file descriptors, like > open, which are vulnerable to TOC-TOU attacks, and require that the > more privileged supervisor can inspect the argument, and perform the > syscall on behalf of the process generating the notifiation. This > allows the file descriptor generated from that open call to be > returned to the calling process. > > In addition, there is funcitonality to allow for replacement of > specific file descriptors, following dup2-like semantics. > > Signed-off-by: Sargun Dhillon > Suggested-by: Matt Denton > Cc: Kees Cook , > Cc: Jann Horn , > Cc: Robert Sesek , > Cc: Chris Palmer > Cc: Christian Brauner > Cc: Tycho Andersen > --- > include/uapi/linux/seccomp.h | 25 + > kernel/seccomp.c | 182 ++- > 2 files changed, 206 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h > index c1735455bc53..c7bfe898e7a0 100644 > --- a/include/uapi/linux/seccomp.h > +++ b/include/uapi/linux/seccomp.h > @@ -113,6 +113,27 @@ struct seccomp_notif_resp { > __u32 flags; > }; > > +/* valid flags for seccomp_notif_addfd */ > +#define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ > + > +/** > + * struct seccomp_notif_addfd > + * @size: The size of the seccomp_notif_addfd datastructure > + * @id: The ID of the seccomp notification > + * @flags: SECCOMP_ADDFD_FLAG_* > + * @srcfd: The local fd number > + * @newfd: Optional remote FD number if SETFD option is set, otherwise 0. > + * @newfd_flags: Flags the remote FD should be allocated under > + */ > +struct seccomp_notif_addfd { > + __u64 size; > + __u64 id; > + __u64 flags; > + __u32 srcfd; > + __u32 newfd; > + __u32 newfd_flags; > +}; This doesn't correspond to how we usually pad structs, I think: struct seccomp_notif_addfd { __u64 size; /* 0 8 */ __u64 id; /* 8 8 */ __u64 flags;/*16 8 */ __u32 srcfd;/*24 4 */ __u32 newfd;/*28 4 */ __u32 newfd_flags; /*32 4 */ /* size: 40, cachelines: 1, members: 6 */ /* padding: 4 */ /* last cacheline: 40 bytes */ }; You can either use the packed attribute or change the flags member from u64 to u32: struct seccomp_notif_addfd { __u64 size; __u64 id; __u32 flags; __u32 srcfd; __u32 newfd; __u32 newfd_flags; } ^^ This seems nicer to me and gets rid of the 4 byte padding. If we run out of 32 flags we'll just add a second flag argument to the struct. > + > #define SECCOMP_IOC_MAGIC'!' > #define SECCOMP_IO(nr) _IO(SECCOMP_IOC_MAGIC, nr) > #define SECCOMP_IOR(nr, type)_IOR(SECCOMP_IOC_MAGIC, nr, > type) > @@ -124,4 +145,8 @@ struct seccomp_notif_resp { > #define SECCOMP_IOCTL_NOTIF_SEND SECCOMP_IOWR(1, \ > struct seccomp_notif_resp) > #define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOR(2, __u64) > +/* On success, the return value is the remote process's added fd number */ > +#define SECCOMP_IOCTL_NOTIF_ADDFDSECCOMP_IOR(3, \ > + struct seccomp_notif_addfd) > + > #endif /* _UAPI_LINUX_SECCOMP_H */ > diff --git a/kernel/seccomp.c b/kernel/seccomp.c > index 94ae4c7502cc..02b9ba1fbee0 100644 > --- a/kernel/seccomp.c > +++ b/kernel/seccomp.c > @@ -41,6 +41,9 @@ > #include > #include > #include > +#include > +#include > +#include > > enum notify_state { > SECCOMP_NOTIFY_INIT, > @@ -77,10 +80,42 @@ struct seccomp_knotif { > long val; > u32 flags; > > - /* Signals when this has entered SECCOMP_NOTIFY_REPLIED */ > + /* > + * Signals when this has changed states, such as the listener > + * dying, a new seccomp addfd message, or changing to REPLIED > + */ > struct completion ready; > > struct list_head list; > + > + /* outstanding addfd requests */ > + struct list_head addfd; > +}; > + > +/** > + * struct seccomp_kaddfd - container for seccomp_addfd ioctl messages > + * > + * @file: A reference to the file to install in the other task > + * @fd: The fd number to install it at. If the fd number is -1, it means the > + *
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
Sargun Dhillon writes: > This adds a seccomp notifier ioctl which allows for the listener to "add" > file descriptors to a process which originated a seccomp user > notification. This allows calls like mount, and mknod to be "implemented", > as the return value, and the arguments are data in memory. On the other > hand, calls like connect can be "implemented" using pidfd_getfd. > > Unfortunately, there are calls which return file descriptors, like > open, which are vulnerable to TOC-TOU attacks, and require that the > more privileged supervisor can inspect the argument, and perform the > syscall on behalf of the process generating the notifiation. This > allows the file descriptor generated from that open call to be > returned to the calling process. > > In addition, there is funcitonality to allow for replacement of > specific file descriptors, following dup2-like semantics. > > Signed-off-by: Sargun Dhillon > Suggested-by: Matt Denton > Cc: Kees Cook , > Cc: Jann Horn , > Cc: Robert Sesek , > Cc: Chris Palmer > Cc: Christian Brauner > Cc: Tycho Andersen > --- Thanks, this is a really useful feature. Tested-by: Giuseppe Scrivano
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 09:38:28AM +0200, Christian Brauner wrote: > On Fri, May 29, 2020 at 12:31:37AM -0700, Kees Cook wrote: > > Nit: please use BIT() > > Fwiw, I don't think we can use BIT() in uapi headers, see: Argh. How many times do I get to trip over this? :P Thank you; yes, please ignore my suggestion. :) -- Kees Cook
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Fri, May 29, 2020 at 12:31:37AM -0700, Kees Cook wrote: > On Thu, May 28, 2020 at 04:08:57AM -0700, Sargun Dhillon wrote: > > This adds a seccomp notifier ioctl which allows for the listener to "add" > > file descriptors to a process which originated a seccomp user > > notification. This allows calls like mount, and mknod to be "implemented", > > as the return value, and the arguments are data in memory. On the other > > hand, calls like connect can be "implemented" using pidfd_getfd. > > > > Unfortunately, there are calls which return file descriptors, like > > open, which are vulnerable to TOC-TOU attacks, and require that the > > more privileged supervisor can inspect the argument, and perform the > > syscall on behalf of the process generating the notifiation. This > > allows the file descriptor generated from that open call to be > > returned to the calling process. > > > > In addition, there is funcitonality to allow for replacement of > > specific file descriptors, following dup2-like semantics. > > > > Signed-off-by: Sargun Dhillon > > Suggested-by: Matt Denton > > This looks mostly really clean. When I've got more brain tomorrow I want to > double-check the locking, but I think the use of notify_lock and being > in the ioctl fully protects everything from any use-after-free-like > issues. > > Notes below... > > > +/* valid flags for seccomp_notif_addfd */ > > +#define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ > > Nit: please use BIT() Fwiw, I don't think we can use BIT() in uapi headers, see: commit 23b2c96fad21886c53f5e1a4ffedd45ddd2e85ba Author: Christian Brauner Date: Thu Oct 24 23:25:39 2019 +0200 seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUE Switch from BIT(0) to (1UL << 0). > > > @@ -735,6 +770,41 @@ static u64 seccomp_next_notify_id(struct > > seccomp_filter *filter) > > return filter->notif->next_id++; > > } > > > > +static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) > > +{ > > + struct socket *sock; > > + int ret, err; > > + > > + /* > > +* Remove the notification, and reset the list pointers, indicating > > +* that it has been handled. > > +*/ > > + list_del_init(>list); > > + > > + ret = security_file_receive(addfd->file); > > + if (ret) > > + goto out; > > + > > + if (addfd->fd == -1) { > > + ret = get_unused_fd_flags(addfd->flags); > > + if (ret >= 0) > > + fd_install(ret, get_file(addfd->file)); > > + } else { > > + ret = replace_fd(addfd->fd, addfd->file, addfd->flags); > > + } > > + > > + /* These are the semantics from copying FDs via SCM_RIGHTS */ > > + sock = sock_from_file(addfd->file, ); > > + if (sock) { > > + sock_update_netprioidx(>sk->sk_cgrp_data); > > + sock_update_classid(>sk->sk_cgrp_data); > > + } > > This made my eye twitch. ;) I see this is borrowed from > scm_detach_fds()... this really feels like the kind of thing that will > quickly go out of sync. I think this "receive an fd" logic needs to be > lifted out of scm_detach_fds() so it and seccomp can share it. I'm not > sure how to parameterize it quite right, though. Perhaps: > > int file_receive(int fd, unsigned long flags, struct file *file) > { > struct socket *sock; > int ret; > > ret = security_file_receive(file); > if (ret) > return ret; > > /* Install the file. */ > if (fd == -1) { > ret = get_unused_fd_flags(flags); > if (ret >= 0) > fd_install(ret, get_file(file)); > } else { > ret = replace_fd(fd, file, flags); > } > > /* Bump the usage count. */ > sock = sock_from_file(addfd->file, ); > if (sock) { > sock_update_netprioidx(>sk->sk_cgrp_data); > sock_update_classid(>sk->sk_cgrp_data); > } > > return ret; > } > > > static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) > { > /* >* Remove the notification, and reset the list pointers, indicating >* that it has been handled. >*/ > list_del_init(>list); > addfd->ret = file_receive(addfd->fd, addfd->flags, addfd->file); > complete(>completion); > } > > scm_detach_fds() > ... > for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); i i++, cmfptr++) > { > > err = file_receive(-1, MSG_CMSG_CLOEXEC & msg->msg_flags > ? O_CLOEXEC : 0, fp[i]); > if (err < 0) > break; > err = put_user(err, cmfptr); > if (err) > /* wat */ > } > ... > > I'm not sure on the put_user() failure, though. We could check early > for faults with a put_user(0, cmfptr) before the file_receive() call, or > we could just ignore it? I'm not sure what SCM does here. I guess
Re: [PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
On Thu, May 28, 2020 at 04:08:57AM -0700, Sargun Dhillon wrote: > This adds a seccomp notifier ioctl which allows for the listener to "add" > file descriptors to a process which originated a seccomp user > notification. This allows calls like mount, and mknod to be "implemented", > as the return value, and the arguments are data in memory. On the other > hand, calls like connect can be "implemented" using pidfd_getfd. > > Unfortunately, there are calls which return file descriptors, like > open, which are vulnerable to TOC-TOU attacks, and require that the > more privileged supervisor can inspect the argument, and perform the > syscall on behalf of the process generating the notifiation. This > allows the file descriptor generated from that open call to be > returned to the calling process. > > In addition, there is funcitonality to allow for replacement of > specific file descriptors, following dup2-like semantics. > > Signed-off-by: Sargun Dhillon > Suggested-by: Matt Denton This looks mostly really clean. When I've got more brain tomorrow I want to double-check the locking, but I think the use of notify_lock and being in the ioctl fully protects everything from any use-after-free-like issues. Notes below... > +/* valid flags for seccomp_notif_addfd */ > +#define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ Nit: please use BIT() > @@ -735,6 +770,41 @@ static u64 seccomp_next_notify_id(struct seccomp_filter > *filter) > return filter->notif->next_id++; > } > > +static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) > +{ > + struct socket *sock; > + int ret, err; > + > + /* > + * Remove the notification, and reset the list pointers, indicating > + * that it has been handled. > + */ > + list_del_init(>list); > + > + ret = security_file_receive(addfd->file); > + if (ret) > + goto out; > + > + if (addfd->fd == -1) { > + ret = get_unused_fd_flags(addfd->flags); > + if (ret >= 0) > + fd_install(ret, get_file(addfd->file)); > + } else { > + ret = replace_fd(addfd->fd, addfd->file, addfd->flags); > + } > + > + /* These are the semantics from copying FDs via SCM_RIGHTS */ > + sock = sock_from_file(addfd->file, ); > + if (sock) { > + sock_update_netprioidx(>sk->sk_cgrp_data); > + sock_update_classid(>sk->sk_cgrp_data); > + } This made my eye twitch. ;) I see this is borrowed from scm_detach_fds()... this really feels like the kind of thing that will quickly go out of sync. I think this "receive an fd" logic needs to be lifted out of scm_detach_fds() so it and seccomp can share it. I'm not sure how to parameterize it quite right, though. Perhaps: int file_receive(int fd, unsigned long flags, struct file *file) { struct socket *sock; int ret; ret = security_file_receive(file); if (ret) return ret; /* Install the file. */ if (fd == -1) { ret = get_unused_fd_flags(flags); if (ret >= 0) fd_install(ret, get_file(file)); } else { ret = replace_fd(fd, file, flags); } /* Bump the usage count. */ sock = sock_from_file(addfd->file, ); if (sock) { sock_update_netprioidx(>sk->sk_cgrp_data); sock_update_classid(>sk->sk_cgrp_data); } return ret; } static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) { /* * Remove the notification, and reset the list pointers, indicating * that it has been handled. */ list_del_init(>list); addfd->ret = file_receive(addfd->fd, addfd->flags, addfd->file); complete(>completion); } scm_detach_fds() ... for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); imsg_flags ? O_CLOEXEC : 0, fp[i]); if (err < 0) break; err = put_user(err, cmfptr); if (err) /* wat */ } ... I'm not sure on the put_user() failure, though. We could check early for faults with a put_user(0, cmfptr) before the file_receive() call, or we could just ignore it? I'm not sure what SCM does here. I guess worst-case: int file_receive(int fd, unsigned long flags, struct file *file, int __user *fdptr) { ... ret = get_unused_fd_flags(flags); if (ret >= 0) { if (cmfptr) { int err; err = put_user(ret, cmfptr); if (err) { put_unused_fd(ret); return err; } }
[PATCH v2 2/3] seccomp: Introduce addfd ioctl to seccomp user notifier
This adds a seccomp notifier ioctl which allows for the listener to "add" file descriptors to a process which originated a seccomp user notification. This allows calls like mount, and mknod to be "implemented", as the return value, and the arguments are data in memory. On the other hand, calls like connect can be "implemented" using pidfd_getfd. Unfortunately, there are calls which return file descriptors, like open, which are vulnerable to TOC-TOU attacks, and require that the more privileged supervisor can inspect the argument, and perform the syscall on behalf of the process generating the notifiation. This allows the file descriptor generated from that open call to be returned to the calling process. In addition, there is funcitonality to allow for replacement of specific file descriptors, following dup2-like semantics. Signed-off-by: Sargun Dhillon Suggested-by: Matt Denton Cc: Kees Cook , Cc: Jann Horn , Cc: Robert Sesek , Cc: Chris Palmer Cc: Christian Brauner Cc: Tycho Andersen --- include/uapi/linux/seccomp.h | 25 + kernel/seccomp.c | 182 ++- 2 files changed, 206 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index c1735455bc53..c7bfe898e7a0 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -113,6 +113,27 @@ struct seccomp_notif_resp { __u32 flags; }; +/* valid flags for seccomp_notif_addfd */ +#define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ + +/** + * struct seccomp_notif_addfd + * @size: The size of the seccomp_notif_addfd datastructure + * @id: The ID of the seccomp notification + * @flags: SECCOMP_ADDFD_FLAG_* + * @srcfd: The local fd number + * @newfd: Optional remote FD number if SETFD option is set, otherwise 0. + * @newfd_flags: Flags the remote FD should be allocated under + */ +struct seccomp_notif_addfd { + __u64 size; + __u64 id; + __u64 flags; + __u32 srcfd; + __u32 newfd; + __u32 newfd_flags; +}; + #define SECCOMP_IOC_MAGIC '!' #define SECCOMP_IO(nr) _IO(SECCOMP_IOC_MAGIC, nr) #define SECCOMP_IOR(nr, type) _IOR(SECCOMP_IOC_MAGIC, nr, type) @@ -124,4 +145,8 @@ struct seccomp_notif_resp { #define SECCOMP_IOCTL_NOTIF_SEND SECCOMP_IOWR(1, \ struct seccomp_notif_resp) #define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOR(2, __u64) +/* On success, the return value is the remote process's added fd number */ +#define SECCOMP_IOCTL_NOTIF_ADDFD SECCOMP_IOR(3, \ + struct seccomp_notif_addfd) + #endif /* _UAPI_LINUX_SECCOMP_H */ diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 94ae4c7502cc..02b9ba1fbee0 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -41,6 +41,9 @@ #include #include #include +#include +#include +#include enum notify_state { SECCOMP_NOTIFY_INIT, @@ -77,10 +80,42 @@ struct seccomp_knotif { long val; u32 flags; - /* Signals when this has entered SECCOMP_NOTIFY_REPLIED */ + /* +* Signals when this has changed states, such as the listener +* dying, a new seccomp addfd message, or changing to REPLIED +*/ struct completion ready; struct list_head list; + + /* outstanding addfd requests */ + struct list_head addfd; +}; + +/** + * struct seccomp_kaddfd - container for seccomp_addfd ioctl messages + * + * @file: A reference to the file to install in the other task + * @fd: The fd number to install it at. If the fd number is -1, it means the + * installing process should allocate the fd as normal. + * @flags: The flags for the new file descriptor. At the moment, only O_CLOEXEC + * is allowed. + * @ret: The return value of the installing process. It is set to the fd num + * upon success (>= 0). + * @completion: Indicates that the installing process has completed fd + * installation, or gone away (either due to successful + * reply, or signal) + * + */ +struct seccomp_kaddfd { + struct file *file; + int fd; + unsigned int flags; + + /* To only be set on reply */ + int ret; + struct completion completion; + struct list_head list; }; /** @@ -735,6 +770,41 @@ static u64 seccomp_next_notify_id(struct seccomp_filter *filter) return filter->notif->next_id++; } +static void seccomp_handle_addfd(struct seccomp_kaddfd *addfd) +{ + struct socket *sock; + int ret, err; + + /* +* Remove the notification, and reset the list pointers, indicating +* that it has been handled. +*/ + list_del_init(>list); + + ret = security_file_receive(addfd->file); + if (ret) + goto out; + + if (addfd->fd == -1) { + ret =