Re: [apparmor] [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec

2022-10-18 Thread Jann Horn
On Tue, Oct 18, 2022 at 9:09 AM Kees Cook  wrote:
> On Fri, Oct 14, 2022 at 05:35:26PM +0200, Jann Horn wrote:
> > On Fri, Oct 14, 2022 at 5:18 AM Andy Lutomirski  wrote:
> > > On Thu, Oct 6, 2022, at 7:13 AM, Jann Horn wrote:
> > > > On Thu, Oct 6, 2022 at 11:05 AM Christian Brauner  
> > > > wrote:
> > > >> On Thu, Oct 06, 2022 at 01:27:34AM -0700, Kees Cook wrote:
> > > >> > The check_unsafe_exec() counting of n_fs would not add up under a 
> > > >> > heavily
> > > >> > threaded process trying to perform a suid exec, causing the suid 
> > > >> > portion
> > > >> > to fail. This counting error appears to be unneeded, but to catch any
> > > >> > possible conditions, explicitly unshare fs_struct on exec, if it 
> > > >> > ends up
> > > >>
> > > >> Isn't this a potential uapi break? Afaict, before this change a call to
> > > >> clone{3}(CLONE_FS) followed by an exec in the child would have the
> > > >> parent and child share fs information. So if the child e.g., changes 
> > > >> the
> > > >> working directory post exec it would also affect the parent. But after
> > > >> this change here this would no longer be true. So a child changing a
> > > >> workding directoro would not affect the parent anymore. IOW, an exec is
> > > >> accompanied by an unshare(CLONE_FS). Might still be worth trying ofc 
> > > >> but
> > > >> it seems like a non-trivial uapi change but there might be few users
> > > >> that do clone{3}(CLONE_FS) followed by an exec.
> > > >
> > > > I believe the following code in Chromium explicitly relies on this
> > > > behavior, but I'm not sure whether this code is in active use anymore:
> > > >
> > > > https://source.chromium.org/chromium/chromium/src/+/main:sandbox/linux/suid/sandbox.c;l=101?q=CLONE_FS&sq=&ss=chromium
> > >
> > > Wait, this is absolutely nucking futs.  On a very quick inspection, the 
> > > sharable things like this are fs, files, sighand, and io.files and 
> > > sighand get unshared, which makes sense.  fs supposedly checks for extra 
> > > refs and prevents gaining privilege.  io is... ignored!  At least it's 
> > > not immediately obvious that io is a problem.
> > >
> > > But seriously, this makes no sense at all.  It should not be possible to 
> > > exec a program and then, without ptrace, change its cwd out from under 
> > > it.  Do we really need to preserve this behavior?
> >
> > I agree that this is pretty wild.
> >
> > The single user I'm aware of is Chrome, and as far as I know, they use
> > it for establishing their sandbox on systems where unprivileged user
> > namespaces are disabled - see
> > .
> > They also have seccomp-based sandboxing, but IIRC there are some small
> > holes that mean it's still useful for them to be able to set up
> > namespaces, like how sendmsg() on a unix domain socket can specify a
> > file path as the destination address.
> >
> > (By the way, I think maybe Chrome wouldn't need this wacky trick with
> > the shared fs_struct if the "NO_NEW_PRIVS permits chroot()" thing had
> > ever landed that you
> > (https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.l...@amacapital.net/)
> > and Mickaël Salaün proposed in the past... or alternatively, if there
> > was a way to properly filter all the syscalls that Chrome has to
> > permit for renderers.)
> >
> > (But also, to be clear, I don't speak for Chrome, this is just my
> > understanding of how their stuff works.)
>
> Chrome seems to just want a totally empty filesystem view, yes?
> Let's land the nnp+chroot change. :P Only 10 years late! Then we can
> have Chrome use this and we can unshare fs on exec...

Someone should check with Chrome first though to make sure what I said
accurately represents what they think...



Re: [apparmor] [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec

2022-10-18 Thread Kees Cook
On Fri, Oct 14, 2022 at 05:35:26PM +0200, Jann Horn wrote:
> On Fri, Oct 14, 2022 at 5:18 AM Andy Lutomirski  wrote:
> > On Thu, Oct 6, 2022, at 7:13 AM, Jann Horn wrote:
> > > On Thu, Oct 6, 2022 at 11:05 AM Christian Brauner  
> > > wrote:
> > >> On Thu, Oct 06, 2022 at 01:27:34AM -0700, Kees Cook wrote:
> > >> > The check_unsafe_exec() counting of n_fs would not add up under a 
> > >> > heavily
> > >> > threaded process trying to perform a suid exec, causing the suid 
> > >> > portion
> > >> > to fail. This counting error appears to be unneeded, but to catch any
> > >> > possible conditions, explicitly unshare fs_struct on exec, if it ends 
> > >> > up
> > >>
> > >> Isn't this a potential uapi break? Afaict, before this change a call to
> > >> clone{3}(CLONE_FS) followed by an exec in the child would have the
> > >> parent and child share fs information. So if the child e.g., changes the
> > >> working directory post exec it would also affect the parent. But after
> > >> this change here this would no longer be true. So a child changing a
> > >> workding directoro would not affect the parent anymore. IOW, an exec is
> > >> accompanied by an unshare(CLONE_FS). Might still be worth trying ofc but
> > >> it seems like a non-trivial uapi change but there might be few users
> > >> that do clone{3}(CLONE_FS) followed by an exec.
> > >
> > > I believe the following code in Chromium explicitly relies on this
> > > behavior, but I'm not sure whether this code is in active use anymore:
> > >
> > > https://source.chromium.org/chromium/chromium/src/+/main:sandbox/linux/suid/sandbox.c;l=101?q=CLONE_FS&sq=&ss=chromium
> >
> > Wait, this is absolutely nucking futs.  On a very quick inspection, the 
> > sharable things like this are fs, files, sighand, and io.files and 
> > sighand get unshared, which makes sense.  fs supposedly checks for extra 
> > refs and prevents gaining privilege.  io is... ignored!  At least it's not 
> > immediately obvious that io is a problem.
> >
> > But seriously, this makes no sense at all.  It should not be possible to 
> > exec a program and then, without ptrace, change its cwd out from under it.  
> > Do we really need to preserve this behavior?
> 
> I agree that this is pretty wild.
> 
> The single user I'm aware of is Chrome, and as far as I know, they use
> it for establishing their sandbox on systems where unprivileged user
> namespaces are disabled - see
> .
> They also have seccomp-based sandboxing, but IIRC there are some small
> holes that mean it's still useful for them to be able to set up
> namespaces, like how sendmsg() on a unix domain socket can specify a
> file path as the destination address.
> 
> (By the way, I think maybe Chrome wouldn't need this wacky trick with
> the shared fs_struct if the "NO_NEW_PRIVS permits chroot()" thing had
> ever landed that you
> (https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.l...@amacapital.net/)
> and Mickaël Salaün proposed in the past... or alternatively, if there
> was a way to properly filter all the syscalls that Chrome has to
> permit for renderers.)
> 
> (But also, to be clear, I don't speak for Chrome, this is just my
> understanding of how their stuff works.)

Chrome seems to just want a totally empty filesystem view, yes?
Let's land the nnp+chroot change. :P Only 10 years late! Then we can
have Chrome use this and we can unshare fs on exec...

-- 
Kees Cook