On Tue, Apr 28, 2020 at 10:36 PM Linus Torvalds
<[email protected]> wrote:
> On Tue, Apr 28, 2020 at 12:08 PM Oleg Nesterov <[email protected]> wrote:
> >
> > Oops. I can update that old patch but somehow I thought there is a better
> > plan which I don't yet understand...
>
> I don't think any plan survived reality.
>
> Unless we want to do something *really* hacky.. The attached patch is
> not meant to be serious.
>
> > And, IIRC, Jan had some ideas how to rework the new creds calculation in
> > execve paths to avoid the cred_guard_mutex deadlock?
>
> I'm not sure how you'd do that.
>
> Execve() fundamentally needs to serialize with PTRACE_ATTACH somehow,
> since the whole point is that "tsk->ptrace" changes how the
> credentials are interpreted.
>
> So PTRACE_ATTACH doesn't really _change_ the credentials, but it very
> much changes what execve() will do with them.
>
> But I guess we could do a "if somebody attached to us while we did the
> execve(), just repeat the whole thing"
>
> Jann, what was your clever idea? Maybe it got lost in the long thread..

My clever/horrible/overly-complex idea was basically:

In execve:

 - After the point of no return, but before we start waiting for the
   other threads to go away, finish calculating our post-execve creds
   and stash them somewhere in the task_struct or so.
 - Drop the cred_guard_mutex.
 - Wait for the other threads to die.
 - Take the cred_guard_mutex again.
 - Clear out the pointer in the task_struct.
 - Finish execve and install the new creds.
 - Drop the cred_guard_mutex again.

Then in ptrace_may_access, after taking the cred_guard_mutex, we'd
know that the target task is either outside execve or in the middle of
execve, with old and new credentials known; and then we could say "you
only get to access that task if you're capable relative to *both* its
old and new credentials, since the task currently has both state from
the old executable and from the new one". (Other users that expect to
use cred_guard_mutex to synchronize with execve would also have to be
changed appropriately; e.g. seccomp tsync would have to bail out if
the task turns out to be in execve after the mutex has been acquired.)

So I think we can conceptually fix the deadlock, but it requires a bit
of refactoring. (I have an old branch somewhere in which I tried to
implement this, and where I did a bunch of refactoring around
ptrace_may_access() so that e.g. the LSM hooks for ptrace can be
invoked twice when the target task is in execve, and so that they take
the target's cred* as an argument.)

Reply via email to