[RFC] proposal for resolving the cred_guard_mutex deadlock

2018-10-12 Thread Jann Horn via Selinux
Hi!

There is that old deadlock of cred_guard_mutex that I originally heard
about from Oleg, and that has come up on LKML sometimes since then; to
recap, essentially, the problem is demonstrated by the following
testcase:


#include 
#include 
#include 
#include 
#include 

void *thread(void *arg) {
ptrace(PTRACE_TRACEME, 0, 0, 0);
return NULL;
}

int main(void) {
int pid = fork();

if (!pid) {
pthread_t pt;
pthread_create(, NULL, thread, NULL);
pthread_join(pt, NULL);
execlp("echo", "echo", "passed", NULL);
}

sleep(1);

ptrace(PTRACE_ATTACH, pid, 0, 0);
kill(pid, SIGCONT);

return 0;
}


The parent gets stuck in ptrace(), waiting for the cred_guard_mutex:


[<0>] ptrace_attach+0x97/0x250
[<0>] __x64_sys_ptrace+0xa2/0x100
[<0>] do_syscall_64+0x55/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9


The child gets stuck in execve(), waiting in de_thread() for its other thread:


? __schedule+0x2b7/0x880
schedule+0x28/0x80
flush_old_exec+0xc8/0x6c0
load_elf_binary+0x291/0x15d0
? get_user_pages_remote+0x137/0x1f0
? get_acl+0x1a/0x100
search_binary_handler+0x90/0x1c0
__do_execve_file.isra.37+0x6b8/0x880
__x64_sys_execve+0x34/0x40
do_syscall_64+0x55/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9


But the other thread can't exit until the ptrace parent handles it;
and the ptrace parent is stuck in another syscall, and therefore can't
wait() for that thread.

This can cause debuggers to lock up, and also gets in the way of
taking cred_guard_mutex in more places (since that'd make the deadlock
worse).


IMO at the core of the problem is that the cred_guard_mutex is held
across pretty much all of sys_execve(), and in particular across
do_thread(). I think that actually, we could drop the cred_guard_mutex
before waiting for sibling threads if we make things like the ptrace
check a bit more complicated.

At the moment, cred_guard_mutex is held across de_thread() and beyond
to protect against things like seccomp TSYNC from a sibling thread or
ptrace attach. For TSYNC, this is necessary because another thread
could try to TSYNC to us as long as we're not done with de_thread();
for ptrace attach, this is necessary because we only switch
credentials after de_thread(). But just doing the credential switch
earlier would also be dangerous.

So here's my idea:
When we're done calculating the post-execve creds (which happens
before de_thread()), we stash a pointer to the post-exec creds in
current->signal (or in current?). Then, in de_thread(), once we know
that all our siblings have been zapped, we drop the cred_guard_mutex
*before* the loop that waits for the siblings to exit. At that point,
sibling threads that are mid-syscall and other processes can try to
interact with us again. This means:
For accesses that are permitted based on whether the access is
same-thread (in particular TSYNC and probably also the
same_thread_group() check in ptrace), we'll have to check whether exec
creds are stashed in current->signal; if so, we have to bail out. Any
error code we return there shouldn't be visible to userspace, since
this can only happen to a thread that has already been zapped.
For accesses that go through ptrace_may_access(), we'll have to expand
the check such that, if an attempt is made to access a task that is
currently going through execve, the access check is performed against
*both* its old and its new credentials. This also means that the
target credentials will have to be plumbed into the LSMs in addition
to the target task pointer.

Opinions? If nobody thinks that this is an incredibly stupid idea,
I'll try to come up with some patches.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v7 3/6] seccomp: add a way to get a listener fd from ptrace

2018-10-10 Thread Jann Horn via Selinux
On Wed, Oct 10, 2018 at 5:32 PM Paul Moore  wrote:
> On Tue, Oct 9, 2018 at 9:36 AM Jann Horn  wrote:
> > +cc selinux people explicitly, since they probably have opinions on this
>
> I just spent about twenty minutes working my way through this thread,
> and digging through the containers archive trying to get a good
> understanding of what you guys are trying to do, and I'm not quite
> sure I understand it all.  However, from what I have seen, this
> approach looks very ptrace-y to me (I imagine to others as well based
> on the comments) and because of this I think ensuring the usual ptrace
> access controls are evaluated, including the ptrace LSM hooks, is the
> right thing to do.

Basically the problem is that this new ptrace() API does something
that doesn't just influence the target task, but also every other task
that has the same seccomp filter. So the classic ptrace check doesn't
work here.

> If I've missed something, or I'm thinking about this wrong, please
> educate me; just a heads-up that I'm largely offline for most of this
> week so responses on my end are going to be delayed much more than
> usual.
>
> > On Tue, Oct 9, 2018 at 3:29 PM Christian Brauner  
> > wrote:
> > > On Tue, Oct 09, 2018 at 02:39:53PM +0200, Jann Horn wrote:
> > > > On Mon, Oct 8, 2018 at 8:18 PM Christian Brauner  
> > > > wrote:
> > > > > On Mon, Oct 08, 2018 at 06:42:00PM +0200, Jann Horn wrote:
> > > > > > On Mon, Oct 8, 2018 at 6:21 PM Christian Brauner 
> > > > > >  wrote:
> > > > > > > On Mon, Oct 08, 2018 at 05:33:22PM +0200, Jann Horn wrote:
> > > > > > > > On Mon, Oct 8, 2018 at 5:16 PM Christian Brauner 
> > > > > > > >  wrote:
> > > > > > > > > On Thu, Sep 27, 2018 at 09:11:16AM -0600, Tycho Andersen 
> > > > > > > > > wrote:
> > > > > > > > > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > > > > > > > > > index 44a31ac8373a..17685803a2af 100644
> > > > > > > > > > --- a/kernel/seccomp.c
> > > > > > > > > > +++ b/kernel/seccomp.c
> > > > > > > > > > @@ -1777,4 +1777,35 @@ static struct file 
> > > > > > > > > > *init_listener(struct task_struct *task,
> > > > > > > > > >
> > > > > > > > > >   return ret;
> > > > > > > > > >  }
> > > > > > > > > > +
> > > > > > > > > > +long seccomp_new_listener(struct task_struct *task,
> > > > > > > > > > +   unsigned long filter_off)
> > > > > > > > > > +{
> > > > > > > > > > + struct seccomp_filter *filter;
> > > > > > > > > > + struct file *listener;
> > > > > > > > > > + int fd;
> > > > > > > > > > +
> > > > > > > > > > + if (!capable(CAP_SYS_ADMIN))
> > > > > > > > > > + return -EACCES;
> > > > > > > > >
> > > > > > > > > I know this might have been discussed a while back but why 
> > > > > > > > > exactly do we
> > > > > > > > > require CAP_SYS_ADMIN in init_userns and not in the target 
> > > > > > > > > userns? What
> > > > > > > > > if I want to do a setns()fd, CLONE_NEWUSER) to the target 
> > > > > > > > > process and
> > > > > > > > > use ptrace from in there?
> > > > > > > >
> > > > > > > > See 
> > > > > > > > https://lore.kernel.org/lkml/CAG48ez3R+ZJ1vwGkDfGzKX2mz6f=jjjwso5pcvnh68p+rko...@mail.gmail.com/
> > > > > > > > . Basically, the problem is that this doesn't just give you 
> > > > > > > > capability
> > > > > > > > over the target task, but also over every other task that has 
> > > > > > > > the same
> > > > > > > > filter installed; you need some sort of "is the caller capable 
> > > > > > > > over
> > > > > > > > the filter and anyone who uses it" check.
> > > > > > >
> > > > > > > Thanks.
> > > > > > > But then this new ptrace feature as it stands is imho currently 
> > > > > > > broken.
> > > > > > > If you can install a seccomp filter with SECCOMP_RET_USER_NOTIF 
> > > > > > > if you
> > > > > > > are ns_cpabable(CAP_SYS_ADMIN) and also get an fd via seccomp() 
> > > > > > > itself
> > > > > > > if you are ns_cpabable(CAP_SYS_ADMIN)
> > > >
> > > > Actually, you don't need CAP_SYS_ADMIN for seccomp() at all as long as
> > > > you enable the NNP flag, I think?
> > >
> > > Yes, if you turn on NNP you don't even need sys_admin.
> > >
> > > >
> > > > > > > then either the new ptrace() api
> > > > > > > extension should be fixed to allow for this too or the seccomp() 
> > > > > > > way of
> > > > > > > retrieving the pid - which I really think we want - needs to be 
> > > > > > > fixed to
> > > > > > > require capable(CAP_SYS_ADMIN) too.
> > > > > > > The solution where both require ns_capable(CAP_SYS_ADMIN) is - 
> > > > > > > imho -
> > > > > > > the preferred way to solve this.
> > > > > > > Everything else will just be confusing.
> > > > > >
> > > > > > First you say "broken", then you say "confusing". Which one do you 
> > > > > > mean?
> > > > >
> > > > > Both. It's broken in so far as it places a seemingly unnecessary
> > > > > restriction that could be fixed. You outlined one possible fix 
> > > > > yourself
> > > > > in the link you provided.
> > > >
> > > > If by "possible fix" you 

Re: [PATCH v7 3/6] seccomp: add a way to get a listener fd from ptrace

2018-10-10 Thread Jann Horn via Selinux
On Wed, Oct 10, 2018 at 2:54 PM Christian Brauner  wrote:
> On Tue, Oct 09, 2018 at 06:26:47PM +0200, Jann Horn wrote:
> > On Tue, Oct 9, 2018 at 6:20 PM Christian Brauner  
> > wrote:
> > > On Tue, Oct 09, 2018 at 05:26:26PM +0200, Jann Horn wrote:
> > > > On Tue, Oct 9, 2018 at 4:09 PM Christian Brauner  
> > > > wrote:
> > > > > On Tue, Oct 09, 2018 at 03:50:53PM +0200, Jann Horn wrote:
> > > > > > On Tue, Oct 9, 2018 at 3:49 PM Christian Brauner 
> > > > > >  wrote:
> > > > > > > On Tue, Oct 09, 2018 at 03:36:04PM +0200, Jann Horn wrote:
> > > > > > > > On Tue, Oct 9, 2018 at 3:29 PM Christian Brauner 
> > > > > > > >  wrote:
> > > > > > > > > One more thing. Citing from [1]
> > > > > > > > >
> > > > > > > > > > I think there's a security problem here. Imagine the 
> > > > > > > > > > following scenario:
> > > > > > > > > >
> > > > > > > > > > 1. task A (uid==0) sets up a seccomp filter that uses 
> > > > > > > > > > SECCOMP_RET_USER_NOTIF
> > > > > > > > > > 2. task A forks off a child B
> > > > > > > > > > 3. task B uses setuid(1) to drop its privileges
> > > > > > > > > > 4. task B becomes dumpable again, either via 
> > > > > > > > > > prctl(PR_SET_DUMPABLE, 1)
> > > > > > > > > > or via execve()
> > > > > > > > > > 5. task C (the attacker, uid==1) attaches to task B via 
> > > > > > > > > > ptrace
> > > > > > > > > > 6. task C uses PTRACE_SECCOMP_NEW_LISTENER on task B
> > > > > > > > >
> > > > > > > > > Sorry, to be late to the party but would this really pass
> > > > > > > > > __ptrace_may_access() in ptrace_attach()? It doesn't seem 
> > > > > > > > > obvious to me
> > > > > > > > > that it would... Doesn't look like it would get past:
> > > > > > > > >
> > > > > > > > > tcred = __task_cred(task);
> > > > > > > > > if (uid_eq(caller_uid, tcred->euid) &&
> > > > > > > > > uid_eq(caller_uid, tcred->suid) &&
> > > > > > > > > uid_eq(caller_uid, tcred->uid)  &&
> > > > > > > > > gid_eq(caller_gid, tcred->egid) &&
> > > > > > > > > gid_eq(caller_gid, tcred->sgid) &&
> > > > > > > > > gid_eq(caller_gid, tcred->gid))
> > > > > > > > > goto ok;
> > > > > > > > > if (ptrace_has_cap(tcred->user_ns, mode))
> > > > > > > > > goto ok;
> > > > > > > > > rcu_read_unlock();
> > > > > > > > > return -EPERM;
> > > > > > > > > ok:
> > > > > > > > > rcu_read_unlock();
> > > > > > > > > mm = task->mm;
> > > > > > > > > if (mm &&
> > > > > > > > > ((get_dumpable(mm) != SUID_DUMP_USER) &&
> > > > > > > > >  !ptrace_has_cap(mm->user_ns, mode)))
> > > > > > > > > return -EPERM;
> > > > > > > >
> > > > > > > > Which specific check would prevent task C from attaching to 
> > > > > > > > task B? If
> > > > > > > > the UIDs match, the first "goto ok" executes; and you're 
> > > > > > > > dumpable, so
> > > > > > > > you don't trigger the second "return -EPERM".
> > > > > > >
> > > > > > > You'd also need CAP_SYS_PTRACE in the mm->user_ns which you 
> > > > > > > shouldn't
> > > > > > > have if you did a setuid to an unpriv user. (But I always find 
> > > > > > > that code
> > > > > > > confusing.)
> > > > > >
> > > > > > Only if the target hasn't gone through execve() since setuid().
> > > > >
> > > > > Sorry if I want to know this in excessive detail but I'd like to
> > > > > understand this properly so bear with me :)
> > > > > - If task B has setuid()ed and prctl(PR_SET_DUMPABLE, 1)ed but not
> > > > >   execve()ed then C won't pass ptrace_has_cap(mm->user_ns, mode).
> > > >
> > > > Yeah.
> > > >
> > > > > - If task B has setuid()ed, exeved()ed it will get its dumpable flag 
> > > > > set
> > > > >   to /proc/sys/fs/suid_dumpable
> > > >
> > > > Not if you changed all UIDs (e.g. by calling setuid() as root). In
> > > > that case, setup_new_exec() calls "set_dumpable(current->mm,
> > > > SUID_DUMP_USER)".
> > >
> > > Actually, looking at this when C is trying to PTRACE_ATTACH to B as an
> > > unprivileged user even if B execve()ed and it is dumpable C still
> > > wouldn't have CAP_SYS_PTRACE in the mm->user_ns unless it already is
> > > privileged over mm->user_ns which means it must be in an ancestor
> > > user_ns.
> >
> > Huh? Why would you need CAP_SYS_PTRACE for anything here? You can
> > ptrace another process running under your UID just fine, no matter
> > what the namespaces are. I'm not sure what you're saying.
>
> Sorry, I was out the door yesterday when answering this and was too
> brief. I forgot to mention: /proc/sys/kernel/yama/ptrace_scope. It
> should be enabled by default on nearly all distros

"nearly all distros"? AFAIK it's off on Debian, for starters. And Yama
still doesn't help you if one of the tasks enters a new user namespace
or whatever.

Yama is a little bit of extra, heuristic, **opt-in** hardening enabled
in some configurations. It is **not** a fundamental building block you
can rely on.

> and even if 

Re: [PATCH v7 3/6] seccomp: add a way to get a listener fd from ptrace

2018-10-09 Thread Jann Horn via Selinux
On Tue, Oct 9, 2018 at 6:20 PM Christian Brauner  wrote:
> On Tue, Oct 09, 2018 at 05:26:26PM +0200, Jann Horn wrote:
> > On Tue, Oct 9, 2018 at 4:09 PM Christian Brauner  
> > wrote:
> > > On Tue, Oct 09, 2018 at 03:50:53PM +0200, Jann Horn wrote:
> > > > On Tue, Oct 9, 2018 at 3:49 PM Christian Brauner  
> > > > wrote:
> > > > > On Tue, Oct 09, 2018 at 03:36:04PM +0200, Jann Horn wrote:
> > > > > > On Tue, Oct 9, 2018 at 3:29 PM Christian Brauner 
> > > > > >  wrote:
> > > > > > > One more thing. Citing from [1]
> > > > > > >
> > > > > > > > I think there's a security problem here. Imagine the following 
> > > > > > > > scenario:
> > > > > > > >
> > > > > > > > 1. task A (uid==0) sets up a seccomp filter that uses 
> > > > > > > > SECCOMP_RET_USER_NOTIF
> > > > > > > > 2. task A forks off a child B
> > > > > > > > 3. task B uses setuid(1) to drop its privileges
> > > > > > > > 4. task B becomes dumpable again, either via 
> > > > > > > > prctl(PR_SET_DUMPABLE, 1)
> > > > > > > > or via execve()
> > > > > > > > 5. task C (the attacker, uid==1) attaches to task B via ptrace
> > > > > > > > 6. task C uses PTRACE_SECCOMP_NEW_LISTENER on task B
> > > > > > >
> > > > > > > Sorry, to be late to the party but would this really pass
> > > > > > > __ptrace_may_access() in ptrace_attach()? It doesn't seem obvious 
> > > > > > > to me
> > > > > > > that it would... Doesn't look like it would get past:
> > > > > > >
> > > > > > > tcred = __task_cred(task);
> > > > > > > if (uid_eq(caller_uid, tcred->euid) &&
> > > > > > > uid_eq(caller_uid, tcred->suid) &&
> > > > > > > uid_eq(caller_uid, tcred->uid)  &&
> > > > > > > gid_eq(caller_gid, tcred->egid) &&
> > > > > > > gid_eq(caller_gid, tcred->sgid) &&
> > > > > > > gid_eq(caller_gid, tcred->gid))
> > > > > > > goto ok;
> > > > > > > if (ptrace_has_cap(tcred->user_ns, mode))
> > > > > > > goto ok;
> > > > > > > rcu_read_unlock();
> > > > > > > return -EPERM;
> > > > > > > ok:
> > > > > > > rcu_read_unlock();
> > > > > > > mm = task->mm;
> > > > > > > if (mm &&
> > > > > > > ((get_dumpable(mm) != SUID_DUMP_USER) &&
> > > > > > >  !ptrace_has_cap(mm->user_ns, mode)))
> > > > > > > return -EPERM;
> > > > > >
> > > > > > Which specific check would prevent task C from attaching to task B? 
> > > > > > If
> > > > > > the UIDs match, the first "goto ok" executes; and you're dumpable, 
> > > > > > so
> > > > > > you don't trigger the second "return -EPERM".
> > > > >
> > > > > You'd also need CAP_SYS_PTRACE in the mm->user_ns which you shouldn't
> > > > > have if you did a setuid to an unpriv user. (But I always find that 
> > > > > code
> > > > > confusing.)
> > > >
> > > > Only if the target hasn't gone through execve() since setuid().
> > >
> > > Sorry if I want to know this in excessive detail but I'd like to
> > > understand this properly so bear with me :)
> > > - If task B has setuid()ed and prctl(PR_SET_DUMPABLE, 1)ed but not
> > >   execve()ed then C won't pass ptrace_has_cap(mm->user_ns, mode).
> >
> > Yeah.
> >
> > > - If task B has setuid()ed, exeved()ed it will get its dumpable flag set
> > >   to /proc/sys/fs/suid_dumpable
> >
> > Not if you changed all UIDs (e.g. by calling setuid() as root). In
> > that case, setup_new_exec() calls "set_dumpable(current->mm,
> > SUID_DUMP_USER)".
>
> Actually, looking at this when C is trying to PTRACE_ATTACH to B as an
> unprivileged user even if B execve()ed and it is dumpable C still
> wouldn't have CAP_SYS_PTRACE in the mm->user_ns unless it already is
> privileged over mm->user_ns which means it must be in an ancestor
> user_ns.

Huh? Why would you need CAP_SYS_PTRACE for anything here? You can
ptrace another process running under your UID just fine, no matter
what the namespaces are. I'm not sure what you're saying.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v7 3/6] seccomp: add a way to get a listener fd from ptrace

2018-10-09 Thread Jann Horn via Selinux
On Tue, Oct 9, 2018 at 4:09 PM Christian Brauner  wrote:
> On Tue, Oct 09, 2018 at 03:50:53PM +0200, Jann Horn wrote:
> > On Tue, Oct 9, 2018 at 3:49 PM Christian Brauner  
> > wrote:
> > > On Tue, Oct 09, 2018 at 03:36:04PM +0200, Jann Horn wrote:
> > > > On Tue, Oct 9, 2018 at 3:29 PM Christian Brauner  
> > > > wrote:
> > > > > One more thing. Citing from [1]
> > > > >
> > > > > > I think there's a security problem here. Imagine the following 
> > > > > > scenario:
> > > > > >
> > > > > > 1. task A (uid==0) sets up a seccomp filter that uses 
> > > > > > SECCOMP_RET_USER_NOTIF
> > > > > > 2. task A forks off a child B
> > > > > > 3. task B uses setuid(1) to drop its privileges
> > > > > > 4. task B becomes dumpable again, either via prctl(PR_SET_DUMPABLE, 
> > > > > > 1)
> > > > > > or via execve()
> > > > > > 5. task C (the attacker, uid==1) attaches to task B via ptrace
> > > > > > 6. task C uses PTRACE_SECCOMP_NEW_LISTENER on task B
> > > > >
> > > > > Sorry, to be late to the party but would this really pass
> > > > > __ptrace_may_access() in ptrace_attach()? It doesn't seem obvious to 
> > > > > me
> > > > > that it would... Doesn't look like it would get past:
> > > > >
> > > > > tcred = __task_cred(task);
> > > > > if (uid_eq(caller_uid, tcred->euid) &&
> > > > > uid_eq(caller_uid, tcred->suid) &&
> > > > > uid_eq(caller_uid, tcred->uid)  &&
> > > > > gid_eq(caller_gid, tcred->egid) &&
> > > > > gid_eq(caller_gid, tcred->sgid) &&
> > > > > gid_eq(caller_gid, tcred->gid))
> > > > > goto ok;
> > > > > if (ptrace_has_cap(tcred->user_ns, mode))
> > > > > goto ok;
> > > > > rcu_read_unlock();
> > > > > return -EPERM;
> > > > > ok:
> > > > > rcu_read_unlock();
> > > > > mm = task->mm;
> > > > > if (mm &&
> > > > > ((get_dumpable(mm) != SUID_DUMP_USER) &&
> > > > >  !ptrace_has_cap(mm->user_ns, mode)))
> > > > > return -EPERM;
> > > >
> > > > Which specific check would prevent task C from attaching to task B? If
> > > > the UIDs match, the first "goto ok" executes; and you're dumpable, so
> > > > you don't trigger the second "return -EPERM".
> > >
> > > You'd also need CAP_SYS_PTRACE in the mm->user_ns which you shouldn't
> > > have if you did a setuid to an unpriv user. (But I always find that code
> > > confusing.)
> >
> > Only if the target hasn't gone through execve() since setuid().
>
> Sorry if I want to know this in excessive detail but I'd like to
> understand this properly so bear with me :)
> - If task B has setuid()ed and prctl(PR_SET_DUMPABLE, 1)ed but not
>   execve()ed then C won't pass ptrace_has_cap(mm->user_ns, mode).

Yeah.

> - If task B has setuid()ed, exeved()ed it will get its dumpable flag set
>   to /proc/sys/fs/suid_dumpable

Not if you changed all UIDs (e.g. by calling setuid() as root). In
that case, setup_new_exec() calls "set_dumpable(current->mm,
SUID_DUMP_USER)".

> which by default is 0. So C won't pass
>   (get_dumpable(mm) != SUID_DUMP_USER).
> In both cases PTRACE_ATTACH shouldn't work. Now, if
> /proc/sys/fs/suid_dumpable is 1 I'd find it acceptable for this to work.
> This is an administrator choice.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v7 3/6] seccomp: add a way to get a listener fd from ptrace

2018-10-09 Thread Jann Horn via Selinux
On Tue, Oct 9, 2018 at 3:49 PM Christian Brauner  wrote:
>
> On Tue, Oct 09, 2018 at 03:36:04PM +0200, Jann Horn wrote:
> > +cc selinux people explicitly, since they probably have opinions on this
> >
> > On Tue, Oct 9, 2018 at 3:29 PM Christian Brauner  
> > wrote:
> > > On Tue, Oct 09, 2018 at 02:39:53PM +0200, Jann Horn wrote:
> > > > On Mon, Oct 8, 2018 at 8:18 PM Christian Brauner  
> > > > wrote:
> > > > > On Mon, Oct 08, 2018 at 06:42:00PM +0200, Jann Horn wrote:
> > > > > > On Mon, Oct 8, 2018 at 6:21 PM Christian Brauner 
> > > > > >  wrote:
> > > > > > > On Mon, Oct 08, 2018 at 05:33:22PM +0200, Jann Horn wrote:
> > > > > > > > On Mon, Oct 8, 2018 at 5:16 PM Christian Brauner 
> > > > > > > >  wrote:
> > > > > > > > > On Thu, Sep 27, 2018 at 09:11:16AM -0600, Tycho Andersen 
> > > > > > > > > wrote:
> > > > > > > > > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > > > > > > > > > index 44a31ac8373a..17685803a2af 100644
> > > > > > > > > > --- a/kernel/seccomp.c
> > > > > > > > > > +++ b/kernel/seccomp.c
> > > > > > > > > > @@ -1777,4 +1777,35 @@ static struct file 
> > > > > > > > > > *init_listener(struct task_struct *task,
> > > > > > > > > >
> > > > > > > > > >   return ret;
> > > > > > > > > >  }
> > > > > > > > > > +
> > > > > > > > > > +long seccomp_new_listener(struct task_struct *task,
> > > > > > > > > > +   unsigned long filter_off)
> > > > > > > > > > +{
> > > > > > > > > > + struct seccomp_filter *filter;
> > > > > > > > > > + struct file *listener;
> > > > > > > > > > + int fd;
> > > > > > > > > > +
> > > > > > > > > > + if (!capable(CAP_SYS_ADMIN))
> > > > > > > > > > + return -EACCES;
> > > > > > > > >
> > > > > > > > > I know this might have been discussed a while back but why 
> > > > > > > > > exactly do we
> > > > > > > > > require CAP_SYS_ADMIN in init_userns and not in the target 
> > > > > > > > > userns? What
> > > > > > > > > if I want to do a setns()fd, CLONE_NEWUSER) to the target 
> > > > > > > > > process and
> > > > > > > > > use ptrace from in there?
> > > > > > > >
> > > > > > > > See 
> > > > > > > > https://lore.kernel.org/lkml/CAG48ez3R+ZJ1vwGkDfGzKX2mz6f=jjjwso5pcvnh68p+rko...@mail.gmail.com/
> > > > > > > > . Basically, the problem is that this doesn't just give you 
> > > > > > > > capability
> > > > > > > > over the target task, but also over every other task that has 
> > > > > > > > the same
> > > > > > > > filter installed; you need some sort of "is the caller capable 
> > > > > > > > over
> > > > > > > > the filter and anyone who uses it" check.
> > > > > > >
> > > > > > > Thanks.
> > > > > > > But then this new ptrace feature as it stands is imho currently 
> > > > > > > broken.
> > > > > > > If you can install a seccomp filter with SECCOMP_RET_USER_NOTIF 
> > > > > > > if you
> > > > > > > are ns_cpabable(CAP_SYS_ADMIN) and also get an fd via seccomp() 
> > > > > > > itself
> > > > > > > if you are ns_cpabable(CAP_SYS_ADMIN)
> > > >
> > > > Actually, you don't need CAP_SYS_ADMIN for seccomp() at all as long as
> > > > you enable the NNP flag, I think?
> > >
> > > Yes, if you turn on NNP you don't even need sys_admin.
> > >
> > > >
> > > > > > > then either the new ptrace() api
> > > > > > > extension should be fixed to allow for this too or the seccomp() 
> > > > > > > way of
> > > > > > > retrieving the pid - which I really think we want - needs to be 
> > > > > > > fixed to
> > > > > > > require capable(CAP_SYS_ADMIN) too.
> > > > > > > The solution where both require ns_capable(CAP_SYS_ADMIN) is - 
> > > > > > > imho -
> > > > > > > the preferred way to solve this.
> > > > > > > Everything else will just be confusing.
> > > > > >
> > > > > > First you say "broken", then you say "confusing". Which one do you 
> > > > > > mean?
> > > > >
> > > > > Both. It's broken in so far as it places a seemingly unnecessary
> > > > > restriction that could be fixed. You outlined one possible fix 
> > > > > yourself
> > > > > in the link you provided.
> > > >
> > > > If by "possible fix" you mean "check whether the seccomp filter is
> > > > only attached to a single task": That wouldn't fundamentally change
> > > > the situation, it would only add an additional special case.
> > > >
> > > > > And it's confusing in so far as there is a way
> > > > > via seccomp() to get the fd without said requirement.
> > > >
> > > > I don't find it confusing at all. seccomp() and ptrace() are very
> > >
> > > Fine, then that's a matter of opinion. I find it counterintuitive that
> > > you can get an fd without privileges via one interface but not via
> > > another.
> > >
> > > > different situations: When you use seccomp(), infrastructure is
> > >
> > > Sure. Note, that this is _one_ of the reasons why I want to make sure we
> > > keep the native seccomp() only based way of getting an fd without
> > > forcing userspace to switching to a differnet kernel api.
> > >
> > > > already in place for ensuring that your filter is 

Re: [PATCH v7 3/6] seccomp: add a way to get a listener fd from ptrace

2018-10-09 Thread Jann Horn via Selinux
+cc selinux people explicitly, since they probably have opinions on this

On Tue, Oct 9, 2018 at 3:29 PM Christian Brauner  wrote:
> On Tue, Oct 09, 2018 at 02:39:53PM +0200, Jann Horn wrote:
> > On Mon, Oct 8, 2018 at 8:18 PM Christian Brauner  
> > wrote:
> > > On Mon, Oct 08, 2018 at 06:42:00PM +0200, Jann Horn wrote:
> > > > On Mon, Oct 8, 2018 at 6:21 PM Christian Brauner  
> > > > wrote:
> > > > > On Mon, Oct 08, 2018 at 05:33:22PM +0200, Jann Horn wrote:
> > > > > > On Mon, Oct 8, 2018 at 5:16 PM Christian Brauner 
> > > > > >  wrote:
> > > > > > > On Thu, Sep 27, 2018 at 09:11:16AM -0600, Tycho Andersen wrote:
> > > > > > > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > > > > > > > index 44a31ac8373a..17685803a2af 100644
> > > > > > > > --- a/kernel/seccomp.c
> > > > > > > > +++ b/kernel/seccomp.c
> > > > > > > > @@ -1777,4 +1777,35 @@ static struct file *init_listener(struct 
> > > > > > > > task_struct *task,
> > > > > > > >
> > > > > > > >   return ret;
> > > > > > > >  }
> > > > > > > > +
> > > > > > > > +long seccomp_new_listener(struct task_struct *task,
> > > > > > > > +   unsigned long filter_off)
> > > > > > > > +{
> > > > > > > > + struct seccomp_filter *filter;
> > > > > > > > + struct file *listener;
> > > > > > > > + int fd;
> > > > > > > > +
> > > > > > > > + if (!capable(CAP_SYS_ADMIN))
> > > > > > > > + return -EACCES;
> > > > > > >
> > > > > > > I know this might have been discussed a while back but why 
> > > > > > > exactly do we
> > > > > > > require CAP_SYS_ADMIN in init_userns and not in the target 
> > > > > > > userns? What
> > > > > > > if I want to do a setns()fd, CLONE_NEWUSER) to the target process 
> > > > > > > and
> > > > > > > use ptrace from in there?
> > > > > >
> > > > > > See 
> > > > > > https://lore.kernel.org/lkml/CAG48ez3R+ZJ1vwGkDfGzKX2mz6f=jjjwso5pcvnh68p+rko...@mail.gmail.com/
> > > > > > . Basically, the problem is that this doesn't just give you 
> > > > > > capability
> > > > > > over the target task, but also over every other task that has the 
> > > > > > same
> > > > > > filter installed; you need some sort of "is the caller capable over
> > > > > > the filter and anyone who uses it" check.
> > > > >
> > > > > Thanks.
> > > > > But then this new ptrace feature as it stands is imho currently 
> > > > > broken.
> > > > > If you can install a seccomp filter with SECCOMP_RET_USER_NOTIF if you
> > > > > are ns_cpabable(CAP_SYS_ADMIN) and also get an fd via seccomp() itself
> > > > > if you are ns_cpabable(CAP_SYS_ADMIN)
> >
> > Actually, you don't need CAP_SYS_ADMIN for seccomp() at all as long as
> > you enable the NNP flag, I think?
>
> Yes, if you turn on NNP you don't even need sys_admin.
>
> >
> > > > > then either the new ptrace() api
> > > > > extension should be fixed to allow for this too or the seccomp() way 
> > > > > of
> > > > > retrieving the pid - which I really think we want - needs to be fixed 
> > > > > to
> > > > > require capable(CAP_SYS_ADMIN) too.
> > > > > The solution where both require ns_capable(CAP_SYS_ADMIN) is - imho -
> > > > > the preferred way to solve this.
> > > > > Everything else will just be confusing.
> > > >
> > > > First you say "broken", then you say "confusing". Which one do you mean?
> > >
> > > Both. It's broken in so far as it places a seemingly unnecessary
> > > restriction that could be fixed. You outlined one possible fix yourself
> > > in the link you provided.
> >
> > If by "possible fix" you mean "check whether the seccomp filter is
> > only attached to a single task": That wouldn't fundamentally change
> > the situation, it would only add an additional special case.
> >
> > > And it's confusing in so far as there is a way
> > > via seccomp() to get the fd without said requirement.
> >
> > I don't find it confusing at all. seccomp() and ptrace() are very
>
> Fine, then that's a matter of opinion. I find it counterintuitive that
> you can get an fd without privileges via one interface but not via
> another.
>
> > different situations: When you use seccomp(), infrastructure is
>
> Sure. Note, that this is _one_ of the reasons why I want to make sure we
> keep the native seccomp() only based way of getting an fd without
> forcing userspace to switching to a differnet kernel api.
>
> > already in place for ensuring that your filter is only applied to
> > processes over which you are capable, and propagation is limited by
> > inheritance from your task down. When you use ptrace(), you need a
> > pretty different sort of access check that checks whether you're
> > privileged over ancestors, siblings and so on of the target task.
>
> So, don't get me wrong I'm not arguing against the ptrace() interface in
> general. If this is something that people find useful, fine. But, I
> would like to have a simple single-syscall pure-seccomp() based way of
> getting an fd, i.e. what we have in patch 1 of this series.

Yeah, I also prefer the seccomp() one.

> > 

Re: [PATCH v5 2/5] Smack: Prepare for PTRACE_MODE_SCHED

2018-10-04 Thread Jann Horn via Selinux
On Thu, Oct 4, 2018 at 9:47 AM Jiri Kosina  wrote:
> On Thu, 27 Sep 2018, Jann Horn wrote:
> > > Yes. Since the PTRACE_MODE_NOAUDIT was in PTRACE_MODE_IBPB in Jiri's
> > > previous patch set and not in PTRACE_MODE_SCHED in this one I assumed
> > > that there was a good reason for it.
> >
> > Jiri, was there a good reason for it, and if so, what was it?
>
> [ FWIW PTRACE_MODE_NOAUDIT being in PTRACE_MODE_IBPB goes back to original
>   Tim's pre-CRD patchset ]
>
> Well, we can't really call out into audit from scheduler code, and the
> previous versions of the patchsets didn't have PTRACE_MODE_SCHED, so it
> had to be included in PTRACE_MODE_IBPB in order to make sure we're not
> calling into audit from context switch code.
>
> Or did I misunderstand the question?

If I understand Casey correctly, he is saying that your patch
(https://lore.kernel.org/lkml/nycvar.yfh.7.76.1809251437340.15...@cbobk.fhfr.pm/)
doesn't include PTRACE_MODE_NOAUDIT for IBPB, but the previous v6 of
your patch 
(https://lore.kernel.org/lkml/nycvar.yfh.7.76.1809121105330.15...@cbobk.fhfr.pm/)
did include it, and therefore Casey thinks that there is a specific
reason why you removed PTRACE_MODE_NOAUDIT, and therefore Casey is
adding special-case logic for PTRACE_MODE_SCHED to Smack when simply
using PTRACE_MODE_NOAUDIT would also work.

I think that Casey should change ptrace_may_access_sched() to use
"mode | PTRACE_MODE_SCHED | PTRACE_MODE_NOAUDIT".
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v5 5/5] sidechannel: Linux Security Module for sidechannel

2018-09-28 Thread Jann Horn via Selinux
On Fri, Sep 28, 2018 at 1:43 AM James Morris  wrote:
> On Thu, 27 Sep 2018, Schaufler, Casey wrote:
> > > > On 9/27/2018 2:45 PM, James Morris wrote:
> > > > > On Wed, 26 Sep 2018, Casey Schaufler wrote:
> > > > >
> > > > >> +  /*
> > > > >> +   * Namespace checks. Considered safe if:
> > > > >> +   *  cgroup namespace is the same
> > > > >> +   *  User namespace is the same
> > > > >> +   *  PID namespace is the same
> > > > >> +   */
> > > > >> +  if (current->nsproxy)
> > > > >> +  ccgn = current->nsproxy->cgroup_ns;
> > > > >> +  if (p->nsproxy)
> > > > >> +  pcgn = p->nsproxy->cgroup_ns;
> > > > >> +  if (ccgn != pcgn)
> > > > >> +  return -EACCES;
> > > > >> +  if (current->cred->user_ns != p->cred->user_ns)
> > > > >> +  return -EACCES;
> > > > >> +  if (task_active_pid_ns(current) != task_active_pid_ns(p))
> > > > >> +  return -EACCES;
> > > > >> +  return 0;
> > > > > I really don't like the idea of hard-coding namespace security 
> > > > > semantics
> > > > > in an LSM.  Also, I'm not sure if these semantics make any sense.
> > > >
> > > > Checks on namespaces where explicitly requested.
> > >
> > > By whom and what is the rationale?
> >
> > The rationale is to protect containers. Since those closest thing
> > there is to a definition of containers is "uses namespaces" that
> > becomes the focus. Separating them out does not make too much
> > sense as I would expect someone concerned with one to be concerned
> > with all.
>
> A lot of people will not be using user namespaces due to security
> concerns,

Ugh.

> so with this hard-coded logic, you are saying this case is
> 'safe' in a sidechannel context.
>
> Which hints at the deeper issue that containers are a userland
> abstraction.  Protection of containers needs to be defined by userland
> policy.

Or just compare mount namespaces additionally/instead. I think that
containers will always use those, because AFAIK nobody uses chroot()
for containers, given that the kernel makes absolutely no security
guarantees about chroot().
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH 6/5] capability: Repair sidechannel test in ptrace

2018-09-28 Thread Jann Horn via Selinux
On Thu, Sep 27, 2018 at 9:17 PM Casey Schaufler
 wrote:
>
> From: Casey Schaufler 
>
> The PTRACE_MODE_SCHED check erroniously returns 0 in
> all cases. It should be returning -EPERM. This fixes
> the logic to correct that error.
>
> Signed-off-by: Casey Schaufler 

Reviewed-by: Jann Horn 

> ---
>  security/commoncap.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/security/commoncap.c b/security/commoncap.c
> index e77457110d05..70a7e3d19c16 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -152,9 +152,8 @@ int cap_ptrace_access_check(struct task_struct *child, 
> unsigned int mode)
> if (cred->user_ns == child_cred->user_ns &&
> cap_issubset(child_cred->cap_permitted, *caller_caps))
> goto out;
> -   if (mode & PTRACE_MODE_SCHED)
> -   goto out;
> -   if (ns_capable(child_cred->user_ns, CAP_SYS_PTRACE))
> +   if (!(mode & PTRACE_MODE_SCHED) &&
> +   ns_capable(child_cred->user_ns, CAP_SYS_PTRACE))
> goto out;
> ret = -EPERM;
>  out:
> --
> 2.17.1
>
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v5 2/5] Smack: Prepare for PTRACE_MODE_SCHED

2018-09-27 Thread Jann Horn via Selinux
On Wed, Sep 26, 2018 at 10:35 PM Casey Schaufler
 wrote:
> A ptrace access check with mode PTRACE_MODE_SCHED gets called
> from process switching code. This precludes the use of audit,
> as the locking is incompatible. Don't do audit in the PTRACE_MODE_SCHED
> case.
>
> Signed-off-by: Casey Schaufler 
> ---
>  security/smack/smack_lsm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index 340fc30ad85d..ffa95bcab599 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -422,7 +422,8 @@ static int smk_ptrace_rule_check(struct task_struct 
> *tracer,
> struct task_smack *tsp;
> struct smack_known *tracer_known;
>
> -   if ((mode & PTRACE_MODE_NOAUDIT) == 0) {
> +   if ((mode & PTRACE_MODE_NOAUDIT) == 0 &&
> +   (mode & PTRACE_MODE_SCHED) == 0) {

If you ORed PTRACE_MODE_NOAUDIT into the flags when calling the
security hook, you could drop this patch, right?
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v5 4/5] Capability: Complete PTRACE_MODE_SCHED

2018-09-27 Thread Jann Horn via Selinux
On Wed, Sep 26, 2018 at 10:35 PM Casey Schaufler
 wrote:
> Allow a complete ptrace access check with mode PTRACE_MODE_SCHED.
> Disable the inappropriate privilege check in the capability code
> that does incompatible locking.

What's that locking you're talking about?

> Signed-off-by: Casey Schaufler 
> ---
>  kernel/ptrace.c  | 2 --
>  security/commoncap.c | 2 ++
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 99cfddde6a55..0b6a9df51c3b 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -331,8 +331,6 @@ static int __ptrace_may_access(struct task_struct *task, 
> unsigned int mode)
>  !ptrace_has_cap(mm->user_ns, mode)))
> return -EPERM;
>
> -   if (mode & PTRACE_MODE_SCHED)
> -   return 0;
> return security_ptrace_access_check(task, mode);
>  }
>
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 2e489d6a3ac8..e77457110d05 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -152,6 +152,8 @@ int cap_ptrace_access_check(struct task_struct *child, 
> unsigned int mode)
> if (cred->user_ns == child_cred->user_ns &&
> cap_issubset(child_cred->cap_permitted, *caller_caps))
> goto out;
> +   if (mode & PTRACE_MODE_SCHED)
> +   goto out;

So for PTRACE_MODE_SCHED, this function always returns 0, right? If
that's intentional, perhaps you should instead just put "if (mode &
PTRACE_MODE_SCHED) return 0;" at the start of the function, to avoid
taking the RCU read lock in this case.

> if (ns_capable(child_cred->user_ns, CAP_SYS_PTRACE))
> goto out;
> ret = -EPERM;
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v5 1/5] AppArmor: Prepare for PTRACE_MODE_SCHED

2018-09-27 Thread Jann Horn via Selinux
On Wed, Sep 26, 2018 at 11:16 PM Jann Horn  wrote:
>
> On Wed, Sep 26, 2018 at 10:35 PM Casey Schaufler
>  wrote:
> > A ptrace access check with mode PTRACE_MODE_SCHED gets called
> > from process switching code. This precludes the use of audit,
> > as the locking is incompatible. Don't do audit in the PTRACE_MODE_SCHED
> > case.
>
> Why is this separate from PTRACE_MODE_NOAUDIT? It looks like
> apparmor_ptrace_access_check() currently ignores PTRACE_MODE_NOAUDIT.
> Could you, instead of adding a new flag, fix the handling of
> PTRACE_MODE_NOAUDIT?

Er, after looking at more of the series, I see that PTRACE_MODE_SCHED
is necessary; but could you handle the "don't audit" part for AppArmor
using PTRACE_MODE_NOAUDIT instead?

> > Signed-off-by: Casey Schaufler 
> > ---
> >  security/apparmor/domain.c  | 2 +-
> >  security/apparmor/include/ipc.h | 2 +-
> >  security/apparmor/ipc.c | 8 +---
> >  security/apparmor/lsm.c | 5 +++--
> >  4 files changed, 10 insertions(+), 7 deletions(-)
> >
> > diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
> > index 08c88de0ffda..28300f4c3ef9 100644
> > --- a/security/apparmor/domain.c
> > +++ b/security/apparmor/domain.c
> > @@ -77,7 +77,7 @@ static int may_change_ptraced_domain(struct aa_label 
> > *to_label,
> > if (!tracer || unconfined(tracerl))
> > goto out;
> >
> > -   error = aa_may_ptrace(tracerl, to_label, PTRACE_MODE_ATTACH);
> > +   error = aa_may_ptrace(tracerl, to_label, PTRACE_MODE_ATTACH, true);
> >
> >  out:
> > rcu_read_unlock();
> > diff --git a/security/apparmor/include/ipc.h 
> > b/security/apparmor/include/ipc.h
> > index 5ffc218d1e74..299d1c45fef0 100644
> > --- a/security/apparmor/include/ipc.h
> > +++ b/security/apparmor/include/ipc.h
> > @@ -34,7 +34,7 @@ struct aa_profile;
> > "xcpu xfsz vtalrm prof winch io pwr sys emt lost"
> >
> >  int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee,
> > - u32 request);
> > + u32 request, bool audit);
> >  int aa_may_signal(struct aa_label *sender, struct aa_label *target, int 
> > sig);
> >
> >  #endif /* __AA_IPC_H */
> > diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> > index 527ea1557120..9ed110afc822 100644
> > --- a/security/apparmor/ipc.c
> > +++ b/security/apparmor/ipc.c
> > @@ -121,15 +121,17 @@ static int profile_tracer_perm(struct aa_profile 
> > *tracer,
> >   * Returns: %0 else error code if permission denied or error
> >   */
> >  int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee,
> > - u32 request)
> > + u32 request, bool audit)
> >  {
> > struct aa_profile *profile;
> > u32 xrequest = request << PTRACE_PERM_SHIFT;
> > DEFINE_AUDIT_DATA(sa, LSM_AUDIT_DATA_NONE, OP_PTRACE);
> >
> > return xcheck_labels(tracer, tracee, profile,
> > -   profile_tracer_perm(profile, tracee, request, ),
> > -   profile_tracee_perm(profile, tracer, xrequest, 
> > ));
> > +   profile_tracer_perm(profile, tracee, request,
> > +   audit ?  : NULL),
> > +   profile_tracee_perm(profile, tracer, xrequest,
> > +   audit ?  : NULL));
> >  }
> >
> >
> > diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> > index 8b8b70620bbe..da9d0b228857 100644
> > --- a/security/apparmor/lsm.c
> > +++ b/security/apparmor/lsm.c
> > @@ -118,7 +118,8 @@ static int apparmor_ptrace_access_check(struct 
> > task_struct *child,
> > tracee = aa_get_task_label(child);
> > error = aa_may_ptrace(tracer, tracee,
> > (mode & PTRACE_MODE_READ) ? AA_PTRACE_READ
> > - : AA_PTRACE_TRACE);
> > + : AA_PTRACE_TRACE,
> > +   !(mode & PTRACE_MODE_SCHED));
> > aa_put_label(tracee);
> > end_current_label_crit_section(tracer);
> >
> > @@ -132,7 +133,7 @@ static int apparmor_ptrace_traceme(struct task_struct 
> > *parent)
> >
> > tracee = begin_current_label_crit_section();
> > tracer = aa_get_task_label(parent);
> > -   error = aa_may_ptrace(tracer, tracee, AA_PTRACE_TRACE);
> > +   error = aa_may_ptrace(tracer, tracee, AA_PTRACE_TRACE, true);
> > aa_put_label(tracer);
> > end_current_label_crit_section(tracee);
> >
> > --
> > 2.17.1
> >
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v5 2/5] Smack: Prepare for PTRACE_MODE_SCHED

2018-09-27 Thread Jann Horn via Selinux
+Jiri

On Thu, Sep 27, 2018 at 12:54 AM Schaufler, Casey
 wrote:
> > -Original Message-
> > From: Jann Horn [mailto:ja...@google.com]
> > Sent: Wednesday, September 26, 2018 2:31 PM
> > To: Schaufler, Casey 
> > Cc: Kernel Hardening ; kernel list
> > ; linux-security-module  > mod...@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > ; Dock, Deneen T ;
> > kris...@linux.intel.com; Arjan van de Ven 
> > Subject: Re: [PATCH v5 2/5] Smack: Prepare for PTRACE_MODE_SCHED
> >
> > On Wed, Sep 26, 2018 at 10:35 PM Casey Schaufler
> >  wrote:
> > > A ptrace access check with mode PTRACE_MODE_SCHED gets called
> > > from process switching code. This precludes the use of audit,
> > > as the locking is incompatible. Don't do audit in the PTRACE_MODE_SCHED
> > > case.
> > >
> > > Signed-off-by: Casey Schaufler 
> > > ---
> > >  security/smack/smack_lsm.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> > > index 340fc30ad85d..ffa95bcab599 100644
> > > --- a/security/smack/smack_lsm.c
> > > +++ b/security/smack/smack_lsm.c
> > > @@ -422,7 +422,8 @@ static int smk_ptrace_rule_check(struct task_struct
> > *tracer,
> > > struct task_smack *tsp;
> > > struct smack_known *tracer_known;
> > >
> > > -   if ((mode & PTRACE_MODE_NOAUDIT) == 0) {
> > > +   if ((mode & PTRACE_MODE_NOAUDIT) == 0 &&
> > > +   (mode & PTRACE_MODE_SCHED) == 0) {
> >
> > If you ORed PTRACE_MODE_NOAUDIT into the flags when calling the
> > security hook, you could drop this patch, right?
>
> Yes. Since the PTRACE_MODE_NOAUDIT was in PTRACE_MODE_IBPB
> in Jiri's previous patch set and not in PTRACE_MODE_SCHED in this one
> I assumed that there was a good reason for it.

Jiri, was there a good reason for it, and if so, what was it?
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v5 1/5] AppArmor: Prepare for PTRACE_MODE_SCHED

2018-09-27 Thread Jann Horn via Selinux
On Wed, Sep 26, 2018 at 10:35 PM Casey Schaufler
 wrote:
> A ptrace access check with mode PTRACE_MODE_SCHED gets called
> from process switching code. This precludes the use of audit,
> as the locking is incompatible. Don't do audit in the PTRACE_MODE_SCHED
> case.

Why is this separate from PTRACE_MODE_NOAUDIT? It looks like
apparmor_ptrace_access_check() currently ignores PTRACE_MODE_NOAUDIT.
Could you, instead of adding a new flag, fix the handling of
PTRACE_MODE_NOAUDIT?

> Signed-off-by: Casey Schaufler 
> ---
>  security/apparmor/domain.c  | 2 +-
>  security/apparmor/include/ipc.h | 2 +-
>  security/apparmor/ipc.c | 8 +---
>  security/apparmor/lsm.c | 5 +++--
>  4 files changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
> index 08c88de0ffda..28300f4c3ef9 100644
> --- a/security/apparmor/domain.c
> +++ b/security/apparmor/domain.c
> @@ -77,7 +77,7 @@ static int may_change_ptraced_domain(struct aa_label 
> *to_label,
> if (!tracer || unconfined(tracerl))
> goto out;
>
> -   error = aa_may_ptrace(tracerl, to_label, PTRACE_MODE_ATTACH);
> +   error = aa_may_ptrace(tracerl, to_label, PTRACE_MODE_ATTACH, true);
>
>  out:
> rcu_read_unlock();
> diff --git a/security/apparmor/include/ipc.h b/security/apparmor/include/ipc.h
> index 5ffc218d1e74..299d1c45fef0 100644
> --- a/security/apparmor/include/ipc.h
> +++ b/security/apparmor/include/ipc.h
> @@ -34,7 +34,7 @@ struct aa_profile;
> "xcpu xfsz vtalrm prof winch io pwr sys emt lost"
>
>  int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee,
> - u32 request);
> + u32 request, bool audit);
>  int aa_may_signal(struct aa_label *sender, struct aa_label *target, int sig);
>
>  #endif /* __AA_IPC_H */
> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> index 527ea1557120..9ed110afc822 100644
> --- a/security/apparmor/ipc.c
> +++ b/security/apparmor/ipc.c
> @@ -121,15 +121,17 @@ static int profile_tracer_perm(struct aa_profile 
> *tracer,
>   * Returns: %0 else error code if permission denied or error
>   */
>  int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee,
> - u32 request)
> + u32 request, bool audit)
>  {
> struct aa_profile *profile;
> u32 xrequest = request << PTRACE_PERM_SHIFT;
> DEFINE_AUDIT_DATA(sa, LSM_AUDIT_DATA_NONE, OP_PTRACE);
>
> return xcheck_labels(tracer, tracee, profile,
> -   profile_tracer_perm(profile, tracee, request, ),
> -   profile_tracee_perm(profile, tracer, xrequest, ));
> +   profile_tracer_perm(profile, tracee, request,
> +   audit ?  : NULL),
> +   profile_tracee_perm(profile, tracer, xrequest,
> +   audit ?  : NULL));
>  }
>
>
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index 8b8b70620bbe..da9d0b228857 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -118,7 +118,8 @@ static int apparmor_ptrace_access_check(struct 
> task_struct *child,
> tracee = aa_get_task_label(child);
> error = aa_may_ptrace(tracer, tracee,
> (mode & PTRACE_MODE_READ) ? AA_PTRACE_READ
> - : AA_PTRACE_TRACE);
> + : AA_PTRACE_TRACE,
> +   !(mode & PTRACE_MODE_SCHED));
> aa_put_label(tracee);
> end_current_label_crit_section(tracer);
>
> @@ -132,7 +133,7 @@ static int apparmor_ptrace_traceme(struct task_struct 
> *parent)
>
> tracee = begin_current_label_crit_section();
> tracer = aa_get_task_label(parent);
> -   error = aa_may_ptrace(tracer, tracee, AA_PTRACE_TRACE);
> +   error = aa_may_ptrace(tracer, tracee, AA_PTRACE_TRACE, true);
> aa_put_label(tracer);
> end_current_label_crit_section(tracee);
>
> --
> 2.17.1
>
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH] selinux: refactor mls_context_to_sid() and make it stricter

2018-08-31 Thread Jann Horn via Selinux
On Thu, Aug 9, 2018 at 3:56 AM Paul Moore  wrote:
>
> On Mon, Aug 6, 2018 at 5:19 PM Jann Horn  wrote:
> >
> > The intended behavior change for this patch is to reject any MLS strings
> > that contain (trailing) garbage if p->mls_enabled is true.
> >
> > As suggested by Paul Moore, change mls_context_to_sid() so that the two
> > parts of the range are extracted before the rest of the parsing. Because
> > now we don't have to scan for two different separators simultaneously
> > everywhere, we can actually switch to strchr() everywhere instead of the
> > open-coded loops that scan for two separators at once.
> >
> > mls_context_to_sid() used to signal how much of the input string was parsed
> > by updating `*scontext`. However, there is actually no case in which
> > mls_context_to_sid() only parses a subset of the input and still returns
> > a success (other than the buggy case with a second '-' in which it
> > incorrectly claims to have consumed the entire string). Turn `scontext`
> > into a simple pointer argument and stop redundantly checking whether the
> > entire input was consumed in string_to_context_struct(). This also lets us
> > remove the `scontext_len` argument from `string_to_context_struct()`.
[...]
> > -   /* Extract low sensitivity. */
> > -   scontextp = p = *scontext;
> > -   while (*p && *p != ':' && *p != '-')
> > -   p++;
> > -
> > -   delim = *p;
> > -   if (delim != '\0')
> > -   *p++ = '\0';
> > +   /*
> > +* If we're dealing with a range, figure out where the two parts
> > +* of the range begin.
> > +*/
> > +   rangep[0] = scontext;
> > +   rangep[1] = strchr(scontext, '-');
> > +   if (rangep[1]) {
> > +   rangep[1][0] = '\0';
> > +   rangep[1]++;
> > +   }
> >
> > +   /* For each part of the range: */
> > for (l = 0; l < 2; l++) {
> > -   levdatum = hashtab_search(pol->p_levels.table, scontextp);
> > -   if (!levdatum) {
> > -   rc = -EINVAL;
> > -   goto out;
> > -   }
> > +   /* Split sensitivity and category set. */
> > +   sensitivity = rangep[l];
> > +   if (sensitivity == NULL)
> > +   break;
> > +   next_cat = strchr(sensitivity, ':');
> > +   if (next_cat)
> > +   *(next_cat++) = '\0';
> >
> > +   /* Parse sensitivity. */
> > +   levdatum = hashtab_search(pol->p_levels.table, sensitivity);
> > +   if (!levdatum)
> > +   return -EINVAL;
> > context->range.level[l].sens = levdatum->level->sens;
> >
> > -   if (delim == ':') {
> > -   /* Extract category set. */
> > -   while (1) {
> > -   scontextp = p;
> > -   while (*p && *p != ',' && *p != '-')
> > -   p++;
> > -   delim = *p;
> > -   if (delim != '\0')
> > -   *p++ = '\0';
> > -
> > -   /* Separate into range if exists */
> > -   rngptr = strchr(scontextp, '.');
> > -   if (rngptr != NULL) {
> > -   /* Remove '.' */
> > -   *rngptr++ = '\0';
> > -   }
> > +   /* Extract category set. */
> > +   while (next_cat != NULL) {
> > +   cur_cat = next_cat;
> > +   next_cat = strchr(next_cat, ',');
> > +   if (next_cat != NULL)
> > +   *(next_cat++) = '\0';
> > +
> > +   /* Separate into range if exists */
> > +   rngptr = strchr(cur_cat, '.');
> > +   if (rngptr != NULL) {
> > +   /* Remove '.' */
>
> On the chance you need to respin this patch, you can probably get rid
> of the above comment and the if-body braces; we don't have "Remove X"
> comments in other similar places in this function.

I'll amend that.

> > +   *rngptr++ = '\0';
> > +   }
> >
> > -   catdatum = hashtab_search(pol->p_cats.table,
> > - scontextp);
> > -   if (!catdatum) {
> > -   rc = -EINVAL;
> > -   goto out;
> > -   }
> > +   catdatum = hashtab_search(pol->p_cats.table, 
> > cur_cat);
> > +   if (!catdatum)
> > +   return -EINVAL;
> >
> > -

Re: [PATCH v4 3/5] LSM: Security module checking for side-channel dangers

2018-08-27 Thread Jann Horn via Selinux
On Sat, Aug 25, 2018 at 12:42 AM Casey Schaufler
 wrote:
> +config SECURITY_SIDECHANNEL_CAPABILITIES
> +   bool "Sidechannel check on capability sets"
> +   depends on SECURITY_SIDECHANNEL
> +   depends on !SECURITY_SIDECHANNEL_ALWAYS
> +   default n
> +   select SECURITY_SIDECHANNEL_NAMESPACES if USER_NS
> +   help
> + Assume that tasks with different sets of privilege may be
> + subject to side-channel attacks. Potential interactions
> + where the attacker lacks capabilities the attacked has
> + are blocked. Selecting this when user namespaces (USER_NS)
> + are enabled will enable SECURITY_SIDECHANNEL_NAMESPACES.

Thanks!
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH v3 3/5] LSM: Security module checking for side-channel dangers

2018-08-22 Thread Jann Horn via Selinux
On Wed, Aug 22, 2018 at 1:44 AM Schaufler, Casey
 wrote:
>
> > -Original Message-
> > From: Jann Horn [mailto:ja...@google.com]
> > Sent: Tuesday, August 21, 2018 10:24 AM
> > To: Schaufler, Casey 
> > Cc: Kernel Hardening ; kernel list
> > ; linux-security-module  > mod...@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > ; Dock, Deneen T ;
> > kris...@linux.intel.com; Arjan van de Ven 
> > Subject: Re: [PATCH v3 3/5] LSM: Security module checking for side-channel
> > dangers
> >
> > On Tue, Aug 21, 2018 at 2:05 AM Casey Schaufler
> >  wrote:
> > >
> > > The sidechannel LSM checks for cases where a side-channel
> > > attack may be dangerous based on security attributes of tasks.
> > > This includes:
> > > Effective UID of the tasks is different
> > > Capablity sets are different
> > > Tasks are in different namespaces
> > > An option is also provided to assert that task are never
> > > to be considered safe. This is high paranoia, and expensive
> > > as well.
> > >
> > > Signed-off-by: Casey Schaufler 
> > > ---
> > [...]
> > > diff --git a/security/sidechannel/Kconfig b/security/sidechannel/Kconfig
> > > new file mode 100644
> > > index ..af9396534128
> > > --- /dev/null
> > > +++ b/security/sidechannel/Kconfig
> > [...]
> > > +config SECURITY_SIDECHANNEL_CAPABILITIES
> > > +   bool "Sidechannel check on capability sets"
> > > +   depends on SECURITY_SIDECHANNEL
> > > +   default n
> > > +   help
> > > + Assume that tasks with different sets of privilege may be
> > > + subject to side-channel attacks. Potential interactions
> > > + where the attacker lacks capabilities the attacked has
> > > + are blocked.
> > > +
> > > +  If you are unsure how to answer this question, answer N.
> > > +
> > > +config SECURITY_SIDECHANNEL_NAMESPACES
> > > +   bool "Sidechannel check on namespaces"
> > > +   depends on SECURITY_SIDECHANNEL
> > > +   depends on NAMESPACES
> > > +   default n
> > > +   help
> > > + Assume that tasks in different namespaces may be
> > > + subject to side-channel attacks. User, PID and cgroup
> > > + namespaces are checked.
> > > +
> > > +  If you are unsure how to answer this question, answer N.
> > [...]
> > > diff --git a/security/sidechannel/sidechannel.c
> > b/security/sidechannel/sidechannel.c
> > > new file mode 100644
> > > index ..4da7d6dafdc5
> > > --- /dev/null
> > > +++ b/security/sidechannel/sidechannel.c
> > [...]
> > > +/*
> > > + * safe_by_capability - Are task and current sidechannel safe?
> > > + * @p: task to check on
> > > + *
> > > + * Returns 0 if the tasks are sidechannel safe, -EACCES otherwise.
> > > + */
> > > +#ifdef CONFIG_SECURITY_SIDECHANNEL_CAPABILITIES
> > > +static int safe_by_capability(struct task_struct *p)
> > > +{
> > > +   const struct cred *ccred = current_real_cred();
> > > +   const struct cred *pcred = 
> > > rcu_dereference_protected(p->real_cred, 1);
> > > +
> > > +   /*
> > > +* Capabilities checks. Considered safe if:
> > > +*  current has all the capabilities p does
> > > +*/
> > > +   if (ccred != pcred &&
> > > +   !cap_issubset(pcred->cap_effective, ccred->cap_effective))
> > > +   return -EACCES;
> > > +   return 0;
> > > +}
> >
> > On its own (without safe_by_namespace()), this check makes no sense, I
> > think. You're performing a test on the namespaced capability sets
> > without looking at which user namespaces they are relative to. Maybe
> > either introduce a configuration dependency or add an extra namespace
> > check here?
>
> If you don't have namespaces the check is correct. If you do, and use
> safe_by_namespace() you're also correct. If you use namespaces and
> care about side-channel attacks you should enable the namespace checks.

By "use namespaces", you mean "have CONFIG_USER_NS=y set in the kernel
config", right?
It doesn't matter much whether processes on your system are
intentionally using namespaces; what matters is whether some random
process can just use unshare(CLONE_NEWUSER) to increase its apparent
capabilities and bypass the checks performed by this LSM.
My expectation is that unshare(CLONE_NEWUSER) should not increase the
caller's abilities. Your patch seems to violate that expectation.

> I don't see real value in adding namespace checks in the capability checks
> for the event where someone has said they don't want namespace checks.

Capabilities are meaningless if you don't consider the namespaces
relative to which they are effective. Anyone can get CAP_SYS_ADMIN or
whatever other capabilities they want, by design - just not relative
to objects they don't own. Look:

$ grep ^Cap /proc/self/status
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 003f
CapAmb: 
$ unshare -Ur grep ^Cap /proc/self/status
CapInh: 

Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-channel vulnerability

2018-08-22 Thread Jann Horn via Selinux
On Tue, Aug 21, 2018 at 6:37 PM Schaufler, Casey
 wrote:
>
> > -Original Message-
> > From: Jann Horn [mailto:ja...@google.com]
> > Sent: Tuesday, August 21, 2018 3:20 AM
> > To: Schaufler, Casey 
> > Cc: Kernel Hardening ; kernel list
> > ; linux-security-module  > mod...@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > ; Dock, Deneen T ;
> > kris...@linux.intel.com; Arjan van de Ven 
> > Subject: Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-
> > channel vulnerability
> >
> > On Mon, Aug 20, 2018 at 4:45 PM Schaufler, Casey
> >  wrote:
> > >
> > > > -Original Message-
> > > > From: Jann Horn [mailto:ja...@google.com]
> > > > Sent: Friday, August 17, 2018 4:55 PM
> > > > To: Schaufler, Casey 
> > > > Cc: Kernel Hardening ; kernel list
> > > > ; linux-security-module  > > > mod...@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > > > ; Dock, Deneen T ;
> > > > kris...@linux.intel.com; Arjan van de Ven 
> > > > Subject: Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-
> > > > channel vulnerability
> > > >
> > > > On Sat, Aug 18, 2018 at 12:17 AM Casey Schaufler
> > > >  wrote:
> > > > >
> > > > > From: Casey Schaufler 
> > > > >
> > > > > When switching between tasks it may be necessary
> > > > > to set an indirect branch prediction barrier if the
> > > > > tasks are potentially vulnerable to side-channel
> > > > > attacks. This adds a call to security_task_safe_sidechannel
> > > > > so that security modules can weigh in on the decision.
> > > > >
> > > > > Signed-off-by: Casey Schaufler 
> > > > > ---
> > > > >  arch/x86/mm/tlb.c | 12 
> > > > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> > > > > index 6eb1f34c3c85..8714d4af06aa 100644
> > > > > --- a/arch/x86/mm/tlb.c
> > > > > +++ b/arch/x86/mm/tlb.c
> > > > > @@ -7,6 +7,7 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > +#include 
> > > > >
> > > > >  #include 
> > > > >  #include 
> > > > > @@ -270,11 +271,14 @@ void switch_mm_irqs_off(struct mm_struct
> > *prev,
> > > > struct mm_struct *next,
> > > > >  * threads. It will also not flush if we switch to 
> > > > > idle
> > > > >  * thread and back to the same process. It will flush 
> > > > > if we
> > > > >  * switch to a different non-dumpable process.
> > > > > +* If a security module thinks that the transition
> > > > > +* is unsafe do the flush.
> > > > >  */
> > > > > -   if (tsk && tsk->mm &&
> > > > > -   tsk->mm->context.ctx_id != last_ctx_id &&
> > > > > -   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> > > > > -   indirect_branch_prediction_barrier();
> > > > > +   if (tsk && tsk->mm && tsk->mm->context.ctx_id != 
> > > > > last_ctx_id)
> > {
> > > > > +   if (get_dumpable(tsk->mm) != SUID_DUMP_USER ||
> > > > > +   security_task_safe_sidechannel(tsk) != 0)
> > > > > +   indirect_branch_prediction_barrier();
> > > > > +   }
> > > >
> > > > When you posted v1 of this series, I asked:
> > > >
> > > > | Does this enforce transitivity? What happens if we first switch from
> > > > | an attacker task to a task without ->mm, and immediately afterwards
> > > > | from the task without ->mm to a victim task? In that case, whether a
> > > > | flush happens between the attacker task and the victim task depends on
> > > > | whether the LSM thinks that the mm-less task should have access to the
> > > > | victim task, right?
> > > >
> > > > Have you addressed that? I don't see it...
> > >
> > > Nope. That's going to require maintaining state about all the
> > > tasks in the chain that might still have cache involvement.
> > >
> > > A -> B -> C -> D
> >
> > Really?
>
> I am willing to be educated otherwise. My understanding
> of Modern Processor Technology will never be so deep that
> I won't listen to reason.
>
> >
> > From what I can tell, it'd be enough to:
> >
> >  - ensure that the LSM-based access checks behave approximately transitively
> >(which I think they already do, mostly)
>
> Smack rules are explicitly and intentionally not transitive.
>
> A reads B, B reads C does *not* imply A reads C.

Ah. :(

Well, at least for UID-based checks, capability comparisons and
namespace comparisons, the relationship should be transitive, right?

> >  - keep a copy of the metadata of the last non-kernel task on the CPU
>
> Do you have a suggestion of how one might do that?
> I'm willing to believe the information could be available,
> but I have yet to come up with a mechanism for getting it.

The obvious solution would be to take a refcounted reference on the
old task's objective creds, but you probably want to avoid the
resulting cache line bouncing...

For safe_by_uid(), 

Re: [PATCH v3 3/5] LSM: Security module checking for side-channel dangers

2018-08-21 Thread Jann Horn via Selinux
On Tue, Aug 21, 2018 at 2:05 AM Casey Schaufler
 wrote:
>
> The sidechannel LSM checks for cases where a side-channel
> attack may be dangerous based on security attributes of tasks.
> This includes:
> Effective UID of the tasks is different
> Capablity sets are different
> Tasks are in different namespaces
> An option is also provided to assert that task are never
> to be considered safe. This is high paranoia, and expensive
> as well.
>
> Signed-off-by: Casey Schaufler 
> ---
[...]
> diff --git a/security/sidechannel/Kconfig b/security/sidechannel/Kconfig
> new file mode 100644
> index ..af9396534128
> --- /dev/null
> +++ b/security/sidechannel/Kconfig
[...]
> +config SECURITY_SIDECHANNEL_CAPABILITIES
> +   bool "Sidechannel check on capability sets"
> +   depends on SECURITY_SIDECHANNEL
> +   default n
> +   help
> + Assume that tasks with different sets of privilege may be
> + subject to side-channel attacks. Potential interactions
> + where the attacker lacks capabilities the attacked has
> + are blocked.
> +
> +  If you are unsure how to answer this question, answer N.
> +
> +config SECURITY_SIDECHANNEL_NAMESPACES
> +   bool "Sidechannel check on namespaces"
> +   depends on SECURITY_SIDECHANNEL
> +   depends on NAMESPACES
> +   default n
> +   help
> + Assume that tasks in different namespaces may be
> + subject to side-channel attacks. User, PID and cgroup
> + namespaces are checked.
> +
> +  If you are unsure how to answer this question, answer N.
[...]
> diff --git a/security/sidechannel/sidechannel.c 
> b/security/sidechannel/sidechannel.c
> new file mode 100644
> index ..4da7d6dafdc5
> --- /dev/null
> +++ b/security/sidechannel/sidechannel.c
[...]
> +/*
> + * safe_by_capability - Are task and current sidechannel safe?
> + * @p: task to check on
> + *
> + * Returns 0 if the tasks are sidechannel safe, -EACCES otherwise.
> + */
> +#ifdef CONFIG_SECURITY_SIDECHANNEL_CAPABILITIES
> +static int safe_by_capability(struct task_struct *p)
> +{
> +   const struct cred *ccred = current_real_cred();
> +   const struct cred *pcred = rcu_dereference_protected(p->real_cred, 1);
> +
> +   /*
> +* Capabilities checks. Considered safe if:
> +*  current has all the capabilities p does
> +*/
> +   if (ccred != pcred &&
> +   !cap_issubset(pcred->cap_effective, ccred->cap_effective))
> +   return -EACCES;
> +   return 0;
> +}

On its own (without safe_by_namespace()), this check makes no sense, I
think. You're performing a test on the namespaced capability sets
without looking at which user namespaces they are relative to. Maybe
either introduce a configuration dependency or add an extra namespace
check here?

> +static int safe_by_namespace(struct task_struct *p)
> +{
> +   struct cgroup_namespace *ccgn = NULL;
> +   struct cgroup_namespace *pcgn = NULL;
> +   const struct cred *ccred;
> +   const struct cred *pcred;
> +
> +   /*
> +* Namespace checks. Considered safe if:
> +*  cgroup namespace is the same
> +*  User namespace is the same
> +*  PID namespace is the same
> +*/
> +   if (current->nsproxy)
> +   ccgn = current->nsproxy->cgroup_ns;
> +   if (p->nsproxy)
> +   pcgn = p->nsproxy->cgroup_ns;
> +   if (ccgn != pcgn)
> +   return -EACCES;
> +
> +   ccred = current_real_cred();
> +   pcred = rcu_dereference_protected(p->real_cred, 1);
> +
> +   if (ccred->user_ns != pcred->user_ns)
> +   return -EACCES;
> +   if (task_active_pid_ns(current) != task_active_pid_ns(p))
> +   return -EACCES;
> +   return 0;
> +}
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-channel vulnerability

2018-08-21 Thread Jann Horn via Selinux
On Mon, Aug 20, 2018 at 4:45 PM Schaufler, Casey
 wrote:
>
> > -Original Message-
> > From: Jann Horn [mailto:ja...@google.com]
> > Sent: Friday, August 17, 2018 4:55 PM
> > To: Schaufler, Casey 
> > Cc: Kernel Hardening ; kernel list
> > ; linux-security-module  > mod...@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > ; Dock, Deneen T ;
> > kris...@linux.intel.com; Arjan van de Ven 
> > Subject: Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-
> > channel vulnerability
> >
> > On Sat, Aug 18, 2018 at 12:17 AM Casey Schaufler
> >  wrote:
> > >
> > > From: Casey Schaufler 
> > >
> > > When switching between tasks it may be necessary
> > > to set an indirect branch prediction barrier if the
> > > tasks are potentially vulnerable to side-channel
> > > attacks. This adds a call to security_task_safe_sidechannel
> > > so that security modules can weigh in on the decision.
> > >
> > > Signed-off-by: Casey Schaufler 
> > > ---
> > >  arch/x86/mm/tlb.c | 12 
> > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> > > index 6eb1f34c3c85..8714d4af06aa 100644
> > > --- a/arch/x86/mm/tlb.c
> > > +++ b/arch/x86/mm/tlb.c
> > > @@ -7,6 +7,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #include 
> > >  #include 
> > > @@ -270,11 +271,14 @@ void switch_mm_irqs_off(struct mm_struct *prev,
> > struct mm_struct *next,
> > >  * threads. It will also not flush if we switch to idle
> > >  * thread and back to the same process. It will flush if 
> > > we
> > >  * switch to a different non-dumpable process.
> > > +* If a security module thinks that the transition
> > > +* is unsafe do the flush.
> > >  */
> > > -   if (tsk && tsk->mm &&
> > > -   tsk->mm->context.ctx_id != last_ctx_id &&
> > > -   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> > > -   indirect_branch_prediction_barrier();
> > > +   if (tsk && tsk->mm && tsk->mm->context.ctx_id != 
> > > last_ctx_id) {
> > > +   if (get_dumpable(tsk->mm) != SUID_DUMP_USER ||
> > > +   security_task_safe_sidechannel(tsk) != 0)
> > > +   indirect_branch_prediction_barrier();
> > > +   }
> >
> > When you posted v1 of this series, I asked:
> >
> > | Does this enforce transitivity? What happens if we first switch from
> > | an attacker task to a task without ->mm, and immediately afterwards
> > | from the task without ->mm to a victim task? In that case, whether a
> > | flush happens between the attacker task and the victim task depends on
> > | whether the LSM thinks that the mm-less task should have access to the
> > | victim task, right?
> >
> > Have you addressed that? I don't see it...
>
> Nope. That's going to require maintaining state about all the
> tasks in the chain that might still have cache involvement.
>
> A -> B -> C -> D

Really?

>From what I can tell, it'd be enough to:

 - ensure that the LSM-based access checks behave approximately transitively
   (which I think they already do, mostly)
 - keep a copy of the metadata of the last non-kernel task on the CPU

> If B and C don't do anything cacheworthy D could conceivably attack A.
> The amount of state required to detect this case would be prohibitive.
> I think that if you're sufficiently concerned about this case you should just
> go ahead and set the barrier. I'm willing to learn something that says I'm
> wrong.

That means that an attacker who can e.g. get a CPU to first switch
from an attacker task to a softirqd (e.g. for network packet
processing or whatever), then switch from the softirqd to a root-owned
victim task would be able to bypass the check, right? That doesn't
sound like a very complicated attack...

I very much dislike the idea of adding a mitigation with a known
bypass technique to the kernel.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-channel vulnerability

2018-08-20 Thread Jann Horn via Selinux
On Sat, Aug 18, 2018 at 12:17 AM Casey Schaufler
 wrote:
>
> From: Casey Schaufler 
>
> When switching between tasks it may be necessary
> to set an indirect branch prediction barrier if the
> tasks are potentially vulnerable to side-channel
> attacks. This adds a call to security_task_safe_sidechannel
> so that security modules can weigh in on the decision.
>
> Signed-off-by: Casey Schaufler 
> ---
>  arch/x86/mm/tlb.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 6eb1f34c3c85..8714d4af06aa 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -7,6 +7,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -270,11 +271,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
> mm_struct *next,
>  * threads. It will also not flush if we switch to idle
>  * thread and back to the same process. It will flush if we
>  * switch to a different non-dumpable process.
> +* If a security module thinks that the transition
> +* is unsafe do the flush.
>  */
> -   if (tsk && tsk->mm &&
> -   tsk->mm->context.ctx_id != last_ctx_id &&
> -   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> -   indirect_branch_prediction_barrier();
> +   if (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id) 
> {
> +   if (get_dumpable(tsk->mm) != SUID_DUMP_USER ||
> +   security_task_safe_sidechannel(tsk) != 0)
> +   indirect_branch_prediction_barrier();
> +   }

When you posted v1 of this series, I asked:

| Does this enforce transitivity? What happens if we first switch from
| an attacker task to a task without ->mm, and immediately afterwards
| from the task without ->mm to a victim task? In that case, whether a
| flush happens between the attacker task and the victim task depends on
| whether the LSM thinks that the mm-less task should have access to the
| victim task, right?

Have you addressed that? I don't see it...
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH RFC v2 3/5] LSM: Security module checking for side-channel dangers

2018-08-20 Thread Jann Horn via Selinux
On Sat, Aug 18, 2018 at 12:17 AM Casey Schaufler
 wrote:
>
> From: Casey Schaufler 
>
> The sidechannel LSM checks for cases where a side-channel
> attack may be dangerous based on security attributes of tasks.
> This includes:
> Effective UID of the tasks is different
> Capablity sets are different
> Tasks are in different namespaces
> An option is also provided to assert that task are never
> to be considered safe. This is high paranoia, and expensive
> as well.
>
> Signed-off-by: Casey Schaufler 
[...]
> +#ifdef CONFIG_SECURITY_SIDECHANNEL_UIDS
> +static int safe_by_uid(struct task_struct *p)
> +{
> +   const struct cred *ccred = current_real_cred();
> +   const struct cred *pcred = get_task_cred(p);
> +
> +   /*
> +* Credential checks. Considered safe if:
> +*  UIDs are the same
> +*/
> +   if (ccred != pcred && ccred->euid.val != pcred->euid.val)
> +   return -EACCES;
> +   return 0;
> +}

This function looks bogus. get_task_cred() bumps the refcount on the
returned cred struct pointer, but you don't drop it. You probably want
to use something that doesn't fiddle with the refcount at all here to
avoid cacheline bouncing - possibly a raw rcu_dereference_protected()
if there are no better helpers.

Same thing for the other get_task_cred() calls further down in the patch.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH RFC 5/5] SELinux: Support SELinux determination of side-channel vulnerability

2018-08-16 Thread Jann Horn via Selinux
On Thu, Aug 16, 2018 at 11:52 AM Casey Schaufler
 wrote:
>
> SELinux considers tasks to be side-channel safe if they
> have PROCESS_SHARE access.
>
> Signed-off-by: Casey Schaufler 
> ---
>  security/selinux/hooks.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index a8bf324130f5..7fbd7d7ac1cb 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -4219,6 +4219,14 @@ static void selinux_task_to_inode(struct task_struct 
> *p,
> spin_unlock(>lock);
>  }
>
> +static int selinux_task_safe_sidechannel(struct task_struct *p)
> +{
> +   struct av_decision avd;
> +
> +   return avc_has_perm_noaudit(_state, current_sid(), 
> task_sid(p),
> +   SECCLASS_PROCESS, PROCESS__SHARE, 0, 
> );
> +}

current_sid() -> current_security() -> current_cred_xxx() ->
current_cred() accesses current->cred, the subjective credentials
associated with the current syscall context, affected by
override_creds(). You probably want to look at objective credentials
here, since what you're interested in is not the security context of
the current syscall, but the security context of the userspace code
running in the current address space.

task_sid() does the right thing and looks at the objective creds.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH RFC 3/5] LSM: Security module checking for side-channel dangers

2018-08-16 Thread Jann Horn via Selinux
On Thu, Aug 16, 2018 at 11:51 AM Casey Schaufler
 wrote:
>
> From: Casey Schaufler 
>
> The sidechannel LSM checks for cases where a side-channel
> attack may be dangerous based on security attributes of tasks.
> This includes:
> Effective UID of the tasks is different
> Capablity sets are different
> Tasks are in different namespaces
> An option is also provided to assert that task are never
> to be considered safe. This is high paranoia, and expensive
> as well.
>
> Signed-off-by: Casey Schaufler 
[...]
> +static int safe_by_uid(struct task_struct *p)
> +{
> +   const struct cred *ccred = current->cred;
> +   const struct cred *pcred = p->cred;

See below.

[...]
> +static int safe_by_capability(struct task_struct *p)
> +{
> +   const struct cred *ccred = current->cred;
> +   const struct cred *pcred = p->cred;

See below.

[...]
> +   if (current->cred->user_ns != p->cred->user_ns)
> +   return -EACCES;

Shouldn't this access be using one of the rcu_dereference_* helpers?
Something like rcu_dereference_protected(..., 1).

Also, you're looking at ->cred, and you don't want to do that. ->cred
are the *subjective* credentials of the task, meaning the credentials
that should only be used when this task actively performs an access.
Depending on what userspace is doing, an unprivileged process' ->cred
will randomly flake to something with GLOBAL_ROOT_UID because of
override_creds() calls - for example, if you access a file through
overlayfs in certain ways, your ->cred will temporarily be overwritten
by ovl_override_creds(), which mostly switches to the credentials of
the filesystem's creator (iow, normally root). coredumps can also
override the EUID with GLOBAL_ROOT_UID:

if (__get_dumpable(cprm.mm_flags) == SUID_DUMP_ROOT) {
/* Setuid core dump mode */
cred->fsuid = GLOBAL_ROOT_UID;  /* Dump root private */
need_suid_safe = true;
}

retval = coredump_wait(siginfo->si_signo, _state);
if (retval < 0)
goto fail_creds;

old_cred = override_creds(cred);

Please use the objective credentials (->real_cred) instead, which are
not subject to random override_creds() effects, and are designed to be
used when looking at the privileges associated with an overall task
(as opposed to the privileges associated with the task's current
syscall context).

At least for the current-> access, you probably want
current_real_cred(), which has been defined for this kind of use.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH RFC 2/5] X86: Support LSM determination of side-channel vulnerability

2018-08-16 Thread Jann Horn via Selinux
On Thu, Aug 16, 2018 at 11:51 AM Casey Schaufler
 wrote:
>
> From: Casey Schaufler 
>
> When switching between tasks it may be necessary
> to set an indirect branch prediction barrier if the
> tasks are potentially vulnerable to side-channel
> attacks. This adds a call to security_task_safe_sidechannel
> so that security modules can weigh in on the decision.
>
> Signed-off-by: Casey Schaufler 
> ---
>  arch/x86/mm/tlb.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 6eb1f34c3c85..8714d4af06aa 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -7,6 +7,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -270,11 +271,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
> mm_struct *next,
>  * threads. It will also not flush if we switch to idle
>  * thread and back to the same process. It will flush if we
>  * switch to a different non-dumpable process.
> +* If a security module thinks that the transition
> +* is unsafe do the flush.
>  */
> -   if (tsk && tsk->mm &&
> -   tsk->mm->context.ctx_id != last_ctx_id &&
> -   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> -   indirect_branch_prediction_barrier();
> +   if (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id) 
> {
> +   if (get_dumpable(tsk->mm) != SUID_DUMP_USER ||
> +   security_task_safe_sidechannel(tsk) != 0)
> +   indirect_branch_prediction_barrier();
> +   }

Does this enforce transitivity? What happens if we first switch from
an attacker task to a task without ->mm, and immediately afterwards
from the task without ->mm to a victim task? In that case, whether a
flush happens between the attacker task and the victim task depends on
whether the LSM thinks that the mm-less task should have access to the
victim task, right?
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH] selinux: refactor mls_context_to_sid() and make it stricter

2018-08-13 Thread Jann Horn via Selinux
On Thu, Aug 9, 2018 at 4:07 AM Paul Moore  wrote:
>
> On Wed, Aug 8, 2018 at 9:56 PM Paul Moore  wrote:
> >
> > On Mon, Aug 6, 2018 at 5:19 PM Jann Horn  wrote:
> > >
> > > The intended behavior change for this patch is to reject any MLS strings
> > > that contain (trailing) garbage if p->mls_enabled is true.
> > >
> > > As suggested by Paul Moore, change mls_context_to_sid() so that the two
> > > parts of the range are extracted before the rest of the parsing. Because
> > > now we don't have to scan for two different separators simultaneously
> > > everywhere, we can actually switch to strchr() everywhere instead of the
> > > open-coded loops that scan for two separators at once.
> > >
> > > mls_context_to_sid() used to signal how much of the input string was 
> > > parsed
> > > by updating `*scontext`. However, there is actually no case in which
> > > mls_context_to_sid() only parses a subset of the input and still returns
> > > a success (other than the buggy case with a second '-' in which it
> > > incorrectly claims to have consumed the entire string). Turn `scontext`
> > > into a simple pointer argument and stop redundantly checking whether the
> > > entire input was consumed in string_to_context_struct(). This also lets us
> > > remove the `scontext_len` argument from `string_to_context_struct()`.
> > >
> > > Signed-off-by: Jann Horn 
> > > ---
> > > Refactored version of
> > > "[PATCH] selinux: stricter parsing in mls_context_to_sid()" based on
> > > Paul's comments. WDYT?
> > > I've thrown some inputs at it, and it seems to work.
> > >
> > >  security/selinux/ss/mls.c  | 178 ++---
> > >  security/selinux/ss/mls.h  |   2 +-
> > >  security/selinux/ss/services.c |  12 +--
> > >  3 files changed, 82 insertions(+), 110 deletions(-)
> >
> > Thanks for the rework, this looks much better than what we currently
> > have.  I have some comments/questions below ...
> >
> > > diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> > > index 39475fb455bc..2fe459df3c85 100644
> > > --- a/security/selinux/ss/mls.c
> > > +++ b/security/selinux/ss/mls.c
> > > @@ -218,9 +218,7 @@ int mls_context_isvalid(struct policydb *p, struct 
> > > context *c)
> > >  /*
> > >   * Set the MLS fields in the security context structure
> > >   * `context' based on the string representation in
> > > - * the string `*scontext'.  Update `*scontext' to
> > > - * point to the end of the string representation of
> > > - * the MLS fields.
> > > + * the string `scontext'.
> > >   *
> > >   * This function modifies the string in place, inserting
> > >   * NULL characters to terminate the MLS fields.
> > > @@ -235,22 +233,21 @@ int mls_context_isvalid(struct policydb *p, struct 
> > > context *c)
> > >   */
> > >  int mls_context_to_sid(struct policydb *pol,
> > >char oldc,
> > > -  char **scontext,
> > > +  char *scontext,
> > >struct context *context,
> > >struct sidtab *s,
> > >u32 def_sid)
> > >  {
> > > -
> > > -   char delim;
> > > -   char *scontextp, *p, *rngptr;
> > > +   char *sensitivity, *cur_cat, *next_cat, *rngptr;
> > > struct level_datum *levdatum;
> > > struct cat_datum *catdatum, *rngdatum;
> > > -   int l, rc = -EINVAL;
> > > +   int l, rc, i;
> > > +   char *rangep[2];
> > >
> > > if (!pol->mls_enabled) {
> > > -   if (def_sid != SECSID_NULL && oldc)
> > > -   *scontext += strlen(*scontext) + 1;
> > > -   return 0;
> > > +   if ((def_sid != SECSID_NULL && oldc) || (*scontext) == 
> > > '\0')
> > > +   return 0;
> > > +   return -EINVAL;
> >
> > Why are we simply not always return 0 in the case where MLS is not
> > enabled in the policy?  The mls_context_to_sid() is pretty much a
> > no-op in this case (even more so with your pat regardless of input and
> > I worry that returning EINVAL here is a deviation from current
> > behavior which could cause problems.
>
> Sorry, I was rephrasing the text above when I accidentally hit send.
> While my emails are generally a good source of typos, the above is
> pretty bad, so let me try again ...
>
> Why are we simply not always returning 0 in the case where MLS is not
> enabled in the policy?  The mls_context_to_sid() function is pretty
> much a no-op in this case regardless of input and I worry that
> returning EINVAL here is a deviation from current behavior which could
> cause problems.

Reverse call graph of mls_context_to_sid():

mls_context_to_sid()
mls_from_string()
selinux_audit_rule_init()
[...]
string_to_context_struct()
security_context_to_sid_core()
[...]
convert_context()
[...]

string_to_context_struct() currently has the following code:

rc = mls_context_to_sid(pol, oldc, , 

[PATCH] selinux: refactor mls_context_to_sid() and make it stricter

2018-08-07 Thread Jann Horn via Selinux
The intended behavior change for this patch is to reject any MLS strings
that contain (trailing) garbage if p->mls_enabled is true.

As suggested by Paul Moore, change mls_context_to_sid() so that the two
parts of the range are extracted before the rest of the parsing. Because
now we don't have to scan for two different separators simultaneously
everywhere, we can actually switch to strchr() everywhere instead of the
open-coded loops that scan for two separators at once.

mls_context_to_sid() used to signal how much of the input string was parsed
by updating `*scontext`. However, there is actually no case in which
mls_context_to_sid() only parses a subset of the input and still returns
a success (other than the buggy case with a second '-' in which it
incorrectly claims to have consumed the entire string). Turn `scontext`
into a simple pointer argument and stop redundantly checking whether the
entire input was consumed in string_to_context_struct(). This also lets us
remove the `scontext_len` argument from `string_to_context_struct()`.

Signed-off-by: Jann Horn 
---
Refactored version of
"[PATCH] selinux: stricter parsing in mls_context_to_sid()" based on
Paul's comments. WDYT?
I've thrown some inputs at it, and it seems to work.

 security/selinux/ss/mls.c  | 178 ++---
 security/selinux/ss/mls.h  |   2 +-
 security/selinux/ss/services.c |  12 +--
 3 files changed, 82 insertions(+), 110 deletions(-)

diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index 39475fb455bc..2fe459df3c85 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -218,9 +218,7 @@ int mls_context_isvalid(struct policydb *p, struct context 
*c)
 /*
  * Set the MLS fields in the security context structure
  * `context' based on the string representation in
- * the string `*scontext'.  Update `*scontext' to
- * point to the end of the string representation of
- * the MLS fields.
+ * the string `scontext'.
  *
  * This function modifies the string in place, inserting
  * NULL characters to terminate the MLS fields.
@@ -235,22 +233,21 @@ int mls_context_isvalid(struct policydb *p, struct 
context *c)
  */
 int mls_context_to_sid(struct policydb *pol,
   char oldc,
-  char **scontext,
+  char *scontext,
   struct context *context,
   struct sidtab *s,
   u32 def_sid)
 {
-
-   char delim;
-   char *scontextp, *p, *rngptr;
+   char *sensitivity, *cur_cat, *next_cat, *rngptr;
struct level_datum *levdatum;
struct cat_datum *catdatum, *rngdatum;
-   int l, rc = -EINVAL;
+   int l, rc, i;
+   char *rangep[2];
 
if (!pol->mls_enabled) {
-   if (def_sid != SECSID_NULL && oldc)
-   *scontext += strlen(*scontext) + 1;
-   return 0;
+   if ((def_sid != SECSID_NULL && oldc) || (*scontext) == '\0')
+   return 0;
+   return -EINVAL;
}
 
/*
@@ -261,113 +258,94 @@ int mls_context_to_sid(struct policydb *pol,
struct context *defcon;
 
if (def_sid == SECSID_NULL)
-   goto out;
+   return -EINVAL;
 
defcon = sidtab_search(s, def_sid);
if (!defcon)
-   goto out;
+   return -EINVAL;
 
-   rc = mls_context_cpy(context, defcon);
-   goto out;
+   return mls_context_cpy(context, defcon);
}
 
-   /* Extract low sensitivity. */
-   scontextp = p = *scontext;
-   while (*p && *p != ':' && *p != '-')
-   p++;
-
-   delim = *p;
-   if (delim != '\0')
-   *p++ = '\0';
+   /*
+* If we're dealing with a range, figure out where the two parts
+* of the range begin.
+*/
+   rangep[0] = scontext;
+   rangep[1] = strchr(scontext, '-');
+   if (rangep[1]) {
+   rangep[1][0] = '\0';
+   rangep[1]++;
+   }
 
+   /* For each part of the range: */
for (l = 0; l < 2; l++) {
-   levdatum = hashtab_search(pol->p_levels.table, scontextp);
-   if (!levdatum) {
-   rc = -EINVAL;
-   goto out;
-   }
+   /* Split sensitivity and category set. */
+   sensitivity = rangep[l];
+   if (sensitivity == NULL)
+   break;
+   next_cat = strchr(sensitivity, ':');
+   if (next_cat)
+   *(next_cat++) = '\0';
 
+   /* Parse sensitivity. */
+   levdatum = hashtab_search(pol->p_levels.table, sensitivity);
+   if (!levdatum)
+   return -EINVAL;
context->range.level[l].sens = levdatum->level->sens;
 
-   if 

Re: [PATCH] selinux: stricter parsing in mls_context_to_sid()

2018-08-06 Thread Jann Horn via Selinux
On Sat, Aug 4, 2018 at 2:01 AM Paul Moore  wrote:
>
> On Fri, Aug 3, 2018 at 5:36 AM Jann Horn  wrote:
> >
> > mls_context_to_sid incorrectly accepted MLS context strings that are
> > followed by a dash and trailing garbage.
> >
> > Before this change, the following command works:
> >
> > # mount -t tmpfs -o 'context=system_u:object_r:tmp_t:s0-s0:c0-BLAH' \
> > none mount
> >
> > After this change, it fails with the following error message in dmesg:
> >
> > SELinux: security_context_str_to_sid(system_u:object_r:tmp_t:s0-s0:c0-BLAH)
> > failed for (dev tmpfs, type tmpfs) errno=-22
> >
> > This is not an important bug; but it is a small quirk that was useful for
> > exploiting a vulnerability in fusermount.
> >
> > This patch does not change the behavior when the policy does not have MLS
> > enabled.
> >
> > Signed-off-by: Jann Horn 
> > ---
> >  security/selinux/ss/mls.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Ooof.  mls_context_to_sid() isn't exactly the most elegant function is
> it?  I suppose some of that is due to the label format, but it still
> seems like we can do better.  However, before I jump into that let me
> say that speaking strictly about your patch, yes, it does look correct
> to me.
>
> What I'm wonder is if we rework/reorder some of the parser to remove
> some of the ordering specific (e.g. "l == 0") and reduce some
> redundancy ... be patient with me for a moment ...
>
> The while loops immediately following the "Extract low sensitivity"
> and "Extract high sensitivity" comments are basically the same,
> including NULL'ing the delimiter if necessary, the only difference is
> the additional '-' delimiter in the low sensitivity search.
>
> The only *legal* place for a '-' in the MLS portion of the label is to
> separate the low/high portions.
>
> What if we were to do a quick search for the low/high separator before
> extracting the low sensitivity and stored low/high pointers for later
> use?  e.g.
>
>   rangep[0] = *scontext;
>   rangep[1] = strchr(rangep[0], '-');
>   if (rangep[1])
> rangep[1]++ = '\0';
>
> ... we could then move the "Extract X sensitivity" into the main for
> loop as well remove all of the '-' special case parsing checks, e.g.
>
>   for (l = 0; l < 2; l++) {
>
> scontextp = rangep[l];
> if (!scontextp)
>   break;
>
> while (*p && *p != ':')
>   p++;
> delim = *p;
> if (delim != '\0')
>   *p++ = '\0';
>
> /* extract the level (use existing code) */
>
> /* extract the category set, if present (use existing code) */
>
> /* no need to worry about the '-' delimiter */
>
>   }
>
> I *believe* that should work.  I think.  Does that make sense?

I'm fiddling with the code a bit now - your suggestion sounds good to
me, and I think there are a couple other small tweaks that can make
the code more readable. I hope I'll have a patch ready soon.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


[PATCH] selinux: stricter parsing in mls_context_to_sid()

2018-08-03 Thread Jann Horn via Selinux
mls_context_to_sid incorrectly accepted MLS context strings that are
followed by a dash and trailing garbage.

Before this change, the following command works:

# mount -t tmpfs -o 'context=system_u:object_r:tmp_t:s0-s0:c0-BLAH' \
none mount

After this change, it fails with the following error message in dmesg:

SELinux: security_context_str_to_sid(system_u:object_r:tmp_t:s0-s0:c0-BLAH)
failed for (dev tmpfs, type tmpfs) errno=-22

This is not an important bug; but it is a small quirk that was useful for
exploiting a vulnerability in fusermount.

This patch does not change the behavior when the policy does not have MLS
enabled.

Signed-off-by: Jann Horn 
---
 security/selinux/ss/mls.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index 39475fb455bc..2c73d612d2ee 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -344,7 +344,7 @@ int mls_context_to_sid(struct policydb *pol,
break;
}
}
-   if (delim == '-') {
+   if (delim == '-' && l == 0) {
/* Extract high sensitivity. */
scontextp = p;
while (*p && *p != ':')
-- 
2.18.0.597.ga71716f1ad-goog

___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH] selinux: move user accesses in selinuxfs out of locked regions

2018-07-02 Thread Jann Horn via Selinux
On Fri, Jun 29, 2018 at 2:38 AM Paul Moore  wrote:
>
> On Thu, Jun 28, 2018 at 8:23 PM Paul Moore  wrote:
> > On Tue, Jun 26, 2018 at 8:15 AM Stephen Smalley  wrote:
> > > On 06/25/2018 12:34 PM, Jann Horn wrote:
> > > > If a user is accessing a file in selinuxfs with a pointer to a userspace
> > > > buffer that is backed by e.g. a userfaultfd, the userspace access can
> > > > stall indefinitely, which can block fsi->mutex if it is held.
> > > >
> > > > For sel_read_policy(), remove the locking, since this method doesn't 
> > > > seem
> > > > to access anything that requires locking.
> > > >
> > > > For sel_read_bool(), move the user access below the locked region.
> > > >
> > > > For sel_write_bool() and sel_commit_bools_write(), move the user access
> > > > up above the locked region.
> > > >
> > > > Cc: sta...@vger.kernel.org
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Jann Horn 
> > >
> > > Only question I have is wrt the Fixes line, i.e. was this an issue until 
> > > userfaultfd was introduced, and if not,
> > > do we need it to be back-ported any further than the commit which 
> > > introduced it.
> >
> > Considering we are talking about v2.6.12 I have to wonder if anyone is
> > bothering with backports for kernels that old.  Even the RHEL-5.x
> > based systems are at least on v2.6.18.
> >
> > Regardless, I think this is fine to merge as-is; thanks everyone.
>
> FYI, I did have to remove the "fsi" variable from sel_read_policy() to
> keep the compiler happy.  Please double check to make sure your code
> compiles cleanly in the future.

Oof, don't know how I missed that. Sorry, I'll be more careful.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: [PATCH] selinux: move user accesses in selinuxfs out of locked regions

2018-06-26 Thread Jann Horn via Selinux
On Tue, Jun 26, 2018 at 2:15 PM Stephen Smalley  wrote:
>
> On 06/25/2018 12:34 PM, Jann Horn wrote:
> > If a user is accessing a file in selinuxfs with a pointer to a userspace
> > buffer that is backed by e.g. a userfaultfd, the userspace access can
> > stall indefinitely, which can block fsi->mutex if it is held.
> >
> > For sel_read_policy(), remove the locking, since this method doesn't seem
> > to access anything that requires locking.
> >
> > For sel_read_bool(), move the user access below the locked region.
> >
> > For sel_write_bool() and sel_commit_bools_write(), move the user access
> > up above the locked region.
> >
> > Cc: sta...@vger.kernel.org
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Jann Horn 
>
> Only question I have is wrt the Fixes line, i.e. was this an issue until 
> userfaultfd was introduced, and if not,
> do we need it to be back-ported any further than the commit which introduced 
> it.

You can also use FUSE, if the system is configured appropriately:
Mount a FUSE filesystem, mmap() a file from it, then pass a pointer to
the mmapped region to a syscall. AFAICS FUSE was added to the kernel
in commit d8a5ba45457e4a22aa39c939121efd7bb6c76672, first in
v2.6.16.28.

> Otherwise, you can add my
> Acked-by: Stephen Smalley 

This patch should go through Paul Moore's tree, right?

> > ---
> >  security/selinux/selinuxfs.c | 77 
> >  1 file changed, 33 insertions(+), 44 deletions(-)
> >
> > diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
> > index f3d374d2ca04..065f8cea84e3 100644
> > --- a/security/selinux/selinuxfs.c
> > +++ b/security/selinux/selinuxfs.c
> > @@ -445,18 +445,13 @@ static ssize_t sel_read_policy(struct file *filp, 
> > char __user *buf,
> >   struct policy_load_memory *plm = filp->private_data;
> >   int ret;
> >
> > - mutex_lock(>mutex);
> > -
> >   ret = avc_has_perm(_state,
> >  current_sid(), SECINITSID_SECURITY,
> > SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL);
> >   if (ret)
> > - goto out;
> > + return ret;
> >
> > - ret = simple_read_from_buffer(buf, count, ppos, plm->data, plm->len);
> > -out:
> > - mutex_unlock(>mutex);
> > - return ret;
> > + return simple_read_from_buffer(buf, count, ppos, plm->data, plm->len);
> >  }
> >
> >  static vm_fault_t sel_mmap_policy_fault(struct vm_fault *vmf)
> > @@ -1188,25 +1183,29 @@ static ssize_t sel_read_bool(struct file *filep, 
> > char __user *buf,
> >   ret = -EINVAL;
> >   if (index >= fsi->bool_num || strcmp(name,
> >fsi->bool_pending_names[index]))
> > - goto out;
> > + goto out_unlock;
> >
> >   ret = -ENOMEM;
> >   page = (char *)get_zeroed_page(GFP_KERNEL);
> >   if (!page)
> > - goto out;
> > + goto out_unlock;
> >
> >   cur_enforcing = security_get_bool_value(fsi->state, index);
> >   if (cur_enforcing < 0) {
> >   ret = cur_enforcing;
> > - goto out;
> > + goto out_unlock;
> >   }
> >   length = scnprintf(page, PAGE_SIZE, "%d %d", cur_enforcing,
> > fsi->bool_pending_values[index]);
> > - ret = simple_read_from_buffer(buf, count, ppos, page, length);
> > -out:
> >   mutex_unlock(>mutex);
> > + ret = simple_read_from_buffer(buf, count, ppos, page, length);
> > +out_free:
> >   free_page((unsigned long)page);
> >   return ret;
> > +
> > +out_unlock:
> > + mutex_unlock(>mutex);
> > + goto out_free;
> >  }
> >
> >  static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
> > @@ -1219,6 +1218,17 @@ static ssize_t sel_write_bool(struct file *filep, 
> > const char __user *buf,
> >   unsigned index = file_inode(filep)->i_ino & SEL_INO_MASK;
> >   const char *name = filep->f_path.dentry->d_name.name;
> >
> > + if (count >= PAGE_SIZE)
> > + return -ENOMEM;
> > +
> > + /* No partial writes. */
> > + if (*ppos != 0)
> > + return -EINVAL;
> > +
> > + page = memdup_user_nul(buf, count);
> > + if (IS_ERR(page))
> > + return PTR_ERR(page);
> > +
> >   mutex_lock(>mutex);
> >
> >   length = avc_has_perm(_state,
> > @@ -1233,22 +1243,6 @@ static ssize_t sel_write_bool(struct file *filep, 
> > const char __user *buf,
> >fsi->bool_pending_names[index]))
> >   goto out;
> >
> > - length = -ENOMEM;
> > - if (count >= PAGE_SIZE)
> > - goto out;
> > -
> > - /* No partial writes. */
> > - length = -EINVAL;
> > - if (*ppos != 0)
> > - goto out;
> > -
> > - page = memdup_user_nul(buf, count);
> > - if (IS_ERR(page)) {
> > - length = PTR_ERR(page);
> > - page = NULL;
> > - goto out;
> > - }
> > -
> >   

Re: [PATCH] selinux: move user accesses in selinuxfs out of locked regions

2018-06-26 Thread Jann Horn via Selinux
On Tue, Jun 26, 2018 at 12:36 AM Paul Moore  wrote:
>
> On Mon, Jun 25, 2018 at 12:34 PM Jann Horn  wrote:
> > If a user is accessing a file in selinuxfs with a pointer to a userspace
> > buffer that is backed by e.g. a userfaultfd, the userspace access can
> > stall indefinitely, which can block fsi->mutex if it is held.
> >
> > For sel_read_policy(), remove the locking, since this method doesn't seem
> > to access anything that requires locking.
>
> Forgive me, I'm thinking about this quickly so I could be very wrong
> here, but isn't the mutex needed to prevent problems in multi-threaded
> apps hitting the same fd at the same time?

sel_read_policy() operates on a read-only copy of the policy, accessed
via ->private_data, allocated using vmalloc in sel_open_policy() via
security_read_policy(). As far as I can tell, nothing can write to
that read-only copy of the policy. None of the handlers in
sel_policy_ops write - they just mmap as readonly (in which case
you're already reading without locks, by the way) or read.
___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


[PATCH] selinux: move user accesses in selinuxfs out of locked regions

2018-06-25 Thread Jann Horn via Selinux
If a user is accessing a file in selinuxfs with a pointer to a userspace
buffer that is backed by e.g. a userfaultfd, the userspace access can
stall indefinitely, which can block fsi->mutex if it is held.

For sel_read_policy(), remove the locking, since this method doesn't seem
to access anything that requires locking.

For sel_read_bool(), move the user access below the locked region.

For sel_write_bool() and sel_commit_bools_write(), move the user access
up above the locked region.

Cc: sta...@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn 
---
 security/selinux/selinuxfs.c | 77 
 1 file changed, 33 insertions(+), 44 deletions(-)

diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index f3d374d2ca04..065f8cea84e3 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -445,18 +445,13 @@ static ssize_t sel_read_policy(struct file *filp, char 
__user *buf,
struct policy_load_memory *plm = filp->private_data;
int ret;
 
-   mutex_lock(>mutex);
-
ret = avc_has_perm(_state,
   current_sid(), SECINITSID_SECURITY,
  SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL);
if (ret)
-   goto out;
+   return ret;
 
-   ret = simple_read_from_buffer(buf, count, ppos, plm->data, plm->len);
-out:
-   mutex_unlock(>mutex);
-   return ret;
+   return simple_read_from_buffer(buf, count, ppos, plm->data, plm->len);
 }
 
 static vm_fault_t sel_mmap_policy_fault(struct vm_fault *vmf)
@@ -1188,25 +1183,29 @@ static ssize_t sel_read_bool(struct file *filep, char 
__user *buf,
ret = -EINVAL;
if (index >= fsi->bool_num || strcmp(name,
 fsi->bool_pending_names[index]))
-   goto out;
+   goto out_unlock;
 
ret = -ENOMEM;
page = (char *)get_zeroed_page(GFP_KERNEL);
if (!page)
-   goto out;
+   goto out_unlock;
 
cur_enforcing = security_get_bool_value(fsi->state, index);
if (cur_enforcing < 0) {
ret = cur_enforcing;
-   goto out;
+   goto out_unlock;
}
length = scnprintf(page, PAGE_SIZE, "%d %d", cur_enforcing,
  fsi->bool_pending_values[index]);
-   ret = simple_read_from_buffer(buf, count, ppos, page, length);
-out:
mutex_unlock(>mutex);
+   ret = simple_read_from_buffer(buf, count, ppos, page, length);
+out_free:
free_page((unsigned long)page);
return ret;
+
+out_unlock:
+   mutex_unlock(>mutex);
+   goto out_free;
 }
 
 static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
@@ -1219,6 +1218,17 @@ static ssize_t sel_write_bool(struct file *filep, const 
char __user *buf,
unsigned index = file_inode(filep)->i_ino & SEL_INO_MASK;
const char *name = filep->f_path.dentry->d_name.name;
 
+   if (count >= PAGE_SIZE)
+   return -ENOMEM;
+
+   /* No partial writes. */
+   if (*ppos != 0)
+   return -EINVAL;
+
+   page = memdup_user_nul(buf, count);
+   if (IS_ERR(page))
+   return PTR_ERR(page);
+
mutex_lock(>mutex);
 
length = avc_has_perm(_state,
@@ -1233,22 +1243,6 @@ static ssize_t sel_write_bool(struct file *filep, const 
char __user *buf,
 fsi->bool_pending_names[index]))
goto out;
 
-   length = -ENOMEM;
-   if (count >= PAGE_SIZE)
-   goto out;
-
-   /* No partial writes. */
-   length = -EINVAL;
-   if (*ppos != 0)
-   goto out;
-
-   page = memdup_user_nul(buf, count);
-   if (IS_ERR(page)) {
-   length = PTR_ERR(page);
-   page = NULL;
-   goto out;
-   }
-
length = -EINVAL;
if (sscanf(page, "%d", _value) != 1)
goto out;
@@ -1280,6 +1274,17 @@ static ssize_t sel_commit_bools_write(struct file *filep,
ssize_t length;
int new_value;
 
+   if (count >= PAGE_SIZE)
+   return -ENOMEM;
+
+   /* No partial writes. */
+   if (*ppos != 0)
+   return -EINVAL;
+
+   page = memdup_user_nul(buf, count);
+   if (IS_ERR(page))
+   return PTR_ERR(page);
+
mutex_lock(>mutex);
 
length = avc_has_perm(_state,
@@ -1289,22 +1294,6 @@ static ssize_t sel_commit_bools_write(struct file *filep,
if (length)
goto out;
 
-   length = -ENOMEM;
-   if (count >= PAGE_SIZE)
-   goto out;
-
-   /* No partial writes. */
-   length = -EINVAL;
-   if (*ppos != 0)
-   goto out;
-
-   page = memdup_user_nul(buf, count);
-   if (IS_ERR(page)) {
-   length = PTR_ERR(page);
-   page = NULL;
-