On Thu, Sep 10, 2015 at 08=54=28PM -0400, namn...@safe-mail.net wrote: > > The current intention of the seccomp filter in QEMU, is that /all/ existing > > QEMU features continue to work unchanged. So even if a flag is used in a > > seemingly uncommon code path, we still need to allow that in a seccomp > > filter. > It already doesn't work very well, e.g. with -chroot, it fails because > chroot() > is not whitelisted, same with -runas because setgid() etc isn't whitelisted. > Maybe there should be an extra option for -sandbox, like "-sandbox > experimental" > which does argument filtering, which of course may break something, and the > old > behavior would do plain syscall filtering without caring about arguments, > because > that's so much easier to guarantee to work, even if it provides little > security.
Can you point out which exact use case breaks if you don't whitelist the below mentioned system calls' flags? > > We could also change the default behavior from SCMP_ACT_KILL (which kills the > entire thing as soon as a single violation occurs) to SCMP_ACT_ERRNO(EPERM), > which > will just return EPERM for a syscall with a violation. The software will be > much > more capable of handling a permission denied error without crashing. Although > of > course that violates the principle of fast-fail. We thought about this in beggining of the development of seccomp on qemu. Some feature like allow all, which would print to stderr all illegal hits and a another argument like -sandbox_add="syscall1,syscall2", but this would be against the concept of the whole security schema. We don't want the user to take full control of it, and if you're a developer, you know what to do. > > > So we need to add DODUMP, DONTDUMP, UNMERGABLE and WILLNEED here. That > > is still stricter than the previous allow-everything rule, so a net > > win. > And MADV_INVALID too I assume? That was one of the others I got with grep. > > > > + > > > + /* shmget */ > > > + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2, > > > + SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE), > > > + SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0777)); > > > > I'm not familiar with semantics of these seccomp rules, but is this > > saying that the second arg must be exactly equal to IP_CREAT|0777 ? > > If the app passes IP_CREAT|0600, would that be permitted instead ? > > The latter is what I see gtk2 source code passing for mode. > Argument 2 must be exactly equal to IP_CREAT|0777, yes, otherwise Qemu dies > with > SIGSYS. I did check the Qemu source and saw 0777 was harcoded. Does the 0600 > mask > in GTK2 ever get called in Qemu? Anyway I added the MADV flags and the 0600 > mask > in the v2 patch. > > Signed-off-by: Namsun Ch'o <namn...@safe-mail.net> > --- > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > index f9de0d3..a353ef9 100644 > --- a/qemu-seccomp.c > +++ b/qemu-seccomp.c > @@ -14,6 +14,8 @@ > */ > #include <stdio.h> > #include <seccomp.h> > +#include <linux/ipc.h> > +#include <asm-generic/mman-common.h> > #include "sysemu/seccomp.h" > > struct QemuSeccompSyscall { > @@ -105,7 +107,6 @@ static const struct QemuSeccompSyscall > seccomp_whitelist[] = { > { SCMP_SYS(rt_sigreturn), 245 }, > { SCMP_SYS(sync), 245 }, > { SCMP_SYS(pread64), 245 }, > - { SCMP_SYS(madvise), 245 }, > { SCMP_SYS(set_robust_list), 245 }, > { SCMP_SYS(lseek), 245 }, > { SCMP_SYS(pselect6), 245 }, > @@ -224,11 +225,9 @@ static const struct QemuSeccompSyscall > seccomp_whitelist[] = { > { SCMP_SYS(arch_prctl), 240 }, > { SCMP_SYS(mkdir), 240 }, > { SCMP_SYS(fchmod), 240 }, > - { SCMP_SYS(shmget), 240 }, > { SCMP_SYS(shmat), 240 }, > { SCMP_SYS(shmdt), 240 }, > { SCMP_SYS(timerfd_create), 240 }, > - { SCMP_SYS(shmctl), 240 }, > { SCMP_SYS(mlockall), 240 }, > { SCMP_SYS(mlock), 240 }, > { SCMP_SYS(munlock), 240 }, > @@ -264,6 +263,60 @@ int seccomp_start(void) > } > } > > + /* madvise */ > + static const int madvise_flags[] = { > + MADV_DODUMP, > + MADV_DONTDUMP, > + MADV_INVALID, > + MADV_UNMERGEABLE, > + MADV_WILLNEED, > + MADV_DONTFORK, > + MADV_DONTNEED, > + MADV_HUGEPAGE, > + MADV_MERGEABLE, > + }; > + for (i = 0; i < ARRAY_SIZE(madvise_flags); i++) { > + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(madvise), 1, > + SCMP_A2(SCMP_CMP_EQ, madvise_flags[i])); > + if (rc < 0) { > + goto seccomp_return; > + } > + } > + rc = seccomp_syscall_priority(ctx, SCMP_SYS(madvise), 245); > + if (rc < 0) { > + goto seccomp_return; > + } > + > + /* shmget */ > + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2, > + SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE), > + SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0777)); > + if (rc < 0) { > + goto seccomp_return; > + } > + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2, > + SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE), > + SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0600)); Isn't it IPC_CREAT? Or am I missing something? Can you resend a v3 describing the changes you did from v1 to v2 and v3? This helps keep tracking of ideas and discussions. Thanks a lot for the contribution! > + if (rc < 0) { > + goto seccomp_return; > + } > + rc = seccomp_syscall_priority(ctx, SCMP_SYS(shmget), 240); > + if (rc < 0) { > + goto seccomp_return; > + } > + > + /* shmctl */ > + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmctl), 2, > + SCMP_A1(SCMP_CMP_EQ, IPC_RMID), > + SCMP_A2(SCMP_CMP_EQ, 0)); > + if (rc < 0) { > + goto seccomp_return; > + } > + rc = seccomp_syscall_priority(ctx, SCMP_SYS(shmctl), 240); > + if (rc < 0) { > + goto seccomp_return; > + } > + > rc = seccomp_load(ctx); > > seccomp_return: -- Eduardo Otubo ProfitBricks GmbH
signature.asc
Description: Digital signature