Jürg Billeter <j...@bitron.ch> writes: > On Thu, 2017-10-05 at 18:27 +0200, Oleg Nesterov wrote: >> On 10/03, Jürg Billeter wrote: >> > >> > My use case is to provide a way for a process to spawn a child and >> > ensure that no descendants survive when that child dies. Avoiding >> > runaway processes is desirable in many situations. My motivation is >> > very lightweight (nested) sandboxing (every process is potentially >> > sandboxed). >> > >> > I.e., pid namespaces would be a pretty good fit (assuming they are >> > sufficiently lightweight) but CLONE_NEWPID >> >> sorry if this was already discussed, I didn't read this thread yet... >> >> if CLONE_NEWPID is not suitable for any reason. We already have >> PR_SET_CHILD_SUBREAPER. Perhaps we can simply add another >> PR_SET_KILL_ALL_DESCEDANTS_ON_EXIT? we can use walk_process_tree() >> to send SIGKILL. > > Yes, this is an option. However, after the discussion in this thread I > believe it would be better to drop the CAP_SYS_ADMIN requirement for > CLONE_NEWPID (when no_new_privs is set) as this would avoid adding > another API and code path for a similar effect. I'm interested in > possible security concerns about such a change. Adding Andy Lutomirski > to cc.
Absolutely not. no_new_privs does not need the headache of being increasing the kernel attack surface. User namespaces are cheap, use one. Let the people using no_new_privs sleep easy. We don't need to transform no_new_privs into a user namespace. Eric