On 01/09/2014 03:55 PM, Victor Porton wrote: > In Fedora there is bin/sandbox command which runs a specified command in so > called 'sandbox'. Program running in sandbox cannot open new files (it is > commonly used with preopen stdin and stdout) and possibly its access to > network is limited. It is intended to run potentially malicious software > safely. > > This Fedora sandbox is not perfect however. > > One problem is: > > Suppose the sandboxed program spawned some child processes and exited itself. > > Suppose we want to kill the sandboxed program after 30 second, if it has not > exited voluntarily. > > The trouble is that the software cannot figure out which processes have > appeared from the sandboxed binary. So we are unable to kill these processes > automatically. This means that a hacker can in this way create thousands (or > more) processes which would overload the system. > > Also note that the sandboxed program may run setsid() and thus its identity > may be lost completely. > > I propose to add parameter sandbox_id to each process in the kernel. It would > be 0 for normal processes and allocated like PID or GID for processes we > create in sandbox. Children inherit sandbox_id. There should be an API call > using which a process makes it sandboxed_id non-zero (which returns EPERM if > it is already non-zero). > > Then there should be API to enumerate all processes with given sandbox_id, so > that we would be able to kill them (-TERM or -KILL). Or maybe we should also > have the function which sends the given signal to all processes with given > sandbox_id (otherwise we would war with a hacker which could possibly create > new children faster than we kill them).
I think you need to think bigger :) I've occasionally pondered how to do real tracking of process trees (sandbox could use it, but I was thinking of systemd and other service managers). cgroups* suck for this purpose. One approach would be to have another subreaper mode (subreaper mode 2) that does three things: - Subreaper mode 2 zombies do not send SIGCHLD and cannot be reaped until they have no descendents left. - Direct zombie children of subreaper mode 2 zombies are automatically reaped. - Descendents that need to be reparented are reparented to the subreaper, just like in subreaper mode 1. Then you'd add an API that takes the PID of a mode 2 subreaper and kills its entire process subtree. (Optionally, tgkill could do that automatically.) To use this for sandbox, sandbox would set subreaper mode 2 and then fork. The initial sandbox process would exit and the child would exec into the sandbox. The parent would stick around as a zombie until the whole tree went away. To use this for an init-like program, the service manager would fork/clone a dummy PID, set subreaper mode 2, fork again, and exec the service. That dummy PID would serve as a persistent reference to the subtree. For added fun, there should be a way to efficiently find the mode 2 subreaper that owns a given pid/tid. That way systemd / journald could map PIDs to service names without mucking with cgroups. An alternative formulation of more or less the same thing would be a syscall manage_pid_subtree(pid_t pid) that does, roughly: if (pid->real_parent != current) return -EINVAL; set subreaper mode; exit current mm, signal set, etc to conserve resources; /* at this point, current is essentially a kernel thread. */ wait for pid to exit; exit, copying pid's return code and other exit siginfo state; To manage a subreaper, you double-fork, and then the middle process would call manage_pid_subtree on its child. Thoughts? * Goddamnit, systemd, I want a way to turn *off* your control of the One True Cgroup Hierarchy (TM). I consider the lack of such a mechanism to be a serious upcoming regression. Maybe if the kernel gives systemd a way to do this, systemd will use it. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/