On Mon, Nov 19, 2018 at 1:37 PM Christian Brauner <christ...@brauner.io> wrote: > > On Mon, Nov 19, 2018 at 01:26:22PM -0800, Daniel Colascione wrote: > > On Mon, Nov 19, 2018 at 1:21 PM, Christian Brauner <christ...@brauner.io> > > wrote: > > > That can be done without a loop by comparing the level counter for the > > > two pid namespaces. > > > > > >> > > >> And you can rewrite pidns_get_parent to use it. So you would instead be > > >> doing: > > >> > > >> if (pidns_is_descendant(proc_pid_ns, task_active_pid_ns(current))) > > >> return -EPERM; > > >> > > >> (Or you can just copy the 5-line loop into procfd_signal -- though I > > >> imagine we'll need this for all of the procfd_* APIs.) > > > > Why is any of this even necessary? Why does the child namespace we're > > considering even have a file descriptor to its ancestor's procfs? If > > Because you can send file descriptors between processes and container > runtimes tend to do that.
Right. But why *would* a container runtime send one of these procfs FDs to a container? > > it has one of these FDs, it can already *read* all sorts of > > information it really shouldn't be able to acquire, so the additional > > ability to send a signal (subject to the usual permission checks) > > feels like sticking a finger in a dike that's already well-perforated. > > IMHO, we shouldn't bother with this check. The patch would be simpler > > without it. > > We will definitely not allow signaling processes in an ancestor pid > namespace! That is a security issue! I can imagine container runtimes > killing their monitoring process etc. pp. Not happening, unless someone > with deep expertise in signals can convince me otherwise. If parent namespace procfs FDs or mounts really can leak into child namespaces as easily as Aleksa says, then I don't mind adding the check. I was under the impression that if you find yourself in this situation, you already have a big problem.