On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote: > > Just to be clear, module loading requires - and must always continue to > require - CAP_SYS_MODULE against the initial user namespace. Containers > in user namespaces do not have that. > > I don't believe anyone has ever claimed that containers which are not in > a user namespace are in any way secure.
Unless the container performs some action which causes the kernel to call request_module(), which then loads some kernel module, potentially containing cr*p unmaintained code which was included when the distro compiled the world, into the host kernel. This is an attack vector that doesn't exist if you are using VM's. And in general, the attack surface of the entire Linux kernel<->userspace API is far larger than that which is exposed by the guest<->host interface. For that reason, containers are *far* more insecure than VM's, since once the attacker gets root on the guest VM, they then have to attack the hypervisor interface. And if you compare the attack surface of the two, it's pretty clear which is larger, and it's not the hypervisor interface. - Ted