On Fri, 11.11.16 16:41, Michał Zegan (webczat_...@poczta.onet.pl) wrote: > Thank you for your answers! > > What I meant by secure containers is mostly, containers that are or will > be secure enough to use them for things like virtual private server > hosting. Is nspawn intended to be usable for such things in the future, > or maybe it already is, or whatever?
I run my own server this way, already as an exercise of dogfooding. So, yes, running a VPS like this certainly works, but do note that nspawn doesn't do orchestration or anything. It's good enough for me, but if you needy fancy orchestration tools then nspawn won't be sufficient. > What kernel limitations do you mean when you say about security? Well, a lot of subsystems cannot be locked down properly for use in containers yet. You can lock down a lot, in particular if you use userns, but there are still a lot of holes in there, and in particular userns itself has been a major source of CVEs alone in the most recent kernels. Right now, "containers" in general are not about security. Some companies claim they were secure, but they really aren't. And that's not a bug in nspawn, or docker, or lxc for that matter, it's simply a limiation of the kernel. Or to say this differently: we'll do in nspawn everything we can to lock things down properly, but there are limits based on what the kernel provides... As the kernel gets improved in this area, we'll update nspawn to make use of it. We are sitting in the same boat in this regard as others container managers, and they have the same limits more or less we have. > For now I know that in full containers with userns file capabilities do > not work (I think), you have no virtualized /proc/meminfo and friends > (do cgroup namespaces give a chance to change that?), you cannot mknod > devices (no whitelist possible at this level), no fuse support, no > automatic uid shifting kernel level, no possibility to mount physical > filesystems in userns, and no possibility to have selinux/etc per > container. Do you mean such limitations or something else? Well, devices are not virtualized at all (with the exception of network devices), that means no udev, not hotplug events and so on. Some container managers ignore this, and provide access to selected device nodes anyway, but we don't do something like that in nspawn, since it's pretty broken (as /sys wouldn't match what you see in /dev). In general, I think people should just accept that containers mean "you don't get physical device access". And if you want physical device access, then don't use containers... > I am interested in this topic but it is quite hard for me to track > progress in that area (kernel side) even though I subscribe in some > kernel ml's and know at least about submitted patches, or some of > them. What else is missing that I didn't say about that would be > good to have? Well, a lot of stuff is still not properly virtualized. To mind come audit, autofs, keyring, cgroups, … > Also what about setting cgroup parameters per container? nspawn does not > allow doing that, and you probably do not intent it to be done by > overriding container's scope unit settings, for example? You can actually do that just fine. Simply set it in the nspawn service file. Or if you run nspawn from the cmdline with the "-p" switch. Or make your changes dynamically via "systemctl set-property". It's all supported and works well. Lennart -- Lennart Poettering, Red Hat _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel