On Thu, Jul 6, 2023 at 6:05 PM Paulo Coghi - Coghi IT <pauloco...@gmail.com> wrote:
> Hello Systemd Devel team, > > I've been using OpenVZ for 11 years in production without the security > problems I faced with LXC. But as a non-official mainstream library of > Linux kernel, there is always a gap. Virtuozzo is working on OpenVZ 9 with > kernel 5.14 now, but it is still not released. > > Systemd-nspawn seems promising, and I would like to cordially ask a few > questions. > > 1. Does systemd-nspawn officially support system containers? > I would like to not conclude it myself, but it seems so, after reading the > official documentation. > Yes, it's mostly what nspawn is designed for. > 2. The "experience" inside a system container is similar to a VM, like on > OpenVZ? > On OpenVZ containers, except for kernel related activities (like adding > kernel modules), everything is identical to a virtual machine, with the > "root" user from the container being able to manage everything, like adding > new users, changing firewall rules, installing multiple services (web > servers, databases), managing cron jobs, etc. > All of that is pretty much implied by "system container". > > 3. Security - Can those OS containers be used in production, with multiple > containers from multiple owners inside the same host? > On LXC, for example, there are vulnerabilities that can be exploited, > allowing a container user to escape to the host. On OpenVZ, it seems that > his was already addressed more than a decade ago. > Does systemd-nspawn provide such security, not allowing a "container user" > to escape to the host? > Both nspawn and LXC these days use "user namespaces" for isolation (i.e. container root is no longer the same UID as host root, and each container is mapped to a unique set of host UIDs as well). LXC calls those "unprivileged containers" and seems to consider them <https://linuxcontainers.org/lxc/security/> as safe as the kernel's regular user separation, so the same would apply to nspawn as well. The corresponding nspawn option is PrivateUsers=. > > 4. Storage and Inodes > On OpenVZ, we could create "virtualized" file systems, like ploop, which > avoids consuming inodes on the host's file system, while lightweight enough > to provide near-native performance. > Is there any approach to have similar benefits through systemd-nspawn? > Nspawn supports running containers off a loop-mounted image, but nothing built-in with the same features, although ploop seems to be a fully separate kernel module (i.e. not strictly part of OpenVZ), so in theory you could still use it with nspawn. Alternatively, you could use regular loop devices (which can be space-efficient with all recent kernels, as they now support TRIM) if you don't need the snapshotting. Though, "consuming inodes" is only a problem with Ext4, isn't it? Does the same type of problem even exist on more modern filesystems like XFS or Btrfs? -- Mantas Mikulėnas