I'm aware of the higher level of collaboration between podman and systemd
compared to docker, hence primarily raising this issue from a podman angle.

In privileged mode all mounts are read-write, so yes the container has
write access to the cgroup filesystem. (Podman also ensures write access to
the systemd cgroup subsystem mount in non-privileged mode by default).

On first boot PID 1 can be found in
/sys/fs/cgroup/systemd/machine.slice/libpod-<ctr-id>.scope/init.scope/cgroup.procs,
whereas when the container restarts the 'init.scope/' directory does not
exist and PID 1 is instead found in the parent (container root) cgroup
/sys/fs/cgroup/systemd/machine.slice/libpod-<ctr-id>.scope/cgroup.procs
(also reflected by /proc/1/cgroup). This is strange because systemd must be
the one to create this cgroup dir in the initial boot, so I'm not sure why
it wouldn't on subsequent boot?

I can confirm that the container has permissions since executing a 'mkdir'
in /sys/fs/cgroup/systemd/machine.slice/libpod-<ctr-id>.scope/ inside the
container succeeds after the restart, so I have no idea why systemd is not
creating the 'init.scope/' dir. I notice that inside the container's
systemd cgroup mount 'system.slice/' does exist, but 'user.slice/' also
does not (both exist on normal boot). Is there any way I can find systemd
logs that might indicate why the cgroup dir creation is failing?

One final datapoint: the same is seen when using a private cgroup namespace
(via 'podman run --cgroupns=private'), although then the error is then, as
expected, "Failed to attach 1 to compat systemd cgroup /init.scope: No such
file or directory".

I could raise this with the podman team, but it seems more in the systemd
area given it's a systemd warning and I would expect systemd to be creating
this cgroup dir?

Thanks,
Lewis

On Tue, 10 Jan 2023 at 14:48, Lennart Poettering <lenn...@poettering.net>
wrote:

> On Di, 10.01.23 13:18, Lewis Gaul (lewis.g...@gmail.com) wrote:
>
> > Following 'setenforce 0' I still see the same issue (I was also
> suspecting
> > SELinux!).
> >
> > A few additional data points:
> > - this was not seen when using systemd v230 inside the container
> > - this is also seen on CentOS 8.4
> > - this is seen under docker even if the container's cgroup driver is
> > changed from 'cgroupfs' to 'systemd'
>
> docker is garbage. They are hostile towards running systemd inside
> containers.
>
> podman upstream is a lot friendly, and apparently what everyone in OCI
> is going towards these days.
>
> I have not much experience with podman though, and in particular not
> old versions. Next step would probably be to look at what precisely
> causes the permission issue, via strace.
>
> but did you make sure your container actually gets write access to the
> cgroup trees?
>
> anyway, i'd recommend asking the podman community for help about this.
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>

Reply via email to