Hi!

I was very pleased to see the "nspawn: add support for kernel 5.12 ID mapping 
mounts #19438"-pull request and went right at it to try it out.
The following was tested on the current git head of systemd running on 
archlinux.

What I try to achieve on a high level is kind of emulating bubblewrap and 
executing chromium under wayland with gpu acceleration and working audio using 
PipeWire.
For that I need to pass some sockets and devices to the container using 
--bind-ro . I want to use --private-users=pick to have easier separation 
between multiple Containers.
That means I do not know the running uid of the process before nspawn spawns my 
container. That results on problems accessing the sockets.
Until now I used setfacl to work around this limitation and allow access to the 
sockets.
I was hoping to be able to skip that with --private-users-ownership=map .

I'm passing three sockets belonging to uid 1000 on the host to a container with 
private-users=pick and and try to access it via uid 1000 (name "user") in the 
container.
Everything is happening on an ext4 file system. I'd prefer btrfs but that is 
(so far) lacking id mapping support.
The full call looks like that:

statepath="/machines/state/chromium/${profilename}"
systemd-nspawn \
        -D /machines/images/archlinux-chromium/ \
        --private-users=pick \
        --private-users-ownership=map \
        --no-new-privileges=yes \
        --as-pid2 \
        --machine "chromium-${profilename}" \
        --user user \
        --bind-ro /var/run/user/1000/pulse/native:/sockets/pulse/native \
        --bind-ro /var/run/user/1000/wayland-1:/sockets/wayland-1 \
        --bind-ro /var/run/user/1000/pipewire-0:/sockets/pipewire-0 \
        --bind "${statepath}:/home/user" \
        --bind /dev/dri/renderD128 \
        -E WAYLAND_DISPLAY=wayland-1 \
        -E XDG_RUNTIME_DIR=/sockets \
        chromium --enable-features=UseOzonePlatform --ozone-platform=wayland

This results in the following output:

Spawning container chromium-default on /machines/images/archlinux-chromium.
Press ^] three times within 1s to kill container.
Selected user namespace base 552206336 and range 65536.
Failed to create mount point 
/machines/images/archlinux-chromium/sockets/pipewire-0: Value too large for 
defined data type

I've run strace on it, this results in the following relevant output:

[pid   524] mount("/machines/state/chromium/default", "/proc/self/fd/8", NULL, 
MS_BIND|MS_REC, NULL) = 0
[pid   524] close(8)                    = 0
[pid   524] newfstatat(AT_FDCWD, "/var/run/user/1000/pipewire-0", 
{st_mode=S_IFSOCK|0666, st_size=0, ...}, 0) = 0
[pid   524] openat(AT_FDCWD, "/machines/images/archlinux-chromium", 
O_RDONLY|O_CLOEXEC|O_PATH|O_DIRECTORY) = 8
[pid   524] openat(8, "sockets", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 10
[pid   524] newfstatat(10, "", {st_mode=S_IFDIR|0700, st_size=4096, ...}, 
AT_EMPTY_PATH) = 0
[pid   524] close(8)                    = 0
[pid   524] openat(10, "pipewire-0", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = -1 
ENOENT (No such file or directory
)
[pid   524] close(10)                   = 0
[pid   524] newfstatat(AT_FDCWD, "/machines/images/archlinux-chromium/sockets", 
{st_mode=S_IFDIR|0700, st_size=40
96, ...}, 0) = 0
[pid   524] openat(AT_FDCWD, 
"/machines/images/archlinux-chromium/sockets/pipewire-0", 
O_RDONLY|O_NOFOLLOW|O_CLOE
XEC|O_PATH) = -1 ENOENT (No such file or directory)
[pid   524] openat(AT_FDCWD, 
"/machines/images/archlinux-chromium/sockets/pipewire-0", 
O_WRONLY|O_CREAT|O_EXCL|O_
CLOEXEC, 0644) = -1 EOVERFLOW (Value too large for defined data type)
[pid   524] writev(2, [{iov_base="Failed to create mount point /ma"..., 
iov_len=122}, {iov_base="\n", iov_len=1}]
, 2Failed to create mount point 
/machines/images/archlinux-chromium/sockets/pipewire-0: Value too large for 
defin
ed data type
) = 123

This maps to the touch in nspawn-mount.c at line 754.
If I skip the --bind(-ro) part this works fine (except chromium of course not 
working), same if I keep the binds and remove the --private-users-ownership=map.
I'm kind of lost on how to go on about this issue at this point.
Have I made a mistake or wrong assumption about how that should work?
Should I open an issue on github about that?

Thanks,
nd
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to