Quoting Fajar A. Nugraha (l...@fajar.net): > On Thu, May 29, 2014 at 10:58 AM, Serge Hallyn <serge.hal...@ubuntu.com>wrote: > > > Quoting Fajar A. Nugraha (l...@fajar.net): > > > On Thu, May 29, 2014 at 5:08 AM, Serge Hallyn <serge.hal...@ubuntu.com > > >wrote: > > > > would systemd be happy with it being mounted by lxc using an > > > > lxc.mount.entry? I think that would be preferable to relaxing the > > > > apparmor policy. i.e. > > > > > > > > lxc.mount.entry = /sys/fs/cgroup/systemd sys/fs/cgroup/systemd none > > > > bind,create=dir,optional 0 0 > > > > > > > > > > > Wouldn't that be shadowed by the container mounting its own /sys? > > > > If lxc mounts /sys then systemd will leave it be. > > > > > Apparently that line alone doesn't work for me. I also had to add before > that: > > lxc.mount.entry = sysfs sys sysfs default 0 0 > lxc.mount.entry = none sys/fs/cgroup tmpfs rw 0 0
or lxc.mount.auto = sys That's what I meant by 'if lxc mounts /sys' :) > > > Stephane also pointed out in my (closed) pull request that it would also > > > allow the container to mess with the hosts's resource allocation. > > > > Yes, that's why lxc.mount.auto = cgroup:mixed is better. But the above > > mount entry is no worse than letting the container do it through > > apparmor. > > > > That does not work, apparently. > > ### in confing > lxc.mount.auto = cgroup:mixed > ### > > ### lxc-start output > <30>systemd[1]: Starting Root Slice. > <27>systemd[1]: Caught <SEGV>, dumped core as pid 12. > <30>systemd[1]: Freezing execution. > ### Hm, that's unfortunate. I thought lxc.mount.auto = cgroup:mixed with cgfs would mount named subsystems? Christian? > ### > # lxc-attach -n f20 -- mount > rpool/lxc on / type zfs (rw,noatime,xattr,noacl) > udev on /dev type devtmpfs > (rw,relatime,size=2473540k,nr_inodes=618385,mode=755) > cgroup on /sys/fs/cgroup type tmpfs (rw,relatime,size=12k,mode=755) > none on /sys/fs/cgroup/cgmanager type tmpfs (rw,relatime,size=4k,mode=755) > devpts on /dev/lxc/console type devpts > (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) > devpts on /dev/lxc/tty1 type devpts > (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) > devpts on /dev/lxc/tty2 type devpts > (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) > devpts on /dev/lxc/tty3 type devpts > (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) > devpts on /dev/lxc/tty4 type devpts > (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) > devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=666) > proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) > sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) > tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755) > tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,mode=755) > > # lxc-attach -n f20 -- ls /sys/fs/cgroup/ > blkio cpu,cpuacct cpuset devices freezer hugetlb memory perf_event > systemd > > # lxc-attach -n f20 -- ls /sys/fs/cgroup/systemd > (no output) > ### > > It looks like there's two lines for /sys/fs/cgroup? I'm using trusty's > lxc-1.0.3. > > > > > > > > > This works (at least, tested with console and ssh login), and should be > > > secure-enough (bind-mount the container subdir, instead of the whole > > > systemd cgroup), but complicated. > > > > > > ### snippet of config > > > lxc.hook.mount = "/var/lib/lxc/f20/bin/create_container_systemd_cgroup" > > > lxc.hook.post-stop = > > "/var/lib/lxc/f20/bin/remove_container_systemd_cgroup" > > > ### > > > > > > ### cat create_container_systemd_cgroup > > > #!/bin/bash > > > mkdir -p /sys/fs/cgroup/systemd/lxc/$LXC_NAME > > > mount -t sysfs sysfs $LXC_ROOTFS_MOUNT/sys > > > mount -t tmpfs none $LXC_ROOTFS_MOUNT/sys/fs/cgroup > > > mkdir $LXC_ROOTFS_MOUNT/sys/fs/cgroup/systemd > > > mount --bind /sys/fs/cgroup/systemd/lxc/$LXC_NAME > > > $LXC_ROOTFS_MOUNT/sys/fs/cgroup/systemd > > > ### > > > > > > ### cat remove_container_systemd_cgroup > > > #!/bin/bash > > > [ -n "$LXC_NAME" ] && find /sys/fs/cgroup/systemd/lxc/$LXC_NAME -type d | > > > tac | xargs rmdir > > > ### > > > > > > Is there a way to simplify this somehow for it to be more suitable in the > > > template? > > > > I suppose we could add a new a lxc.mount.auto = cgroup:systemd option which > > only mounts name=systemd, read-only except for the container's own cgroup > > which is rw? But when I say we I don't really mean we :) > > > > > Will that work? > > systemd cgroup mount is weird in a sense that there's no > /lxc/CONTAINER_NAME subdirs under /sys/fs/cgroup/systemd, while there's one > under /sys/fs/crgoup/{blkio,cpu,etc}. So for systemd cgroup I don't see > which ones should be mount ro and which gets rw. > > The workaround hook I wrote earlier creates the directory > /sys/fs/cgroup/systemd/lxc/CONTAINER_NAME on the host, and bind-mount it as > the container's /sys/fs/cgroup/systemd. > > -- > Fajar _______________________________________________ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users