Re: [systemd-devel] name=systemd cgroup mounts/hierarchy
Thank you for checking! Yes, it clearly seems that systemd and kubelet in such setup shares cgroups which is not supposed. We will prioritize moving our cluster to use systemd cgroup driver to avoid such conflict. Also I think it would be good to have extra check on kubelet side to avoid running cgroupfs driver on systemd systems. But it’s question to k8s folks which already rised in slack. - Just out of curiosity, how systemd in particular may be disrupted with such record in root of it’s cgroups hierarchy as /kubpods/bla/bla during service (de)activation? Or how it may disrupt the kubelet or workload running by it? Will it delete such records because of some logic? Or there will be name conflict during cgroup creation? Would be happy to know more details of cgroups interference. I've read few articles: https://systemd.io/CGROUP_DELEGATION/ http://0pointer.de/blog/projects/cgroups-vs-cgroups.html https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/ https://www.freedesktop.org/wiki/Software/systemd/writing-vm-managers/ even outdated one https://www.freedesktop.org/wiki/Software/systemd/PaxControlGroups/ Seems I missed some technical details how exact it will interfere. - > It may be a residual inside kubelet context when environment was prepared for > a container spawned from within this context Just last finding of this weird cgroup mount: # find / -name '*8842def24*' /sys/fs/cgroup/systemd/kubepods/burstable/pod7ffde41a-fa85-4b01-8023-69a4e4b50c55/8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15 /sys/fs/cgroup/systemd/machine.slice/systemd-nspawn@centos75.service/payload/system.slice/host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount and # machinectl list MACHINE CLASS SERVICEOS VERSION ADDRESSES centos75 container systemd-nspawn centos 7 - frr container systemd-nspawn ubuntu 18.04 - 2 machines listed. Since container with id 8842def241 is not running it’s hard to understand what exactly happened, who did such mount and reproduce the conflict situation. May I ask, how systemd-nspawn may be involved in it? Or any ideas what happened so I still have two times mounted the systemd named hierarchy? >Thursday, November 19, 2020 3:25 AM +09:00 from Michal Koutný >: > >Thanks for the details. > >On Mon, Nov 16, 2020 at 09:30:20PM +0300, Andrei Enshin < b...@bk.ru > wrote: >> I see the kubelet crash with error: «Failed to start ContainerManager failed >> to initialize top level QOS containers: root container [kubepods] doesn't >> exist» >> details: https://github.com/kubernetes/kubernetes/issues/95488 >I skimmed the issue and noticed that your setup uses 'cgroupfs' cgroup >driver. As explained in the other messages in this thread, it conflicts >with systemd operation over the root cgroup tree. > >> I can see same two mounts of named systemd hierarchy from shell on the same >> node, simply by `$ cat /proc/self/mountinfo` >> I think kubelet is running in the «main» mount namespace which has weird >> named systemd mount. >I assume so as well. It may be a residual inside kubelet context when >environment was prepared for a container spawned from within this >context. > >> I would like to reproduce such weird mount to understand the full >> situation and make sure I can avoid it in future. >I'm afraid you may be seeing results of various races between systemd >service (de)activation and container spawnings under the "shared" root >(both of which comprise cgroup creation/removal and migrations). >There's a reason behind the cgroup subtree delegation. > >So I'd say there's not much to do from systemd side now. > > >Michal > --- Best Regards, Andrei Enshin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] name=systemd cgroup mounts/hierarchy
Thanks for the details. On Mon, Nov 16, 2020 at 09:30:20PM +0300, Andrei Enshin wrote: > I see the kubelet crash with error: «Failed to start ContainerManager failed > to initialize top level QOS containers: root container [kubepods] doesn't > exist» > details: https://github.com/kubernetes/kubernetes/issues/95488 I skimmed the issue and noticed that your setup uses 'cgroupfs' cgroup driver. As explained in the other messages in this thread, it conflicts with systemd operation over the root cgroup tree. > I can see same two mounts of named systemd hierarchy from shell on the same > node, simply by `$ cat /proc/self/mountinfo` > I think kubelet is running in the «main» mount namespace which has weird > named systemd mount. I assume so as well. It may be a residual inside kubelet context when environment was prepared for a container spawned from within this context. > I would like to reproduce such weird mount to understand the full > situation and make sure I can avoid it in future. I'm afraid you may be seeing results of various races between systemd service (de)activation and container spawnings under the "shared" root (both of which comprise cgroup creation/removal and migrations). There's a reason behind the cgroup subtree delegation. So I'd say there's not much to do from systemd side now. Michal signature.asc Description: Digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Ubuntu CI will be unavailable for part of Nov 20 and/or Nov 21
Starting sometime on Nov 20, some of the hardware used for Ubuntu CI tests will be down for maintenance, and it's likely some or all Ubuntu CI test runs will fail until the hardware is back up. I don't know the specific length of time it will take, but the maintenance window has been scheduled from Nov 20 until Nov 21. https://lists.ubuntu.com/archives/launchpad-announce/2020-November/000107.html ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Why choose the number of 16M to check /run space?
On Mi, 18.11.20 16:26, ChenQi (qi.c...@windriver.com) wrote: > Hi All, > > I've checked the history and logic of the 16M space check for /run as > presented in https://github.com/systemd/systemd/pull/5219. > But I'm wondering why choose this number (16M)? Is there some criteria or is > it based on experience? We had to pick something. And 16M should be more than enough, but generally available. So far noone complained, we got no bug reports that it is too much or too little. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Why choose the number of 16M to check /run space?
Hi All, I've checked the history and logic of the 16M space check for /run as presented in https://github.com/systemd/systemd/pull/5219. But I'm wondering why choose this number (16M)? Is there some criteria or is it based on experience? Best Regards, Chen Qi ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] How can I simply check that a service has been restarted ?
On Tue, Nov 10, 2020 at 6:35 PM Luca Boccassi wrote: > > On Tue, 2020-11-10 at 17:22 +, Luca Boccassi wrote: > > On Tue, 2020-11-10 at 18:12 +0100, Francis Moreau wrote: > > > On Tue, Nov 10, 2020 at 2:43 PM Luca Boccassi wrote: > > > > On Tue, 2020-11-10 at 11:50 +0100, Francis Moreau wrote: > > > > > On Tue, Nov 10, 2020 at 11:30 AM Lennart Poettering > > > > > wrote: > > > > > > On Di, 10.11.20 10:28, Francis Moreau (francis.m...@gmail.com) > > > > > > wrote: > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > After restarting a service with "systemctl try-restart ..." I > > > > > > > want to > > > > > > > verify that the service has been restarted. > > > > > > > > > > > > > > How can I reliably do this without using the dbus API ? > > > > > > > > > > > > D-Bus is how systemd exposes its state. If you don't want to use > > > > > > that, > > > > > > you won't get the state information. > > > > > > > > > > > > > > > > dbus is overkill for my little bash script. > > > > > > > > It's pretty simple, and a one-liner, to get the value of a property > > > > from a bash script with busctl. Eg: > > > > > > > > $ busctl get-property org.freedesktop.systemd1 > > > > /org/freedesktop/systemd1/unit/gdm_2eservice > > > > org.freedesktop.systemd1.Service Restart > > > > s "always" > > > > > > > > > > Thank you but I'm not interested in the Restart property of a service, > > > I want to know if a service as been restarted. > > > > It's just an example on how to get D-Bus data on units easily from a > > bash script. > > Eg: > > $ busctl get-property org.freedesktop.systemd1 > /org/freedesktop/systemd1/unit/gdm_2eservice org.freedesktop.systemd1.Service > NRestarts > u 0 > I didn't know NRestart property, thanks. But it only counts automatic restart, not manual one. And also there's no need to use busctl, for getting service properties I can use 'systemctl show -p' -- Francis ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel