Re: [systemd-devel] name=systemd cgroup mounts/hierarchy

2020-11-18 Thread Andrei Enshin

Thank you for checking!

Yes, it clearly seems that systemd and kubelet in such setup shares cgroups 
which is not supposed.
We will prioritize moving our cluster to use systemd cgroup driver to avoid 
such conflict.
Also I think it would be good to have extra check on kubelet side to avoid 
running cgroupfs driver on systemd systems. But it’s question to k8s folks 
which already rised in slack.
 
-
Just out of curiosity, how systemd in particular may be disrupted with such 
record in root of it’s cgroups hierarchy as /kubpods/bla/bla during service 
(de)activation?
Or how it may disrupt the kubelet or workload running by it?

Will it delete such records because of some logic? Or there will be name 
conflict during cgroup creation?
Would be happy to know more details of cgroups interference.

I've read few articles:
https://systemd.io/CGROUP_DELEGATION/
http://0pointer.de/blog/projects/cgroups-vs-cgroups.html
https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
https://www.freedesktop.org/wiki/Software/systemd/writing-vm-managers/

even outdated one
https://www.freedesktop.org/wiki/Software/systemd/PaxControlGroups/

Seems I missed some technical details how exact it will interfere.
-

> It may be a residual inside kubelet context when environment was prepared for 
> a container spawned from within this context

Just last finding of this weird cgroup mount:
# find / -name '*8842def24*'
/sys/fs/cgroup/systemd/kubepods/burstable/pod7ffde41a-fa85-4b01-8023-69a4e4b50c55/8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15
/sys/fs/cgroup/systemd/machine.slice/systemd-nspawn@centos75.service/payload/system.slice/host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount

and

# machinectl list
MACHINE  CLASS SERVICEOS VERSION ADDRESSES
centos75 container systemd-nspawn centos 7   -
frr  container systemd-nspawn ubuntu 18.04   -

2 machines listed. Since container with id  8842def241  is not running it’s 
hard to understand what exactly happened, who did such mount and reproduce the 
conflict situation.

May I ask, how systemd-nspawn may be involved in it? Or any ideas what happened 
so I still have two times mounted the systemd named hierarchy?
  
>Thursday, November 19, 2020 3:25 AM +09:00 from Michal Koutný 
>:
> 
>Thanks for the details.
>
>On Mon, Nov 16, 2020 at 09:30:20PM +0300, Andrei Enshin < b...@bk.ru > wrote:
>> I see the kubelet crash with error: «Failed to start ContainerManager failed 
>> to initialize top level QOS containers: root container [kubepods] doesn't 
>> exist»
>> details:  https://github.com/kubernetes/kubernetes/issues/95488
>I skimmed the issue and noticed that your setup uses 'cgroupfs' cgroup
>driver. As explained in the other messages in this thread, it conflicts
>with systemd operation over the root cgroup tree.
>
>> I can see same two mounts of named systemd hierarchy from shell on the same 
>> node, simply by `$ cat /proc/self/mountinfo`
>> I think kubelet is running in the «main» mount namespace which has weird 
>> named systemd mount.
>I assume so as well. It may be a residual inside kubelet context when
>environment was prepared for a container spawned from within this
>context.
>
>> I would like to reproduce such weird mount to understand the full
>> situation and make sure I can avoid it in future.
>I'm afraid you may be seeing results of various races between systemd
>service (de)activation and container spawnings under the "shared" root
>(both of which comprise cgroup creation/removal and migrations).
>There's a reason behind the cgroup subtree delegation.
>
>So I'd say there's not much to do from systemd side now.
>
>
>Michal
>  
 
 
---
Best Regards,
Andrei Enshin
 ___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] name=systemd cgroup mounts/hierarchy

2020-11-18 Thread Michal Koutný
Thanks for the details.

On Mon, Nov 16, 2020 at 09:30:20PM +0300, Andrei Enshin  wrote:
> I see the kubelet crash with error: «Failed to start ContainerManager failed 
> to initialize top level QOS containers: root container [kubepods] doesn't 
> exist»
> details:  https://github.com/kubernetes/kubernetes/issues/95488
I skimmed the issue and noticed that your setup uses 'cgroupfs' cgroup
driver. As explained in the other messages in this thread, it conflicts
with systemd operation over the root cgroup tree.

> I can see same two mounts of named systemd hierarchy from shell on the same 
> node, simply by `$ cat /proc/self/mountinfo`
> I think kubelet is running in the «main» mount namespace which has weird 
> named systemd mount.
I assume so as well. It may be a residual inside kubelet context when
environment was prepared for a container spawned from within this
context.

> I would like to reproduce such weird mount to understand the full
> situation and make sure I can avoid it in future.
I'm afraid you may be seeing results of various races between systemd
service (de)activation and container spawnings under the "shared" root
(both of which comprise cgroup creation/removal and migrations).
There's a reason behind the cgroup subtree delegation.

So I'd say there's not much to do from systemd side now.


Michal


signature.asc
Description: Digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Ubuntu CI will be unavailable for part of Nov 20 and/or Nov 21

2020-11-18 Thread Dan Streetman
Starting sometime on Nov 20, some of the hardware used for Ubuntu CI
tests will be down for maintenance, and it's likely some or all Ubuntu
CI test runs will fail until the hardware is back up. I don't know the
specific length of time it will take, but the maintenance window has
been scheduled from Nov 20 until Nov 21.

https://lists.ubuntu.com/archives/launchpad-announce/2020-November/000107.html
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Why choose the number of 16M to check /run space?

2020-11-18 Thread Lennart Poettering
On Mi, 18.11.20 16:26, ChenQi (qi.c...@windriver.com) wrote:

> Hi All,
>
> I've checked the history and logic of the 16M space check for /run as
> presented in https://github.com/systemd/systemd/pull/5219.
> But I'm wondering why choose this number (16M)? Is there some criteria or is
> it based on experience?

We had to pick something. And 16M should be more than enough, but
generally available. So far noone complained, we got no bug reports
that it is too much or too little.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Why choose the number of 16M to check /run space?

2020-11-18 Thread ChenQi

Hi All,

I've checked the history and logic of the 16M space check for /run as 
presented in https://github.com/systemd/systemd/pull/5219.
But I'm wondering why choose this number (16M)? Is there some criteria 
or is it based on experience?


Best Regards,
Chen Qi
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] How can I simply check that a service has been restarted ?

2020-11-18 Thread Francis Moreau
On Tue, Nov 10, 2020 at 6:35 PM Luca Boccassi  wrote:
>
> On Tue, 2020-11-10 at 17:22 +, Luca Boccassi wrote:
> > On Tue, 2020-11-10 at 18:12 +0100, Francis Moreau wrote:
> > > On Tue, Nov 10, 2020 at 2:43 PM Luca Boccassi  wrote:
> > > > On Tue, 2020-11-10 at 11:50 +0100, Francis Moreau wrote:
> > > > > On Tue, Nov 10, 2020 at 11:30 AM Lennart Poettering
> > > > >  wrote:
> > > > > > On Di, 10.11.20 10:28, Francis Moreau (francis.m...@gmail.com) 
> > > > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > After restarting a service with "systemctl try-restart ..." I 
> > > > > > > want to
> > > > > > > verify that the service has been restarted.
> > > > > > >
> > > > > > > How can I reliably do this without using the dbus API ?
> > > > > >
> > > > > > D-Bus is how systemd exposes its state. If you don't want to use 
> > > > > > that,
> > > > > > you won't get the state information.
> > > > > >
> > > > >
> > > > > dbus is overkill for my little bash script.
> > > >
> > > > It's pretty simple, and a one-liner, to get the value of a property
> > > > from a bash script with busctl. Eg:
> > > >
> > > > $ busctl get-property org.freedesktop.systemd1 
> > > > /org/freedesktop/systemd1/unit/gdm_2eservice 
> > > > org.freedesktop.systemd1.Service Restart
> > > > s "always"
> > > >
> > >
> > > Thank you but I'm not interested in the Restart property of a service,
> > > I want to know if a service as been restarted.
> >
> > It's just an example on how to get D-Bus data on units easily from a
> > bash script.
>
> Eg:
>
> $ busctl get-property org.freedesktop.systemd1 
> /org/freedesktop/systemd1/unit/gdm_2eservice org.freedesktop.systemd1.Service 
> NRestarts
> u 0
>

I didn't know NRestart property, thanks. But it only counts automatic
restart, not manual one.

And also there's no need to use busctl, for getting service properties
I can use 'systemctl show -p'

-- 
Francis
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel