Re: [systemd-devel] Rationale for mirroring cpu and systemd cgroup subsystems

2014-11-05 Thread Lennart Poettering
On Wed, 05.11.14 16:00, Umut Tezduyar Lindskog (u...@tezduyar.com) wrote:

> On Wed, Nov 5, 2014 at 2:05 PM, Lennart Poettering
>  wrote:
> > On Wed, 05.11.14 13:41, Umut Tezduyar Lindskog (u...@tezduyar.com) wrote:
> >
> >> Hi,
> >>
> >> What is the reasoning for not joining cpu subsystem with systemd subsystem?
> >>
> >> There are couple ways you can mirror [1] cpu and systemd subsystems
> >> and doing so can result completely different cpu bandwidth for
> >> processes.
> >>
> >> I am wondering why we don't mirror them by default.
> >
> > Because simply enabling a "cpu" controller for a unit already has
> > effects on the processes running it. For example, you don't get RT
> > anymore, and the general scheduling is altered to schedule your entire
> > group evenly against the all groups on the same level.
> 
> Doesn't it make sense to turn it on by default and let users wanting
> RT disable it? Seems like this was the case at some point -
> http://www.freedesktop.org/wiki/Software/systemd/MyServiceCantGetRealtime/
> (Very much outdated article, we don't have ControlGroup= anymore)

Yeah, I really need to update that article.

Generally we should try hard to keep the tree minimal. Resource
control enforcement is not free, and hence it should be opt-in, not
opt-out. This is something Tejun pretty explicitly asked us for: he
wants the most shallow tree that does what is needed.

> > systemd will "mirror" a cgroup in the "cpu" hierarchy as soon as you
> > set a property on it that requires the "cpu" or "cpuacct" hierarchy,
> > for example CPUAccounting=, CPUShares= or CPUQuota.
> 
> You can turn on mirroring during runtime but as far as I know there is
> no way going back without rebooting right?

In current versions it should correctly turn mirroring off again when
you reset the props to their defaults.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Rationale for mirroring cpu and systemd cgroup subsystems

2014-11-05 Thread Umut Tezduyar Lindskog
On Wed, Nov 5, 2014 at 2:05 PM, Lennart Poettering
 wrote:
> On Wed, 05.11.14 13:41, Umut Tezduyar Lindskog (u...@tezduyar.com) wrote:
>
>> Hi,
>>
>> What is the reasoning for not joining cpu subsystem with systemd subsystem?
>>
>> There are couple ways you can mirror [1] cpu and systemd subsystems
>> and doing so can result completely different cpu bandwidth for
>> processes.
>>
>> I am wondering why we don't mirror them by default.
>
> Because simply enabling a "cpu" controller for a unit already has
> effects on the processes running it. For example, you don't get RT
> anymore, and the general scheduling is altered to schedule your entire
> group evenly against the all groups on the same level.

Doesn't it make sense to turn it on by default and let users wanting
RT disable it? Seems like this was the case at some point -
http://www.freedesktop.org/wiki/Software/systemd/MyServiceCantGetRealtime/
(Very much outdated article, we don't have ControlGroup= anymore)

>
> systemd will "mirror" a cgroup in the "cpu" hierarchy as soon as you
> set a property on it that requires the "cpu" or "cpuacct" hierarchy,
> for example CPUAccounting=, CPUShares= or CPUQuota.

You can turn on mirroring during runtime but as far as I know there is
no way going back without rebooting right?

>
> Bu the general rule is: don't enable a controller for a unit, unless
> we really need to. We must make sure the tree is always as minimal as
> possible.
>
>> Not mirroring them results PID 1, each kernel thread and each user
>> space task having the same cpu bandwidth (/sys/fs/cgroup/cpu/tasks).
>> Even worse is the cpu bandwidth PID 1 gets goes down with the number
>> of processes spawned, possibly opening ways to DOS.
>
> There has been a plan to introduce CPUFairScheduling= that you can set
> on a slice, and that will turn on the cpu controller for all children
> of that slice. Setting that on system.slice should have the desired
> effect.
>
> Regarding PID1: with the unified cgroup hierarchy it will not be
> possible to have both populated subcgroups and processes in the same
> cgroup. This means we will have to move PID 1 out of the root cgroup
> anyway, probably into some unit in "system.slice". This should fix
> your problem, I figure? This would also allow applying cgroup resource
> limits to PID 1 itself, for example to control the way it is scheduled
> against other proceses.

We discussed putting systemd in to its own cgroup in Germany during
hack fest. It would solve the problem I have mentioned.

Umut

>
> Lennart
>
> --
> Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Rationale for mirroring cpu and systemd cgroup subsystems

2014-11-05 Thread Lennart Poettering
On Wed, 05.11.14 13:41, Umut Tezduyar Lindskog (u...@tezduyar.com) wrote:

> Hi,
> 
> What is the reasoning for not joining cpu subsystem with systemd subsystem?
> 
> There are couple ways you can mirror [1] cpu and systemd subsystems
> and doing so can result completely different cpu bandwidth for
> processes.
> 
> I am wondering why we don't mirror them by default.

Because simply enabling a "cpu" controller for a unit already has
effects on the processes running it. For example, you don't get RT
anymore, and the general scheduling is altered to schedule your entire
group evenly against the all groups on the same level.

systemd will "mirror" a cgroup in the "cpu" hierarchy as soon as you
set a property on it that requires the "cpu" or "cpuacct" hierarchy,
for example CPUAccounting=, CPUShares= or CPUQuota.

Bu the general rule is: don't enable a controller for a unit, unless
we really need to. We must make sure the tree is always as minimal as
possible.

> Not mirroring them results PID 1, each kernel thread and each user
> space task having the same cpu bandwidth (/sys/fs/cgroup/cpu/tasks).
> Even worse is the cpu bandwidth PID 1 gets goes down with the number
> of processes spawned, possibly opening ways to DOS.

There has been a plan to introduce CPUFairScheduling= that you can set
on a slice, and that will turn on the cpu controller for all children
of that slice. Setting that on system.slice should have the desired
effect.

Regarding PID1: with the unified cgroup hierarchy it will not be
possible to have both populated subcgroups and processes in the same
cgroup. This means we will have to move PID 1 out of the root cgroup
anyway, probably into some unit in "system.slice". This should fix
your problem, I figure? This would also allow applying cgroup resource
limits to PID 1 itself, for example to control the way it is scheduled
against other proceses.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Rationale for mirroring cpu and systemd cgroup subsystems

2014-11-05 Thread Umut Tezduyar Lindskog
Hi,

What is the reasoning for not joining cpu subsystem with systemd subsystem?

There are couple ways you can mirror [1] cpu and systemd subsystems
and doing so can result completely different cpu bandwidth for
processes.

I am wondering why we don't mirror them by default.

Not mirroring them results PID 1, each kernel thread and each user
space task having the same cpu bandwidth (/sys/fs/cgroup/cpu/tasks).
Even worse is the cpu bandwidth PID 1 gets goes down with the number
of processes spawned, possibly opening ways to DOS.

[1] - Simple changes that alter the entire cpu bandwidth processes get

a) DefaultCPUAccounting=yes will change the entire cpu bandwidth
allocation due to JoinControllers=cpu,cpuacct
b) Dropping a .slice and adding even only 1 service in it.
c) systemctl set-property system.slice CPUShares=1024 (Even though
1024 is the default cpu weight)

Umut
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel