Hello everyone, I have now published the Cgroup specification on the Upstart wiki: http://upstart.ubuntu.com/wiki/Cgroup
This is based on my original proposal with the changes suggested on the mailing list. On Wed, Nov 20, 2013 at 02:23:59PM -0500, Stéphane Graber wrote: > This morning at vUDS we discussed adding support for cgroups in Upstart. > > Before I go into details about the proposed stanza and overall > behaviour, I'd begin by saying that contrary to some other init systems, > our intent is solely related to resource controls which is the main goal > of cgroups. Process grouping and tracking will remain unaffected by the > addition of cgroup support. > > Cgroup support will be implemented by adding a new "cgroup" stanza which > will control the application of cgroup based restrictions to the job. > The limits will be applied to any of the scripts > (pre-start/post-start/job/pre-stop/post-stob) similar to what's done > with setuid/setgid/apparmor stanzas. > > Now my recommended format for the stanza, which I believe should be > flexible enough is: > cgroup <controller> <cgroup name|auto> [<key> <value>] > > > Detail on the fields: > == controller == > Name for one of the cgroup controller > > Currently the valid values are (but won't be hardcoded into upstart): > - blkio > - cpu > - cpuacct > - cpuset > - devices > - freezer > - hugetlb > - memory > - perf_event > > == cgroup-name|$auto == > Name of the cgroup to use (and create if non-existing) > > The name may contain a / (e.g. "db/pgsql" or "db/$auto") indicating that > it's requesting a sub-cgroup. > > "$auto" is the recommended name and will have upstart generate a name > based on the job instance name. > > The main use of that field is for cases where a set of jobs should share > limits, in such case the main job should declare the various values and > the others just refer to the cgroup by name but not defined values. > > The name may be different for the various controllers but may not differ > within the same controller. Example: > valid => cgroup memory group1 limit_in_bytes 52428800 > cgroup cpuset group2 cpus 0-1 > > invalid => cgroup memory group1 limit_in_bytes 52428800 > cgroup memory group1 soft_limit_in_bytes 1024 > > == key == > The cgroup control file minus the controller name, so for example > memory.soft_limit_in_bytes will become limit_in_bytes. > > == value == > Any value valid for the given control file, upstart itself won't perform > any validation. > > If the value contains spaces, it should be put between double-quotes (e.g.): > cgroup devices auto allow "c 1:2 rwm" > > > Upstart won't have any controller aware logic in its code, instead, > it'll simply talk over dbus (using a private dbus socket) to the cgroup > manager which will take care of applying the various limits. > That cgroup manager will be started very early in the boot sequence. Any > job containing a cgroup stanza will be held until the manager is > started. > > The cgroup will be destroyed when a job is stopped and the cgroup isn't > shared with another job (task count is 0 and it has no child cgroup). > > It'll be possible to disable cgroup support entirely by either building > upstart without it (needed for non-Linux systems) or by passing > --no-cgroup as a parameter to upstart. In that case, the cgroup stanza > will simply be ignored and the jobs will start without limitations. > > > All of the above is also meant to apply to user sessions. The cgroup > manager will allow unprivileged cgroup configuration, so as long as the > user has write access to a sub-section of a controller, it'll be allowed > to write entries there. Similarly to other restriction stanzas, failure > to apply a cgroup limit in a user session won't be fatal. > > > Now a few examples to try and illustrate the thoughts behind that proposal: > > == Single job simple example == > === Job === > cgroup memory $auto limit_in_bytes 52428800 > > === Result === > The job will only start once the manager is up and running and will have a > 50MB memory limit. If the system has less than 50MB, the job will fail > to start. > > == Single job complex example == > === Job === > cgroup memory $auto limit_in_bytes 52428800 > cgroup cpuset $auto cpus 0-1 > cgroup blkio slowio throttle.write_bps_device "8:16 1048576" > > == Result == > The job will only start once the manager is up and running and will have a > 50MB memory limit, be restricted to CPU ids 0 and 1 and have a 1MB/s > write limit to the block device 8:16. > The job will fail to start if the system has less than 50MB of RAM or > less than 2 CPUs. > > > == Multiple jobs complex example == > === Job 1 === > cgroup cpuset db cpus 0-1 > cgroup memory db limit_in_bytes 104857600 > cgroup blkio db throttle.write_bps_device "8:16 1048576" > > === Job 2 === > cgroup cpuset db/$auto cpus 1 > cgroup memory db/$auto limit_in_bytes 52428800 > cgroup blkio db/$auto throttle.write_bps_device "8:17 1048576" > > === Job 3 === > cgroup cpuset db > cgroup memory db > > === Job 4 === > cgroup cpuset db/$auto cpus 2 > > == Result == > This is rather complex, so let's go job by job: > - Job 1 will start bound to CPU 0 and 1 with a 100MB memory limit and > 1MB/s write limit to the 8:16 block device. It'll fail to start if > the system has less than 2 CPUs or less than 100MB of RAM. > > - Job 2 will start bound to CPU 1 and with a 50MB memory limit. It'll > inherit the 1MB/s write limit to 8:16 and on top of that also rate limit > writes to 8:17 also at 1MB/s. > The job will fail to start if the system has less than 50MB of RAM or > less than 2 CPUs. > > - Job 3 will start in the "db" cpuset and memory cgroups. If it starts > before Job 1, no limit will be applied at startup time. As soon as Job 1 > starts however Job 3 will be limited to 2 CPUs and 100MB of memory. > As it doesn't have a blkio statement, it won't have rate limited I/Os. > > - Job 4 if started after Job 1 will fail to start as it's requesting a > CPU that the parent cgroup doesn't have access to. If started before > Job 1 however, it won't have a parent value set so will inherit the > default and so will start so long as the system has at least 3 CPUs. > > > > I think this pretty much covers all I've got in mind at this point, I > think the above is flexible enough to work with all existing > controllers. > > Questions, comment and suggestions are much welcome! > > -- > Stéphane Graber > Ubuntu developer > http://www.ubuntu.com > -- > upstart-devel mailing list > [email protected] > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/upstart-devel -- Stéphane Graber Ubuntu developer http://www.ubuntu.com
signature.asc
Description: Digital signature
-- upstart-devel mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/upstart-devel
