Re: [RFC] cgroup TODOs

2012-09-21 Thread Tejun Heo
On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: > 7. Misc issues > > * Sort & unique when listing tasks. Even the documentation says it > doesn't happen but we have a good hunk of code doing it in > cgroup.c. I'm gonna rip it out at some point. Again, if you > don't lik

Re: [RFC] cgroup TODOs

2012-09-20 Thread Andy Lutomirski
On Thu, Sep 20, 2012 at 11:26 AM, Tejun Heo wrote: > Hello, > > On Wed, Sep 19, 2012 at 06:33:15PM -0700, Andy Lutomirski wrote: >> [grr. why does gmane scramble addresses?] > > You can append /raw to the message url and see the raw mssage. > > http://article.gmane.org/gmane.linux.kernel.contai

Re: [RFC] cgroup TODOs

2012-09-20 Thread Tejun Heo
Hello, On Wed, Sep 19, 2012 at 06:33:15PM -0700, Andy Lutomirski wrote: > [grr. why does gmane scramble addresses?] You can append /raw to the message url and see the raw mssage. http://article.gmane.org/gmane.linux.kernel.containers/23802/raw > > I think this level of flexibility should b

Re: [RFC] cgroup TODOs

2012-09-19 Thread Andy Lutomirski
[grr. why does gmane scramble addresses?] On 09/13/2012 01:58 PM, Tejun Heo wrote: > > 6. Multiple hierarchies > > Apart from the apparent whness of it (I think I talked about > that enough the last time[1]), there's a basic problem when more > than one controllers interact - it's

Re: [RFC] cgroup TODOs

2012-09-19 Thread Michal Hocko
[CCing Dave, Ben] Just a short summary as you were not on the CC list. This is sort of follow up on https://lkml.org/lkml/2012/9/3/211. The end result is slightly different because Tejun did a more generic cgroup solution (see bellow). I cannot do the same for OpenSUSE so I will stick with the mem

Re: [RFC] cgroup TODOs

2012-09-18 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 02:57:01PM -0700, Tejun Heo wrote: [..] > I think we need to stick to one model for all controllers; otherwise, > it gets confusing and unified hierarchy can't work. That said, I'm > not too happy about how cpu is handling it now. > > * As I wrote before, the configuratio

Re: [RFC] cgroup TODOs

2012-09-17 Thread Tejun Heo
On Mon, Sep 17, 2012 at 12:40:28PM +0400, Glauber Costa wrote: > That is exactly what I proposed in our previous discussions around > memcg, with files like "available_controllers" , "current_controllers". > Name chosen to match what other subsystems already do. > > if memcg is not in "available_c

Re: [RFC] cgroup TODOs

2012-09-17 Thread Tejun Heo
Hello, Glauber. On Mon, Sep 17, 2012 at 12:50:47PM +0400, Glauber Costa wrote: > > Can you be a bit more specific? > > What I mean is that if some operation needs to operate locked, they will > have to lock. Whether or not the locking is called from cgroup core or > not. If the lock is not availab

Re: [RFC] cgroup TODOs

2012-09-17 Thread Tejun Heo
Hello, On Mon, Sep 17, 2012 at 11:05:18AM -0400, Vivek Goyal wrote: > As a developer, I will be happy to support only one model and keep code > simple. I am only concerned that for blkcg we have still not charted > out a clear migration path. The warning message your patch is giving > out will wor

Re: [RFC] cgroup TODOs

2012-09-17 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 02:57:01PM -0700, Tejun Heo wrote: [..] > > > cpu does the relative weight, so 'users' will have to deal with it > > > anyway regardless of blk, its effectively free of learning curve for all > > > subsequent controllers. > > > > I am inclined to keep it simple in kernel a

Re: [RFC] cgroup TODOs

2012-09-17 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 02:39:38PM -0700, Tejun Heo wrote: [..] > > I am still little concerned about changing the blkio behavior > > unexpectedly. Can we have some kind of mount time flag which retains > > the old flat behavior and we warn user that this mode is deprecated > > and will soon be rem

Re: [RFC] cgroup TODOs

2012-09-17 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 01:39:25PM -0700, Tejun Heo wrote: > Hello, again. > > On Fri, Sep 14, 2012 at 12:49:50PM -0700, Tejun Heo wrote: > > That said, if someone can think of a better solution, I'm all ears. > > One thing that *has* to be maintained is that it should be able to tag > > a resourc

Re: [RFC] cgroup TODOs

2012-09-17 Thread Aristeu Rozanski
On Sun, Sep 16, 2012 at 09:19:17AM +0100, James Bottomley wrote: > On Fri, 2012-09-14 at 14:36 -0400, Aristeu Rozanski wrote: > > also, heard about the desire of having a device namespace instead with > > support for translation ("sda" -> "sdf"). If anyone see immediate use for > > this please let

Re: [RFC] cgroup TODOs

2012-09-17 Thread Glauber Costa
On 09/14/2012 09:43 PM, Tejun Heo wrote: > Hello, Glauber. > > On Fri, Sep 14, 2012 at 12:16:31PM +0400, Glauber Costa wrote: >> Can we please keep some key userspace guys CCd? > > Yeap, thanks for adding the ccs. > >>> 1. cpu and cpuacct > ... >>> Me, working on it. >> I can work on it as wel

Re: [RFC] cgroup TODOs

2012-09-17 Thread Glauber Costa
On 09/15/2012 12:39 AM, Tejun Heo wrote: > Hello, again. > > On Fri, Sep 14, 2012 at 12:49:50PM -0700, Tejun Heo wrote: >> That said, if someone can think of a better solution, I'm all ears. >> One thing that *has* to be maintained is that it should be able to tag >> a resource in such way that it

Re: [RFC] cgroup TODOs

2012-09-16 Thread Eric W. Biederman
James Bottomley writes: > On Fri, 2012-09-14 at 14:36 -0400, Aristeu Rozanski wrote: >> also, heard about the desire of having a device namespace instead with >> support for translation ("sda" -> "sdf"). If anyone see immediate use for >> this please let me know. > > That sounds like a really bad

Re: [RFC] cgroup TODOs

2012-09-16 Thread James Bottomley
On Fri, 2012-09-14 at 14:36 -0400, Aristeu Rozanski wrote: > also, heard about the desire of having a device namespace instead with > support for translation ("sda" -> "sdf"). If anyone see immediate use for > this please let me know. That sounds like a really bad idea to me. We've spent ages tra

Re: [RFC] cgroup TODOs

2012-09-14 Thread Serge E. Hallyn
Quoting Aristeu Rozanski (a...@ruivo.org): > Tejun, > On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: > > memcg can be handled by memcg people and I can handle cgroup_freezer > > and others with help from the authors. The problematic one is > > blkio. If anyone is interested in w

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, On Fri, Sep 14, 2012 at 06:03:16PM -0400, Dhaval Giani wrote: > > > > * Sort & unique when listing tasks. Even the documentation says it > > doesn't happen but we have a good hunk of code doing it in > > cgroup.c. I'm gonna rip it out at some point. Again, if you > > don't

Re: [RFC] cgroup TODOs

2012-09-14 Thread Dhaval Giani
> > * Sort & unique when listing tasks. Even the documentation says it > doesn't happen but we have a good hunk of code doing it in > cgroup.c. I'm gonna rip it out at some point. Again, if you > don't like it, scream. > I think some userspace tools do assume the uniq bit. So if w

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, Vivek, Peter. On Fri, Sep 14, 2012 at 11:14:47AM -0400, Vivek Goyal wrote: > We don't have to start with 0%. We can keep a pool with dynamic % and > launch all the virtual machines from that single pool. So nobody starts > with 0%. If we require certain % for a machine, only then we look at

Re: [RFC] cgroup TODOs

2012-09-14 Thread Kay Sievers
On Fri, Sep 14, 2012 at 9:29 PM, Tejun Heo wrote: > On Fri, Sep 14, 2012 at 09:58:30AM -0400, Vivek Goyal wrote: >> I am little concerned about above and wondering how systemd and libvirt >> will interact and behave out of the box. >> >> Currently systemd does not create its own hierarchy under bl

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, Vivek. On Fri, Sep 14, 2012 at 10:25:39AM -0400, Vivek Goyal wrote: > On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: > > [..] > > * blkio is the most problematic. It has two sub-controllers - cfq > > and blk-throttle. Both are utterly broken in terms of hierarchy > >

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, again. On Fri, Sep 14, 2012 at 12:49:50PM -0700, Tejun Heo wrote: > That said, if someone can think of a better solution, I'm all ears. > One thing that *has* to be maintained is that it should be able to tag > a resource in such way that its associated controllers are > identifiable regard

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
On Fri, Sep 14, 2012 at 12:44:39PM -0700, Tejun Heo wrote: > I think there currently is too much (broken) flexibility and intent to > remove it. That doesn't mean that removeing all flexibility is the > right direction. It inherently is a balancing act and I think the > proposed solution is a rea

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, Vivek. On Fri, Sep 14, 2012 at 03:28:40PM -0400, Vivek Goyal wrote: > Hmm.., In that case how libvirt will make use of blkio in the proposed > scheme. We can't disable blkio nesting at "system" level. So We will > have to disable it at each service level except "libvirtd" so that > libvirt

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, (cc'ing Lennart and Kay) On Fri, Sep 14, 2012 at 09:58:30AM -0400, Vivek Goyal wrote: > I am little concerned about above and wondering how systemd and libvirt > will interact and behave out of the box. > > Currently systemd does not create its own hierarchy under blkio and > libvirt does

Re: [RFC] cgroup TODOs

2012-09-14 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 11:53:24AM -0700, Tejun Heo wrote: [..] > In addition, for some resources, granularity beyond certain point > simply doesn't work. Per-service granularity might make sense for cpu > but applying it by default would be silly for blkio. Hmm.., In that case how libvirt will

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, On Fri, Sep 14, 2012 at 02:36:41PM -0400, Aristeu Rozanski wrote: > if Serge is not planning to do it already, I can take a look in device_cgroup. Yes please. :) Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to major

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, Vivek. On Fri, Sep 14, 2012 at 02:07:54PM -0400, Vivek Goyal wrote: > I am curious that why are you planning to provide capability of controller > specific view of hierarchy. To me it sounds pretty close to having > separate hierarchies per controller. Just that it is a little more > restri

Re: [RFC] cgroup TODOs

2012-09-14 Thread Aristeu Rozanski
Tejun, On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: > memcg can be handled by memcg people and I can handle cgroup_freezer > and others with help from the authors. The problematic one is > blkio. If anyone is interested in working on blkio, please be my > guest. Vivek? Gla

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, On Fri, Sep 14, 2012 at 08:23:41PM +0200, Peter Zijlstra wrote: > Its hotplug, all hotplug stuff is synchronous, the last thing hotplug > needs is the added complexity of async callbacks. Also pushing stuff out > into worklets just to work around locking issues is vile. I was asking whethe

Re: [RFC] cgroup TODOs

2012-09-14 Thread Peter Zijlstra
On Fri, 2012-09-14 at 10:59 -0700, Tejun Heo wrote: > Hello, > > On Fri, Sep 14, 2012 at 05:12:31PM +0800, Li Zefan wrote: > > Agreed. The biggest issue in cpuset is if hotplug makes a cpuset's cpulist > > empty the tasks in it will be moved to an ancestor cgroup, which requires > > holding cgroup

Re: [RFC] cgroup TODOs

2012-09-14 Thread Vivek Goyal
On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: [..] > 6. Multiple hierarchies > > Apart from the apparent whness of it (I think I talked about > that enough the last time[1]), there's a basic problem when more > than one controllers interact - it's impossible to define a

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, On Fri, Sep 14, 2012 at 05:12:31PM +0800, Li Zefan wrote: > Agreed. The biggest issue in cpuset is if hotplug makes a cpuset's cpulist > empty the tasks in it will be moved to an ancestor cgroup, which requires > holding cgroup lock. We have to either change cpuset's behavior or eliminate >

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, Peter. On Fri, Sep 14, 2012 at 01:15:02PM +0200, Peter Zijlstra wrote: > On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote: > > The cpu ones handle nesting correctly - parent's accounting includes > > children's, parent's configuration affects children's unless > > explicitly overrid

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
Hello, Glauber. On Fri, Sep 14, 2012 at 12:16:31PM +0400, Glauber Costa wrote: > Can we please keep some key userspace guys CCd? Yeap, thanks for adding the ccs. > > 1. cpu and cpuacct ... > > Me, working on it. > I can work on it as well if you want. I dealt with it many times in > the past,

Re: [RFC] cgroup TODOs

2012-09-14 Thread Tejun Heo
On Fri, Sep 14, 2012 at 11:04:44AM +0200, Mike Galbraith wrote: > On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote: > > > 7. Misc issues > > >* Extract synchronize_rcu() from user interface? Exporting grace > periods to userspace isn't wonderful for dynamic launchers. Aye aye. Also, * U

Re: [RFC] cgroup TODOs

2012-09-14 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 04:53:29PM +0200, Peter Zijlstra wrote: > On Fri, 2012-09-14 at 10:25 -0400, Vivek Goyal wrote: > > So while % model is more intutive to users, it is hard to implement. > > I don't agree with that. The fixed quota thing is counter-intuitive and > hard to use. It begets you

Re: [RFC] cgroup TODOs

2012-09-14 Thread Michal Hocko
On Thu 13-09-12 13:58:27, Tejun Heo wrote: [...] > 2. memcg's __DEPRECATED_clear_css_refs > > This is a remnant of another weird design decision of requiring > synchronous draining of refcnts on cgroup removal and allowing > subsystems to veto cgroup removal - what's the userspace supposed t

Re: [RFC] cgroup TODOs

2012-09-14 Thread Peter Zijlstra
On Fri, 2012-09-14 at 10:25 -0400, Vivek Goyal wrote: > So while % model is more intutive to users, it is hard to implement. I don't agree with that. The fixed quota thing is counter-intuitive and hard to use. It begets you questions like: why, if everything is idle except my task, am I not gettin

Re: [RFC] cgroup TODOs

2012-09-14 Thread Vivek Goyal
On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: [..] > * blkio is the most problematic. It has two sub-controllers - cfq > and blk-throttle. Both are utterly broken in terms of hierarchy > support and the former is known to have pretty hairy code base. I > don't see any

Re: [RFC] cgroup TODOs

2012-09-14 Thread Vivek Goyal
On Fri, Sep 14, 2012 at 10:10:32AM +0100, Daniel P. Berrange wrote: [..] > > 6. Multiple hierarchies > > > > Apart from the apparent whness of it (I think I talked about > > that enough the last time[1]), there's a basic problem when more > > than one controllers interact - it's imp

Re: [RFC] cgroup TODOs

2012-09-14 Thread Daniel P. Berrange
On Fri, Sep 14, 2012 at 01:15:02PM +0200, Peter Zijlstra wrote: > On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote: > > The cpu ones handle nesting correctly - parent's accounting includes > > children's, parent's configuration affects children's unless > > explicitly overridden, and childr

Re: [RFC] cgroup TODOs

2012-09-14 Thread Peter Zijlstra
On Fri, 2012-09-14 at 17:12 +0800, Li Zefan wrote: > > I think this is a pressing problem, yes, but not the only problem with > > cgroup lock. Even if we restrict its usage to cgroup core, we still can > > call cgroup functions, which will lock. And then we gain nothing. > > > > Agreed. The bigge

Re: [RFC] cgroup TODOs

2012-09-14 Thread Peter Zijlstra
On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote: > The cpu ones handle nesting correctly - parent's accounting includes > children's, parent's configuration affects children's unless > explicitly overridden, and children's limits nest inside parent's. The implementation has some issues w

Re: [RFC] cgroup TODOs

2012-09-14 Thread Li Zefan
>> >> 2. memcg's __DEPRECATED_clear_css_refs >> >> This is a remnant of another weird design decision of requiring >> synchronous draining of refcnts on cgroup removal and allowing >> subsystems to veto cgroup removal - what's the userspace supposed to >> do afterwards? Note that this also

Re: [RFC] cgroup TODOs

2012-09-14 Thread Daniel P. Berrange
On Thu, Sep 13, 2012 at 01:58:27PM -0700, Tejun Heo wrote: > 5. I CAN HAZ HIERARCHIES? > > The cpu ones handle nesting correctly - parent's accounting includes > children's, parent's configuration affects children's unless > explicitly overridden, and children's limits nest inside parent's.

Re: [RFC] cgroup TODOs

2012-09-14 Thread Mike Galbraith
On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote: > 7. Misc issues > * Extract synchronize_rcu() from user interface? Exporting grace periods to userspace isn't wonderful for dynamic launchers. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

[RFC] cgroup TODOs

2012-09-13 Thread Tejun Heo
Hello, guys. Here's the write-up I promised last week about what I think are the problems in cgroup and what the current plans are. First of all, it's a mess. Shame on me. Shame on you. Shame on all of us for allowing this mess. Let's all tremble in shame for solid ten seconds before proceedi