Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-12 Thread David Rientjes
On Thu, 11 Oct 2007, Paul Jackson wrote: > Hmmm ... I hadn't noticed that sched_hotcpu_mutex before. > > I wonder what it is guarding? As best as I can guess, it seems, at > least in part, to be keeping the following two items consistent: > 1) cpu_online_map Yes, it protects against cpu

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-12 Thread David Rientjes
On Thu, 11 Oct 2007, Paul Jackson wrote: Hmmm ... I hadn't noticed that sched_hotcpu_mutex before. I wonder what it is guarding? As best as I can guess, it seems, at least in part, to be keeping the following two items consistent: 1) cpu_online_map Yes, it protects against cpu hot-plug

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Paul Jackson
> Since we know that the tasks are not running, and that we have > exclusive access to the tasks in the control group we can take action > as if we were the actual tasks themselves. Which should simplify > locking. The Big Kernel Lock (BKL), born again, as a Medium Sized Cgroup Lock ? This only

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Eric W. Biederman
Stupid question. Would it help at all if we structured this as: - Take the control group off of the cpus and runqueues. - Modify the tasks in the control group. - Place the control group back on the runqueues. Essentially this is what ptrace does (except for one task at a time). Since we know

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Paul Jackson
David wrote: > you could protect the whole thing by sched_hotcpu_mutex, which is > expressly designed for migrations. > > Something like this: > > struct cgroup_iter it; > struct task_struct *p, **tasks; > int i = 0; > > cgroup_iter_start(cs->css.cgroup, ); >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Paul Jackson
Several days ago, Paul M replied to Paul J: > > Hmmm ... guess I'd have to loop over the cgroup twice, once to count > > them (the 'count' field is not claimed to be accurate) and then again, > > after I've kmalloc'd the tasks[] array, filling in the tasks[] array. > > > > On a big cgroup on a big

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Paul Jackson
Several days ago, Paul M replied to Paul J: Hmmm ... guess I'd have to loop over the cgroup twice, once to count them (the 'count' field is not claimed to be accurate) and then again, after I've kmalloc'd the tasks[] array, filling in the tasks[] array. On a big cgroup on a big system,

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Paul Jackson
David wrote: you could protect the whole thing by sched_hotcpu_mutex, which is expressly designed for migrations. Something like this: struct cgroup_iter it; struct task_struct *p, **tasks; int i = 0; cgroup_iter_start(cs-css.cgroup, it); while ((p =

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Eric W. Biederman
Stupid question. Would it help at all if we structured this as: - Take the control group off of the cpus and runqueues. - Modify the tasks in the control group. - Place the control group back on the runqueues. Essentially this is what ptrace does (except for one task at a time). Since we know

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-11 Thread Paul Jackson
Since we know that the tasks are not running, and that we have exclusive access to the tasks in the control group we can take action as if we were the actual tasks themselves. Which should simplify locking. The Big Kernel Lock (BKL), born again, as a Medium Sized Cgroup Lock ? This only

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-10 Thread David Rientjes
On Wed, 10 Oct 2007, Paul Menage wrote: > On 10/6/07, David Rientjes <[EMAIL PROTECTED]> wrote: > > > > It can race with sched_setaffinity(). It has to give up tasklist_lock as > > well to call set_cpus_allowed() and can race > > > > cpus_allowed = cpuset_cpus_allowed(p); > >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-10 Thread Paul Menage
On 10/6/07, David Rientjes <[EMAIL PROTECTED]> wrote: > > It can race with sched_setaffinity(). It has to give up tasklist_lock as > well to call set_cpus_allowed() and can race > > cpus_allowed = cpuset_cpus_allowed(p); > cpus_and(new_mask, new_mask, cpus_allowed); >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-10 Thread Paul Menage
On 10/6/07, David Rientjes [EMAIL PROTECTED] wrote: It can race with sched_setaffinity(). It has to give up tasklist_lock as well to call set_cpus_allowed() and can race cpus_allowed = cpuset_cpus_allowed(p); cpus_and(new_mask, new_mask, cpus_allowed); retval =

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-10 Thread David Rientjes
On Wed, 10 Oct 2007, Paul Menage wrote: On 10/6/07, David Rientjes [EMAIL PROTECTED] wrote: It can race with sched_setaffinity(). It has to give up tasklist_lock as well to call set_cpus_allowed() and can race cpus_allowed = cpuset_cpus_allowed(p); cpus_and(new_mask,

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-07 Thread David Rientjes
On Sat, 6 Oct 2007, Paul Menage wrote: > > The getting and putting of the tasks will prevent them from exiting or > > being deallocated prematurely. But this is also a critical section that > > will need to be protected by some mutex so it doesn't race with other > > set_cpus_allowed(). > > Is

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-07 Thread David Rientjes
On Sat, 6 Oct 2007, Paul Jackson wrote: > > struct cgroup_iter it; > > struct task_struct *p, **tasks; > > int i = 0; > > > > cgroup_iter_start(cs->css.cgroup, ); > > while ((p = cgroup_iter_next(cs->css.cgroup, ))) { > > get_task_struct(p); > >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-07 Thread David Rientjes
On Sat, 6 Oct 2007, Paul Jackson wrote: struct cgroup_iter it; struct task_struct *p, **tasks; int i = 0; cgroup_iter_start(cs-css.cgroup, it); while ((p = cgroup_iter_next(cs-css.cgroup, it))) { get_task_struct(p); tasks[i++] = p;

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-07 Thread David Rientjes
On Sat, 6 Oct 2007, Paul Menage wrote: The getting and putting of the tasks will prevent them from exiting or being deallocated prematurely. But this is also a critical section that will need to be protected by some mutex so it doesn't race with other set_cpus_allowed(). Is that

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Jackson
Paul M wrote: > > What's wrong with: > > allocate a page of task_struct pointers > again: > need_repeat = false; > cgroup_iter_start(); > while (cgroup_iter_next()) { > if (p->cpus_allowed != new_cpumask) { > store p; > if (page is full) { > need_repeat = true; >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Menage
On 10/6/07, David Rientjes <[EMAIL PROTECTED]> wrote: > The getting and putting of the tasks will prevent them from exiting or > being deallocated prematurely. But this is also a critical section that > will need to be protected by some mutex so it doesn't race with other > set_cpus_allowed().

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Menage
On 10/6/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > David wrote: > > It would probably be better to just save references to the tasks. > > > > struct cgroup_iter it; > > struct task_struct *p, **tasks; > > int i = 0; > > > > cgroup_iter_start(cs->css.cgroup, ); > >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Menage
On 10/6/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > This isn't working for me. > > The key kernel routine for updating a tasks cpus_allowed > cannot be called while holding a spinlock. > > But the above loop holds a spinlock, css_set_lock, between > the cgroup_iter_start and the

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Jackson
David wrote: > It would probably be better to just save references to the tasks. > > struct cgroup_iter it; > struct task_struct *p, **tasks; > int i = 0; > > cgroup_iter_start(cs->css.cgroup, ); > while ((p = cgroup_iter_next(cs->css.cgroup, ))) { >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread David Rientjes
On Sat, 6 Oct 2007, Paul Jackson wrote: > This isn't working for me. > > The key kernel routine for updating a tasks cpus_allowed > cannot be called while holding a spinlock. > > But the above loop holds a spinlock, css_set_lock, between > the cgroup_iter_start and the cgroup_iter_end. > > I

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Jackson
Paul Menage wrote: > What was wrong with my suggestion from a couple of emails back? Adding > the following in cpuset_attach(): > > struct cgroup_iter it; > struct task_struct *p; > cgroup_iter_start(cs->css.cgroup, ); > while ((p = cgroup_iter_next(cs->css.cgroup, ))) >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Jackson
Paul Menage wrote: What was wrong with my suggestion from a couple of emails back? Adding the following in cpuset_attach(): struct cgroup_iter it; struct task_struct *p; cgroup_iter_start(cs-css.cgroup, it); while ((p = cgroup_iter_next(cs-css.cgroup, it))) set_cpus_allowed(p,

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread David Rientjes
On Sat, 6 Oct 2007, Paul Jackson wrote: This isn't working for me. The key kernel routine for updating a tasks cpus_allowed cannot be called while holding a spinlock. But the above loop holds a spinlock, css_set_lock, between the cgroup_iter_start and the cgroup_iter_end. I end up

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Jackson
David wrote: It would probably be better to just save references to the tasks. struct cgroup_iter it; struct task_struct *p, **tasks; int i = 0; cgroup_iter_start(cs-css.cgroup, it); while ((p = cgroup_iter_next(cs-css.cgroup, it))) {

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Menage
On 10/6/07, Paul Jackson [EMAIL PROTECTED] wrote: This isn't working for me. The key kernel routine for updating a tasks cpus_allowed cannot be called while holding a spinlock. But the above loop holds a spinlock, css_set_lock, between the cgroup_iter_start and the cgroup_iter_end. I end

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Menage
On 10/6/07, Paul Jackson [EMAIL PROTECTED] wrote: David wrote: It would probably be better to just save references to the tasks. struct cgroup_iter it; struct task_struct *p, **tasks; int i = 0; cgroup_iter_start(cs-css.cgroup, it); while ((p =

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Menage
On 10/6/07, David Rientjes [EMAIL PROTECTED] wrote: The getting and putting of the tasks will prevent them from exiting or being deallocated prematurely. But this is also a critical section that will need to be protected by some mutex so it doesn't race with other set_cpus_allowed(). Is that

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-06 Thread Paul Jackson
Paul M wrote: What's wrong with: allocate a page of task_struct pointers again: need_repeat = false; cgroup_iter_start(); while (cgroup_iter_next()) { if (p-cpus_allowed != new_cpumask) { store p; if (page is full) { need_repeat = true; break;

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > But now (correct me if I'm wrong here) cgroups has a per-cgroup task > list, and the above loop has cost linear in the number of tasks > actually in the cgroup, plus (unfortunate but necessary and tolerable) > the cost of taking a global

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
Andrew - please kill this patch. Looks like Paul Menage has a better solution that I will be trying out. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
> What was wrong with my suggestion from a couple of emails back? Adding > the following in cpuset_attach(): > > struct cgroup_iter it; > struct task_struct *p; > while ((p = cgroup_iter_next(cs->css.cgroup, ))) { >set_cpus_allowed(p, cs->cpus_allowed); > } >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > So far as I can tell, this patch just removes a minor optimization, It's not minor for any subsystem that has a non-trivial attach() operation. > > Any further suggestions, or embarrassing (;) questions? > What was wrong with my suggestion

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
Paul M wrote: > Given that later in cpusets.txt you say: > > >If hotplug functionality is used > >to remove all the CPUs that are currently assigned to a cpuset, > >then the kernel will automatically update the cpus_allowed of all > >tasks attached to CPUs in that cpuset to allow all CPUs > >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Menage <[EMAIL PROTECTED]> wrote: > > What's wrong with, in update_cpumask(), doing a loop across all > members of the cgroup and updating their cpus_allowed fields? i.e. something like: struct cgroup_iter it; struct task_struct *p; while ((p = cgroup_iter_next(cs->css.cgroup,

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > Yes, something in user space has to do it. It's part of the > kernel-user cpuset API. If you change a cpuset's 'cpus', then > you have to rewrite each pid in its 'tasks' file back to that > 'tasks' file in order to get that 'cpus' change to

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
Paul M wrote: > > The code in kernel/cgroup.c attach_task() which skips the > > attachment of a task to the group it is already in has to be > > removed. Cpusets depends on reattaching a task to its current > > cpuset, in order to trigger updating the cpus_allowed mask in the > > task struct. >

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > From: Paul Jackson <[EMAIL PROTECTED]> > > The code in kernel/cgroup.c attach_task() which skips the > attachment of a task to the group it is already in has to be > removed. Cpusets depends on reattaching a task to its current > cpuset, in

[PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
From: Paul Jackson <[EMAIL PROTECTED]> The code in kernel/cgroup.c attach_task() which skips the attachment of a task to the group it is already in has to be removed. Cpusets depends on reattaching a task to its current cpuset, in order to trigger updating the cpus_allowed mask in the task

[PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
From: Paul Jackson [EMAIL PROTECTED] The code in kernel/cgroup.c attach_task() which skips the attachment of a task to the group it is already in has to be removed. Cpusets depends on reattaching a task to its current cpuset, in order to trigger updating the cpus_allowed mask in the task struct.

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson [EMAIL PROTECTED] wrote: From: Paul Jackson [EMAIL PROTECTED] The code in kernel/cgroup.c attach_task() which skips the attachment of a task to the group it is already in has to be removed. Cpusets depends on reattaching a task to its current cpuset, in order to

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
Paul M wrote: The code in kernel/cgroup.c attach_task() which skips the attachment of a task to the group it is already in has to be removed. Cpusets depends on reattaching a task to its current cpuset, in order to trigger updating the cpus_allowed mask in the task struct. Can you

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson [EMAIL PROTECTED] wrote: Yes, something in user space has to do it. It's part of the kernel-user cpuset API. If you change a cpuset's 'cpus', then you have to rewrite each pid in its 'tasks' file back to that 'tasks' file in order to get that 'cpus' change to be

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Menage [EMAIL PROTECTED] wrote: What's wrong with, in update_cpumask(), doing a loop across all members of the cgroup and updating their cpus_allowed fields? i.e. something like: struct cgroup_iter it; struct task_struct *p; while ((p = cgroup_iter_next(cs-css.cgroup, it))) {

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
Paul M wrote: Given that later in cpusets.txt you say: If hotplug functionality is used to remove all the CPUs that are currently assigned to a cpuset, then the kernel will automatically update the cpus_allowed of all tasks attached to CPUs in that cpuset to allow all CPUs why can't the

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson [EMAIL PROTECTED] wrote: So far as I can tell, this patch just removes a minor optimization, It's not minor for any subsystem that has a non-trivial attach() operation. Any further suggestions, or embarrassing (;) questions? What was wrong with my suggestion from a

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
What was wrong with my suggestion from a couple of emails back? Adding the following in cpuset_attach(): struct cgroup_iter it; struct task_struct *p; while ((p = cgroup_iter_next(cs-css.cgroup, it))) { set_cpus_allowed(p, cs-cpus_allowed); } cgroup_iter_end(cs-css.cgroup, it);

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Jackson
Andrew - please kill this patch. Looks like Paul Menage has a better solution that I will be trying out. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.925.600.0401 - To unsubscribe from

Re: [PATCH] task containersv11 add tasks file interface fix for cpusets

2007-10-03 Thread Paul Menage
On 10/3/07, Paul Jackson [EMAIL PROTECTED] wrote: But now (correct me if I'm wrong here) cgroups has a per-cgroup task list, and the above loop has cost linear in the number of tasks actually in the cgroup, plus (unfortunate but necessary and tolerable) the cost of taking a global