Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Gautham R Shenoy
On Tue, Jun 21, 2016 at 09:47:19PM +0200, Peter Zijlstra wrote: > On Tue, Jun 21, 2016 at 03:43:56PM -0400, Tejun Heo wrote: > > On Tue, Jun 21, 2016 at 09:37:09PM +0200, Peter Zijlstra wrote: > > > Hurm.. So I've applied it, just to get this issue sorted, but I'm not > > > entirely sure I like

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Gautham R Shenoy
On Tue, Jun 21, 2016 at 09:47:19PM +0200, Peter Zijlstra wrote: > On Tue, Jun 21, 2016 at 03:43:56PM -0400, Tejun Heo wrote: > > On Tue, Jun 21, 2016 at 09:37:09PM +0200, Peter Zijlstra wrote: > > > Hurm.. So I've applied it, just to get this issue sorted, but I'm not > > > entirely sure I like

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Peter Zijlstra
On Tue, Jun 21, 2016 at 11:36:51AM -0400, Tejun Heo wrote: > On Tue, Jun 21, 2016 at 07:42:31PM +0530, Gautham R Shenoy wrote: > > > Subject: [PATCH] sched: allow kthreads to fallback to online && !active > > > cpus > > > > > > During CPU hotplug, CPU_ONLINE callbacks are run while the CPU is >

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Peter Zijlstra
On Tue, Jun 21, 2016 at 11:36:51AM -0400, Tejun Heo wrote: > On Tue, Jun 21, 2016 at 07:42:31PM +0530, Gautham R Shenoy wrote: > > > Subject: [PATCH] sched: allow kthreads to fallback to online && !active > > > cpus > > > > > > During CPU hotplug, CPU_ONLINE callbacks are run while the CPU is >

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Peter Zijlstra
On Tue, Jun 21, 2016 at 03:43:56PM -0400, Tejun Heo wrote: > On Tue, Jun 21, 2016 at 09:37:09PM +0200, Peter Zijlstra wrote: > > Hurm.. So I've applied it, just to get this issue sorted, but I'm not > > entirely sure I like it. > > > > I think I prefer ego's version because that makes it harder

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Peter Zijlstra
On Tue, Jun 21, 2016 at 03:43:56PM -0400, Tejun Heo wrote: > On Tue, Jun 21, 2016 at 09:37:09PM +0200, Peter Zijlstra wrote: > > Hurm.. So I've applied it, just to get this issue sorted, but I'm not > > entirely sure I like it. > > > > I think I prefer ego's version because that makes it harder

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Tejun Heo
On Tue, Jun 21, 2016 at 09:37:09PM +0200, Peter Zijlstra wrote: > Hurm.. So I've applied it, just to get this issue sorted, but I'm not > entirely sure I like it. > > I think I prefer ego's version because that makes it harder to get stuff > to run on !active,online cpus. I think we really want

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Tejun Heo
On Tue, Jun 21, 2016 at 09:37:09PM +0200, Peter Zijlstra wrote: > Hurm.. So I've applied it, just to get this issue sorted, but I'm not > entirely sure I like it. > > I think I prefer ego's version because that makes it harder to get stuff > to run on !active,online cpus. I think we really want

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Tejun Heo
On Tue, Jun 21, 2016 at 07:42:31PM +0530, Gautham R Shenoy wrote: > > Subject: [PATCH] sched: allow kthreads to fallback to online && !active cpus > > > > During CPU hotplug, CPU_ONLINE callbacks are run while the CPU is > > online but not active. A CPU_ONLINE callback may create or bind a > >

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Tejun Heo
On Tue, Jun 21, 2016 at 07:42:31PM +0530, Gautham R Shenoy wrote: > > Subject: [PATCH] sched: allow kthreads to fallback to online && !active cpus > > > > During CPU hotplug, CPU_ONLINE callbacks are run while the CPU is > > online but not active. A CPU_ONLINE callback may create or bind a > >

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Gautham R Shenoy
Hi Tejun, On Thu, Jun 16, 2016 at 03:35:04PM -0400, Tejun Heo wrote: > Hello, > > So, the issue of the initial worker not having its affinity set > correctly wasn't caused by the order of the operations. Reordering > just made set_cpus_allowed tried one more time late enough so that it > hides

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-21 Thread Gautham R Shenoy
Hi Tejun, On Thu, Jun 16, 2016 at 03:35:04PM -0400, Tejun Heo wrote: > Hello, > > So, the issue of the initial worker not having its affinity set > correctly wasn't caused by the order of the operations. Reordering > just made set_cpus_allowed tried one more time late enough so that it > hides

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-16 Thread Tejun Heo
Hello, So, the issue of the initial worker not having its affinity set correctly wasn't caused by the order of the operations. Reordering just made set_cpus_allowed tried one more time late enough so that it hides the race condition most of the time. The problem is that CPU_ONLINE callbacks are

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-16 Thread Tejun Heo
Hello, So, the issue of the initial worker not having its affinity set correctly wasn't caused by the order of the operations. Reordering just made set_cpus_allowed tried one more time late enough so that it hides the race condition most of the time. The problem is that CPU_ONLINE callbacks are

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-15 Thread Gautham R Shenoy
Hello Tejun, On Wed, Jun 15, 2016 at 11:53:50AM -0400, Tejun Heo wrote: > Hello, > > On Tue, Jun 07, 2016 at 08:44:02PM +0530, Gautham R. Shenoy wrote: > > Currently in the CPU_ONLINE workqueue handler, the > > restore_unbound_workers_cpumask() will never call > > set_cpus_allowed_ptr() for a

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-15 Thread Gautham R Shenoy
Hello Tejun, On Wed, Jun 15, 2016 at 11:53:50AM -0400, Tejun Heo wrote: > Hello, > > On Tue, Jun 07, 2016 at 08:44:02PM +0530, Gautham R. Shenoy wrote: > > Currently in the CPU_ONLINE workqueue handler, the > > restore_unbound_workers_cpumask() will never call > > set_cpus_allowed_ptr() for a

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-15 Thread Tejun Heo
Hello, On Tue, Jun 07, 2016 at 08:44:02PM +0530, Gautham R. Shenoy wrote: > Currently in the CPU_ONLINE workqueue handler, the > restore_unbound_workers_cpumask() will never call > set_cpus_allowed_ptr() for a newly created unbound worker thread. Hmmm... did you actually verify that this

Re: [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-15 Thread Tejun Heo
Hello, On Tue, Jun 07, 2016 at 08:44:02PM +0530, Gautham R. Shenoy wrote: > Currently in the CPU_ONLINE workqueue handler, the > restore_unbound_workers_cpumask() will never call > set_cpus_allowed_ptr() for a newly created unbound worker thread. Hmmm... did you actually verify that this

[PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-07 Thread Gautham R. Shenoy
Currently in the CPU_ONLINE workqueue handler, the restore_unbound_workers_cpumask() will never call set_cpus_allowed_ptr() for a newly created unbound worker thread. This is because the function which creates a new unbound worker thread when the first CPU in the node comes online

[PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE

2016-06-07 Thread Gautham R. Shenoy
Currently in the CPU_ONLINE workqueue handler, the restore_unbound_workers_cpumask() will never call set_cpus_allowed_ptr() for a newly created unbound worker thread. This is because the function which creates a new unbound worker thread when the first CPU in the node comes online