Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Christoph Lameter
On Tue, 20 Feb 2007, Max Krasnyansky wrote: > Ok. Sounds like disabling cache_reaper is a better option for now. Like you > said > it's unlikely that slabs will grow much if that cpu is not heavily used by the > kernel. Running for prolonged times without cache_reaper is no good. What we are tal

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Max Krasnyansky
Christoph Lameter wrote: On Tue, 20 Feb 2007, Max Krasnyansky wrote: I guess I kind of hijacked the thread. The second part of my first email was dropped. Basically I was saying that I'm working on CPU isolation extensions. Where an isolated CPU is not supposed to do much kernel work. In which

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Christoph Lameter
On Tue, 20 Feb 2007, Max Krasnyansky wrote: > I guess I kind of hijacked the thread. The second part of my first email was > dropped. Basically I was saying that I'm working on CPU isolation extensions. > Where an isolated CPU is not supposed to do much kernel work. In which case > you'd want to r

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Christoph Lameter
On Tue, 20 Feb 2007, Max Krasnyansky wrote: > I agree that running the reaper on the wrong CPU is not the best way to go > about it. > But it seems like disabling it is even worse, unless I missing something. ie > wasting > memory. Disabling during shutdown is no problem because the per cpu cache

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Max Krasnyansky
Oleg Nesterov wrote: On 02/20, Christoph Lameter wrote: On Tue, 20 Feb 2007, Max Krasnyansky wrote: Well seems that we have a set of unresolved issues with workqueues and cpu hotplug. How about storing 'cpu' explicitly in the work queue instead of relying on the smp_processor_id() and friends

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Max Krasnyansky
Christoph Lameter wrote: On Tue, 20 Feb 2007, Max Krasnyansky wrote: Well seems that we have a set of unresolved issues with workqueues and cpu hotplug. How about storing 'cpu' explicitly in the work queue instead of relying on the smp_processor_id() and friends ? That way there is no ambiguit

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Oleg Nesterov
On 02/20, Christoph Lameter wrote: > > On Tue, 20 Feb 2007, Max Krasnyansky wrote: > > > > > Well seems that we have a set of unresolved issues with workqueues and > > > > cpu > > > > hotplug. > > > > How about storing 'cpu' explicitly in the work queue instead of relying on > > the > > smp_pro

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Max Krasnyansky
Folks, Oleg Nesterov wrote: Even if smp_processor_id() was stable during the execution of cache_reap(), this work_struct can be moved to another CPU if CPU_DEAD happens. We can't avoid this, and this is correct. Uhh This may not be correct in terms of how the slab operates. But this is pr

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-02-20 Thread Christoph Lameter
On Tue, 20 Feb 2007, Max Krasnyansky wrote: > > > Well seems that we have a set of unresolved issues with workqueues and cpu > > > hotplug. > > How about storing 'cpu' explicitly in the work queue instead of relying on the > smp_processor_id() and friends ? That way there is no ambiguity when > t

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Oleg Nesterov
On 01/29, Christoph Lameter wrote: > > On Tue, 30 Jan 2007, Oleg Nesterov wrote: > > > Now we have 2 additional events, CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE, > > so cpuup_callback() can use them to lock/unlock cache_chain_mutex, > > but this is not related. > > Then we wont need to do the mutex_lock

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Christoph Lameter
On Tue, 30 Jan 2007, Oleg Nesterov wrote: > Now we have 2 additional events, CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE, > so cpuup_callback() can use them to lock/unlock cache_chain_mutex, > but this is not related. Then we wont need to do the mutex_lock/unlock in CPU_DOWN_XX anymore, right? Which bring

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Oleg Nesterov
On 01/29, Christoph Lameter wrote: > > Here is the patch against 2.6.20-rc6-mm2. CPU_DOWN_PREPARE and > CPU_DOWN_FAILED somehow vanished in mm? No, no, there are still in place, so I believe your patch is good. Now we have 2 additional events, CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE, so cpuup_callback(

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Christoph Lameter
On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > But we could delay CPU_DOWN in the handler for the slab until we know that > > the cache_reaper is no longer running? > > Hmm... I don't undestand this. We can delay CPU_DOWN if we cancel cache_reaper > like you did in the previous patch. Did you mea

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Oleg Nesterov
On 01/29, Christoph Lameter wrote: > > On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > > > > Even if smp_processor_id() was stable during the execution of > > > > cache_reap(), > > > > this work_struct can be moved to another CPU if CPU_DEAD happens. We > > > > can't > > > > avoid this, and this i

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Oleg Nesterov
On 01/29, Christoph Lameter wrote: > > On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > > > The slab would need a notification > > > that the workqueue for a processor was shutdown in order to set work.func > > > = NULL. > > > > The slab has a notification: CPU

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Christoph Lameter
On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > > Even if smp_processor_id() was stable during the execution of > > > cache_reap(), > > > this work_struct can be moved to another CPU if CPU_DEAD happens. We can't > > > avoid this, and this is correct. > > > > Uhh This may not be correct in ter

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Christoph Lameter
On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > The slab would need a notification > > that the workqueue for a processor was shutdown in order to set work.func > > = NULL. > > The slab has a notification: CPU_XXX events. It should cancel a pending per > cpu "

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Oleg Nesterov
On 01/29, Christoph Lameter wrote: > > On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > > > > This is wrong. Suppose we have a CPU_UP,CPU_DOWN,CPU_UP sequence. The > > > > last > > > > CPU_UP will not restart a per-cpu "cache_reap timer". > > > > > > Why? > > > > Because the last CPU_UP calls star

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Christoph Lameter
On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > > This is wrong. Suppose we have a CPU_UP,CPU_DOWN,CPU_UP sequence. The last > > > CPU_UP will not restart a per-cpu "cache_reap timer". > > > > Why? > > Because the last CPU_UP calls start_cpu_timer(), but since ->work.func != NULL > we don't do sch

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Oleg Nesterov
On 01/29, Christoph Lameter wrote: > > On Mon, 29 Jan 2007, Oleg Nesterov wrote: > > > Now, > > static void __devinit start_cpu_timer(int cpu) > > { > > struct delayed_work *reap_work = &per_cpu(reap_work, cpu); > > > > if (keventd_up() && reap_work->work.func == N

Re: slab: start_cpu_timer/cache_reap CONFIG_HOTPLUG_CPU problems

2007-01-29 Thread Christoph Lameter
On Mon, 29 Jan 2007, Oleg Nesterov wrote: > Now, > static void __devinit start_cpu_timer(int cpu) > { > struct delayed_work *reap_work = &per_cpu(reap_work, cpu); > > if (keventd_up() && reap_work->work.func == NULL) { > init_reap_node