Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-05 Thread Nathan Lynch
On Tue, Apr 05, 2005 at 09:55:06AM +0800, Li Shaohua wrote:

> On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote:
> > No.  It should make zero difference to the scheduler whether the "play
> > dead" cpu hotplug or "physical" hotplug is being used.  
> Keeping some fields like 'cpu_load' are meanless for a hotadded CPU to
> me. Just ignore them?

Reinitializing such things during the CPU_UP_PREPARE case in
migration_call should be sufficient, if it's not done already.


Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-05 Thread Li Shaohua
On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote:
> > > 
> > > I don't understand why this is needed at all.  It looks like a fair
> > > amount of code from do_exit is being duplicated here.  
> > Yes, exactly. Someone who understand do_exit please help clean up the
> > code. I'd like to remove the idle thread, since the smpboot code will
> > create a new idle thread.
> 
> I'd say fix the smpboot code so that it doesn't create new idle tasks
> except during boot.
I tried what you said. But I must use a ugly method to adjust
idle->thread.esp (stack pointer in IA32). otherwise, the stack will soon
overflow after several rounds of hotplug. I'll take close look at if
other fields in thread_info cause problems.
Did you reinitialize the idle's thread_info in ppc? I have no problem to
do it in IA32, but is this a good approach? Creating a new idle thread
for upcoming CPU looks more graceful to me.

Thanks,
Shaohua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-04 Thread Li Shaohua
Hi,
On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote:
> 
> I'd say fix the smpboot code so that it doesn't create new idle tasks
> except during boot.
I'd like the the CPU hotremove case just likes the case that CPU isn't
boot. A non-boot CPU hasn't a idle thread. But you may think it's not
worthy doing. Anyway, I will keep the idle thread in a updated patch
like what you said.

> > > We've been
> > > doing cpu removal on ppc64 logical partitions for a while and never
> > > needed to do anything like this. 
> > Did it remove idle thread? or dead cpu is in a busy loop of idle?
> 
> Neither.  The cpu is definitely offline, but there is no reason to
> free the idle thread.
> 
> > 
> > >  Maybe idle_task_exit would suffice?
> > idle_task_exit seems just drop mm. We need destroy the idle task for
> > physical CPU hotplug, right?
> 
> No.
> 
> > > 
> > > I don't understand the need for this, either.  The existing cpu
> > > hotplug notifier in the scheduler takes care of initializing the sched
> > > domains and groups appropriately for online/offline events; why do you
> > > need to touch the runqueue structures?
> > If a CPU is physically hotremoved from the system, shouldn't we clean
> > its runqueue?
> 
> No.  It should make zero difference to the scheduler whether the "play
> dead" cpu hotplug or "physical" hotplug is being used.  
Keeping some fields like 'cpu_load' are meanless for a hotadded CPU to
me. Just ignore them?

Thanks,
Shaohua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-04 Thread Nigel Cunningham
Hi.

On Tue, 2005-04-05 at 08:46, Nathan Lynch wrote:
> Hi Nigel!
> 
> On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote:
> > 
> > On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote:
> > > > Yes, exactly. Someone who understand do_exit please help clean up the
> > > > code. I'd like to remove the idle thread, since the smpboot code will
> > > > create a new idle thread.
> > > 
> > > I'd say fix the smpboot code so that it doesn't create new idle tasks
> > > except during boot.
> > 
> > Would that mean that CPUs that were physically hotplugged wouldn't get
> > idle threads?
> 
> No, that wouldn't work.  I am saying that there's little to gain by
> adding all this complexity for destroying the idle tasks when it's
> fairly simple to create num_possible_cpus() - 1 idle tasks* to
> accommodate any additional cpus which may come along.  This is what
> ppc64 does now, and it should be feasible on any architecture which
> supports cpu hotplug.

Ah. Ta. I was a little confused :>

Nigel

> * num_possible_cpus() - 1 because the idle task for the boot cpu is
>   created in sched_init.
-- 
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028;  Mob: +61 (417) 100 574

Maintainer of Suspend2 Kernel Patches http://suspend2.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-04 Thread Nathan Lynch
Hi Nigel!

On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote:
> 
> On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote:
> > > Yes, exactly. Someone who understand do_exit please help clean up the
> > > code. I'd like to remove the idle thread, since the smpboot code will
> > > create a new idle thread.
> > 
> > I'd say fix the smpboot code so that it doesn't create new idle tasks
> > except during boot.
> 
> Would that mean that CPUs that were physically hotplugged wouldn't get
> idle threads?

No, that wouldn't work.  I am saying that there's little to gain by
adding all this complexity for destroying the idle tasks when it's
fairly simple to create num_possible_cpus() - 1 idle tasks* to
accommodate any additional cpus which may come along.  This is what
ppc64 does now, and it should be feasible on any architecture which
supports cpu hotplug.


Nathan

* num_possible_cpus() - 1 because the idle task for the boot cpu is
  created in sched_init.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-04 Thread Ashok Raj
On Mon, Apr 04, 2005 at 03:46:20PM -0700, Nathan Lynch wrote:
> 
>Hi Nigel!
> 
>On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote:
>>
>> On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote:
>>  >  > Yes, exactly. Someone who understand do_exit please help clean
> 
>No, that wouldn't work.  I am saying that there's little to gain by
>adding all this complexity for destroying the idle tasks when it's
>fairly simple to create num_possible_cpus() - 1 idle tasks* to
>accommodate any additional cpus which may come along.  This is what
>ppc64 does now, and it should be feasible on any architecture which
>supports cpu hotplug.
> 
>Nathan
> 
>* num_possible_cpus() - 1 because the idle task for the boot cpu is
>  created in sched_init.
> 

In ia64 we create idle threads on demand if one is not available for the same
logical cpu number, and re-used when the same logical cpu number is re-used. 

just a minor improvement, i also thought about idle exit, but wasnt worth
anything in return.

Cheers,
ashok
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-04 Thread Nigel Cunningham
Hi.

On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote:
> > Yes, exactly. Someone who understand do_exit please help clean up the
> > code. I'd like to remove the idle thread, since the smpboot code will
> > create a new idle thread.
> 
> I'd say fix the smpboot code so that it doesn't create new idle tasks
> except during boot.

Would that mean that CPUs that were physically hotplugged wouldn't get
idle threads?

Regards,

Nigel
-- 
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028;  Mob: +61 (417) 100 574

Maintainer of Suspend2 Kernel Patches http://suspend2.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-04 Thread Nathan Lynch
On Mon, Apr 04, 2005 at 01:42:18PM +0800, Li Shaohua wrote:
> Hi,
> On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote:
> > On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote:
> > > Clean up all CPU states including its runqueue and idle thread, 
> > > so we can use boot time code without any changes.
> > > Note this makes /sys/devices/system/cpu/cpux/online unworkable.
> > 
> > In what sense does it make the online attribute unworkable?
> I removed the idle thread and other CPU states, and makes the dead CPU
> into a 'halt' busy loop. 
> 
> > 
> > > diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c
> > > --- linux-2.6.11/kernel/exit.c~cpu_state_clean2005-03-31 
> > > 10:50:27.0 +0800
> > > +++ linux-2.6.11-root/kernel/exit.c   2005-03-31 10:50:27.0 
> > > +0800
> > > @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co
> > >   for (;;) ;
> > >  }
> > >  
> > > +#ifdef CONFIG_STR_SMP
> > > +void do_exit_idle(void)
> > > +{
> > > + struct task_struct *tsk = current;
> > > + int group_dead;
> > > +
> > > + BUG_ON(tsk->pid);
> > > + BUG_ON(tsk->mm);
> > > +
> > > + if (tsk->io_context)
> > > + exit_io_context();
> > > + tsk->flags |= PF_EXITING;
> > > + tsk->it_virt_expires = cputime_zero;
> > > + tsk->it_prof_expires = cputime_zero;
> > > + tsk->it_sched_expires = 0;
> > > +
> > > + acct_update_integrals(tsk);
> > > + update_mem_hiwater(tsk);
> > > + group_dead = atomic_dec_and_test(&tsk->signal->live);
> > > + if (group_dead) {
> > > + del_timer_sync(&tsk->signal->real_timer);
> > > + acct_process(-1);
> > > + }
> > > + exit_mm(tsk);
> > > +
> > > + exit_sem(tsk);
> > > + __exit_files(tsk);
> > > + __exit_fs(tsk);
> > > + exit_namespace(tsk);
> > > + exit_thread();
> > > + exit_keys(tsk);
> > > +
> > > + if (group_dead && tsk->signal->leader)
> > > + disassociate_ctty(1);
> > > +
> > > + module_put(tsk->thread_info->exec_domain->module);
> > > + if (tsk->binfmt)
> > > + module_put(tsk->binfmt->module);
> > > +
> > > + tsk->exit_code = -1;
> > > + tsk->exit_state = EXIT_DEAD;
> > > +
> > > + /* in release_task */
> > > + atomic_dec(&tsk->user->processes);
> > > + write_lock_irq(&tasklist_lock);
> > > + __exit_signal(tsk);
> > > + __exit_sighand(tsk);
> > > + write_unlock_irq(&tasklist_lock);
> > > + release_thread(tsk);
> > > + put_task_struct(tsk);
> > > +
> > > + tsk->flags |= PF_DEAD;
> > > +#ifdef CONFIG_NUMA
> > > + mpol_free(tsk->mempolicy);
> > > + tsk->mempolicy = NULL;
> > > +#endif
> > > +}
> > > +#endif
> > 
> > I don't understand why this is needed at all.  It looks like a fair
> > amount of code from do_exit is being duplicated here.  
> Yes, exactly. Someone who understand do_exit please help clean up the
> code. I'd like to remove the idle thread, since the smpboot code will
> create a new idle thread.

I'd say fix the smpboot code so that it doesn't create new idle tasks
except during boot.

> 
> > We've been
> > doing cpu removal on ppc64 logical partitions for a while and never
> > needed to do anything like this. 
> Did it remove idle thread? or dead cpu is in a busy loop of idle?

Neither.  The cpu is definitely offline, but there is no reason to
free the idle thread.

> 
> >  Maybe idle_task_exit would suffice?
> idle_task_exit seems just drop mm. We need destroy the idle task for
> physical CPU hotplug, right?

No.

> > 
> > I don't understand the need for this, either.  The existing cpu
> > hotplug notifier in the scheduler takes care of initializing the sched
> > domains and groups appropriately for online/offline events; why do you
> > need to touch the runqueue structures?
> If a CPU is physically hotremoved from the system, shouldn't we clean
> its runqueue?

No.  It should make zero difference to the scheduler whether the "play
dead" cpu hotplug or "physical" hotplug is being used.  


Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU

2005-04-03 Thread Li Shaohua
Hi,
On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote:
> On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote:
> > Clean up all CPU states including its runqueue and idle thread, 
> > so we can use boot time code without any changes.
> > Note this makes /sys/devices/system/cpu/cpux/online unworkable.
> 
> In what sense does it make the online attribute unworkable?
I removed the idle thread and other CPU states, and makes the dead CPU
into a 'halt' busy loop. 

> 
> > diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c
> > --- linux-2.6.11/kernel/exit.c~cpu_state_clean  2005-03-31 
> > 10:50:27.0 +0800
> > +++ linux-2.6.11-root/kernel/exit.c 2005-03-31 10:50:27.0 +0800
> > @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co
> > for (;;) ;
> >  }
> >  
> > +#ifdef CONFIG_STR_SMP
> > +void do_exit_idle(void)
> > +{
> > +   struct task_struct *tsk = current;
> > +   int group_dead;
> > +
> > +   BUG_ON(tsk->pid);
> > +   BUG_ON(tsk->mm);
> > +
> > +   if (tsk->io_context)
> > +   exit_io_context();
> > +   tsk->flags |= PF_EXITING;
> > +   tsk->it_virt_expires = cputime_zero;
> > +   tsk->it_prof_expires = cputime_zero;
> > +   tsk->it_sched_expires = 0;
> > +
> > +   acct_update_integrals(tsk);
> > +   update_mem_hiwater(tsk);
> > +   group_dead = atomic_dec_and_test(&tsk->signal->live);
> > +   if (group_dead) {
> > +   del_timer_sync(&tsk->signal->real_timer);
> > +   acct_process(-1);
> > +   }
> > +   exit_mm(tsk);
> > +
> > +   exit_sem(tsk);
> > +   __exit_files(tsk);
> > +   __exit_fs(tsk);
> > +   exit_namespace(tsk);
> > +   exit_thread();
> > +   exit_keys(tsk);
> > +
> > +   if (group_dead && tsk->signal->leader)
> > +   disassociate_ctty(1);
> > +
> > +   module_put(tsk->thread_info->exec_domain->module);
> > +   if (tsk->binfmt)
> > +   module_put(tsk->binfmt->module);
> > +
> > +   tsk->exit_code = -1;
> > +   tsk->exit_state = EXIT_DEAD;
> > +
> > +   /* in release_task */
> > +   atomic_dec(&tsk->user->processes);
> > +   write_lock_irq(&tasklist_lock);
> > +   __exit_signal(tsk);
> > +   __exit_sighand(tsk);
> > +   write_unlock_irq(&tasklist_lock);
> > +   release_thread(tsk);
> > +   put_task_struct(tsk);
> > +
> > +   tsk->flags |= PF_DEAD;
> > +#ifdef CONFIG_NUMA
> > +   mpol_free(tsk->mempolicy);
> > +   tsk->mempolicy = NULL;
> > +#endif
> > +}
> > +#endif
> 
> I don't understand why this is needed at all.  It looks like a fair
> amount of code from do_exit is being duplicated here.  
Yes, exactly. Someone who understand do_exit please help clean up the
code. I'd like to remove the idle thread, since the smpboot code will
create a new idle thread.

> We've been
> doing cpu removal on ppc64 logical partitions for a while and never
> needed to do anything like this. 
Did it remove idle thread? or dead cpu is in a busy loop of idle?

>  Maybe idle_task_exit would suffice?
idle_task_exit seems just drop mm. We need destroy the idle task for
physical CPU hotplug, right?

> 
> 
> > diff -puN kernel/sched.c~cpu_state_clean kernel/sched.c
> > --- linux-2.6.11/kernel/sched.c~cpu_state_clean 2005-03-31 
> > 10:50:27.0 +0800
> > +++ linux-2.6.11-root/kernel/sched.c2005-04-04 09:06:40.362357104 
> > +0800
> > @@ -4028,6 +4028,58 @@ void __devinit init_idle(task_t *idle, i
> >  }
> >  
> >  /*
> > + * Initial dummy domain for early boot and for hotplug cpu. Being static,
> > + * it is initialized to zero, so all balancing flags are cleared which is
> > + * what we want.
> > + */
> > +static struct sched_domain sched_domain_dummy;
> > +
> > +#ifdef CONFIG_STR_SMP
> > +static void __devinit exit_idle(int cpu)
> > +{
> > +   runqueue_t *rq = cpu_rq(cpu);
> > +   struct task_struct *p = rq->idle;
> > +   int j, k;
> > +   prio_array_t *array;
> > +
> > +   /* init runqueue */
> > +   spin_lock_init(&rq->lock);
> > +   rq->active = rq->arrays;
> > +   rq->expired = rq->arrays + 1;
> > +   rq->best_expired_prio = MAX_PRIO;
> > +
> > +   rq->prev_mm = NULL;
> > +   rq->curr = rq->idle = NULL;
> > +   rq->expired_timestamp = 0;
> > +
> > +   rq->sd = &sched_domain_dummy;
> > +   rq->cpu_load = 0;
> > +   rq->active_balance = 0;
> > +   rq->push_cpu = 0;
> > +   rq->migration_thread = NULL;
> > +   INIT_LIST_HEAD(&rq->migration_queue);
> > +   atomic_set(&rq->nr_iowait, 0);
> > +
> > +   for (j = 0; j < 2; j++) {
> > +   array = rq->arrays + j;
> > +   for (k = 0; k < MAX_PRIO; k++) {
> > +   INIT_LIST_HEAD(array->queue + k);
> > +   __clear_bit(k, array->bitmap);
> > +   }
> > +   // delimiter for bitsearch
> > +   __set_bit(MAX_PRIO, array->bitmap);
> > +   }
> > +   /* Destroy IDLE thread.
> > +* it's safe now, the CPU is in busy loop
> > +*/
> > +   if (p->active_mm)
> > +   mmdrop(p->active_mm);
> > +   p->active_mm = NULL;
> > +   put_task_struct(p);
> > +}
> > +#endif
> > +
> > +/*
> >   * In a system that switches o