Re: [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die

2021-10-11 Thread Michael Ellerman
Nathan Lynch  writes:
> Michael Ellerman  writes:
>> Daniel Henrique Barboza  writes:
>>> This is enough to say that we can't easily see the history behind this 
>>> comment.
>>> I also believe that we're better of without it since it doesn't make sense
>>> with the current codebase.
>>
>> It was added by the original CPU hotplug commit for ppc64::
>>
>> https://github.com/mpe/linux-fullhistory/commit/0e9fd9441cd2113b67b14e739267c9e69761489b
>>
>>
>> The code was fairly similar:
>>
>> void __cpu_die(unsigned int cpu)
>> {
>>  int tries;
>>  int cpu_status;
>>  unsigned int pcpu = get_hard_smp_processor_id(cpu);
>>
>>  for (tries = 0; tries < 5; tries++) {
>>  cpu_status = query_cpu_stopped(pcpu);
>>
>>  if (cpu_status == 0)
>>  break;
>>  set_current_state(TASK_UNINTERRUPTIBLE);
>>  schedule_timeout(HZ);
>>  }
>>  if (cpu_status != 0) {
>>  printk("Querying DEAD? cpu %i (%i) shows %i\n",
>> cpu, pcpu, cpu_status);
>>  }
>>
>>  /* Isolation and deallocation are definatly done by
>>   * drslot_chrp_cpu.  If they were not they would be
>>   * done here.  Change isolate state to Isolate and
>>   * change allocation-state to Unusable.
>>   */
>>  paca[cpu].xProcStart = 0;
>>
>>  /* So we can recognize if it fails to come up next time. */
>>  cpu_callin_map[cpu] = 0;
>> }
>>
>>
>> drslot_chrp_cpu() still exists in drmgr:
>>
>>   
>> https://github.com/ibm-power-utilities/powerpc-utils/blob/e798c4a09fbf0fa0f421e624cfa366a6c405c9fe/src/drmgr/drslot_chrp_cpu.c#L406
>>
>>
>> I agree the comment is no longer meaningful and can be removed.
>
> Thanks for providing this background.
>
>> It might be good to then add a comment explaining why we need to set
>> cpu_start = 0.
>
> Sure, I can take that as a follow-up. Or perhaps it should be moved to
> the online path.

Yeah possibly.

>> It's not immediately clear why we need to. When we bring a CPU back
>> online in smp_pSeries_kick_cpu() we ask RTAS to start it and then
>> immediately set cpu_start = 1, ie. there isn't a separate step that sets
>> cpu_start = 1 for hotplugged CPUs.
>
> Hmm I'm not following the distinction you seem to be drawing between
> bringing a CPU back online and a hotplugged CPU. kick_cpu is used in all
> cases AFAIK.

Yeah that wasn't very clear, reading it back I have half confused myself.

At boot we need the paca->cpu_start flag because some CPUs can be
spinning in generic_secondary_common_init, in head_64.S.

ie. they're not in RTAS, they're spinning in kernel code, and the only
thing that stops them coming "online" in the Linux sense is
paca->cpu_start.

You can see that in pseries/smp.c:

static inline int smp_startup_cpu(unsigned int lcpu)
{
...
if (cpumask_test_cpu(lcpu, of_spin_mask))
/* Already started by OF and sitting in spin loop */
return 1;


We also hit that case when kexec'ing, where all the secondaries come in
that way.


On the other hand when we offline a CPU, we set paca->cpu_start = 0, in
pseries_cpu_die(), and then we return the CPU to RTAS.

The only way it *should* come back online is via smp_pSeries_kick_cpu(),
which calls smp_startup_cpu() to bring the CPU out of RTAS, and then
smp_pSeries_kick_cpu() immediately sets cpu_start = 1.

So the sequence is:

CPU goes offline from Linux POV
paca->cpu_start = 0;
CPU offline in RTAS
...
CPU brought out of RTAS
paca->cpu_start = 1;
CPU comes back online from Linux POV


But I guess I kind of answered my own question above, where I said
*should*. Clearing paca->cpu_start when we offline the CPU gives us a
little bit of backup if the CPU comes out of RTAS unexpectedly. ie. it
would then spin on paca->cpu_start, rather than spontaneously coming
back into Linux when we weren't expecting it.

cheers


Re: [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die

2021-09-29 Thread Nathan Lynch
Michael Ellerman  writes:
> Daniel Henrique Barboza  writes:
>> This is enough to say that we can't easily see the history behind this 
>> comment.
>> I also believe that we're better of without it since it doesn't make sense
>> with the current codebase.
>
> It was added by the original CPU hotplug commit for ppc64::
>
> https://github.com/mpe/linux-fullhistory/commit/0e9fd9441cd2113b67b14e739267c9e69761489b
>
>
> The code was fairly similar:
>
> void __cpu_die(unsigned int cpu)
> {
>   int tries;
>   int cpu_status;
>   unsigned int pcpu = get_hard_smp_processor_id(cpu);
>
>   for (tries = 0; tries < 5; tries++) {
>   cpu_status = query_cpu_stopped(pcpu);
>
>   if (cpu_status == 0)
>   break;
>   set_current_state(TASK_UNINTERRUPTIBLE);
>   schedule_timeout(HZ);
>   }
>   if (cpu_status != 0) {
>   printk("Querying DEAD? cpu %i (%i) shows %i\n",
>  cpu, pcpu, cpu_status);
>   }
>
>   /* Isolation and deallocation are definatly done by
>* drslot_chrp_cpu.  If they were not they would be
>* done here.  Change isolate state to Isolate and
>* change allocation-state to Unusable.
>*/
>   paca[cpu].xProcStart = 0;
>
>   /* So we can recognize if it fails to come up next time. */
>   cpu_callin_map[cpu] = 0;
> }
>
>
> drslot_chrp_cpu() still exists in drmgr:
>
>   
> https://github.com/ibm-power-utilities/powerpc-utils/blob/e798c4a09fbf0fa0f421e624cfa366a6c405c9fe/src/drmgr/drslot_chrp_cpu.c#L406
>
>
> I agree the comment is no longer meaningful and can be removed.

Thanks for providing this background.

> It might be good to then add a comment explaining why we need to set
> cpu_start = 0.

Sure, I can take that as a follow-up. Or perhaps it should be moved to
the online path.

> It's not immediately clear why we need to. When we bring a CPU back
> online in smp_pSeries_kick_cpu() we ask RTAS to start it and then
> immediately set cpu_start = 1, ie. there isn't a separate step that sets
> cpu_start = 1 for hotplugged CPUs.

Hmm I'm not following the distinction you seem to be drawing between
bringing a CPU back online and a hotplugged CPU. kick_cpu is used in all
cases AFAIK.



Re: [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die

2021-09-28 Thread Michael Ellerman
Daniel Henrique Barboza  writes:
> On 9/27/21 17:19, Nathan Lynch wrote:
>> This comment likely refers to the obsolete DLPAR workflow where some
>> resource state transitions were driven more directly from user space
>> utilities, but it also seems to contradict itself: "Change isolate state to
>> Isolate [...]" is at odds with the preceding sentences, and it does not
>> relate at all to the code that follows.
>
> This comment was added by commit 413f7c405a34, a 2006 commit where Mike
> Ellerman moved code from platform/pseries/smp.c into hotplug-cpu.c.
>
> I checked the original code back in smp.c and this comment was added there
> by commit 1da177e4c3f41, which is Linus's initial git commit, where he 
> mentions
> that he didn't bothered with full history (although it is available somewhere,
> allegedly).
>
> This is enough to say that we can't easily see the history behind this 
> comment.
> I also believe that we're better of without it since it doesn't make sense
> with the current codebase.

It was added by the original CPU hotplug commit for ppc64::

https://github.com/mpe/linux-fullhistory/commit/0e9fd9441cd2113b67b14e739267c9e69761489b


The code was fairly similar:

void __cpu_die(unsigned int cpu)
{
int tries;
int cpu_status;
unsigned int pcpu = get_hard_smp_processor_id(cpu);

for (tries = 0; tries < 5; tries++) {
cpu_status = query_cpu_stopped(pcpu);

if (cpu_status == 0)
break;
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(HZ);
}
if (cpu_status != 0) {
printk("Querying DEAD? cpu %i (%i) shows %i\n",
   cpu, pcpu, cpu_status);
}

/* Isolation and deallocation are definatly done by
 * drslot_chrp_cpu.  If they were not they would be
 * done here.  Change isolate state to Isolate and
 * change allocation-state to Unusable.
 */
paca[cpu].xProcStart = 0;

/* So we can recognize if it fails to come up next time. */
cpu_callin_map[cpu] = 0;
}


drslot_chrp_cpu() still exists in drmgr:

  
https://github.com/ibm-power-utilities/powerpc-utils/blob/e798c4a09fbf0fa0f421e624cfa366a6c405c9fe/src/drmgr/drslot_chrp_cpu.c#L406


I agree the comment is no longer meaningful and can be removed.

It might be good to then add a comment explaining why we need to set
cpu_start = 0.

It's not immediately clear why we need to. When we bring a CPU back
online in smp_pSeries_kick_cpu() we ask RTAS to start it and then
immediately set cpu_start = 1, ie. there isn't a separate step that sets
cpu_start = 1 for hotplugged CPUs.

cheers


Re: [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die

2021-09-28 Thread Daniel Henrique Barboza




On 9/27/21 17:19, Nathan Lynch wrote:

This comment likely refers to the obsolete DLPAR workflow where some
resource state transitions were driven more directly from user space
utilities, but it also seems to contradict itself: "Change isolate state to
Isolate [...]" is at odds with the preceding sentences, and it does not
relate at all to the code that follows.



This comment was added by commit 413f7c405a34, a 2006 commit where Mike
Ellerman moved code from platform/pseries/smp.c into hotplug-cpu.c.

I checked the original code back in smp.c and this comment was added there
by commit 1da177e4c3f41, which is Linus's initial git commit, where he mentions
that he didn't bothered with full history (although it is available somewhere,
allegedly).

This is enough to say that we can't easily see the history behind this comment.
I also believe that we're better of without it since it doesn't make sense
with the current codebase.


Reviewed-by: Daniel Henrique Barboza 



Remove it to prevent confusion.

Signed-off-by: Nathan Lynch 
---
  arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 -
  1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index b50f3e9aa259..5ab44600c8d3 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -137,11 +137,6 @@ static void pseries_cpu_die(unsigned int cpu)
cpu, pcpu);
}
  
-	/* Isolation and deallocation are definitely done by

-* drslot_chrp_cpu.  If they were not they would be
-* done here.  Change isolate state to Isolate and
-* change allocation-state to Unusable.
-*/
paca_ptrs[cpu]->cpu_start = 0;
  }
  



[PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die

2021-09-27 Thread Nathan Lynch
This comment likely refers to the obsolete DLPAR workflow where some
resource state transitions were driven more directly from user space
utilities, but it also seems to contradict itself: "Change isolate state to
Isolate [...]" is at odds with the preceding sentences, and it does not
relate at all to the code that follows.

Remove it to prevent confusion.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index b50f3e9aa259..5ab44600c8d3 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -137,11 +137,6 @@ static void pseries_cpu_die(unsigned int cpu)
cpu, pcpu);
}
 
-   /* Isolation and deallocation are definitely done by
-* drslot_chrp_cpu.  If they were not they would be
-* done here.  Change isolate state to Isolate and
-* change allocation-state to Unusable.
-*/
paca_ptrs[cpu]->cpu_start = 0;
 }
 
-- 
2.31.1