Re: [PATCH 1/4] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks

2021-05-02 Thread Michael Ellerman
Nicholas Piggin  writes:
> Excerpts from Naveen N. Rao's message of April 27, 2021 11:43 pm:
>> Nicholas Piggin wrote:
>>> The paravit queued spinlock slow path adds itself to the queue then
>>> calls pv_wait to wait for the lock to become free. This is implemented
>>> by calling H_CONFER to donate cycles.
>>> 
>>> When hcall tracing is enabled, this H_CONFER call can lead to a spin
>>> lock being taken in the tracing code, which will result in the lock to
>>> be taken again, which will also go to the slow path because it queues
>>> behind itself and so won't ever make progress.
>>> 
>>> An example trace of a deadlock:
>>> 
>>>   __pv_queued_spin_lock_slowpath
>>>   trace_clock_global
>>>   ring_buffer_lock_reserve
>>>   trace_event_buffer_lock_reserve
>>>   trace_event_buffer_reserve
>>>   trace_event_raw_event_hcall_exit
>>>   __trace_hcall_exit
>>>   plpar_hcall_norets_trace
>>>   __pv_queued_spin_lock_slowpath
>>>   trace_clock_global
>>>   ring_buffer_lock_reserve
>>>   trace_event_buffer_lock_reserve
>>>   trace_event_buffer_reserve
>>>   trace_event_raw_event_rcu_dyntick
>>>   rcu_irq_exit
>>>   irq_exit
>>>   __do_irq
>>>   call_do_irq
>>>   do_IRQ
>>>   hardware_interrupt_common_virt
>>> 
>>> Fix this by introducing plpar_hcall_norets_notrace(), and using that to
>>> make SPLPAR virtual processor dispatching hcalls by the paravirt
>>> spinlock code.
>>> 
>>> Fixes: 20c0e8269e9d ("powerpc/pseries: Implement paravirt qspinlocks for 
>>> SPLPAR")
>>> Signed-off-by: Nicholas Piggin 
>>> ---
>>>  arch/powerpc/include/asm/hvcall.h   |  3 +++
>>>  arch/powerpc/include/asm/paravirt.h | 22 +++---
>>>  arch/powerpc/platforms/pseries/hvCall.S | 10 ++
>>>  arch/powerpc/platforms/pseries/lpar.c   |  4 ++--
>>>  4 files changed, 34 insertions(+), 5 deletions(-)
>> 
>> Thanks for the fix! Some very minor nits below, but none the less:
>> Reviewed-by: Naveen N. Rao 
>> 
>>> 
>>> diff --git a/arch/powerpc/include/asm/hvcall.h 
>>> b/arch/powerpc/include/asm/hvcall.h
>>> index ed6086d57b22..0c92b01a3c3c 100644
>>> --- a/arch/powerpc/include/asm/hvcall.h
>>> +++ b/arch/powerpc/include/asm/hvcall.h
>>> @@ -446,6 +446,9 @@
>>>   */
>>>  long plpar_hcall_norets(unsigned long opcode, ...);
>>> 
>>> +/* Variant which does not do hcall tracing */
>>> +long plpar_hcall_norets_notrace(unsigned long opcode, ...);
>>> +
>>>  /**
>>>   * plpar_hcall: - Make a pseries hypervisor call
>>>   * @opcode: The hypervisor call to make.
>>> diff --git a/arch/powerpc/include/asm/paravirt.h 
>>> b/arch/powerpc/include/asm/paravirt.h
>>> index 5d1726bb28e7..3c13c2ec70a9 100644
>>> --- a/arch/powerpc/include/asm/paravirt.h
>>> +++ b/arch/powerpc/include/asm/paravirt.h
>>> @@ -30,17 +30,33 @@ static inline u32 yield_count_of(int cpu)
>>> 
>>>  static inline void yield_to_preempted(int cpu, u32 yield_count)
>>>  {
>> 
>> It looks like yield_to_preempted() is only used by simple spin locks 
>> today. I wonder if it makes more sense to put the below comment in 
>> yield_to_any() which is used by the qspinlock code.
>
> Yeah, I just put it above the functions entirely because it refers to 
> all of them.
>
>> 
>>> -   plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), 
>>> yield_count);
>>> +   /*
>>> +* Spinlock code yields and prods, so don't trace the hcalls because
>>> +* tracing code takes spinlocks which could recurse.
>>> +*
>>> +* These calls are made while the lock is not held, the lock slowpath
>>> +* yields if it can not acquire the lock, and unlock slow path might
>>> +* prod if a waiter has yielded). So this did not seem to be a problem
>>> +* for simple spin locks because technically it didn't recuse on the
>> ^^
>> recurse
>> 
>>> +* lock. However the queued spin lock contended path is more strictly
>>> +* ordered: the H_CONFER hcall is made after the task has queued itself
>>> +* on the lock, so then recursing on the lock will queue up behind that
>>> +* (or worse: queued spinlocks uses tricks that assume a context never
>>> +* waits on more than one spinlock, so that may cause random
>>> +* corruption).
>>> +*/
>>> +   plpar_hcall_norets_notrace(H_CONFER,
>>> +  get_hard_smp_processor_id(cpu), yield_count);
>> 
>> This can all be on a single line.
>
> Should it though? Linux in general allegedly changed to 100 column 
> lines for checkpatch, but it seems to still be frowned upon to go
> beyond 80 deliberately. What about arch/powerpc?

Splitting it provides zero benefit to code readability IMO. And it would
be only 89 by my count, which is not grossly long.

cheers


Re: [PATCH 1/4] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks

2021-04-30 Thread Nicholas Piggin
Excerpts from Naveen N. Rao's message of April 27, 2021 11:43 pm:
> Nicholas Piggin wrote:
>> The paravit queued spinlock slow path adds itself to the queue then
>> calls pv_wait to wait for the lock to become free. This is implemented
>> by calling H_CONFER to donate cycles.
>> 
>> When hcall tracing is enabled, this H_CONFER call can lead to a spin
>> lock being taken in the tracing code, which will result in the lock to
>> be taken again, which will also go to the slow path because it queues
>> behind itself and so won't ever make progress.
>> 
>> An example trace of a deadlock:
>> 
>>   __pv_queued_spin_lock_slowpath
>>   trace_clock_global
>>   ring_buffer_lock_reserve
>>   trace_event_buffer_lock_reserve
>>   trace_event_buffer_reserve
>>   trace_event_raw_event_hcall_exit
>>   __trace_hcall_exit
>>   plpar_hcall_norets_trace
>>   __pv_queued_spin_lock_slowpath
>>   trace_clock_global
>>   ring_buffer_lock_reserve
>>   trace_event_buffer_lock_reserve
>>   trace_event_buffer_reserve
>>   trace_event_raw_event_rcu_dyntick
>>   rcu_irq_exit
>>   irq_exit
>>   __do_irq
>>   call_do_irq
>>   do_IRQ
>>   hardware_interrupt_common_virt
>> 
>> Fix this by introducing plpar_hcall_norets_notrace(), and using that to
>> make SPLPAR virtual processor dispatching hcalls by the paravirt
>> spinlock code.
>> 
>> Fixes: 20c0e8269e9d ("powerpc/pseries: Implement paravirt qspinlocks for 
>> SPLPAR")
>> Signed-off-by: Nicholas Piggin 
>> ---
>>  arch/powerpc/include/asm/hvcall.h   |  3 +++
>>  arch/powerpc/include/asm/paravirt.h | 22 +++---
>>  arch/powerpc/platforms/pseries/hvCall.S | 10 ++
>>  arch/powerpc/platforms/pseries/lpar.c   |  4 ++--
>>  4 files changed, 34 insertions(+), 5 deletions(-)
> 
> Thanks for the fix! Some very minor nits below, but none the less:
> Reviewed-by: Naveen N. Rao 
> 
>> 
>> diff --git a/arch/powerpc/include/asm/hvcall.h 
>> b/arch/powerpc/include/asm/hvcall.h
>> index ed6086d57b22..0c92b01a3c3c 100644
>> --- a/arch/powerpc/include/asm/hvcall.h
>> +++ b/arch/powerpc/include/asm/hvcall.h
>> @@ -446,6 +446,9 @@
>>   */
>>  long plpar_hcall_norets(unsigned long opcode, ...);
>> 
>> +/* Variant which does not do hcall tracing */
>> +long plpar_hcall_norets_notrace(unsigned long opcode, ...);
>> +
>>  /**
>>   * plpar_hcall: - Make a pseries hypervisor call
>>   * @opcode: The hypervisor call to make.
>> diff --git a/arch/powerpc/include/asm/paravirt.h 
>> b/arch/powerpc/include/asm/paravirt.h
>> index 5d1726bb28e7..3c13c2ec70a9 100644
>> --- a/arch/powerpc/include/asm/paravirt.h
>> +++ b/arch/powerpc/include/asm/paravirt.h
>> @@ -30,17 +30,33 @@ static inline u32 yield_count_of(int cpu)
>> 
>>  static inline void yield_to_preempted(int cpu, u32 yield_count)
>>  {
> 
> It looks like yield_to_preempted() is only used by simple spin locks 
> today. I wonder if it makes more sense to put the below comment in 
> yield_to_any() which is used by the qspinlock code.

Yeah, I just put it above the functions entirely because it refers to 
all of them.

> 
>> -plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), 
>> yield_count);
>> +/*
>> + * Spinlock code yields and prods, so don't trace the hcalls because
>> + * tracing code takes spinlocks which could recurse.
>> + *
>> + * These calls are made while the lock is not held, the lock slowpath
>> + * yields if it can not acquire the lock, and unlock slow path might
>> + * prod if a waiter has yielded). So this did not seem to be a problem
>> + * for simple spin locks because technically it didn't recuse on the
>  ^^
>  recurse
> 
>> + * lock. However the queued spin lock contended path is more strictly
>> + * ordered: the H_CONFER hcall is made after the task has queued itself
>> + * on the lock, so then recursing on the lock will queue up behind that
>> + * (or worse: queued spinlocks uses tricks that assume a context never
>> + * waits on more than one spinlock, so that may cause random
>> + * corruption).
>> + */
>> +plpar_hcall_norets_notrace(H_CONFER,
>> +   get_hard_smp_processor_id(cpu), yield_count);
> 
> This can all be on a single line.

Should it though? Linux in general allegedly changed to 100 column 
lines for checkpatch, but it seems to still be frowned upon to go
beyond 80 deliberately. What about arch/powerpc?

Thanks,
Nick


Re: [PATCH 1/4] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks

2021-04-27 Thread Naveen N. Rao

Nicholas Piggin wrote:

The paravit queued spinlock slow path adds itself to the queue then
calls pv_wait to wait for the lock to become free. This is implemented
by calling H_CONFER to donate cycles.

When hcall tracing is enabled, this H_CONFER call can lead to a spin
lock being taken in the tracing code, which will result in the lock to
be taken again, which will also go to the slow path because it queues
behind itself and so won't ever make progress.

An example trace of a deadlock:

  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_hcall_exit
  __trace_hcall_exit
  plpar_hcall_norets_trace
  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_rcu_dyntick
  rcu_irq_exit
  irq_exit
  __do_irq
  call_do_irq
  do_IRQ
  hardware_interrupt_common_virt

Fix this by introducing plpar_hcall_norets_notrace(), and using that to
make SPLPAR virtual processor dispatching hcalls by the paravirt
spinlock code.

Fixes: 20c0e8269e9d ("powerpc/pseries: Implement paravirt qspinlocks for 
SPLPAR")
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/hvcall.h   |  3 +++
 arch/powerpc/include/asm/paravirt.h | 22 +++---
 arch/powerpc/platforms/pseries/hvCall.S | 10 ++
 arch/powerpc/platforms/pseries/lpar.c   |  4 ++--
 4 files changed, 34 insertions(+), 5 deletions(-)


Thanks for the fix! Some very minor nits below, but none the less:
Reviewed-by: Naveen N. Rao 



diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index ed6086d57b22..0c92b01a3c3c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -446,6 +446,9 @@
  */
 long plpar_hcall_norets(unsigned long opcode, ...);

+/* Variant which does not do hcall tracing */
+long plpar_hcall_norets_notrace(unsigned long opcode, ...);
+
 /**
  * plpar_hcall: - Make a pseries hypervisor call
  * @opcode: The hypervisor call to make.
diff --git a/arch/powerpc/include/asm/paravirt.h 
b/arch/powerpc/include/asm/paravirt.h
index 5d1726bb28e7..3c13c2ec70a9 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -30,17 +30,33 @@ static inline u32 yield_count_of(int cpu)

 static inline void yield_to_preempted(int cpu, u32 yield_count)
 {


It looks like yield_to_preempted() is only used by simple spin locks 
today. I wonder if it makes more sense to put the below comment in 
yield_to_any() which is used by the qspinlock code.



-   plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), 
yield_count);
+   /*
+* Spinlock code yields and prods, so don't trace the hcalls because
+* tracing code takes spinlocks which could recurse.
+*
+* These calls are made while the lock is not held, the lock slowpath
+* yields if it can not acquire the lock, and unlock slow path might
+* prod if a waiter has yielded). So this did not seem to be a problem
+* for simple spin locks because technically it didn't recuse on the

   ^^
   recurse


+* lock. However the queued spin lock contended path is more strictly
+* ordered: the H_CONFER hcall is made after the task has queued itself
+* on the lock, so then recursing on the lock will queue up behind that
+* (or worse: queued spinlocks uses tricks that assume a context never
+* waits on more than one spinlock, so that may cause random
+* corruption).
+*/
+   plpar_hcall_norets_notrace(H_CONFER,
+  get_hard_smp_processor_id(cpu), yield_count);


This can all be on a single line.


- Naveen



[PATCH 1/4] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks

2021-04-22 Thread Nicholas Piggin
The paravit queued spinlock slow path adds itself to the queue then
calls pv_wait to wait for the lock to become free. This is implemented
by calling H_CONFER to donate cycles.

When hcall tracing is enabled, this H_CONFER call can lead to a spin
lock being taken in the tracing code, which will result in the lock to
be taken again, which will also go to the slow path because it queues
behind itself and so won't ever make progress.

An example trace of a deadlock:

  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_hcall_exit
  __trace_hcall_exit
  plpar_hcall_norets_trace
  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_rcu_dyntick
  rcu_irq_exit
  irq_exit
  __do_irq
  call_do_irq
  do_IRQ
  hardware_interrupt_common_virt

Fix this by introducing plpar_hcall_norets_notrace(), and using that to
make SPLPAR virtual processor dispatching hcalls by the paravirt
spinlock code.

Fixes: 20c0e8269e9d ("powerpc/pseries: Implement paravirt qspinlocks for 
SPLPAR")
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/hvcall.h   |  3 +++
 arch/powerpc/include/asm/paravirt.h | 22 +++---
 arch/powerpc/platforms/pseries/hvCall.S | 10 ++
 arch/powerpc/platforms/pseries/lpar.c   |  4 ++--
 4 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index ed6086d57b22..0c92b01a3c3c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -446,6 +446,9 @@
  */
 long plpar_hcall_norets(unsigned long opcode, ...);
 
+/* Variant which does not do hcall tracing */
+long plpar_hcall_norets_notrace(unsigned long opcode, ...);
+
 /**
  * plpar_hcall: - Make a pseries hypervisor call
  * @opcode: The hypervisor call to make.
diff --git a/arch/powerpc/include/asm/paravirt.h 
b/arch/powerpc/include/asm/paravirt.h
index 5d1726bb28e7..3c13c2ec70a9 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -30,17 +30,33 @@ static inline u32 yield_count_of(int cpu)
 
 static inline void yield_to_preempted(int cpu, u32 yield_count)
 {
-   plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), 
yield_count);
+   /*
+* Spinlock code yields and prods, so don't trace the hcalls because
+* tracing code takes spinlocks which could recurse.
+*
+* These calls are made while the lock is not held, the lock slowpath
+* yields if it can not acquire the lock, and unlock slow path might
+* prod if a waiter has yielded). So this did not seem to be a problem
+* for simple spin locks because technically it didn't recuse on the
+* lock. However the queued spin lock contended path is more strictly
+* ordered: the H_CONFER hcall is made after the task has queued itself
+* on the lock, so then recursing on the lock will queue up behind that
+* (or worse: queued spinlocks uses tricks that assume a context never
+* waits on more than one spinlock, so that may cause random
+* corruption).
+*/
+   plpar_hcall_norets_notrace(H_CONFER,
+  get_hard_smp_processor_id(cpu), yield_count);
 }
 
 static inline void prod_cpu(int cpu)
 {
-   plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu));
+   plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu));
 }
 
 static inline void yield_to_any(void)
 {
-   plpar_hcall_norets(H_CONFER, -1, 0);
+   plpar_hcall_norets_notrace(H_CONFER, -1, 0);
 }
 #else
 static inline bool is_shared_processor(void)
diff --git a/arch/powerpc/platforms/pseries/hvCall.S 
b/arch/powerpc/platforms/pseries/hvCall.S
index 2136e42833af..8a2b8d64265b 100644
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); 
\
 #define HCALL_BRANCH(LABEL)
 #endif
 
+_GLOBAL_TOC(plpar_hcall_norets_notrace)
+   HMT_MEDIUM
+
+   mfcrr0
+   stw r0,8(r1)
+   HVSC/* invoke the hypervisor */
+   lwz r0,8(r1)
+   mtcrf   0xff,r0
+   blr /* return r3 = status */
+
 _GLOBAL_TOC(plpar_hcall_norets)
HMT_MEDIUM
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 3805519a6469..6011b7b80946 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -1828,8 +1828,8 @@ void hcall_tracepoint_unregfunc(void)
 
 /*
  * Since the tracing code might execute hcalls we need to guard against
- * recursion. One example of this are spinlocks calling H_YIELD on
- * shared