Nicholas Piggin wrote:
The paravit queued spinlock slow path adds itself to the queue then
calls pv_wait to wait for the lock to become free. This is implemented
by calling H_CONFER to donate cycles.

When hcall tracing is enabled, this H_CONFER call can lead to a spin
lock being taken in the tracing code, which will result in the lock to
be taken again, which will also go to the slow path because it queues
behind itself and so won't ever make progress.

An example trace of a deadlock:

  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_hcall_exit
  __trace_hcall_exit
  plpar_hcall_norets_trace
  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_rcu_dyntick
  rcu_irq_exit
  irq_exit
  __do_irq
  call_do_irq
  do_IRQ
  hardware_interrupt_common_virt

Fix this by introducing plpar_hcall_norets_notrace(), and using that to
make SPLPAR virtual processor dispatching hcalls by the paravirt
spinlock code.

Fixes: 20c0e8269e9d ("powerpc/pseries: Implement paravirt qspinlocks for 
SPLPAR")
Signed-off-by: Nicholas Piggin <npig...@gmail.com>
---
 arch/powerpc/include/asm/hvcall.h       |  3 +++
 arch/powerpc/include/asm/paravirt.h     | 22 +++++++++++++++++++---
 arch/powerpc/platforms/pseries/hvCall.S | 10 ++++++++++
 arch/powerpc/platforms/pseries/lpar.c   |  4 ++--
 4 files changed, 34 insertions(+), 5 deletions(-)

Thanks for the fix! Some very minor nits below, but none the less:
Reviewed-by: Naveen N. Rao <naveen.n....@linux.vnet.ibm.com>


diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index ed6086d57b22..0c92b01a3c3c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -446,6 +446,9 @@
  */
 long plpar_hcall_norets(unsigned long opcode, ...);

+/* Variant which does not do hcall tracing */
+long plpar_hcall_norets_notrace(unsigned long opcode, ...);
+
 /**
  * plpar_hcall: - Make a pseries hypervisor call
  * @opcode: The hypervisor call to make.
diff --git a/arch/powerpc/include/asm/paravirt.h 
b/arch/powerpc/include/asm/paravirt.h
index 5d1726bb28e7..3c13c2ec70a9 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -30,17 +30,33 @@ static inline u32 yield_count_of(int cpu)

 static inline void yield_to_preempted(int cpu, u32 yield_count)
 {

It looks like yield_to_preempted() is only used by simple spin locks today. I wonder if it makes more sense to put the below comment in yield_to_any() which is used by the qspinlock code.

-       plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), 
yield_count);
+       /*
+        * Spinlock code yields and prods, so don't trace the hcalls because
+        * tracing code takes spinlocks which could recurse.
+        *
+        * These calls are made while the lock is not held, the lock slowpath
+        * yields if it can not acquire the lock, and unlock slow path might
+        * prod if a waiter has yielded). So this did not seem to be a problem
+        * for simple spin locks because technically it didn't recuse on the
                                                               ^^^^^^
                                                               recurse

+        * lock. However the queued spin lock contended path is more strictly
+        * ordered: the H_CONFER hcall is made after the task has queued itself
+        * on the lock, so then recursing on the lock will queue up behind that
+        * (or worse: queued spinlocks uses tricks that assume a context never
+        * waits on more than one spinlock, so that may cause random
+        * corruption).
+        */
+       plpar_hcall_norets_notrace(H_CONFER,
+                                  get_hard_smp_processor_id(cpu), yield_count);

This can all be on a single line.


- Naveen

Reply via email to