core_idle_state is maintained for each core. It uses 0-7 bits to track
whether a thread in the core has entered fastsleep or winkle. 8th bit is
used as a lock bit.
The lock bit is set in these 2 scenarios-
 - The thread is first in subcore to wakeup from sleep/winkle.
 - If its the last thread in the core about to enter sleep/winkle

While the lock bit is set, if any other thread in the core wakes up, it
loops until the lock bit is cleared before proceeding in the wakeup
path. This helps prevent race conditions w.r.t fastsleep workaround and
prevents threads from switching to process context before core/subcore
resources are restored.

But, in the path to sleep/winkle entry, we currently don't check for
lock-bit. This exposes us to following race when running with subcore
on-

First thread in the subcorea            Another thread in the same
waking up                               core entering sleep/winkle

lwarx   r15,0,r14
ori     r15,r15,PNV_CORE_IDLE_LOCK_BIT
stwcx.  r15,0,r14
[Code to restore subcore state]

                                                lwarx   r15,0,r14
                                                [clear thread bit]
                                                stwcx.  r15,0,r14

andi.   r15,r15,PNV_CORE_IDLE_THREAD_BITS
stw     r15,0(r14)

Here, after the thread entering sleep clears its thread bit in
core_idle_state, the value is overwritten by the thread waking up.
This patch fixes the above race by looping on the lock bit even while
entering the idle states.

Signed-off-by: Shreyas B. Prabhu <shre...@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/idle_power7.S | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index ccde8f0..48c7d4f 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -144,12 +144,26 @@ power7_enter_nap_mode:
        bge     cr3,2f
        IDLE_STATE_ENTER_SEQ(PPC_NAP)
        /* No return */
+
+core_idle_lock_held_entry:
+       HMT_LOW
+core_idle_lock_loop_entry:
+       lwz     r15,0(r14)
+       andi.   r9,r15,PNV_CORE_IDLE_LOCK_BIT
+       bne     core_idle_lock_loop_entry
+       HMT_MEDIUM
+       b       lwarx_loop1
+
 2:
        /* Sleep or winkle */
        lbz     r7,PACA_THREAD_MASK(r13)
        ld      r14,PACA_CORE_IDLE_STATE_PTR(r13)
 lwarx_loop1:
        lwarx   r15,0,r14
+
+       andi.   r9,r15,PNV_CORE_IDLE_LOCK_BIT
+       bne     core_idle_lock_held_entry
+
        andc    r15,r15,r7                      /* Clear thread bit */
 
        andi.   r15,r15,PNV_CORE_IDLE_THREAD_BITS
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to