Re: [PATCH 7/8] powerpc/64s: idle do not hold reservation longer than required

2017-03-16 Thread Nicholas Piggin
On Thu, 16 Mar 2017 18:13:28 +0530
Gautham R Shenoy  wrote:

> Hi Nick,
> 
> On Tue, Mar 14, 2017 at 07:23:48PM +1000, Nicholas Piggin wrote:
> > When taking the core idle state lock, grab it immediately like a
> > regular lock, rather than adding more tests in there. Holding the lock
> > keeps it stable, so there is no need to do it whole holding the
> > reservation.  
> 
> I agree with this patch. Just a minor query
> 
> > 
> > Signed-off-by: Nicholas Piggin 
> > ---
> >  arch/powerpc/kernel/idle_book3s.S | 20 +++-
> >  1 file changed, 11 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/idle_book3s.S 
> > b/arch/powerpc/kernel/idle_book3s.S
> > index 1c91dc35c559..3cb75907c5c5 100644
> > --- a/arch/powerpc/kernel/idle_book3s.S
> > +++ b/arch/powerpc/kernel/idle_book3s.S
> > @@ -488,12 +488,12 @@ BEGIN_FTR_SECTION
> > CHECK_HMI_INTERRUPT
> >  END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> > 
> > -   lbz r7,PACA_THREAD_MASK(r13)
> > ld  r14,PACA_CORE_IDLE_STATE_PTR(r13)
> > -lwarx_loop2:
> > -   lwarx   r15,0,r14
> > -   andis.  r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
> > +   lbz r7,PACA_THREAD_MASK(r13)  
> 
> Is reversing the order of loads into r7 and r14 intentional?

Oh, yes I guess it is because we use r14 result first. I should have
mentioned it but I forgot about it. Probably they decode together,
but you might get them in different cycles.

Thanks for the review!

Thanks,
Nick


Re: [PATCH 7/8] powerpc/64s: idle do not hold reservation longer than required

2017-03-16 Thread Gautham R Shenoy
Hi Nick,

On Tue, Mar 14, 2017 at 07:23:48PM +1000, Nicholas Piggin wrote:
> When taking the core idle state lock, grab it immediately like a
> regular lock, rather than adding more tests in there. Holding the lock
> keeps it stable, so there is no need to do it whole holding the
> reservation.

I agree with this patch. Just a minor query

> 
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/kernel/idle_book3s.S | 20 +++-
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/idle_book3s.S 
> b/arch/powerpc/kernel/idle_book3s.S
> index 1c91dc35c559..3cb75907c5c5 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -488,12 +488,12 @@ BEGIN_FTR_SECTION
>   CHECK_HMI_INTERRUPT
>  END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> 
> - lbz r7,PACA_THREAD_MASK(r13)
>   ld  r14,PACA_CORE_IDLE_STATE_PTR(r13)
> -lwarx_loop2:
> - lwarx   r15,0,r14
> - andis.  r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
> + lbz r7,PACA_THREAD_MASK(r13)

Is reversing the order of loads into r7 and r14 intentional?

Other than that,
Reviewed-by: Gautham R. Shenoy 

> +
>   /*
> +  * Take the core lock to synchronize against other threads.
> +  *
>* Lock bit is set in one of the 2 cases-
>* a. In the sleep/winkle enter path, the last thread is executing
>* fastsleep workaround code.
> @@ -501,7 +501,14 @@ lwarx_loop2:
>* workaround undo code or resyncing timebase or restoring context
>* In either case loop until the lock bit is cleared.
>*/
> +1:
> + lwarx   r15,0,r14
> + andis.  r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
>   bnel-   core_idle_lock_held
> + orisr15,r15,PNV_CORE_IDLE_LOCK_BIT@h
> + stwcx.  r15,0,r14
> + bne-1b
> + isync
> 
>   andi.   r9,r15,PNV_CORE_IDLE_THREAD_BITS
>   cmpwi   cr2,r9,0
> @@ -513,11 +520,6 @@ lwarx_loop2:
>* cr4 - gt or eq if waking up from complete hypervisor state loss.
>*/
> 
> - orisr15,r15,PNV_CORE_IDLE_LOCK_BIT@h
> - stwcx.  r15,0,r14
> - bne-lwarx_loop2
> - isync
> -
>  BEGIN_FTR_SECTION
>   lbz r4,PACA_SUBCORE_SIBLING_MASK(r13)
>   and r4,r4,r15
> -- 
> 2.11.0
> 



[PATCH 7/8] powerpc/64s: idle do not hold reservation longer than required

2017-03-14 Thread Nicholas Piggin
When taking the core idle state lock, grab it immediately like a
regular lock, rather than adding more tests in there. Holding the lock
keeps it stable, so there is no need to do it whole holding the
reservation.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/idle_book3s.S | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/idle_book3s.S 
b/arch/powerpc/kernel/idle_book3s.S
index 1c91dc35c559..3cb75907c5c5 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -488,12 +488,12 @@ BEGIN_FTR_SECTION
CHECK_HMI_INTERRUPT
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
 
-   lbz r7,PACA_THREAD_MASK(r13)
ld  r14,PACA_CORE_IDLE_STATE_PTR(r13)
-lwarx_loop2:
-   lwarx   r15,0,r14
-   andis.  r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
+   lbz r7,PACA_THREAD_MASK(r13)
+
/*
+* Take the core lock to synchronize against other threads.
+*
 * Lock bit is set in one of the 2 cases-
 * a. In the sleep/winkle enter path, the last thread is executing
 * fastsleep workaround code.
@@ -501,7 +501,14 @@ lwarx_loop2:
 * workaround undo code or resyncing timebase or restoring context
 * In either case loop until the lock bit is cleared.
 */
+1:
+   lwarx   r15,0,r14
+   andis.  r9,r15,PNV_CORE_IDLE_LOCK_BIT@h
bnel-   core_idle_lock_held
+   orisr15,r15,PNV_CORE_IDLE_LOCK_BIT@h
+   stwcx.  r15,0,r14
+   bne-1b
+   isync
 
andi.   r9,r15,PNV_CORE_IDLE_THREAD_BITS
cmpwi   cr2,r9,0
@@ -513,11 +520,6 @@ lwarx_loop2:
 * cr4 - gt or eq if waking up from complete hypervisor state loss.
 */
 
-   orisr15,r15,PNV_CORE_IDLE_LOCK_BIT@h
-   stwcx.  r15,0,r14
-   bne-lwarx_loop2
-   isync
-
 BEGIN_FTR_SECTION
lbz r4,PACA_SUBCORE_SIBLING_MASK(r13)
and r4,r4,r15
-- 
2.11.0