Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case

2017-02-13 Thread Michael Neuling
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> In the kernel we do follow the below sequence in different code paths.
> pte = ptep_get_clear(ptep)
> 
> set_pte_at(ptep, pte)
> 
> We do that for mremap, autonuma protection update and softdirty clearing. This
> implies our optimization to skip a tlb flush when clearing a pte update is
> not valid, because for DD1 system that followup set_pte_at will be done witout
> doing the required tlbflush. Fix that by always doing the dd1 style pte update
> irrespective of new_pte value. In a later patch we will optimize the
> application
> exit case.
> 
> Signed-off-by: Benjamin Herrenschmidt 
> Signed-off-by: Aneesh Kumar K.V 

Tested-by: Michael Neuling 

> ---
>  arch/powerpc/include/asm/book3s/64/radix.h | 12 +++-
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h
> b/arch/powerpc/include/asm/book3s/64/radix.h
> index b4d1302387a3..70a3cdcdbe47 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct
> mm_struct *mm,
>    * new value of pte
>    */
>   new_pte = (old_pte | set) & ~clr;
> - /*
> -  * If we are trying to clear the pte, we can skip
> -  * the below sequence and batch the tlb flush. The
> -  * tlb flush batching is done by mmu gather code
> -  */
> - if (new_pte) {
> - asm volatile("ptesync" : : : "memory");
> - radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> + asm volatile("ptesync" : : : "memory");
> + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> + if (new_pte)
>   __radix_pte_update(ptep, 0, new_pte);
> - }
>   } else
>   old_pte = __radix_pte_update(ptep, clr, set);
>   asm volatile("ptesync" : : : "memory");


Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case

2017-02-08 Thread Aneesh Kumar K.V
Benjamin Herrenschmidt  writes:

> On Thu, 2017-02-09 at 14:49 +1100, Benjamin Herrenschmidt wrote:
>> On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
>> > In the kernel we do follow the below sequence in different code
>> > paths.
>> > pte = ptep_get_clear(ptep)
>> > 
>> > set_pte_at(ptep, pte)
>> > 
>> > We do that for mremap, autonuma protection update and softdirty
>> > clearing. This
>> > implies our optimization to skip a tlb flush when clearing a pte
>> > update is
>> > not valid, because for DD1 system that followup set_pte_at will be
>> > done witout
>> > doing the required tlbflush. Fix that by always doing the dd1 style
>> > pte update
>> > irrespective of new_pte value. In a later patch we will optimize
>> > the application
>> > exit case.
>> 
>> What about my change to set_pte_at() ? We seem to be overwriting
>> valid PTEs,
>> shouldn't we deal with that ?
>
> So the HW guys confirmed that the TLB will never cache a valid entry
> that has all permissions clear. That leaves the THP write problem
> though.


Which is fixed by the autonuma related changes posted at
https://lkml.kernel.org/r/1486609259-6796-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com

Right now i am running a kernel compile in loop with parallel perf
bench numa mem run to make sure we got most of the details correct.
(this is on p8)

-aneesh



Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case

2017-02-08 Thread Benjamin Herrenschmidt
On Thu, 2017-02-09 at 14:49 +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> > In the kernel we do follow the below sequence in different code
> > paths.
> > pte = ptep_get_clear(ptep)
> > 
> > set_pte_at(ptep, pte)
> > 
> > We do that for mremap, autonuma protection update and softdirty
> > clearing. This
> > implies our optimization to skip a tlb flush when clearing a pte
> > update is
> > not valid, because for DD1 system that followup set_pte_at will be
> > done witout
> > doing the required tlbflush. Fix that by always doing the dd1 style
> > pte update
> > irrespective of new_pte value. In a later patch we will optimize
> > the application
> > exit case.
> 
> What about my change to set_pte_at() ? We seem to be overwriting
> valid PTEs,
> shouldn't we deal with that ?

So the HW guys confirmed that the TLB will never cache a valid entry
that has all permissions clear. That leaves the THP write problem
though.

Cheers,
Ben.

> Cheers,
> Ben.
> 
> > Signed-off-by: Benjamin Herrenschmidt 
> > > Signed-off-by: Aneesh Kumar K.V 
> > 
> > ---
> >  arch/powerpc/include/asm/book3s/64/radix.h | 12 +++-
> >  1 file changed, 3 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h
> > b/arch/powerpc/include/asm/book3s/64/radix.h
> > index b4d1302387a3..70a3cdcdbe47 100644
> > --- a/arch/powerpc/include/asm/book3s/64/radix.h
> > +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> > @@ -144,16 +144,10 @@ static inline unsigned long
> > radix__pte_update(struct mm_struct *mm,
> > >    * new value of pte
> > >    */
> > >   new_pte = (old_pte | set) & ~clr;
> > > - /*
> > > -  * If we are trying to clear the pte, we can
> > > skip
> > > -  * the below sequence and batch the tlb flush.
> > > The
> > > -  * tlb flush batching is done by mmu gather code
> > > -  */
> > > - if (new_pte) {
> > > - asm volatile("ptesync" : : : "memory");
> > > - radix__flush_tlb_pte_p9_dd1(old_pte, mm,
> > > addr);
> > > + asm volatile("ptesync" : : : "memory");
> > > + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> > > + if (new_pte)
> > >   __radix_pte_update(ptep, 0, new_pte);
> > > - }
> > >   } else
> > >   old_pte = __radix_pte_update(ptep, clr, set);
> > >   asm volatile("ptesync" : : : "memory");


Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case

2017-02-08 Thread Benjamin Herrenschmidt
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> In the kernel we do follow the below sequence in different code paths.
> pte = ptep_get_clear(ptep)
> 
> set_pte_at(ptep, pte)
> 
> We do that for mremap, autonuma protection update and softdirty clearing. This
> implies our optimization to skip a tlb flush when clearing a pte update is
> not valid, because for DD1 system that followup set_pte_at will be done witout
> doing the required tlbflush. Fix that by always doing the dd1 style pte update
> irrespective of new_pte value. In a later patch we will optimize the 
> application
> exit case.

What about my change to set_pte_at() ? We seem to be overwriting valid PTEs,
shouldn't we deal with that ?

Cheers,
Ben.

> Signed-off-by: Benjamin Herrenschmidt 
> > Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/book3s/64/radix.h | 12 +++-
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h 
> b/arch/powerpc/include/asm/book3s/64/radix.h
> index b4d1302387a3..70a3cdcdbe47 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct 
> mm_struct *mm,
> >      * new value of pte
> >      */
> >     new_pte = (old_pte | set) & ~clr;
> > -   /*
> > -    * If we are trying to clear the pte, we can skip
> > -    * the below sequence and batch the tlb flush. The
> > -    * tlb flush batching is done by mmu gather code
> > -    */
> > -   if (new_pte) {
> > -   asm volatile("ptesync" : : : "memory");
> > -   radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> > +   asm volatile("ptesync" : : : "memory");
> > +   radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> > +   if (new_pte)
> >     __radix_pte_update(ptep, 0, new_pte);
> > -   }
> >     } else
> >     old_pte = __radix_pte_update(ptep, clr, set);
> >     asm volatile("ptesync" : : : "memory");


[PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case

2017-02-08 Thread Aneesh Kumar K.V
In the kernel we do follow the below sequence in different code paths.
pte = ptep_get_clear(ptep)

set_pte_at(ptep, pte)

We do that for mremap, autonuma protection update and softdirty clearing. This
implies our optimization to skip a tlb flush when clearing a pte update is
not valid, because for DD1 system that followup set_pte_at will be done witout
doing the required tlbflush. Fix that by always doing the dd1 style pte update
irrespective of new_pte value. In a later patch we will optimize the application
exit case.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/radix.h | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h 
b/arch/powerpc/include/asm/book3s/64/radix.h
index b4d1302387a3..70a3cdcdbe47 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct 
mm_struct *mm,
 * new value of pte
 */
new_pte = (old_pte | set) & ~clr;
-   /*
-* If we are trying to clear the pte, we can skip
-* the below sequence and batch the tlb flush. The
-* tlb flush batching is done by mmu gather code
-*/
-   if (new_pte) {
-   asm volatile("ptesync" : : : "memory");
-   radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
+   asm volatile("ptesync" : : : "memory");
+   radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
+   if (new_pte)
__radix_pte_update(ptep, 0, new_pte);
-   }
} else
old_pte = __radix_pte_update(ptep, clr, set);
asm volatile("ptesync" : : : "memory");
-- 
2.7.4