Re: [PATCH 2/2] powerpc/mm/hash: Only need the Nest MMU workaround for R -> RW transition

2018-08-23 Thread Michael Ellerman
Nicholas Piggin  writes:
> On Wed, 22 Aug 2018 22:46:05 +0530
> "Aneesh Kumar K.V"  wrote:
>
>> The Nest MMU workaround is only needed for RW upgrades. Avoid doing that
>> for other pte updates.
>> 
>> We also avoid clearing the pte while marking it invalid. This is because 
>> other
>> page table walk will find this pte none and can result in unexpected 
>> behaviour
>> due to that. Instead we clear _PAGE_PRESENT and set the software pte bit
>> _PAGE_INVALID. pte_present is already updated to check for bot the bits. This
>> make sure page table walkers will find the pte present and things like
>> pte_pfn(pte) returns the right value.
>> 
>> Based on the original patch from Benjamin Herrenschmidt 
>> 
>> 
>> Signed-off-by: Aneesh Kumar K.V 
>> ---
>>  arch/powerpc/mm/pgtable-radix.c | 8 +---
>
> This is powerpc/mm/radix, isn't it? Subject says hash.

I fixed it when applying.

> Could we make this fix POWER9 only and use a RSV bit for it
> rather than use up a SW bit? Other than that,
>
> Reviewed-by: Nicholas Piggin 

Thanks.

cheers


Re: [PATCH 2/2] powerpc/mm/hash: Only need the Nest MMU workaround for R -> RW transition

2018-08-23 Thread Benjamin Herrenschmidt
On Thu, 2018-08-23 at 19:23 +1000, Nicholas Piggin wrote:
> On Wed, 22 Aug 2018 22:46:05 +0530
> "Aneesh Kumar K.V"  wrote:
> 
> > The Nest MMU workaround is only needed for RW upgrades. Avoid doing that
> > for other pte updates.
> > 
> > We also avoid clearing the pte while marking it invalid. This is because 
> > other
> > page table walk will find this pte none and can result in unexpected 
> > behaviour
> > due to that. Instead we clear _PAGE_PRESENT and set the software pte bit
> > _PAGE_INVALID. pte_present is already updated to check for bot the bits. 
> > This
> > make sure page table walkers will find the pte present and things like
> > pte_pfn(pte) returns the right value.
> > 
> > Based on the original patch from Benjamin Herrenschmidt 
> > 
> > 
> > Signed-off-by: Aneesh Kumar K.V 
> > ---
> >  arch/powerpc/mm/pgtable-radix.c | 8 +---
> 
> This is powerpc/mm/radix, isn't it? Subject says hash.
> 
> Could we make this fix POWER9 only and use a RSV bit for it
> rather than use up a SW bit? Other than that,

Well, the SW bit is necessary for THP as well isn't it ?
> 
> Reviewed-by: Nicholas Piggin 
> 
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/pgtable-radix.c 
> > b/arch/powerpc/mm/pgtable-radix.c
> > index 7be99fd9af15..c879979faa73 100644
> > --- a/arch/powerpc/mm/pgtable-radix.c
> > +++ b/arch/powerpc/mm/pgtable-radix.c
> > @@ -1045,20 +1045,22 @@ void radix__ptep_set_access_flags(struct 
> > vm_area_struct *vma, pte_t *ptep,
> > struct mm_struct *mm = vma->vm_mm;
> > unsigned long set = pte_val(entry) & (_PAGE_DIRTY | _PAGE_ACCESSED |
> >   _PAGE_RW | _PAGE_EXEC);
> > +
> > +   unsigned long change = pte_val(entry) ^ pte_val(*ptep);
> > /*
> >  * To avoid NMMU hang while relaxing access, we need mark
> >  * the pte invalid in between.
> >  */
> > -   if (atomic_read(>context.copros) > 0) {
> > +   if ((change & _PAGE_RW) && atomic_read(>context.copros) > 0) {
> > unsigned long old_pte, new_pte;
> >  
> > -   old_pte = __radix_pte_update(ptep, ~0, 0);
> > +   old_pte = __radix_pte_update(ptep, _PAGE_PRESENT, 
> > _PAGE_INVALID);
> > /*
> >  * new value of pte
> >  */
> > new_pte = old_pte | set;
> > radix__flush_tlb_page_psize(mm, address, psize);
> > -   __radix_pte_update(ptep, 0, new_pte);
> > +   __radix_pte_update(ptep, _PAGE_INVALID, new_pte);
> > } else {
> > __radix_pte_update(ptep, 0, set);
> > /*



Re: [PATCH 2/2] powerpc/mm/hash: Only need the Nest MMU workaround for R -> RW transition

2018-08-23 Thread Nicholas Piggin
On Wed, 22 Aug 2018 22:46:05 +0530
"Aneesh Kumar K.V"  wrote:

> The Nest MMU workaround is only needed for RW upgrades. Avoid doing that
> for other pte updates.
> 
> We also avoid clearing the pte while marking it invalid. This is because other
> page table walk will find this pte none and can result in unexpected behaviour
> due to that. Instead we clear _PAGE_PRESENT and set the software pte bit
> _PAGE_INVALID. pte_present is already updated to check for bot the bits. This
> make sure page table walkers will find the pte present and things like
> pte_pfn(pte) returns the right value.
> 
> Based on the original patch from Benjamin Herrenschmidt 
> 
> 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/mm/pgtable-radix.c | 8 +---

This is powerpc/mm/radix, isn't it? Subject says hash.

Could we make this fix POWER9 only and use a RSV bit for it
rather than use up a SW bit? Other than that,

Reviewed-by: Nicholas Piggin 

>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index 7be99fd9af15..c879979faa73 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -1045,20 +1045,22 @@ void radix__ptep_set_access_flags(struct 
> vm_area_struct *vma, pte_t *ptep,
>   struct mm_struct *mm = vma->vm_mm;
>   unsigned long set = pte_val(entry) & (_PAGE_DIRTY | _PAGE_ACCESSED |
> _PAGE_RW | _PAGE_EXEC);
> +
> + unsigned long change = pte_val(entry) ^ pte_val(*ptep);
>   /*
>* To avoid NMMU hang while relaxing access, we need mark
>* the pte invalid in between.
>*/
> - if (atomic_read(>context.copros) > 0) {
> + if ((change & _PAGE_RW) && atomic_read(>context.copros) > 0) {
>   unsigned long old_pte, new_pte;
>  
> - old_pte = __radix_pte_update(ptep, ~0, 0);
> + old_pte = __radix_pte_update(ptep, _PAGE_PRESENT, 
> _PAGE_INVALID);
>   /*
>* new value of pte
>*/
>   new_pte = old_pte | set;
>   radix__flush_tlb_page_psize(mm, address, psize);
> - __radix_pte_update(ptep, 0, new_pte);
> + __radix_pte_update(ptep, _PAGE_INVALID, new_pte);
>   } else {
>   __radix_pte_update(ptep, 0, set);
>   /*



[PATCH 2/2] powerpc/mm/hash: Only need the Nest MMU workaround for R -> RW transition

2018-08-22 Thread Aneesh Kumar K.V
The Nest MMU workaround is only needed for RW upgrades. Avoid doing that
for other pte updates.

We also avoid clearing the pte while marking it invalid. This is because other
page table walk will find this pte none and can result in unexpected behaviour
due to that. Instead we clear _PAGE_PRESENT and set the software pte bit
_PAGE_INVALID. pte_present is already updated to check for bot the bits. This
make sure page table walkers will find the pte present and things like
pte_pfn(pte) returns the right value.

Based on the original patch from Benjamin Herrenschmidt 


Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/pgtable-radix.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 7be99fd9af15..c879979faa73 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -1045,20 +1045,22 @@ void radix__ptep_set_access_flags(struct vm_area_struct 
*vma, pte_t *ptep,
struct mm_struct *mm = vma->vm_mm;
unsigned long set = pte_val(entry) & (_PAGE_DIRTY | _PAGE_ACCESSED |
  _PAGE_RW | _PAGE_EXEC);
+
+   unsigned long change = pte_val(entry) ^ pte_val(*ptep);
/*
 * To avoid NMMU hang while relaxing access, we need mark
 * the pte invalid in between.
 */
-   if (atomic_read(>context.copros) > 0) {
+   if ((change & _PAGE_RW) && atomic_read(>context.copros) > 0) {
unsigned long old_pte, new_pte;
 
-   old_pte = __radix_pte_update(ptep, ~0, 0);
+   old_pte = __radix_pte_update(ptep, _PAGE_PRESENT, 
_PAGE_INVALID);
/*
 * new value of pte
 */
new_pte = old_pte | set;
radix__flush_tlb_page_psize(mm, address, psize);
-   __radix_pte_update(ptep, 0, new_pte);
+   __radix_pte_update(ptep, _PAGE_INVALID, new_pte);
} else {
__radix_pte_update(ptep, 0, set);
/*
-- 
2.17.1



Re: [PATCH 2/2] powerpc/mm/hash: Only need the Nest MMU workaround for R -> RW transition

2018-08-22 Thread Aneesh Kumar K.V

On 08/22/2018 10:46 PM, Aneesh Kumar K.V wrote:

The Nest MMU workaround is only needed for RW upgrades. Avoid doing that
for other pte updates.

We also avoid clearing the pte while marking it invalid. This is because other
page table walk will find this pte none and can result in unexpected behaviour
due to that. Instead we clear _PAGE_PRESENT and set the software pte bit
_PAGE_INVALID. pte_present is already updated to check for bot the bits. This
make sure page table walkers will find the pte present and things like
pte_pfn(pte) returns the right value.

Based on the original patch from Benjamin Herrenschmidt 




Fixes: bd5050e38aec3
("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang")


Signed-off-by: Aneesh Kumar K.V 
---
  arch/powerpc/mm/pgtable-radix.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 7be99fd9af15..c879979faa73 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -1045,20 +1045,22 @@ void radix__ptep_set_access_flags(struct vm_area_struct 
*vma, pte_t *ptep,
struct mm_struct *mm = vma->vm_mm;
unsigned long set = pte_val(entry) & (_PAGE_DIRTY | _PAGE_ACCESSED |
  _PAGE_RW | _PAGE_EXEC);
+
+   unsigned long change = pte_val(entry) ^ pte_val(*ptep);
/*
 * To avoid NMMU hang while relaxing access, we need mark
 * the pte invalid in between.
 */
-   if (atomic_read(>context.copros) > 0) {
+   if ((change & _PAGE_RW) && atomic_read(>context.copros) > 0) {
unsigned long old_pte, new_pte;
  
-		old_pte = __radix_pte_update(ptep, ~0, 0);

+   old_pte = __radix_pte_update(ptep, _PAGE_PRESENT, 
_PAGE_INVALID);
/*
 * new value of pte
 */
new_pte = old_pte | set;
radix__flush_tlb_page_psize(mm, address, psize);
-   __radix_pte_update(ptep, 0, new_pte);
+   __radix_pte_update(ptep, _PAGE_INVALID, new_pte);
} else {
__radix_pte_update(ptep, 0, set);
/*