Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote: > In the kernel we do follow the below sequence in different code paths. > pte = ptep_get_clear(ptep) > > set_pte_at(ptep, pte) > > We do that for mremap, autonuma protection update and softdirty clearing. This > implies our optimization to skip a tlb flush when clearing a pte update is > not valid, because for DD1 system that followup set_pte_at will be done witout > doing the required tlbflush. Fix that by always doing the dd1 style pte update > irrespective of new_pte value. In a later patch we will optimize the > application > exit case. > > Signed-off-by: Benjamin Herrenschmidt> Signed-off-by: Aneesh Kumar K.V Tested-by: Michael Neuling > --- > arch/powerpc/include/asm/book3s/64/radix.h | 12 +++- > 1 file changed, 3 insertions(+), 9 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h > b/arch/powerpc/include/asm/book3s/64/radix.h > index b4d1302387a3..70a3cdcdbe47 100644 > --- a/arch/powerpc/include/asm/book3s/64/radix.h > +++ b/arch/powerpc/include/asm/book3s/64/radix.h > @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct > mm_struct *mm, > * new value of pte > */ > new_pte = (old_pte | set) & ~clr; > - /* > - * If we are trying to clear the pte, we can skip > - * the below sequence and batch the tlb flush. The > - * tlb flush batching is done by mmu gather code > - */ > - if (new_pte) { > - asm volatile("ptesync" : : : "memory"); > - radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); > + asm volatile("ptesync" : : : "memory"); > + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); > + if (new_pte) > __radix_pte_update(ptep, 0, new_pte); > - } > } else > old_pte = __radix_pte_update(ptep, clr, set); > asm volatile("ptesync" : : : "memory");
Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
Benjamin Herrenschmidtwrites: > On Thu, 2017-02-09 at 14:49 +1100, Benjamin Herrenschmidt wrote: >> On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote: >> > In the kernel we do follow the below sequence in different code >> > paths. >> > pte = ptep_get_clear(ptep) >> > >> > set_pte_at(ptep, pte) >> > >> > We do that for mremap, autonuma protection update and softdirty >> > clearing. This >> > implies our optimization to skip a tlb flush when clearing a pte >> > update is >> > not valid, because for DD1 system that followup set_pte_at will be >> > done witout >> > doing the required tlbflush. Fix that by always doing the dd1 style >> > pte update >> > irrespective of new_pte value. In a later patch we will optimize >> > the application >> > exit case. >> >> What about my change to set_pte_at() ? We seem to be overwriting >> valid PTEs, >> shouldn't we deal with that ? > > So the HW guys confirmed that the TLB will never cache a valid entry > that has all permissions clear. That leaves the THP write problem > though. Which is fixed by the autonuma related changes posted at https://lkml.kernel.org/r/1486609259-6796-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com Right now i am running a kernel compile in loop with parallel perf bench numa mem run to make sure we got most of the details correct. (this is on p8) -aneesh
Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
On Thu, 2017-02-09 at 14:49 +1100, Benjamin Herrenschmidt wrote: > On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote: > > In the kernel we do follow the below sequence in different code > > paths. > > pte = ptep_get_clear(ptep) > > > > set_pte_at(ptep, pte) > > > > We do that for mremap, autonuma protection update and softdirty > > clearing. This > > implies our optimization to skip a tlb flush when clearing a pte > > update is > > not valid, because for DD1 system that followup set_pte_at will be > > done witout > > doing the required tlbflush. Fix that by always doing the dd1 style > > pte update > > irrespective of new_pte value. In a later patch we will optimize > > the application > > exit case. > > What about my change to set_pte_at() ? We seem to be overwriting > valid PTEs, > shouldn't we deal with that ? So the HW guys confirmed that the TLB will never cache a valid entry that has all permissions clear. That leaves the THP write problem though. Cheers, Ben. > Cheers, > Ben. > > > Signed-off-by: Benjamin Herrenschmidt> > > Signed-off-by: Aneesh Kumar K.V > > > > --- > > arch/powerpc/include/asm/book3s/64/radix.h | 12 +++- > > 1 file changed, 3 insertions(+), 9 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h > > b/arch/powerpc/include/asm/book3s/64/radix.h > > index b4d1302387a3..70a3cdcdbe47 100644 > > --- a/arch/powerpc/include/asm/book3s/64/radix.h > > +++ b/arch/powerpc/include/asm/book3s/64/radix.h > > @@ -144,16 +144,10 @@ static inline unsigned long > > radix__pte_update(struct mm_struct *mm, > > > * new value of pte > > > */ > > > new_pte = (old_pte | set) & ~clr; > > > - /* > > > - * If we are trying to clear the pte, we can > > > skip > > > - * the below sequence and batch the tlb flush. > > > The > > > - * tlb flush batching is done by mmu gather code > > > - */ > > > - if (new_pte) { > > > - asm volatile("ptesync" : : : "memory"); > > > - radix__flush_tlb_pte_p9_dd1(old_pte, mm, > > > addr); > > > + asm volatile("ptesync" : : : "memory"); > > > + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); > > > + if (new_pte) > > > __radix_pte_update(ptep, 0, new_pte); > > > - } > > > } else > > > old_pte = __radix_pte_update(ptep, clr, set); > > > asm volatile("ptesync" : : : "memory");
Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote: > In the kernel we do follow the below sequence in different code paths. > pte = ptep_get_clear(ptep) > > set_pte_at(ptep, pte) > > We do that for mremap, autonuma protection update and softdirty clearing. This > implies our optimization to skip a tlb flush when clearing a pte update is > not valid, because for DD1 system that followup set_pte_at will be done witout > doing the required tlbflush. Fix that by always doing the dd1 style pte update > irrespective of new_pte value. In a later patch we will optimize the > application > exit case. What about my change to set_pte_at() ? We seem to be overwriting valid PTEs, shouldn't we deal with that ? Cheers, Ben. > Signed-off-by: Benjamin Herrenschmidt> > Signed-off-by: Aneesh Kumar K.V > --- > arch/powerpc/include/asm/book3s/64/radix.h | 12 +++- > 1 file changed, 3 insertions(+), 9 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h > b/arch/powerpc/include/asm/book3s/64/radix.h > index b4d1302387a3..70a3cdcdbe47 100644 > --- a/arch/powerpc/include/asm/book3s/64/radix.h > +++ b/arch/powerpc/include/asm/book3s/64/radix.h > @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct > mm_struct *mm, > > * new value of pte > > */ > > new_pte = (old_pte | set) & ~clr; > > - /* > > - * If we are trying to clear the pte, we can skip > > - * the below sequence and batch the tlb flush. The > > - * tlb flush batching is done by mmu gather code > > - */ > > - if (new_pte) { > > - asm volatile("ptesync" : : : "memory"); > > - radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); > > + asm volatile("ptesync" : : : "memory"); > > + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); > > + if (new_pte) > > __radix_pte_update(ptep, 0, new_pte); > > - } > > } else > > old_pte = __radix_pte_update(ptep, clr, set); > > asm volatile("ptesync" : : : "memory");
[PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
In the kernel we do follow the below sequence in different code paths. pte = ptep_get_clear(ptep) set_pte_at(ptep, pte) We do that for mremap, autonuma protection update and softdirty clearing. This implies our optimization to skip a tlb flush when clearing a pte update is not valid, because for DD1 system that followup set_pte_at will be done witout doing the required tlbflush. Fix that by always doing the dd1 style pte update irrespective of new_pte value. In a later patch we will optimize the application exit case. Signed-off-by: Benjamin HerrenschmidtSigned-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/radix.h | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h index b4d1302387a3..70a3cdcdbe47 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm, * new value of pte */ new_pte = (old_pte | set) & ~clr; - /* -* If we are trying to clear the pte, we can skip -* the below sequence and batch the tlb flush. The -* tlb flush batching is done by mmu gather code -*/ - if (new_pte) { - asm volatile("ptesync" : : : "memory"); - radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); + asm volatile("ptesync" : : : "memory"); + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr); + if (new_pte) __radix_pte_update(ptep, 0, new_pte); - } } else old_pte = __radix_pte_update(ptep, clr, set); asm volatile("ptesync" : : : "memory"); -- 2.7.4