Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed
On Mon, 17 Sep 2018 11:30:16 +0530 "Aneesh Kumar K.V" wrote: > Nicholas Piggin writes: > > > The POWER5 < DD2.1 issue is that slbie needs to be issued more than > > once. It came in with this change: > > > > ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au > > [PATCH] POWER5 erratum workaround > > > > Early POWER5 revisions ( > instructions to be repeated under some circumstances. The patch below > > adds a workaround (patch made by Anton Blanchard). > > Thanks for extracting this. Can we add this to the code? The comment? Sure. > Also I am not > sure what is repeated here? Is it that we just need one slb extra(hence > only applicable to offset == 1) or is it that we need to make sure there > is always one slb extra? The code does the former. Yeah it has always done the former, so my assumption is that you just need more than one slbie. I don't think we need to bother revisiting that assumption unless someone can pull up something definitive. What I did change is that slbia no longer has the additional slbie, but I think there are strong reasons not to need that. > Do you a have link for > that email patch? I tried looking through the archives around that date but could not find it. That came from a bitkeeper log. Thanks, Nick
Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed
Nicholas Piggin writes: > The POWER5 < DD2.1 issue is that slbie needs to be issued more than > once. It came in with this change: > > ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au > [PATCH] POWER5 erratum workaround > > Early POWER5 revisions ( instructions to be repeated under some circumstances. The patch below > adds a workaround (patch made by Anton Blanchard). Thanks for extracting this. Can we add this to the code? Also I am not sure what is repeated here? Is it that we just need one slb extra(hence only applicable to offset == 1) or is it that we need to make sure there is always one slb extra? The code does the former. Do you a have link for that email patch? > > The extra slbie in switch_slb is done even for the case where slbia is > called (slb_flush_and_rebolt). I don't believe that is required > because there are other slb_flush_and_rebolt callers which do not > issue the workaround slbie, which would be broken if it was required. > > It also seems to be fine inside the isync with the first slbie, as it > is in the kernel stack switch code. > > So move this workaround to where it is required. This is not much of > an optimisation because this is the fast path, but it makes the code > more understandable and neater. > > Signed-off-by: Nicholas Piggin > --- > arch/powerpc/mm/slb.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c > index 1c7128c63a4b..d952ece3abf7 100644 > --- a/arch/powerpc/mm/slb.c > +++ b/arch/powerpc/mm/slb.c > @@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, > unsigned long addr2) > void switch_slb(struct task_struct *tsk, struct mm_struct *mm) > { > unsigned long offset; > - unsigned long slbie_data = 0; > unsigned long pc = KSTK_EIP(tsk); > unsigned long stack = KSTK_ESP(tsk); > unsigned long exec_base; > @@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct > *mm) > offset = get_paca()->slb_cache_ptr; > if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) && > offset <= SLB_CACHE_ENTRIES) { > + unsigned long slbie_data; > int i; > + > asm volatile("isync" : : : "memory"); > for (i = 0; i < offset; i++) { > slbie_data = (unsigned long)get_paca()->slb_cache[i] > @@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct > mm_struct *mm) > slbie_data |= SLBIE_C; /* C set for user addresses */ > asm volatile("slbie %0" : : "r" (slbie_data)); > } > - asm volatile("isync" : : : "memory"); > - } else { > - __slb_flush_and_rebolt(); > - } > > - if (!cpu_has_feature(CPU_FTR_ARCH_207S)) { > /* Workaround POWER5 < DD2.1 issue */ > - if (offset == 1 || offset > SLB_CACHE_ENTRIES) > + if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1) > asm volatile("slbie %0" : : "r" (slbie_data)); > + > + asm volatile("isync" : : : "memory"); > + } else { > + __slb_flush_and_rebolt(); > } > > get_paca()->slb_cache_ptr = 0; > -- > 2.18.0
Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed
Hi Nicholas, I love your patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on v4.19-rc3 next-20180913] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/SLB-miss-conversion-to-C-and-SLB-optimisations/20180917-015458 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc Note: it may well be a FALSE warning. FWIW you are at least aware of it now. http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings Note: the linux-review/Nicholas-Piggin/SLB-miss-conversion-to-C-and-SLB-optimisations/20180917-015458 HEAD b26bd44c74488169a0fd19eef43ea3db189a207d builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): arch/powerpc/mm/slb.c: In function 'switch_slb': >> arch/powerpc/mm/slb.c:258:4: error: 'slbie_data' may be used uninitialized >> in this function [-Werror=maybe-uninitialized] asm volatile("slbie %0" : : "r" (slbie_data)); ^~~ cc1: all warnings being treated as errors vim +/slbie_data +258 arch/powerpc/mm/slb.c 465ccab9 arch/powerpc/mm/slb.c will schmidt 2007-10-31 224 ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 225 /* Flush all user entries from the segment table of the current processor. */ ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 226 void switch_slb(struct task_struct *tsk, struct mm_struct *mm) ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 227 { 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 228 unsigned long offset; ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 229 unsigned long pc = KSTK_EIP(tsk); ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 230 unsigned long stack = KSTK_ESP(tsk); de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 231 unsigned long exec_base; ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 232 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 233 /* 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 234 * We need interrupts hard-disabled here, not just soft-disabled, 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 235 * so that a PMU interrupt can't occur, which might try to access 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 236 * user memory (to get a stack trace) and possible cause an SLB miss 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 237 * which would update the slb_cache/slb_cache_ptr fields in the PACA. 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 238 */ 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 239 hard_irq_disable(); 9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 240 offset = get_paca()->slb_cache_ptr; 44ae3ab3 arch/powerpc/mm/slb.c Matt Evans 2011-04-06 241 if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) && f66bce5e arch/powerpc/mm/slb.c Olof Johansson 2007-10-16 242 offset <= SLB_CACHE_ENTRIES) { 6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 243 unsigned long slbie_data; ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 244 int i; 6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 245 ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 246 asm volatile("isync" : : : "memory"); ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 247 for (i = 0; i < offset; i++) { 1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 248 slbie_data = (unsigned long)get_paca()->slb_cache[i] 1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 249 << SID_SHIFT; /* EA */ 1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 250 slbie_data |= user_segment_size(slbie_data) 1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 251 << SLBIE_SSIZE_SHIFT; 1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 252 slbie_data |= SLBIE_C; /* C set for user addresses */ 1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 253 asm volatile("slbie %0" : : "r" (slbie_data)); ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 254 } ^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 255
[PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed
The POWER5 < DD2.1 issue is that slbie needs to be issued more than once. It came in with this change: ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au [PATCH] POWER5 erratum workaround Early POWER5 revisions ( --- arch/powerpc/mm/slb.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index 1c7128c63a4b..d952ece3abf7 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, unsigned long addr2) void switch_slb(struct task_struct *tsk, struct mm_struct *mm) { unsigned long offset; - unsigned long slbie_data = 0; unsigned long pc = KSTK_EIP(tsk); unsigned long stack = KSTK_ESP(tsk); unsigned long exec_base; @@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) offset = get_paca()->slb_cache_ptr; if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) && offset <= SLB_CACHE_ENTRIES) { + unsigned long slbie_data; int i; + asm volatile("isync" : : : "memory"); for (i = 0; i < offset; i++) { slbie_data = (unsigned long)get_paca()->slb_cache[i] @@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) slbie_data |= SLBIE_C; /* C set for user addresses */ asm volatile("slbie %0" : : "r" (slbie_data)); } - asm volatile("isync" : : : "memory"); - } else { - __slb_flush_and_rebolt(); - } - if (!cpu_has_feature(CPU_FTR_ARCH_207S)) { /* Workaround POWER5 < DD2.1 issue */ - if (offset == 1 || offset > SLB_CACHE_ENTRIES) + if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1) asm volatile("slbie %0" : : "r" (slbie_data)); + + asm volatile("isync" : : : "memory"); + } else { + __slb_flush_and_rebolt(); } get_paca()->slb_cache_ptr = 0; -- 2.18.0