Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed

2018-09-17 Thread Nicholas Piggin
On Mon, 17 Sep 2018 11:30:16 +0530
"Aneesh Kumar K.V"  wrote:

> Nicholas Piggin  writes:
> 
> > The POWER5 < DD2.1 issue is that slbie needs to be issued more than
> > once. It came in with this change:
> >
> > ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au
> >   [PATCH] POWER5 erratum workaround
> >
> >   Early POWER5 revisions ( >   instructions to be repeated under some circumstances.  The patch below
> >   adds a workaround (patch made by Anton Blanchard).  
> 
> Thanks for extracting this. Can we add this to the code?

The comment? Sure.

> Also I am not
> sure what is repeated here? Is it that we just need one slb extra(hence
> only applicable to offset == 1) or is it that we need to make sure there
> is always one slb extra? The code does the former.

Yeah it has always done the former, so my assumption is that you just
need more than one slbie. I don't think we need to bother revisiting
that assumption unless someone can pull up something definitive.

What I did change is that slbia no longer has the additional slbie, but
I think there are strong reasons not to need that.

>  Do you a have link for
> that email patch?

I tried looking through the archives around that date but could not
find it. That came from a bitkeeper log.

Thanks,
Nick


Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed

2018-09-17 Thread Aneesh Kumar K.V
Nicholas Piggin  writes:

> The POWER5 < DD2.1 issue is that slbie needs to be issued more than
> once. It came in with this change:
>
> ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au
>   [PATCH] POWER5 erratum workaround
>
>   Early POWER5 revisions (   instructions to be repeated under some circumstances.  The patch below
>   adds a workaround (patch made by Anton Blanchard).

Thanks for extracting this. Can we add this to the code? Also I am not
sure what is repeated here? Is it that we just need one slb extra(hence
only applicable to offset == 1) or is it that we need to make sure there
is always one slb extra? The code does the former.  Do you a have link for
that email patch?


>
> The extra slbie in switch_slb is done even for the case where slbia is
> called (slb_flush_and_rebolt). I don't believe that is required
> because there are other slb_flush_and_rebolt callers which do not
> issue the workaround slbie, which would be broken if it was required.
>
> It also seems to be fine inside the isync with the first slbie, as it
> is in the kernel stack switch code.
>
> So move this workaround to where it is required. This is not much of
> an optimisation because this is the fast path, but it makes the code
> more understandable and neater.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/mm/slb.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index 1c7128c63a4b..d952ece3abf7 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, 
> unsigned long addr2)
>  void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
>  {
>   unsigned long offset;
> - unsigned long slbie_data = 0;
>   unsigned long pc = KSTK_EIP(tsk);
>   unsigned long stack = KSTK_ESP(tsk);
>   unsigned long exec_base;
> @@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct 
> *mm)
>   offset = get_paca()->slb_cache_ptr;
>   if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
>   offset <= SLB_CACHE_ENTRIES) {
> + unsigned long slbie_data;
>   int i;
> +
>   asm volatile("isync" : : : "memory");
>   for (i = 0; i < offset; i++) {
>   slbie_data = (unsigned long)get_paca()->slb_cache[i]
> @@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct 
> mm_struct *mm)
>   slbie_data |= SLBIE_C; /* C set for user addresses */
>   asm volatile("slbie %0" : : "r" (slbie_data));
>   }
> - asm volatile("isync" : : : "memory");
> - } else {
> - __slb_flush_and_rebolt();
> - }
>  
> - if (!cpu_has_feature(CPU_FTR_ARCH_207S)) {
>   /* Workaround POWER5 < DD2.1 issue */
> - if (offset == 1 || offset > SLB_CACHE_ENTRIES)
> + if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1)
>   asm volatile("slbie %0" : : "r" (slbie_data));
> +
> + asm volatile("isync" : : : "memory");
> + } else {
> + __slb_flush_and_rebolt();
>   }
>  
>   get_paca()->slb_cache_ptr = 0;
> -- 
> 2.18.0



Re: [PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed

2018-09-16 Thread kbuild test robot
Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.19-rc3 next-20180913]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nicholas-Piggin/SLB-miss-conversion-to-C-and-SLB-optimisations/20180917-015458
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

Note: the 
linux-review/Nicholas-Piggin/SLB-miss-conversion-to-C-and-SLB-optimisations/20180917-015458
 HEAD b26bd44c74488169a0fd19eef43ea3db189a207d builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   arch/powerpc/mm/slb.c: In function 'switch_slb':
>> arch/powerpc/mm/slb.c:258:4: error: 'slbie_data' may be used uninitialized 
>> in this function [-Werror=maybe-uninitialized]
   asm volatile("slbie %0" : : "r" (slbie_data));
   ^~~
   cc1: all warnings being treated as errors

vim +/slbie_data +258 arch/powerpc/mm/slb.c

465ccab9 arch/powerpc/mm/slb.c will schmidt 2007-10-31  224  
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  225  /* Flush all 
user entries from the segment table of the current processor. */
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  226  void 
switch_slb(struct task_struct *tsk, struct mm_struct *mm)
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  227  {
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  228 
unsigned long offset;
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  229 
unsigned long pc = KSTK_EIP(tsk);
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  230 
unsigned long stack = KSTK_ESP(tsk);
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard  2009-07-13  231 
unsigned long exec_base;
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  232  
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  233 /*
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  234  * We 
need interrupts hard-disabled here, not just soft-disabled,
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  235  * so 
that a PMU interrupt can't occur, which might try to access
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  236  * user 
memory (to get a stack trace) and possible cause an SLB miss
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  237  * 
which would update the slb_cache/slb_cache_ptr fields in the PACA.
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  238  */
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  239 
hard_irq_disable();
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras   2009-08-17  240 offset 
= get_paca()->slb_cache_ptr;
44ae3ab3 arch/powerpc/mm/slb.c Matt Evans   2011-04-06  241 if 
(!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
f66bce5e arch/powerpc/mm/slb.c Olof Johansson   2007-10-16  242 
offset <= SLB_CACHE_ENTRIES) {
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin  2018-09-15  243 
unsigned long slbie_data;
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  244 
int i;
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin  2018-09-15  245  
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  246 
asm volatile("isync" : : : "memory");
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  247 
for (i = 0; i < offset; i++) {
1189be65 arch/powerpc/mm/slb.c Paul Mackerras   2007-10-11  248 
slbie_data = (unsigned long)get_paca()->slb_cache[i]
1189be65 arch/powerpc/mm/slb.c Paul Mackerras   2007-10-11  249 
<< SID_SHIFT; /* EA */
1189be65 arch/powerpc/mm/slb.c Paul Mackerras   2007-10-11  250 
slbie_data |= user_segment_size(slbie_data)
1189be65 arch/powerpc/mm/slb.c Paul Mackerras   2007-10-11  251 
<< SLBIE_SSIZE_SHIFT;
1189be65 arch/powerpc/mm/slb.c Paul Mackerras   2007-10-11  252 
slbie_data |= SLBIE_C; /* C set for user addresses */
1189be65 arch/powerpc/mm/slb.c Paul Mackerras   2007-10-11  253 
asm volatile("slbie %0" : : "r" (slbie_data));
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  254 
}
^1da177e arch/ppc64/mm/slb.c   Linus Torvalds   2005-04-16  255  

[PATCH 03/12] powerpc/64s/hash: move POWER5 < DD2.1 slbie workaround where it is needed

2018-09-14 Thread Nicholas Piggin
The POWER5 < DD2.1 issue is that slbie needs to be issued more than
once. It came in with this change:

ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, da...@gibson.dropbear.id.au
  [PATCH] POWER5 erratum workaround

  Early POWER5 revisions (
---
 arch/powerpc/mm/slb.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 1c7128c63a4b..d952ece3abf7 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, unsigned 
long addr2)
 void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
 {
unsigned long offset;
-   unsigned long slbie_data = 0;
unsigned long pc = KSTK_EIP(tsk);
unsigned long stack = KSTK_ESP(tsk);
unsigned long exec_base;
@@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct 
*mm)
offset = get_paca()->slb_cache_ptr;
if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
offset <= SLB_CACHE_ENTRIES) {
+   unsigned long slbie_data;
int i;
+
asm volatile("isync" : : : "memory");
for (i = 0; i < offset; i++) {
slbie_data = (unsigned long)get_paca()->slb_cache[i]
@@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct mm_struct 
*mm)
slbie_data |= SLBIE_C; /* C set for user addresses */
asm volatile("slbie %0" : : "r" (slbie_data));
}
-   asm volatile("isync" : : : "memory");
-   } else {
-   __slb_flush_and_rebolt();
-   }
 
-   if (!cpu_has_feature(CPU_FTR_ARCH_207S)) {
/* Workaround POWER5 < DD2.1 issue */
-   if (offset == 1 || offset > SLB_CACHE_ENTRIES)
+   if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1)
asm volatile("slbie %0" : : "r" (slbie_data));
+
+   asm volatile("isync" : : : "memory");
+   } else {
+   __slb_flush_and_rebolt();
}
 
get_paca()->slb_cache_ptr = 0;
-- 
2.18.0