Re: [PATCH] powerpc/64s/hash: convert SLB miss handlers to C

2018-08-21 Thread Nicholas Piggin
On Tue, 21 Aug 2018 16:12:44 +1000
Benjamin Herrenschmidt  wrote:

> On Tue, 2018-08-21 at 15:13 +1000, Nicholas Piggin wrote:
> > This patch moves SLB miss handlers completely to C, using the standard
> > exception handler macros to set up the stack and branch to C.
> > 
> > This can be done because the segment containing the kernel stack is
> > always bolted, so accessing it with relocation on will not cause an
> > SLB exception.
> > 
> > Arbitrary kernel memory may not be accessed when handling kernel space
> > SLB misses, so care should be taken there. However user SLB misses can
> > access any kernel memory, which can be used to move some fields out of
> > the paca (in later patches).
> > 
> > User SLB misses could quite easily reconcile IRQs and set up a first
> > class kernel environment and exit via ret_from_except, however that
> > doesn't seem to be necessary at the moment, so we only do that if a
> > bad fault is encountered.
> > 
> > [ Credit to Aneesh for bug fixes, error checks, and improvements to bad
> >   address handling, etc ]
> > 
> > Signed-off-by: Nicholas Piggin 
> > 
> > Since RFC:
> > - Send patch 1 by itself to focus on the big change.
> > - Added MSR[RI] handling
> > - Fixed up a register loss bug exposed by irq tracing (Aneesh)
> > - Reject misses outside the defined kernel regions (Aneesh)
> > - Added several more sanity checks and error handlig (Aneesh), we may
> >   look at consolidating these tests and tightenig up the code but for
> >   a first pass we decided it's better to check carefully.
> > ---
> >  arch/powerpc/include/asm/asm-prototypes.h |   2 +
> >  arch/powerpc/kernel/exceptions-64s.S  | 202 +++--
> >  arch/powerpc/mm/Makefile  |   2 +-
> >  arch/powerpc/mm/slb.c | 257 +
> >  arch/powerpc/mm/slb_low.S | 335 --
> >  5 files changed, 185 insertions(+), 613 deletions(-)  
> ^^^^^^
> 
> Nice ! :-)

Okay with the subsequent improvements, the context switching benchmark
is about 30% faster. Radix is still faster, but hash now has 90% the
throughput rather than 70% with upstream. Although that's no longer
testing SLB miss performance because they are all eliminated.

I don't know of any other SLB miss intensive tests. I think these
numbers indicate it may be a reasonable approach wrt performance.
Though we should still look at HANA or something on a big memory
system.

Thanks,
Nick


Re: [PATCH] powerpc/64s/hash: convert SLB miss handlers to C

2018-08-21 Thread Nicholas Piggin
On Tue, 21 Aug 2018 16:12:44 +1000
Benjamin Herrenschmidt  wrote:

> On Tue, 2018-08-21 at 15:13 +1000, Nicholas Piggin wrote:
> > This patch moves SLB miss handlers completely to C, using the standard
> > exception handler macros to set up the stack and branch to C.
> > 
> > This can be done because the segment containing the kernel stack is
> > always bolted, so accessing it with relocation on will not cause an
> > SLB exception.
> > 
> > Arbitrary kernel memory may not be accessed when handling kernel space
> > SLB misses, so care should be taken there. However user SLB misses can
> > access any kernel memory, which can be used to move some fields out of
> > the paca (in later patches).
> > 
> > User SLB misses could quite easily reconcile IRQs and set up a first
> > class kernel environment and exit via ret_from_except, however that
> > doesn't seem to be necessary at the moment, so we only do that if a
> > bad fault is encountered.
> > 
> > [ Credit to Aneesh for bug fixes, error checks, and improvements to bad
> >   address handling, etc ]
> > 
> > Signed-off-by: Nicholas Piggin 
> > 
> > Since RFC:
> > - Send patch 1 by itself to focus on the big change.
> > - Added MSR[RI] handling
> > - Fixed up a register loss bug exposed by irq tracing (Aneesh)
> > - Reject misses outside the defined kernel regions (Aneesh)
> > - Added several more sanity checks and error handlig (Aneesh), we may
> >   look at consolidating these tests and tightenig up the code but for
> >   a first pass we decided it's better to check carefully.
> > ---
> >  arch/powerpc/include/asm/asm-prototypes.h |   2 +
> >  arch/powerpc/kernel/exceptions-64s.S  | 202 +++--
> >  arch/powerpc/mm/Makefile  |   2 +-
> >  arch/powerpc/mm/slb.c | 257 +
> >  arch/powerpc/mm/slb_low.S | 335 --
> >  5 files changed, 185 insertions(+), 613 deletions(-)  
> ^^^^^^
> 
> Nice ! :-)

So I did some measurements with context switching (that takes quite a
few SLB faults), we lose about 3-5% performance on that benchmark.

Top SLB involved profile entries before:

   7.44%  [k] switch_slb
   3.43%  [k] slb_compare_rr_to_size
   1.64%  [k] slb_miss_common
   1.24%  [k] slb_miss_kernel_load_io
   0.58%  [k] exc_virt_0x4480_instruction_access_slb

After:

   7.15%  [k] switch_slb
   3.90%  [k] slb_insert_entry
   3.65%  [k] fast_exception_return
   1.00%  [k] slb_allocate_user
   0.59%  [k] exc_virt_0x4480_instruction_access_slb

With later patches we can reduce SLB misses to zero on this workload
(and generally an order of magnitude lower on small workloads). But
each miss will be more expensive and very large memory workloads are
going to have mandatory misses. Will be good to try verifying that
we can do smarter SLB allocation and reclaim to make up for that on
workloads like HANA. I think we probably could because round robin
isn't great.

Thanks,
Nick


Re: [PATCH] powerpc/64s/hash: convert SLB miss handlers to C

2018-08-21 Thread Benjamin Herrenschmidt
On Tue, 2018-08-21 at 15:13 +1000, Nicholas Piggin wrote:
> This patch moves SLB miss handlers completely to C, using the standard
> exception handler macros to set up the stack and branch to C.
> 
> This can be done because the segment containing the kernel stack is
> always bolted, so accessing it with relocation on will not cause an
> SLB exception.
> 
> Arbitrary kernel memory may not be accessed when handling kernel space
> SLB misses, so care should be taken there. However user SLB misses can
> access any kernel memory, which can be used to move some fields out of
> the paca (in later patches).
> 
> User SLB misses could quite easily reconcile IRQs and set up a first
> class kernel environment and exit via ret_from_except, however that
> doesn't seem to be necessary at the moment, so we only do that if a
> bad fault is encountered.
> 
> [ Credit to Aneesh for bug fixes, error checks, and improvements to bad
>   address handling, etc ]
> 
> Signed-off-by: Nicholas Piggin 
> 
> Since RFC:
> - Send patch 1 by itself to focus on the big change.
> - Added MSR[RI] handling
> - Fixed up a register loss bug exposed by irq tracing (Aneesh)
> - Reject misses outside the defined kernel regions (Aneesh)
> - Added several more sanity checks and error handlig (Aneesh), we may
>   look at consolidating these tests and tightenig up the code but for
>   a first pass we decided it's better to check carefully.
> ---
>  arch/powerpc/include/asm/asm-prototypes.h |   2 +
>  arch/powerpc/kernel/exceptions-64s.S  | 202 +++--
>  arch/powerpc/mm/Makefile  |   2 +-
>  arch/powerpc/mm/slb.c | 257 +
>  arch/powerpc/mm/slb_low.S | 335 --
>  5 files changed, 185 insertions(+), 613 deletions(-)
^^^^^^

Nice ! :-)

Cheers,
Ben.



[PATCH] powerpc/64s/hash: convert SLB miss handlers to C

2018-08-20 Thread Nicholas Piggin
This patch moves SLB miss handlers completely to C, using the standard
exception handler macros to set up the stack and branch to C.

This can be done because the segment containing the kernel stack is
always bolted, so accessing it with relocation on will not cause an
SLB exception.

Arbitrary kernel memory may not be accessed when handling kernel space
SLB misses, so care should be taken there. However user SLB misses can
access any kernel memory, which can be used to move some fields out of
the paca (in later patches).

User SLB misses could quite easily reconcile IRQs and set up a first
class kernel environment and exit via ret_from_except, however that
doesn't seem to be necessary at the moment, so we only do that if a
bad fault is encountered.

[ Credit to Aneesh for bug fixes, error checks, and improvements to bad
  address handling, etc ]

Signed-off-by: Nicholas Piggin 

Since RFC:
- Send patch 1 by itself to focus on the big change.
- Added MSR[RI] handling
- Fixed up a register loss bug exposed by irq tracing (Aneesh)
- Reject misses outside the defined kernel regions (Aneesh)
- Added several more sanity checks and error handlig (Aneesh), we may
  look at consolidating these tests and tightenig up the code but for
  a first pass we decided it's better to check carefully.
---
 arch/powerpc/include/asm/asm-prototypes.h |   2 +
 arch/powerpc/kernel/exceptions-64s.S  | 202 +++--
 arch/powerpc/mm/Makefile  |   2 +-
 arch/powerpc/mm/slb.c | 257 +
 arch/powerpc/mm/slb_low.S | 335 --
 5 files changed, 185 insertions(+), 613 deletions(-)
 delete mode 100644 arch/powerpc/mm/slb_low.S

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index 1f4691ce4126..78ed3c3f879a 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -78,6 +78,8 @@ void kernel_bad_stack(struct pt_regs *regs);
 void system_reset_exception(struct pt_regs *regs);
 void machine_check_exception(struct pt_regs *regs);
 void emulation_assist_interrupt(struct pt_regs *regs);
+long do_slb_fault(struct pt_regs *regs, unsigned long ea);
+void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
 
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index ea04dfb8c092..c3832243819b 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -566,28 +566,36 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
 
 EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
-   SET_SCRATCH0(r13)
-   EXCEPTION_PROLOG_0(PACA_EXSLB)
-   EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
-   mr  r12,r3  /* save r3 */
-   mfspr   r3,SPRN_DAR
-   mfspr   r11,SPRN_SRR1
-   crset   4*cr6+eq
-   BRANCH_TO_COMMON(r10, slb_miss_common)
+EXCEPTION_PROLOG(PACA_EXSLB, data_access_slb_common, EXC_STD, KVMTEST_PR, 
0x380);
 EXC_REAL_END(data_access_slb, 0x380, 0x80)
 
 EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
-   SET_SCRATCH0(r13)
-   EXCEPTION_PROLOG_0(PACA_EXSLB)
-   EXCEPTION_PROLOG_1(PACA_EXSLB, NOTEST, 0x380)
-   mr  r12,r3  /* save r3 */
-   mfspr   r3,SPRN_DAR
-   mfspr   r11,SPRN_SRR1
-   crset   4*cr6+eq
-   BRANCH_TO_COMMON(r10, slb_miss_common)
+EXCEPTION_RELON_PROLOG(PACA_EXSLB, data_access_slb_common, EXC_STD, NOTEST, 
0x380);
 EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
+
 TRAMP_KVM_SKIP(PACA_EXSLB, 0x380)
 
+EXC_COMMON_BEGIN(data_access_slb_common)
+   mfspr   r10,SPRN_DAR
+   std r10,PACA_EXSLB+EX_DAR(r13)
+   EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB)
+   ld  r4,PACA_EXSLB+EX_DAR(r13)
+   std r4,_DAR(r1)
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  do_slb_fault
+   cmpdi   r3,0
+   bne-1f
+   b   fast_exception_return
+1: /* Error case */
+   std r3,RESULT(r1)
+   bl  save_nvgprs
+   RECONCILE_IRQ_STATE(r10, r11)
+   ld  r4,_DAR(r1)
+   ld  r5,RESULT(r1)
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  do_bad_slb_fault
+   b   ret_from_except
+
 
 EXC_REAL(instruction_access, 0x400, 0x80)
 EXC_VIRT(instruction_access, 0x4400, 0x80, 0x400)
@@ -610,160 +618,34 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
 
 EXC_REAL_BEGIN(instruction_access_slb, 0x480, 0x80)
-   SET_SCRATCH0(r13)
-   EXCEPTION_PROLOG_0(PACA_EXSLB)
-   EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x480)
-   mr  r12,r3  /* save r3 */
-   mfspr   r3,SPRN_SRR0/* SRR0 is faulting address */
-   mfspr   r11,SPRN_SRR1
-   crclr   4*cr6+eq
-   BRANCH_TO_COMMON(r10, slb_miss_common)
+EXCEPTION_PROLOG(PACA_EXSLB, instruction_access_slb_common, EXC_STD, 
KVMTEST_PR, 0x480);