Re: [PATCH] perf: powerpc: Disable pagefaults during callchain stack read

2011-08-01 Thread Peter Zijlstra
On Sat, 2011-07-30 at 14:53 -0600, David Ahern wrote:
 A page fault occurred walking the callchain while creating a perf
 sample for the context-switch event. To handle the page fault the
 mmap_sem is needed, but it is currently held by setup_arg_pages.
 (setup_arg_pages calls shift_arg_pages with the mmap_sem held.
 shift_arg_pages then calls move_page_tables which has a cond_resched
 at the top of its for loop - hitting that cond_resched is what caused
 the context switch.)
 
 This is an extension of Anton's proposed patch:
 https://lkml.org/lkml/2011/7/24/151
 adding case for 32-bit ppc.
 
 Tested on the system that first generated the panic and then again
 with latest kernel using a PPC VM. I am not able to test the 64-bit
 path - I do not have H/W for it and 64-bit PPC VMs (qemu on Intel)
 is horribly slow.
 
 Signed-off-by: David Ahern dsah...@gmail.com
 CC: Benjamin Herrenschmidt b...@kernel.crashing.org
 CC: Anton Blanchard an...@samba.org 

Hmm, Paul, didn't you fix something like this early on? Anyway, I've no
objections since I'm really not familiar enough with the PPC side of
things.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf: powerpc: Disable pagefaults during callchain stack read

2011-08-01 Thread Benjamin Herrenschmidt
On Mon, 2011-08-01 at 11:59 +0200, Peter Zijlstra wrote:
  Signed-off-by: David Ahern dsah...@gmail.com
  CC: Benjamin Herrenschmidt b...@kernel.crashing.org
  CC: Anton Blanchard an...@samba.org 
 
 Hmm, Paul, didn't you fix something like this early on? Anyway, I've
 no
 objections since I'm really not familiar enough with the PPC side of
 things. 

I'm travelling so I haven't had a chance to review properly or even test
but it looks like an ad-hoc fix for the immediate problem.

Ultimately, I want to rework that stuff to do a __gup_fast like x86 does
(maybe as a fallback from an attempt at access first) so we work around
access permissions blocked by lack of dirty/accessed bits but in the
meantime, this should fix the immediate issue.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf: powerpc: Disable pagefaults during callchain stack read

2011-08-01 Thread David Ahern


On 08/01/2011 04:39 AM, Benjamin Herrenschmidt wrote:
 On Mon, 2011-08-01 at 11:59 +0200, Peter Zijlstra wrote:
 Signed-off-by: David Ahern dsah...@gmail.com
 CC: Benjamin Herrenschmidt b...@kernel.crashing.org
 CC: Anton Blanchard an...@samba.org 

 Hmm, Paul, didn't you fix something like this early on? Anyway, I've
 no
 objections since I'm really not familiar enough with the PPC side of
 things. 
 
 I'm travelling so I haven't had a chance to review properly or even test
 but it looks like an ad-hoc fix for the immediate problem.
 
 Ultimately, I want to rework that stuff to do a __gup_fast like x86 does
 (maybe as a fallback from an attempt at access first) so we work around
 access permissions blocked by lack of dirty/accessed bits but in the
 meantime, this should fix the immediate issue.

The problem goes back to all kernel releases with perf, so this patch
should get applied to the stable trains too.

David

 
 Cheers,
 Ben.
 
 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] perf: powerpc: Disable pagefaults during callchain stack read

2011-07-30 Thread David Ahern
Panic observed on an older kernel when collecting call chains for
the context-switch software event:

 [b0180e00]rb_erase+0x1b4/0x3e8
 [b00430f4]__dequeue_entity+0x50/0xe8
 [b0043304]set_next_entity+0x178/0x1bc
 [b0043440]pick_next_task_fair+0xb0/0x118
 [b02ada80]schedule+0x500/0x614
 [b02afaa8]rwsem_down_failed_common+0xf0/0x264
 [b02afca0]rwsem_down_read_failed+0x34/0x54
 [b02aed4c]down_read+0x3c/0x54
 [b0023b58]do_page_fault+0x114/0x5e8
 [b001e350]handle_page_fault+0xc/0x80
 [b0022dec]perf_callchain+0x224/0x31c
 [b009ba70]perf_prepare_sample+0x240/0x2fc
 [b009d760]__perf_event_overflow+0x280/0x398
 [b009d914]perf_swevent_overflow+0x9c/0x10c
 [b009db54]perf_swevent_ctx_event+0x1d0/0x230
 [b009dc38]do_perf_sw_event+0x84/0xe4
 [b009dde8]perf_sw_event_context_switch+0x150/0x1b4
 [b009de90]perf_event_task_sched_out+0x44/0x2d4
 [b02ad840]schedule+0x2c0/0x614
 [b0047dc0]__cond_resched+0x34/0x90
 [b02adcc8]_cond_resched+0x4c/0x68
 [b00bccf8]move_page_tables+0xb0/0x418
 [b00d7ee0]setup_arg_pages+0x184/0x2a0
 [b0110914]load_elf_binary+0x394/0x1208
 [b00d6e28]search_binary_handler+0xe0/0x2c4
 [b00d834c]do_execve+0x1bc/0x268
 [b0015394]sys_execve+0x84/0xc8
 [b001df10]ret_from_syscall+0x0/0x3c

A page fault occurred walking the callchain while creating a perf
sample for the context-switch event. To handle the page fault the
mmap_sem is needed, but it is currently held by setup_arg_pages.
(setup_arg_pages calls shift_arg_pages with the mmap_sem held.
shift_arg_pages then calls move_page_tables which has a cond_resched
at the top of its for loop - hitting that cond_resched is what caused
the context switch.)

This is an extension of Anton's proposed patch:
https://lkml.org/lkml/2011/7/24/151
adding case for 32-bit ppc.

Tested on the system that first generated the panic and then again
with latest kernel using a PPC VM. I am not able to test the 64-bit
path - I do not have H/W for it and 64-bit PPC VMs (qemu on Intel)
is horribly slow.

Signed-off-by: David Ahern dsah...@gmail.com
CC: Benjamin Herrenschmidt b...@kernel.crashing.org
CC: Anton Blanchard an...@samba.org
CC: Peter Zijlstra a.p.zijls...@chello.nl
CC: Paul Mackerras pau...@samba.org
CC: Ingo Molnar mi...@elte.hu
CC: Arnaldo Carvalho de Melo a...@ghostprotocols.net
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-ker...@vger.kernel.org

---
 arch/powerpc/kernel/perf_callchain.c |   20 +---
 1 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/perf_callchain.c 
b/arch/powerpc/kernel/perf_callchain.c
index d05ae42..564c1d8 100644
--- a/arch/powerpc/kernel/perf_callchain.c
+++ b/arch/powerpc/kernel/perf_callchain.c
@@ -154,8 +154,12 @@ static int read_user_stack_64(unsigned long __user *ptr, 
unsigned long *ret)
((unsigned long)ptr  7))
return -EFAULT;
 
-   if (!__get_user_inatomic(*ret, ptr))
+   pagefault_disable();
+   if (!__get_user_inatomic(*ret, ptr)) {
+   pagefault_enable();
return 0;
+   }
+   pagefault_enable();
 
return read_user_stack_slow(ptr, ret, 8);
 }
@@ -166,8 +170,12 @@ static int read_user_stack_32(unsigned int __user *ptr, 
unsigned int *ret)
((unsigned long)ptr  3))
return -EFAULT;
 
-   if (!__get_user_inatomic(*ret, ptr))
+   pagefault_disable();
+   if (!__get_user_inatomic(*ret, ptr)) {
+   pagefault_enable();
return 0;
+   }
+   pagefault_enable();
 
return read_user_stack_slow(ptr, ret, 4);
 }
@@ -294,11 +302,17 @@ static inline int current_is_64bit(void)
  */
 static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret)
 {
+   int rc;
+
if ((unsigned long)ptr  TASK_SIZE - sizeof(unsigned int) ||
((unsigned long)ptr  3))
return -EFAULT;
 
-   return __get_user_inatomic(*ret, ptr);
+   pagefault_disable();
+   rc = __get_user_inatomic(*ret, ptr);
+   pagefault_enable();
+
+   return rc;
 }
 
 static inline void perf_callchain_user_64(struct perf_callchain_entry *entry,
-- 
1.7.6

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] perf: powerpc: Disable pagefaults during callchain stack read

2011-07-24 Thread Anton Blanchard
Hi David,

  I am hoping someone familiar with PPC can help understand a panic
  that is generated when capturing callchains with context switch
  events.
  
  Call trace is below. The short of it is that walking the callchain
  generates a page fault. To handle the page fault the mmap_sem is
  needed, but it is currently held by setup_arg_pages.
  setup_arg_pages calls shift_arg_pages with the mmap_sem held.
  shift_arg_pages then calls move_page_tables which has a
  cond_resched at the top of its for loop. If the cond_resched() is
  removed from move_page_tables everything works beautifully - no
  panics.
  
  So, the question: is it normal for walking the stack to trigger a
  page fault on PPC? The panic is not seen on x86 based systems.
 
 Can anyone confirm whether page faults while walking the stack are
 normal for PPC? We really want to use the context switch event with
 callchains and need to understand whether this behavior is normal. Of
 course if it is normal, a way to address the problem without a panic
 will be needed.

I talked to Ben about this last week and he pointed me at
pagefault_disable/enable. Untested patch below.

Anton

--

We need to disable pagefaults when reading the stack otherwise
we can lock up trying to take the mmap_sem when the code we are
profiling already has a write lock taken.

This will not happen for hardware events, but could for software
events.

Reported-by: David Ahern dsah...@gmail.com
Signed-off-by: Anton Blanchard an...@samba.org
Cc: sta...@kernel.org
---

Index: linux-powerpc/arch/powerpc/kernel/perf_callchain.c
===
--- linux-powerpc.orig/arch/powerpc/kernel/perf_callchain.c 2011-07-25 
09:54:27.296757427 +1000
+++ linux-powerpc/arch/powerpc/kernel/perf_callchain.c  2011-07-25 
09:56:08.828367882 +1000
@@ -154,8 +154,12 @@ static int read_user_stack_64(unsigned l
((unsigned long)ptr  7))
return -EFAULT;
 
-   if (!__get_user_inatomic(*ret, ptr))
+   pagefault_disable();
+   if (!__get_user_inatomic(*ret, ptr)) {
+   pagefault_enable();
return 0;
+   }
+   pagefault_enable();
 
return read_user_stack_slow(ptr, ret, 8);
 }
@@ -166,8 +170,12 @@ static int read_user_stack_32(unsigned i
((unsigned long)ptr  3))
return -EFAULT;
 
-   if (!__get_user_inatomic(*ret, ptr))
+   pagefault_disable();
+   if (!__get_user_inatomic(*ret, ptr)) {
+   pagefault_enable();
return 0;
+   }
+   pagefault_enable();
 
return read_user_stack_slow(ptr, ret, 4);
 }
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev