On Wed, Jul 20, 2011 at 03:57:51PM -0600, David Ahern wrote: > I am hoping someone familiar with PPC can help understand a panic that > is generated when capturing callchains with context switch events. > > Call trace is below. The short of it is that walking the callchain > generates a page fault. To handle the page fault the mmap_sem is needed, > but it is currently held by setup_arg_pages. setup_arg_pages calls > shift_arg_pages with the mmap_sem held. shift_arg_pages then calls > move_page_tables which has a cond_resched at the top of its for loop. If > the cond_resched() is removed from move_page_tables everything works > beautifully - no panics. > > So, the question: is it normal for walking the stack to trigger a page > fault on PPC? The panic is not seen on x86 based systems.
Walking the user stack can certainly generate a page fault; walking the kernel stack should never generate a page fault. If any page fault is generated reading the user stack frame, we're supposed to detect that and fall back to walking the page tables manually (see read_user_stack_64() in arch/powerpc/kernel/perf_callchain.c). I think I need to check our __get_user_inatomic() implementation. I don't think removing the cond_resched() from move_page_tables is the right answer. Paul. -- To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
