Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-20 Thread Michael Ellerman
On Wed, 2008-05-14 at 23:49 -0400, Steven Rostedt wrote:
 plain text document attachment (ftrace-powerpc-port.patch)
 This patch adds full support for ftrace for PowerPC (both 64 and 32 bit).
 This includes dynamic tracing and function filtering.

Hi Steven,

Just a few comments inline ..

 Index: linux-sched-devel.git/arch/powerpc/kernel/Makefile
 ===
 --- linux-sched-devel.git.orig/arch/powerpc/kernel/Makefile   2008-05-14 
 19:30:53.0 -0700
 +++ linux-sched-devel.git/arch/powerpc/kernel/Makefile2008-05-14 
 19:31:56.0 -0700
 @@ -12,6 +12,18 @@ CFLAGS_prom_init.o  += -fPIC
  CFLAGS_btext.o   += -fPIC
  endif
  
 +ifdef CONFIG_FTRACE
 +# Do not trace early boot code
 +CFLAGS_REMOVE_cputable.o = -pg
 +CFLAGS_REMOVE_prom_init.o = -pg

Why do we not want to trace early boot? Just because it's not useful? 

 Index: linux-sched-devel.git/arch/powerpc/kernel/entry_32.S
 ===
 --- linux-sched-devel.git.orig/arch/powerpc/kernel/entry_32.S 2008-05-14 
 19:30:50.0 -0700
 +++ linux-sched-devel.git/arch/powerpc/kernel/entry_32.S  2008-05-14 
 19:31:56.0 -0700
 @@ -1035,3 +1035,133 @@ machine_check_in_rtas:
   /* XXX load up BATs and panic */
  
.. snip

 +_GLOBAL(mcount)
 +_GLOBAL(_mcount)
 + stwur1,-48(r1)
 + stw r3, 12(r1)
 + stw r4, 16(r1)
 + stw r5, 20(r1)
 + stw r6, 24(r1)
 + mflrr3
 + lwz r4, 52(r1)
 + mfcrr5
 + stw r7, 28(r1)
 + stw r8, 32(r1)
 + stw r9, 36(r1)
 + stw r10,40(r1)
 + stw r3, 44(r1)
 + stw r5, 8(r1)
 +
 + LOAD_REG_ADDR(r5, ftrace_trace_function)
 +#if 0
 + mtctr   r3
 + mr  r1, r5
 + bctrl
 +#endif
 + lwz r5,0(r5)
 +#if 1
 + mtctr   r5
 + bctrl
 +#else
 + bl  ftrace_stub
 +#endif

#if 0, #if 1 ?

 Index: linux-sched-devel.git/arch/powerpc/kernel/ftrace.c
 ===
 --- /dev/null 1970-01-01 00:00:00.0 +
 +++ linux-sched-devel.git/arch/powerpc/kernel/ftrace.c2008-05-14 
 19:31:56.0 -0700
 @@ -0,0 +1,165 @@
 +/*
 + * Code for replacing ftrace calls with jumps.
 + *
 + * Copyright (C) 2007-2008 Steven Rostedt [EMAIL PROTECTED]
 + *
 + * Thanks goes out to P.A. Semi, Inc for supplying me with a PPC64 box.
 + *
 + */
 +
 +#include linux/spinlock.h
 +#include linux/hardirq.h
 +#include linux/ftrace.h
 +#include linux/percpu.h
 +#include linux/init.h
 +#include linux/list.h
 +
 +#include asm/cacheflush.h
 +
 +#define CALL_BACK4

I don't grok what you're doing with CALL_BACK, you add it in places and
subtract in others - and it looks like you could do neither, but I haven't
gone over it in detail.

 +static unsigned int ftrace_nop = 0x6000;

I should really add a #define for that.

 +#ifdef CONFIG_PPC32
 +# define GET_ADDR(addr) addr
 +#else
 +/* PowerPC64's functions are data that points to the functions */
 +# define GET_ADDR(addr) *(unsigned long *)addr
 +#endif

And that.

.. snip

 +notrace unsigned char *ftrace_call_replace(unsigned long ip, unsigned long 
 addr)
 +{
 + static unsigned int op;
 +
 + addr = GET_ADDR(addr);
 +
 + /* Set to bl addr */
 + op = 0x4801 | (ftrace_calc_offset(ip, addr)  0x03fe);

0x03fe should be 0x03fc, if you set bit 1 you'll end with a bla 
instruction,
ie. branch absolute and link. That shouldn't happen as long as ip and addr are
properly aligned, but still.

In fact I think you should just use create_function_call() or create_branch() 
from
include/asm-powerpc/system.h

 +#ifdef CONFIG_PPC64
 +# define _ASM_ALIGN   .align 3 
 +# define _ASM_PTR .llong 
 +#else
 +# define _ASM_ALIGN   .align 2 
 +# define _ASM_PTR .long 
 +#endif

We already have a #define for .long, it's called PPC_LONG (asm/asm-compat.h)

Perhaps we should add one for .align, PPC_LONG_ALIGN or something?

 +notrace int
 +ftrace_modify_code(unsigned long ip, unsigned char *old_code,
 +unsigned char *new_code)
 +{
 + unsigned replaced;
 + unsigned old = *(unsigned *)old_code;
 + unsigned new = *(unsigned *)new_code;
 + int faulted = 0;
 +
 + /* move the IP back to the start of the call */
 + ip -= CALL_BACK;
 +
 + /*
 +  * Note: Due to modules and __init, code can
 +  *  disappear and change, we need to protect against faulting
 +  *  as well as code changing.
 +  *
 +  * No real locking needed, this code is run through
 +  * kstop_machine.
 +  */
 + asm volatile (
 + 1: lwz %1, 0(%2)\n
 +cmpw%1, %5\n
 +bne 2f\n
 +stwu%3, 0(%2)\n
 + 2:\n
 + .section .fixup, \ax\\n
 + 3: li %0, 1\n
 +b 2b\n
 +   

Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-20 Thread Benjamin Herrenschmidt

 
 Why do we not want to trace early boot? Just because it's not useful? 

Not running at the linked address... might be causing trouble.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-20 Thread Steven Rostedt

On Wed, 21 May 2008, Michael Ellerman wrote:

 On Wed, 2008-05-14 at 23:49 -0400, Steven Rostedt wrote:
  plain text document attachment (ftrace-powerpc-port.patch)
  This patch adds full support for ftrace for PowerPC (both 64 and 32 bit).
  This includes dynamic tracing and function filtering.

 Hi Steven,

 Just a few comments inline ..

Hi Michael,

I really appreciate this. It's been a few years since I did any real PPC
programming, so any comments are most definitely welcome.



  Index: linux-sched-devel.git/arch/powerpc/kernel/Makefile
  ===
  --- linux-sched-devel.git.orig/arch/powerpc/kernel/Makefile 2008-05-14 
  19:30:53.0 -0700
  +++ linux-sched-devel.git/arch/powerpc/kernel/Makefile  2008-05-14 
  19:31:56.0 -0700
  @@ -12,6 +12,18 @@ CFLAGS_prom_init.o  += -fPIC
   CFLAGS_btext.o += -fPIC
   endif
 
  +ifdef CONFIG_FTRACE
  +# Do not trace early boot code
  +CFLAGS_REMOVE_cputable.o = -pg
  +CFLAGS_REMOVE_prom_init.o = -pg

 Why do we not want to trace early boot? Just because it's not useful?

The -pg flag makes calls to the mcount code. I didn't look too deeply, but
at least in my first prototypes the early boot up code would crash when
calling mcount. I found that simply keeping them from calling mcount made
things OK. Perhaps I'm just hiding the problem, but the tracing wont
happen anyway that early. We need to set up memory before tracing starts.


  Index: linux-sched-devel.git/arch/powerpc/kernel/entry_32.S
  ===
  --- linux-sched-devel.git.orig/arch/powerpc/kernel/entry_32.S   
  2008-05-14 19:30:50.0 -0700
  +++ linux-sched-devel.git/arch/powerpc/kernel/entry_32.S2008-05-14 
  19:31:56.0 -0700
  @@ -1035,3 +1035,133 @@ machine_check_in_rtas:
  /* XXX load up BATs and panic */
 
 ... snip

  +_GLOBAL(mcount)
  +_GLOBAL(_mcount)
  +   stwur1,-48(r1)
  +   stw r3, 12(r1)
  +   stw r4, 16(r1)
  +   stw r5, 20(r1)
  +   stw r6, 24(r1)
  +   mflrr3
  +   lwz r4, 52(r1)
  +   mfcrr5
  +   stw r7, 28(r1)
  +   stw r8, 32(r1)
  +   stw r9, 36(r1)
  +   stw r10,40(r1)
  +   stw r3, 44(r1)
  +   stw r5, 8(r1)
  +
  +   LOAD_REG_ADDR(r5, ftrace_trace_function)
  +#if 0
  +   mtctr   r3
  +   mr  r1, r5
  +   bctrl
  +#endif
  +   lwz r5,0(r5)
  +#if 1
  +   mtctr   r5
  +   bctrl
  +#else
  +   bl  ftrace_stub
  +#endif

 #if 0, #if 1 ?

Ouch! Thanks, that's leftover from debugging.


  Index: linux-sched-devel.git/arch/powerpc/kernel/ftrace.c
  ===
  --- /dev/null   1970-01-01 00:00:00.0 +
  +++ linux-sched-devel.git/arch/powerpc/kernel/ftrace.c  2008-05-14 
  19:31:56.0 -0700
  @@ -0,0 +1,165 @@
  +/*
  + * Code for replacing ftrace calls with jumps.
  + *
  + * Copyright (C) 2007-2008 Steven Rostedt [EMAIL PROTECTED]
  + *
  + * Thanks goes out to P.A. Semi, Inc for supplying me with a PPC64 box.
  + *
  + */
  +
  +#include linux/spinlock.h
  +#include linux/hardirq.h
  +#include linux/ftrace.h
  +#include linux/percpu.h
  +#include linux/init.h
  +#include linux/list.h
  +
  +#include asm/cacheflush.h
  +
  +#define CALL_BACK  4

 I don't grok what you're doing with CALL_BACK, you add it in places and
 subtract in others - and it looks like you could do neither, but I haven't
 gone over it in detail.

I tried hard to make most of the complex logic stay in generic code.

What dynamic ftrace does is at start up the code is simply a nop. Then
after initialization of ftrace, it calls kstop-machine that will call into
the arch code to convert the nop to a call to a record_ip function.
That record_ip function will start recording the return address of the
mcount function (__builtin_return_address(0)).

Then later, once a second the ftraced wakes up and checks if any new
functions have been recorded. If they have been, then it calls
kstop_machine againg and for each recorded function, it passes in the
address that was recorded.

The arch is responsible for knowning how to translate the
__builtin_return_address(0) into the address of the location of the call,
to be able to modify that code.

On boot up, all functions call mcount. The ftraced daemon will convert
those calls to nop, and when tracing is enabled, then they will be
converted to point directly to the tracing function.

This helps tremondously in making ftrace efficient.


  +static unsigned int ftrace_nop = 0x6000;

 I should really add a #define for that.

  +#ifdef CONFIG_PPC32
  +# define GET_ADDR(addr) addr
  +#else
  +/* PowerPC64's functions are data that points to the functions */
  +# define GET_ADDR(addr) *(unsigned long *)addr
  +#endif

 And that.

 ... snip

  +notrace unsigned char *ftrace_call_replace(unsigned long ip, unsigned long 
  addr)
  +{
  +   static 

Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-20 Thread Steven Rostedt

On Tue, 20 May 2008, Benjamin Herrenschmidt wrote:

  Why do we not want to trace early boot? Just because it's not useful?

 Not running at the linked address... might be causing trouble.

I figured it was something like that, so I didn't look too deep into it,
and decided that it was best just not to trace it.

-- Steve

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-16 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 This patch adds full support for ftrace for PowerPC (both 64 and 32 
 bit). This includes dynamic tracing and function filtering.

applied, thanks. Nice stuff! :)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-15 Thread Steven Rostedt

On Wed, 14 May 2008, David Miller wrote:

 From: Steven Rostedt [EMAIL PROTECTED]
 Date: Wed, 14 May 2008 23:49:44 -0400

  +#ifdef CONFIG_FTRACE
  +#ifdef CONFIG_DYNAMIC_FTRACE
  +_GLOBAL(mcount)
  +_GLOBAL(_mcount)
  +   stwur1,-48(r1)
  +   stw r3, 12(r1)
  +   stw r4, 16(r1)
  +   stw r5, 20(r1)
  +   stw r6, 24(r1)
  +   mflrr3
  +   stw r7, 28(r1)
  +   mfcrr5
  +   stw r8, 32(r1)
  +   stw r9, 36(r1)
  +   stw r10,40(r1)
  +   stw r3, 44(r1)
  +   stw r5, 8(r1)

 Yikes, that's really expensive.

Well, at least with dynamic ftrace, it's only expensive when tracing is
enabled.


 Can't you do a tail call and let the function you end
 up calling do all the callee-saved register pops onto
 the stack?

Not sure PPC has such a thing. I'm only a hobby PPC hacker (did it full
time in another life). If there is such a way, I'll be happy to Ack any
patches.


 That's what I did on sparc.


So that was your secret! ;-)

-- Steve

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-15 Thread Scott Wood
On Wed, May 14, 2008 at 10:28:57PM -0700, David Miller wrote:
 From: Steven Rostedt [EMAIL PROTECTED]
 Date: Wed, 14 May 2008 23:49:44 -0400
 
  +#ifdef CONFIG_FTRACE
  +#ifdef CONFIG_DYNAMIC_FTRACE
  +_GLOBAL(mcount)
  +_GLOBAL(_mcount)
  +   stwur1,-48(r1)
  +   stw r3, 12(r1)
  +   stw r4, 16(r1)
  +   stw r5, 20(r1)
  +   stw r6, 24(r1)
  +   mflrr3
  +   stw r7, 28(r1)
  +   mfcrr5
  +   stw r8, 32(r1)
  +   stw r9, 36(r1)
  +   stw r10,40(r1)
  +   stw r3, 44(r1)
  +   stw r5, 8(r1)
 
 Yikes, that's really expensive.
 
 Can't you do a tail call and let the function you end
 up calling do all the callee-saved register pops onto
 the stack?

The PPC32 ABI seems to (unfortunately) suggest that, with mcount, all
registers are callee-saved (except for the modifiable-during-function-linkage
registers like r0, r11, and r12) -- so mcount has to save the registers that
the callee won't (because they're normally volatile).

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] ftrace: support for PowerPC

2008-05-14 Thread David Miller
From: Steven Rostedt [EMAIL PROTECTED]
Date: Wed, 14 May 2008 23:49:44 -0400

 +#ifdef CONFIG_FTRACE
 +#ifdef CONFIG_DYNAMIC_FTRACE
 +_GLOBAL(mcount)
 +_GLOBAL(_mcount)
 + stwur1,-48(r1)
 + stw r3, 12(r1)
 + stw r4, 16(r1)
 + stw r5, 20(r1)
 + stw r6, 24(r1)
 + mflrr3
 + stw r7, 28(r1)
 + mfcrr5
 + stw r8, 32(r1)
 + stw r9, 36(r1)
 + stw r10,40(r1)
 + stw r3, 44(r1)
 + stw r5, 8(r1)

Yikes, that's really expensive.

Can't you do a tail call and let the function you end
up calling do all the callee-saved register pops onto
the stack?

That's what I did on sparc.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev