Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
On Thu, Mar 02, 2017 at 09:01:24PM +, Abel Vesa wrote: > On Tue, Feb 28, 2017 at 11:54:29AM +, Abel Vesa wrote: > > On Tue, Feb 28, 2017 at 11:46:38AM +, Russell King - ARM Linux wrote: > > > On Tue, Feb 28, 2017 at 11:22:27AM +, Abel Vesa wrote: > > > > On Tue, Feb 28, 2017 at 11:58:49AM +0100, Nicolai Stange wrote: > > > > > Hi Abel, > > > > > > > > > > On Tue, Feb 28 2017, Abel Vesa wrote: > > > > > > > > > > > On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: > > > > > >> On Fri, Feb 24 2017, Abel Vesa wrote: > > > > > >> Wouldn't it be better (and more consistent with other archs) to > > > > > >> have > > > > > >> > > > > > >> pt_regs->ARM_lr = original lr > > > > > >> pt_refs->ARM_pc = current lr > > > > > >> > > > > > >> instead? > > > > > > > > > > The stack would look like this then > > > > > > > > > > @ ... | ARM_ip | ARM_sp | ARM_lr | ARM_pc | ... > > > > >| > > > > > @ 0 4 48 52 566064 > > > > > 68 72 > > > > > @ R0 | R1 | ... | LR | SP + 4 | original LR | original PC | PSR | > > > > > OLD_R0 | original LR | > Just to make sure we're on the same page. If we are replacing the LR > with the original_LR is it worth keeping around the one pushed before > the ftrace_regs_caller is called? > > Another thing, PC needs to be new_LR and then we can restore all > regs r0 through r15 like this: > > ldmia sp, {r0-r15} That's the intention - the point is to save the state as it was at the point that the function was entered, not at the point when the ftrace code was entered. What we don't want is the implementation details of GCC's mcount or ftrace's adaption of mcount being exposed via ftrace. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
On Tue, Feb 28, 2017 at 11:54:29AM +, Abel Vesa wrote: > On Tue, Feb 28, 2017 at 11:46:38AM +, Russell King - ARM Linux wrote: > > On Tue, Feb 28, 2017 at 11:22:27AM +, Abel Vesa wrote: > > > On Tue, Feb 28, 2017 at 11:58:49AM +0100, Nicolai Stange wrote: > > > > Hi Abel, > > > > > > > > On Tue, Feb 28 2017, Abel Vesa wrote: > > > > > > > > > On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: > > > > >> On Fri, Feb 24 2017, Abel Vesa wrote: > > > > >> Wouldn't it be better (and more consistent with other archs) to have > > > > >> > > > > >> pt_regs->ARM_lr = original lr > > > > >> pt_refs->ARM_pc = current lr > > > > >> > > > > >> instead? > > > > > > > > The stack would look like this then > > > > > > > > @ ... | ARM_ip | ARM_sp | ARM_lr | ARM_pc | ... > > > > | > > > > @ 0 4 48 52 56606468 > > > > 72 > > > > @ R0 | R1 | ... | LR | SP + 4 | original LR | original PC | PSR | > > > > OLD_R0 | original LR | Just to make sure we're on the same page. If we are replacing the LR with the original_LR is it worth keeping around the one pushed before the ftrace_regs_caller is called? Another thing, PC needs to be new_LR and then we can restore all regs r0 through r15 like this: ldmia sp, {r0-r15} > > > > > > > > I.e. the pt_regs would capture almost the full context of the > > > > instrumented function (except for ip). > > > > > > > So basicly what you are saying is: > > > - instead of current LR save original LR (previous one saved in > > > instrumented function epilog) > > > - instead of current PC save original PC (previous one saved in > > > instrumented function epilog) > > > > > > I still don't see the point of saving the actual value of PC since nobody > > > will ever > > > restore it. In case of livepatch it will get overwritten anyway. As for > > > LR, I agree, > > > it could be the original one in pt_regs. > > > > > > I'll look into this sometime today or tomorrow and get back with updates. > > > > Which is exactly what I proposed, with code, on one of the previous > > iterations of this patch... > Fair enough. I probably missunderstood your comments then. > > Thanks. > > > > -- > > RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ > > FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up > > according to speedtest.net.
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
On Tue, Feb 28, 2017 at 11:22:27AM +, Abel Vesa wrote: > On Tue, Feb 28, 2017 at 11:58:49AM +0100, Nicolai Stange wrote: > > Hi Abel, > > > > On Tue, Feb 28 2017, Abel Vesa wrote: > > > > > On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: > > >> On Fri, Feb 24 2017, Abel Vesa wrote: > > >> Wouldn't it be better (and more consistent with other archs) to have > > >> > > >> pt_regs->ARM_lr = original lr > > >> pt_refs->ARM_pc = current lr > > >> > > >> instead? > > > > The stack would look like this then > > > > @ ... | ARM_ip | ARM_sp | ARM_lr | ARM_pc | ... > > | > > @ 0 4 48 52 56606468 > > 72 > > @ R0 | R1 | ... | LR | SP + 4 | original LR | original PC | PSR | > > OLD_R0 | original LR | > > > > I.e. the pt_regs would capture almost the full context of the > > instrumented function (except for ip). > > > So basicly what you are saying is: > - instead of current LR save original LR (previous one saved in instrumented > function epilog) > - instead of current PC save original PC (previous one saved in instrumented > function epilog) > > I still don't see the point of saving the actual value of PC since nobody > will ever > restore it. In case of livepatch it will get overwritten anyway. As for LR, I > agree, > it could be the original one in pt_regs. > > I'll look into this sometime today or tomorrow and get back with updates. Which is exactly what I proposed, with code, on one of the previous iterations of this patch... -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
On Tue, Feb 28, 2017 at 11:46:38AM +, Russell King - ARM Linux wrote: > On Tue, Feb 28, 2017 at 11:22:27AM +, Abel Vesa wrote: > > On Tue, Feb 28, 2017 at 11:58:49AM +0100, Nicolai Stange wrote: > > > Hi Abel, > > > > > > On Tue, Feb 28 2017, Abel Vesa wrote: > > > > > > > On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: > > > >> On Fri, Feb 24 2017, Abel Vesa wrote: > > > >> Wouldn't it be better (and more consistent with other archs) to have > > > >> > > > >> pt_regs->ARM_lr = original lr > > > >> pt_refs->ARM_pc = current lr > > > >> > > > >> instead? > > > > > > The stack would look like this then > > > > > > @ ... | ARM_ip | ARM_sp | ARM_lr | ARM_pc | ... > > >| > > > @ 0 4 48 52 56606468 > > >72 > > > @ R0 | R1 | ... | LR | SP + 4 | original LR | original PC | PSR | > > > OLD_R0 | original LR | > > > > > > I.e. the pt_regs would capture almost the full context of the > > > instrumented function (except for ip). > > > > > So basicly what you are saying is: > > - instead of current LR save original LR (previous one saved in > > instrumented function epilog) > > - instead of current PC save original PC (previous one saved in > > instrumented function epilog) > > > > I still don't see the point of saving the actual value of PC since nobody > > will ever > > restore it. In case of livepatch it will get overwritten anyway. As for LR, > > I agree, > > it could be the original one in pt_regs. > > > > I'll look into this sometime today or tomorrow and get back with updates. > > Which is exactly what I proposed, with code, on one of the previous > iterations of this patch... Fair enough. I probably missunderstood your comments then. Thanks. > > -- > RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ > FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up > according to speedtest.net.
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
On Tue, Feb 28, 2017 at 11:58:49AM +0100, Nicolai Stange wrote: > Hi Abel, > > On Tue, Feb 28 2017, Abel Vesa wrote: > > > On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: > >> On Fri, Feb 24 2017, Abel Vesa wrote: > >> > >> > >> > >> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > >> > index fda6a46..877df5b 100644 > >> > --- a/arch/arm/Kconfig > >> > +++ b/arch/arm/Kconfig > >> > @@ -55,6 +55,7 @@ config ARM > >> > select HAVE_DMA_API_DEBUG > >> > select HAVE_DMA_CONTIGUOUS if MMU > >> > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 > >> > && MMU > >> > +select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE && > >> > OLD_MCOUNT > >> > >> > >> AFAICS, your code depends on the __gnu_mcount_nc calling conventions, > >> i.e. on gcc pushing the original lr on the stack. In particular, there's > >> no implementation of a ftrace_regs_caller_old or so. > >> > >> So shouldn't this read as !OLD_MCOUNT instead? > > The implementation of __ftrace_modify_code which sets the kernel text to rw > > needs OLD_MCOUNT (that is, the arch specific one, not the generic one). > > You're right that ARM's implementation of __ftrace_modify_code() is hidden > within an #ifdef CONFIG_OLD_MCOUNT. > > But, > - its implementation doesn't "need" or depend on OLD_MCOUNT > - and it's true in general that the kernel text must be made writable >before ftrace_modify_all_code() attempts to write to it. > > So, I bet that the set_kernel_text_rw()-calling ARM implementations of > arch_ftrace_update_code() and __ftrace_modify_code() resp. have been > inserted under that CONFIG_OLD_MCOUNT #ifdef by mistake with commit > 80d6b0c2eed2 ("ARM: mm: allow text and rodata sections to be > read-only"). > > In conclusion, I claim that DYNAMIC_FTRACE w/o OLD_MCOUNT had been > broken before your patch already. I didn't explicitly test that though. > That is correct. The DYNAMIC_FTRACE w/o OLD_MCOUNT doesn't work. > I think that should be fixed rather than your DYNAMIC_FTRACE_WITH_REGS > pulling in OLD_MCOUNT in order to repair DYNAMIC_FTRACE. > > Especially since your implementation seems to require !OLD_MCOUNT... > So making arch specific __ftrace_modify_code to be OLD_MCOUNT independent might fix DYNAMIC_FTRACE w/o OLD_MCOUNT and implicitly make DYNAMIC_FTRACE_WITH_REGS not dependent on OLD_MCOUNT. I will investigate this further today. Probably the whole dependancy between FUNCTION_TRACER and OLD_MCOUNT will need to be changed/updated. > >> Also, at least the ldmia ..., {..., sp, ...} insn needs a !THUMB2_KERNEL. > >> > >> > >> > select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || > >> > CPU_V7) && MMU > >> > select HAVE_EXIT_THREAD > >> > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) > >> > >> > >> > >> > diff --git a/arch/arm/kernel/entry-ftrace.S > >> > b/arch/arm/kernel/entry-ftrace.S > >> > index c73c403..3916dd6 100644 > >> > --- a/arch/arm/kernel/entry-ftrace.S > >> > +++ b/arch/arm/kernel/entry-ftrace.S > >> > @@ -92,12 +92,78 @@ > >> > 2: mcount_exit > >> > .endm > >> > > >> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS > >> > + > >> > +.macro __ftrace_regs_caller > >> > + > >> > +sub sp, sp, #8 @ space for CPSR and OLD_R0 (not used) > >> > + > >> > +add ip, sp, #12 @ move in IP the value of SP as it was > >> > +@ before the push {lr} of the mcount > >> > mechanism > >> > +stmdb sp!, {ip,lr,pc} > >> > +stmdb sp!, {r0-r11,lr} > >> > + > >> > +@ stack content at this point: > >> > +@ 0 4 48 52 56 60 6468 72 > >> > +@ R0 | R1 | ... | LR | SP + 4 | LR | PC | PSR | OLD_R0 | > >> > previous LR | > >> > >> Being a constant, the saved pc is not very useful, I think. > > So you're saying skip it ? But you still need to leave space for it. So why > > not > > just save it even if the value is not useful? > > No, no, I don't want to skip it. I'd just prefer to have the pt_regs' > ->ARM_lr and ->ARM_pc slots filled with different, perhaps more useful > values: > > >> > >> Wouldn't it be better (and more consistent with other archs) to have > >> > >> pt_regs->ARM_lr = original lr > >> pt_refs->ARM_pc = current lr > >> > >> instead? > > The stack would look like this then > > @ ... | ARM_ip | ARM_sp | ARM_lr | ARM_pc | ... | > @ 0 4 48 52 56606468 > 72 > @ R0 | R1 | ... | LR | SP + 4 | original LR | original PC | PSR | OLD_R0 > | original LR | > > I.e. the pt_regs would capture almost the full context of the > instrumented function (except for ip). > So basicly what you are saying is: - instead of current LR save original LR (previous one saved in instrumented function epilog) - instead of current PC save original PC (previous one saved in instrumented f
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
Hi Abel, On Tue, Feb 28 2017, Abel Vesa wrote: > On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: >> On Fri, Feb 24 2017, Abel Vesa wrote: >> >> >> >> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >> > index fda6a46..877df5b 100644 >> > --- a/arch/arm/Kconfig >> > +++ b/arch/arm/Kconfig >> > @@ -55,6 +55,7 @@ config ARM >> >select HAVE_DMA_API_DEBUG >> >select HAVE_DMA_CONTIGUOUS if MMU >> >select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 && MMU >> > + select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE && >> > OLD_MCOUNT >> >> >> AFAICS, your code depends on the __gnu_mcount_nc calling conventions, >> i.e. on gcc pushing the original lr on the stack. In particular, there's >> no implementation of a ftrace_regs_caller_old or so. >> >> So shouldn't this read as !OLD_MCOUNT instead? > The implementation of __ftrace_modify_code which sets the kernel text to rw > needs OLD_MCOUNT (that is, the arch specific one, not the generic one). You're right that ARM's implementation of __ftrace_modify_code() is hidden within an #ifdef CONFIG_OLD_MCOUNT. But, - its implementation doesn't "need" or depend on OLD_MCOUNT - and it's true in general that the kernel text must be made writable before ftrace_modify_all_code() attempts to write to it. So, I bet that the set_kernel_text_rw()-calling ARM implementations of arch_ftrace_update_code() and __ftrace_modify_code() resp. have been inserted under that CONFIG_OLD_MCOUNT #ifdef by mistake with commit 80d6b0c2eed2 ("ARM: mm: allow text and rodata sections to be read-only"). In conclusion, I claim that DYNAMIC_FTRACE w/o OLD_MCOUNT had been broken before your patch already. I didn't explicitly test that though. I think that should be fixed rather than your DYNAMIC_FTRACE_WITH_REGS pulling in OLD_MCOUNT in order to repair DYNAMIC_FTRACE. Especially since your implementation seems to require !OLD_MCOUNT... >> Also, at least the ldmia ..., {..., sp, ...} insn needs a !THUMB2_KERNEL. >> >> >> >select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) >> > && MMU >> >select HAVE_EXIT_THREAD >> >select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) >> >> >> >> > diff --git a/arch/arm/kernel/entry-ftrace.S >> > b/arch/arm/kernel/entry-ftrace.S >> > index c73c403..3916dd6 100644 >> > --- a/arch/arm/kernel/entry-ftrace.S >> > +++ b/arch/arm/kernel/entry-ftrace.S >> > @@ -92,12 +92,78 @@ >> > 2:mcount_exit >> > .endm >> > >> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS >> > + >> > +.macro __ftrace_regs_caller >> > + >> > + sub sp, sp, #8 @ space for CPSR and OLD_R0 (not used) >> > + >> > + add ip, sp, #12 @ move in IP the value of SP as it was >> > + @ before the push {lr} of the mcount mechanism >> > + stmdb sp!, {ip,lr,pc} >> > + stmdb sp!, {r0-r11,lr} >> > + >> > + @ stack content at this point: >> > + @ 0 4 48 52 56 60 6468 72 >> > + @ R0 | R1 | ... | LR | SP + 4 | LR | PC | PSR | OLD_R0 | previous LR | >> >> Being a constant, the saved pc is not very useful, I think. > So you're saying skip it ? But you still need to leave space for it. So why > not > just save it even if the value is not useful? No, no, I don't want to skip it. I'd just prefer to have the pt_regs' ->ARM_lr and ->ARM_pc slots filled with different, perhaps more useful values: >> >> Wouldn't it be better (and more consistent with other archs) to have >> >> pt_regs->ARM_lr = original lr >> pt_refs->ARM_pc = current lr >> >> instead? The stack would look like this then @ ... | ARM_ip | ARM_sp | ARM_lr | ARM_pc | ... | @ 0 4 48 52 56606468 72 @ R0 | R1 | ... | LR | SP + 4 | original LR | original PC | PSR | OLD_R0 | original LR | I.e. the pt_regs would capture almost the full context of the instrumented function (except for ip). Thanks, Nicolai
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
On Mon, Feb 27, 2017 at 04:52:06PM +0100, Nicolai Stange wrote: > Hi Abel, > > On Fri, Feb 24 2017, Abel Vesa wrote: > > > > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > > index fda6a46..877df5b 100644 > > --- a/arch/arm/Kconfig > > +++ b/arch/arm/Kconfig > > @@ -55,6 +55,7 @@ config ARM > > select HAVE_DMA_API_DEBUG > > select HAVE_DMA_CONTIGUOUS if MMU > > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 && MMU > > + select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE && > > OLD_MCOUNT > > > AFAICS, your code depends on the __gnu_mcount_nc calling conventions, > i.e. on gcc pushing the original lr on the stack. In particular, there's > no implementation of a ftrace_regs_caller_old or so. > > So shouldn't this read as !OLD_MCOUNT instead? The implementation of __ftrace_modify_code which sets the kernel text to rw needs OLD_MCOUNT (that is, the arch specific one, not the generic one). > > Also, at least the ldmia ..., {..., sp, ...} insn needs a !THUMB2_KERNEL. > > > > select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) > > && MMU > > select HAVE_EXIT_THREAD > > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) > > > > > diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S > > index c73c403..3916dd6 100644 > > --- a/arch/arm/kernel/entry-ftrace.S > > +++ b/arch/arm/kernel/entry-ftrace.S > > @@ -92,12 +92,78 @@ > > 2: mcount_exit > > .endm > > > > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS > > + > > +.macro __ftrace_regs_caller > > + > > + sub sp, sp, #8 @ space for CPSR and OLD_R0 (not used) > > + > > + add ip, sp, #12 @ move in IP the value of SP as it was > > + @ before the push {lr} of the mcount mechanism > > + stmdb sp!, {ip,lr,pc} > > + stmdb sp!, {r0-r11,lr} > > + > > + @ stack content at this point: > > + @ 0 4 48 52 56 60 6468 72 > > + @ R0 | R1 | ... | LR | SP + 4 | LR | PC | PSR | OLD_R0 | previous LR | > > Being a constant, the saved pc is not very useful, I think. So you're saying skip it ? But you still need to leave space for it. So why not just save it even if the value is not useful? > > Wouldn't it be better (and more consistent with other archs) to have > > pt_regs->ARM_lr = original lr > pt_refs->ARM_pc = current lr > > instead? > > A (hypothetical) klp_arch_set_pc(struct pt_regs *regs, unsigned long ip) > would do the more intuitive > > regs->ARM_pc = ip; > > rather than a > > regs->ARM_lr = ip > > then. You are right. There is a subsequent patch I'm currently working on that will enable livepatch and will bring an implementation for klp_arch_set_pc as you described it. I still don't get what is wrong with the code though? > > In addition, the original lr register would be made available to ftrace > handlers this way. > > > > + mov r3, sp @ struct pt_regs* > > + ldr r2, =function_trace_op > > + ldr r2, [r2]@ pointer to the current > > + @ function tracing op > > + ldr r1, [sp, #PT_REGS_SIZE] @ lr of instrumented func > > + mcount_adjust_addr r0, lr @ instrumented function > > + > > + .globl ftrace_regs_call > > +ftrace_regs_call: > > + bl ftrace_stub > > + > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER > > + .globl ftrace_graph_regs_call > > +ftrace_graph_regs_call: > > + mov r0, r0 > > +#endif > > + @ pop saved regs > > + @ first, get the previous LR value from stack > > + ldr lr, [sp, #PT_REGS_SIZE] > > + @ now pop the rest of the saved registers INCLUDING stack pointer > > + ldmia sp, {r0-r11, ip, sp, pc} > > +.endm > > + > > > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER > > +.macro __ftrace_graph_regs_caller > > + > > + sub r0, fp, #4 @ lr of instrumented routine (parent) > > + > > + @ called from __ftrace_regs_caller > > + ldr r1, [sp, #S_LR] @ instrumented routine (func) > > + mcount_adjust_addr r1, r1 > > + > > + sub r2, r0, #4 @ frame pointer > > Given that r2 is prepare_ftrace_return()'s frame_pointer argument, is > > r2 = fp - 4 - 4 = fp - 8 > > really correct / what is wanted here? > You are right. - sub r2, r0, #4 @ frame pointer + mov r2, fp @ frame pointer > > + bl prepare_ftrace_return > > + > > + @ pop registers saved in ftrace_regs_caller > > + @ first, get the previous LR value from stack > > + ldr lr, [sp, #PT_REGS_SIZE] > > + @ now pop the rest of the saved registers INCLUDING stack pointer > > + ldmia sp, {r0-r11, ip, sp, pc} > > +.endm > > +#endif > > +#endif > > + > > > > > Thanks, > > Nicolai Thanks
Re: [PATCHv4] arm: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
Hi Abel, On Fri, Feb 24 2017, Abel Vesa wrote: > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index fda6a46..877df5b 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -55,6 +55,7 @@ config ARM > select HAVE_DMA_API_DEBUG > select HAVE_DMA_CONTIGUOUS if MMU > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 && MMU > + select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE && > OLD_MCOUNT AFAICS, your code depends on the __gnu_mcount_nc calling conventions, i.e. on gcc pushing the original lr on the stack. In particular, there's no implementation of a ftrace_regs_caller_old or so. So shouldn't this read as !OLD_MCOUNT instead? Also, at least the ldmia ..., {..., sp, ...} insn needs a !THUMB2_KERNEL. > select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) > && MMU > select HAVE_EXIT_THREAD > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) > diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S > index c73c403..3916dd6 100644 > --- a/arch/arm/kernel/entry-ftrace.S > +++ b/arch/arm/kernel/entry-ftrace.S > @@ -92,12 +92,78 @@ > 2: mcount_exit > .endm > > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS > + > +.macro __ftrace_regs_caller > + > + sub sp, sp, #8 @ space for CPSR and OLD_R0 (not used) > + > + add ip, sp, #12 @ move in IP the value of SP as it was > + @ before the push {lr} of the mcount mechanism > + stmdb sp!, {ip,lr,pc} > + stmdb sp!, {r0-r11,lr} > + > + @ stack content at this point: > + @ 0 4 48 52 56 60 6468 72 > + @ R0 | R1 | ... | LR | SP + 4 | LR | PC | PSR | OLD_R0 | previous LR | Being a constant, the saved pc is not very useful, I think. Wouldn't it be better (and more consistent with other archs) to have pt_regs->ARM_lr = original lr pt_refs->ARM_pc = current lr instead? A (hypothetical) klp_arch_set_pc(struct pt_regs *regs, unsigned long ip) would do the more intuitive regs->ARM_pc = ip; rather than a regs->ARM_lr = ip then. In addition, the original lr register would be made available to ftrace handlers this way. > + mov r3, sp @ struct pt_regs* > + ldr r2, =function_trace_op > + ldr r2, [r2]@ pointer to the current > + @ function tracing op > + ldr r1, [sp, #PT_REGS_SIZE] @ lr of instrumented func > + mcount_adjust_addr r0, lr @ instrumented function > + > + .globl ftrace_regs_call > +ftrace_regs_call: > + bl ftrace_stub > + > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER > + .globl ftrace_graph_regs_call > +ftrace_graph_regs_call: > + mov r0, r0 > +#endif > + @ pop saved regs > + @ first, get the previous LR value from stack > + ldr lr, [sp, #PT_REGS_SIZE] > + @ now pop the rest of the saved registers INCLUDING stack pointer > + ldmia sp, {r0-r11, ip, sp, pc} > +.endm > + > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER > +.macro __ftrace_graph_regs_caller > + > + sub r0, fp, #4 @ lr of instrumented routine (parent) > + > + @ called from __ftrace_regs_caller > + ldr r1, [sp, #S_LR] @ instrumented routine (func) > + mcount_adjust_addr r1, r1 > + > + sub r2, r0, #4 @ frame pointer Given that r2 is prepare_ftrace_return()'s frame_pointer argument, is r2 = fp - 4 - 4 = fp - 8 really correct / what is wanted here? > + bl prepare_ftrace_return > + > + @ pop registers saved in ftrace_regs_caller > + @ first, get the previous LR value from stack > + ldr lr, [sp, #PT_REGS_SIZE] > + @ now pop the rest of the saved registers INCLUDING stack pointer > + ldmia sp, {r0-r11, ip, sp, pc} > +.endm > +#endif > +#endif > + Thanks, Nicolai