On Wed, 30 Jan 2008, Steven Rostedt wrote: > well, actually, I disagree. I only set mcount_enabled=1 when I'm about to > test something. You're right that we want the impact of the test least > affected, but when we have mcount_enabled=1 we usually also have a > function that's attached and in that case this change is negligible. But > on the normal case where mcount_enabled=0, this change may have a bigger > impact. > > Remember CONFIG_MCOUNT=y && mcount_enabled=0 is (15% overhead) > CONFIG_MCOUNT=y && mcount_enabled=1 dummy func (49% overhead) > CONFIG_MCOUNT=y && mcount_enabled=1 trace func (500% overhead) > > The trace func is the one that will be most likely used when analyzing. It > gives hackbench a 500% overhead, so I'm expecting this change to be > negligible in that case. But after I find what's wrong, I like to rebuild > the kernel without rebooting so I like to have mcount_enabled=0 have the > smallest impact ;-) > > I'll put back the original code and run some new numbers.
I just ran with the original version of that test (on x86_64, the same box as the previous tests were done, with the same kernel and config except for this change) Here's the numbers with the new design (the one that was used in this patch): mcount disabled: Avg: 4.8638 (15.934498% overhead) mcount enabled: Avg: 6.2819 (49.736610% overhead) function tracing: Avg: 25.2035 (500.755607% overhead) Now changing the code to: ENTRY(mcount) /* likely(mcount_enabled) */ cmpl $0, mcount_enabled jz out /* taken from glibc */ subq $0x38, %rsp movq %rax, (%rsp) movq %rcx, 8(%rsp) movq %rdx, 16(%rsp) movq %rsi, 24(%rsp) movq %rdi, 32(%rsp) movq %r8, 40(%rsp) movq %r9, 48(%rsp) movq 0x38(%rsp), %rsi movq 8(%rbp), %rdi call *mcount_trace_function movq 48(%rsp), %r9 movq 40(%rsp), %r8 movq 32(%rsp), %rdi movq 24(%rsp), %rsi movq 16(%rsp), %rdx movq 8(%rsp), %rcx movq (%rsp), %rax addq $0x38, %rsp out: retq mcount disabled: Avg: 4.908 (16.988058% overhead) mcount enabled: Avg: 6.244. (48.840369% overhead) function tracing: Avg: 25.1963 (500.583987% overhead) The change seems to cause a 1% overhead difference. With mcount disabled, the newer code has a 1% performance benefit. With mcount enabled as well as with tracing on, the old code has the 1% benefit. But 1% has a bigger impact on something that is 15% than it does on something that is 48% or 500%, so I'm keeping the newer version. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/