* David Woodhouse <[email protected]> wrote:
> On Tue, 2018-01-23 at 08:53 +0100, Ingo Molnar wrote:
> >
> > The patch below demonstrates the principle, it forcibly enables dynamic
> > ftrace
> > patching (CONFIG_DYNAMIC_FTRACE=y et al) and turns mcount/__fentry__ into a
> > RET:
> >
> > ffffffff81a01a40 <__fentry__>:
> > ffffffff81a01a40: c3 retq
> >
> > This would have to be extended with (very simple) call stack depth tracking
> > (just
> > 3 more instructions would do in the fast path I believe) and a suitable
> > SkyLake
> > workaround (and also has to play nice with the ftrace callbacks).
> >
> > On non-SkyLake the overhead would be 0 cycles.
>
> The overhead of forcing CONFIG_DYNAMIC_FTRACE=y is precisely zero
> cycles? That seems a little optimistic. ;)
The overhead of the quick hack patch I sent to show what exact code I mean is
obviously not zero.
The overhead of using my proposed solution, to utilize the function call
callback
that CONFIG_DYNAMIC_FTRACE=y provides, is exactly zero on non-SkyLake systems
where the callback is patched out, on typical Linux distros.
The callback is widely enabled on distro kernels:
Fedora: CONFIG_DYNAMIC_FTRACE=y
Ubuntu: CONFIG_DYNAMIC_FTRACE=y
OpenSuse (default flavor): CONFIG_DYNAMIC_FTRACE=y
BTW., the reason this is enabled on all distro kernels is because the overhead
is
a single patched-in NOP instruction in the function epilogue, when tracing is
disabled. So it's not even a CALL+RET - it's a patched in NOP.
Thanks,
Ingo