On 9 November 2018 at 08:28, Ingo Molnar <mi...@kernel.org> wrote: > > * Josh Poimboeuf <jpoim...@redhat.com> wrote: > >> These patches are related to two similar patch sets from Ard and Steve: >> >> - https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheu...@linaro.org >> - https://lkml.kernel.org/r/20181006015110.653946...@goodmis.org >> >> The code is also heavily inspired by the jump label code, as some of the >> concepts are very similar. >> >> There are three separate implementations, depending on what the arch >> supports: >> >> 1) CONFIG_HAVE_STATIC_CALL_OPTIMIZED: patched call sites - requires >> objtool and a small amount of arch code >> >> 2) CONFIG_HAVE_STATIC_CALL_UNOPTIMIZED: patched trampolines - requires >> a small amount of arch code >> >> 3) If no arch support, fall back to regular function pointers >> >> >> TODO: >> >> - I'm not sure about the objtool approach. Objtool is (currently) >> x86-64 only, which means we have to use the "unoptimized" version >> everywhere else. I may experiment with a GCC plugin instead. > > I'd prefer the objtool approach. It's a pretty reliable first-principles > approach while GCC plugin would have to be replicated for Clang and any > other compilers, etc. >
I implemented the GCC plugin approach here for arm64 https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=static-calls That implements both the unoptimized and the optimized versions. I do take your point about GCC and other compilers, but on arm64 we don't have a lot of choice. As far as I can tell, the GCC plugin is generic (i.e., it does not rely on any ARM specific passes, but obviously, this requires a *lot* of testing and validation to be taken seriously. >> - Does this feature have much value without retpolines? If not, should >> we make it depend on retpolines somehow? > > Paravirt patching, as you mention in your later reply? > >> - Find some actual users of the interfaces (tracepoints? crypto?) > > I'd be very happy with a demonstrated paravirt optimization already - > i.e. seeing the before/after effect on the vmlinux with an x86 distro > config. > > All major Linux distributions enable CONFIG_PARAVIRT=y and > CONFIG_PARAVIRT_XXL=y on x86 at the moment, so optimizing it away as much > as possible in the 99.999% cases where it's not used is a primary > concern. > > All other usecases are bonus, but it would certainly be interesting to > investigate the impact of using these APIs for tracing: that too is a > feature enabled everywhere but utilized only by a small fraction of Linux > users - so literally every single cycle or instruction saved or hot-path > shortened is a major win. > > Thanks, > > Ingo