On Fri, Nov 09, 2018 at 11:31:06AM -0600, Josh Poimboeuf wrote: > On Fri, Nov 09, 2018 at 06:25:24PM +0100, Ard Biesheuvel wrote: > > On 9 November 2018 at 16:14, Ard Biesheuvel <ard.biesheu...@linaro.org> > > wrote: > > > On 9 November 2018 at 16:10, Josh Poimboeuf <jpoim...@redhat.com> wrote: > > >> On Fri, Nov 09, 2018 at 02:39:17PM +0100, Ard Biesheuvel wrote: > > >>> > + for (site = start; site < stop; site++) { > > >>> > + struct static_call_key *key = static_call_key(site); > > >>> > + unsigned long addr = static_call_addr(site); > > >>> > + > > >>> > + if (list_empty(&key->site_mods)) { > > >>> > + struct static_call_mod *mod; > > >>> > + > > >>> > + mod = kzalloc(sizeof(*mod), GFP_KERNEL); > > >>> > + if (!mod) { > > >>> > + WARN(1, "Failed to allocate memory > > >>> > for static calls"); > > >>> > + return; > > >>> > + } > > >>> > + > > >>> > + mod->sites = site; > > >>> > + list_add_tail(&mod->list, &key->site_mods); > > >>> > + > > >>> > + /* > > >>> > + * The trampoline should no longer be used. > > >>> > Poison it > > >>> > + * it with a BUG() to catch any stray callers. > > >>> > + */ > > >>> > + arch_static_call_poison_tramp(addr); > > >>> > > >>> This patches the wrong thing: the trampoline is at key->func not addr. > > >> > > >> If you look at the x86 implementation, it actually does poison the > > >> trampoline. > > >> > > >> The address of the trampoline isn't actually known here. key->func > > >> isn't the trampoline address; it's the destination func address. > > >> > > >> So instead I passed the address of the call instruction. The arch code > > >> then reads the instruction to find the callee (the trampoline). > > >> > > >> The code is a bit confusing. To make it more obvious, maybe we should > > >> add another arch function to read the call destination. Then this code > > >> can pass that into arch_static_call_poison_tramp(). > > >> > > > > > > Ah right, so I am basically missing a dereference in my > > > arch_static_call_poison_tramp() code if this breaks. > > > > > > > Could we call it 'defuse' rather than 'poision'? On arm64, we will > > need to keep it around to bounce function calls that are out of range, > > and replace it with a PLT sequence. > > Ok, but doesn't that defeat the purpose of the inline approach?
Or are you only going to use the trampoline for out-of-range calls, otherwise just do direct calls? -- Josh