> On Jan 11, 2019, at 1:22 PM, Josh Poimboeuf <jpoim...@redhat.com> wrote: > > On Fri, Jan 11, 2019 at 12:46:39PM -0800, Linus Torvalds wrote: >> On Fri, Jan 11, 2019 at 12:31 PM Josh Poimboeuf <jpoim...@redhat.com> wrote: >>> I was referring to the fact that a single static call key update will >>> usually result in patching multiple call sites. But you're right, it's >>> only 1-2 trampolines per text_poke_bp() invocation. Though eventually >>> we may want to batch all the writes like what Daniel has proposed for >>> jump labels, to reduce IPIs. >> >> Yeah, my suggestion doesn't allow for batching, since it would >> basically generate one trampoline for every rewritten instruction. > > As Andy said, I think batching would still be possible, it's just that > we'd have to create multiple trampolines at a time. > > Or... we could do a hybrid approach: create a single custom trampoline > which has the call destination patched in, but put the return address in > %rax -- which is always clobbered, even for callee-saved PV ops. Like: > > trampoline: > push %rax > call patched-dest > > That way the batching could be done with a single trampoline > (particularly if using rcu-sched to avoid the sti hack).
I don’t see RCU-sched solves the problem if you don’t disable preemption. On a fully preemptable kernel, you can get preempted between the push and the call (jmp) or before the push. RCU-sched can then finish, and the preempted task may later jump to a wrong patched-dest.