When livepatching is enabled, this function is used all the time. Really do check the fastpath first, and annotate it likely() as this is the right answer 100% of the time (to many significant figures).
This cuts out 3 pointer dereferences in the "nothing to do path", and it seems the optimiser has an easier time too. Bloat-o-meter reports: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-57 (-57) Function old new delta check_for_livepatch_work.cold 1201 1183 -18 check_for_livepatch_work 1021 982 -39 which isn't too shabby for no logical change. Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> --- CC: Konrad Rzeszutek Wilk <konrad.w...@oracle.com> CC: Ross Lagerwall <ross.lagerw...@citrix.com> CC: Jan Beulich <jbeul...@suse.com> CC: Roger Pau Monné <roger....@citrix.com> CC: Wei Liu <w...@xen.org> I'm still a little disappointed with the code generation. GCC still chooses to set up the full stack frame (6 regs, +3 more slots) intermixed with the per-cpu calculations. In isolation, GCC can check the boolean without creating a stack frame: <work_to_to>: 48 89 e2 mov %rsp,%rdx 48 8d 05 de e1 37 00 lea 0x37e1de(%rip),%rax # ffff82d0405b6068 <per_cpu__work_to_do> 48 81 ca ff 7f 00 00 or $0x7fff,%rdx 8b 4a c1 mov -0x3f(%rdx),%ecx 48 8d 15 45 aa 39 00 lea 0x39aa45(%rip),%rdx # ffff82d0405d28e0 <__per_cpu_offset> 48 8b 14 ca mov (%rdx,%rcx,8),%rdx 0f b6 04 02 movzbl (%rdx,%rax,1),%eax c3 retq but I can't find a way to convince GCC that it would be worth not setting up a stack frame in in the common case, and having a few extra mov reg/reg's later in the uncommon case. I haven't tried manually splitting the function into a check() and a do() function. Views on whether that might be acceptable? At a guess, do() would need to be a static noinline to avoid it turning back into what it currently is. --- xen/common/livepatch.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c index 1209fea2566c..b6275339f663 100644 --- a/xen/common/livepatch.c +++ b/xen/common/livepatch.c @@ -1706,15 +1706,15 @@ void check_for_livepatch_work(void) s_time_t timeout; unsigned long flags; + /* Fast path: no work to do. */ + if ( likely(!per_cpu(work_to_do, cpu)) ) + return; + /* Only do any work when invoked in truly idle state. */ if ( system_state != SYS_STATE_active || !is_idle_domain(current->sched_unit->domain) ) return; - /* Fast path: no work to do. */ - if ( !per_cpu(work_to_do, cpu ) ) - return; - smp_rmb(); /* In case we aborted, other CPUs can skip right away. */ if ( !livepatch_work.do_work ) base-commit: 49818cde637b5ec20383e46b71f93b2e7d867686 -- 2.30.2