On Tue, Jul 16, 2019 at 11:15:54AM -0700, Nick Desaulniers wrote:
> On Sun, Jul 14, 2019 at 5:37 PM Josh Poimboeuf <jpoim...@redhat.com> wrote:
> >
> > On x86-64, with CONFIG_RETPOLINE=n, GCC's "global common subexpression
> > elimination" optimization results in ___bpf_prog_run()'s jumptable code
> > changing from this:
> >
> >         select_insn:
> >                 jmp *jumptable(, %rax, 8)
> >                 ...
> >         ALU64_ADD_X:
> >                 ...
> >                 jmp *jumptable(, %rax, 8)
> >         ALU_ADD_X:
> >                 ...
> >                 jmp *jumptable(, %rax, 8)
> >
> > to this:
> >
> >         select_insn:
> >                 mov jumptable, %r12
> >                 jmp *(%r12, %rax, 8)
> >                 ...
> >         ALU64_ADD_X:
> >                 ...
> >                 jmp *(%r12, %rax, 8)
> >         ALU_ADD_X:
> >                 ...
> >                 jmp *(%r12, %rax, 8)
> >
> > The jumptable address is placed in a register once, at the beginning of
> > the function.  The function execution can then go through multiple
> > indirect jumps which rely on that same register value.  This has a few
> > issues:
> >
> > 1) Objtool isn't smart enough to be able to track such a register value
> >    across multiple recursive indirect jumps through the jump table.
> >
> > 2) With CONFIG_RETPOLINE enabled, this optimization actually results in
> >    a small slowdown.  I measured a ~4.7% slowdown in the test_bpf
> >    "tcpdump port 22" selftest.
> >
> >    This slowdown is actually predicted by the GCC manual:
> >
> >      Note: When compiling a program using computed gotos, a GCC
> >      extension, you may get better run-time performance if you
> >      disable the global common subexpression elimination pass by
> >      adding -fno-gcse to the command line.
> >
> > So just disable the optimization for this function.
> >
> > Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code")
> > Reported-by: Randy Dunlap <rdun...@infradead.org>
> > Signed-off-by: Josh Poimboeuf <jpoim...@redhat.com>
> > Acked-by: Alexei Starovoitov <a...@kernel.org>
> > ---
> > Cc: Alexei Starovoitov <a...@kernel.org>
> > Cc: Daniel Borkmann <dan...@iogearbox.net>
> > ---
> >  include/linux/compiler-gcc.h   | 2 ++
> >  include/linux/compiler_types.h | 4 ++++
> >  kernel/bpf/core.c              | 2 +-
> >  3 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index e8579412ad21..d7ee4c6bad48 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -170,3 +170,5 @@
> >  #else
> >  #define __diag_GCC_8(s)
> >  #endif
> > +
> > +#define __no_fgcse __attribute__((optimize("-fno-gcse")))
> 
> + Miguel, maintainer of compiler_attributes.h
> I wonder if the optimize attributes can be feature detected?
> Is -fno-gcse supported all the way back to GCC 4.6?

Yeah, from snooping in the GCC tree it looks like it's been around
for 18+ years.

-- 
Josh

Reply via email to