On Wed, Oct 14, 2020 at 2:26 PM Arvind Sankar <nived...@alum.mit.edu> wrote: > > Commit > 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually > exclusive") > > neglected to copy barrier_data() from compiler-gcc.h into > compiler-clang.h. The definition in compiler-gcc.h was really to work > around clang's more aggressive optimization, so this broke > barrier_data() on clang, and consequently memzero_explicit() as well. > > For example, this results in at least the memzero_explicit() call in > lib/crypto/sha256.c:sha256_transform() being optimized away by clang. > > Fix this by moving the definition of barrier_data() into compiler.h. > > Also move the gcc/clang definition of barrier() into compiler.h, > __memory_barrier() is icc-specific (and barrier() is already defined > using it in compiler-intel.h) and doesn't belong in compiler.h. > > Signed-off-by: Arvind Sankar <nived...@alum.mit.edu> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually > exclusive")
Thanks for the patch! Curious how you spotted this? My mistake for missing it. Definite difference in the disassembly before/after. Cc: sta...@vger.kernel.org Reviewed-by: Nick Desaulniers <ndesaulni...@google.com> Tested-by: Nick Desaulniers <ndesaulni...@google.com> akpm@ would you mind picking this up when you have a chance? See also: commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against dead store elimination") I'm pretty sure `man 3 explicit_bzero` was created in libc for this exact problem, though the manual page is an interesting read. > --- > include/linux/compiler-clang.h | 6 ------ > include/linux/compiler-gcc.h | 19 ------------------- > include/linux/compiler.h | 18 ++++++++++++++++-- > 3 files changed, 16 insertions(+), 27 deletions(-) > > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h > index cee0c728d39a..04c0a5a717f7 100644 > --- a/include/linux/compiler-clang.h > +++ b/include/linux/compiler-clang.h > @@ -52,12 +52,6 @@ > #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1 > #endif > > -/* The following are for compatibility with GCC, from compiler-gcc.h, > - * and may be redefined here because they should not be shared with other > - * compilers, like ICC. > - */ > -#define barrier() __asm__ __volatile__("" : : : "memory") > - > #if __has_feature(shadow_call_stack) > # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) > #endif > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h > index 7a3769040d7d..fda30ffb037b 100644 > --- a/include/linux/compiler-gcc.h > +++ b/include/linux/compiler-gcc.h > @@ -15,25 +15,6 @@ > # error Sorry, your compiler is too old - please upgrade it. > #endif > > -/* Optimization barrier */ > - > -/* The "volatile" is due to gcc bugs */ > -#define barrier() __asm__ __volatile__("": : :"memory") > -/* > - * This version is i.e. to prevent dead stores elimination on @ptr > - * where gcc and llvm may behave differently when otherwise using > - * normal barrier(): while gcc behavior gets along with a normal > - * barrier(), llvm needs an explicit input variable to be assumed > - * clobbered. The issue is as follows: while the inline asm might > - * access any memory it wants, the compiler could have fit all of > - * @ptr into memory registers instead, and since @ptr never escaped > - * from that, it proved that the inline asm wasn't touching any of > - * it. This version works well with both compilers, i.e. we're telling > - * the compiler that the inline asm absolutely may see the contents > - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495 > - */ > -#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory") > - > /* > * This macro obfuscates arithmetic on a variable address so that gcc > * shouldn't recognize the original var, and make assumptions about it. > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 92ef163a7479..dfba70b2644f 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_likely_data *f, > int val, > > /* Optimization barrier */ > #ifndef barrier > -# define barrier() __memory_barrier() > +/* The "volatile" is due to gcc bugs */ > +# define barrier() __asm__ __volatile__("": : :"memory") > #endif > > #ifndef barrier_data > -# define barrier_data(ptr) barrier() > +/* > + * This version is i.e. to prevent dead stores elimination on @ptr > + * where gcc and llvm may behave differently when otherwise using > + * normal barrier(): while gcc behavior gets along with a normal > + * barrier(), llvm needs an explicit input variable to be assumed > + * clobbered. The issue is as follows: while the inline asm might > + * access any memory it wants, the compiler could have fit all of > + * @ptr into memory registers instead, and since @ptr never escaped > + * from that, it proved that the inline asm wasn't touching any of > + * it. This version works well with both compilers, i.e. we're telling > + * the compiler that the inline asm absolutely may see the contents > + * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495 > + */ > +# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory") > #endif > > /* workaround for GCC PR82365 if needed */ > -- > 2.26.2 > -- Thanks, ~Nick Desaulniers