On Wed, Sep 09, 2015 at 12:36:01PM -0400, Zack Weinberg wrote: > The first, simpler problem is strictly optimization. explicit_bzero > can be optimized to memset followed by a vacuous use of the memory > region (generating no machine instructions, but preventing the stores > from being deleted as dead); this is valuable because the sensitive > data is often small and fixed-size, so the memset can in turn be > replaced by inline code. (This also facilitates implementation of > -D_FORTIFY_SOURCE checks for explicit_bzero.) Again looking at > libressl, 92 of those 152 uses are improved by a crude version of this > optimization: > > void explicit_bzero(void *, size_t); > extern inline __attribute__((gnu_inline, always_inline)) > void explicit_bzero_constn(void *ptr, size_t len) > { > typedef struct {char x[len];} memblk; > memset(ptr, 0, len); > asm("" : : "m" (*(memblk __attribute__((may_alias)) *)ptr)); > } > #define explicit_bzero(s, n) \ > (__extension__(__builtin_constant_p(n) && (n) > 0 \ > ? explicit_bzero_constn(s, n) \ > : explicit_bzero(s, n))) > > I call this "crude" because it only works in GCC, when compiling C, > and when the length parameter is compile-time constant. GCC issues no > error for this code when 'len' is not compile-time constant, but it is > not documented to work reliably. When compiling C++, GCC does not > accept a structure containing an array whose size is not *lexically* > constant; even if the body of explicit_bzero_constn is moved into the > macro so that the whole thing is guarded by __builtin_constant_p, > using explicit_bzero with a non-constant size will cause a compile > error. The same is true for Clang whether compiling C or C++. > > This problem could be solved with a very simple feature addition: > > extern inline __attribute__((gnu_inline, always_inline)) > void explicit_bzero(void *ptr, size_t len) > { > memset(ptr, 0, len); > __builtin_use_memory(ptr, len); > }
You're making this harder than it needs to be. The "m" constraint is the wrong thing to use here. Simply use: __asm__(""::"r"(ptr):"memory"); The memory constraint implies that the asm can read or write any memory that's reachable by it. The lack of output constraints implies __volatile__ which is also needed. Rich