On Wed, Sep 09, 2015 at 12:36:01PM -0400, Zack Weinberg wrote:
> The first, simpler problem is strictly optimization.  explicit_bzero
> can be optimized to memset followed by a vacuous use of the memory
> region (generating no machine instructions, but preventing the stores
> from being deleted as dead); this is valuable because the sensitive
> data is often small and fixed-size, so the memset can in turn be
> replaced by inline code.  (This also facilitates implementation of
> -D_FORTIFY_SOURCE checks for explicit_bzero.)  Again looking at
> libressl, 92 of those 152 uses are improved by a crude version of this
> optimization:
> 
>     void explicit_bzero(void *, size_t);
>     extern inline __attribute__((gnu_inline, always_inline))
>     void explicit_bzero_constn(void *ptr, size_t len)
>     {
>       typedef struct {char x[len];} memblk;
>       memset(ptr, 0, len);
>       asm("" : : "m" (*(memblk __attribute__((may_alias)) *)ptr));
>     }
>     #define explicit_bzero(s, n)                          \
>       (__extension__(__builtin_constant_p(n) && (n) > 0   \
>                      ? explicit_bzero_constn(s, n)        \
>                      : explicit_bzero(s, n)))
> 
> I call this "crude" because it only works in GCC, when compiling C,
> and when the length parameter is compile-time constant.  GCC issues no
> error for this code when 'len' is not compile-time constant, but it is
> not documented to work reliably.  When compiling C++, GCC does not
> accept a structure containing an array whose size is not *lexically*
> constant; even if the body of explicit_bzero_constn is moved into the
> macro so that the whole thing is guarded by __builtin_constant_p,
> using explicit_bzero with a non-constant size will cause a compile
> error.  The same is true for Clang whether compiling C or C++.
> 
> This problem could be solved with a very simple feature addition:
> 
>     extern inline __attribute__((gnu_inline, always_inline))
>     void explicit_bzero(void *ptr, size_t len)
>     {
>       memset(ptr, 0, len);
>       __builtin_use_memory(ptr, len);
>     }

You're making this harder than it needs to be. The "m" constraint is
the wrong thing to use here. Simply use:

        __asm__(""::"r"(ptr):"memory");

The memory constraint implies that the asm can read or write any
memory that's reachable by it. The lack of output constraints implies
__volatile__ which is also needed.

Rich

Reply via email to