Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange

Jonathan Wakely via Gcc-patches Wed, 29 Sep 2021 05:14:43 -0700

On Mon, 27 Sept 2021 at 15:11, Thomas Rodgers <rodg...@appliantology.com> wrote:
>
> From: Thomas Rodgers <rodg...@twrodgers.com>
>
> Now with checks for __has_builtin(__builtin_clear_padding)
>
> This change implements P0528 which requires that padding bits not
> participate in atomic compare exchange operations. All arguments to the
> generic template are 'sanitized' by the __builtin_clearpadding intrisic
> before they are used in comparisons. This alrequires that any stores
> also sanitize the incoming value.
>
> Signed-off-by: Thomas Rodgers <trodg...@redhat.com>
>
> libstdc++=v3/ChangeLog:
>
>         * include/std/atomic (atomic<T>::atomic(_Tp) clear padding for
>         __cplusplus > 201703L.
>         (atomic<T>::store()) Clear padding.
>         (atomic<T>::exchange()) Likewise.
>         (atomic<T>::compare_exchange_weak()) Likewise.
>         (atomic<T>::compare_exchange_strong()) Likewise.


Don't we also need this for std::atomic_ref, i.e. for the
__atomic_impl free functions in <bits/atomic_base.h>?

There we don't have any distinction between atomic_ref<integral type>
and atomic_ref<struct with possible padding>, they both use the same
implementations. But I think that's OK, as I think the built-in is
smart enough to be a no-op for types with no padding.

>         * testsuite/29_atomics/atomic/compare_exchange_padding.cc: New
>         test.
> ---
>  libstdc++-v3/include/std/atomic               | 41 +++++++++++++++++-
>  .../atomic/compare_exchange_padding.cc        | 42 +++++++++++++++++++
>  2 files changed, 81 insertions(+), 2 deletions(-)
>  create mode 100644 
> libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
> index 936dd50ba1c..4ac9ccdc1ab 100644
> --- a/libstdc++-v3/include/std/atomic
> +++ b/libstdc++-v3/include/std/atomic
> @@ -228,7 +228,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        atomic& operator=(const atomic&) = delete;
>        atomic& operator=(const atomic&) volatile = delete;
>
> -      constexpr atomic(_Tp __i) noexcept : _M_i(__i) { }
> +#if __cplusplus > 201703L && __has_builtin(__builtin_clear_padding)
> +      constexpr atomic(_Tp __i) noexcept : _M_i(__i)
> +      { __builtin_clear_padding(std::__addressof(_M_i)); }
> +#else
> +      constexpr atomic(_Tp __i) noexcept : _M_i(__i)
> +      { }
> +#endif

Please write this as a single function with the preprocessor
conditions in the body:

      constexpr atomic(_Tp __i) noexcept : _M_i(__i)
      {
#if __cplusplus > 201703L && __has_builtin(__builtin_clear_padding)
        __builtin_clear_padding(std::__addressof(_M_i)); }
#endif
      }

This not only avoids duplication of the identical parts, but it avoids
warnings from ld.gold if you use --detect-odr-violations. Otherwise,
the linker can see a definition of that constructor on two different
lines (233 and 236), and so warns about possible ODR violations,
something like "warning: while linking foo: symbol
'std::atomic<int>::atomic(int)' defined in multiple places (possible
ODR violation): ...atomic:233 ... atomic:236"

Can't we clear the padding for >= 201402L instead of only C++20? Only
C++11 has a problem with the built-in in a constexpr function, right?
So we can DTRT for C++14 upwards.


>
>        operator _Tp() const noexcept
>        { return load(); }
> @@ -268,12 +274,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        void
>        store(_Tp __i, memory_order __m = memory_order_seq_cst) noexcept
>        {
> +#if __has_builtin(__builtin_clear_padding)
> +       __builtin_clear_padding(std::__addressof(__i));
> +#endif

We repeat this *a lot*. When I started work on this I defined a
non-member function in the __atomic_impl namespace:

    template<typename _Tp>
      _GLIBCXX_ALWAYS_INLINE void
      __clear_padding(_Tp& __val) noexcept
      {
#if __has_builtin(__builtin_clear_padding)
       __builtin_clear_padding(std::__addressof(__val));
#endif
      }

Then you can just use that everywhere (except the constexpr
constructor), without all the #if checks.



>         __atomic_store(std::__addressof(_M_i), std::__addressof(__i), 
> int(__m));
>        }
>
>        void
>        store(_Tp __i, memory_order __m = memory_order_seq_cst) volatile 
> noexcept
>        {
> +#if __has_builtin(__builtin_clear_padding)
> +       __builtin_clear_padding(std::__addressof(__i));
> +#endif
>         __atomic_store(std::__addressof(_M_i), std::__addressof(__i), 
> int(__m));
>        }
>
> @@ -300,6 +312,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        {
>          alignas(_Tp) unsigned char __buf[sizeof(_Tp)];
>         _Tp* __ptr = reinterpret_cast<_Tp*>(__buf);
> +#if __has_builtin(__builtin_clear_padding)
> +       __builtin_clear_padding(std::__addressof(__i));
> +#endif
>         __atomic_exchange(std::__addressof(_M_i), std::__addressof(__i),
>                           __ptr, int(__m));
>         return *__ptr;
> @@ -311,6 +326,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        {
>          alignas(_Tp) unsigned char __buf[sizeof(_Tp)];
>         _Tp* __ptr = reinterpret_cast<_Tp*>(__buf);
> +#if __has_builtin(__builtin_clear_padding)
> +       __builtin_clear_padding(std::__addressof(__i));
> +#endif
>         __atomic_exchange(std::__addressof(_M_i), std::__addressof(__i),
>                           __ptr, int(__m));
>         return *__ptr;
> @@ -322,6 +340,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        {
>         __glibcxx_assert(__is_valid_cmpexch_failure_order(__f));
>
> +#if __has_builtin(__builtin_clear_padding)
> +       __builtin_clear_padding(std::__addressof(__e));

This unconditionally clears the padding of __e, which I don't think is
allowed. It potentially introduces a data race if another thread is
doing the CAS at the same time, and the program assumes that only the
CAS that fails will update expected.

See the thread I started at https://lists.isocpp.org/parallel/2020/12/3443.php
("atomic compare_exchange and padding bits", 2020-12-03)

The conclusion was that writing to __e is not allowed in the failure
case, so you need to make a copy of it (into a buffer, using memcpy),
then clear the padding in the copy, then try the
__atomic_compare_exchange and if it fails, copy back from the buffer
to __e. If all that extra work doesn't get inlined then we want to
only do it for types which might have padding bits, so I had
__atomic_impl::__maybe_has_padding in my unfinished patch:

   template<typename _Tp>
     constexpr bool
     __maybe_has_padding()
     {
#if __has_builtin(__has_unique_object_representations)
      return !__has_unique_object_representations(_Tp);
#else
      return true;
#endif
     }

The MSVC implementation uses !__has_unique_object_representations(_Tp)
&& !is_floating_point<_Tp>::value here, which is better than mine
above (FP types don't have unique object reps, but also don't have
padding bits).

And then do something like this in compare_exchange_weak:


+      {
+#if __has_builtin(__builtin_clear_padding)
+       if _GLIBCXX_CONSTEXPR17 (__maybe_has_padding<_Tp>())
+         {
+           _Val<_Tp> __expected0 = __expected; // XXX should use memcpy
+           auto* __exp = __atomic_impl::__clear_padding(__expected0);
+           auto* __des = __atomic_impl::__clear_padding(__desired);
+           if (__atomic_compare_exchange(__ptr, __exp, __des, true,
+                                         int(__success), int(__failure)))
+             return true;
+           __builtin_memcpy(std::__addressof(__expected), __exp, sizeof(_Tp));
+           return false;
+         }
+#endif
       return __atomic_compare_exchange(__ptr, std::__addressof(__expected),

And similarly for compare_exchange_strong (or refactor them into one
function that takes a bool for weak/strong).

If you do all that in __atomic_impl::compare_exchange_weak (making it
take a bool for weak/strong) then you can reuse it from
__atomic_impl:compare_exchange_strong, and then change the gneric
atomic<T>::compare_exchange_{weak,strong} to use that as well.




> diff --git 
> a/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc 
> b/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
> new file mode 100644
> index 00000000000..0875f168097
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
> @@ -0,0 +1,42 @@
> +// { dg-options "-std=gnu++2a" }
> +// { dg-do run { target c++2a } }

We can (and should) use "20" not "2a".

Does it need to be C++20 though, aren't all the clearings that are
being tested going to happen unconditionally? (well ... as long as the
builtin exists, which is true for GCC).

> +// { dg-add-options libatomic }
> +
> +#include <atomic>
> +
> +#include <testsuite_hooks.h>
> +
> +struct S { char c; short s; };
> +
> +void __attribute__((noinline,noipa))
> +fill_struct(S& s)
> +{ __builtin_memset(&s, 0xff, sizeof(S)); }
> +
> +bool
> +compare_struct(const S& a, const S& b)
> +{ return __builtin_memcmp(&a, &b, sizeof(S)) == 0; }
> +
> +int
> +main ()
> +{
> +  S s;
> +  fill_struct(s);
> +  s.c = 'a';
> +  s.s = 42;
> +
> +  std::atomic<S> as{ s };
> +  auto ts = as.load();
> +  VERIFY( !compare_struct(s, ts) ); // padding cleared on construction
> +  as.exchange(s);
> +  auto es = as.load();
> +  VERIFY( compare_struct(ts, es) ); // padding cleared on exchange
> +
> +  S n;
> +  fill_struct(n);
> +  n.c = 'b';
> +  n.s = 71;
> +  // padding cleared on compexchg
> +  VERIFY( as.compare_exchange_weak(s, n) );

Is it safe assume this won't fail spuriously? There is only one thread
doing the RMW operation, is that enough to avoid spurious failures?

> +  VERIFY( as.compare_exchange_strong(n, s) );
> +  return 0;
> +}
> --
> 2.31.1
>

Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange

Reply via email to