Re: [PING][PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-03-25 Thread xndcn
Wow, thank you all, you guys!

在 2024年3月14日星期四,Jonathan Wakely  写道:

> On Fri, 16 Feb 2024 at 15:15, Jonathan Wakely wrote:
> >
> > On Fri, 16 Feb 2024 at 14:10, Jakub Jelinek wrote:
> > >
> > > On Fri, Feb 16, 2024 at 01:51:54PM +, Jonathan Wakely wrote:
> > > > Ah, although __atomic_compare_exchange only takes pointers, the
> > > > compiler replaces that with a call to __atomic_compare_exchange_n
> > > > which takes the newval by value, which presumably uses an 80-bit FP
> > > > register and so the padding bits become indeterminate again.
> > >
> > > __atomic_compare_exchange_n only works with integers, so I guess
> > > it is doing VIEW_CONVERT_EXPR (aka union-style type punning) on the
> > > argument.
> > >
> > > Do you have preprocessed source for the testcase?
> >
> > Sent offlist.
>
> Jakub fixed the compiler, so I've pushed the attached patch now.
>
> Tested x86_64-linux.
>


Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-02-02 Thread xndcn
Thank you for your careful review!

> But we don't need a new one if it's going to be used in exactly one test and 
> if the new option does the same thing for all targets that run the test.
Got it, thanks. Now add option "-latomic" directly, but it still rely
on the trick "[atomic_link_flags [get_multilibs]]"

> No, because the patch is supposed to prevent the infinite loop, and so 
> there's no need to stop it looping after 10s. It won't loop at all.
Thanks, deleted.

> We only need to clear padding for long double, not float and double, right?
Yes, actually there is a check "if constexpr
(__atomic_impl::__maybe_has_padding<_Fp>())".
But "__atomic_impl::__clear_padding(_M_fp); " is indeed simply, so fixed here.

> Why can't we run this on all targets?
Got it, now target option deleted.

> There's no reason to use __builtin_memset here, just include  and 
> use std::memcpy.
Thanks, fixed.

> It definitely does have padding, just say "long double has padding bits on 
> x86"
Thanks, fixed.

So here comes the latest patch:
---
libstdc++-v3/ChangeLog:

 * include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
 * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.

Signed-off-by: xndcn 
---
 libstdc++-v3/include/bits/atomic_base.h   |  2 +-
 .../atomic_float/compare_exchange_padding.cc  | 53 +++
 2 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..cdd6f07da 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   constexpr
   __atomic_float(_Fp __t) : _M_fp(__t)
-  { }
+  { __atomic_impl::__clear_padding(_M_fp); }

   __atomic_float(const __atomic_float&) = delete;
   __atomic_float& operator=(const __atomic_float&) = delete;
diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 0..eeace251c
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,53 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-additional-options "[atomic_link_flags [get_multilibs]] -latomic" }
+
+#include 
+#include 
+#include 
+#include 
+
+template
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  std::memset(, 0xff, sizeof(T));
+  __builtin_clear_padding();
+  unsigned char* ptr_f = (unsigned char*)
+  unsigned char* ptr_mask = (unsigned char*)
+  for (int i = 0; i < sizeof(T); i++)
+  {
+if (ptr_mask[i] == 0x00)
+{
+  ptr_f[i] = 0xff;
+}
+  }
+}
+
+void
+test01()
+{
+  // test for long double with padding (float80)
+  if constexpr (std::numeric_limits::digits == 64)
+  {
+long double f = 0.5f; // long double has padding bits on x86
+fill_padding(f);
+std::atomic as{ f }; // padding cleared on constructor
+long double t = 1.5;
+
+as.fetch_add(t);
+long double s = f + t;
+t = as.load();
+VERIFY(s == t); // padding ignored on float comparing
+fill_padding(s);
+VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+fill_padding(f);
+VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+  }
+}
+
+int main()
+{
+  test01();
+}
-- 
2.25.1


Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-31 Thread xndcn
Thanks!

> Why not just use -latomic unconditionally here?
testsuites of libstdc++ seems to have some tricks to find and link
libatomic.a by using "dg-add-options libatomic", which do nothing for
x86 target. In previous email, we decide not to pollute the current
option, so we add a new libatomic_16b option here.

> Why dg-timeout?
without the patch, "as.fetch_add(t);" will result in an infinite loop,
so I add timeout to help it escape. Should I keep it or not?

> Also, the target selectors above look wrong.
Thanks, fixed with checking "std::numeric_limits::digits == 64".

> This code is not compiled for C++11 and C++14, so this should just use
> constexpr not _GLIBCXX17_CONSTEXPR.
Thanks, fixed with "constexpr".

So here is the fixed patch, please review it again, thanks:
---
libstdc++-v3/ChangeLog:

 * include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
 * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
 * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.

Signed-off-by: xndcn 
---
 libstdc++-v3/include/bits/atomic_base.h   |  7 ++-
 .../atomic_float/compare_exchange_padding.cc  | 54 +++
 libstdc++-v3/testsuite/lib/dg-options.exp | 22 
 3 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..90e3f0e4b 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   constexpr
   __atomic_float(_Fp __t) : _M_fp(__t)
-  { }
+  {
+#if __has_builtin(__builtin_clear_padding)
+ if constexpr (__atomic_impl::__maybe_has_padding<_Fp>())
+   __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+  }

   __atomic_float(const __atomic_float&) = delete;
   __atomic_float& operator=(const __atomic_float&) = delete;
diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 0..e6e17e4b5
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,54 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-timeout 10 }
+// { dg-do run { target { i?86-*-* x86_64-*-* } } }
+// { dg-add-options libatomic_16b }
+
+#include 
+#include 
+#include 
+
+template
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  __builtin_memset(, 0xff, sizeof(T));
+  __builtin_clear_padding();
+  unsigned char* ptr_f = (unsigned char*)
+  unsigned char* ptr_mask = (unsigned char*)
+  for (int i = 0; i < sizeof(T); i++)
+  {
+if (ptr_mask[i] == 0x00)
+{
+  ptr_f[i] = 0xff;
+}
+  }
+}
+
+void
+test01()
+{
+  // test for long double with padding (float80)
+  if constexpr (std::numeric_limits::digits == 64)
+  {
+long double f = 0.5f; // long double may contains padding on X86
+fill_padding(f);
+std::atomic as{ f }; // padding cleared on constructor
+long double t = 1.5;
+
+as.fetch_add(t);
+long double s = f + t;
+t = as.load();
+VERIFY(s == t); // padding ignored on float comparing
+fill_padding(s);
+VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+fill_padding(f);
+VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+  }
+}
+
+int main()
+{
+  test01();
+}
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 850442b6b..75b27a136 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
 return $flags
 }

+# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
+proc add_options_for_libatomic_16b { flags } {
+if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
+   } {
+ global TOOL_OPTIONS
+
+ set link_flags ""
+ if ![is_remote host] {
+ if [info exists TOOL_OPTIONS] {
+ set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
+ } else {
+ set link_flags "[atomic_link_flags [get_multilibs]]"
+ }
+ }
+
+ append link_flags " -latomic "
+
+ return "$flags $link_flags"
+}
+return $flags
+}
+
 # Add options to enable use of deprecated features.
 proc add_options_for_using-deprecated { flags } {
 return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
-- 
2.25.1


[PING][PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-30 Thread xndcn
Ping, thanks.
I do not have access to the repo, anyone can please help me submit the
patch? Thanks.

xndcn  于2024年1月17日周三 00:16写道:
>
> Sorry about the mangled content...
> So I add a new add-options for libatomic_16b:
>
> ---
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
> * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h   |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++
>  libstdc++-v3/testsuite/lib/dg-options.exp | 22 
>  3 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..104ddfdbe 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>constexpr
>__atomic_float(_Fp __t) : _M_fp(__t)
> -  { }
> +  {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +  }
>
>__atomic_float(const __atomic_float&) = delete;
>__atomic_float& operator=(const __atomic_float&) = delete;
> diff --git 
> a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 0..d538b3d55
> --- /dev/null
> +++ 
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic_16b }
> +
> +#include 
> +#include 
> +
> +template
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(, 0xff, sizeof(T));
> +  __builtin_clear_padding();
> +  unsigned char* ptr_f = (unsigned char*)
> +  unsigned char* ptr_mask = (unsigned char*)
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +if (ptr_mask[i] == 0x00)
> +{
> +  ptr_f[i] = 0xff;
> +}
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..25da20d58 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>  return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte 
> (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +   } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> + if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> + } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> + }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +}
> +return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>  return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>
>
> Xi Ruoyao  于2024年1月16日周二 18:12写道:
> >
> > On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > > Thanks, so

Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-24 Thread xndcn
Hi, is it OK for trunk? I do not have access to the repo, can you
please help me submit the patch? Thanks.

xndcn  于2024年1月17日周三 00:16写道:
>
> Sorry about the mangled content...
> So I add a new add-options for libatomic_16b:
>
> ---
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> * testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
> * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> ---
>  libstdc++-v3/include/bits/atomic_base.h   |  7 ++-
>  .../atomic_float/compare_exchange_padding.cc  | 50 +++
>  libstdc++-v3/testsuite/lib/dg-options.exp | 22 
>  3 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..104ddfdbe 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>constexpr
>__atomic_float(_Fp __t) : _M_fp(__t)
> -  { }
> +  {
> +#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +   __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +  }
>
>__atomic_float(const __atomic_float&) = delete;
>__atomic_float& operator=(const __atomic_float&) = delete;
> diff --git 
> a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> new file mode 100644
> index 0..d538b3d55
> --- /dev/null
> +++ 
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -0,0 +1,50 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O0" }
> +// { dg-timeout 10 }
> +// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
> +// { dg-do run { target { ia32 || x86_64-*-* } } }
> +// { dg-add-options libatomic_16b }
> +
> +#include 
> +#include 
> +
> +template
> +void __attribute__((noinline,noipa))
> +fill_padding(T& f)
> +{
> +  T mask;
> +  __builtin_memset(, 0xff, sizeof(T));
> +  __builtin_clear_padding();
> +  unsigned char* ptr_f = (unsigned char*)
> +  unsigned char* ptr_mask = (unsigned char*)
> +  for (int i = 0; i < sizeof(T); i++)
> +  {
> +if (ptr_mask[i] == 0x00)
> +{
> +  ptr_f[i] = 0xff;
> +}
> +  }
> +}
> +
> +void
> +test01()
> +{
> +  long double f = 0.5f; // long double may contains padding on X86
> +  fill_padding(f);
> +  std::atomic as{ f }; // padding cleared on constructor
> +  long double t = 1.5;
> +
> +  as.fetch_add(t);
> +  long double s = f + t;
> +  t = as.load();
> +  VERIFY(s == t); // padding ignored on float comparing
> +  fill_padding(s);
> +  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
> +  fill_padding(f);
> +  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
> +}
> +
> +int main()
> +{
> +  test01();
> +}
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 850442b6b..25da20d58 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
>  return $flags
>  }
>
> +# Add option to link to libatomic, if required for atomics on 16-byte 
> (128-bit)
> +proc add_options_for_libatomic_16b { flags } {
> +if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
> +   } {
> + global TOOL_OPTIONS
> +
> + set link_flags ""
> + if ![is_remote host] {
> + if [info exists TOOL_OPTIONS] {
> + set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> + } else {
> + set link_flags "[atomic_link_flags [get_multilibs]]"
> + }
> + }
> +
> + append link_flags " -latomic "
> +
> + return "$flags $link_flags"
> +}
> +return $flags
> +}
> +
>  # Add options to enable use of deprecated features.
>  proc add_options_for_using-deprecated { flags } {
>  return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
> --
> 2.25.1
>
>
> Xi Ruoyao  于2024年1月16日周二 18:12写道:
> >
> > On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > > Thanks, so

Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-16 Thread xndcn
Sorry about the mangled content...
So I add a new add-options for libatomic_16b:

---
libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
* testsuite/lib/dg-options.exp: add new add-options for libatomic_16b.
* testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
---
 libstdc++-v3/include/bits/atomic_base.h   |  7 ++-
 .../atomic_float/compare_exchange_padding.cc  | 50 +++
 libstdc++-v3/testsuite/lib/dg-options.exp | 22 
 3 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..104ddfdbe 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   constexpr
   __atomic_float(_Fp __t) : _M_fp(__t)
-  { }
+  {
+#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
+ if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
+   __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+  }

   __atomic_float(const __atomic_float&) = delete;
   __atomic_float& operator=(const __atomic_float&) = delete;
diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 0..d538b3d55
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,50 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-timeout 10 }
+// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
+// { dg-do run { target { ia32 || x86_64-*-* } } }
+// { dg-add-options libatomic_16b }
+
+#include 
+#include 
+
+template
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+  T mask;
+  __builtin_memset(, 0xff, sizeof(T));
+  __builtin_clear_padding();
+  unsigned char* ptr_f = (unsigned char*)
+  unsigned char* ptr_mask = (unsigned char*)
+  for (int i = 0; i < sizeof(T); i++)
+  {
+if (ptr_mask[i] == 0x00)
+{
+  ptr_f[i] = 0xff;
+}
+  }
+}
+
+void
+test01()
+{
+  long double f = 0.5f; // long double may contains padding on X86
+  fill_padding(f);
+  std::atomic as{ f }; // padding cleared on constructor
+  long double t = 1.5;
+
+  as.fetch_add(t);
+  long double s = f + t;
+  t = as.load();
+  VERIFY(s == t); // padding ignored on float comparing
+  fill_padding(s);
+  VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+  fill_padding(f);
+  VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+}
+
+int main()
+{
+  test01();
+}
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 850442b6b..25da20d58 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -356,6 +356,28 @@ proc add_options_for_libatomic { flags } {
 return $flags
 }

+# Add option to link to libatomic, if required for atomics on 16-byte (128-bit)
+proc add_options_for_libatomic_16b { flags } {
+if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
+   } {
+ global TOOL_OPTIONS
+
+ set link_flags ""
+ if ![is_remote host] {
+ if [info exists TOOL_OPTIONS] {
+ set link_flags "[atomic_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
+ } else {
+ set link_flags "[atomic_link_flags [get_multilibs]]"
+ }
+ }
+
+ append link_flags " -latomic "
+
+ return "$flags $link_flags"
+}
+return $flags
+}
+
 # Add options to enable use of deprecated features.
 proc add_options_for_using-deprecated { flags } {
 return "$flags -U_GLIBCXX_USE_DEPRECATED -D_GLIBCXX_USE_DEPRECATED=1"
-- 
2.25.1


Xi Ruoyao  于2024年1月16日周二 18:12写道:
>
> On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote:
> > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc,
> > which will fail due to timeout without the patch.
>
> Please resend in plain text instead of HTML.  Sending in HTML causes the
> patch mangled.
>
> And libstdc++ patches should CC libstd...@gcc.gnu.org, in addition to
> gcc-patches@gcc.gnu.org.
>
> > ---
> > libstdc++-v3/ChangeLog:
> >
> >  * include/bits/atomic_base.h: add __builtin_clear_padding in 
> > __atomic_float constructor.
> >  * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
> >  * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
> > ---
> >  libstdc++-v3/include/bits/atomic_base.h   |  7 ++-
> >  .../atomic_float/compare_exchange_padding.cc  | 50 +++

Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-16 Thread xndcn
Thanks, so I add a test: atomic_float/compare_exchange_padding.cc, which
will fail due to timeout without the patch.

---
libstdc++-v3/ChangeLog:

 * include/bits/atomic_base.h: add __builtin_clear_padding in
__atomic_float constructor.
 * testsuite/lib/dg-options.exp: enable libatomic for IA32 and X86-64.
 * testsuite/29_atomics/atomic_float/compare_exchange_padding.cc: New test.
---
 libstdc++-v3/include/bits/atomic_base.h | 7 ++-
 .../atomic_float/compare_exchange_padding.cc | 50 +++
 libstdc++-v3/testsuite/lib/dg-options.exp | 1 +
 3 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100644
libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index d3a2c4f3805..cbe3749e17f 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   constexpr
   __atomic_float(_Fp __t) : _M_fp(__t)
- { }
+ {
+#if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
+ if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
+ __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+ }

   __atomic_float(const __atomic_float&) = delete;
   __atomic_float& operator=(const __atomic_float&) = delete;
diff --git
a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
new file mode 100644
index 000..9376ab22850
--- /dev/null
+++
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -0,0 +1,50 @@
+// { dg-do run { target c++20 } }
+// { dg-options "-O0" }
+// { dg-timeout 10 }
+// { dg-additional-options "-mlong-double-80" { target x86_64-*-* } }
+// { dg-do run { target { ia32 || x86_64-*-* } } }
+// { dg-add-options libatomic }
+
+#include 
+#include 
+
+template
+void __attribute__((noinline,noipa))
+fill_padding(T& f)
+{
+ T mask;
+ __builtin_memset(, 0xff, sizeof(T));
+ __builtin_clear_padding();
+ unsigned char* ptr_f = (unsigned char*)
+ unsigned char* ptr_mask = (unsigned char*)
+ for (int i = 0; i < sizeof(T); i++)
+ {
+ if (ptr_mask[i] == 0x00)
+ {
+ ptr_f[i] = 0xff;
+ }
+ }
+}
+
+void
+test01()
+{
+ long double f = 0.5f; // long double may contains padding on X86
+ fill_padding(f);
+ std::atomic as{ f }; // padding cleared on constructor
+ long double t = 1.5;
+
+ as.fetch_add(t);
+ long double s = f + t;
+ t = as.load();
+ VERIFY(s == t); // padding ignored on float comparing
+ fill_padding(s);
+ VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg
+ fill_padding(f);
+ VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg
+}
+
+int main()
+{
+ test01();
+}
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp
b/libstdc++-v3/testsuite/lib/dg-options.exp
index bc387d17ed7..d9a19dadd7f 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
   || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
   || [istarget riscv*-*-*]
   || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32])
+ || ([istarget i?86-*-*] || [istarget x86_64-*-*])
} {
  global TOOL_OPTIONS

-- 
2.25.1

H.J. Lu  于2024年1月15日周一 11:46写道:

> On Sun, Jan 7, 2024, 5:02 PM xndcn  wrote:
>
>> Hi, I found __atomic_float constructor does not clear padding,
>> while __compare_exchange assumes it as zeroed padding. So it is easy to
>> reproducing a infinite loop in X86-64 with long double type like:
>> ---
>> -O0 -std=c++23 -mlong-double-80
>> #include 
>> #include 
>>
>> #define T long double
>> int main() {
>> std::atomic t(0.5);
>> t.fetch_add(0.5);
>> float x = t;
>> printf("%f\n", x);
>> }
>> ---
>>
>> So we should add __builtin_clear_padding in __atomic_float constructor,
>> just like the generic atomic struct.
>>
>> regtested on x86_64-linux. Is it OK for trunk?
>>
>> ---
>> libstdc++: atomic: Add missing clear_padding in __atomic_float
>> constructor.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/bits/atomic_base.h: add __builtin_clear_padding in
>> __atomic_float constructor.
>> ---
>>  libstdc++-v3/include/bits/atomic_base.h | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/bits/atomic_base.h
>> b/libstdc++-v3/include/bits/atomic_base.h
>> index f4ce0fa53..d59c2209e 100644
>> --- a/libstdc++-v3/include/bits/atomic_base.h
>> +++ b/libstdc++-v3/include/bits/atomic_base.h
>&

[PING][PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-14 Thread xndcn
Ping. Thanks.

xndcn  于2024年1月8日周一 09:01写道:

> Hi, I found __atomic_float constructor does not clear padding,
> while __compare_exchange assumes it as zeroed padding. So it is easy to
> reproducing a infinite loop in X86-64 with long double type like:
> ---
> -O0 -std=c++23 -mlong-double-80
> #include 
> #include 
>
> #define T long double
> int main() {
> std::atomic t(0.5);
> t.fetch_add(0.5);
> float x = t;
> printf("%f\n", x);
> }
> ---
>
> So we should add __builtin_clear_padding in __atomic_float constructor,
> just like the generic atomic struct.
>
> regtested on x86_64-linux. Is it OK for trunk?
>
> ---
> libstdc++: atomic: Add missing clear_padding in __atomic_float constructor.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h: add __builtin_clear_padding in
> __atomic_float constructor.
> ---
>  libstdc++-v3/include/bits/atomic_base.h | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index f4ce0fa53..d59c2209e 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>constexpr
>__atomic_float(_Fp __t) : _M_fp(__t)
> -  { }
> +  {
> +#if __has_builtin(__builtin_clear_padding)
> + if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
> +  __builtin_clear_padding(std::__addressof(_M_fp));
> +#endif
> +  }
>
>__atomic_float(const __atomic_float&) = delete;
>__atomic_float& operator=(const __atomic_float&) = delete;
> --
> 2.25.1
>


[PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-07 Thread xndcn
Hi, I found __atomic_float constructor does not clear padding,
while __compare_exchange assumes it as zeroed padding. So it is easy to
reproducing a infinite loop in X86-64 with long double type like:
---
-O0 -std=c++23 -mlong-double-80
#include 
#include 

#define T long double
int main() {
std::atomic t(0.5);
t.fetch_add(0.5);
float x = t;
printf("%f\n", x);
}
---

So we should add __builtin_clear_padding in __atomic_float constructor,
just like the generic atomic struct.

regtested on x86_64-linux. Is it OK for trunk?

---
libstdc++: atomic: Add missing clear_padding in __atomic_float constructor.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float
constructor.
---
 libstdc++-v3/include/bits/atomic_base.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/atomic_base.h
b/libstdc++-v3/include/bits/atomic_base.h
index f4ce0fa53..d59c2209e 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1283,7 +1283,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   constexpr
   __atomic_float(_Fp __t) : _M_fp(__t)
-  { }
+  {
+#if __has_builtin(__builtin_clear_padding)
+ if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Fp>())
+  __builtin_clear_padding(std::__addressof(_M_fp));
+#endif
+  }

   __atomic_float(const __atomic_float&) = delete;
   __atomic_float& operator=(const __atomic_float&) = delete;
-- 
2.25.1


Re: Ping: [PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding

2024-01-04 Thread xndcn
Thanks! I am trying to re-write by calling __builtin_clear_padding. But I
found gimple_fold_builtin_clear_padding seems only work before SSA pass.
Should I remove the assertion?

On the other hand, since ATOMIC_COMPARE_EXCHANGE will only work for simple
reg type. excluding vector or complex types, is it enough to generate the
mask by simple bit operations?

Thanks!,
xndcn

---
static bool
gimple_fold_builtin_clear_padding (gimple_stmt_iterator *gsi)
{ ...
  /* This should be folded during the lower pass.  */
  gcc_assert (!gimple_in_ssa_p (cfun) && cfun->cfg == NULL);


Jakub Jelinek  于2024年1月3日周三 23:52写道:

> On Wed, Jan 03, 2024 at 11:42:58PM +0800, xndcn wrote:
> > Hi, I am new to this, and I really need your advice, thanks.
> >
> > I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE
> > internal-fn optimization
> >
> > for floating type or types contains padding (e.g., long double).
> > Please correct me if I happen to
> > make any mistakes, Thanks!
> >
> > Firstly, about the concerns of sNaNs float/doduble value, it seems
> > work well and shall have been
> > covered by testsuite/gcc.dg/atomic/c11-atomic-exec-5.c
> >
> > Secondly, since ATOMIC_COMPARE_EXCHANGE is only enabled when expected
> > var is only addressable
> > because of the call, the padding bits can not be modified by any other
> > stmts. So we can save all
> > bits after ATOMIC_COMPARE_EXCHANGE call and extract the padding bits.
> > After first iteration, the
> > extracted padding bits can be mixed with the expected var.
> >
> > Bootstrapped/regtested on x86_64-linux.
> >
> > I did some benchmarks, and there is some significant time optimization
> > for float/double types,
> >
> > while there is no regression for long double type.
>
> If anything, this should be using clear_padding_type_may_have_padding_p
> and call to __builtin_clear_padding.
> Code in that file uses /* ... */ style comments, so please use them instead
> of // for consistency, and furthermore comments should be terminated with
> a dot and two spaces before */
>
> Also, I don't think this is late stage3 material, so needs to wait for GCC
> 15.
>
> Jakub
>
>


Ping: [PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding

2024-01-03 Thread xndcn
Hi, I am new to this, and I really need your advice, thanks.

I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE
internal-fn optimization

for floating type or types contains padding (e.g., long double).
Please correct me if I happen to
make any mistakes, Thanks!

Firstly, about the concerns of sNaNs float/doduble value, it seems
work well and shall have been
covered by testsuite/gcc.dg/atomic/c11-atomic-exec-5.c

Secondly, since ATOMIC_COMPARE_EXCHANGE is only enabled when expected
var is only addressable
because of the call, the padding bits can not be modified by any other
stmts. So we can save all
bits after ATOMIC_COMPARE_EXCHANGE call and extract the padding bits.
After first iteration, the
extracted padding bits can be mixed with the expected var.

Bootstrapped/regtested on x86_64-linux.

I did some benchmarks, and there is some significant time optimization
for float/double types,

while there is no regression for long double type.

Thanks,

xndcn


gcc/ChangeLog:

* gimple-fold.cc (optimize_atomic_compare_exchange_p): enable
for SCALAR_FLOAT_TYPE_P type of expected var, or if TYPE_PRECISION
is different from mode's bitsize
(fold_builtin_atomic_compare_exchange): if TYPE_PRECISION is
different from mode's bitsize, try to keep track all the bits and
mix it with VIEW_CONVERT_EXPR(expected).

Signed-off-by: xndcn 
---
 gcc/gimple-fold.cc | 77 ++
 1 file changed, 71 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index cb4b57250..321ff4f41 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -5306,12 +5306,7 @@ optimize_atomic_compare_exchange_p (gimple *stmt)
   || !auto_var_in_fn_p (TREE_OPERAND (expected, 0), current_function_decl)
   || TREE_THIS_VOLATILE (etype)
   || VECTOR_TYPE_P (etype)
-  || TREE_CODE (etype) == COMPLEX_TYPE
-  /* Don't optimize floating point expected vars, VIEW_CONVERT_EXPRs
-might not preserve all the bits.  See PR71716.  */
-  || SCALAR_FLOAT_TYPE_P (etype)
-  || maybe_ne (TYPE_PRECISION (etype),
-  GET_MODE_BITSIZE (TYPE_MODE (etype
+  || TREE_CODE (etype) == COMPLEX_TYPE)
 return false;

   tree weak = gimple_call_arg (stmt, 3);
@@ -5350,8 +5345,10 @@ fold_builtin_atomic_compare_exchange
(gimple_stmt_iterator *gsi)
   tree itype = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (parmt)));
   tree ctype = build_complex_type (itype);
   tree expected = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
+  tree etype = TREE_TYPE (expected);
   bool throws = false;
   edge e = NULL;
+  tree allbits = NULL_TREE;
   gimple *g = gimple_build_assign (make_ssa_name (TREE_TYPE (expected)),
   expected);
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
@@ -5362,6 +5359,67 @@ fold_builtin_atomic_compare_exchange
(gimple_stmt_iterator *gsi)
   build1 (VIEW_CONVERT_EXPR, itype,
   gimple_assign_lhs (g)));
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
+
+  // VIEW_CONVERT_EXPRs might not preserve all the bits.  See PR71716.
+  // so we have to keep track all bits here.
+  if (maybe_ne (TYPE_PRECISION (etype),
+   GET_MODE_BITSIZE (TYPE_MODE (etype
+   {
+ gimple_stmt_iterator cgsi
+   = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+ allbits = create_tmp_var (itype);
+ // allbits is initialized to 0, which can be ignored first time
+ gimple *init_stmt
+   = gimple_build_assign (allbits, build_int_cst (itype, 0));
+ gsi_insert_before (, init_stmt, GSI_SAME_STMT);
+ tree maskbits = create_tmp_var (itype);
+ // maskbits is initialized to full 1 (0xFFF...)
+ init_stmt = gimple_build_assign (maskbits, build1 (BIT_NOT_EXPR,
+itype, allbits));
+ gsi_insert_before (, init_stmt, GSI_SAME_STMT);
+
+ // g = g & maskbits
+ g = gimple_build_assign (make_ssa_name (itype),
+  build2 (BIT_AND_EXPR, itype,
+  gimple_assign_lhs (g), maskbits));
+ gsi_insert_before (gsi, g, GSI_SAME_STMT);
+
+ gimple *def_mask = gimple_build_assign (
+   make_ssa_name (itype),
+   build2 (LSHIFT_EXPR, itype, build_int_cst (itype, 1),
+   build_int_cst (itype, TYPE_PRECISION (etype;
+ gsi_insert_before (gsi, def_mask, GSI_SAME_STMT);
+ def_mask = gimple_build_assign (make_ssa_name (itype),
+ build2 (MINUS_EXPR, itype,
+ gimple_assign_lhs (def_mask),
+ build_int_cst (itype, 1)));
+ gsi_insert_before (gsi, def_mask, GSI_SAME_

[PING][PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding

2023-12-26 Thread xndcn
Ping, thanks.

I did some benchmarks, and there is some significant time optimization for 
float/double types,
while there is no regression for long double type.



[PATCH] gimple-fold.cc: enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contain padding

2023-12-18 Thread xndcn
gcc/ChangeLog:

* gimple-fold.cc (optimize_atomic_compare_exchange_p): enable
for SCALAR_FLOAT_TYPE_P type of expected var, or if TYPE_PRECISION
is different from mode's bitsize
(fold_builtin_atomic_compare_exchange): if TYPE_PRECISION is
different from mode's bitsize, try to keep track all the bits and
mix it with VIEW_CONVERT_EXPR(expected).

Signed-off-by: xndcn 
---
 gcc/gimple-fold.cc | 77 ++
 1 file changed, 71 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index cb4b57250..321ff4f41 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -5306,12 +5306,7 @@ optimize_atomic_compare_exchange_p (gimple *stmt)
   || !auto_var_in_fn_p (TREE_OPERAND (expected, 0), current_function_decl)
   || TREE_THIS_VOLATILE (etype)
   || VECTOR_TYPE_P (etype)
-  || TREE_CODE (etype) == COMPLEX_TYPE
-  /* Don't optimize floating point expected vars, VIEW_CONVERT_EXPRs
-might not preserve all the bits.  See PR71716.  */
-  || SCALAR_FLOAT_TYPE_P (etype)
-  || maybe_ne (TYPE_PRECISION (etype),
-  GET_MODE_BITSIZE (TYPE_MODE (etype
+  || TREE_CODE (etype) == COMPLEX_TYPE)
 return false;
 
   tree weak = gimple_call_arg (stmt, 3);
@@ -5350,8 +5345,10 @@ fold_builtin_atomic_compare_exchange 
(gimple_stmt_iterator *gsi)
   tree itype = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (parmt)));
   tree ctype = build_complex_type (itype);
   tree expected = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
+  tree etype = TREE_TYPE (expected);
   bool throws = false;
   edge e = NULL;
+  tree allbits = NULL_TREE;
   gimple *g = gimple_build_assign (make_ssa_name (TREE_TYPE (expected)),
   expected);
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
@@ -5362,6 +5359,67 @@ fold_builtin_atomic_compare_exchange 
(gimple_stmt_iterator *gsi)
   build1 (VIEW_CONVERT_EXPR, itype,
   gimple_assign_lhs (g)));
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
+
+  // VIEW_CONVERT_EXPRs might not preserve all the bits.  See PR71716.
+  // so we have to keep track all bits here.
+  if (maybe_ne (TYPE_PRECISION (etype),
+   GET_MODE_BITSIZE (TYPE_MODE (etype
+   {
+ gimple_stmt_iterator cgsi
+   = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+ allbits = create_tmp_var (itype);
+ // allbits is initialized to 0, which can be ignored first time
+ gimple *init_stmt
+   = gimple_build_assign (allbits, build_int_cst (itype, 0));
+ gsi_insert_before (, init_stmt, GSI_SAME_STMT);
+ tree maskbits = create_tmp_var (itype);
+ // maskbits is initialized to full 1 (0xFFF...)
+ init_stmt = gimple_build_assign (maskbits, build1 (BIT_NOT_EXPR,
+itype, allbits));
+ gsi_insert_before (, init_stmt, GSI_SAME_STMT);
+
+ // g = g & maskbits
+ g = gimple_build_assign (make_ssa_name (itype),
+  build2 (BIT_AND_EXPR, itype,
+  gimple_assign_lhs (g), maskbits));
+ gsi_insert_before (gsi, g, GSI_SAME_STMT);
+
+ gimple *def_mask = gimple_build_assign (
+   make_ssa_name (itype),
+   build2 (LSHIFT_EXPR, itype, build_int_cst (itype, 1),
+   build_int_cst (itype, TYPE_PRECISION (etype;
+ gsi_insert_before (gsi, def_mask, GSI_SAME_STMT);
+ def_mask = gimple_build_assign (make_ssa_name (itype),
+ build2 (MINUS_EXPR, itype,
+ gimple_assign_lhs (def_mask),
+ build_int_cst (itype, 1)));
+ gsi_insert_before (gsi, def_mask, GSI_SAME_STMT);
+ // maskbits = (1 << TYPE_PRECISION (etype)) - 1
+ def_mask = gimple_build_assign (maskbits, SSA_NAME,
+ gimple_assign_lhs (def_mask));
+ gsi_insert_before (gsi, def_mask, GSI_SAME_STMT);
+
+ // paddingbits = (~maskbits) & allbits
+ def_mask
+   = gimple_build_assign (make_ssa_name (itype),
+  build1 (BIT_NOT_EXPR, itype,
+  gimple_assign_lhs (def_mask)));
+ gsi_insert_before (gsi, def_mask, GSI_SAME_STMT);
+ def_mask
+   = gimple_build_assign (make_ssa_name (itype),
+  build2 (BIT_AND_EXPR, itype, allbits,
+  gimple_assign_lhs (def_mask)));
+ gsi_insert_before (gsi, def_mask, GSI_SAME_STMT);
+
+ // g = g | paddingbits, i.e.,
+ // g = (VIEW_CONVERT_EXPR(expected) & maskbits)
+ /

[PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding

2023-12-18 Thread xndcn


Hi, I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE internal-fn 
optimization
for floating type or types contains padding (e.g., long double). Please tell me 
if I happen to
make any mistakes, Thanks!

Firstly, about the concerns of sNaNs float/double value, it seems work well and 
shall have been
covered by testsuite/gcc.dg/atomic/c11-atomic-exec-5.c

Secondly, since ATOMIC_COMPARE_EXCHANGE is only enabled when expected var is 
only addressable
because of the call, the padding bits can not be modified by any other stmts. 
So we can save all
bits after ATOMIC_COMPARE_EXCHANGE call and extract the padding bits. After 
first iteration, the
extracted padding bits can be mixed with the expected var.

Bootstrapped/regtested on x86_64-linux.



Re: [PATCH] tree-cfg: Fix misleading error message in verify_gimple_assign_single

2023-12-11 Thread xndcn
Thanks, now I have totally understand! I think it deserves a clearer
prompt, but I do not have a better idea currently. So forget it, thanks!

在 2023年12月11日星期一,Richard Biener  写道:

> On Mon, Dec 11, 2023 at 12:39 PM xndcn  wrote:
> >
> > Got it, thanks! It is really confusing >,<
> > What about the first one? For case MEM_REF.
>
> The same - the LHS determines this is a store, if it is the
> RHS is invalid as diagnosed (it needs to go through a
> temporary).
>
> Richard.
>
>
>
> > 在 2023年12月11日星期一,Richard Biener  写道:
> >>
> >> On Sun, Dec 10, 2023 at 4:00 PM xndcn  wrote:
> >> >
> >> > Hi, I am a newbie in GCC, and I do not have access to git repo.
> >> >
> >> > I found some misleading error messages in verify_gimple_assign_single
> function of tree-cfg.cc. It prompt error "invalid RHS for gimple memory
> store: ", but it checks lhs in fact.
> >>
> >> it might be a bit confusing but it's correct.  There is a store
> >> because !is_gimple_reg (lhs)
> >> and the only case !is_gimple_reg (rhs1) is correct is when this is an
> aggregate
> >> copy (!is_gimple_reg_type (TREE_TYPE (lhs))).  Otherwise the _RHS_
> needs to be
> >> a register.
> >>
> >> Richard.
>


Re: [PATCH] tree-cfg: Fix misleading error message in verify_gimple_assign_single

2023-12-11 Thread xndcn
Got it, thanks! It is really confusing >,<
What about the first one? For case MEM_REF.

在 2023年12月11日星期一,Richard Biener  写道:

> On Sun, Dec 10, 2023 at 4:00 PM xndcn  wrote:
> >
> > Hi, I am a newbie in GCC, and I do not have access to git repo.
> >
> > I found some misleading error messages in verify_gimple_assign_single
> function of tree-cfg.cc. It prompt error "invalid RHS for gimple memory
> store: ", but it checks lhs in fact.
>
> it might be a bit confusing but it's correct.  There is a store
> because !is_gimple_reg (lhs)
> and the only case !is_gimple_reg (rhs1) is correct is when this is an
> aggregate
> copy (!is_gimple_reg_type (TREE_TYPE (lhs))).  Otherwise the _RHS_ needs
> to be
> a register.
>
> Richard.
>


[PATCH] tree-cfg: Fix misleading error message in verify_gimple_assign_single.

2023-12-10 Thread xndcn
gcc/ChangeLog:

* tree-cfg.cc (verify_gimple_assign_single): Fix misleading error, from 
"invalid LHS ..." to "invalid RHS ..."

Signed-off-by: xndcn 
---
 gcc/tree-cfg.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index d784b9115..f041786b3 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4695,7 +4695,7 @@ verify_gimple_assign_single (gassign *stmt)
   if (!is_gimple_reg (lhs)
  && is_gimple_reg_type (TREE_TYPE (lhs)))
{
- error ("invalid RHS for gimple memory store: %qs", code_name);
+ error ("invalid LHS for gimple memory store: %qs", code_name);
  debug_generic_stmt (lhs);
  debug_generic_stmt (rhs1);
  return true;
@@ -4721,7 +4721,10 @@ verify_gimple_assign_single (gassign *stmt)
  && !is_gimple_reg (rhs1)
  && is_gimple_reg_type (TREE_TYPE (lhs)))
{
- error ("invalid RHS for gimple memory store: %qs", code_name);
+ if (!is_gimple_reg (rhs1))
+   error ("invalid RHS for gimple memory store: %qs", code_name);
+ else
+   error ("invalid LHS for gimple memory store: %qs", code_name);
  debug_generic_stmt (lhs);
  debug_generic_stmt (rhs1);
  return true;
-- 
2.25.1



[PATCH] tree-cfg: Fix misleading error message in verify_gimple_assign_single

2023-12-10 Thread xndcn
Sorry about the attachment, re-paste here




[PATCH] tree-cfg: Fix misleading error message in verify_gimple_assign_single

2023-12-10 Thread xndcn
Hi, I am a newbie in GCC, and I do not have access to git repo.

I found some misleading error messages in verify_gimple_assign_single
function of tree-cfg.cc. It prompt error "invalid RHS for gimple memory
store: ", but it checks lhs in fact.
From 72fc4abe635c3e35ac31457aeba92b528f0574fe Mon Sep 17 00:00:00 2001
From: xndcn 
Date: Sun, 10 Dec 2023 22:38:16 +0800
Subject: [PATCH] tree-cfg: Fix misleading error message in
 verify_gimple_assign_single.

gcc/ChangeLog:

	* tree-cfg.cc (verify_gimple_assign_single): Fix misleading error, from "invalid LHS ..." to "invalid RHS ..."

Signed-off-by: xndcn 
---
 gcc/tree-cfg.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index d784b9115..f041786b3 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4695,7 +4695,7 @@ verify_gimple_assign_single (gassign *stmt)
   if (!is_gimple_reg (lhs)
 	  && is_gimple_reg_type (TREE_TYPE (lhs)))
 	{
-	  error ("invalid RHS for gimple memory store: %qs", code_name);
+	  error ("invalid LHS for gimple memory store: %qs", code_name);
 	  debug_generic_stmt (lhs);
 	  debug_generic_stmt (rhs1);
 	  return true;
@@ -4721,7 +4721,10 @@ verify_gimple_assign_single (gassign *stmt)
 	  && !is_gimple_reg (rhs1)
 	  && is_gimple_reg_type (TREE_TYPE (lhs)))
 	{
-	  error ("invalid RHS for gimple memory store: %qs", code_name);
+	  if (!is_gimple_reg (rhs1))
+	error ("invalid RHS for gimple memory store: %qs", code_name);
+	  else
+	error ("invalid LHS for gimple memory store: %qs", code_name);
 	  debug_generic_stmt (lhs);
 	  debug_generic_stmt (rhs1);
 	  return true;
-- 
2.25.1