date:20220621

[PATCH] libstdc++-v3: check for openat

2022-06-21 Thread Alexandre Oliva via Gcc-patches



rtems6.0 has fdopendir, and fcntl.h defines AT_FDCWD and declares
openat, but there's no openat in libc.  Adjust dir-common.h to not
assume ::openat just because of AT_FDCWD.

Regstrapped on x86_64-linux-gnu (detects and still uses openat), also
tested with a cross to aarch64-rtems6 (detects openat's absence and
refrains from using it).  Ok to install?

PS: This is the last patch in my rtems6.0 patchset, and the only patch
for the filesystem-related patchset that was written specifically for a
mainline gcc.  gcc-11 did not attempt to use openat.  This patch enabled
filesystem tests to link when testing mainline on aarch64-rtems6.0.
Alas, several filesystem tests still failed with it, in ways that AFAICT
are not related with the use of openat, or with the other patches I've
posted.  However, I'm not able to look into the remaining failures right
now.


for  libstdc++-v3/ChangeLog

* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for
openat.
* aclocal.m4, configure, config.h.in: Rebuilt.
* src/filesystem/dir-common.h (openat): Use ::openat if
_GLIBCXX_HAVE_OPENAT.
---
 libstdc++-v3/acinclude.m4|   12 +++
 libstdc++-v3/config.h.in |3 ++
 libstdc++-v3/configure   |   55 ++
 libstdc++-v3/src/filesystem/dir-common.h |2 +
 4 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 138bd58d86cb9..e3cc3a8e867d3 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4772,6 +4772,18 @@ dnl
   if test $glibcxx_cv_dirfd = yes; then
 AC_DEFINE(HAVE_DIRFD, 1, [Define if dirfd is available in .])
   fi
+dnl
+  AC_CACHE_CHECK([for openat],
+glibcxx_cv_openat, [dnl
+GCC_TRY_COMPILE_OR_LINK(
+  [#include ],
+  [int fd = ::openat(AT_FDCWD, "", 0);],
+  [glibcxx_cv_openat=yes],
+  [glibcxx_cv_openat=no])
+  ])
+  if test $glibcxx_cv_openat = yes; then
+AC_DEFINE(HAVE_OPENAT, 1, [Define if openat is available in .])
+  fi
 dnl
   AC_CACHE_CHECK([for unlinkat],
 glibcxx_cv_unlinkat, [dnl
diff --git a/libstdc++-v3/config.h.in b/libstdc++-v3/config.h.in
index f30a8c51c458c..2a3972eef5412 100644
--- a/libstdc++-v3/config.h.in
+++ b/libstdc++-v3/config.h.in
@@ -292,6 +292,9 @@
 /* Define if  defines obsolete isnan function. */
 #undef HAVE_OBSOLETE_ISNAN
 
+/* Define if openat is available in . */
+#undef HAVE_OPENAT
+
 /* Define if poll is available in . */
 #undef HAVE_POLL
 
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 9b94fd71e4248..eac6039212168 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -77177,6 +77177,61 @@ $as_echo "$glibcxx_cv_dirfd" >&6; }
 
 $as_echo "#define HAVE_DIRFD 1" >>confdefs.h
 
+  fi
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for openat" >&5
+$as_echo_n "checking for openat... " >&6; }
+if ${glibcxx_cv_openat+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if test x$gcc_no_link = xyes; then
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+int fd = ::openat(AT_FDCWD, "", 0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  glibcxx_cv_openat=yes
+else
+  glibcxx_cv_openat=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+else
+  if test x$gcc_no_link = xyes; then
+  as_fn_error $? "Link tests are not allowed after GCC_NO_EXECUTABLES." 
"$LINENO" 5
+fi
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+int fd = ::openat(AT_FDCWD, "", 0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_link "$LINENO"; then :
+  glibcxx_cv_openat=yes
+else
+  glibcxx_cv_openat=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+fi
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $glibcxx_cv_openat" >&5
+$as_echo "$glibcxx_cv_openat" >&6; }
+  if test $glibcxx_cv_openat = yes; then
+
+$as_echo "#define HAVE_OPENAT 1" >>confdefs.h
+
   fi
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for unlinkat" >&5
 $as_echo_n "checking for unlinkat... " >&6; }
diff --git a/libstdc++-v3/src/filesystem/dir-common.h 
b/libstdc++-v3/src/filesystem/dir-common.h
index 365fd527f4d68..669780ea23fe5 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -199,7 +199,7 @@ struct _Dir_base
 #endif
 
 
-#ifdef AT_FDCWD
+#if _GLIBCXX_HAVE_OPENAT && defined AT_FDCWD
 fd = ::openat(fd, pathname, flags);
 #else
 // If we cannot use openat, there's no benefit to using posix::open unless

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: fs: rtems subdir renaming

2022-06-21 Thread Alexandre Oliva via Gcc-patches



RTEMS's implementation of rename(), at least on a temporary RAM
filesystem, allows a subdir to be moved into itself, but prevents a
dir from being renamed (in?)to an existing directory.  Adjust the
expectations of filesystem rename tests.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/27_io/filesystem/operations/rename.cc [__rtems__]
(test_directories): Skip attempt to rename into itself, or to
an existing directory.
* testsuite/27_io/filesystem/operations/rename.cc [__rtems__]
(test_directories): Likewise.
---
 .../27_io/filesystem/operations/rename.cc  |6 +-
 .../experimental/filesystem/operations/rename.cc   |   11 ++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc
index 2fb2068dfd3c5..cf5d543b816c2 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc
@@ -125,11 +125,15 @@ test_directories()
   VERIFY( is_directory(dir/"subdir2") );
   VERIFY( !exists(dir/"subdir") );
 
+#ifdef __rtems__
+  // Can rename a directory to a sub-directory of itself?!?
+#else
   // Cannot rename a directory to a sub-directory of itself.
   fs::rename(dir/"subdir2", dir/"subdir2/subsubdir", ec);
   VERIFY( ec );
   VERIFY( is_directory(dir/"subdir2") );
   VERIFY( !exists(dir/"subdir2"/"subsubdir") );
+#endif
 
   // Cannot rename a file to the name of an existing directory.
   ec.clear();
@@ -155,7 +159,7 @@ test_directories()
   VERIFY( is_directory(dir/"subdir2") );
   VERIFY( is_regular_file(dir/"subdir2/file") );
 
-#if defined(__MINGW32__) || defined(__MINGW64__)
+#if defined(__MINGW32__) || defined(__MINGW64__) || defined(__rtems__)
   // Cannot rename a directory to an existing directory
 #else
   // Can rename a non-empty directory to the name of an empty directory.
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc
index d2175652a79a8..1ef386fe73ade 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc
@@ -125,11 +125,15 @@ test_directories()
   VERIFY( is_directory(dir/"subdir2") );
   VERIFY( !exists(dir/"subdir") );
 
+#ifdef __rtems__
+  // Can rename a directory to a sub-directory of itself?!?
+#else
   // Cannot rename a directory to a sub-directory of itself.
   fs::rename(dir/"subdir2", dir/"subdir2/subsubdir", ec);
   VERIFY( ec );
   VERIFY( is_directory(dir/"subdir2") );
   VERIFY( !exists(dir/"subdir2"/"subsubdir") );
+#endif
 
   // Cannot rename a file to the name of an existing directory.
   ec.clear();
@@ -158,6 +162,9 @@ test_directories()
   VERIFY( is_directory(dir/"subdir2") );
   VERIFY( is_regular_file(dir/"subdir2/file") );
 
+#ifdef __rtems__
+  // Cannot rename a directory to an existing directory
+#else
   // Can rename a non-empty directory to the name of an empty directory.
   ec = bad_ec;
   fs::rename(dir/"subdir2", dir/"subdir", ec);
@@ -165,10 +172,12 @@ test_directories()
   VERIFY( is_directory(dir/"subdir") );
   VERIFY( !exists(dir/"subdir2") );
   VERIFY( is_regular_file(dir/"subdir/file") );
+#endif
+
   f2.path.clear();
+#endif
 
   f.path.clear();
-#endif
 
   fs::remove_all(dir, ec);
 }

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH] libstdc++: testsuite: fs rename to self may fail

2022-06-21 Thread Sebastian Huber


On 22/06/2022 08:24, Alexandre Oliva via Libstdc++ wrote:

rtems6's rename() implementation errors with EEXIST when the rename-to
filename exists, even when renaming a file to itself or when renaming
a nonexisting file.  Adjust expectations.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS:https://devel.rtems.org/ticket/2169  doesn't seem to suggest plans to
change behavior so as to comply with POSIX.


I would not adjust the test case to cope with systems which are not in 
line with POSIX. In the past RTEMS used the GCC tests to check that the 
implementation is in line with other systems. The RTEMS ticket is still 
open. There just needs to be someone who thinks this bug is important 
enough to fix.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

[PATCH] libstdc++: testsuite: skip fs last_write_time tests if not available

2022-06-21 Thread Alexandre Oliva via Gcc-patches



The last_write_time functions are defined in ways that are useful, or
that fail immediately, depending on various macros.  When they fail
immediately, the filesystem last_write_time.cc tests fail noisily, but
the fail is entirely expected.

Define HAVE_LWT in the last_write_time.cc tests, according to the
macros that select implementations of last_write_time, and use it to
skip tests that are expected to fail.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: I realize _GLIBCXX_HAVE_SYS_STAT_H is tested for in two different
ways in the #if expressions added to the tests.  This mirrors the
different uses in the do_stat template body, and in
fs::last_write_time(const path&, file_time_type, error_code&).  Perhaps
they should all be using either value or definedness, but I didn't want
to go there, at least not at first, so I retained the apparent
inconsistency.


for  libstdc++-v3/ChangeLog

* testsuite/27_io/filesystem/operations/last_write_time.cc:
Skip the test if the features are unavailable.
* testsuite/experimental/filesystem/operations/last_write_time.cc:
Likewise.
---
 .../27_io/filesystem/operations/last_write_time.cc |   11 +++
 .../filesystem/operations/last_write_time.cc   |   11 +++
 2 files changed, 22 insertions(+)

diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/operations/last_write_time.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/last_write_time.cc
index 7d6468a512424..ecdd45d6ac99e 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/last_write_time.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/last_write_time.cc
@@ -33,6 +33,14 @@
 #endif
 #include 
 
+#if (_GLIBCXX_USE_UTIMENSAT \
+ || (_GLIBCXX_USE_UTIME && _GLIBCXX_HAVE_SYS_STAT_H)) \
+  && defined (_GLIBCXX_HAVE_SYS_STAT_H)
+# define HAVE_LWT 1
+#else
+# define HAVE_LWT 0
+#endif
+
 using time_type = std::filesystem::file_time_type;
 
 namespace chrono = std::chrono;
@@ -209,6 +217,9 @@ test02()
 int
 main()
 {
+  if (!HAVE_LWT)
+return 0;
+
   test01();
   test02();
 }
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/last_write_time.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/last_write_time.cc
index 38fafc392ca9e..562c1114a7fb3 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/last_write_time.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/last_write_time.cc
@@ -34,6 +34,14 @@
 #endif
 #include 
 
+#if (_GLIBCXX_USE_UTIMENSAT \
+ || (_GLIBCXX_USE_UTIME && _GLIBCXX_HAVE_SYS_STAT_H)) \
+  && defined (_GLIBCXX_HAVE_SYS_STAT_H)
+# define HAVE_LWT 1
+#else
+# define HAVE_LWT 0
+#endif
+
 using time_type = std::experimental::filesystem::file_time_type;
 
 namespace chrono = std::chrono;
@@ -175,6 +183,9 @@ test02()
 int
 main()
 {
+  if (!HAVE_LWT)
+return 0;
+
   test01();
   test02();
 }

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: testsuite: skip fs space tests if not available

2022-06-21 Thread Alexandre Oliva via Gcc-patches



The do_space function is defined in ways that are useful, or that fail
immediately, depending on various macros.  When it fails immediately,
the filesystem space.cc tests fail noisily, but the fail is entirely
expected.

Define HAVE_SPACE in the space.cc tests, according to the macros that
select implementations of do_space, and use it to skip tests that are
expected to fail.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/27_io/filesystem/operations/space.cc: Skip the
test if the feature is unavailable.
* testsuite/experimental/filesystem/operations/space.cc:
Likewise.
---
 .../testsuite/27_io/filesystem/operations/space.cc |   10 ++
 .../experimental/filesystem/operations/space.cc|   10 ++
 2 files changed, 20 insertions(+)

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/space.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/space.cc
index 05997cac1dfa4..029d65655b1a7 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/space.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/space.cc
@@ -24,6 +24,13 @@
 #include 
 #include 
 
+#if defined (_GLIBCXX_HAVE_SYS_STATVFS_H) \
+  || defined (_GLIBCXX_FILESYSTEM_IS_WINDOWS)
+# define HAVE_SPACE 1
+#else
+# define HAVE_SPACE 0
+#endif
+
 bool check(std::filesystem::space_info const& s)
 {
   const std::uintmax_t err = -1;
@@ -59,6 +66,9 @@ test02()
 int
 main()
 {
+  if (!HAVE_SPACE)
+return 0;
+
   test01();
   test02();
 }
diff --git a/libstdc++-v3/testsuite/experimental/filesystem/operations/space.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/space.cc
index 10ee0f06871df..83868dea9b5e3 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/operations/space.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/space.cc
@@ -25,6 +25,13 @@
 #include 
 #include 
 
+#if defined (_GLIBCXX_HAVE_SYS_STATVFS_H) \
+  || defined (_GLIBCXX_FILESYSTEM_IS_WINDOWS)
+# define HAVE_SPACE 1
+#else
+# define HAVE_SPACE 0
+#endif
+
 namespace fs = std::experimental::filesystem;
 
 bool check(fs::space_info const& s)
@@ -60,6 +67,9 @@ test02()
 int
 main()
 {
+  if (!HAVE_SPACE)
+return 0;
+
   test01();
   test02();
 }

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: testsuite: fs rename to self may fail

2022-06-21 Thread Alexandre Oliva via Gcc-patches



rtems6's rename() implementation errors with EEXIST when the rename-to
filename exists, even when renaming a file to itself or when renaming
a nonexisting file.  Adjust expectations.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: https://devel.rtems.org/ticket/2169 doesn't seem to suggest plans to
change behavior so as to comply with POSIX.


for  libstdc++-v3/ChangeLog

* testsuite/27_io/filesystem/operations/rename.cc (test01):
Accept EEXIST fail on self-rename, on rename of a
nonexisting file, and on rename to existing file.  Clean up p1
in case it remains.
* testsuite/experimental/filesystem/operations/rename.cc
(test01): Accept EEXIST fail on self-rename, and on rename to
existing file.  Clean up p1 in case it remains.
---
 .../27_io/filesystem/operations/rename.cc  |   11 ++-
 .../experimental/filesystem/operations/rename.cc   |9 +
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc
index 936e306041290..2fb2068dfd3c5 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/rename.cc
@@ -46,14 +46,14 @@ test01()
 
   ec = bad_ec;
   std::ofstream{p1}; // create file
-  fs::rename(p1, p1, ec); // no-op
-  VERIFY( !ec );
+  fs::rename(p1, p1, ec); // no-op, but may fail
+  VERIFY( !ec || ec.value() == EEXIST );
   VERIFY( is_regular_file(p1) );
 
   ec.clear();
   rename(p2, p1, ec);
   VERIFY( ec );
-  VERIFY( ec.value() == ENOENT );
+  VERIFY( ec.value() == ENOENT || ec.value() == EEXIST );
   VERIFY( is_regular_file(p1) );
 
   ec = bad_ec;
@@ -65,10 +65,11 @@ test01()
   ec = bad_ec;
   std::ofstream{p1}; // create file
   fs::rename(p1, p2, ec);
-  VERIFY( !ec );
-  VERIFY( !exists(p1) );
+  VERIFY( !ec || ec.value() == EEXIST );
+  VERIFY( !exists(p1) || ec );
   VERIFY( is_regular_file(p2) );
 
+  fs::remove(p1, ec);
   fs::remove(p2, ec);
 }
 
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc
index 520d48ef8d844..d2175652a79a8 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/rename.cc
@@ -47,8 +47,8 @@ test01()
 
   ec = bad_ec;
   std::ofstream{p1}; // create file
-  fs::rename(p1, p1, ec); // no-op
-  VERIFY( !ec );
+  fs::rename(p1, p1, ec); // no-op, but may fail
+  VERIFY( !ec || ec.value() == EEXIST );
   VERIFY( is_regular_file(p1) );
 
   ec.clear();
@@ -65,10 +65,11 @@ test01()
   ec = bad_ec;
   std::ofstream{p1}; // create file
   fs::rename(p1, p2, ec);
-  VERIFY( !ec );
-  VERIFY( !exists(p1) );
+  VERIFY( !ec || ec.value() == EEXIST );
+  VERIFY( !exists(p1) || ec );
   VERIFY( is_regular_file(p2) );
 
+  fs::remove(p1, ec);
   fs::remove(p2, ec);
 }
 

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH] libstdc++: 60241.cc: tolerate slightly shorter aggregate sleep

2022-06-21 Thread Sebastian Huber


On 22/06/2022 08:01, Alexandre Oliva via Gcc-patches wrote:


On rtems under qemu, the frequently-interrupted nanosleep ends up
sleeping shorter than expected, by a margin of less than 0,3%.

I figured failing the library test over a system (emulator?) bug is
undesirable, so I put in some tolerance for the drift.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: I see nothing wrong with the implementation of clock_nanosleep (used
by nanosleep) on rtems6 that could cause it to wake up too early.  I
suspect some artifact of the emulation environment.


for  libstdc++-v3/ChangeLog

* testsuite/30_threads/this_thread/60421.cc: Tolerate a
slightly early wakeup.
---
  .../testsuite/30_threads/this_thread/60421.cc  |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc 
b/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc
index 12dbeba1cc492..f3a5af453c4ad 100644
--- a/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc
+++ b/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc
@@ -51,9 +51,10 @@ test02()
std::thread t([&result, &sleeping] {
  auto start = std::chrono::system_clock::now();
  auto time = std::chrono::seconds(3);
+auto tolerance = std::chrono::milliseconds(10);
  sleeping = true;
  std::this_thread::sleep_for(time);
-result = std::chrono::system_clock::now() >= (start + time);
+result = std::chrono::system_clock::now() + tolerance >= (start + time);
  sleeping = false;
});
while (!sleeping)


This looks like a bug in RTEMS or the BSP for the test platform. I would 
first investigate this and then change the test which looks all right to me.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

[PATCH] libstdc++: retry removal of dir entries if dir removal fails

2022-06-21 Thread Alexandre Oliva via Gcc-patches



On some target systems (e.g. rtems6.0), removing directory components
while iterating over directory entries may cause some of the directory
entries to be skipped, which prevents the removal of the parent
directory from succeeding.

Advancing the iterator before removing a member proved not to be
enough, so I've instead arranged for remove_all to retry the removal
of components if the removal of the parent dir fails after removing at
least one entry.  The fail will be permanent only if no components got
removed in the current try.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: The implementation of remove_all has changed completely, compared
with the gcc-11 environment in which the need for this patch came up.  I
have reimplemented it for mainline, and I have attempted to test it in
this environment, but new filesystem tests and subtests that fail on
rtems6.0 have impaired testing and prevented the full pass rate I got
for them with a similar patchset on gcc-11.


for  libstdc++-v3/ChangeLog

* src/c++17/fs_ops.cc (remove_all): Retry removal of
directory entries.
---
 libstdc++-v3/src/c++17/fs_ops.cc |   22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 435368fa5c5ff..b3390310132b4 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1286,6 +1286,8 @@ fs::remove_all(const path& p)
 {
   error_code ec;
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1303,7 +1305,7 @@ fs::remove_all(const path& p)
 break;
   case ENOENT:
 // Our work here is done.
-return 0;
+return count;
   case ENOTDIR:
   case ELOOP:
 // Not a directory, will remove below.
@@ -1313,6 +1315,18 @@ fs::remove_all(const path& p)
 _GLIBCXX_THROW_OR_ABORT(filesystem_error("cannot remove all", p, ec));
   }
 
+  if (count > init_count)
+{
+  if (int last = fs::remove(p, ec); !ec)
+   return count + last;
+  else
+   // Some systems seem to skip entries in the dir iteration if
+   // you remove dir entries while iterating, so if we removed
+   // anything in the dir in this round, and failed to remove
+   // the dir (presumably because it wasn't empty), retry.
+   goto retry;
+}
+
   // Remove p itself, which is either a non-directory or is now empty.
   return count + fs::remove(p);
 }
@@ -1321,6 +1335,8 @@ std::uintmax_t
 fs::remove_all(const path& p, error_code& ec)
 {
   uintmax_t count = 0;
+ retry:
+  uintmax_t init_count = count;
   recursive_directory_iterator dir(p, directory_options{64|128}, ec);
   switch (ec.value()) // N.B. assumes ec.category() == std::generic_category()
   {
@@ -1341,7 +1357,7 @@ fs::remove_all(const path& p, error_code& ec)
   case ENOENT:
 // Our work here is done.
 ec.clear();
-return 0;
+return count;
   case ENOTDIR:
   case ELOOP:
 // Not a directory, will remove below.
@@ -1354,6 +1370,8 @@ fs::remove_all(const path& p, error_code& ec)
   // Remove p itself, which is either a non-directory or is now empty.
   if (int last = fs::remove(p, ec); !ec)
 return count + last;
+  if (count > init_count)
+goto retry;
   return -1;
 }
 

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH v5, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-06-21 Thread HAO CHEN GUI via Gcc-patches

Hi,

On 21/6/2022 上午 7:08, Segher Boessenkool wrote:
> && !flag_trapping_math
> 
> and/or whatever else is needed as well here.
> 
I have a question here. fmin/max are folded to MIN/MAX_EXPR when
flag_finite_math_only is set. Seems no-trapping-math is no need to
fmin/max? Also xs[min|max]dp do raise trapping.

/* Convert fmin/fmax to MIN_EXPR/MAX_EXPR.  C99 requires these
   functions to return the numeric arg if the other one is NaN.
   MIN and MAX don't honor that, so only transform if -ffinite-math-only
   is set.  C99 doesn't require -0.0 to be handled, so we don't have to
   worry about it either.  */
(if (flag_finite_math_only)
 (simplify
  (FMIN_ALL @0 @1)
  (min @0 @1))
 (simplify
  (FMAX_ALL @0 @1)
  (max @0 @1)))

> Are things like
>   fmin(4.0, 2.0);
> (still) optimised correctly?
I have tested it. fmin(4.0, 2.0) is converted to "2.0" in front end.
So my patch doesn't touch it.

Thanks a lot.
Gui Haochen

[PATCH] libstdc++: testsuite: test symlnks ifdef _GLIBCXX_HAVE_SYMLINK

2022-06-21 Thread Alexandre Oliva via Gcc-patches



Several filesystem tests expect to be able to create symlinks even
when !defined (_GLIBCXX_HAVE_SYMLINK), and fail predictably, reducing
the amount of testing of other filesystem features.

They are already skipped for mingw targets.  I've extended the
skipping to other targets in which _GLIBCXX_HAVE_SYMLINK is
undefined.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: Testing with trunk was somewhat impaired by various changes in the
filesystem implementation and tests that cause new failures on rtems6
that I have not (yet?) been able to investigate.  A slight variant of
this patch, along with a number of patches I'm yet to post, has enabled
gcc-11 to pass all filesystem tests.  Meaning this might turn out to be
an incomplete fix for the problem on mainline, in case remaining
failures hide similar but new (compared with gcc-11) occurrences of the
problem this attempts to fix.  However, it brings strict improvement, so
I expect it to be useful to integrate it nevertheless.


for  libstdc++-v3/ChangeLog

* testsuite/27_io/filesystem/operations/canonical.cc (test03):
Only create symlinks when _GLIBCXX_HAVE_SYMLINK is defined.
* testsuite/27_io/filesystem/operations/copy.cc (test02):
Likewise.
* testsuite/27_io/filesystem/operations/create_directories.cc
(test04): Likewise.
* testsuite/27_io/filesystem/operations/create_directory.cc
(test01): Likewise.
* testsuite/27_io/filesystem/operations/permissions.cc
(test03, test04): Likewise.
* testsuite/27_io/filesystem/operations/read_symlink.cc
(test01): Likewise.
* testsuite/27_io/filesystem/operations/remove.cc (test01):
Likewise.
* testsuite/27_io/filesystem/operations/remove_all.cc (test01):
Likewise.
* testsuite/27_io/filesystem/operations/rename.cc
(test_symlinks): Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc
(test01, test02): Likewise.
* testsuite/27_io/filesystem/operations/weakly_canonical.cc
(test01): Likewise.
* 
testsuite/experimental/filesystem/iterators/recursive_directory_itreator.cc
(test06): Likewise.
* testsuite/experimental/filesystem/operations/copy.cc
(test01): Likewise.
* testsuite/experimental/filesystem/operations/create_directories.cc
(test04): Likewise.
* testsuite/experimental/filesystem/operations/create_directory.cc
(test01): Likewise.
* testsuite/experimental/filesystem/operations/permissions.cc
(test03, test04): Likewise.
* testsuite/experimental/filesystem/operations/read_symlink.cc
(test01): Likewise.
* testsuite/experimental/filesystem/operations/remove.cc
(test01): Likewise.
* testsuite/experimental/filesystem/operations/remove_all.cc
(test01): Likewise.
* testsuite/experimental/filesystem/operations/rename.cc
(test01): Likewise.
---
 .../27_io/filesystem/operations/canonical.cc   |2 ++
 .../testsuite/27_io/filesystem/operations/copy.cc  |3 ++-
 .../filesystem/operations/create_directories.cc|3 ++-
 .../filesystem/operations/create_directory.cc  |3 ++-
 .../27_io/filesystem/operations/permissions.cc |4 
 .../27_io/filesystem/operations/read_symlink.cc|2 ++
 .../27_io/filesystem/operations/remove.cc  |3 ++-
 .../27_io/filesystem/operations/remove_all.cc  |3 ++-
 .../27_io/filesystem/operations/rename.cc  |3 ++-
 .../27_io/filesystem/operations/symlink_status.cc  |4 
 .../filesystem/operations/weakly_canonical.cc  |3 ++-
 .../iterators/recursive_directory_iterator.cc  |3 ++-
 .../experimental/filesystem/operations/copy.cc |3 ++-
 .../filesystem/operations/create_directories.cc|3 ++-
 .../filesystem/operations/create_directory.cc  |3 ++-
 .../filesystem/operations/permissions.cc   |4 
 .../filesystem/operations/read_symlink.cc  |2 ++
 .../experimental/filesystem/operations/remove.cc   |3 ++-
 .../filesystem/operations/remove_all.cc|3 ++-
 .../experimental/filesystem/operations/rename.cc   |3 ++-
 20 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/canonical.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/canonical.cc
index bc7ef0de2b716..1ae23dadac517 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/canonical.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/canonical.cc
@@ -114,6 +114,8 @@ test03()
 #if defined(__MINGW32__) || defined(__MINGW64__)
   // No symlink support
   const fs::path baz = dir/"foo..\\bar///";
+#elif !defined (_GLIBCXX_HAVE_SYMLINK)
+  const fs::path baz = dir/"foo//../bar///";
 #else
   fs::create_symlink("../bar", foo/"baz");
   const fs::pat

[PATCH] libstdc++: testsuite: avoid predictable mkstemp

2022-06-21 Thread Alexandre Oliva via Gcc-patches



This patch was originally meant to reduce the likelihood that
nonexistent_path() returns the same pathname for from and to.

It was prompted by a target system with a non-random implementation of
mkstemp, that returns a predictable sequence of filenames and selects
the first one that isn't already taken.

That turned out not to be enough: nonexistent_path adds a suffix to
the filename chosen by mkstemp and removes the file it created, so
mkstemp may very well insist on the same basename, and the case that
doesn't use mkstemp doesn't even check whether the file already
exists.

Anyway, by the time I realized this wasn't enough, I'd already
implemented some of the changes, and I figured I might as well
contribute them, even though they don't really solve any problem, and
even if they did, they'd be just a partial solution.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/27_io/filesystem/operations/copy.cc (test02):
Select TO after creating FROM.
(test03, test04): Likewise.
* testsuite/experimental/filesystem/operations/copy.cc
(test02, test03, test04): Likewise.
---
 .../testsuite/27_io/filesystem/operations/copy.cc  |7 ---
 .../experimental/filesystem/operations/copy.cc |7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/copy.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/copy.cc
index b936e04493b5c..f3081f4b64ebc 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/copy.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/copy.cc
@@ -73,7 +73,6 @@ test02()
 
   const std::error_code bad_ec = make_error_code(std::errc::invalid_argument);
   auto from = __gnu_test::nonexistent_path();
-  auto to = __gnu_test::nonexistent_path();
   std::error_code ec;
 
   ec = bad_ec;
@@ -81,6 +80,7 @@ test02()
   VERIFY( !ec );
   VERIFY( fs::exists(from) );
 
+  auto to = __gnu_test::nonexistent_path();
   ec = bad_ec;
   fs::copy(from, to, fs::copy_options::skip_symlinks, ec);
   VERIFY( !ec );
@@ -117,12 +117,13 @@ void
 test03()
 {
   auto from = __gnu_test::nonexistent_path();
-  auto to = __gnu_test::nonexistent_path();
 
   // test empty file
   std::ofstream{from};
   VERIFY( fs::exists(from) );
   VERIFY( fs::file_size(from) == 0 );
+
+  auto to = __gnu_test::nonexistent_path();
   fs::copy(from, to);
   VERIFY( fs::exists(to) );
   VERIFY( fs::file_size(to) == 0 );
@@ -145,11 +146,11 @@ test04()
 {
   const std::error_code bad_ec = make_error_code(std::errc::invalid_argument);
   auto from = __gnu_test::nonexistent_path();
-  auto to = __gnu_test::nonexistent_path();
   std::error_code ec;
 
   create_directories(from/"a/b/c");
 
+  auto to = __gnu_test::nonexistent_path();
   {
 __gnu_test::scoped_file f(to);
 copy(from, to, ec);
diff --git a/libstdc++-v3/testsuite/experimental/filesystem/operations/copy.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/copy.cc
index 5cd6b483c269b..ca38328c5da15 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/operations/copy.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/copy.cc
@@ -73,7 +73,6 @@ test02()
 #endif
 
   auto from = __gnu_test::nonexistent_path();
-  auto to = __gnu_test::nonexistent_path();
   std::error_code ec, bad = std::make_error_code(std::errc::invalid_argument);
 
   ec = bad;
@@ -81,6 +80,7 @@ test02()
   VERIFY( !ec );
   VERIFY( fs::exists(from) );
 
+  auto to = __gnu_test::nonexistent_path();
   ec = bad;
   fs::copy(from, to, fs::copy_options::skip_symlinks, ec);
   VERIFY( !ec );
@@ -116,12 +116,13 @@ void
 test03()
 {
   auto from = __gnu_test::nonexistent_path();
-  auto to = __gnu_test::nonexistent_path();
 
   // test empty file
   std::ofstream{from.c_str()};
   VERIFY( fs::exists(from) );
   VERIFY( fs::file_size(from) == 0 );
+
+  auto to = __gnu_test::nonexistent_path();
   fs::copy(from, to);
   VERIFY( fs::exists(to) );
   VERIFY( fs::file_size(to) == 0 );
@@ -143,11 +144,11 @@ void
 test04()
 {
   auto from = __gnu_test::nonexistent_path();
-  auto to = __gnu_test::nonexistent_path();
   std::error_code ec;
 
   create_directories(from/"a/b/c");
 
+  auto to = __gnu_test::nonexistent_path();
   {
 __gnu_test::scoped_file f(to);
 copy(from, to, ec);

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: async: tolerate slightly shorter sleep

2022-06-21 Thread Alexandre Oliva via Gcc-patches



Even without frequent signals interrupting nanosleep, sleep_for on
rtems on qemu wakes up too early by a predictable margin of less than
0,3%, which some async tests complain about the too-short wait times.
Allow the tests to tolerate a little sleep deprivation.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/30_threads/async/async.cc: Tolerate early wakeup.
---
 libstdc++-v3/testsuite/30_threads/async/async.cc |9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 38943ff1a9a5e..b151677af6a0e 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -104,7 +104,8 @@ void test03()
   VERIFY( status == std::future_status::ready );
 
   auto const elapsed = CLOCK::now() - start;
-  VERIFY( elapsed >= std::chrono::seconds(2) );
+  auto const tolerance = std::chrono::milliseconds(6);
+  VERIFY( elapsed + tolerance >= std::chrono::seconds(2) );
   VERIFY( elapsed < std::chrono::seconds(5) );
 }
 
@@ -169,7 +170,8 @@ void test_pr91486_wait_for()
   auto status = f1.wait_for(wait_time);
   auto const elapsed_steady = chrono::steady_clock::now() - start_steady;
 
-  VERIFY( elapsed_steady >= std::chrono::seconds(1) );
+  auto const tolerance = std::chrono::milliseconds(3);
+  VERIFY( elapsed_steady + tolerance >= std::chrono::seconds(1) );
 }
 
 // This is a clock with a very recent epoch which ensures that the difference
@@ -222,7 +224,8 @@ void test_pr91486_wait_until()
   auto const elapsed_steady = chrono::steady_clock::now() - start_steady;
 
   // This checks that we didn't come back too soon
-  VERIFY( elapsed_steady >= std::chrono::seconds(1) );
+  auto const tolerance = std::chrono::milliseconds(3);
+  VERIFY( elapsed_steady + tolerance >= std::chrono::seconds(1) );
 
   // This checks that wait_until didn't busy wait checking the clock more times
   // than necessary.


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: 60241.cc: tolerate slightly shorter aggregate sleep

2022-06-21 Thread Alexandre Oliva via Gcc-patches



On rtems under qemu, the frequently-interrupted nanosleep ends up
sleeping shorter than expected, by a margin of less than 0,3%.

I figured failing the library test over a system (emulator?) bug is
undesirable, so I put in some tolerance for the drift.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?

PS: I see nothing wrong with the implementation of clock_nanosleep (used
by nanosleep) on rtems6 that could cause it to wake up too early.  I
suspect some artifact of the emulation environment.


for  libstdc++-v3/ChangeLog

* testsuite/30_threads/this_thread/60421.cc: Tolerate a
slightly early wakeup.
---
 .../testsuite/30_threads/this_thread/60421.cc  |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc 
b/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc
index 12dbeba1cc492..f3a5af453c4ad 100644
--- a/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc
+++ b/libstdc++-v3/testsuite/30_threads/this_thread/60421.cc
@@ -51,9 +51,10 @@ test02()
   std::thread t([&result, &sleeping] {
 auto start = std::chrono::system_clock::now();
 auto time = std::chrono::seconds(3);
+auto tolerance = std::chrono::milliseconds(10);
 sleeping = true;
 std::this_thread::sleep_for(time);
-result = std::chrono::system_clock::now() >= (start + time);
+result = std::chrono::system_clock::now() + tolerance >= (start + time);
 sleeping = false;
   });
   while (!sleeping)


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: testsuite: tolerate non-cancelling sleep

2022-06-21 Thread Alexandre Oliva via Gcc-patches



Though sleep, nanosleep and clock_nanosleep are all POSIX cancellation
points, not all target systems follow this POSIX requirement.
30_threads/thread/native_handle/cancel.cc will run until it times out
on such systems.

Rather than failing a C++ library test because of a limitation of the
target system, this patch gives the test a chance to successfully
exercise the features it intends to exercise, by introducing a
cancellation point in a loop that would otherwise run indefinitely on
systems exhibiting this limitation.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/30_threads/thread/native_handle/cancel.cc: Add an
explicit cancellation point in case sleep_for lacks one.
---
 .../30_threads/thread/native_handle/cancel.cc  |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/30_threads/thread/native_handle/cancel.cc 
b/libstdc++-v3/testsuite/30_threads/thread/native_handle/cancel.cc
index dca162b3ace1d..3cef97e8c53de 100644
--- a/libstdc++-v3/testsuite/30_threads/thread/native_handle/cancel.cc
+++ b/libstdc++-v3/testsuite/30_threads/thread/native_handle/cancel.cc
@@ -30,7 +30,11 @@ void f(std::atomic& started)
 {
   started = true;
   while (true)
-std::this_thread::sleep_for(std::chrono::milliseconds(100));
+{
+  std::this_thread::sleep_for(std::chrono::milliseconds(100));
+  // In case the target system doesn't make sleep a cancellation point...
+  pthread_testcancel();
+}
 }
 
 int main()

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] libstdc++: testsuite: use -lbsd for net_ts on RTEMS

2022-06-21 Thread Alexandre Oliva via Gcc-patches



Networking functions that net_ts tests rely on are defined in libbsd
on RTEMS, so link with it.

Regstrapped on x86_64-linux-gnu, also tested with a cross to
aarch64-rtems6.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/lib/dg-options.exp (add_options_for_net_ts): Add
-lbsd for RTEMS targets.
---
 libstdc++-v3/testsuite/lib/dg-options.exp |6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp 
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 203bb0dfed505..15f37da468a5b 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -253,6 +253,12 @@ proc add_options_for_net_ts { flags } {
 # libsocket and libnsl for networking applications.
 if { [istarget *-*-solaris2*] } {
return "$flags -lsocket -lnsl"
+} elseif { [istarget *-*-rtems*] } {
+   # Adding -Wl,--gc-sections would enable a few more tests to
+   # link, but all of them fail at runtime anyway, because the
+   # io_context ctor calls pipe(), which always fails, and thus
+   # the ctor throws a system error.
+   return "$flags -lbsd"
 }
 return $flags
 }

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH] aarch64: testsuite: symbol-range compile only

2022-06-21 Thread Alexandre Oliva via Gcc-patches

On Jun 21, 2022, Richard Sandiford  wrote:

> Could we instead have a new target selector for whether the memory
> map includes xGB of RAM?

How about this?  Testing on aarch64-rtems6.0.  Ok to install?


aarch64: testsuite: symbol-range fallback to compile

From: Alexandre Oliva 

On some of our embedded aarch64 targets, RAM size is too small for
this test to fit.  It doesn't look like this test requires linking,
and if it does, the -tiny version may presumably get most of the
coverage without going overboard in target system requirements.

Still, linking may be useful, so introduce a TwoPlusGigs effective
target, that checks for the ability to link a program with 2GB of
sbss, and use that to select whether to link or just compile
symbol-range.c.


for  gcc/testsuite/ChangeLog

* lib/target-supports.exp
(check_effective_target_TwoPlusGigs): New.
* gcc.target/aarch64/symbol-range.c: Link only on
TwoPlusGigs targets, compile otherwise.
---
 gcc/testsuite/gcc.target/aarch64/symbol-range.c |3 ++-
 gcc/testsuite/lib/target-supports.exp   |9 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/symbol-range.c 
b/gcc/testsuite/gcc.target/aarch64/symbol-range.c
index d8e82fa1b2829..f9a916c7ae2f0 100644
--- a/gcc/testsuite/gcc.target/aarch64/symbol-range.c
+++ b/gcc/testsuite/gcc.target/aarch64/symbol-range.c
@@ -1,4 +1,5 @@
-/* { dg-do link } */
+/* { dg-do link { target TwoPlusGigs } } */
+/* { dg-do compile { target { ! TwoPlusGigs } } } */
 /* { dg-options "-O3 -save-temps -mcmodel=small" } */
 
 char fixed_regs[0x8000];
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index d1f4eb7641fa7..0507d6e617fef 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2906,6 +2906,15 @@ proc check_effective_target_le { } {
 }]
 }
 
+# Return 1 if we can link a program with 2+GB of data.
+
+proc check_effective_target_TwoPlusGigs { } {
+return [check_no_compiler_messages TwoPlusGigs executable {
+   int dummy[0x8000];
+   int main () { return 0; }
+}]
+}
+
 # Return 1 if we're generating 32-bit code using default options, 0
 # otherwise.
 


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH RFA] ubsan: do return check with -fsanitize=unreachable

2022-06-21 Thread Jason Merrill via Gcc-patches


On 6/20/22 16:16, Jason Merrill wrote:

On 6/20/22 07:05, Jakub Jelinek wrote:

On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote:
Related to PR104642, the current situation where we get less return 
checking
with just -fsanitize=unreachable than no sanitize flags seems 
undesirable; I

propose that we do return checking when -fsanitize=unreachable.


__builtin_unreachable itself (unless turned into trap or
__ubsan_handle_builtin_unreachable) is not any kind of return 
checking, it

is just an optimization.


Yes, but I'm talking about "when -fsanitize=unreachable".

Looks like clang just traps on missing return if not 
-fsanitize=return, but

the approach in this patch seems more helpful to me if we're already
sanitizing other should-be-unreachable code.

I'm assuming that the difference in treatment of SANITIZE_UNREACHABLE 
and

SANITIZE_RETURN with regard to loop optimization is deliberate.


return and unreachable are separate sanitizers and such silent one way
implication can have quite unexpected consequences, especially with
-fsanitize-trap=.
Say with -fsanitize=unreachable -fsanitize-trap=unreachable, both current
trunk and clang will link without -lubsan, because the only enabled UBSan
sanitizers use __builtin_trap () which doesn't need library.
With -fsanitize=unreachable silently meaning 
-fsanitize=unreachable,return

the above would link in -lubsan, because while SANITIZE_UNREACHABLE uses
__builtin_trap, SANITIZE_RETURN doesn't.
Similarly, one has no_sanitize attribute, one could in certain function
__attribute__((no_sanitize ("unreachable"))) and because on the command
line using -fsanitize=unreachable assume other sanitizers aren't enabled,
but the silent addition of return sanitizer would break that.


Ah, true.  How about this approach instead?


Or, this approach relies on the PR104642 patch, and just fixes the line 
number issue.  This is less clear about the problem than using the 
return ubsan library function, but avoids using one entry point to 
implement the other sanitizer, if that's important.From da268c4c1f9ac0a7eaa4d428791c3ed51cf0994d Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Wed, 15 Jun 2022 15:45:48 -0400
Subject: [PATCH] ubsan: do return check with -fsanitize=unreachable
To: gcc-patches@gcc.gnu.org

The current situation where we get less return checking with just
-fsanitize=unreachable than no sanitize flags seems undesirable; we would
get checking there except that we explicitly avoid emitting a
__builtin_unreachable.  The documented reason is that the use of
BUILTIN_LOCATION makes the message confusing, so let's fix that.

The !optimize check seems unneded since the PR104642 patch.

gcc/cp/ChangeLog:

	* cp-gimplify.cc (cp_maybe_instrument_return): Pass the real
	location to the ubsan unreachable function.

gcc/ChangeLog:

	* tree-cfg.cc (pass_warn_function_return::execute): Check
	for ubsan unreachable.

gcc/testsuite/ChangeLog:

	* g++.dg/ubsan/return-8c.C: New test.
---
 gcc/cp/cp-gimplify.cc  | 19 ++-
 gcc/testsuite/g++.dg/ubsan/return-8c.C | 16 
 gcc/tree-cfg.cc|  3 +++
 3 files changed, 25 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ubsan/return-8c.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index c05be833357..fcea9f8d0e0 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1806,18 +1806,6 @@ cp_maybe_instrument_return (tree fndecl)
   || !targetm.warn_func_return (fndecl))
 return;
 
-  if (!sanitize_flags_p (SANITIZE_RETURN, fndecl)
-  /* Don't add __builtin_unreachable () if not optimizing, it will not
-	 improve any optimizations in that case, just break UB code.
-	 Don't add it if -fsanitize=unreachable -fno-sanitize=return either,
-	 UBSan covers this with ubsan_instrument_return above where sufficient
-	 information is provided, while the __builtin_unreachable () below
-	 if return sanitization is disabled will just result in hard to
-	 understand runtime error without location.  */
-  && ((!optimize && !flag_unreachable_traps)
-	  || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl)))
-return;
-
   tree t = DECL_SAVED_TREE (fndecl);
   while (t)
 {
@@ -1864,7 +1852,12 @@ cp_maybe_instrument_return (tree fndecl)
   if (sanitize_flags_p (SANITIZE_RETURN, fndecl))
 t = ubsan_instrument_return (loc);
   else
-t = build_builtin_unreachable (BUILTINS_LOCATION);
+{
+  /* Pass the real location to the ubsan function.  */
+  t = build_builtin_unreachable (loc);
+  /* But set BUILTINS_LOCATION for pass_warn_function_return.  */
+  SET_EXPR_LOCATION (t, BUILTINS_LOCATION);
+}
 
   append_to_statement_list (t, p);
 }
diff --git a/gcc/testsuite/g++.dg/ubsan/return-8c.C b/gcc/testsuite/g++.dg/ubsan/return-8c.C
new file mode 100644
index 000..828b24efa31
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/return-8c.C
@@ -0,0 +1,16 @@
+// PR c++/104642
+
+// -fsanit

Re: [PATCH RFA] ubsan: default to trap on unreachable at -O0 and -Og [PR104642]

2022-06-21 Thread Jason Merrill via Gcc-patches


On 6/21/22 07:17, Jakub Jelinek wrote:

On Mon, Jun 20, 2022 at 04:30:51PM -0400, Jason Merrill wrote:
I'd still prefer to see a separate -funreachable-traps.
The thing is that -fsanitize{,-recover,-trap}= are global options, not per
function (and only tweaked by no_sanitize attribute), while something
that needs to depend on the per-function -O0/-Og setting is necessarily per
function.  The *.awk changes I understand make -fsanitize= kind of per
function but -fsanitize-{recover,trap}= remain global, that is going to be a
nightmare especially with LTO which saves/restores the per function flags
and for the global ones merges them across TUs.
By separating sanitizers (which would remain global with no_sanitize
overrides) from -funreachable-traps which would be Optimization option
(with default set if unset in default_options_optimization or so)
saved/restored upon function changes that issue is gone.


Done.


--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5858,6 +5858,11 @@ builtin_decl_implicit (enum built_in_function fncode)
return builtin_info[uns_fncode].decl;
  }
  
+/* For BUILTIN_UNREACHABLE, use one of these instead of one of the above.  */

+extern tree builtin_decl_unreachable ();
+extern gcall *gimple_build_builtin_unreachable (location_t);
+extern tree build_builtin_unreachable (location_t);


I think we generally try to declare functions in the header with same
basename as the source file in which they are defined.
So, the question is if builtin_decl_unreachable and build_builtin_unreachable
shouldn't be defined in tree.cc and declared in tree.h and
gimple_build_builtin_unreachable in gimple.cc and declared in gimple.h,
using a helper defined in ubsan.cc and declared in ubsan.h (your current
unreachable_1).


Done.


+
  /* Set explicit builtin function nodes and whether it is an implicit
 function.  */
  
--- a/gcc/builtins.cc

+++ b/gcc/builtins.cc
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
--- a/gcc/ipa-fnsummary.cc
+++ b/gcc/ipa-fnsummary.cc
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
--- a/gcc/ipa.cc
+++ b/gcc/ipa.cc


The above changes LGTM.

  if (dump_enabled_p ())
{
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 959d48d173f..d92699a1bc9 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -1122,6 +1122,17 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
opts->x_flag_no_inline = 1;
  }
  
+  /* At -O0 or -Og, turn __builtin_unreachable into a trap.  */

+  if (!opts_set->x_flag_sanitize)
+{
+  if (!opts->x_optimize || opts->x_optimize_debug)
+   opts->x_flag_sanitize = SANITIZE_UNREACHABLE|SANITIZE_RETURN;
+
+  /* Change this without regard to optimization level so we don't need to
+deal with it in optc-save-gen.awk.  */
+  opts->x_flag_sanitize_trap = SANITIZE_UNREACHABLE|SANITIZE_RETURN;
+}
+
/* Pipelining of outer loops is only possible when general pipelining
   capabilities are requested.  */
if (!opts->x_flag_sel_sched_pipelining)


See above.


--- a/gcc/sanopt.cc
+++ b/gcc/sanopt.cc
@@ -942,7 +942,15 @@ public:
{}
  
/* opt_pass methods: */

-  virtual bool gate (function *) { return flag_sanitize; }
+  virtual bool gate (function *)
+  {
+/* SANITIZE_RETURN is handled in the front-end.  When trapping,
+   SANITIZE_UNREACHABLE is handled by builtin_decl_unreachable.  */
+unsigned int mask = SANITIZE_RETURN;


There are other sanitizers purely handled in the FEs, guess as a follow-up
we should look at which of them don't really need any sanopt handling.


+if (flag_sanitize_trap & SANITIZE_UNREACHABLE)
+  mask |= SANITIZE_UNREACHABLE;
+return flag_sanitize & ~mask;
+  }
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
--- a/gcc/tree-ssa-loop-ivcanon.cc
+++ b/gcc/tree-ssa-loop-ivcanon.cc
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
--- a/gcc/tree.cc
+++ b/gcc/tree.cc


LGTM.


--- a/gcc/ubsan.cc
+++ b/gcc/ubsan.cc
@@ -638,27 +638,84 @@ ubsan_create_data (const char *name, int loccnt, const 
location_t *ploc, ...)
return var;
  }
  
-/* Instrument the __builtin_unreachable call.  We just call the libubsan

-   routine instead.  */
+/* The built-in decl to use to mark code points believed to be unreachable.
+   Typically __builtin_unreachable, but __builtin_trap if
+   -fsanitize=unreachable -fsanitize-trap=unreachable.  If only
+   -fsanitize=unreachable, we rely on sanopt to replace any calls with the
+   appropriate ubsan function.  When building a call directly, use
+   {gimple_},build_builtin_unreachable instead.  */
+
+tree
+builtin_decl_unreachable ()
+{
+  enum built_in_function fncode = BUILT_IN_UNREACHABLE;
+
+  if (sanitize_flags_p (SANITIZE_UNREACHABLE))
+{
+  if (flag_sanitize_trap & SANITIZE_UNREACHABLE)
+   fncode = BUILT_IN_TRAP;
+  /* Otherwise we want __b

Re: [PATCH] RISC-V: Add -mtune=thead-c906 to the invoke docs

2022-06-21 Thread Palmer Dabbelt


On Wed, 25 May 2022 23:11:00 PDT (-0700), Kito Cheng wrote:

LGTM, thanks :)


Committed.


On Thu, May 26, 2022 at 10:31 AM Palmer Dabbelt  wrote:


gcc/ChangeLog

* doc/invoke.texi (RISC-V): Document -mtune=thead-c906.

Signed-off-by: Palmer Dabbelt 
---
 gcc/doc/invoke.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 71098d86313..a584dc6a7f9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -28088,7 +28088,7 @@ Permissible values for this option are: 
@samp{sifive-e20}, @samp{sifive-e21},
 Optimize the output for the given processor, specified by microarchitecture or
 particular CPU name.  Permissible values for this option are: @samp{rocket},
 @samp{sifive-3-series}, @samp{sifive-5-series}, @samp{sifive-7-series},
-@samp{size}, and all valid options for @option{-mcpu=}.
+@samp{thead-c906}, @samp{size}, and all valid options for @option{-mcpu=}.

 When @option{-mtune=} is not specified, use the setting from @option{-mcpu},
 the default is @samp{rocket} if both are not specified.
--
2.34.1

Re: [PATCH] libgompd: Fix sizes in OMPD support and add local ICVs finctions.

2022-06-21 Thread Mohamed Atef via Gcc-patches

I forgot the DCO line. And I edited the commit message, but I can't push,
even forced push doesn't work.
Now I get the non-fast-forward error, is there any way to solve this?

On Mon, Jun 20, 2022 at 9:31 AM Jakub Jelinek  wrote:

> On Fri, Jun 17, 2022 at 01:20:28AM +0200, Mohamed Atef wrote:
> > libgomp/ChangeLog
> >
> > 2022-06-17  Mohamed Atef  
> >
> > * ompd-helper.h (DEREFERENCE, ACCESS_VALUE): New macros.
> > (gompd_get_proc_bind): Change the returned value from ompd_word_t
> > to const char *.
> > (gompd_get_max_task_priority): Fix format.
> > (gompd_stringize_gompd_enabled): Removed.
> > (gompd_get_gompd_enabled): New function prototype.
> > * ompd-helper.c (gompd_get_affinity_format): Call CHECK_RET.
> > Fix format in gompd_enabled GET_VALUE.
> > (gompd_stringize_gompd_enabled): Removed.
> > (gompd_get_nthread, gompd_get_thread_limit, gompd_get_run_sched,
> > gompd_get_run_sched_chunk_size, gompd_get_default_device,
> > gompd_get_dynamic, gompd_get_max_active_levels, gompd_get_proc_bind,
> > gompd_is_final,
> > gompd_is_implicit, gompd_get_team_size): New functions.
> > (gompd_get_gompd_enabled): Change the returned value from
> > ompd_word_t to const char *.
> > * ompd-init.c (ompd_process_initialize): Use sizeof_short instead of
> > sizeof_long_long in GET_VALUE argument.
> > * ompd-support.h: Change type from __UINT64_TYPE__ to unsigned short.
> > (GOMPD_FOREACH_ACCESS): Add entries for gomp_task kind
> > and final_task and gomp_team nthreads.
> > * ompd-support.c (gompd_get_offset, gompd_get_sizeof_member,
> > gompd_get_size, OMPD_SECTION): Define.
> > (gompd_access_gomp_thread_handle,
> > gompd_sizeof_gomp_thread_handle): New variables.
> > (gompd_state): Change type from __UNIT64_TYPE__ to
> > unsigned short.
> > (gompd_load): Remove gompd_init_access, gompd_init_sizeof_members,
> > gompd_init_sizes, gompd_access_gomp_thread_handle,
> > gompd_sizeof_gomp_thread_handle.
> > * ompd-icv.c (ompd_get_icv_from_scope): Add thread_handle,
> > task_handle and parallel_handle. Fix format in ashandle definition.
>
> Just a nit.  After . there should be 2 spaces instead of one
> unless it is at the end of line.
>
> > Call gompd_get_nthread, gompd_get_thread_limit, gomp_get_run_shed,
> > gompd_get_run_sched_chunk_size, gompd_get_default_device,
> > gompd_get_dynamic, gompd_get_max_active_levels, gompd_get_proc_bind,
> > gompd_is_final,
> > gompd_is_implicit,
> > and gompd_get_team_size.
> > (ompd_get_icv_string_from_scope): Fix format in ashandle definition.
> > Add task_handle. Call gompd_get_gompd_enabled, and
>
> Here too.
>
> > gompd_get_proc_bind. Remove the call to
> > gompd_stringize_gompd_enabled.
>
> > +
> > +unsigned short gompd_access_gomp_thread_handle;
> > +unsigned short gompd_sizeof_gomp_thread_handle;
>
> This is undesirable, both because you are then mixing
> const and non-const objects in OMPD_SECTION if GOMP_NEEDS_THREAD_HANDLE
> is defined and because you need to duplicate the stuff in the macros.
> I'd suggest
> #ifndef GOMP_NEEDS_THREAD_HANDLE
> const unsigned short gompd_access_gomp_thread_handle
>   __attribute__ ((used)) OMPD_SECTION = 0;
> const unsigned short gompd_sizeof_gomp_thread_handle
>   __attribute__ ((used)) OMPD_SECTION = 0;
> #endif
>
> > +/* Get offset of the member m in struct t.  */
> > +#define gompd_get_offset(t, m) \
> > +  const unsigned short gompd_access_##t##_##m __attribute__ ((used)) \
> > +OMPD_SECTION \
> > +  = (unsigned short) offsetof (struct t, m);
> > +  GOMPD_FOREACH_ACCESS (gompd_get_offset)
> > +#ifdef GOMP_NEEDS_THREAD_HANDLE
> > +  gompd_access_gomp_thread_handle __attribute__ ((used)) OMPD_SECTION
> > += (unsigned short) offsetof (gomp_thread, handle);
> > +#endif
>
> Remove the above 4 lines.
>
> > +#undef gompd_get_offset
> > +/* Get size of member m in struct t.  */
> > +#define gompd_get_sizeof_member(t, m) \
> > +  const unsigned short gompd_sizeof_##t##_##m __attribute__ ((used)) \
> > +OMPD_SECTION \
> > +  = sizeof (((struct t *) NULL)->m);
> > +  GOMPD_FOREACH_ACCESS (gompd_get_sizeof_member)
> > +#ifdef GOMP_NEEDS_THREAD_HANDLE
> > +  gompd_sizeof_gomp_thread_handle __attribute__ ((used)) OMPD_SECTION
> > += sizeof (((struct gomp_thread *) NULL)->handle);
> > +#endif
>
> And these.
>
> > +#undef gompd_get_sizeof_member
> > +/* Get size of struct t.  */
> > +#define gompd_get_size(t) \
> > +  const unsigned short gompd_sizeof_##t##_ __attribute__ ((used)) \
> > +OMPD_SECTION \
> > +  = sizeof (struct t);
> > +  GOMPD_SIZES (gompd_get_size)
> > +#undef gompd_get_size
> >
> > --- a/libgomp/ompd-support.h
> > +++ b/libgomp/ompd-support.h
> > @@ -67,7 +67,7 @@
> >  #endif
> >
> >  void gompd_load (void);
> > -extern __UINT64_TYPE__ gompd_state;
> > +extern unsigned short gompd_state;
> >
> >  #define OMPD_ENABLED 0x1
>
> #ifdef GOMP_NEEDS_THREAD_HANDLE
> #define gompd_thread_handle_access gompd_access (gomp_thread, handle)
> #else
> #define gompd_thread_handle_access
> #endif
>
> above the following macr

Re: [PATCH] xtensa: Fix buffer overflow

2022-06-21 Thread Max Filippov via Gcc-patches

On Tue, Jun 21, 2022 at 12:52 PM Takayuki 'January June' Suwa
 wrote:
>
> Fortify buffer overflow message reported.
> (see https://github.com/earlephilhower/esp-quick-toolchain/issues/36)
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (bswapsi2_internal):
> Enlarge the buffer that is obviously smaller than the template
> string given to sprintf().
> ---
>  gcc/config/xtensa/xtensa.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Committed to master.

-- 
Thanks.
-- Max

Re: [PATCH] Introduce -nolibstdc++ option

2022-06-21 Thread Alexandre Oliva via Gcc-patches

On Jun 21, 2022, Fangrui Song  wrote:

> Is this similar to clang -nostdlib++ ?
> When libstdc++ is selected, clang -nostdlib++ removes -lstdc++.

Sounds like they're the same indeed, but the clang option you mention
makes little sense to me, so I'd rather to introduce the one that does.
If someone feels offering this option with the same spelling as clang,
it's easy enough to add a synonym.  Now, if others feel we'd be better
off following clang's practices, I don't mind adjusting the patch to use
the same spelling.  It's not like this option is going to have much use
one way or another, aside from this testcase.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

RE: [PATCH 1/2]middle-end Support optimized division by pow2 bitmask

2022-06-21 Thread Tamar Christina via Gcc-patches

> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, June 14, 2022 4:58 PM
> To: Richard Sandiford ; Richard Biener
> 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [PATCH 1/2]middle-end Support optimized division by pow2
> bitmask
> 
> 
> 
> > -Original Message-
> > From: Richard Sandiford 
> > Sent: Tuesday, June 14, 2022 2:43 PM
> > To: Richard Biener 
> > Cc: Tamar Christina ;
> > gcc-patches@gcc.gnu.org; nd 
> > Subject: Re: [PATCH 1/2]middle-end Support optimized division by pow2
> > bitmask
> >
> > Richard Biener  writes:
> > > On Mon, 13 Jun 2022, Tamar Christina wrote:
> > >
> > >> > -Original Message-
> > >> > From: Richard Biener 
> > >> > Sent: Monday, June 13, 2022 12:48 PM
> > >> > To: Tamar Christina 
> > >> > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford
> > >> > 
> > >> > Subject: RE: [PATCH 1/2]middle-end Support optimized division by
> > >> > pow2 bitmask
> > >> >
> > >> > On Mon, 13 Jun 2022, Tamar Christina wrote:
> > >> >
> > >> > > > -Original Message-
> > >> > > > From: Richard Biener 
> > >> > > > Sent: Monday, June 13, 2022 10:39 AM
> > >> > > > To: Tamar Christina 
> > >> > > > Cc: gcc-patches@gcc.gnu.org; nd ; Richard
> > >> > > > Sandiford 
> > >> > > > Subject: Re: [PATCH 1/2]middle-end Support optimized division
> > >> > > > by
> > >> > > > pow2 bitmask
> > >> > > >
> > >> > > > On Mon, 13 Jun 2022, Richard Biener wrote:
> > >> > > >
> > >> > > > > On Thu, 9 Jun 2022, Tamar Christina wrote:
> > >> > > > >
> > >> > > > > > Hi All,
> > >> > > > > >
> > >> > > > > > In plenty of image and video processing code it's common
> > >> > > > > > to modify pixel values by a widening operation and then
> > >> > > > > > scale them back into range
> > >> > > > by dividing by 255.
> > >> > > > > >
> > >> > > > > > This patch adds an optab to allow us to emit an optimized
> > >> > > > > > sequence when doing an unsigned division that is equivalent
> to:
> > >> > > > > >
> > >> > > > > >x = y / (2 ^ (bitsize (y)/2)-1
> > >> > > > > >
> > >> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > >> > > > > > x86_64-pc-linux-gnu and no issues.
> > >> > > > > >
> > >> > > > > > Ok for master?
> > >> > > > >
> > >> > > > > Looking at 2/2 it seems that this is the wrong way to
> > >> > > > > attack the problem.  The ISA doesn't have such instruction
> > >> > > > > so adding an optab looks premature.  I suppose that there's
> > >> > > > > no unsigned vector integer division and thus we open-code
> > >> > > > > that in a different
> > way?
> > >> > > > > Isn't the correct thing then to fixup that open-coding if
> > >> > > > > it is more
> > >> > efficient?
> > >> > > >
> > >> > >
> > >> > > The problem is that even if you fixup the open-coding it would
> > >> > > need to be something target specific? The sequence of
> > >> > > instructions we generate don't have a GIMPLE representation.
> > >> > > So whatever is generated I'd have to fixup in RTL then.
> > >> >
> > >> > What's the operation that doesn't have a GIMPLE representation?
> > >>
> > >> For NEON use two operations:
> > >> 1. Add High narrowing lowpart, essentially doing (a +w b) >>.n
> bitsize(a)/2
> > >> Where the + widens and the >> narrows.  So you give it two
> > >> shorts, get a byte 2. Add widening add of lowpart so basically
> > >> lowpart (a +w b)
> > >>
> > >> For SVE2 we use a different sequence, we use two back-to-back
> > sequences of:
> > >> 1. Add narrow high part (bottom).  In SVE the Top and Bottom
> > >> instructions
> > select
> > >>Even and odd elements of the vector rather than "top half" and
> > >> "bottom
> > half".
> > >>
> > >>So this instruction does : Add each vector element of the first
> > >> source
> > vector to the
> > >>corresponding vector element of the second source vector, and
> > >> place
> > the most
> > >> significant half of the result in the even-numbered half-width
> > destination elements,
> > >> while setting the odd-numbered elements to zero.
> > >>
> > >> So there's an explicit permute in there. The instructions are
> > >> sufficiently different that there wouldn't be a single GIMPLE
> > representation.
> > >
> > > I see.  Are these also useful to express scalar integer division?
> > >
> > > I'll defer to others to ack the special udiv_pow2_bitmask optab or
> > > suggest some piecemail things other targets might be able to do as
> > > well.  It does look very special.  I'd also bikeshed it to
> > > udiv_pow2m1 since 'bitmask' is less obvious than 2^n-1 (assuming I
> > > interpreted 'bitmask' correctly ;)).  It seems to be even less
> > > general since it is an unary op and the actual divisor is
> > > constrained by the mode itself?
> >
> > Yeah, those were my concerns as well.  For n-bit numbers, the same
> > kind of arithmetic transformation can be used for any 2^m-1 for m in
> > [n/2, n), so from a target-independent point of view, m==n/2 isn't
> particularly special.
> > Hard-coding one value of m would ma

Re: [PATCH] libstdc++: testsuite: call sched_yield for nonpreemptive targets

2022-06-21 Thread Alexandre Oliva via Gcc-patches

On Jun 21, 2022, Jonathan Wakely  wrote:

> On Tue, 21 Jun 2022 at 06:54, Alexandre Oliva via Libstdc++
>  wrote:
>> 
>> 
>> As in the gcc testsuite, systems without preemptive multi-threading
>> require sched_yield calls to be placed at points in which a context
>> switch might be needed to enable the test to complete.

> I'll try to remember that, but will probably forget. Is this really
> the only affected test?

Yeah, the only one in libstdc++-v3/testsuite, as of gcc-11, which is
what I've focused on for this project; I haven't gone through all fails
on master to make sure they're unrelated.  (This is holding up some
filesystem-related patches, whose tests are failing for reasons not
present in gcc-11, and some of the forward-ports needed significant
rewriting.)

libstdc++ has plenty of threading tests, but they involve sleeping or
otherwise blocking, which offers a context switch opportunity even on
non-preemptive multithreading systems.  It's busy waits that require a
sched_yield.  The gcc testsuite has a handful of those; covered by a
separate patch also posted last night.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] Fortran: fix simplification of INDEX(str1,str2) [PR105691]

2022-06-21 Thread Harald Anlauf via Gcc-patches

Dear all,

compile time simplification of INDEX(str1,str2,back=.true.) gave wrong
results.  Looking at gfc_simplify_index, this appeared to be close to
a complete mess, while the runtime library code - which was developed
later - was a relief.

The solution is to use the runtime library code as template to fix this.
I took the opportunity to change string index and length variables
in gfc_simplify_index to HOST_WIDE_INT.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

As this is a wrong-code issue, would this qualify for backports to
open branches?

Thanks,
Harald

From 2cfe8034340424ffa15784c61584634ccac4c4fc Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 21 Jun 2022 23:20:18 +0200
Subject: [PATCH] Fortran: fix simplification of INDEX(str1,str2) [PR105691]

gcc/fortran/ChangeLog:

	PR fortran/105691
	* simplify.cc (gfc_simplify_index): Replace old simplification
	code by the equivalent of the runtime library implementation.  Use
	HOST_WIDE_INT instead of int for string index, length variables.

gcc/testsuite/ChangeLog:

	PR fortran/105691
	* gfortran.dg/index_6.f90: New test.
---
 gcc/fortran/simplify.cc   | 131 ++
 gcc/testsuite/gfortran.dg/index_6.f90 |  31 ++
 2 files changed, 60 insertions(+), 102 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/index_6.f90

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index c8f2ef9fbf4..e8e3ec63669 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -3515,17 +3515,15 @@ gfc_expr *
 gfc_simplify_index (gfc_expr *x, gfc_expr *y, gfc_expr *b, gfc_expr *kind)
 {
   gfc_expr *result;
-  int back, len, lensub;
-  int i, j, k, count, index = 0, start;
+  bool back;
+  HOST_WIDE_INT len, lensub, start, last, i, index = 0;
+  int k, delta;

   if (x->expr_type != EXPR_CONSTANT || y->expr_type != EXPR_CONSTANT
   || ( b != NULL && b->expr_type !=  EXPR_CONSTANT))
 return NULL;

-  if (b != NULL && b->value.logical != 0)
-back = 1;
-  else
-back = 0;
+  back = (b != NULL && b->value.logical != 0);

   k = get_kind (BT_INTEGER, kind, "INDEX", gfc_default_integer_kind);
   if (k == -1)
@@ -3542,111 +3540,40 @@ gfc_simplify_index (gfc_expr *x, gfc_expr *y, gfc_expr *b, gfc_expr *kind)
   return result;
 }

-  if (back == 0)
+  if (lensub == 0)
 {
-  if (lensub == 0)
-	{
-	  mpz_set_si (result->value.integer, 1);
-	  return result;
-	}
-  else if (lensub == 1)
-	{
-	  for (i = 0; i < len; i++)
-	{
-	  for (j = 0; j < lensub; j++)
-		{
-		  if (y->value.character.string[j]
-		  == x->value.character.string[i])
-		{
-		  index = i + 1;
-		  goto done;
-		}
-		}
-	}
-	}
+  if (back)
+	index = len + 1;
   else
-	{
-	  for (i = 0; i < len; i++)
-	{
-	  for (j = 0; j < lensub; j++)
-		{
-		  if (y->value.character.string[j]
-		  == x->value.character.string[i])
-		{
-		  start = i;
-		  count = 0;
-
-		  for (k = 0; k < lensub; k++)
-			{
-			  if (y->value.character.string[k]
-			  == x->value.character.string[k + start])
-			count++;
-			}
-
-		  if (count == lensub)
-			{
-			  index = start + 1;
-			  goto done;
-			}
-		}
-		}
-	}
-	}
+	index = 1;
+  goto done;
+}

+  if (!back)
+{
+  last = len + 1 - lensub;
+  start = 0;
+  delta = 1;
 }
   else
 {
-  if (lensub == 0)
-	{
-	  mpz_set_si (result->value.integer, len + 1);
-	  return result;
-	}
-  else if (lensub == 1)
+  last = -1;
+  start = len - lensub;
+  delta = -1;
+}
+
+  for (; start != last; start += delta)
+{
+  for (i = 0; i < lensub; i++)
 	{
-	  for (i = 0; i < len; i++)
-	{
-	  for (j = 0; j < lensub; j++)
-		{
-		  if (y->value.character.string[j]
-		  == x->value.character.string[len - i])
-		{
-		  index = len - i + 1;
-		  goto done;
-		}
-		}
-	}
+	  if (x->value.character.string[start + i]
+	  != y->value.character.string[i])
+	break;
 	}
-  else
+  if (i == lensub)
 	{
-	  for (i = 0; i < len; i++)
-	{
-	  for (j = 0; j < lensub; j++)
-		{
-		  if (y->value.character.string[j]
-		  == x->value.character.string[len - i])
-		{
-		  start = len - i;
-		  if (start <= len - lensub)
-			{
-			  count = 0;
-			  for (k = 0; k < lensub; k++)
-			if (y->value.character.string[k]
-			== x->value.character.string[k + start])
-			  count++;
-
-			  if (count == lensub)
-			{
-			  index = start + 1;
-			  goto done;
-			}
-			}
-		  else
-			{
-			  continue;
-			}
-		}
-		}
-	}
+	  index = start + 1;
+	  goto done;
 	}
 }

diff --git a/gcc/testsuite/gfortran.dg/index_6.f90 b/gcc/testsuite/gfortran.dg/index_6.f90
new file mode 100644
index 000..61d492985ad
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/index_6.f90
@@ -0,0 +1,31 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+! PR fortran/105691

[PATCH] Inline memchr with a small constant string

2022-06-21 Thread H.J. Lu via Gcc-patches

When memchr is applied on a constant string of no more than the bytes of
a word, inline memchr by checking each byte in the constant string.

int f (int a)
{
   return  __builtin_memchr ("eE", a, 2) != 0;
}

is simplified to

int f (int a)
{
  return (char) a == 'e' || (char) a == 'E';
}

gcc/

PR tree-optimization/103798
* match.pd (__builtin_memchr (const_str, a, N)): Inline memchr
with constant strings of no more than the bytes of a word.

gcc/testsuite/

PR tree-optimization/103798
* c-c++-common/pr103798-1.c: New test.
* c-c++-common/pr103798-2.c: Likewise.
* c-c++-common/pr103798-3.c: Likewise.
* c-c++-common/pr103798-4.c: Likewise.
* c-c++-common/pr103798-5.c: Likewise.
* c-c++-common/pr103798-6.c: Likewise.
* c-c++-common/pr103798-7.c: Likewise.
* c-c++-common/pr103798-8.c: Likewise.
---
 gcc/match.pd| 136 
 gcc/testsuite/c-c++-common/pr103798-1.c |  28 +
 gcc/testsuite/c-c++-common/pr103798-2.c |  30 ++
 gcc/testsuite/c-c++-common/pr103798-3.c |  28 +
 gcc/testsuite/c-c++-common/pr103798-4.c |  28 +
 gcc/testsuite/c-c++-common/pr103798-5.c |  26 +
 gcc/testsuite/c-c++-common/pr103798-6.c |  27 +
 gcc/testsuite/c-c++-common/pr103798-7.c |  27 +
 gcc/testsuite/c-c++-common/pr103798-8.c |  27 +
 9 files changed, 357 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-1.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-2.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-3.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-4.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-5.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-6.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-7.c
 create mode 100644 gcc/testsuite/c-c++-common/pr103798-8.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a63b649841b..aa4766749af 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7976,3 +7976,139 @@ and,
 (match (bitwise_induction_p @0 @2 @3)
  (bit_not
   (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) @3
+
+#if GIMPLE
+/* __builtin_memchr (const_str, a, N) != 0 ->
+   a == const_str[0] .. || a == const_str[N-1]
+   __builtin_memchr (const_str, a, N) == 0 ->
+   a != const_str[0] .. && a != const_str[N-1]
+   where N is less than the string size.  */
+(for cmp (eq ne)
+ icmp (ne eq)
+ bit_op (bit_and bit_ior)
+ (simplify (cmp:c @0 (BUILT_IN_MEMCHR ADDR_EXPR@1 @2 INTEGER_CST@3))
+  (if (UNITS_PER_WORD <= 8
+   && CHAR_TYPE_SIZE == 8
+   && BITS_PER_UNIT == 8
+   && CHAR_BIT == 8
+   && integer_zerop (@0)
+   && !integer_zerop (@3)
+   && TREE_CODE (TREE_OPERAND (@1, 0)) == STRING_CST
+   && TREE_STRING_LENGTH (TREE_OPERAND (@1, 0)) >= 2
+   && wi::leu_p (wi::to_wide (@3), UNITS_PER_WORD)
+   && wi::ltu_p (wi::to_wide (@3),
+TREE_STRING_LENGTH (TREE_OPERAND (@1, 0
+   (with
+{
+  const char *p = TREE_STRING_POINTER (TREE_OPERAND (@1, 0));
+  unsigned HOST_WIDE_INT size = TREE_INT_CST_LOW (@3);
+}
+(switch
+ (if (size == 1)
+  (icmp (convert:char_type_node @2)
+   { build_int_cst (char_type_node, p[0]); }))
+ (if (size == 2)
+  (bit_op
+   (icmp (convert:char_type_node @2)
+{ build_int_cst (char_type_node, p[0]); })
+   (icmp (convert:char_type_node @2)
+{ build_int_cst (char_type_node, p[1]); })))
+ (if (size == 3)
+  (bit_op
+   (icmp (convert:char_type_node @2)
+{ build_int_cst (char_type_node, p[0]); })
+   (bit_op
+(icmp (convert:char_type_node @2)
+ { build_int_cst (char_type_node, p[1]); })
+(icmp (convert:char_type_node @2)
+ { build_int_cst (char_type_node, p[2]); }
+ (if (size == 4)
+  (bit_op
+   (icmp (convert:char_type_node @2)
+{ build_int_cst (char_type_node, p[0]); })
+   (bit_op
+   (icmp (convert:char_type_node @2)
+{ build_int_cst (char_type_node, p[1]); })
+   (bit_op
+(icmp (convert:char_type_node @2)
+  { build_int_cst (char_type_node, p[2]); })
+(icmp (convert:char_type_node @2)
+  { build_int_cst (char_type_node, p[3]); })
+ (if (size == 5)
+  (bit_op
+   (icmp (convert:char_type_node @2)
+{ build_int_cst (char_type_node, p[0]); })
+   (bit_op
+   (icmp (convert:char_type_node @2)
+ { build_int_cst (char_type_node, p[1]); })
+   (bit_op
+(icmp (convert:char_type_node @2)
+  { build_int_cst (char_type_node, p[2]); })
+(bit_op
+ (icmp (convert:char_type_node @2)
+   { build_int_cst (char_type_node, p[3]); })
+ (icmp (convert:char_type_node @2)
+   { build_int_cst (char_type_node, p[4]); }))
+ (if (size == 6)
+  (bit

[PATCH] xtensa: Fix buffer overflow

2022-06-21 Thread Takayuki 'January June' Suwa via Gcc-patches

Fortify buffer overflow message reported.
(see https://github.com/earlephilhower/esp-quick-toolchain/issues/36)

gcc/ChangeLog:

* config/xtensa/xtensa.md (bswapsi2_internal):
Enlarge the buffer that is obviously smaller than the template
string given to sprintf().
---
 gcc/config/xtensa/xtensa.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 84b975cf00e..f31ec33b362 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -536,7 +536,7 @@
 {
   rtx_insn *prev_insn = prev_nonnote_nondebug_insn (insn);
   const char *init = "ssai\t8\;";
-  static char result[64];
+  static char result[128];
   if (prev_insn && NONJUMP_INSN_P (prev_insn))
 {
   rtx x = PATTERN (prev_insn);
-- 
2.20.1

Re: [PATCH v2] tree-optimization/95821 - Convert strlen + strchr to memchr

2022-06-21 Thread Noah Goldstein via Gcc-patches

On Tue, Jun 21, 2022 at 5:01 AM Jakub Jelinek  wrote:
>
> On Mon, Jun 20, 2022 at 02:42:20PM -0700, Noah Goldstein wrote:
> > This patch allows for strchr(x, c) to the replace with memchr(x, c,
> > strlen(x) + 1) if strlen(x) has already been computed earlier in the
> > tree.
> >
> > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821
> >
> > Since memchr doesn't need to re-find the null terminator it is faster
> > than strchr.
> >
> > bootstrapped and tested on x86_64-linux.
> >
> >   PR tree-optimization/95821
>
> This should be indented by a single tab, not two.

Fixed in V3
> >
> > gcc/
> >
> >   * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit
> >   memchr instead of strchr if strlen already computed.
> >
> > gcc/testsuite/
> >
> >   * c-c++-common/pr95821-1.c: New test.
> >   * c-c++-common/pr95821-2.c: New test.
> >   * c-c++-common/pr95821-3.c: New test.
> >   * c-c++-common/pr95821-4.c: New test.
> >   * c-c++-common/pr95821-5.c: New test.
> >   * c-c++-common/pr95821-6.c: New test.
> >   * c-c++-common/pr95821-7.c: New test.
> >   * c-c++-common/pr95821-8.c: New test.
> > --- a/gcc/tree-ssa-strlen.cc
> > +++ b/gcc/tree-ssa-strlen.cc
> > @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen ()
> >  }
> >  }
> >
> > -/* Handle a strchr call.  If strlen of the first argument is known, replace
> > -   the strchr (x, 0) call with the endptr or x + strlen, otherwise remember
> > -   that lhs of the call is endptr and strlen of the argument is endptr - 
> > x.  */
> > +/* Handle a strchr call.  If strlen of the first argument is known,
> > +   replace the strchr (x, 0) call with the endptr or x + strlen,
> > +   otherwise remember that lhs of the call is endptr and strlen of the
> > +   argument is endptr - x.  If strlen of x is not know but has been
> > +   computed earlier in the tree then replace strchr(x, c) to
>
> Still missing space before ( above.

Sorry, fixed that in V3.
>
> > +   memchr (x, c, strlen + 1).  */
> >
> >  void
> >  strlen_pass::handle_builtin_strchr ()
> > @@ -2418,8 +2421,12 @@ strlen_pass::handle_builtin_strchr ()
> >if (lhs == NULL_TREE)
> >  return;
> >
> > -  if (!integer_zerop (gimple_call_arg (stmt, 1)))
> > -return;
> > +  tree chr = gimple_call_arg (stmt, 1);
> > +  /* strchr only uses the lower char of input so to check if its
> > + strchr (s, zerop) only take into account the lower char.  */
> > +  bool is_strchr_zerop
> > +  = (TREE_CODE (chr) == INTEGER_CST
> > +  && integer_zerop (fold_convert (char_type_node, chr)));
>
> The indentation rule is that = should be 2 columns to the right from bool,
> so
>

Fixed in V3.
>   bool is_strchr_zerop
> = (TREE_CODE (chr) == INTEGER_CST
>&& integer_zerop (fold_convert (char_type_node, chr)));
>
> > +   /* If its not strchr (s, zerop) then try and convert to
> > +  memchr since strlen has already been computed.  */
>
> This comment still has the second line weirdly indented.

Sorry, have emacs with 4-space tabs so things that look right arent
as they seem :/

Fixed in V3 I believe.
>
> > +   tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR);
> > +
> > +   /* Only need to check length strlen (s) + 1 if chr may be zero.
> > + Otherwise the last chr (which is known to be zero) can never
> > + be a match.  NB: We don't need to test if chr is a non-zero
> > + integer const with zero char bits because that is taken into
> > + account with is_strchr_zerop.  */
> > +   if (!tree_expr_nonzero_p (chr))
>
> The above is unsafe though.  tree_expr_nonzero_p (chr) will return true
> if say VRP can prove it is not zero, but because of the implicit
> (char) chr cast done by the function we need something different.
> Say if VRP determines that chr is in [1, INT_MAX] or even just [255, 257]
> it doesn't mean (char) chr won't be 0.
> So, as I've tried to explain in the previous mail, it can be done e.g. with

Added your code in V3. Thanks for the help.
>   bool chr_nonzero = false;
>   if (TREE_CODE (chr) == INTEGER_CST
>   && integer_nonzerop (fold_convert (char_type_node, chr)))
> chr_nonzero = true;
>   else if (TREE_CODE (chr) == SSA_NAME
>&& CHAR_TYPE_SIZE < INT_TYPE_SIZE)
> {
>   value_range r;
>   /* Try to determine using ranges if (char) chr must
>  be always 0.  That is true e.g. if all the subranges
>  have the INT_TYPE_SIZE - CHAR_TYPE_SIZE bits
>  the same on lower and upper bounds.  */
>   if (get_range_query (cfun)->range_of_expr (r, chr, stmt)
>   && r.kind () == VR_RANGE)
> {
>   chr_nonzero = true;
>   wide_int mask =

[PATCH v3] tree-optimization/95821 - Convert strlen + strchr to memchr

2022-06-21 Thread Noah Goldstein via Gcc-patches

This patch allows for strchr(x, c) to the replace with memchr(x, c,
strlen(x) + 1) if strlen(x) has already been computed earlier in the
tree.

Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821

Since memchr doesn't need to re-find the null terminator it is faster
than strchr.

bootstrapped and tested on x86_64-linux.

PR tree-optimization/95821

gcc/

* tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit
memchr instead of strchr if strlen already computed.

gcc/testsuite/

* c-c++-common/pr95821-1.c: New test.
* c-c++-common/pr95821-2.c: New test.
* c-c++-common/pr95821-3.c: New test.
* c-c++-common/pr95821-4.c: New test.
* c-c++-common/pr95821-5.c: New test.
* c-c++-common/pr95821-6.c: New test.
* c-c++-common/pr95821-7.c: New test.
* c-c++-common/pr95821-8.c: New test.
---
 gcc/testsuite/c-c++-common/pr95821-1.c |  15 
 gcc/testsuite/c-c++-common/pr95821-2.c |  17 
 gcc/testsuite/c-c++-common/pr95821-3.c |  17 
 gcc/testsuite/c-c++-common/pr95821-4.c |  16 
 gcc/testsuite/c-c++-common/pr95821-5.c |  19 +
 gcc/testsuite/c-c++-common/pr95821-6.c |  18 
 gcc/testsuite/c-c++-common/pr95821-7.c |  18 
 gcc/testsuite/c-c++-common/pr95821-8.c |  19 +
 gcc/tree-ssa-strlen.cc | 113 -
 9 files changed, 233 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-1.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-2.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-3.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-4.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-5.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-6.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-7.c
 create mode 100644 gcc/testsuite/c-c++-common/pr95821-8.c

diff --git a/gcc/testsuite/c-c++-common/pr95821-1.c 
b/gcc/testsuite/c-c++-common/pr95821-1.c
new file mode 100644
index 000..e0beb609ea2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr95821-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "memchr" } } */
+
+#include 
+
+char *
+foo (char *s, char c)
+{
+   size_t slen = __builtin_strlen(s);
+   if(slen < 1000)
+   return NULL;
+
+   return __builtin_strchr(s, c);
+}
diff --git a/gcc/testsuite/c-c++-common/pr95821-2.c 
b/gcc/testsuite/c-c++-common/pr95821-2.c
new file mode 100644
index 000..5429f0586be
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr95821-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "memchr" } } */
+
+#include 
+
+char *
+foo (char *s, char c, char * other)
+{
+   size_t slen = __builtin_strlen(s);
+   if(slen < 1000)
+   return NULL;
+
+   *other = 0;
+
+   return __builtin_strchr(s, c);
+}
diff --git a/gcc/testsuite/c-c++-common/pr95821-3.c 
b/gcc/testsuite/c-c++-common/pr95821-3.c
new file mode 100644
index 000..bc929c6044b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr95821-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "memchr" } } */
+
+#include 
+
+char *
+foo (char * __restrict s, char c, char * __restrict other)
+{
+   size_t slen = __builtin_strlen(s);
+   if(slen < 1000)
+   return NULL;
+
+   *other = 0;
+
+   return __builtin_strchr(s, c);
+}
diff --git a/gcc/testsuite/c-c++-common/pr95821-4.c 
b/gcc/testsuite/c-c++-common/pr95821-4.c
new file mode 100644
index 000..684b41d5b70
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr95821-4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "memchr" } } */
+
+#include 
+#include 
+
+char *
+foo (char *s, char c)
+{
+   size_t slen = strlen(s);
+   if(slen < 1000)
+   return NULL;
+
+   return strchr(s, c);
+}
diff --git a/gcc/testsuite/c-c++-common/pr95821-5.c 
b/gcc/testsuite/c-c++-common/pr95821-5.c
new file mode 100644
index 000..00c1d93b614
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr95821-5.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "memchr" } } */
+
+#include 
+#include 
+
+char *
+foo (char *s, char c, char * other)
+{
+   size_t slen = strlen(s);
+   if(slen < 1000)
+   return NULL;
+
+   *other = 0;
+
+   return strchr(s, c);
+}
+int main() {}
diff --git a/gcc/testsuite/c-c++-common/pr95821-6.c 
b/gcc/testsuite/c-c++-common/pr95821-6.c
new file mode 100644
index 000..dec839de5ea
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr95821-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "memchr" } } */
+
+#include 
+#include 
+
+char *
+foo (char * __restrict s, char c,

Re: [PATCH V2]rs6000: Store complicated constant into pool

2022-06-21 Thread Segher Boessenkool

Hi!

On Thu, Jun 16, 2022 at 03:47:49PM +0800, Jiufu Guo wrote:
> Segher Boessenkool  writes:
> >> >> --- a/gcc/testsuite/gcc.target/powerpc/medium_offset.c
> >> >> +++ b/gcc/testsuite/gcc.target/powerpc/medium_offset.c
> >> >> @@ -1,7 +1,7 @@
> >> >>  /* { dg-do compile { target { powerpc*-*-* } } } */
> >> >>  /* { dg-require-effective-target lp64 } */
> >> >>  /* { dg-options "-O" } */
> >> >> -/* { dg-final { scan-assembler-not "\\+4611686018427387904" } } */
> >> >> +/* { dg-final { scan-assembler-times {\msldi|pld\M} 1 } } */
> >> >
> >> > Why?  This is still better generated in code, no?  It should never be
> >> > loaded from a constant pool (it is hex 4000___, easy to
> >> > construct with just one or two insns).
> >> 
> >> For p8/9, two insns "lis 0x4000+sldi 32" are used:
> >> addis %r3,%r2,.LANCHOR0@toc@ha
> >> addi %r3,%r3,.LANCHOR0@toc@l
> >> lis %r9,0x4000
> >> sldi %r9,%r9,32
> >> add %r3,%r3,%r9
> >>blr
> >
> > That does not mean putting this constant in the constant pool is a good
> > idea at all, of course.
> >
> >> On p10, as expected, 'pld' would be better than 'lis+sldi'.
> >
> > Is it?
> 
> With simple cases, it shows 'pld' seems better. For perlbench, it may
> also indicate this. But I did not test this part separately.
> As you suggested, I will collect more data to check this change.

Look at p10 for example.  There can be only two pld's concurrently, and
they might miss in the cache as well (not likely hopefully, but it is
costly).  pld is between 4 and 6 cycles latency, so that is never better
than 1+1 to 3+3 what the addi+rldicr (li+sldi) are, and easily worse.

If you really see loads being better than two simple integer insns, we
need to rethink more :-/


Segher

Re: [PATCH] gcc/configure.ac: fix --enable-fixed-point enablement [PR34422]

2022-06-21 Thread Eric Gallager via Gcc-patches

Hi, I'd like to ping this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596654.html
(cc-ing the build machinery maintainers listed in MAINTAINERS this time)

On Tue, Jun 14, 2022 at 3:51 PM Eric Gallager  wrote:
>
> So, in investigating PR target/34422, I discovered that the gcc
> subdirectory's configure script had an instance of AC_ARG_ENABLE with
> 3rd and 4th its arguments reversed: the one where it warns that the
> --enable-fixed-point flag is being ignored is the one where that flag
> hasn't even been passed in the first place. The attached patch puts
> the warning in the correct argument to the macro in question. (I'm not
> including the regeneration of gcc/configure in the patch this time
> since that confused people last time.) OK to commit, with an
> appropriate ChangeLog?

Re: [PATCH] data-ref: Improve non-loop disambiguation [PR106019]

2022-06-21 Thread Richard Biener via Gcc-patches




> Am 21.06.2022 um 17:16 schrieb Richard Sandiford via Gcc-patches 
> :
> 
> When dr_may_alias_p is called without a loop context, it tries
> to use the tree-affine interface to calculate the difference
> between the two addresses and use that difference to check whether
> the gap between the accesses is known at compile time.  However, as the
> example in the PR shows, this doesn't expand SSA_NAMEs and so can easily
> be defeated by things like reassociation.
> 
> One fix would have been to use aff_combination_expand to expand the
> SSA_NAMEs, but we'd then need some way of maintaining the associated
> cache.  This patch instead reuses the innermost_loop_behavior fields
> (which exist even when no loop context is provided).
> 
> It might still be useful to do the aff_combination_expand thing too,
> if an example turns out to need it.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 

Ok.

Thanks,
Richard 

> Richard
> 
> 
> gcc/
>PR tree-optimization/106019
>* tree-data-ref.cc (dr_may_alias_p): Try using the
>innermost_loop_behavior to disambiguate non-loop queries.
> 
> gcc/testsuite/
>PR tree-optimization/106019
>* gcc.dg/vect/bb-slp-pr106019.c: New test.
> ---
> gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c | 15 +++
> gcc/tree-data-ref.cc| 19 +++
> 2 files changed, 34 insertions(+)
> create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c
> new file mode 100644
> index 000..218d7cca33d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +
> +void f(double *p, long i)
> +{
> +p[i+0] += 1;
> +p[i+1] += 1;
> +}
> +void g(double *p, long i)
> +{
> +double *q = p + i;
> +q[0] += 1;
> +q[1] += 1;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "can't determine dependence" slp2 } } */
> diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
> index 8b7edf2124a..90242948c27 100644
> --- a/gcc/tree-data-ref.cc
> +++ b/gcc/tree-data-ref.cc
> @@ -2968,6 +2968,25 @@ dr_may_alias_p (const struct data_reference *a, const 
> struct data_reference *b,
>  disambiguation.  */
>   if (!loop_nest)
> {
> +  tree tree_size_a = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a)));
> +  tree tree_size_b = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b)));
> +
> +  if (DR_BASE_ADDRESS (a)
> +  && DR_BASE_ADDRESS (b)
> +  && operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b))
> +  && operand_equal_p (DR_OFFSET (a), DR_OFFSET (b))
> +  && poly_int_tree_p (tree_size_a)
> +  && poly_int_tree_p (tree_size_b)
> +  && !ranges_maybe_overlap_p (wi::to_widest (DR_INIT (a)),
> +  wi::to_widest (tree_size_a),
> +  wi::to_widest (DR_INIT (b)),
> +  wi::to_widest (tree_size_b)))
> +{
> +  gcc_assert (integer_zerop (DR_STEP (a))
> +  && integer_zerop (DR_STEP (b)));
> +  return false;
> +}
> +
>   aff_tree off1, off2;
>   poly_widest_int size1, size2;
>   get_inner_reference_aff (DR_REF (a), &off1, &size1);
> -- 
> 2.25.1
>

Re: [PATCH] Enhance _Hashtable for range insertion 0/5

2022-06-21 Thread Jonathan Wakely via Gcc-patches

On Mon, 20 Jun 2022 at 17:58, François Dumont via Libstdc++
 wrote:
>
> Hi
>
> Here is a series of patch to enhance _Hashtable behavior mostly in the
> context of range insertion. I also start considering the problem of
> memory fragmentation in this container with 2 objectives:
>
> - It is easier to find out when you're done with the elements of a
> bucket if the last node of the bucket N is the before-begin node of
> bucket N + 1.
>
> - It is faster to loop through nodes of a bucket if those node are close
> in memory, ultimately we should have addressof(Node + 1) ==
> addressof(Node) + 1

Have these changes been profiled or benchmarked? Is it measurably
faster? By how much?


> [1/5] Make more use of user hints as both insertion and allocation hints.
>
> [2/5] Introduce a new method to check if we are still looping through
> the same bucket's nodes
>
> [3/5] Consider that all initializer_list elements are going to be inserted
>
> [4/5] Introduce a before-begin cache policy to remember which bucket is
> currently pointing on it
>
> [5/5] Prealloc nodes on _Hashtable copy and introduce a new assignment
> method which replicate buckets data structure
>
> François
>

Re: [PATCH] libgo: Recognize off64_t / loff_t type definition of musl libc

2022-06-21 Thread Ian Lance Taylor via Gcc-patches

On Tue, Jun 21, 2022 at 8:16 AM Andreas Schwab  wrote:
>
> On Jun 21 2022, Ian Lance Taylor via Gcc-patches wrote:
>
> > which seems to be Linux 3.13 and glibc 4.8.4.  On that system I see
>
> That's stone age.

I don't know who maintains these systems on the GCC compile farm.

Ian

Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes

2022-06-21 Thread Jose E. Marchesi via Gcc-patches



> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>> Hi Yonghong.
>> 
>>> On 6/15/22 1:57 PM, David Faust wrote:

 On 6/14/22 22:53, Yonghong Song wrote:
>
>
> On 6/7/22 2:43 PM, David Faust wrote:
>> Hello,
>>
>> This patch series adds support for:
>>
>> - Two new C-language-level attributes that allow to associate (to 
>> "annotate" or
>>  to "tag") particular declarations and types with arbitrary strings. 
>> As
>>  explained below, this is intended to be used to, for example, 
>> characterize
>>  certain pointer types.
>>
>> - The conveyance of that information in the DWARF output in the form of 
>> a new
>>  DIE: DW_TAG_GNU_annotation.
>>
>> - The conveyance of that information in the BTF output in the form of 
>> two new
>>  kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>
>> All of these facilities are being added to the eBPF ecosystem, and 
>> support for
>> them exists in some form in LLVM.
>>
>> Purpose
>> ===
>>
>> 1)  Addition of C-family language constructs (attributes) to specify 
>> free-text
>>tags on certain language elements, such as struct fields.
>>
>>The purpose of these annotations is to provide additional 
>> information about
>>types, variables, and function parameters of interest to the 
>> kernel. A
>>driving use case is to tag pointer types within the linux kernel 
>> and eBPF
>>programs with additional semantic information, such as '__user' 
>> or '__rcu'.
>>
>>For example, consider the linux kernel function do_execve with the
>>following declaration:
>>
>>  static int do_execve(struct filename *filename,
>> const char __user *const __user *__argv,
>> const char __user *const __user *__envp);
>>
>>Here, __user could be defined with these annotations to record 
>> semantic
>>information about the pointer parameters (e.g., they are 
>> user-provided) in
>>DWARF and BTF information. Other kernel facilites such as the 
>> eBPF verifier
>>can read the tags and make use of the information.
>>
>> 2)  Conveying the tags in the generated DWARF debug info.
>>
>>The main motivation for emitting the tags in DWARF is that the 
>> Linux kernel
>>generates its BTF information via pahole, using DWARF as a source:
>>
>>++  BTF  BTF   +--+
>>| pahole |---> vmlinux.btf --->| verifier |
>>++ +--+
>>^^
>>||
>>  DWARF |BTF |
>>||
>> vmlinux  +-+
>> module1.ko   | BPF program |
>> module2.ko   +-+
>>   ...
>>
>>This is because:
>>
>>a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>
>>b)  GCC can generate BTF for whatever target with -gbtf, but 
>> there is no
>>support for linking/deduplicating BTF in the linker.
>>
>>In the scenario above, the verifier needs access to the pointer 
>> tags of
>>both the kernel types/declarations (conveyed in the DWARF and 
>> translated
>>to BTF by pahole) and those of the BPF program (available 
>> directly in BTF).
>>
>>Another motivation for having the tag information in DWARF, 
>> unrelated to
>>BPF and BTF, is that the drgn project (another DWARF consumer) 
>> also wants
>>to benefit from these tags in order to differentiate between 
>> different
>>kinds of pointers in the kernel.
>>
>> 3)  Conveying the tags in the generated BTF debug info.
>>
>>This is easy: the main purpose of having this info in BTF is for 
>> the
>>compiled eBPF programs. The kernel verifier can then access the 
>> tags
>>of pointers used by the eBPF programs.
>>
>>
>> For more information about these tags and the motivation behind them, 
>> please
>> refer to the following linux kernel discussions:
>>
>>  https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
>>  https://lore.kernel.org/bpf/20211012164838.3345699-1-...@fb.com/
>>  https://lore.kernel.org/bpf/2022012604.1504583-1-...@fb.com/
>>
>>
>> Implementation Overview

[committed] libgomp: Fix up target-31.c test [PR106045]

2022-06-21 Thread Jakub Jelinek via Gcc-patches

Hi!

The i variable is used inside of the parallel in:
  #pragma omp simd safelen(32) private (v)
  for (i = 0; i < 64; i++)
{
  v = 3 * i;
  ll[i] = u1 + v * u2[0] + u2[1] + x + y[0] + y[1] + v + h[0] + u3[i];
}
where i is predetermined linear (so while inside of the body
it is safe, private per SIMD lane var) the final value is written to
the shared variable, and in:
  for (i = 0; i < 64; i++)
if (ll[i] != u1 + 3 * i * u2[0] + u2[1] + x + y[0] + y[1] + 3 * i + 13 
+ 14 + i)
  #pragma omp atomic write
err = 1;
which is a normal loop and so it isn't in any way privatized there.
So we have a data race, fixed by adding private (i) clause to the
parallel.

Tested on x86_64-linux, committed to trunk.

2022-06-21  Jakub Jelinek  
Paul Iannetta  

PR libgomp/106045
* testsuite/libgomp.c/target-31.c: Add private (i) clause.

--- libgomp/testsuite/libgomp.c/target-31.c.jj
+++ libgomp/testsuite/libgomp.c/target-31.c
@@ -76,7 +76,7 @@ main ()
   m[1] += 3 * b;
 }
 use (&a, &b, &c, &d, e, f, g, h);
-#pragma omp parallel firstprivate (u1, u2)
+#pragma omp parallel firstprivate (u1, u2) private (i)
 {
   int w = omp_get_thread_num ();
   int x = 19;


Jakub

Re: [PATCH] libgo: Recognize off64_t / loff_t type definition of musl libc

2022-06-21 Thread Andreas Schwab

On Jun 21 2022, Ian Lance Taylor via Gcc-patches wrote:

> which seems to be Linux 3.13 and glibc 4.8.4.  On that system I see

That's stone age.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

[PATCH] data-ref: Improve non-loop disambiguation [PR106019]

2022-06-21 Thread Richard Sandiford via Gcc-patches

When dr_may_alias_p is called without a loop context, it tries
to use the tree-affine interface to calculate the difference
between the two addresses and use that difference to check whether
the gap between the accesses is known at compile time.  However, as the
example in the PR shows, this doesn't expand SSA_NAMEs and so can easily
be defeated by things like reassociation.

One fix would have been to use aff_combination_expand to expand the
SSA_NAMEs, but we'd then need some way of maintaining the associated
cache.  This patch instead reuses the innermost_loop_behavior fields
(which exist even when no loop context is provided).

It might still be useful to do the aff_combination_expand thing too,
if an example turns out to need it.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
PR tree-optimization/106019
* tree-data-ref.cc (dr_may_alias_p): Try using the
innermost_loop_behavior to disambiguate non-loop queries.

gcc/testsuite/
PR tree-optimization/106019
* gcc.dg/vect/bb-slp-pr106019.c: New test.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c | 15 +++
 gcc/tree-data-ref.cc| 19 +++
 2 files changed, 34 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c
new file mode 100644
index 000..218d7cca33d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr106019.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+void f(double *p, long i)
+{
+p[i+0] += 1;
+p[i+1] += 1;
+}
+void g(double *p, long i)
+{
+double *q = p + i;
+q[0] += 1;
+q[1] += 1;
+}
+
+/* { dg-final { scan-tree-dump-not "can't determine dependence" slp2 } } */
diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
index 8b7edf2124a..90242948c27 100644
--- a/gcc/tree-data-ref.cc
+++ b/gcc/tree-data-ref.cc
@@ -2968,6 +2968,25 @@ dr_may_alias_p (const struct data_reference *a, const 
struct data_reference *b,
  disambiguation.  */
   if (!loop_nest)
 {
+  tree tree_size_a = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a)));
+  tree tree_size_b = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b)));
+
+  if (DR_BASE_ADDRESS (a)
+ && DR_BASE_ADDRESS (b)
+ && operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b))
+ && operand_equal_p (DR_OFFSET (a), DR_OFFSET (b))
+ && poly_int_tree_p (tree_size_a)
+ && poly_int_tree_p (tree_size_b)
+ && !ranges_maybe_overlap_p (wi::to_widest (DR_INIT (a)),
+ wi::to_widest (tree_size_a),
+ wi::to_widest (DR_INIT (b)),
+ wi::to_widest (tree_size_b)))
+   {
+ gcc_assert (integer_zerop (DR_STEP (a))
+ && integer_zerop (DR_STEP (b)));
+ return false;
+   }
+
   aff_tree off1, off2;
   poly_widest_int size1, size2;
   get_inner_reference_aff (DR_REF (a), &off1, &size1);
-- 
2.25.1

Re: [PATCH] libgo: Recognize off64_t / loff_t type definition of musl libc

2022-06-21 Thread Ian Lance Taylor via Gcc-patches

On Sat, Jun 18, 2022 at 8:59 AM Andreas Schwab  wrote:
>
> On Jun 18 2022, Ian Lance Taylor wrote:
>
> > What target?
>
> aarch64-suse-linux, of course.

Thanks.  Sorry for missing that.


> > What is the output of
> >
> > grep loff_t TARGET/libgo/gen-sysinfo.go
>
> type ___loff_t int64
> type _loff_t int64
> type ___kernel_loff_t int64

Hmmm, it does work as expected on gcc114 in the GCC compile farm,
which seems to be Linux 3.13 and glibc 4.8.4.  On that system I see

> grep loff_t aarch64-unknown-linux-gnu/libgo/gen-sysinfo.go
type ___loff_t int64
type _loff_t int64
type ___kernel_loff_t int64
type _libgo_loff_t_type int64

Still, I may see the problem.  I've committed this patch that should fix it.

Ian
7afe467c665bba27574a183dad5c00f2c0f676e1
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 4b75dd37355..737bc483274 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-a409e049737ec9a358a19233e017d957db3d6d2a
+77821de1a149c2e6ef9c154ae384c16292173039
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/configure b/libgo/configure
index b7ff9b32867..61a49947eb9 100755
--- a/libgo/configure
+++ b/libgo/configure
@@ -15549,7 +15549,10 @@ fi
 
 CFLAGS_hold="$CFLAGS"
 CFLAGS="$OSCFLAGS $CFLAGS"
-ac_fn_c_check_type "$LINENO" "loff_t" "ac_cv_type_loff_t" "#include 
+ac_fn_c_check_type "$LINENO" "loff_t" "ac_cv_type_loff_t" "
+#include 
+#include 
+
 "
 if test "x$ac_cv_type_loff_t" = xyes; then :
 
diff --git a/libgo/configure.ac b/libgo/configure.ac
index bac58b07b41..274fcfc35c7 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -604,7 +604,10 @@ AC_TYPE_OFF_T
 
 CFLAGS_hold="$CFLAGS"
 CFLAGS="$OSCFLAGS $CFLAGS"
-AC_CHECK_TYPES([loff_t], [], [], [[#include ]])
+AC_CHECK_TYPES([loff_t], [], [], [[
+#include 
+#include 
+]])
 CFLAGS="$CFLAGS_hold"
 
 LIBS_hold="$LIBS"

Re: PING^1 [PATCH] i386: Disallow sibcall when calling ifunc functions with PIC register

2022-06-21 Thread Uros Bizjak via Gcc-patches

On Tue, Jun 21, 2022 at 4:46 PM H.J. Lu  wrote:
>
> On Mon, Jun 20, 2022 at 7:51 AM Uros Bizjak  wrote:
> >
> > On Mon, Jun 20, 2022 at 4:03 PM H.J. Lu  wrote:
> > >
> > > On Tue, Jun 14, 2022 at 12:25 PM H.J. Lu  wrote:
> > > >
> > > > Disallow siball when calling ifunc functions with PIC register so that
> > > > PIC register can be restored.
> > > >
> > > > gcc/
> > > >
> > > > PR target/105960
> > > > * config/i386/i386.cc (ix86_function_ok_for_sibcall): Return
> > > > false if PIC register is used when calling ifunc functions.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR target/105960
> > > > * gcc.target/i386/pr105960.c: New test.
> >
> > LGTM.
>
> OK to backport to GCC 12 branch?

OK.

Thanks,
Uros.

> Thanks.
>
> > Thanks,
> > Uros.
> >
> > > > ---
> > > >  gcc/config/i386/i386.cc  |  9 +
> > > >  gcc/testsuite/gcc.target/i386/pr105960.c | 19 +++
> > > >  2 files changed, 28 insertions(+)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105960.c
> > > >
> > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > index 3d189e124e4..1ca7836e11e 100644
> > > > --- a/gcc/config/i386/i386.cc
> > > > +++ b/gcc/config/i386/i386.cc
> > > > @@ -1015,6 +1015,15 @@ ix86_function_ok_for_sibcall (tree decl, tree 
> > > > exp)
> > > > }
> > > >  }
> > > >
> > > > +  if (decl && ix86_use_pseudo_pic_reg ())
> > > > +{
> > > > +  /* When PIC register is used, it must be restored after ifunc
> > > > +function returns.  */
> > > > +   cgraph_node *node = cgraph_node::get (decl);
> > > > +   if (node && node->ifunc_resolver)
> > > > +return false;
> > > > +}
> > > > +
> > > >/* Otherwise okay.  That also includes certain types of indirect 
> > > > calls.  */
> > > >return true;
> > > >  }
> > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105960.c 
> > > > b/gcc/testsuite/gcc.target/i386/pr105960.c
> > > > new file mode 100644
> > > > index 000..db137a1642d
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/i386/pr105960.c
> > > > @@ -0,0 +1,19 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-ifunc "" } */
> > > > +/* { dg-options "-O2 -fpic" } */
> > > > +
> > > > +__attribute__((target_clones("default","fma")))
> > > > +static inline double
> > > > +expfull_ref(double x)
> > > > +{
> > > > +  return __builtin_pow(x, 0.1234);
> > > > +}
> > > > +
> > > > +double
> > > > +exp_ref(double x)
> > > > +{
> > > > +  return expfull_ref(x);
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-assembler "jmp\[ \t\]*expfull_ref@PLT" { target { 
> > > > ! ia32 } } } } */
> > > > +/* { dg-final { scan-assembler "call\[ \t\]*expfull_ref@PLT" { target 
> > > > ia32 } } } */
> > > > --
> > > > 2.36.1
> > > >
> > >
> > > PING.
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.

Re: [PATCH] i386: Add syscall to enable AMX for latest kernels

2022-06-21 Thread Uros Bizjak via Gcc-patches

On Tue, Jun 21, 2022 at 9:41 AM Jiang, Haochen  wrote:
>
> > -Original Message-
> > From: Uros Bizjak 
> > Sent: Tuesday, June 21, 2022 3:06 PM
> > To: Jiang, Haochen 
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest kernels
> >
> > On Tue, Jun 21, 2022 at 4:23 AM Jiang, Haochen 
> > wrote:
> > >
> > > > -Original Message-
> > > > From: Uros Bizjak 
> > > > Sent: Monday, June 20, 2022 10:54 PM
> > > > To: Jiang, Haochen 
> > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > > > Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest
> > > > kernels
> > > >
> > > > On Mon, Jun 20, 2022 at 10:04 AM Haochen Jiang
> > > > 
> > > > wrote:
> > > > >
> > > > > From: "Jiang, Haochen" 
> > > > >
> > > > > Hi all,
> > > > >
> > > > > We need syscall to enable AMX for kernels>=5.4. It is missing in
> > > > > current amx tests, which will cause test fail.
> > > >
> > > > So this new code is only valid for linux & co?
> > >
> > > Thanks for reminding me for that, I only test on linux since the header 
> > > file is
> > only in linux.
> > >
> > > Just updated a patch wrapping with a macro not to change the behavior on
> > windows.
> >
> > I think you want __linux__ there, not __unix__.
>
> Fixed with __linux__.

OK.

Thanks,
Uros.

>
> Thx,
> Haochen
>
> >
> > Uros.
> >
> > >
> > > Regtested on x86_64-pc-linux-gnu.
> > >
> > > Thx,
> > > Haochen
> > > >
> > > > Uros.
> > > >
> > > > >
> > > > > This patch aims to add them to fix this bug.
> > > > >
> > > > > BRs,
> > > > > Haochen
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > * gcc.target/i386/amx-check.h (request_perm_xtile_data):
> > > > > New function to check if AMX is usable and enable AMX.
> > > > > (main): Run test if AMX is usable.
> > > > > ---
> > > > >  gcc/testsuite/gcc.target/i386/amx-check.h | 24
> > > > > +++
> > > > >  1 file changed, 24 insertions(+)
> > > > >
> > > > > diff --git a/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > > b/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > > index 434b0e59703..92ed8669304 100644
> > > > > --- a/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > > +++ b/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > > @@ -4,11 +4,22 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > +#include 
> > > > > +#include 
> > > > >  #ifdef DEBUG
> > > > >  #include 
> > > > >  #endif
> > > > >  #include "cpuid.h"
> > > > >
> > > > > +#define XFEATURE_XTILECFG  17
> > > > > +#define XFEATURE_XTILEDATA 18
> > > > > +#define XFEATURE_MASK_XTILECFG (1 << XFEATURE_XTILECFG)
> > > > > +#define XFEATURE_MASK_XTILEDATA(1 << XFEATURE_XTILEDATA)
> > > > > +#define XFEATURE_MASK_XTILE(XFEATURE_MASK_XTILECFG |
> > > > XFEATURE_MASK_XTILEDATA)
> > > > > +
> > > > > +#define ARCH_GET_XCOMP_PERM0x1022
> > > > > +#define ARCH_REQ_XCOMP_PERM0x1023
> > > > > +
> > > > >  /* TODO: The tmm emulation is temporary for current
> > > > > AMX implementation with no tmm regclass, should
> > > > > be changed in the future. */
> > > > > @@ -44,6 +55,18 @@ typedef struct __tile
> > > > >  /* Stride (colum width in byte) used for tileload/store */
> > > > > #define _STRIDE 64
> > > > >
> > > > > +/* We need syscall to use amx functions */ int
> > > > > +request_perm_xtile_data() {
> > > > > +  unsigned long bitmask;
> > > > > +
> > > > > +  if (syscall (SYS_arch_prctl, ARCH_REQ_XCOMP_PERM,
> > > > XFEATURE_XTILEDATA) ||
> > > > > +  syscall (SYS_arch_prctl, ARCH_GET_XCOMP_PERM, &bitmask))
> > > > > +return 0;
> > > > > +
> > > > > +  return (bitmask & XFEATURE_MASK_XTILE) != 0; }
> > > > > +
> > > > >  /* Initialize tile config by setting all tmm size to 16x64 */
> > > > > void init_tile_config (__tilecfg_u *dst)  { @@ -186,6 +209,7 @@
> > > > > main () #ifdef AMX_BF16
> > > > >&& __builtin_cpu_supports ("amx-bf16")  #endif
> > > > > +  && request_perm_xtile_data ()
> > > > >)
> > > > >  {
> > > > >DO_TEST ();
> > > > > --
> > > > > 2.18.2
> > > > >

Re: PING^1 [PATCH] i386: Disallow sibcall when calling ifunc functions with PIC register

2022-06-21 Thread H.J. Lu via Gcc-patches

On Mon, Jun 20, 2022 at 7:51 AM Uros Bizjak  wrote:
>
> On Mon, Jun 20, 2022 at 4:03 PM H.J. Lu  wrote:
> >
> > On Tue, Jun 14, 2022 at 12:25 PM H.J. Lu  wrote:
> > >
> > > Disallow siball when calling ifunc functions with PIC register so that
> > > PIC register can be restored.
> > >
> > > gcc/
> > >
> > > PR target/105960
> > > * config/i386/i386.cc (ix86_function_ok_for_sibcall): Return
> > > false if PIC register is used when calling ifunc functions.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/105960
> > > * gcc.target/i386/pr105960.c: New test.
>
> LGTM.

OK to backport to GCC 12 branch?

Thanks.

> Thanks,
> Uros.
>
> > > ---
> > >  gcc/config/i386/i386.cc  |  9 +
> > >  gcc/testsuite/gcc.target/i386/pr105960.c | 19 +++
> > >  2 files changed, 28 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105960.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index 3d189e124e4..1ca7836e11e 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -1015,6 +1015,15 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
> > > }
> > >  }
> > >
> > > +  if (decl && ix86_use_pseudo_pic_reg ())
> > > +{
> > > +  /* When PIC register is used, it must be restored after ifunc
> > > +function returns.  */
> > > +   cgraph_node *node = cgraph_node::get (decl);
> > > +   if (node && node->ifunc_resolver)
> > > +return false;
> > > +}
> > > +
> > >/* Otherwise okay.  That also includes certain types of indirect 
> > > calls.  */
> > >return true;
> > >  }
> > > diff --git a/gcc/testsuite/gcc.target/i386/pr105960.c 
> > > b/gcc/testsuite/gcc.target/i386/pr105960.c
> > > new file mode 100644
> > > index 000..db137a1642d
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/pr105960.c
> > > @@ -0,0 +1,19 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-ifunc "" } */
> > > +/* { dg-options "-O2 -fpic" } */
> > > +
> > > +__attribute__((target_clones("default","fma")))
> > > +static inline double
> > > +expfull_ref(double x)
> > > +{
> > > +  return __builtin_pow(x, 0.1234);
> > > +}
> > > +
> > > +double
> > > +exp_ref(double x)
> > > +{
> > > +  return expfull_ref(x);
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler "jmp\[ \t\]*expfull_ref@PLT" { target { ! 
> > > ia32 } } } } */
> > > +/* { dg-final { scan-assembler "call\[ \t\]*expfull_ref@PLT" { target 
> > > ia32 } } } */
> > > --
> > > 2.36.1
> > >
> >
> > PING.
> >
> > --
> > H.J.



-- 
H.J.

RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.

2022-06-21 Thread Tamar Christina via Gcc-patches

> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, June 21, 2022 2:15 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; ja...@redhat.com
> Subject: RE: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
> 
> On Mon, 20 Jun 2022, Tamar Christina wrote:
> 
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Monday, June 20, 2022 9:36 AM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; ja...@redhat.com
> > > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > > max/min.
> > >
> > > On Thu, 16 Jun 2022, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch adds support for three-way min/max recognition in phi-opts.
> > > >
> > > > Concretely for e.g.
> > > >
> > > > #include 
> > > >
> > > > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > uint8_t  xk;
> > > > if (xc < xm) {
> > > > xk = (uint8_t) (xc < xy ? xc : xy);
> > > > } else {
> > > > xk = (uint8_t) (xm < xy ? xm : xy);
> > > > }
> > > > return xk;
> > > > }
> > > >
> > > > we generate:
> > > >
> > > >[local count: 1073741824]:
> > > >   _5 = MIN_EXPR ;
> > > >   _7 = MIN_EXPR ;
> > > >   return _7;
> > > >
> > > > instead of
> > > >
> > > >   :
> > > >   if (xc_2(D) < xm_3(D))
> > > > goto ;
> > > >   else
> > > > goto ;
> > > >
> > > >   :
> > > >   xk_5 = MIN_EXPR ;
> > > >   goto ;
> > > >
> > > >   :
> > > >   xk_6 = MIN_EXPR ;
> > > >
> > > >   :
> > > >   # xk_1 = PHI 
> > > >   return xk_1;
> > > >
> > > > The same function also immediately deals with turning a
> > > > minimization problem into a maximization one if the results are
> > > > inverted.  We do this here since doing it in match.pd would end up
> > > > changing the shape of the BBs and adding additional instructions
> > > > which would prevent various
> > > optimizations from working.
> > >
> > > Can you explain a bit more?
> >
> > I'll respond to this one first In case it changes how you want me to 
> > proceed.
> >
> > I initially had used a match.pd rule to do the min to max conversion,
> > but a number of testcases started to fail.  The reason was that a lot
> > of the foldings checked that the BB contains only a single SSA and that that
> SSA is a phi node.
> >
> > By changing the min into max, the negation of the result ends up In
> > the same BB and so the optimizations are skipped leading to less optimal
> code.
> >
> > I did look into relaxing those phi opts but it felt like I'd make a
> > rather arbitrary exception for minus and seemed better to handle it in the
> minmax folding.
> 
> That's a possibility but we try to maintain a single place for a transform 
> which
> might be in match.pd which would then also handle this when there's a RHS
> COND_EXPR connecting the stmts rather than a PHI node.

Sorry, I am probably missing something here.  Just to be clear at the moment I 
just do it all in
minmax_replacement, so everything is already in one place.  It's a simple 
extension of the code
already there.

Are you suggesting I have to move it all to match.pd?  That's non-trivial..

Thanks,
Tamar

> 
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> > > the phi
> > > > sequence of a three-way conditional.
> > > > (replace_phi_edge_with_variable): Support deferring of BB 
> > > > removal.
> > > > (tree_ssa_phiopt_worker): Detect diamond phi structure for 
> > > > three-
> > > way
> > > > min/max.
> > > > (strip_bit_not, invert_minmax_code): New.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> > > optimize
> > > > code away.
> > > > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > > > * gcc.dg/tree-ssa/minmax-8.c: New test.
> > > >
> > > > --- inline copy of patch --
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > new file mode 100644
> > > > index
> > > >
> > >
> ..de3b2e946e81701e3b75f580e
> > > 6a8
> > > > 43695a05786e
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > > +
> > > > +#include 
> > > > +
> > > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > > +   uint8_t  xk;
> > > > +if (xc < xm) {

doc: Document module language-linkage supported

2022-06-21 Thread Nathan Sidwell via Gcc-patches



I missed we documented this as unimplemented, when I implemented it.


--
Nathan SidwellFrom f1fcd6e3ad911945bc3c24a3a5c7ea99b910121e Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Tue, 21 Jun 2022 06:23:11 -0700
Subject: [PATCH] doc: Document module language-linkage supported

I missed we documented this as unimplemented, when I implemented it.

	gcc/
	* doc/invoke.texi (C++ Modules): Remove language-linkage
	as missing feature.
---
 gcc/doc/invoke.texi | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 50f57877477..81d13f4e78e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -34639,13 +34639,6 @@ Papers p1815 (@uref{https://wg21.link/p1815}) and p2003
 exported region may reference (for instance, the entities an exported
 template definition may reference).  These are not fully implemented.
 
-@item Language-linkage module attachment
-Declarations with explicit language linkage (@code{extern "C"} or
-@code{extern "C++"}) are attached to the global module, even when in
-the purview of a named module.  This is not implemented.  Such
-declarations will be attached to the module, if any, in which they are
-declared.
-
 @item Standard Library Header Units
 The Standard Library is not provided as importable header units.  If
 you want to import such units, you must explicitly build them first.
-- 
2.30.2

RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.

2022-06-21 Thread Richard Biener via Gcc-patches

On Mon, 20 Jun 2022, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Monday, June 20, 2022 9:36 AM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; ja...@redhat.com
> > Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> > max/min.
> > 
> > On Thu, 16 Jun 2022, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch adds support for three-way min/max recognition in phi-opts.
> > >
> > > Concretely for e.g.
> > >
> > > #include 
> > >
> > > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > >   uint8_t  xk;
> > > if (xc < xm) {
> > > xk = (uint8_t) (xc < xy ? xc : xy);
> > > } else {
> > > xk = (uint8_t) (xm < xy ? xm : xy);
> > > }
> > > return xk;
> > > }
> > >
> > > we generate:
> > >
> > >[local count: 1073741824]:
> > >   _5 = MIN_EXPR ;
> > >   _7 = MIN_EXPR ;
> > >   return _7;
> > >
> > > instead of
> > >
> > >   :
> > >   if (xc_2(D) < xm_3(D))
> > > goto ;
> > >   else
> > > goto ;
> > >
> > >   :
> > >   xk_5 = MIN_EXPR ;
> > >   goto ;
> > >
> > >   :
> > >   xk_6 = MIN_EXPR ;
> > >
> > >   :
> > >   # xk_1 = PHI 
> > >   return xk_1;
> > >
> > > The same function also immediately deals with turning a minimization
> > > problem into a maximization one if the results are inverted.  We do
> > > this here since doing it in match.pd would end up changing the shape
> > > of the BBs and adding additional instructions which would prevent various
> > optimizations from working.
> > 
> > Can you explain a bit more?
> 
> I'll respond to this one first In case it changes how you want me to proceed.
> 
> I initially had used a match.pd rule to do the min to max conversion, but a
> number of testcases started to fail.  The reason was that a lot of the 
> foldings
> checked that the BB contains only a single SSA and that that SSA is a phi 
> node.
> 
> By changing the min into max, the negation of the result ends up In the same 
> BB
> and so the optimizations are skipped leading to less optimal code.
> 
> I did look into relaxing those phi opts but it felt like I'd make a rather 
> arbitrary
> exception for minus and seemed better to handle it in the minmax folding. 

That's a possibility but we try to maintain a single place for a transform
which might be in match.pd which would then also handle this when
there's a RHS COND_EXPR connecting the stmts rather than a PHI node.

Richard.

> Thanks,
> Tamar
> 
> > 
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for
> > the phi
> > >   sequence of a three-way conditional.
> > >   (replace_phi_edge_with_variable): Support deferring of BB removal.
> > >   (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> > way
> > >   min/max.
> > >   (strip_bit_not, invert_minmax_code): New.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't
> > optimize
> > >   code away.
> > >   * gcc.dg/tree-ssa/minmax-3.c: New test.
> > >   * gcc.dg/tree-ssa/minmax-4.c: New test.
> > >   * gcc.dg/tree-ssa/minmax-5.c: New test.
> > >   * gcc.dg/tree-ssa/minmax-6.c: New test.
> > >   * gcc.dg/tree-ssa/minmax-7.c: New test.
> > >   * gcc.dg/tree-ssa/minmax-8.c: New test.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > new file mode 100644
> > > index
> > >
> > ..de3b2e946e81701e3b75f580e
> > 6a8
> > > 43695a05786e
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include 
> > > +
> > > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > + uint8_t  xk;
> > > +if (xc < xm) {
> > > +xk = (uint8_t) (xc < xy ? xc : xy);
> > > +} else {
> > > +xk = (uint8_t) (xm < xy ? xm : xy);
> > > +}
> > > +return xk;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > new file mode 100644
> > > index
> > >
> > ..0b6d667be868c2405eaefd17c
> > b52
> > > 2da44bafa0e2
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > > +
> > > +#include 
> > > +
> > > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > > +uint8_t   xk;
> > > +if (xc > xm) {
> > > +xk = (uint8_t) (xc > xy

Re: [PATCH][wwwdocs] gcc-13: m2link branch

2022-06-21 Thread Gaius Mulley via Gcc-patches

Gerald Pfeifer  writes:

> Hi Gaius,
>
> On Tue, 21 Jun 2022, Gaius Mulley wrote:
>> here is a proposed entry describing a new branch m2link containing the
>> new scaffold development for modula-2.  As the description says it is
>> expected that this branch be short lived - terminating once significant
>> regression tests pass,
>
> this looks fine, thank you!

great thanks!

>> +branch once a significant number of regression tests pass.  It is
>
> 1/2 nitpick, 1/2 language question (hence including Sandra):
>
> Should this be "passes" since it relates to "number" (singular), or does 
> the implied plural allow for the plural in the verb form in English? Or 
> even require it?

interesting - the singular pass I think is correct.  One of the many
oddities of English and rather counter intuitive as:

 branch once a significant number of regression tests pass.

or

 branch once the singular foobar regression test passes.

:-) are correct.  I'm sure there is a formal grammar explanation - but I
confess to not knowing this now,

> (The patch is fine either way; more of a curious question.)

thanks,
Gaius

Re: [PATCH] libgo: Recognize off64_t / loff_t type definition of musl libc

2022-06-21 Thread Franz Sirl


Am 2022-06-21 um 09:34 schrieb Sören Tempel via Gcc-patches:

Hi,

The problem is: glibc defines loff_t in sys/types.h, not fcntl.h (where musl
defines it). I falsely assumed that the newly committed AC_CHECK_TYPES
invocation would include fcntl.h *in addition to* AC_INCLUDES_DEFAULT.
However, as it turns out specifying includes for AC_CHECK_TYPES overwrites the
default instead of appending to it.

The patch below should fix this by appending to AC_INCLUDES_DEFAULT explicitly.
Alternatively, we could try to add fcntl.h to AC_INCLUDES_DEFAULT, though my
autotools knowledge is severely limited and hence I am not sure how this would
be achieved.

diff --git a/libgo/configure b/libgo/configure
index b7ff9b3..273af1d 100755
--- a/libgo/configure
+++ b/libgo/configure
@@ -15549,8 +15549,10 @@ fi

  CFLAGS_hold="$CFLAGS"
  CFLAGS="$OSCFLAGS $CFLAGS"
-ac_fn_c_check_type "$LINENO" "loff_t" "ac_cv_type_loff_t" "#include 
-"
+ac_fn_c_check_type "$LINENO" "loff_t" "ac_cv_type_loff_t" "
+$ac_includes_default
+#include 
+ "
  if test "x$ac_cv_type_loff_t" = xyes; then :

  cat >>confdefs.h <<_ACEOF
diff --git a/libgo/configure.ac b/libgo/configure.ac
index bac58b0..b237392 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -604,7 +604,9 @@ AC_TYPE_OFF_T

  CFLAGS_hold="$CFLAGS"
  CFLAGS="$OSCFLAGS $CFLAGS"
-AC_CHECK_TYPES([loff_t], [], [], [[#include ]])
+AC_CHECK_TYPES([loff_t], [], [], [
+AC_INCLUDES_DEFAULT
+#include ])
  CFLAGS="$CFLAGS_hold"

  LIBS_hold="$LIBS"



Hi,

the patch restores bootstrap for me on x86_64-suse-linux.

Franz.

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, Jun 21, 2022 at 12:05 PM Xionghu Luo  wrote:
>
>
>
> On 2022/6/21 15:33, Richard Biener via Gcc-patches wrote:
> > On Tue, Jun 21, 2022 at 5:06 AM xionghuluo(罗雄虎) via Gcc-patches
> >  wrote:
> >>
> >>
> >> Bootstrap and regression tested pass on x86_64-linux-gnu, OK for master?
> >
> > OK if you add a comment that an empty conditions_in_bbs indicates we are
> > processing the first basic-block (that's not obvious to me).
> >
>
>
> Thanks.  Committed in  r13-1184, I assume this doesn't need backport?

No, it's not a regression.

Richard.

> Thanks,
> Xionghu

Re: [PATCH v2] tree-optimization/95821 - Convert strlen + strchr to memchr

2022-06-21 Thread Jakub Jelinek via Gcc-patches

On Mon, Jun 20, 2022 at 02:42:20PM -0700, Noah Goldstein wrote:
> This patch allows for strchr(x, c) to the replace with memchr(x, c,
> strlen(x) + 1) if strlen(x) has already been computed earlier in the
> tree.
> 
> Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821
> 
> Since memchr doesn't need to re-find the null terminator it is faster
> than strchr.
> 
> bootstrapped and tested on x86_64-linux.
> 
>   PR tree-optimization/95821

This should be indented by a single tab, not two.
> 
> gcc/
> 
>   * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit
>   memchr instead of strchr if strlen already computed.
> 
> gcc/testsuite/
> 
>   * c-c++-common/pr95821-1.c: New test.
>   * c-c++-common/pr95821-2.c: New test.
>   * c-c++-common/pr95821-3.c: New test.
>   * c-c++-common/pr95821-4.c: New test.
>   * c-c++-common/pr95821-5.c: New test.
>   * c-c++-common/pr95821-6.c: New test.
>   * c-c++-common/pr95821-7.c: New test.
>   * c-c++-common/pr95821-8.c: New test.
> --- a/gcc/tree-ssa-strlen.cc
> +++ b/gcc/tree-ssa-strlen.cc
> @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen ()
>  }
>  }
>  
> -/* Handle a strchr call.  If strlen of the first argument is known, replace
> -   the strchr (x, 0) call with the endptr or x + strlen, otherwise remember
> -   that lhs of the call is endptr and strlen of the argument is endptr - x.  
> */
> +/* Handle a strchr call.  If strlen of the first argument is known,
> +   replace the strchr (x, 0) call with the endptr or x + strlen,
> +   otherwise remember that lhs of the call is endptr and strlen of the
> +   argument is endptr - x.  If strlen of x is not know but has been
> +   computed earlier in the tree then replace strchr(x, c) to

Still missing space before ( above.

> +   memchr (x, c, strlen + 1).  */
>  
>  void
>  strlen_pass::handle_builtin_strchr ()
> @@ -2418,8 +2421,12 @@ strlen_pass::handle_builtin_strchr ()
>if (lhs == NULL_TREE)
>  return;
>  
> -  if (!integer_zerop (gimple_call_arg (stmt, 1)))
> -return;
> +  tree chr = gimple_call_arg (stmt, 1);
> +  /* strchr only uses the lower char of input so to check if its
> + strchr (s, zerop) only take into account the lower char.  */
> +  bool is_strchr_zerop
> +  = (TREE_CODE (chr) == INTEGER_CST
> +  && integer_zerop (fold_convert (char_type_node, chr)));

The indentation rule is that = should be 2 columns to the right from bool,
so

  bool is_strchr_zerop
= (TREE_CODE (chr) == INTEGER_CST
   && integer_zerop (fold_convert (char_type_node, chr)));

> +   /* If its not strchr (s, zerop) then try and convert to
> +  memchr since strlen has already been computed.  */

This comment still has the second line weirdly indented.

> +   tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR);
> +
> +   /* Only need to check length strlen (s) + 1 if chr may be zero.
> + Otherwise the last chr (which is known to be zero) can never
> + be a match.  NB: We don't need to test if chr is a non-zero
> + integer const with zero char bits because that is taken into
> + account with is_strchr_zerop.  */
> +   if (!tree_expr_nonzero_p (chr))

The above is unsafe though.  tree_expr_nonzero_p (chr) will return true
if say VRP can prove it is not zero, but because of the implicit
(char) chr cast done by the function we need something different.
Say if VRP determines that chr is in [1, INT_MAX] or even just [255, 257]
it doesn't mean (char) chr won't be 0.
So, as I've tried to explain in the previous mail, it can be done e.g. with
  bool chr_nonzero = false;
  if (TREE_CODE (chr) == INTEGER_CST
  && integer_nonzerop (fold_convert (char_type_node, chr)))
chr_nonzero = true;
  else if (TREE_CODE (chr) == SSA_NAME
   && CHAR_TYPE_SIZE < INT_TYPE_SIZE)
{
  value_range r;
  /* Try to determine using ranges if (char) chr must
 be always 0.  That is true e.g. if all the subranges
 have the INT_TYPE_SIZE - CHAR_TYPE_SIZE bits
 the same on lower and upper bounds.  */
  if (get_range_query (cfun)->range_of_expr (r, chr, stmt)
  && r.kind () == VR_RANGE)
{
  chr_nonzero = true;
  wide_int mask = wi::mask (CHAR_TYPE_SIZE, true,
INT_TYPE_SIZE);
  for (int i = 0; i < r.num_pairs (); ++i)
if ((r.lower_bound (i) & mask)
!= (r.upper_bound (i) & mask))
  {
chr_nonzero = false;
break;
  }
}
}

Re: [PATCH][wwwdocs] gcc-13: m2link branch

2022-06-21 Thread Gerald Pfeifer

Hi Gaius,

On Tue, 21 Jun 2022, Gaius Mulley wrote:
> here is a proposed entry describing a new branch m2link containing the
> new scaffold development for modula-2.  As the description says it is
> expected that this branch be short lived - terminating once significant
> regression tests pass,

this looks fine, thank you!

> +branch once a significant number of regression tests pass.  It is

1/2 nitpick, 1/2 language question (hence including Sandra):

Should this be "passes" since it relates to "number" (singular), or does 
the implied plural allow for the plural in the verb form in English? Or 
even require it?

(The patch is fine either way; more of a curious question.)

Gerald

Re: [PATCH RFA] ubsan: default to trap on unreachable at -O0 and -Og [PR104642]

2022-06-21 Thread Jakub Jelinek via Gcc-patches

On Mon, Jun 20, 2022 at 04:30:51PM -0400, Jason Merrill wrote:
I'd still prefer to see a separate -funreachable-traps.
The thing is that -fsanitize{,-recover,-trap}= are global options, not per
function (and only tweaked by no_sanitize attribute), while something
that needs to depend on the per-function -O0/-Og setting is necessarily per
function.  The *.awk changes I understand make -fsanitize= kind of per
function but -fsanitize-{recover,trap}= remain global, that is going to be a
nightmare especially with LTO which saves/restores the per function flags
and for the global ones merges them across TUs.
By separating sanitizers (which would remain global with no_sanitize
overrides) from -funreachable-traps which would be Optimization option
(with default set if unset in default_options_optimization or so)
saved/restored upon function changes that issue is gone.

> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -5858,6 +5858,11 @@ builtin_decl_implicit (enum built_in_function fncode)
>return builtin_info[uns_fncode].decl;
>  }
>  
> +/* For BUILTIN_UNREACHABLE, use one of these instead of one of the above.  */
> +extern tree builtin_decl_unreachable ();
> +extern gcall *gimple_build_builtin_unreachable (location_t);
> +extern tree build_builtin_unreachable (location_t);

I think we generally try to declare functions in the header with same
basename as the source file in which they are defined.
So, the question is if builtin_decl_unreachable and build_builtin_unreachable
shouldn't be defined in tree.cc and declared in tree.h and
gimple_build_builtin_unreachable in gimple.cc and declared in gimple.h,
using a helper defined in ubsan.cc and declared in ubsan.h (your current
unreachable_1).

> +
>  /* Set explicit builtin function nodes and whether it is an implicit
> function.  */
>  
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> --- a/gcc/cgraphunit.cc
> +++ b/gcc/cgraphunit.cc
> --- a/gcc/cp/constexpr.cc
> +++ b/gcc/cp/constexpr.cc
> --- a/gcc/cp/cp-gimplify.cc
> +++ b/gcc/cp/cp-gimplify.cc
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> --- a/gcc/ipa-fnsummary.cc
> +++ b/gcc/ipa-fnsummary.cc
> --- a/gcc/ipa-prop.cc
> +++ b/gcc/ipa-prop.cc
> --- a/gcc/ipa.cc
> +++ b/gcc/ipa.cc

The above changes LGTM.
> if (dump_enabled_p ())
>   {
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 959d48d173f..d92699a1bc9 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -1122,6 +1122,17 @@ finish_options (struct gcc_options *opts, struct 
> gcc_options *opts_set,
>opts->x_flag_no_inline = 1;
>  }
>  
> +  /* At -O0 or -Og, turn __builtin_unreachable into a trap.  */
> +  if (!opts_set->x_flag_sanitize)
> +{
> +  if (!opts->x_optimize || opts->x_optimize_debug)
> + opts->x_flag_sanitize = SANITIZE_UNREACHABLE|SANITIZE_RETURN;
> +
> +  /* Change this without regard to optimization level so we don't need to
> +  deal with it in optc-save-gen.awk.  */
> +  opts->x_flag_sanitize_trap = SANITIZE_UNREACHABLE|SANITIZE_RETURN;
> +}
> +
>/* Pipelining of outer loops is only possible when general pipelining
>   capabilities are requested.  */
>if (!opts->x_flag_sel_sched_pipelining)

See above.

> --- a/gcc/sanopt.cc
> +++ b/gcc/sanopt.cc
> @@ -942,7 +942,15 @@ public:
>{}
>  
>/* opt_pass methods: */
> -  virtual bool gate (function *) { return flag_sanitize; }
> +  virtual bool gate (function *)
> +  {
> +/* SANITIZE_RETURN is handled in the front-end.  When trapping,
> +   SANITIZE_UNREACHABLE is handled by builtin_decl_unreachable.  */
> +unsigned int mask = SANITIZE_RETURN;

There are other sanitizers purely handled in the FEs, guess as a follow-up
we should look at which of them don't really need any sanopt handling.

> +if (flag_sanitize_trap & SANITIZE_UNREACHABLE)
> +  mask |= SANITIZE_UNREACHABLE;
> +return flag_sanitize & ~mask;
> +  }
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> --- a/gcc/tree-ssa-loop-ivcanon.cc
> +++ b/gcc/tree-ssa-loop-ivcanon.cc
> --- a/gcc/tree-ssa-sccvn.cc
> +++ b/gcc/tree-ssa-sccvn.cc
> --- a/gcc/tree.cc
> +++ b/gcc/tree.cc

LGTM.

> --- a/gcc/ubsan.cc
> +++ b/gcc/ubsan.cc
> @@ -638,27 +638,84 @@ ubsan_create_data (const char *name, int loccnt, const 
> location_t *ploc, ...)
>return var;
>  }
>  
> -/* Instrument the __builtin_unreachable call.  We just call the libubsan
> -   routine instead.  */
> +/* The built-in decl to use to mark code points believed to be unreachable.
> +   Typically __builtin_unreachable, but __builtin_trap if
> +   -fsanitize=unreachable -fsanitize-trap=unreachable.  If only
> +   -fsanitize=unreachable, we rely on sanopt to replace any calls with the
> +   appropriate ubsan function.  When building a call directly, use
> +   {gimple_},build_builtin_unreachable instead.  */
> +
> +tree
> +builtin_decl_unreachable ()
> +{
> +  enum built_in_function fncode = BUILT_IN_UNREACHABLE;
> +
> +  if (sanitize_flags_p (SANITIZE_UNREACHABLE))
> +{
> +  if

[PATCH][wwwdocs] gcc-13: m2link branch

2022-06-21 Thread Gaius Mulley via Gcc-patches



Hi,

here is a proposed entry describing a new branch m2link containing the
new scaffold development for modula-2.  As the description says it is
expected that this branch be short lived - terminating once significant
regression tests pass,

regards,
Gaius


diff --git a/htdocs/git.html b/htdocs/git.html
index f9acea54..5202363c 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -344,6 +344,18 @@ in Git.
 Patches should be
 prefixed with [modula-2] in the subject line.
 
+  m2link
+  This is a short term branch for the
+http://www.nongnu.org/gm2/homepage.html";>GNU Modula-2
+front end to GCC prior to its integration with the mainline.
+It contains the new scaffold and driver development.  The contents
+of this branch will be folded back onto the modula-2
+branch once a significant number of regression tests pass.  It is
+maintained by
+mailto:gaius.mul...@southwales.ac.uk";>Gaius Mulley.
+Patches should be
+prefixed with [m2link] in the subject line.
+
   coarray_native
   This branch is for implementation of a shared memory
 implementation of Fortran coarrays.  It is maintained by

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Xionghu Luo via Gcc-patches





On 2022/6/21 15:42, Martin Liška wrote:

On 6/21/22 09:33, Xi Ruoyao wrote:

On Tue, 2022-06-21 at 09:28 +0200, Martin Liška wrote:


Sorry, but I don't see to which email this replies to?
Can't find a patch.


https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596913.html


Hm, interesting. It means Thunderbird can't deal with the email format
and I can't see any attachment in the email.


It was sent by outlook, seems not work well with thunderbird...



Martin



The patch is an attachment:

https://gcc.gnu.org/pipermail/gcc-patches/attachments/20220621/dbb112d2/attachment-0001.obj

Re: [PATCH] aarch64: testsuite: symbol-range compile only

2022-06-21 Thread Richard Sandiford via Gcc-patches

Alexandre Oliva  writes:
> On some of our embedded aarch64 targets, RAM size is too small for
> this test to fit.  It doesn't look like this test requires linking,
> and if it does, the -tiny version may presumably get most of the
> coverage without going overboard in target system requirements.

Linking is valuable here because one of the likely failure modes
is an out-of-range relocation.

Could we instead have a new target selector for whether the memory
map includes xGB of RAM?  E.g. maybe it could be along similar lines
to check_effective_target_simulator, reading an optional board
property that gives the RAM size.

Thanks,
Richard

>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?
>
>
> for  gcc/testsuite/ChangeLog
>
>   * gcc.target/aarch64/symbol-range.c: Compile only.
> ---
>  gcc/testsuite/gcc.target/aarch64/symbol-range.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/symbol-range.c 
> b/gcc/testsuite/gcc.target/aarch64/symbol-range.c
> index d8e82fa1b2829..cc68c19ca85d9 100644
> --- a/gcc/testsuite/gcc.target/aarch64/symbol-range.c
> +++ b/gcc/testsuite/gcc.target/aarch64/symbol-range.c
> @@ -1,4 +1,4 @@
> -/* { dg-do link } */
> +/* { dg-do compile } */
>  /* { dg-options "-O3 -save-temps -mcmodel=small" } */
>  
>  char fixed_regs[0x8000];

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Xionghu Luo via Gcc-patches





On 2022/6/21 15:33, Richard Biener via Gcc-patches wrote:

On Tue, Jun 21, 2022 at 5:06 AM xionghuluo(罗雄虎) via Gcc-patches
 wrote:



Bootstrap and regression tested pass on x86_64-linux-gnu, OK for master?


OK if you add a comment that an empty conditions_in_bbs indicates we are
processing the first basic-block (that's not obvious to me).




Thanks.  Committed in  r13-1184, I assume this doesn't need backport?

Thanks,
Xionghu

Re: [PATCH] ifcvt: Don't introduce trapping or faulting reads in noce_try_sign_mask [PR106032]

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, 21 Jun 2022, Jakub Jelinek wrote:

> Hi!
> 
> noce_try_sign_mask as documented will optimize
>   if (c < 0)
> x = t;
>   else
> x = 0;
> into x = (c >> bitsm1) & t;
> The optimization is done if either t is unconditional
> (e.g. for
>   x = t;
>   if (c >= 0)
> x = 0;
> ) or if it is cheap.  We already check that t doesn't have side-effects,
> but if t is conditional, we need to punt also if it may trap or fault,
> as we make it unconditional.
> 
> I've briefly skimmed other noce_try* optimizations and didn't find one that
> would suffer from the same problem.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2022-06-21  Jakub Jelinek  
> 
>   PR rtl-optimization/106032
>   * ifcvt.cc (noce_try_sign_mask): Punt if !t_unconditional, and
>   t may_trap_or_fault_p, even if it is cheap.
> 
>   * gcc.c-torture/execute/pr106032.c: New test.
> 
> --- gcc/ifcvt.cc.jj   2022-04-26 10:11:51.951558338 +0200
> +++ gcc/ifcvt.cc  2022-06-20 17:44:18.638394338 +0200
> @@ -2833,18 +2833,19 @@ noce_try_sign_mask (struct noce_if_info
>  return FALSE;
>  
>/* This is only profitable if T is unconditionally executed/evaluated in 
> the
> - original insn sequence or T is cheap.  The former happens if B is the
> - non-zero (T) value and if INSN_B was taken from TEST_BB, or there was no
> - INSN_B which can happen for e.g. conditional stores to memory.  For the
> - cost computation use the block TEST_BB where the evaluation will end up
> - after the transformation.  */
> + original insn sequence or T is cheap and can't trap or fault.  The 
> former
> + happens if B is the non-zero (T) value and if INSN_B was taken from
> + TEST_BB, or there was no INSN_B which can happen for e.g. conditional
> + stores to memory.  For the cost computation use the block TEST_BB where
> + the evaluation will end up after the transformation.  */
>t_unconditional
>  = (t == if_info->b
> && (if_info->insn_b == NULL_RTX
>  || BLOCK_FOR_INSN (if_info->insn_b) == if_info->test_bb));
>if (!(t_unconditional
> - || (set_src_cost (t, mode, if_info->speed_p)
> - < COSTS_N_INSNS (2
> + || ((set_src_cost (t, mode, if_info->speed_p)
> +  < COSTS_N_INSNS (2))
> + && !may_trap_or_fault_p (t
>  return FALSE;
>  
>if (!noce_can_force_operand (t))
> --- gcc/testsuite/gcc.c-torture/execute/pr106032.c.jj 2022-06-20 
> 18:00:01.064352904 +0200
> +++ gcc/testsuite/gcc.c-torture/execute/pr106032.c2022-06-20 
> 17:59:41.714600349 +0200
> @@ -0,0 +1,21 @@
> +/* PR rtl-optimization/106032 */
> +
> +__attribute__((noipa)) int
> +foo (int x, int *y)
> +{
> +  int a = 0;
> +  if (x < 0)
> +a = *y;
> +  return a;  
> +}
> +
> +int
> +main ()
> +{
> +  int a = 42;
> +  if (foo (0, 0) != 0 || foo (1, 0) != 0)
> +__builtin_abort ();
> +  if (foo (-1, &a) != 42 || foo (-42, &a) != 42)
> +__builtin_abort ();
> +  return 0;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] expand: Fix up expand_cond_expr_using_cmove [PR106030]

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, 21 Jun 2022, Jakub Jelinek wrote:

> Hi!
> 
> If expand_cond_expr_using_cmove can't find a cmove optab for a particular
> mode, it tries to promote the mode and perform the cmove in the promoted
> mode.
> 
> The testcase in the patch ICEs on arm because in that case we pass temp which
> has the promoted mode (SImode) as target to expand_operands where the
> operands have the non-promoted mode (QImode).
> Later on the function uses paradoxical subregs:
>   if (GET_MODE (op1) != mode)
> op1 = gen_lowpart (mode, op1);
> 
>   if (GET_MODE (op2) != mode)
> op2 = gen_lowpart (mode, op2);
> to change the operand modes.
> 
> The following patch fixes it by passing NULL_RTX as target if it has
> promoted mode.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux and tested with cross
> to arm on the testcase, ok for trunk?

OK.

Thanks,
Richard.

> 2022-06-21  Jakub Jelinek  
> 
>   PR middle-end/106030
>   * expr.cc (expand_cond_expr_using_cmove): Pass NULL_RTX instead of
>   temp to expand_operands if mode has been promoted.
> 
>   * gcc.c-torture/compile/pr106030.c: New test.
> 
> --- gcc/expr.cc.jj2022-06-10 21:19:12.087803729 +0200
> +++ gcc/expr.cc   2022-06-20 14:10:07.941681072 +0200
> @@ -8832,7 +8832,8 @@ expand_cond_expr_using_cmove (tree treeo
>expanding_cond_expr_using_cmove = true;
>start_sequence ();
>expand_operands (treeop1, treeop2,
> -temp, &op1, &op2, EXPAND_NORMAL);
> +mode == orig_mode ? temp : NULL_RTX, &op1, &op2,
> +EXPAND_NORMAL);
>  
>if (TREE_CODE (treeop0) == SSA_NAME
>&& (srcstmt = get_def_for_expr_class (treeop0, tcc_comparison)))
> --- gcc/testsuite/gcc.c-torture/compile/pr106030.c.jj 2022-06-20 
> 14:49:37.618142233 +0200
> +++ gcc/testsuite/gcc.c-torture/compile/pr106030.c2022-06-20 
> 14:49:19.016382117 +0200
> @@ -0,0 +1,16 @@
> +/* PR middle-end/106030 */
> +
> +int a, b, c;
> +
> +char
> +foo (int x, int y)
> +{
> +  return x * y;
> +}
> +
> +void
> +bar (void)
> +{
> +  char d = (foo <= b) * a;
> +  c = foo (2 != bar, d);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH 2/3] lto-plugin: make claim_file_handler thread-safe

2022-06-21 Thread Martin Liška

On 6/21/22 09:56, Richard Biener wrote:
> On Mon, Jun 20, 2022 at 12:20 PM Martin Liška  wrote:
>>
>> On 6/20/22 11:32, Richard Biener wrote:
>>> On Thu, Jun 16, 2022 at 9:01 AM Martin Liška  wrote:

 lto-plugin/ChangeLog:

 * lto-plugin.c (plugin_lock): New lock.
 (claim_file_handler): Use mutex for critical section.
 (onload): Initialize mutex.
 ---
  lto-plugin/lto-plugin.c | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
 index 00b760636dc..13118c4983c 100644
 --- a/lto-plugin/lto-plugin.c
 +++ b/lto-plugin/lto-plugin.c
 @@ -55,6 +55,7 @@ along with this program; see the file COPYING3.  If not 
 see
  #include 
  #include 
  #include 
 +#include 
>>>
>>> Not sure if we support any non-pthread target for building the LTO
>>> plugin, but it
>>> seems we have
>>>
>>>   # Among non-ELF, only Windows platforms support the lto-plugin so far.
>>>   # Build it unless LTO was explicitly disabled.
>>>   case $target in
>>> *-cygwin* | *-mingw*) build_lto_plugin=$enable_lto ;;
>>>
>>> which suggests that at least build validating the above with --enable-lto
>>
>> Verified that it's fine.
>>
>>>
>>> IIRC we have gthr-*.h in libgcc/, not sure if that's usable in a
>>> host linker plugin.
>>>
  #ifdef HAVE_SYS_WAIT_H
  #include 
  #endif
 @@ -157,6 +158,9 @@ enum symbol_style
ss_uscore,   /* Underscore prefix all symbols.  */
  };

 +/* Plug-in mutex.  */
 +static pthread_mutex_t plugin_lock;
 +
  static char *arguments_file_name;
  static ld_plugin_register_claim_file register_claim_file;
  static ld_plugin_register_all_symbols_read register_all_symbols_read;
 @@ -1262,15 +1266,18 @@ claim_file_handler (const struct 
 ld_plugin_input_file *file, int *claimed)
   lto_file.symtab.syms);
check (status == LDPS_OK, LDPL_FATAL, "could not add symbols");

 +  pthread_mutex_lock (&plugin_lock);
num_claimed_files++;
claimed_files =
 xrealloc (claimed_files,
   num_claimed_files * sizeof (struct plugin_file_info));
claimed_files[num_claimed_files - 1] = lto_file;
 +  pthread_mutex_unlock (&plugin_lock);

*claimed = 1;
  }

 +  pthread_mutex_lock (&plugin_lock);
if (offload_files == NULL)
  {
/* Add dummy item to the start of the list.  */
 @@ -1333,11 +1340,12 @@ claim_file_handler (const struct 
 ld_plugin_input_file *file, int *claimed)
 offload_files_last_lto = ofld;
num_offload_files++;
  }
 +  pthread_mutex_unlock (&plugin_lock);

goto cleanup;

   err:
 -  non_claimed_files++;
 +  __atomic_fetch_add (&non_claimed_files, 1, __ATOMIC_RELAXED);
>>>
>>> is it worth "optimizing" this with yet another need for target specific 
>>> support
>>> (just use pthread_mutex here as well?)
>>
>> Sure.
>>
>> May I install the patch with the change?
> 
> Can you at least add a configure check for pthread.h and maybe disable
> locking when not found or erroring out?  I figure we have GCC_AC_THREAD_HEADER

All right, let's error out then.

> for the gthr.h stuff using $target_thread_file (aka --enable-threads=XYZ),
> but as said that's for the target and I don't see any host uses.  We might 
> also
> add an explicit list of hosts (*-linux*?) where we enable thread support for
> lto-plugin, providing opt-in (so you'd have to wrap the mutex taking or
> if-def it out).
> 
> I think you also need to link lto-plugin with -pthread, no?

Yep.

Please see the updated patch.

> On linux
> it might work omitting that but I'm not sure other libc have serial pthread
> stubs in their libc.  BFD ld definitely doesn't link against pthread so
> dlopening lto-plugin will fail (also not all libc might like
> initializing threads
> from a dlopen _init).

What initializing threads do you mean?

Martin

> 
> Richard.
> 
>> Cheers,
>> Martin
>>
>>>
free (lto_file.name);

   cleanup:
 @@ -1415,6 +1423,12 @@ onload (struct ld_plugin_tv *tv)
struct ld_plugin_tv *p;
enum ld_plugin_status status;

 +  if (pthread_mutex_init (&plugin_lock, NULL) != 0)
 +{
 +  fprintf (stderr, "mutex init failed\n");
 +  abort ();
 +}
 +
p = tv;
while (p->tv_tag)
  {
 --
 2.36.1


From f1e2f84dbfdac5c7aee7036e78841cb33c3bad50 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 16 May 2022 14:18:41 +0200
Subject: [PATCH] lto-plugin: make claim_file_handler thread-safe

lto-plugin/ChangeLog:

	* lto-plugin.c (plugin_lock): New lock.
	(claim_file_handler): Use mutex for critical section.
	(onload): Initialize mutex.
---
 lto-plugin/config.h.in  |  3

[PATCH] ifcvt: Don't introduce trapping or faulting reads in noce_try_sign_mask [PR106032]

2022-06-21 Thread Jakub Jelinek via Gcc-patches

Hi!

noce_try_sign_mask as documented will optimize
  if (c < 0)
x = t;
  else
x = 0;
into x = (c >> bitsm1) & t;
The optimization is done if either t is unconditional
(e.g. for
  x = t;
  if (c >= 0)
x = 0;
) or if it is cheap.  We already check that t doesn't have side-effects,
but if t is conditional, we need to punt also if it may trap or fault,
as we make it unconditional.

I've briefly skimmed other noce_try* optimizations and didn't find one that
would suffer from the same problem.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-06-21  Jakub Jelinek  

PR rtl-optimization/106032
* ifcvt.cc (noce_try_sign_mask): Punt if !t_unconditional, and
t may_trap_or_fault_p, even if it is cheap.

* gcc.c-torture/execute/pr106032.c: New test.

--- gcc/ifcvt.cc.jj 2022-04-26 10:11:51.951558338 +0200
+++ gcc/ifcvt.cc2022-06-20 17:44:18.638394338 +0200
@@ -2833,18 +2833,19 @@ noce_try_sign_mask (struct noce_if_info
 return FALSE;
 
   /* This is only profitable if T is unconditionally executed/evaluated in the
- original insn sequence or T is cheap.  The former happens if B is the
- non-zero (T) value and if INSN_B was taken from TEST_BB, or there was no
- INSN_B which can happen for e.g. conditional stores to memory.  For the
- cost computation use the block TEST_BB where the evaluation will end up
- after the transformation.  */
+ original insn sequence or T is cheap and can't trap or fault.  The former
+ happens if B is the non-zero (T) value and if INSN_B was taken from
+ TEST_BB, or there was no INSN_B which can happen for e.g. conditional
+ stores to memory.  For the cost computation use the block TEST_BB where
+ the evaluation will end up after the transformation.  */
   t_unconditional
 = (t == if_info->b
&& (if_info->insn_b == NULL_RTX
   || BLOCK_FOR_INSN (if_info->insn_b) == if_info->test_bb));
   if (!(t_unconditional
-   || (set_src_cost (t, mode, if_info->speed_p)
-   < COSTS_N_INSNS (2
+   || ((set_src_cost (t, mode, if_info->speed_p)
+< COSTS_N_INSNS (2))
+   && !may_trap_or_fault_p (t
 return FALSE;
 
   if (!noce_can_force_operand (t))
--- gcc/testsuite/gcc.c-torture/execute/pr106032.c.jj   2022-06-20 
18:00:01.064352904 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr106032.c  2022-06-20 
17:59:41.714600349 +0200
@@ -0,0 +1,21 @@
+/* PR rtl-optimization/106032 */
+
+__attribute__((noipa)) int
+foo (int x, int *y)
+{
+  int a = 0;
+  if (x < 0)
+a = *y;
+  return a;  
+}
+
+int
+main ()
+{
+  int a = 42;
+  if (foo (0, 0) != 0 || foo (1, 0) != 0)
+__builtin_abort ();
+  if (foo (-1, &a) != 42 || foo (-42, &a) != 42)
+__builtin_abort ();
+  return 0;
+}

Jakub

Re: [PATCH] Introduce -nolibstdc++ option

2022-06-21 Thread Fangrui Song via Gcc-patches


On 2022-06-21, Richard Biener wrote:

On Tue, Jun 21, 2022 at 9:53 AM Fangrui Song  wrote:


On Tue, Jun 21, 2022 at 1:43 AM Richard Biener via Gcc-patches
 wrote:
>
> On Tue, Jun 21, 2022 at 7:56 AM Alexandre Oliva via Gcc-patches
>  wrote:
> >
> >
> > Using g++ to link without libstdc++, as in g++.dg/abi/pure-virtual1.C,
> > is error prone, because there's no way to tell g++ to drop libstdc++
> > without also dropping libc and any other libraries that the target
> > implicitly links in.
> >
> > This has often led to the need for manual adjustments to this
> > testcase.
> >
> > I figured adding support for -nolibstdc++, even though redundant,
> > makes some sense.  One could presumably use gcc rather than g++ for
> > linking, for the same effect, but sometimes changing the link command
> > is harder than adding an option, as in our testsuite.
> >
> > Regstrapped on x86_64-linux-gnu, also tested with a cross to
> > aarch64-rtems6.  Ok to install?
>
> OK in case nobody objects in 24h.
>
> Richard.

Is this similar to clang -nostdlib++ ?
When libstdc++ is selected, clang -nostdlib++ removes -lstdc++.


Probably.  Note that we have -static-libstdc++ already so
-nolibstdc++ matches that.  We also have -nolibc, not -noclib.

Richard.


I think the relation between -static-foo and -nofoo is not that large.
-nostdlib does not have a corresponding -static-stdlib.

Note that gcc has supported -stdlib=libc++ since 2020-12, though the
usage is a bit tricky. Having a C++ standard library agnostic name
helps libc++:)

For -lc, clang has -nolibc.


> >
> > for  gcc/ChangeLog
> >
> > * common.opt (nolibstdc++): New.
> > * doc/invoke.texi (-nolibstdc++): Document it.
> >
> > for  gcc/cp/ChangeLog
> >
> > * g++spec.c (lang_specific_driver): Implement -nolibstdc++.
> >
> > for  gcc/testsuite/ChangeLog
> >
> > * g++.dg/abi/pure-virtual1.C: Use -nolibstdc++.
> > ---
> >  gcc/common.opt   |3 +++
> >  gcc/cp/g++spec.cc|1 +
> >  gcc/doc/invoke.texi  |6 +-
> >  gcc/testsuite/g++.dg/abi/pure-virtual1.C |2 +-
> >  4 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index 32917aafcaec1..e00c6fc2fb098 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -3456,6 +3456,9 @@ Driver
> >  nolibc
> >  Driver
> >
> > +nolibstdc++
> > +Driver
> > +
> >  nostdlib
> >  Driver
> >
> > diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
> > index 8174d652776b1..539e6ca089d85 100644
> > --- a/gcc/cp/g++spec.cc
> > +++ b/gcc/cp/g++spec.cc
> > @@ -160,6 +160,7 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
> > {
> > case OPT_nostdlib:
> > case OPT_nodefaultlibs:
> > +   case OPT_nolibstdc__:
> >   library = -1;
> >   break;
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 50f57877477bc..469b6d97e0dfa 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -652,7 +652,7 @@ Objective-C and Objective-C++ Dialects}.
> >  @item Linker Options
> >  @xref{Link Options,,Options for Linking}.
> >  @gccoptlist{@var{object-file-name}  -fuse-ld=@var{linker}  -l@var{library} 
@gol
> > --nostartfiles  -nodefaultlibs  -nolibc  -nostdlib @gol
> > +-nostartfiles  -nodefaultlibs  -nolibc  -nolibstdc++  -nostdlib @gol
> >  -e @var{entry}  --entry=@var{entry} @gol
> >  -pie  -pthread  -r  -rdynamic @gol
> >  -s  -static  -static-pie  -static-libgcc  -static-libstdc++ @gol
> > @@ -16787,6 +16787,10 @@ absence of a C library is assumed, for example 
@option{-lpthread} or
> >  @option{-lm} in some configurations.  This is intended for bare-board
> >  targets when there is indeed no C library available.
> >
> > +@item -nolibstdc++
> > +@opindex nolibstdc++
> > +Do not link with standard C++ libraries implicitly.
> > +
> >  @item -nostdlib
> >  @opindex nostdlib
> >  Do not use the standard system startup files or libraries when linking.
> > diff --git a/gcc/testsuite/g++.dg/abi/pure-virtual1.C 
b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > index 538e2cb097a0d..889c33e4952f4 100644
> > --- a/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > +++ b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > @@ -1,7 +1,7 @@
> >  // Test that we don't need libsupc++ just for __cxa_pure_virtual.
> >  // { dg-do link }
> >  // { dg-require-weak }
> > -// { dg-additional-options "-fno-rtti -nodefaultlibs -lc" }
> > +// { dg-additional-options "-fno-rtti -nolibstdc++" }
> >  // { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
*-*-darwin* } }
> >  // { dg-xfail-if "AIX weak" { powerpc-ibm-aix* } }
> >
> >
> > --
> > Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
> >Free Software Activist   GNU Toolchain Engineer
> > Disinformation flourishes because many people care deeply about injustice
> > but very few check the facts.  Ask me about

[PATCH] expand: Fix up expand_cond_expr_using_cmove [PR106030]

2022-06-21 Thread Jakub Jelinek via Gcc-patches

Hi!

If expand_cond_expr_using_cmove can't find a cmove optab for a particular
mode, it tries to promote the mode and perform the cmove in the promoted
mode.

The testcase in the patch ICEs on arm because in that case we pass temp which
has the promoted mode (SImode) as target to expand_operands where the
operands have the non-promoted mode (QImode).
Later on the function uses paradoxical subregs:
  if (GET_MODE (op1) != mode)
op1 = gen_lowpart (mode, op1);

  if (GET_MODE (op2) != mode)
op2 = gen_lowpart (mode, op2);
to change the operand modes.

The following patch fixes it by passing NULL_RTX as target if it has
promoted mode.

Bootstrapped/regtested on x86_64-linux and i686-linux and tested with cross
to arm on the testcase, ok for trunk?

2022-06-21  Jakub Jelinek  

PR middle-end/106030
* expr.cc (expand_cond_expr_using_cmove): Pass NULL_RTX instead of
temp to expand_operands if mode has been promoted.

* gcc.c-torture/compile/pr106030.c: New test.

--- gcc/expr.cc.jj  2022-06-10 21:19:12.087803729 +0200
+++ gcc/expr.cc 2022-06-20 14:10:07.941681072 +0200
@@ -8832,7 +8832,8 @@ expand_cond_expr_using_cmove (tree treeo
   expanding_cond_expr_using_cmove = true;
   start_sequence ();
   expand_operands (treeop1, treeop2,
-  temp, &op1, &op2, EXPAND_NORMAL);
+  mode == orig_mode ? temp : NULL_RTX, &op1, &op2,
+  EXPAND_NORMAL);
 
   if (TREE_CODE (treeop0) == SSA_NAME
   && (srcstmt = get_def_for_expr_class (treeop0, tcc_comparison)))
--- gcc/testsuite/gcc.c-torture/compile/pr106030.c.jj   2022-06-20 
14:49:37.618142233 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr106030.c  2022-06-20 
14:49:19.016382117 +0200
@@ -0,0 +1,16 @@
+/* PR middle-end/106030 */
+
+int a, b, c;
+
+char
+foo (int x, int y)
+{
+  return x * y;
+}
+
+void
+bar (void)
+{
+  char d = (foo <= b) * a;
+  c = foo (2 != bar, d);
+}

Jakub

Re: [PATCH] Introduce -nolibstdc++ option

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, Jun 21, 2022 at 9:53 AM Fangrui Song  wrote:
>
> On Tue, Jun 21, 2022 at 1:43 AM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Tue, Jun 21, 2022 at 7:56 AM Alexandre Oliva via Gcc-patches
> >  wrote:
> > >
> > >
> > > Using g++ to link without libstdc++, as in g++.dg/abi/pure-virtual1.C,
> > > is error prone, because there's no way to tell g++ to drop libstdc++
> > > without also dropping libc and any other libraries that the target
> > > implicitly links in.
> > >
> > > This has often led to the need for manual adjustments to this
> > > testcase.
> > >
> > > I figured adding support for -nolibstdc++, even though redundant,
> > > makes some sense.  One could presumably use gcc rather than g++ for
> > > linking, for the same effect, but sometimes changing the link command
> > > is harder than adding an option, as in our testsuite.
> > >
> > > Regstrapped on x86_64-linux-gnu, also tested with a cross to
> > > aarch64-rtems6.  Ok to install?
> >
> > OK in case nobody objects in 24h.
> >
> > Richard.
>
> Is this similar to clang -nostdlib++ ?
> When libstdc++ is selected, clang -nostdlib++ removes -lstdc++.

Probably.  Note that we have -static-libstdc++ already so
-nolibstdc++ matches that.  We also have -nolibc, not -noclib.

Richard.

> > >
> > > for  gcc/ChangeLog
> > >
> > > * common.opt (nolibstdc++): New.
> > > * doc/invoke.texi (-nolibstdc++): Document it.
> > >
> > > for  gcc/cp/ChangeLog
> > >
> > > * g++spec.c (lang_specific_driver): Implement -nolibstdc++.
> > >
> > > for  gcc/testsuite/ChangeLog
> > >
> > > * g++.dg/abi/pure-virtual1.C: Use -nolibstdc++.
> > > ---
> > >  gcc/common.opt   |3 +++
> > >  gcc/cp/g++spec.cc|1 +
> > >  gcc/doc/invoke.texi  |6 +-
> > >  gcc/testsuite/g++.dg/abi/pure-virtual1.C |2 +-
> > >  4 files changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > index 32917aafcaec1..e00c6fc2fb098 100644
> > > --- a/gcc/common.opt
> > > +++ b/gcc/common.opt
> > > @@ -3456,6 +3456,9 @@ Driver
> > >  nolibc
> > >  Driver
> > >
> > > +nolibstdc++
> > > +Driver
> > > +
> > >  nostdlib
> > >  Driver
> > >
> > > diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
> > > index 8174d652776b1..539e6ca089d85 100644
> > > --- a/gcc/cp/g++spec.cc
> > > +++ b/gcc/cp/g++spec.cc
> > > @@ -160,6 +160,7 @@ lang_specific_driver (struct cl_decoded_option 
> > > **in_decoded_options,
> > > {
> > > case OPT_nostdlib:
> > > case OPT_nodefaultlibs:
> > > +   case OPT_nolibstdc__:
> > >   library = -1;
> > >   break;
> > >
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index 50f57877477bc..469b6d97e0dfa 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -652,7 +652,7 @@ Objective-C and Objective-C++ Dialects}.
> > >  @item Linker Options
> > >  @xref{Link Options,,Options for Linking}.
> > >  @gccoptlist{@var{object-file-name}  -fuse-ld=@var{linker}  
> > > -l@var{library} @gol
> > > --nostartfiles  -nodefaultlibs  -nolibc  -nostdlib @gol
> > > +-nostartfiles  -nodefaultlibs  -nolibc  -nolibstdc++  -nostdlib @gol
> > >  -e @var{entry}  --entry=@var{entry} @gol
> > >  -pie  -pthread  -r  -rdynamic @gol
> > >  -s  -static  -static-pie  -static-libgcc  -static-libstdc++ @gol
> > > @@ -16787,6 +16787,10 @@ absence of a C library is assumed, for example 
> > > @option{-lpthread} or
> > >  @option{-lm} in some configurations.  This is intended for bare-board
> > >  targets when there is indeed no C library available.
> > >
> > > +@item -nolibstdc++
> > > +@opindex nolibstdc++
> > > +Do not link with standard C++ libraries implicitly.
> > > +
> > >  @item -nostdlib
> > >  @opindex nostdlib
> > >  Do not use the standard system startup files or libraries when linking.
> > > diff --git a/gcc/testsuite/g++.dg/abi/pure-virtual1.C 
> > > b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > > index 538e2cb097a0d..889c33e4952f4 100644
> > > --- a/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > > +++ b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > > @@ -1,7 +1,7 @@
> > >  // Test that we don't need libsupc++ just for __cxa_pure_virtual.
> > >  // { dg-do link }
> > >  // { dg-require-weak }
> > > -// { dg-additional-options "-fno-rtti -nodefaultlibs -lc" }
> > > +// { dg-additional-options "-fno-rtti -nolibstdc++" }
> > >  // { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
> > > *-*-darwin* } }
> > >  // { dg-xfail-if "AIX weak" { powerpc-ibm-aix* } }
> > >
> > >
> > > --
> > > Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
> > >Free Software Activist   GNU Toolchain Engineer
> > > Disinformation flourishes because many people care deeply about injustice
> > > but very few check the facts.  Ask me about

Re: [PATCH v4] tree-optimization/94899: Remove "+ 0x80000000" in int comparisons

2022-06-21 Thread Richard Biener via Gcc-patches

On Mon, Jun 20, 2022 at 4:23 PM Arjun Shankar  wrote:
>
> Expressions of the form "X + CST < Y + CST" where:
>
> * CST is an unsigned integer constant with only the MSB set, and
> * X and Y's types have integer conversion ranks <= CST's
>
> can be simplified to "(signed) X < (signed) Y".
>
> This is because, assuming a 32-bit signed numbers,
> (unsigned) INT_MIN + 0x8000 is 0, and
> (unsigned) INT_MAX + 0x8000 is UINT_MAX.
>
> i.e. the result increases monotonically with signed input.
>
> This means:
> ((signed) X < (signed) Y) iff (X + 0x8000 < Y + 0x8000)
>
> gcc/
> * match.pd (X + C < Y + C -> (signed) X < (signed) Y, if C is
> 0x8000): New simplification.
> gcc/testsuite/
> * gcc.dg/pr94899.c: New test.
> ---
>  gcc/match.pd   | 13 +
>  gcc/testsuite/gcc.dg/pr94899.c | 49 ++
>  2 files changed, 62 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr94899.c
> ---
> v3: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596785.html
>
> Notes on v4, based on Richard and Jakub's review comments:
>
> Richard wrote:
>
> > It might be possible to test for zero + or - operations instead?
>
> OK. That seems more fool-proof. I've made the change.
>
> Jakub wrote:
>
> > Can't one just omit the INTEGER_CST part on the second @0?
>
> I hadn't thought of that. Done!
>
> > As a follow-up, it might be useful to make it work for vector integral types
> > too,
> > typedef unsigned V __attribute__((vector_size (4 * sizeof (int;
> > #define M __INT_MAX__ + 1U
> > V foo (V x, V y)
> > {
> >   return x + (V) { M, M, M, M } < y + (V) { M, M, M, M };
> > }
> > using uniform_integer_cst_p.
>
> OK. This syntax is unfamiliar to me. I'll read a bit and then try to work on
> a follow-up. Thanks!

This variant is OK.  Let's do the vector case as followup.

Richard.

> diff --git a/gcc/match.pd b/gcc/match.pd
> index a63b649841b..4a570894b2e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -2089,6 +2089,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
> (op @0 @1
> +
> +/* As a special case, X + C < Y + C is the same as (signed) X < (signed) Y
> +   when C is an unsigned integer constant with only the MSB set, and X and
> +   Y have types of equal or lower integer conversion rank than C's.  */
> +(for op (lt le ge gt)
> + (simplify
> +  (op (plus @1 INTEGER_CST@0) (plus @2 @0))
> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +   && TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && wi::only_sign_bit_p (wi::to_wide (@0)))
> +   (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
> +(op (convert:stype @1) (convert:stype @2))
> +
>  /* For equality and subtraction, this is also true with wrapping overflow.  
> */
>  (for op (eq ne minus)
>   (simplify
> diff --git a/gcc/testsuite/gcc.dg/pr94899.c b/gcc/testsuite/gcc.dg/pr94899.c
> new file mode 100644
> index 000..2fc7009a2e7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr94899.c
> @@ -0,0 +1,49 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +typedef __INT16_TYPE__ int16_t;
> +typedef __INT32_TYPE__ int32_t;
> +typedef __UINT16_TYPE__ uint16_t;
> +typedef __UINT32_TYPE__ uint32_t;
> +
> +#define MAGIC (~ (uint32_t) 0 / 2 + 1)
> +
> +int
> +f_i16_i16 (int16_t x, int16_t y)
> +{
> +  return x + MAGIC < y + MAGIC;
> +}
> +
> +int
> +f_i16_i32 (int16_t x, int32_t y)
> +{
> +  return x + MAGIC < y + MAGIC;
> +}
> +
> +int
> +f_i32_i32 (int32_t x, int32_t y)
> +{
> +  return x + MAGIC < y + MAGIC;
> +}
> +
> +int
> +f_u32_i32 (uint32_t x, int32_t y)
> +{
> +  return x + MAGIC < y + MAGIC;
> +}
> +
> +int
> +f_u32_u32 (uint32_t x, uint32_t y)
> +{
> +  return x + MAGIC < y + MAGIC;
> +}
> +
> +int
> +f_i32_i32_sub (int32_t x, int32_t y)
> +{
> +  return x - MAGIC < y - MAGIC;
> +}
> +
> +/* The addition/subtraction of constants should be optimized away.  */
> +/* { dg-final { scan-tree-dump-not "\\+" "optimized"} } */
> +/* { dg-final { scan-tree-dump-not "\\-" "optimized"} } */
> --
> 2.35.3
>

Re: [PATCH 2/3] lto-plugin: make claim_file_handler thread-safe

2022-06-21 Thread Richard Biener via Gcc-patches

On Mon, Jun 20, 2022 at 12:20 PM Martin Liška  wrote:
>
> On 6/20/22 11:32, Richard Biener wrote:
> > On Thu, Jun 16, 2022 at 9:01 AM Martin Liška  wrote:
> >>
> >> lto-plugin/ChangeLog:
> >>
> >> * lto-plugin.c (plugin_lock): New lock.
> >> (claim_file_handler): Use mutex for critical section.
> >> (onload): Initialize mutex.
> >> ---
> >>  lto-plugin/lto-plugin.c | 16 +++-
> >>  1 file changed, 15 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
> >> index 00b760636dc..13118c4983c 100644
> >> --- a/lto-plugin/lto-plugin.c
> >> +++ b/lto-plugin/lto-plugin.c
> >> @@ -55,6 +55,7 @@ along with this program; see the file COPYING3.  If not 
> >> see
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >
> > Not sure if we support any non-pthread target for building the LTO
> > plugin, but it
> > seems we have
> >
> >   # Among non-ELF, only Windows platforms support the lto-plugin so far.
> >   # Build it unless LTO was explicitly disabled.
> >   case $target in
> > *-cygwin* | *-mingw*) build_lto_plugin=$enable_lto ;;
> >
> > which suggests that at least build validating the above with --enable-lto
>
> Verified that it's fine.
>
> >
> > IIRC we have gthr-*.h in libgcc/, not sure if that's usable in a
> > host linker plugin.
> >
> >>  #ifdef HAVE_SYS_WAIT_H
> >>  #include 
> >>  #endif
> >> @@ -157,6 +158,9 @@ enum symbol_style
> >>ss_uscore,   /* Underscore prefix all symbols.  */
> >>  };
> >>
> >> +/* Plug-in mutex.  */
> >> +static pthread_mutex_t plugin_lock;
> >> +
> >>  static char *arguments_file_name;
> >>  static ld_plugin_register_claim_file register_claim_file;
> >>  static ld_plugin_register_all_symbols_read register_all_symbols_read;
> >> @@ -1262,15 +1266,18 @@ claim_file_handler (const struct 
> >> ld_plugin_input_file *file, int *claimed)
> >>   lto_file.symtab.syms);
> >>check (status == LDPS_OK, LDPL_FATAL, "could not add symbols");
> >>
> >> +  pthread_mutex_lock (&plugin_lock);
> >>num_claimed_files++;
> >>claimed_files =
> >> xrealloc (claimed_files,
> >>   num_claimed_files * sizeof (struct plugin_file_info));
> >>claimed_files[num_claimed_files - 1] = lto_file;
> >> +  pthread_mutex_unlock (&plugin_lock);
> >>
> >>*claimed = 1;
> >>  }
> >>
> >> +  pthread_mutex_lock (&plugin_lock);
> >>if (offload_files == NULL)
> >>  {
> >>/* Add dummy item to the start of the list.  */
> >> @@ -1333,11 +1340,12 @@ claim_file_handler (const struct 
> >> ld_plugin_input_file *file, int *claimed)
> >> offload_files_last_lto = ofld;
> >>num_offload_files++;
> >>  }
> >> +  pthread_mutex_unlock (&plugin_lock);
> >>
> >>goto cleanup;
> >>
> >>   err:
> >> -  non_claimed_files++;
> >> +  __atomic_fetch_add (&non_claimed_files, 1, __ATOMIC_RELAXED);
> >
> > is it worth "optimizing" this with yet another need for target specific 
> > support
> > (just use pthread_mutex here as well?)
>
> Sure.
>
> May I install the patch with the change?

Can you at least add a configure check for pthread.h and maybe disable
locking when not found or erroring out?  I figure we have GCC_AC_THREAD_HEADER
for the gthr.h stuff using $target_thread_file (aka --enable-threads=XYZ),
but as said that's for the target and I don't see any host uses.  We might also
add an explicit list of hosts (*-linux*?) where we enable thread support for
lto-plugin, providing opt-in (so you'd have to wrap the mutex taking or
if-def it out).

I think you also need to link lto-plugin with -pthread, no?  On linux
it might work omitting that but I'm not sure other libc have serial pthread
stubs in their libc.  BFD ld definitely doesn't link against pthread so
dlopening lto-plugin will fail (also not all libc might like
initializing threads
from a dlopen _init).

Richard.

> Cheers,
> Martin
>
> >
> >>free (lto_file.name);
> >>
> >>   cleanup:
> >> @@ -1415,6 +1423,12 @@ onload (struct ld_plugin_tv *tv)
> >>struct ld_plugin_tv *p;
> >>enum ld_plugin_status status;
> >>
> >> +  if (pthread_mutex_init (&plugin_lock, NULL) != 0)
> >> +{
> >> +  fprintf (stderr, "mutex init failed\n");
> >> +  abort ();
> >> +}
> >> +
> >>p = tv;
> >>while (p->tv_tag)
> >>  {
> >> --
> >> 2.36.1
> >>
> >>

Re: [PATCH] Introduce -nolibstdc++ option

2022-06-21 Thread Fangrui Song via Gcc-patches

On Tue, Jun 21, 2022 at 1:43 AM Richard Biener via Gcc-patches
 wrote:
>
> On Tue, Jun 21, 2022 at 7:56 AM Alexandre Oliva via Gcc-patches
>  wrote:
> >
> >
> > Using g++ to link without libstdc++, as in g++.dg/abi/pure-virtual1.C,
> > is error prone, because there's no way to tell g++ to drop libstdc++
> > without also dropping libc and any other libraries that the target
> > implicitly links in.
> >
> > This has often led to the need for manual adjustments to this
> > testcase.
> >
> > I figured adding support for -nolibstdc++, even though redundant,
> > makes some sense.  One could presumably use gcc rather than g++ for
> > linking, for the same effect, but sometimes changing the link command
> > is harder than adding an option, as in our testsuite.
> >
> > Regstrapped on x86_64-linux-gnu, also tested with a cross to
> > aarch64-rtems6.  Ok to install?
>
> OK in case nobody objects in 24h.
>
> Richard.

Is this similar to clang -nostdlib++ ?
When libstdc++ is selected, clang -nostdlib++ removes -lstdc++.

> >
> > for  gcc/ChangeLog
> >
> > * common.opt (nolibstdc++): New.
> > * doc/invoke.texi (-nolibstdc++): Document it.
> >
> > for  gcc/cp/ChangeLog
> >
> > * g++spec.c (lang_specific_driver): Implement -nolibstdc++.
> >
> > for  gcc/testsuite/ChangeLog
> >
> > * g++.dg/abi/pure-virtual1.C: Use -nolibstdc++.
> > ---
> >  gcc/common.opt   |3 +++
> >  gcc/cp/g++spec.cc|1 +
> >  gcc/doc/invoke.texi  |6 +-
> >  gcc/testsuite/g++.dg/abi/pure-virtual1.C |2 +-
> >  4 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index 32917aafcaec1..e00c6fc2fb098 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -3456,6 +3456,9 @@ Driver
> >  nolibc
> >  Driver
> >
> > +nolibstdc++
> > +Driver
> > +
> >  nostdlib
> >  Driver
> >
> > diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
> > index 8174d652776b1..539e6ca089d85 100644
> > --- a/gcc/cp/g++spec.cc
> > +++ b/gcc/cp/g++spec.cc
> > @@ -160,6 +160,7 @@ lang_specific_driver (struct cl_decoded_option 
> > **in_decoded_options,
> > {
> > case OPT_nostdlib:
> > case OPT_nodefaultlibs:
> > +   case OPT_nolibstdc__:
> >   library = -1;
> >   break;
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 50f57877477bc..469b6d97e0dfa 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -652,7 +652,7 @@ Objective-C and Objective-C++ Dialects}.
> >  @item Linker Options
> >  @xref{Link Options,,Options for Linking}.
> >  @gccoptlist{@var{object-file-name}  -fuse-ld=@var{linker}  -l@var{library} 
> > @gol
> > --nostartfiles  -nodefaultlibs  -nolibc  -nostdlib @gol
> > +-nostartfiles  -nodefaultlibs  -nolibc  -nolibstdc++  -nostdlib @gol
> >  -e @var{entry}  --entry=@var{entry} @gol
> >  -pie  -pthread  -r  -rdynamic @gol
> >  -s  -static  -static-pie  -static-libgcc  -static-libstdc++ @gol
> > @@ -16787,6 +16787,10 @@ absence of a C library is assumed, for example 
> > @option{-lpthread} or
> >  @option{-lm} in some configurations.  This is intended for bare-board
> >  targets when there is indeed no C library available.
> >
> > +@item -nolibstdc++
> > +@opindex nolibstdc++
> > +Do not link with standard C++ libraries implicitly.
> > +
> >  @item -nostdlib
> >  @opindex nostdlib
> >  Do not use the standard system startup files or libraries when linking.
> > diff --git a/gcc/testsuite/g++.dg/abi/pure-virtual1.C 
> > b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > index 538e2cb097a0d..889c33e4952f4 100644
> > --- a/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > +++ b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> > @@ -1,7 +1,7 @@
> >  // Test that we don't need libsupc++ just for __cxa_pure_virtual.
> >  // { dg-do link }
> >  // { dg-require-weak }
> > -// { dg-additional-options "-fno-rtti -nodefaultlibs -lc" }
> > +// { dg-additional-options "-fno-rtti -nolibstdc++" }
> >  // { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
> > *-*-darwin* } }
> >  // { dg-xfail-if "AIX weak" { powerpc-ibm-aix* } }
> >
> >
> > --
> > Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
> >Free Software Activist   GNU Toolchain Engineer
> > Disinformation flourishes because many people care deeply about injustice
> > but very few check the facts.  Ask me about

Re: [PATCH 1/2]middle-end: Simplify subtract where both arguments are being bitwise inverted.

2022-06-21 Thread Richard Biener via Gcc-patches

On Mon, Jun 20, 2022 at 10:49 AM Tamar Christina
 wrote:
>
> > -Original Message-
> > From: Richard Sandiford 
> > Sent: Monday, June 20, 2022 9:19 AM
> > To: Richard Biener via Gcc-patches 
> > Cc: Tamar Christina ; Richard Biener
> > ; Richard Guenther ;
> > nd 
> > Subject: Re: [PATCH 1/2]middle-end: Simplify subtract where both
> > arguments are being bitwise inverted.
> >
> > Richard Biener via Gcc-patches  writes:
> > > On Thu, Jun 16, 2022 at 1:10 PM Tamar Christina via Gcc-patches
> > >  wrote:
> > >>
> > >> Hi All,
> > >>
> > >> This adds a match.pd rule that drops the bitwwise nots when both
> > >> arguments to a subtract is inverted. i.e. for:
> > >>
> > >> float g(float a, float b)
> > >> {
> > >>   return ~(int)a - ~(int)b;
> > >> }
> > >>
> > >> we instead generate
> > >>
> > >> float g(float a, float b)
> > >> {
> > >>   return (int)a - (int)b;
> > >> }
> > >>
> > >> We already do a limited version of this from the fold_binary fold
> > >> functions but this makes a more general version in match.pd that applies
> > more often.
> > >>
> > >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >>
> > >> Ok for master?
> > >>
> > >> Thanks,
> > >> Tamar
> > >>
> > >> gcc/ChangeLog:
> > >>
> > >> * match.pd: New bit_not rule.
> > >>
> > >> gcc/testsuite/ChangeLog:
> > >>
> > >> * gcc.dg/subnot.c: New test.
> > >>
> > >> --- inline copy of patch --
> > >> diff --git a/gcc/match.pd b/gcc/match.pd index
> > >>
> > a59b6778f661cf9121dd3503f43472871e4da445..51b0a1b562409af535e53828a1
> > 0
> > >> c30b8a3e1ae2e 100644
> > >> --- a/gcc/match.pd
> > >> +++ b/gcc/match.pd
> > >> @@ -1258,6 +1258,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >> (simplify
> > >>   (bit_not (plus:c (bit_not @0) @1))
> > >>   (minus @0 @1))
> > >> +/* (~X - ~Y) -> X - Y.  */
> > >> +(simplify
> > >> + (minus (bit_not @0) (bit_not @1))
> > >> + (minus @0 @1))
> > >
> > > It doesn't seem correct.
> > >
> > > (gdb) p/x ~-1 - ~0x8000
> > > $3 = 0x8001
> > > (gdb) p/x -1 - 0x8000
> > > $4 = 0x7fff
> > >
> > > where I was looking for a case exposing undefined integer overflow.
> >
> > Yeah, shouldn't it be folding to (minus @1 @0) instead?
> >
> >   ~X = (-X - 1)
> >   -Y = (-Y - 1)
> >
> > so:
> >
> >   ~X - ~Y = (-X - 1) - (-Y - 1)
> >   = -X - 1 + Y + 1
> >   = Y - X
> >
>
> You're right, sorry, I should have paid more attention when I wrote the patch.

You still need to watch out for undefined overflow cases in the result
that were well-defined in the original expression I think.

Richard.

> Tamar
> > Richard
> >
> >
> > > Richard.
> > >
> > >>
> > >>  /* ~(X - Y) -> ~X + Y.  */
> > >>  (simplify
> > >> diff --git a/gcc/testsuite/gcc.dg/subnot.c
> > >> b/gcc/testsuite/gcc.dg/subnot.c new file mode 100644 index
> > >>
> > ..d621bacd27bd3d19a010e4c9f
> > 83
> > >> 1aa77d28bd02d
> > >> --- /dev/null
> > >> +++ b/gcc/testsuite/gcc.dg/subnot.c
> > >> @@ -0,0 +1,9 @@
> > >> +/* { dg-do compile } */
> > >> +/* { dg-options "-O -fdump-tree-optimized" } */
> > >> +
> > >> +float g(float a, float b)
> > >> +{
> > >> +  return ~(int)a - ~(int)b;
> > >> +}
> > >> +
> > >> +/* { dg-final { scan-tree-dump-not "~" "optimized" } } */
> > >>
> > >>
> > >>
> > >>
> > >> --

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Martin Liška

On 6/21/22 09:33, Xi Ruoyao wrote:
> On Tue, 2022-06-21 at 09:28 +0200, Martin Liška wrote:
> 
>> Sorry, but I don't see to which email this replies to?
>> Can't find a patch.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596913.html

Hm, interesting. It means Thunderbird can't deal with the email format
and I can't see any attachment in the email.

Martin

> 
> The patch is an attachment:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/attachments/20220621/dbb112d2/attachment-0001.obj
>

Re: [PATCH v3] tree-optimization/94899: Remove "+ 0x80000000" in int comparisons

2022-06-21 Thread Richard Biener via Gcc-patches

On Mon, Jun 20, 2022 at 10:38 AM Jakub Jelinek  wrote:
>
> On Mon, Jun 20, 2022 at 09:36:28AM +0200, Richard Biener wrote:
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -2080,6 +2080,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >(if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > > && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
> > > (op @0 @1
> > > +
> > > +/* As a special case, X + C < Y + C is the same as (signed) X < (signed) 
> > > Y
> > > +   when C is an unsigned integer constant with only the MSB set, and X 
> > > and
> > > +   Y have types of equal or lower integer conversion rank than C's.  */
> > > +(for op (lt le ge gt)
> > > + (simplify
> > > +  (op (plus @1 INTEGER_CST@0) (plus @2 INTEGER_CST@0))
>
> Can't one just omit the INTEGER_CST part on the second @0?

Ah, yes.

> > > +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > > +   && TYPE_UNSIGNED (TREE_TYPE (@0))
> > > +   && wi::only_sign_bit_p (wi::to_wide (@0)))
> > > +   (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
> > > +(op (convert:stype @1) (convert:stype @2))
>
> As a follow-up, it might be useful to make it work for vector integral types
> too,
> typedef unsigned V __attribute__((vector_size (4 * sizeof (int;
> #define M __INT_MAX__ + 1U
> V foo (V x, V y)
> {
>   return x + (V) { M, M, M, M } < y + (V) { M, M, M, M };
> }
> using uniform_integer_cst_p.
>
> Jakub
>

RE: [PATCH] i386: Add syscall to enable AMX for latest kernels

2022-06-21 Thread Jiang, Haochen via Gcc-patches

> -Original Message-
> From: Uros Bizjak 
> Sent: Tuesday, June 21, 2022 3:06 PM
> To: Jiang, Haochen 
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest kernels
> 
> On Tue, Jun 21, 2022 at 4:23 AM Jiang, Haochen 
> wrote:
> >
> > > -Original Message-
> > > From: Uros Bizjak 
> > > Sent: Monday, June 20, 2022 10:54 PM
> > > To: Jiang, Haochen 
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > > Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest
> > > kernels
> > >
> > > On Mon, Jun 20, 2022 at 10:04 AM Haochen Jiang
> > > 
> > > wrote:
> > > >
> > > > From: "Jiang, Haochen" 
> > > >
> > > > Hi all,
> > > >
> > > > We need syscall to enable AMX for kernels>=5.4. It is missing in
> > > > current amx tests, which will cause test fail.
> > >
> > > So this new code is only valid for linux & co?
> >
> > Thanks for reminding me for that, I only test on linux since the header 
> > file is
> only in linux.
> >
> > Just updated a patch wrapping with a macro not to change the behavior on
> windows.
> 
> I think you want __linux__ there, not __unix__.

Fixed with __linux__.

Thx,
Haochen

> 
> Uros.
> 
> >
> > Regtested on x86_64-pc-linux-gnu.
> >
> > Thx,
> > Haochen
> > >
> > > Uros.
> > >
> > > >
> > > > This patch aims to add them to fix this bug.
> > > >
> > > > BRs,
> > > > Haochen
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/i386/amx-check.h (request_perm_xtile_data):
> > > > New function to check if AMX is usable and enable AMX.
> > > > (main): Run test if AMX is usable.
> > > > ---
> > > >  gcc/testsuite/gcc.target/i386/amx-check.h | 24
> > > > +++
> > > >  1 file changed, 24 insertions(+)
> > > >
> > > > diff --git a/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > b/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > index 434b0e59703..92ed8669304 100644
> > > > --- a/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > +++ b/gcc/testsuite/gcc.target/i386/amx-check.h
> > > > @@ -4,11 +4,22 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > > +#include 
> > > >  #ifdef DEBUG
> > > >  #include 
> > > >  #endif
> > > >  #include "cpuid.h"
> > > >
> > > > +#define XFEATURE_XTILECFG  17
> > > > +#define XFEATURE_XTILEDATA 18
> > > > +#define XFEATURE_MASK_XTILECFG (1 << XFEATURE_XTILECFG)
> > > > +#define XFEATURE_MASK_XTILEDATA(1 << XFEATURE_XTILEDATA)
> > > > +#define XFEATURE_MASK_XTILE(XFEATURE_MASK_XTILECFG |
> > > XFEATURE_MASK_XTILEDATA)
> > > > +
> > > > +#define ARCH_GET_XCOMP_PERM0x1022
> > > > +#define ARCH_REQ_XCOMP_PERM0x1023
> > > > +
> > > >  /* TODO: The tmm emulation is temporary for current
> > > > AMX implementation with no tmm regclass, should
> > > > be changed in the future. */
> > > > @@ -44,6 +55,18 @@ typedef struct __tile
> > > >  /* Stride (colum width in byte) used for tileload/store */
> > > > #define _STRIDE 64
> > > >
> > > > +/* We need syscall to use amx functions */ int
> > > > +request_perm_xtile_data() {
> > > > +  unsigned long bitmask;
> > > > +
> > > > +  if (syscall (SYS_arch_prctl, ARCH_REQ_XCOMP_PERM,
> > > XFEATURE_XTILEDATA) ||
> > > > +  syscall (SYS_arch_prctl, ARCH_GET_XCOMP_PERM, &bitmask))
> > > > +return 0;
> > > > +
> > > > +  return (bitmask & XFEATURE_MASK_XTILE) != 0; }
> > > > +
> > > >  /* Initialize tile config by setting all tmm size to 16x64 */
> > > > void init_tile_config (__tilecfg_u *dst)  { @@ -186,6 +209,7 @@
> > > > main () #ifdef AMX_BF16
> > > >&& __builtin_cpu_supports ("amx-bf16")  #endif
> > > > +  && request_perm_xtile_data ()
> > > >)
> > > >  {
> > > >DO_TEST ();
> > > > --
> > > > 2.18.2
> > > >


0001-i386-Add-syscall-to-enable-AMX-for-latest-kernels.patch
Description: 0001-i386-Add-syscall-to-enable-AMX-for-latest-kernels.patch

Re: [PATCH] libstdc++: eh_globals: gthreads: reset _S_init before deleting key

2022-06-21 Thread Jonathan Wakely via Gcc-patches

On Tue, 21 Jun 2022 at 07:04, Alexandre Oliva via Libstdc++
 wrote:
>
>
> Clear __eh_globals_init's _S_init in the dtor before deleting the
> gthread key.
>
> This ensures that, in case any code involved in deleting the key
> interacts with eh_globals, the key that is being deleted won't be
> used, and the non-thread-specific eh_globals fallback will.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

OK, thanks.

> PS: This is a fix for a theoretical issue, that came to mind while I
> independently investigated the problem that I later found to be
> PR105880.

It looks like a real problem though, if rare in practice.

Re: [PATCH] libstdc++: testsuite: call sched_yield for nonpreemptive targets

2022-06-21 Thread Jonathan Wakely via Gcc-patches

On Tue, 21 Jun 2022 at 06:54, Alexandre Oliva via Libstdc++
 wrote:
>
>
> As in the gcc testsuite, systems without preemptive multi-threading
> require sched_yield calls to be placed at points in which a context
> switch might be needed to enable the test to complete.

I'll try to remember that, but will probably forget. Is this really
the only affected test?

>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

OK, thanks.

Re: [PATCH] Introduce -nolibstdc++ option

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, Jun 21, 2022 at 7:56 AM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> Using g++ to link without libstdc++, as in g++.dg/abi/pure-virtual1.C,
> is error prone, because there's no way to tell g++ to drop libstdc++
> without also dropping libc and any other libraries that the target
> implicitly links in.
>
> This has often led to the need for manual adjustments to this
> testcase.
>
> I figured adding support for -nolibstdc++, even though redundant,
> makes some sense.  One could presumably use gcc rather than g++ for
> linking, for the same effect, but sometimes changing the link command
> is harder than adding an option, as in our testsuite.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

OK in case nobody objects in 24h.

Richard.

>
> for  gcc/ChangeLog
>
> * common.opt (nolibstdc++): New.
> * doc/invoke.texi (-nolibstdc++): Document it.
>
> for  gcc/cp/ChangeLog
>
> * g++spec.c (lang_specific_driver): Implement -nolibstdc++.
>
> for  gcc/testsuite/ChangeLog
>
> * g++.dg/abi/pure-virtual1.C: Use -nolibstdc++.
> ---
>  gcc/common.opt   |3 +++
>  gcc/cp/g++spec.cc|1 +
>  gcc/doc/invoke.texi  |6 +-
>  gcc/testsuite/g++.dg/abi/pure-virtual1.C |2 +-
>  4 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 32917aafcaec1..e00c6fc2fb098 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3456,6 +3456,9 @@ Driver
>  nolibc
>  Driver
>
> +nolibstdc++
> +Driver
> +
>  nostdlib
>  Driver
>
> diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
> index 8174d652776b1..539e6ca089d85 100644
> --- a/gcc/cp/g++spec.cc
> +++ b/gcc/cp/g++spec.cc
> @@ -160,6 +160,7 @@ lang_specific_driver (struct cl_decoded_option 
> **in_decoded_options,
> {
> case OPT_nostdlib:
> case OPT_nodefaultlibs:
> +   case OPT_nolibstdc__:
>   library = -1;
>   break;
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 50f57877477bc..469b6d97e0dfa 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -652,7 +652,7 @@ Objective-C and Objective-C++ Dialects}.
>  @item Linker Options
>  @xref{Link Options,,Options for Linking}.
>  @gccoptlist{@var{object-file-name}  -fuse-ld=@var{linker}  -l@var{library} 
> @gol
> --nostartfiles  -nodefaultlibs  -nolibc  -nostdlib @gol
> +-nostartfiles  -nodefaultlibs  -nolibc  -nolibstdc++  -nostdlib @gol
>  -e @var{entry}  --entry=@var{entry} @gol
>  -pie  -pthread  -r  -rdynamic @gol
>  -s  -static  -static-pie  -static-libgcc  -static-libstdc++ @gol
> @@ -16787,6 +16787,10 @@ absence of a C library is assumed, for example 
> @option{-lpthread} or
>  @option{-lm} in some configurations.  This is intended for bare-board
>  targets when there is indeed no C library available.
>
> +@item -nolibstdc++
> +@opindex nolibstdc++
> +Do not link with standard C++ libraries implicitly.
> +
>  @item -nostdlib
>  @opindex nostdlib
>  Do not use the standard system startup files or libraries when linking.
> diff --git a/gcc/testsuite/g++.dg/abi/pure-virtual1.C 
> b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> index 538e2cb097a0d..889c33e4952f4 100644
> --- a/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> +++ b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
> @@ -1,7 +1,7 @@
>  // Test that we don't need libsupc++ just for __cxa_pure_virtual.
>  // { dg-do link }
>  // { dg-require-weak }
> -// { dg-additional-options "-fno-rtti -nodefaultlibs -lc" }
> +// { dg-additional-options "-fno-rtti -nolibstdc++" }
>  // { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
> *-*-darwin* } }
>  // { dg-xfail-if "AIX weak" { powerpc-ibm-aix* } }
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about

Re: [PATCH] libstdc++: testsuite: require cmath for nexttowardl

2022-06-21 Thread Jonathan Wakely via Gcc-patches

On Tue, 21 Jun 2022 at 06:36, Alexandre Oliva via Libstdc++
 wrote:
>
>
> nexttowardl is only expected to be available with C99 math, but
> 20_util/to_chars/long_double.cc uses it unconditionally.
>
> State the cmath requirement in the test.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

OK, thanks.

Re: [PATCH v1.1] tree-optimization/105736: Don't let error_mark_node escape for ADDR_EXPR

2022-06-21 Thread Siddhesh Poyarekar


On 20/06/2022 15:20, Jakub Jelinek wrote:

On Tue, Jun 14, 2022 at 09:01:54PM +0530, Siddhesh Poyarekar wrote:

The addr_expr computation does not check for error_mark_node before
returning the size expression.  This used to work in the constant case
because the conversion to uhwi would end up causing it to return
size_unknown, but that won't work for the dynamic case.


Regarding subject/first line of commit, it should be something like:

tree-object-size: Don't let error_mark_node escape for ADDR_EXPR [PR105736]
instead of what you have.


Modify the control flow to explicitly return size_unknown if the offset
computation returns an error_mark_node.

gcc/ChangeLog:

PR tree-optimization/105736
* tree-object-size.cc (addr_object_size): Return size_unknown
when object offset computation returns an error.

gcc/testsuite/ChangeLog:

PR tree-optimization/105736
* gcc.dg/builtin-dynamic-object-size-0.c (TV4, val3,
test_pr105736): New struct declaration, variable and function to
test PR.


If you want to spell the exact changes in the test, it would be better
to do it separately when it is different changes.


Thanks, fixed up before pushing to master.


* gcc.dg/builtin-dynamic-object-size-0.c (struct TV4): New type.
(val3): New variable.
(test_pr105736): New function.

(main): Use them.


Otherwise LGTM, but for GCC 13, it would be nice to add support
for BIT_FIELD_REF if both second and third arguments are multiples of
BITS_PER_UNIT.


Ack, I'll test and post this as a separate change.


Signed-off-by: Siddhesh Poyarekar 
---
Changes from v1:
- Used FAIL() instead of __builtin_abort() in the test.

Tested:

- x86_64 bootstrap and test
- --with-build-config=bootstrap-ubsan build

May I also backport this to gcc12?


Ok.


Thanks, I'll give it a couple of days in master and then cherry-pick.

Sid

Re: [PATCH] libstdc++: testsuite: work around bitset namespace pollution

2022-06-21 Thread Jonathan Wakely via Gcc-patches

On Tue, 21 Jun 2022 at 06:32, Alexandre Oliva via Libstdc++
 wrote:
>
>
> rtems6 declares a global struct bitset in a header file included
> indirectly by sys/types.h, that ambiguates the unqualified references
> to bitset after "using namespace std" in the testsuite.
>
> Work around the namespace pollution with using declarations of
> std::bitset.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.0.  Ok to install?

OK, thanks.

Re: [PATCH] testsuite: outputs.exp: cleanup before running tests (was: Re: [PATCH] testsuite: outputs.exp: test for skip_atsave more thoroughly)

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, Jun 21, 2022 at 7:47 AM Alexandre Oliva via Gcc-patches
 wrote:
>
> On Jun 21, 2022, Alexandre Oliva  wrote:
>
> >   * gcc.misc-tests/outputs.exp (outest): Introduce quiet mode,
>
> Use the just-added dry-run infrastructure to clean up files that may
> have been left over by interrupted runs of outputs.exp, which used to
> lead to spurious non-repeatable (self-fixing) failures.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

OK

>
> for  gcc/testsuite/ChangeLog
>
> * gcc.misc-tests/outputs.exp: Clean up left-overs first.
> ---
>  gcc/testsuite/gcc.misc-tests/outputs.exp |3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
> b/gcc/testsuite/gcc.misc-tests/outputs.exp
> index a63ce66693b97..ab919db1ccb2d 100644
> --- a/gcc/testsuite/gcc.misc-tests/outputs.exp
> +++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
> @@ -304,6 +304,9 @@ if { "$aout" != "" } then {
>  set oaout "-o $aout"
>  }
>
> +# Clean up any left-overs from an earlier interrupted run.
> +outest "$b-cleanup?" $sing "$oaout" {alt/ dir/ o/ od/ obj/} {{} {} {} {} {} 
> {$aout}}
> +
>  # Sometimes the -I or -L flags that cause the compiler driver to save
>  # .args.[01], instead of leaving it for the linker to save .ld1_args,
>  # is hiding in driver self specs.
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about

Re: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.

2022-06-21 Thread Segher Boessenkool

On Tue, Jun 21, 2022 at 10:03:18AM +0800, Kewen.Lin wrote:
> This case also needs effective-target keyword lp64,
> that is /* { dg-require-effective-target lp64 } */

Good point.  Yes.

It would be nice to have just has_arch_ppc64 really.

> since with -m32, it gets:
>   mr 3,4
> 
> with -m32 -mpowerpc64, it gets:
>   rldicl 3,4,0,32

Yes, and that is not lp64 -- both longs and pointers are 32 bits when
you have -m32.

You get different code because parameter passing is different.  The
usual way to sidestep is to have the data in memory instead:

unsigned long long x;
void 
goo (void)
{
  unsigned long long value = x;
  value &= 0x;
  value |= value << 32;
  x = value;
}

but then the compiler tries to be smart and do code like
addis 10,2,.LANCHOR0+4@toc@ha
lwz 10,.LANCHOR0+4@toc@l(10)
sldi 9,10,32
add 9,9,10
addis 10,2,.LANCHOR0@toc@ha
std 9,.LANCHOR0@toc@l(10)
blr
for -m64, and
lis 9,x@ha
la 10,x@l(9)
lwz 10,4(10)
stw 10,x@l(9)
blr
for just -m32, but
lis 10,x@ha
la 9,x@l(10)
la 10,x@l(10)
ld 9,0(9)
rldicl 8,9,0,32
sldi 9,9,32
add 9,9,8
std 9,0(10)
blr
for -m32 -mpowerpc64 (note it has not managed to do the splitter here;
it gets
Failed to match this instruction:
(set (reg:DI 128)
(plus:DI (ashift:DI (reg/v:DI 117 [ value ])
(const_int 32 [0x20]))
(zero_extend:DI (subreg:SI (reg/v:DI 117 [ value ]) 4
and then
Failed to match this instruction:
(set (reg:DI 128)
(plus:DI (and:DI (reg/v:DI 117 [ value ])
(const_int 4294967295 [0x]))
(ashift:DI (reg/v:DI 117 [ value ])
(const_int 32 [0x20]
but that is not enough).

So let's just do lp64, at least for now :-)

Segher

Re: [PATCH] testsuite: outputs.exp: test for skip_atsave more thoroughly

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, Jun 21, 2022 at 7:45 AM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> The presence of -I or -L flags in link command lines changes the
> driver's, and thus the linker's behavior, WRT naming files with
> command-line options.  With such flags, the driver creates .args.0 and
> .args.1 files, whereas without them it's the linker (collect2, really)
> that creates .ld1_args.
>
> I've hit some fails on a target system that doesn't have -I or -L
> flags in the board config file, but it does add some of them
> implicitly with configured-in driver self specs.  Alas, the test in
> outputs.exp doesn't catch that, so we proceed to run rather than
> skip_atsave tests.
>
> I've reworked the outest procedure to allow dry runs and to return
> would-have-been pass/fail results as lists, so we can now test whether
> certain files are created and use that to configure the actual test
> runs.
>
> Regstrapped on x86_64-linux-gnu, also tested with a cross to
> aarch64-rtems6.  Ok to install?

OK.

Thanks

>
> for  gcc/testsuite/ChangeLog
>
> * gcc.misc-tests/outputs.exp (outest): Introduce quiet mode,
> create and return lists of passes and fails.  Use it to catch
> skip_atsave cases where -L flags are implicitly added by
> driver self specs.
> ---
>  gcc/testsuite/gcc.misc-tests/outputs.exp |   49 
> ++
>  1 file changed, 42 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
> b/gcc/testsuite/gcc.misc-tests/outputs.exp
> index afae735e92d76..a63ce66693b97 100644
> --- a/gcc/testsuite/gcc.misc-tests/outputs.exp
> +++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
> @@ -116,8 +116,23 @@ if [info exists env(MAKEFLAGS)] {
>  # it weren't for
>  # https://core.tcl-lang.org/tcl/tktview?name=5bbd044812), but .{i,s,o}
>  # and .[iso] will pass even if only the .o is present.
> +
> +# Return a list containing two lists, the first naming the passes, the
> +# second naming the fails.  If test ends with a question mark, the
> +# test is taken as a preparatory test or cleanup, and no pass or fail
> +# results will be logged, though the lists will still be built and
> +# returned.
>  array unset outests *
>  proc outest { test sources opts dirs outputs } {
> +if { [string index $test end] == "?" } {
> +   set quiet 1
> +} else {
> +   set quiet 0
> +}
> +
> +set passes {}
> +set fails {}
> +
>  global b srcdir subdir
>  global outests
>
> @@ -182,15 +197,15 @@ proc outest { test sources opts dirs outputs } {
> set o "$og"
> }
> if { [file exists $d$o] } then {
> -   pass "$test: $d$o"
> +   lappend passes "$d$o"
> file delete $d$o
> } else {
> set ogl [glob -nocomplain -path $d -- $o]
> if { $ogl != {} } {
> -   pass "$test: $d$o"
> +   lappend passes "$d$o"
> file delete $ogl
> } else {
> -   fail "$test: $d$o"
> +   lappend fails "$d$o"
> }
> }
> }
> @@ -219,17 +234,27 @@ proc outest { test sources opts dirs outputs } {
>  }
>
>  if { [llength $outb] == 0 } then {
> -   pass "$test: extra"
> +   lappend passes "extra"
>  } else {
> -   fail "$test: extra\n$outb"
> +   lappend fails "extra\n$outb"
>  }
>
>  if { [string equal "$gcc_output" ""] } then {
> -   pass "$test: std out"
> +   lappend passes "std out"
>  } else {
> -   fail "$test: std out\n$gcc_output"
> +   lappend fails "std out\n$gcc_output"
>  }
>
> +if !$quiet {
> +   foreach p $passes {
> +   pass "$test: $p"
> +   }
> +   foreach f $fails {
> +   fail "$test: $f"
> +   }
> +}
> +
> +return [list $passes $fails]
>  }
>
>  set sing {-0.c}
> @@ -279,6 +304,16 @@ if { "$aout" != "" } then {
>  set oaout "-o $aout"
>  }
>
> +# Sometimes the -I or -L flags that cause the compiler driver to save
> +# .args.[01], instead of leaving it for the linker to save .ld1_args,
> +# is hiding in driver self specs.
> +if !$skip_atsave {
> +set atsave_test_out [outest "$b-skip-atsave?" $sing "@/dev/null -o 
> $b.exe -save-temps" {} {{.args.1}}]
> +if { [lindex [lindex $atsave_test_out 0] 0] == "$b.args.1" } {
> +   set skip_atsave 1
> +}
> +}
> +
>  # Driver-chosen outputs.
>  outest "$b-1 asm default 1" $sing "-S" {} {{-0.s}}
>  outest "$b-2 asm default 2" $mult "-S" {} {{-1.s -2.s}}
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about

Re: [PATCH] libgo: Recognize off64_t / loff_t type definition of musl libc

2022-06-21 Thread Sören Tempel via Gcc-patches

Hi,

The problem is: glibc defines loff_t in sys/types.h, not fcntl.h (where musl
defines it). I falsely assumed that the newly committed AC_CHECK_TYPES
invocation would include fcntl.h *in addition to* AC_INCLUDES_DEFAULT.
However, as it turns out specifying includes for AC_CHECK_TYPES overwrites the
default instead of appending to it.

The patch below should fix this by appending to AC_INCLUDES_DEFAULT explicitly.
Alternatively, we could try to add fcntl.h to AC_INCLUDES_DEFAULT, though my
autotools knowledge is severely limited and hence I am not sure how this would
be achieved.

diff --git a/libgo/configure b/libgo/configure
index b7ff9b3..273af1d 100755
--- a/libgo/configure
+++ b/libgo/configure
@@ -15549,8 +15549,10 @@ fi

 CFLAGS_hold="$CFLAGS"
 CFLAGS="$OSCFLAGS $CFLAGS"
-ac_fn_c_check_type "$LINENO" "loff_t" "ac_cv_type_loff_t" "#include 
-"
+ac_fn_c_check_type "$LINENO" "loff_t" "ac_cv_type_loff_t" "
+$ac_includes_default
+#include 
+ "
 if test "x$ac_cv_type_loff_t" = xyes; then :

 cat >>confdefs.h <<_ACEOF
diff --git a/libgo/configure.ac b/libgo/configure.ac
index bac58b0..b237392 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -604,7 +604,9 @@ AC_TYPE_OFF_T

 CFLAGS_hold="$CFLAGS"
 CFLAGS="$OSCFLAGS $CFLAGS"
-AC_CHECK_TYPES([loff_t], [], [], [[#include ]])
+AC_CHECK_TYPES([loff_t], [], [], [
+AC_INCLUDES_DEFAULT
+#include ])
 CFLAGS="$CFLAGS_hold"

 LIBS_hold="$LIBS"

Eric Botcazou  wrote:
> > aarch64-suse-linux, of course.
> 
> Likewise on x86_64-suse-linux.
> 
> > > What is the output of
> > > 
> > > grep loff_t TARGET/libgo/gen-sysinfo.go
> > 
> > type ___loff_t int64
> > type _loff_t int64
> > type ___kernel_loff_t int64
> 
> Ditto.

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Xi Ruoyao via Gcc-patches

On Tue, 2022-06-21 at 09:28 +0200, Martin Liška wrote:

> Sorry, but I don't see to which email this replies to?
> Can't find a patch.

https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596913.html

The patch is an attachment:

https://gcc.gnu.org/pipermail/gcc-patches/attachments/20220621/dbb112d2/attachment-0001.obj

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Richard Biener via Gcc-patches

On Tue, Jun 21, 2022 at 5:06 AM xionghuluo(罗雄虎) via Gcc-patches
 wrote:
>
> Current GCC generates:
>
>
>  test2:
> .LFB0:
>         .cfi_startproc
>         xorl    %edx, %edx
>         cmpl    $3, (%rdi)
>         jle     .L1
>         movl    16(%rdi), %eax
>         cmpl    $1, %eax
>         je      .L4
>         subl    $2, %eax
>         cmpl    $4, %eax
>         ja      .L1
>         movl    CSWTCH.1(,%rax,4), %edx
> .L1:
>         movl    %edx, %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L4:
>         movl    $12, %edx
>         jmp     .L1
>         .cfi_endproc
> .LFE0:
>         .size   test2, .-test2
>         .section        .rodata
>         .align 16
>         .type   CSWTCH.1, @object
>         .size   CSWTCH.1, 20
> CSWTCH.1:
>         .long   27
>         .long   38
>         .long   18
>         .long   58
>         .long   68
>
>
>
>
> With the patch attatched:
>
>
>  test2:
> .LFB0:
>         .cfi_startproc
>         xorl    %edx, %edx
>         cmpl    $3, (%rdi)
>         jle     .L1
>         movl    16(%rdi), %eax
>         subl    $1, %eax
>         cmpl    $5, %eax
>         jbe     .L6
> .L1:
>         movl    %edx, %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L6:
>         movl    CSWTCH.1(,%rax,4), %edx
>         movl    %edx, %eax
>         ret
>         .cfi_endproc
> .LFE0:
>         .size   test2, .-test2
>         .section        .rodata
>         .align 16
>         .type   CSWTCH.1, @object
>         .size   CSWTCH.1, 24
> CSWTCH.1:
>         .long   12
>         .long   27
>         .long   38
>         .long   18
>         .long   58
>         .long   68
>
>
> Bootstrap and regression tested pass on x86_64-linux-gnu, OK for master?

OK if you add a comment that an empty conditions_in_bbs indicates we are
processing the first basic-block (that's not obvious to me).

Thanks,
Richard.

Re: [PATCH] if-to-switch: Don't skip the first condition bb when find_conditions in if-to-switch [PR105740]

2022-06-21 Thread Martin Liška

On 6/21/22 05:44, Xionghu Luo via Gcc-patches wrote:
> Correct the format...
> 
>  test2:
> .LFB0:
>         .cfi_startproc
>         xorl    %edx, %edx
>         cmpl    $3, (%rdi)
>         jle     .L1
>         movl    16(%rdi), %eax
>         cmpl    $1, %eax
>         je      .L4
>         subl    $2, %eax
>         cmpl    $4, %eax
>         ja      .L1
>         movl    CSWTCH.1(,%rax,4), %edx
> .L1:
>         movl    %edx, %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L4:
>         movl    $12, %edx
>         jmp     .L1
>         .cfi_endproc
> .LFE0:
>         .size   test2, .-test2
>         .section        .rodata
>         .align 16
>         .type   CSWTCH.1, @object
>         .size   CSWTCH.1, 20
> CSWTCH.1:
>         .long   27
>         .long   38
>         .long   18
>         .long   58
>         .long   68
> 
> 
> 
> With the patch attatched:
> 
> 
>  test2:
> .LFB0:
>         .cfi_startproc
>         xorl    %edx, %edx
>         cmpl    $3, (%rdi)
>         jle     .L1
>         movl    16(%rdi), %eax
>         subl    $1, %eax
>         cmpl    $5, %eax
>         jbe     .L6
> .L1:
>         movl    %edx, %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L6:
>         movl    CSWTCH.1(,%rax,4), %edx
>         movl    %edx, %eax
>         ret
>         .cfi_endproc
> .LFE0:
>         .size   test2, .-test2
>         .section        .rodata
>         .align 16
>         .type   CSWTCH.1, @object
>         .size   CSWTCH.1, 24
> CSWTCH.1:
>         .long   12
>         .long   27
>         .long   38
>         .long   18
>         .long   58
>         .long   68
> 
> 
> 
> 
> On 2022/6/21 11:05, xionghuluo(罗雄虎) via Gcc-patches wrote:
>> Current GCC generates:
>>
>>
>>  test2:
>> .LFB0:
>>         .cfi_startproc
>>         xorl    %edx, %edx
>>         cmpl    $3, (%rdi)
>>         jle     .L1
>>         movl    16(%rdi), %eax
>>         cmpl    $1, %eax
>>         je      .L4
>>         subl    $2, %eax
>>         cmpl    $4, %eax
>>         ja      .L1
>>         movl    CSWTCH.1(,%rax,4), %edx
>> .L1:
>>         movl    %edx, %eax
>>         ret
>>         .p2align 4,,10
>>         .p2align 3
>> .L4:
>>         movl    $12, %edx
>>         jmp     .L1
>>         .cfi_endproc
>> .LFE0:
>>         .size   test2, .-test2
>>         .section        .rodata
>>         .align 16
>>         .type   CSWTCH.1, @object
>>         .size   CSWTCH.1, 20
>> CSWTCH.1:
>>         .long   27
>>         .long   38
>>         .long   18
>>         .long   58
>>         .long   68
>>
>>
>>
>>
>> With the patch attatched:
>>
>>
>>  test2:
>> .LFB0:
>>         .cfi_startproc
>>         xorl    %edx, %edx
>>         cmpl    $3, (%rdi)
>>         jle     .L1
>>         movl    16(%rdi), %eax
>>         subl    $1, %eax
>>         cmpl    $5, %eax
>>         jbe     .L6
>> .L1:
>>         movl    %edx, %eax
>>         ret
>>         .p2align 4,,10
>>         .p2align 3
>> .L6:
>>         movl    CSWTCH.1(,%rax,4), %edx
>>         movl    %edx, %eax
>>         ret
>>         .cfi_endproc
>> .LFE0:
>>         .size   test2, .-test2
>>         .section        .rodata
>>         .align 16
>>         .type   CSWTCH.1, @object
>>         .size   CSWTCH.1, 24
>> CSWTCH.1:
>>         .long   12
>>         .long   27
>>         .long   38
>>         .long   18
>>         .long   58
>>         .long   68
>>
>>
>> Bootstrap and regression tested pass on x86_64-linux-gnu, OK for master?
> 

Sorry, but I don't see to which email this replies to?
Can't find a patch.

Martin

RE: [PATCH 2/2]middle-end: Support recognition of three-way max/min.

2022-06-21 Thread Tamar Christina via Gcc-patches

> -Original Message-
> From: Andrew Pinski 
> Sent: Tuesday, June 21, 2022 12:16 AM
> To: Tamar Christina 
> Cc: GCC Patches ; Jakub Jelinek
> ; nd ; Richard Guenther
> 
> Subject: Re: [PATCH 2/2]middle-end: Support recognition of three-way
> max/min.
> 
> On Thu, Jun 16, 2022 at 4:11 AM Tamar Christina via Gcc-patches  patc...@gcc.gnu.org> wrote:
> >
> > Hi All,
> >
> > This patch adds support for three-way min/max recognition in phi-opts.
> >
> > Concretely for e.g.
> >
> > #include 
> >
> > uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > uint8_t  xk;
> > if (xc < xm) {
> > xk = (uint8_t) (xc < xy ? xc : xy);
> > } else {
> > xk = (uint8_t) (xm < xy ? xm : xy);
> > }
> > return xk;
> > }
> >
> > we generate:
> >
> >[local count: 1073741824]:
> >   _5 = MIN_EXPR ;
> >   _7 = MIN_EXPR ;
> >   return _7;
> >
> > instead of
> >
> >   :
> >   if (xc_2(D) < xm_3(D))
> > goto ;
> >   else
> > goto ;
> >
> >   :
> >   xk_5 = MIN_EXPR ;
> >   goto ;
> >
> >   :
> >   xk_6 = MIN_EXPR ;
> >
> >   :
> >   # xk_1 = PHI 
> >   return xk_1;
> >
> > The same function also immediately deals with turning a minimization
> > problem into a maximization one if the results are inverted.  We do
> > this here since doing it in match.pd would end up changing the shape
> > of the BBs and adding additional instructions which would prevent various
> optimizations from working.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the
> phi
> > sequence of a three-way conditional.
> > (replace_phi_edge_with_variable): Support deferring of BB removal.
> > (tree_ssa_phiopt_worker): Detect diamond phi structure for three-
> way
> > min/max.
> > (strip_bit_not, invert_minmax_code): New.
> 
> I have been working on getting rid of minmax_replacement and a few others
> and only having match_simplify_replacement and having the simplification
> logic all in match.pd instead.
> Is there a reason why you can't expand match_simplify_replacement and
> match.pd?

Because this is just a simple extension of minmax_replacement which just adds
a third case but re-uses all the validation and normalization code already 
present
in the pass.

> 
> >The reason was that a lot of the foldings checked that the BB contains
> >only  a single SSA and that that SSA is a phi node.
> 
> Could you expand on that?

Passes that call last_and_only_stmt break because you push an extra statement 
into the BB. The phi node is then no longer the last and only statement.

From the top of my head, ssa-spit-path is one that started giving some failures 
in the testsuite because of this.

Tamar

> 
> Thanks,
> Andrew
> 
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't 
> > optimize
> > code away.
> > * gcc.dg/tree-ssa/minmax-3.c: New test.
> > * gcc.dg/tree-ssa/minmax-4.c: New test.
> > * gcc.dg/tree-ssa/minmax-5.c: New test.
> > * gcc.dg/tree-ssa/minmax-6.c: New test.
> > * gcc.dg/tree-ssa/minmax-7.c: New test.
> > * gcc.dg/tree-ssa/minmax-8.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > new file mode 100644
> > index
> >
> ..de3b2e946e81701e3b75f580e
> 6a8
> > 43695a05786e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-3.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include 
> > +
> > +uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
> > +   uint8_t  xk;
> > +if (xc < xm) {
> > +xk = (uint8_t) (xc < xy ? xc : xy);
> > +} else {
> > +xk = (uint8_t) (xm < xy ? xm : xy);
> > +}
> > +return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > new file mode 100644
> > index
> >
> ..0b6d667be868c2405eaefd17c
> b52
> > 2da44bafa0e2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-4.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt" } */
> > +
> > +#include 
> > +
> > +uint8_t three_max (uint8_t xc, uint8_t xm, uint8_t xy) {
> > +uint8_t xk;
> > +if (xc > xm) {
> > +xk = (uint8_t) (xc > xy ? xc : xy);
> > +} else {
> > +xk = (uint8_t) (xm > xy ? xm : xy);
> > +}
> > +return xk;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "phiopt1" } }

Re: [PATCH] i386: Add syscall to enable AMX for latest kernels

2022-06-21 Thread Uros Bizjak via Gcc-patches

On Tue, Jun 21, 2022 at 4:23 AM Jiang, Haochen  wrote:
>
> > -Original Message-
> > From: Uros Bizjak 
> > Sent: Monday, June 20, 2022 10:54 PM
> > To: Jiang, Haochen 
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest kernels
> >
> > On Mon, Jun 20, 2022 at 10:04 AM Haochen Jiang 
> > wrote:
> > >
> > > From: "Jiang, Haochen" 
> > >
> > > Hi all,
> > >
> > > We need syscall to enable AMX for kernels>=5.4. It is missing in
> > > current amx tests, which will cause test fail.
> >
> > So this new code is only valid for linux & co?
>
> Thanks for reminding me for that, I only test on linux since the header file 
> is only in linux.
>
> Just updated a patch wrapping with a macro not to change the behavior on 
> windows.

I think you want __linux__ there, not __unix__.

Uros.

>
> Regtested on x86_64-pc-linux-gnu.
>
> Thx,
> Haochen
> >
> > Uros.
> >
> > >
> > > This patch aims to add them to fix this bug.
> > >
> > > BRs,
> > > Haochen
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/i386/amx-check.h (request_perm_xtile_data):
> > > New function to check if AMX is usable and enable AMX.
> > > (main): Run test if AMX is usable.
> > > ---
> > >  gcc/testsuite/gcc.target/i386/amx-check.h | 24
> > > +++
> > >  1 file changed, 24 insertions(+)
> > >
> > > diff --git a/gcc/testsuite/gcc.target/i386/amx-check.h
> > > b/gcc/testsuite/gcc.target/i386/amx-check.h
> > > index 434b0e59703..92ed8669304 100644
> > > --- a/gcc/testsuite/gcc.target/i386/amx-check.h
> > > +++ b/gcc/testsuite/gcc.target/i386/amx-check.h
> > > @@ -4,11 +4,22 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > > +#include 
> > >  #ifdef DEBUG
> > >  #include 
> > >  #endif
> > >  #include "cpuid.h"
> > >
> > > +#define XFEATURE_XTILECFG  17
> > > +#define XFEATURE_XTILEDATA 18
> > > +#define XFEATURE_MASK_XTILECFG (1 << XFEATURE_XTILECFG)
> > > +#define XFEATURE_MASK_XTILEDATA(1 << XFEATURE_XTILEDATA)
> > > +#define XFEATURE_MASK_XTILE(XFEATURE_MASK_XTILECFG |
> > XFEATURE_MASK_XTILEDATA)
> > > +
> > > +#define ARCH_GET_XCOMP_PERM0x1022
> > > +#define ARCH_REQ_XCOMP_PERM0x1023
> > > +
> > >  /* TODO: The tmm emulation is temporary for current
> > > AMX implementation with no tmm regclass, should
> > > be changed in the future. */
> > > @@ -44,6 +55,18 @@ typedef struct __tile
> > >  /* Stride (colum width in byte) used for tileload/store */  #define
> > > _STRIDE 64
> > >
> > > +/* We need syscall to use amx functions */ int
> > > +request_perm_xtile_data() {
> > > +  unsigned long bitmask;
> > > +
> > > +  if (syscall (SYS_arch_prctl, ARCH_REQ_XCOMP_PERM,
> > XFEATURE_XTILEDATA) ||
> > > +  syscall (SYS_arch_prctl, ARCH_GET_XCOMP_PERM, &bitmask))
> > > +return 0;
> > > +
> > > +  return (bitmask & XFEATURE_MASK_XTILE) != 0; }
> > > +
> > >  /* Initialize tile config by setting all tmm size to 16x64 */  void
> > > init_tile_config (__tilecfg_u *dst)  { @@ -186,6 +209,7 @@ main ()
> > > #ifdef AMX_BF16
> > >&& __builtin_cpu_supports ("amx-bf16")  #endif
> > > +  && request_perm_xtile_data ()
> > >)
> > >  {
> > >DO_TEST ();
> > > --
> > > 2.18.2
> > >

84 matches

Mail list logo