[PATCH v2] Simplify year::is_leap().

2023-11-10 Thread Cassio Neri
The current implementation returns
(_M_y & (__is_multiple_of_100 ? 15 : 3)) == 0;
where __is_multiple_of_100 is calculated using an obfuscated algorithm which
saves one ror instruction when compared to _M_y % 100 == 0 [1].

In leap years calculation, it's mathematically correct to replace the
divisibility check by 100 with the one by 25. It turns out that
_M_y % 25 == 0 also saves the ror instruction [2]. Therefore, the
obfuscation is not required.

[1] https://godbolt.org/z/5PaEv6a6b
[2] https://godbolt.org/z/55G8rn77e

libstdc++-v3/ChangeLog:

* include/std/chrono:

---
 libstdc++-v3/include/std/chrono | 38 ++
 1 file changed, 18 insertions(+), 20 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono
b/libstdc++-v3/include/std/chrono
index 10e868e5a03..5707ed002a2 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -835,29 +835,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr bool
   is_leap() const noexcept
   {
- // Testing divisibility by 100 first gives better performance, that is,
- // return (_M_y % 100 != 0 || _M_y % 400 == 0) && _M_y % 4 == 0;
-
- // It gets even faster if _M_y is in [-536870800, 536870999]
- // (which is the case here) and _M_y % 100 is replaced by
- // __is_multiple_of_100 below.
+ // Testing divisibility by 100 first gives better performance [1], i.e.,
+ // return y % 100 == 0 ? y % 400 == 0 : y % 16 == 0;
+ // Furthermore, if y % 100 == 0, then y % 400 == 0 is equivalent to
+ // y % 16 == 0, so we can simplify it to
+ // return y % 100 == 0 ? y % 16 == 0 : y % 4 == 0.  // #1
+ // Similarly, we can replace 100 with 25 (which is good since
+ // y % 25 == 0 requires one fewer instruction than y % 100 == 0 [2]):
+ // return y % 25 == 0 ? y % 16 == 0 : y % 4 == 0.  // #2
+ // Indeed, first assume y % 4 != 0.  Then y % 16 != 0 and hence,
+ // y % 4 == 0 and y % 16 == 0 are both false.  Therefore, #2 returns
+ // false as it should (regardless of y % 25.) Now assume y % 4 == 0.  In
+ // this case, y % 25 == 0 if, and only if, y % 100 == 0, that is, #1 and
+ // #2 are equivalent.  Finally, #2 is equivalent to
+ // return (y & (y % 25 == 0 ? 15 : 3)) == 0.

  // References:
  // [1] https://github.com/cassioneri/calendar
- // [2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16
-
- // Furthermore, if y%100 == 0, then y%400==0 is equivalent to y%16==0,
- // so we can simplify it to (!mult_100 && y % 4 == 0) || y % 16 == 0,
- // which is equivalent to (y & (mult_100 ? 15 : 3)) == 0.
- // See https://gcc.gnu.org/pipermail/libstdc++/2021-June/052815.html
-
- constexpr uint32_t __multiplier   = 42949673;
- constexpr uint32_t __bound= 42949669;
- constexpr uint32_t __max_dividend = 1073741799;
- constexpr uint32_t __offset   = __max_dividend / 2 / 100 * 100;
- const bool __is_multiple_of_100
-  = __multiplier * (_M_y + __offset) < __bound;
- return (_M_y & (__is_multiple_of_100 ? 15 : 3)) == 0;
+ // [2] https://godbolt.org/z/55G8rn77e
+ // [3] https://gcc.gnu.org/pipermail/libstdc++/2021-June/052815.html
+
+ return (_M_y & (_M_y % 25 == 0 ? 15 : 3)) == 0;
   }

   explicit constexpr


[PATCH v2] Remove unnecessary "& 1" in year_month_day_last::day()

2023-11-10 Thread Cassio Neri
When year_month_day_last::day() was implemented, Dr. Matthias Kretz realised
that the operation "& 1" wasn't necessary but we did not patch it at that
time. This patch removes the unnecessary operation.

libstdc++-v3/ChangeLog:

* include/std/chrono:

---
 libstdc++-v3/include/std/chrono | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono
b/libstdc++-v3/include/std/chrono
index 10e868e5a03..a826982803b 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -1800,22 +1800,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
  const auto __m = static_cast(month());

- // Excluding February, the last day of month __m is either 30 or 31 or,
- // in another words, it is 30 + b = 30 | b, where b is in {0, 1}.
+ // The result is unspecified if __m < 1 or __m > 12.  Hence, assume
+ // 1 <= __m <= 12.  For __m != 2, day() == 30 or day() == 31 or, in
+ // other words, day () == 30 | b, where b is in {0, 1}.

- // If __m in {1, 3, 4, 5, 6, 7}, then b is 1 if, and only if __m is odd.
- // Hence, b = __m & 1 = (__m ^ 0) & 1.
+ // If __m in {1, 3, 4, 5, 6, 7}, then b is 1 if, and only if, __m is
+ // odd.  Hence, b = __m & 1 = (__m ^ 0) & 1.

- // If __m in {8, 9, 10, 11, 12}, then b is 1 if, and only if __m is even.
- // Hence, b = (__m ^ 1) & 1.
+ // If __m in {8, 9, 10, 11, 12}, then b is 1 if, and only if, __m is
+ // even.  Hence, b = (__m ^ 1) & 1.

  // Therefore, b = (__m ^ c) & 1, where c = 0, if __m < 8, or c = 1 if
  // __m >= 8, that is, c = __m >> 3.

- // The above mathematically justifies this implementation whose
- // performance does not depend on look-up tables being on the L1 cache.
- return chrono::day{__m != 2 ? ((__m ^ (__m >> 3)) & 1) | 30
-: _M_y.is_leap() ? 29 : 28};
+ // Since 30 = (0)_2 and __m <= 31 = (1)_2, the "& 1" in b's
+ // calculation is unnecessary.
+
+ // The performance of this implementation does not depend on look-up
+ // tables being on the L1 cache.
+ return chrono::day{__m != 2 ? (__m ^ (__m >> 3)) | 30
+  : _M_y.is_leap() ? 29 : 28};
   }

   constexpr


[PATCH v2] c++: fix parsing with auto(x) [PR112410]

2023-11-10 Thread Marek Polacek
On Thu, Nov 09, 2023 at 07:07:03PM -0500, Jason Merrill wrote:
> On 11/9/23 14:58, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > Here we are wrongly parsing
> > 
> >int y(auto(42));
> > 
> > which uses the C++23 cast-to-prvalue feature, and initializes y to 42.
> > However, we were treating the auto as an implicit template parameter.
> > 
> > Fixing the auto{42} case is easy, but when auto is followed by a (,
> > I found the fix to be much more involved.  For instance, we cannot
> > use cp_parser_expression, because that can give hard errors.  It's
> > also necessary to disambiguate 'auto(i)' as 'auto i', not a cast.
> > auto(), auto(int), auto(f)(int), auto(*), auto(i[]), auto(...), etc.
> > are all function declarations.  We have to look at more than one
> > token to decide.
> 
> Yeah, this is a most vexing parse problem.  The code is synthesizing
> template parameters before we've resolved whether the auto is a
> decl-specifier or not.
> 
> > In this fix, I'm (ab)using cp_parser_declarator, with member_p=false
> > so that it doesn't commit.  But it handles even more complicated
> > cases as
> > 
> >int fn (auto (*const **)(int) -> char);
> 
> But it doesn't seem to handle the extremely vexing
> 
> struct A {
>   A(int,int);
> };
> 
> int main()
> {
>   int a;
>   A b(auto(a), 42);
> }

Argh.  This test should indeed be accepted and is currently rejected,
but it's a different problem: 'b' is at block scope and you can't
have a template there.  But when I put it into a namespace scope,
it shows that my patch doesn't work correctly.  I've added auto-fncast14.C
for the latter and opened c++/112482 for the block-scope problem.
 
> I think we need to stop synthesizing immediately when we see RID_AUTO, and
> instead go back after we successfully parse a declaration and synthesize for
> any autos we saw along the way.  :/

That seems very complicated :(.  I had a different idea though; how
about the following patch?  The idea is that if we see that parsing
the parameter-declaration-list didn't work, we undo what synthesize_
did, and let cp_parser_initializer parse "(auto(42))", which should
succeed.  I checked that after cp_finish_decl y is initialized to 42.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here we are wrongly parsing

  int y(auto(42));

which uses the C++23 cast-to-prvalue feature, and initializes y to 42.
However, we were treating the auto as an implicit template parameter.

Fixing the auto{42} case is easy, but when auto is followed by a (,
I found the fix to be much more involved.  For instance, we cannot
use cp_parser_expression, because that can give hard errors.  It's
also necessary to disambiguate 'auto(i)' as 'auto i', not a cast.
auto(), auto(int), auto(f)(int), auto(*), auto(i[]), auto(...), etc.
are all function declarations.

This patch rectifies that by undoing the implicit function template
modification.  In the test above, we should notice that the parameter
list is ill-formed, and since we've synthesized an implicit template
parameter, we undo it by calling abort_fully_implicit_template.  Then,
we'll parse the "(auto(42))" as an initializer.

PR c++/112410

gcc/cp/ChangeLog:

* parser.cc (cp_parser_simple_type_specifier): Disambiguate
between a variable and function declaration with auto.
(cp_parser_parameter_declaration_clause): Maybe call
abort_fully_implicit_template if it turned out the parameter list was
ill-formed.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast13.C: New test.
* g++.dg/cpp23/auto-fncast14.C: New test.
---
 gcc/cp/parser.cc   | 27 +-
 gcc/testsuite/g++.dg/cpp23/auto-fncast13.C | 61 ++
 gcc/testsuite/g++.dg/cpp23/auto-fncast14.C |  9 
 3 files changed, 96 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast13.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast14.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 5116bcb78f6..947351b09b8 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -19991,6 +19991,8 @@ cp_parser_simple_type_specifier (cp_parser* parser,
  /* The 'auto' might be the placeholder return type for a function decl
 with trailing return type.  */
  bool have_trailing_return_fn_decl = false;
+ /* Or it might be auto(x) or auto {x}.  */
+ bool decay_copy = false;
 
  cp_parser_parse_tentatively (parser);
  cp_lexer_consume_token (parser->lexer);
@@ -20008,6 +20010,11 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 /*consume_paren*/true);
  continue;
}
+ else if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
+   {
+ decay_copy = true;
+ break;
+

[PATCH] c-family: Let libcpp know when the compilation is for a PCH [PR9471]

2023-11-10 Thread Lewis Hyatt
Hello-

The PR may be 20 years old, but by now it only needs a one-line fix :). Is
it OK please? Bootstrapped + regtested all langauges on x86-64 Linux.
Thanks!

-Lewis

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47857

-- >8 --

libcpp will generate diagnostics when it encounters things in the main file
that only belong in a header file, such as `#pragma once' or `#pragma GCC
system_header'. But sometimes the main file is a header file that is just
being compiled separately, e.g. to produce a C++ module or a PCH, in which
case such diagnostics should be suppressed. libcpp already has an interface
to request that, so make use of it in the C frontends to prevent libcpp from
issuing unwanted diagnostics when compiling a PCH.

gcc/c-family/ChangeLog:

PR pch/9471
PR pch/47857
* c-opts.cc (c_common_post_options): Set cpp_opts->main_search
so libcpp knows it is compiling a header file separately.

gcc/testsuite/ChangeLog:

PR pch/9471
PR pch/47857
* g++.dg/pch/main-file-warnings.C: New test.
* g++.dg/pch/main-file-warnings.Hs: New test.
* gcc.dg/pch/main-file-warnings.c: New test.
* gcc.dg/pch/main-file-warnings.hs: New test.
---
 gcc/c-family/c-opts.cc | 3 +++
 gcc/testsuite/g++.dg/pch/main-file-warnings.C  | 7 +++
 gcc/testsuite/g++.dg/pch/main-file-warnings.Hs | 3 +++
 gcc/testsuite/gcc.dg/pch/main-file-warnings.c  | 7 +++
 gcc/testsuite/gcc.dg/pch/main-file-warnings.hs | 3 +++
 5 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/pch/main-file-warnings.C
 create mode 100644 gcc/testsuite/g++.dg/pch/main-file-warnings.Hs
 create mode 100644 gcc/testsuite/gcc.dg/pch/main-file-warnings.c
 create mode 100644 gcc/testsuite/gcc.dg/pch/main-file-warnings.hs

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index fbabd1816c1..10403c03bd6 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1174,6 +1174,9 @@ c_common_post_options (const char **pfilename)
   "the %qs debug info cannot be used with "
   "pre-compiled headers",
   debug_set_names (write_symbols & ~DWARF2_DEBUG));
+ /* Let libcpp know that the main file is a header so it won't
+complain about things like #include_next and #pragma once.  */
+ cpp_opts->main_search = CMS_header;
}
   else if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
c_common_no_more_pch ();
diff --git a/gcc/testsuite/g++.dg/pch/main-file-warnings.C 
b/gcc/testsuite/g++.dg/pch/main-file-warnings.C
new file mode 100644
index 000..a9e8b0ba9f2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/main-file-warnings.C
@@ -0,0 +1,7 @@
+/* PR pch/9471 */
+/* PR pch/47857 */
+/* Test will fail if any warnings get issued while compiling the header into a 
PCH.  */
+#include "main-file-warnings.H"
+#pragma once /* { dg-warning "in main file" } */
+#pragma GCC system_header /* { dg-warning "outside include file" } */
+#include_next  /* { dg-warning "in primary source file" } */
diff --git a/gcc/testsuite/g++.dg/pch/main-file-warnings.Hs 
b/gcc/testsuite/g++.dg/pch/main-file-warnings.Hs
new file mode 100644
index 000..d1582bb8290
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/main-file-warnings.Hs
@@ -0,0 +1,3 @@
+#pragma once /* { dg-bogus "in main file" } */
+#pragma GCC system_header /* { dg-bogus "outside include file" } */
+#include_next  /* { dg-bogus "in primary source file" } */
diff --git a/gcc/testsuite/gcc.dg/pch/main-file-warnings.c 
b/gcc/testsuite/gcc.dg/pch/main-file-warnings.c
new file mode 100644
index 000..aedbc15f7ba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/main-file-warnings.c
@@ -0,0 +1,7 @@
+/* PR pch/9471 */
+/* PR pch/47857 */
+/* Test will fail if any warnings get issued while compiling the header into a 
PCH.  */
+#include "main-file-warnings.h"
+#pragma once /* { dg-warning "in main file" } */
+#pragma GCC system_header /* { dg-warning "outside include file" } */
+#include_next  /* { dg-warning "in primary source file" } */
diff --git a/gcc/testsuite/gcc.dg/pch/main-file-warnings.hs 
b/gcc/testsuite/gcc.dg/pch/main-file-warnings.hs
new file mode 100644
index 000..d1582bb8290
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/main-file-warnings.hs
@@ -0,0 +1,3 @@
+#pragma once /* { dg-bogus "in main file" } */
+#pragma GCC system_header /* { dg-bogus "outside include file" } */
+#include_next  /* { dg-bogus "in primary source file" } */


Re: [PATCH] libgcc/m68k: Fixes for soft float

2023-11-10 Thread Keith Packard

> I pushed this to the trunk after fixing a few minor whitespace nits. 
> You didn't mention the divdf change, but I'll assume that was just an 
> oversight.

Yeah, a couple of minor fixes there that I forgot to mention in the log.

> I'm largely trusting your reputation on the fpgnulib changes.  I won't 
> claim to know that code at all.  The assembly bits were simple enough 
> that I could make out what you were doing relatively easily.

Thanks for that review -- m68k assembly isn't my strongest language. The
kludge to return pointers in both d1 and a1 was a bit ugly, but seemed
like a much more robust plan than attempting to use different registers
depending on the target ABI...

The real check for these fixes was to run a fairly comprehensive C
library test suite (part of picolibc) and just iterate until I stopped
getting failures. Those tests have found so many corner cases in both
the C library, FPU emulation and compilers ...

-- 
-keith


signature.asc
Description: PGP signature


[committed] libstdc++: Do not use assume attribute for Clang [PR112467]

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux (-m32 and -m64). Pushed to trunk.

-- >8 --

Clang has an 'assume' attribute, but it's a function attribute not a
statement attribute. The recently-added use of the statement form causes
an error with Clang.

libstdc++-v3/ChangeLog:

PR libstdc++/112467
* include/bits/stl_bvector.h (_M_assume_normalized): Do not use
statement form of assume attribute for Clang.
---
 libstdc++-v3/include/bits/stl_bvector.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 2b91af2005f..64f04c1f4f5 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -185,8 +185,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 void
 _M_assume_normalized() const
 {
+#if __has_attribute(__assume__) && !defined(__clang__)
   unsigned int __ofst = _M_offset;
   __attribute__ ((__assume__ (__ofst < unsigned(_S_word_bit;
+#endif
 }
 
 _GLIBCXX20_CONSTEXPR
-- 
2.41.0



[committed] libstdc++: Simplify std::string_view comparisons (LWG 3950)

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

LWG 3950 points out that the comparisons of std::basic_string_view can
be simplified to just a single overload of operator== and a single
overload of operator<=>. Those overloads work fine for homogeneous
comparisons of two string view objects.

libstdc++-v3/ChangeLog:

* include/std/string_view (operator==, operator<=>): Remove
redundant overloads (LWG 3950).
---
 libstdc++-v3/include/std/string_view | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/libstdc++-v3/include/std/string_view 
b/libstdc++-v3/include/std/string_view
index 9deae25f712..cf288ed3a36 100644
--- a/libstdc++-v3/include/std/string_view
+++ b/libstdc++-v3/include/std/string_view
@@ -602,13 +602,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // deduction and the other argument gets implicitly converted to the deduced
   // type (see N3766).
 
-  template
-[[nodiscard]]
-constexpr bool
-operator==(basic_string_view<_CharT, _Traits> __x,
-   basic_string_view<_CharT, _Traits> __y) noexcept
-{ return __x.size() == __y.size() && __x.compare(__y) == 0; }
-
   template
 [[nodiscard]]
 constexpr bool
@@ -618,14 +611,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return __x.size() == __y.size() && __x.compare(__y) == 0; }
 
 #if __cpp_lib_three_way_comparison
-  template
-[[nodiscard]]
-constexpr auto
-operator<=>(basic_string_view<_CharT, _Traits> __x,
-   basic_string_view<_CharT, _Traits> __y) noexcept
--> decltype(__detail::__char_traits_cmp_cat<_Traits>(0))
-{ return __detail::__char_traits_cmp_cat<_Traits>(__x.compare(__y)); }
-
   template
 [[nodiscard]]
 constexpr auto
@@ -635,6 +620,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 -> decltype(__detail::__char_traits_cmp_cat<_Traits>(0))
 { return __detail::__char_traits_cmp_cat<_Traits>(__x.compare(__y)); }
 #else
+  template
+[[nodiscard]]
+constexpr bool
+operator==(basic_string_view<_CharT, _Traits> __x,
+  basic_string_view<_CharT, _Traits> __y) noexcept
+{ return __x.size() == __y.size() && __x.compare(__y) == 0; }
+
   template
 [[nodiscard]]
 constexpr bool
-- 
2.41.0



[committed] libstdc++: Add static_assert to std::integer_sequence [PR112473]

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

C++20 allows class types as non-type template parameters, but
std::integer_sequence explicitly disallows them. Enforce that.

libstdc++-v3/ChangeLog:

PR libstdc++/112473
* include/bits/utility.h (integer_sequence): Add static_assert.
* testsuite/20_util/integer_sequence/112473.cc: New test.
---
 libstdc++-v3/include/bits/utility.h   | 3 +++
 libstdc++-v3/testsuite/20_util/integer_sequence/112473.cc | 8 
 2 files changed, 11 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/20_util/integer_sequence/112473.cc

diff --git a/libstdc++-v3/include/bits/utility.h 
b/libstdc++-v3/include/bits/utility.h
index 8766dfbc15f..ebcf5ba36b2 100644
--- a/libstdc++-v3/include/bits/utility.h
+++ b/libstdc++-v3/include/bits/utility.h
@@ -166,6 +166,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct integer_sequence
 {
+#if __cplusplus >= 202002L
+  static_assert(is_integral_v<_Tp>);
+#endif
   typedef _Tp value_type;
   static constexpr size_t size() noexcept { return sizeof...(_Idx); }
 };
diff --git a/libstdc++-v3/testsuite/20_util/integer_sequence/112473.cc 
b/libstdc++-v3/testsuite/20_util/integer_sequence/112473.cc
new file mode 100644
index 000..14abfbc8149
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/integer_sequence/112473.cc
@@ -0,0 +1,8 @@
+// { dg-do compile { target c++20 } }
+
+// PR libstdc++/112473 - integer_sequence accepts non-integer types
+
+#include 
+
+std::integer_sequence, std::pair{0, 0}> ic;
+// { dg-error "static assertion failed" "" { target *-*-* } 0 }
-- 
2.41.0



[committed] libstdc++: Fix broken tests for

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

When I added these tests I gave them .h file extensions, so they've
never been run.

They need to use the no_pch option, so that they only test the
 header and don't get  via .

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/headers/complex.h/std_c++11.h: Moved to...
* testsuite/26_numerics/headers/complex.h/std_c++11.cc: ...here.
* testsuite/26_numerics/headers/complex.h/std_c++98.h: Moved to...
* testsuite/26_numerics/headers/complex.h/std_c++98.cc: ...here.
Check macro first and then #undef.
* testsuite/26_numerics/headers/complex.h/std_gnu++11.h: Moved to...
* testsuite/26_numerics/headers/complex.h/std_gnu++11.cc: ...here.
---
 .../complex.h/{std_c++11.h => std_c++11.cc}|  4 +++-
 .../complex.h/{std_c++98.h => std_c++98.cc}| 14 --
 .../complex.h/{std_gnu++11.h => std_gnu++11.cc}|  3 ++-
 3 files changed, 13 insertions(+), 8 deletions(-)
 rename libstdc++-v3/testsuite/26_numerics/headers/complex.h/{std_c++11.h => 
std_c++11.cc} (91%)
 rename libstdc++-v3/testsuite/26_numerics/headers/complex.h/{std_c++98.h => 
std_c++98.cc} (87%)
 rename libstdc++-v3/testsuite/26_numerics/headers/complex.h/{std_gnu++11.h => 
std_gnu++11.cc} (95%)

diff --git a/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++11.h 
b/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++11.cc
similarity index 91%
rename from libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++11.h
rename to libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++11.cc
index f74b13498d7..5cac1218163 100644
--- a/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++11.h
+++ b/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++11.cc
@@ -15,7 +15,9 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=c++11" }
+// { dg-do compile { target c++11 } }
+// { dg-add-options strict_std }
+// { dg-add-options no_pch }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++98.h 
b/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++98.cc
similarity index 87%
rename from libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++98.h
rename to libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++98.cc
index 79facef8d5b..4c9bd6e6a08 100644
--- a/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++98.h
+++ b/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_c++98.cc
@@ -15,13 +15,19 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=c++98" }
+// { dg-do compile { target c++98_only } }
+// { dg-add-options strict_std }
+// { dg-add-options no_pch }
 
 #include 
 
-// Should be equivalent to C99 , not C++ 
+// Should be equivalent to C99 , not C++ 
+#ifndef complex
+# error "'complex' is not defined as a macro by  for -std=c++98"
+#endif
 namespace std
 {
+#undef complex
   struct complex;
 }
 
@@ -49,7 +55,3 @@ namespace test
   using ::creal;
 }
 #endif
-
-#ifndef complex
-# error "'complex' is not defined as a macro by  for -std=c++98"
-#endif
diff --git a/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_gnu++11.h 
b/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_gnu++11.cc
similarity index 95%
rename from libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_gnu++11.h
rename to libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_gnu++11.cc
index 20c55a5944e..4a6fc00d390 100644
--- a/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_gnu++11.h
+++ b/libstdc++-v3/testsuite/26_numerics/headers/complex.h/std_gnu++11.cc
@@ -15,7 +15,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++11" }
+// { dg-do compile { target c++11 } }
+// { dg-add-options no_pch }
 
 #include 
 
-- 
2.41.0



[committed] libstdc++: Deprecate std::atomic_xxx overloads for std::shared_ptr

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

These overloads are deprecated in C++20 (and likely to be removed for
C++26). The std::atomic> specialization should be
preferred in new code.

libstdc++-v3/ChangeLog:

* include/bits/shared_ptr_atomic.h (atomic_is_lock_free)
(atomic_load_explicit, atomic_load, atomic_store_explicit)
(atomic_store, atomic_exchange_explicit, atomic_exchange)
(atomic_compare_exchange_strong, atomic_compare_exchange_weak)
(atomic_compare_exchange_strong_explicit)
(atomic_compare_exchange_weak_explicit): Add deprecated
attribute for C++20 and later.
* testsuite/20_util/shared_ptr/atomic/1.cc: Suppress deprecated
warnings.
* testsuite/20_util/shared_ptr/atomic/2.cc: Likewise.
* testsuite/20_util/shared_ptr/atomic/3.cc: Likewise.
* testsuite/29_atomics/atomic/lwg3220.cc: Likewise.
---
 libstdc++-v3/include/bits/shared_ptr_atomic.h | 22 +++
 .../testsuite/20_util/shared_ptr/atomic/1.cc  |  1 +
 .../testsuite/20_util/shared_ptr/atomic/2.cc  |  1 +
 .../testsuite/20_util/shared_ptr/atomic/3.cc  |  1 +
 .../testsuite/29_atomics/atomic/lwg3220.cc|  1 +
 5 files changed, 26 insertions(+)

diff --git a/libstdc++-v3/include/bits/shared_ptr_atomic.h 
b/libstdc++-v3/include/bits/shared_ptr_atomic.h
index ae2d1b7a094..5b818fe4456 100644
--- a/libstdc++-v3/include/bits/shared_ptr_atomic.h
+++ b/libstdc++-v3/include/bits/shared_ptr_atomic.h
@@ -101,6 +101,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @{
   */
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline bool
 atomic_is_lock_free(const __shared_ptr<_Tp, _Lp>*)
 {
@@ -112,6 +113,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline bool
 atomic_is_lock_free(const shared_ptr<_Tp>* __p)
 { return std::atomic_is_lock_free<_Tp, __default_lock_policy>(__p); }
@@ -128,6 +130,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @{
   */
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline shared_ptr<_Tp>
 atomic_load_explicit(const shared_ptr<_Tp>* __p, memory_order)
 {
@@ -136,11 +139,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline shared_ptr<_Tp>
 atomic_load(const shared_ptr<_Tp>* __p)
 { return std::atomic_load_explicit(__p, memory_order_seq_cst); }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline __shared_ptr<_Tp, _Lp>
 atomic_load_explicit(const __shared_ptr<_Tp, _Lp>* __p, memory_order)
 {
@@ -149,6 +154,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline __shared_ptr<_Tp, _Lp>
 atomic_load(const __shared_ptr<_Tp, _Lp>* __p)
 { return std::atomic_load_explicit(__p, memory_order_seq_cst); }
@@ -164,6 +170,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @{
   */
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline void
 atomic_store_explicit(shared_ptr<_Tp>* __p, shared_ptr<_Tp> __r,
  memory_order)
@@ -173,11 +180,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline void
 atomic_store(shared_ptr<_Tp>* __p, shared_ptr<_Tp> __r)
 { std::atomic_store_explicit(__p, std::move(__r), memory_order_seq_cst); }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline void
 atomic_store_explicit(__shared_ptr<_Tp, _Lp>* __p,
  __shared_ptr<_Tp, _Lp> __r,
@@ -188,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline void
 atomic_store(__shared_ptr<_Tp, _Lp>* __p, __shared_ptr<_Tp, _Lp> __r)
 { std::atomic_store_explicit(__p, std::move(__r), memory_order_seq_cst); }
@@ -201,6 +211,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @{
   */
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline shared_ptr<_Tp>
 atomic_exchange_explicit(shared_ptr<_Tp>* __p, shared_ptr<_Tp> __r,
 memory_order)
@@ -211,6 +222,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline shared_ptr<_Tp>
 atomic_exchange(shared_ptr<_Tp>* __p, shared_ptr<_Tp> __r)
 {
@@ -219,6 +231,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline __shared_ptr<_Tp, _Lp>
 atomic_exchange_explicit(__shared_ptr<_Tp, _Lp>* __p,
 __shared_ptr<_Tp, _Lp> __r,
@@ -230,6 +243,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+_GLIBCXX20_DEPRECATED_SUGGEST("std::atomic>")
 inline __shared_ptr<_Tp, _Lp>
 atomic_exchange(__shared_ptr<_Tp, _Lp>* __p, __shared_ptr<_Tp, _Lp> __r)
 {

[committed] libstdc++: Fix test that fails with -ffreestanding

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The -ffreestanding option disables Debug Mode, forcibly #undef'ing
_GLIBCXX_DEBUG. This means that the dangling checks in std::pair are
disabled for -ffreestanding in C++17 and earlier, because they depend on
_GLIBCXX_DEBUG. Adjust the target specifiers for the errors currently
matching c++17_down so they also require the hosted effective target.

libstdc++-v3/ChangeLog:

* testsuite/20_util/pair/dangling_ref.cc: Add hosted effective
target for specifiers using c++17_down.
---
 .../testsuite/20_util/pair/dangling_ref.cc| 20 +--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/pair/dangling_ref.cc 
b/libstdc++-v3/testsuite/20_util/pair/dangling_ref.cc
index 8e0c34816dd..ca238f9cbd1 100644
--- a/libstdc++-v3/testsuite/20_util/pair/dangling_ref.cc
+++ b/libstdc++-v3/testsuite/20_util/pair/dangling_ref.cc
@@ -22,15 +22,15 @@ void
 test_binary_ctors()
 {
   std::pair p1(1L, 2);
-  // { dg-error "here" "" { target c++17_down } 24 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 24 }
   // { dg-error "use of deleted function" "" { target c++20 } 24 }
 
   std::pair p2(1, 2L);
-  // { dg-error "here" "" { target c++17_down } 28 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 28 }
   // { dg-error "use of deleted function" "" { target c++20 } 28 }
 
   std::pair p3(1L, 2L);
-  // { dg-error "here" "" { target c++17_down } 32 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 32 }
   // { dg-error "use of deleted function" "" { target c++20 } 32 }
 }
 
@@ -40,28 +40,28 @@ test_converting_ctors()
   std::pair p0;
 
   std::pair p1(p0);
-  // { dg-error "here" "" { target c++17_down } 42 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 42 }
   // { dg-error "use of deleted function" "" { target c++20 } 42 }
 
   std::pair p2(p0);
-  // { dg-error "here" "" { target c++17_down } 46 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 46 }
   // { dg-error "use of deleted function" "" { target c++20 } 46 }
 
   std::pair p3(p0);
-  // { dg-error "here" "" { target c++17_down } 50 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 50 }
   // { dg-error "use of deleted function" "" { target c++20 } 50 }
 
   std::pair p4(std::move(p0));
-  // { dg-error "here" "" { target c++17_down } 54 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 54 }
   // { dg-error "use of deleted function" "" { target c++20 } 54 }
 
   std::pair p5(std::move(p0));
-  // { dg-error "here" "" { target c++17_down } 58 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 58 }
   // { dg-error "use of deleted function" "" { target c++20 } 58 }
 
   std::pair p6(std::move(p0));
-  // { dg-error "here" "" { target c++17_down } 62 }
+  // { dg-error "here" "" { target { c++17_down && hosted } } 62 }
   // { dg-error "use of deleted function" "" { target c++20 } 62 }
 }
 
-// { dg-error "static assert.* dangling reference" "" { target { c++17_down } 
} 0 }
+// { dg-error "static assert.* dangling reference" "" { target { c++17_down && 
hosted } } 0 }
-- 
2.41.0



[committed] libstdc++: Add [[nodiscard]] to std::span members

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

All std::span member functions are pure functions that have no side
effects. They are only useful for their return value, so they should all
warn if that value is not used.

libstdc++-v3/ChangeLog:

* include/std/span (span, as_bytes, as_writable_bytes): Add
[[nodiscard]] attribute on all non-void functions.
* testsuite/23_containers/span/back_assert_neg.cc: Suppress
nodiscard warning.
* testsuite/23_containers/span/back_neg.cc: Likewise.
* testsuite/23_containers/span/first_2_assert_neg.cc: Likewise.
* testsuite/23_containers/span/first_assert_neg.cc: Likewise.
* testsuite/23_containers/span/first_neg.cc: Likewise.
* testsuite/23_containers/span/front_assert_neg.cc: Likewise.
* testsuite/23_containers/span/front_neg.cc: Likewise.
* testsuite/23_containers/span/index_op_assert_neg.cc: Likewise.
* testsuite/23_containers/span/index_op_neg.cc: Likewise.
* testsuite/23_containers/span/last_2_assert_neg.cc: Likewise.
* testsuite/23_containers/span/last_assert_neg.cc: Likewise.
* testsuite/23_containers/span/last_neg.cc: Likewise.
* testsuite/23_containers/span/subspan_2_assert_neg.cc:
Likewise.
* testsuite/23_containers/span/subspan_3_assert_neg.cc:
Likewise.
* testsuite/23_containers/span/subspan_4_assert_neg.cc:
Likewise.
* testsuite/23_containers/span/subspan_5_assert_neg.cc:
Likewise.
* testsuite/23_containers/span/subspan_6_assert_neg.cc:
Likewise.
* testsuite/23_containers/span/subspan_assert_neg.cc: Likewise.
* testsuite/23_containers/span/subspan_neg.cc: Likewise.
* testsuite/23_containers/span/nodiscard.cc: New test.
---
 libstdc++-v3/include/std/span | 26 -
 .../23_containers/span/back_assert_neg.cc |  2 +-
 .../testsuite/23_containers/span/back_neg.cc  |  2 +-
 .../23_containers/span/first_2_assert_neg.cc  |  2 +-
 .../23_containers/span/first_assert_neg.cc|  2 +-
 .../testsuite/23_containers/span/first_neg.cc |  2 +-
 .../23_containers/span/front_assert_neg.cc|  2 +-
 .../testsuite/23_containers/span/front_neg.cc |  2 +-
 .../23_containers/span/index_op_assert_neg.cc |  2 +-
 .../23_containers/span/index_op_neg.cc|  2 +-
 .../23_containers/span/last_2_assert_neg.cc   |  2 +-
 .../23_containers/span/last_assert_neg.cc |  2 +-
 .../testsuite/23_containers/span/last_neg.cc  |  2 +-
 .../testsuite/23_containers/span/nodiscard.cc | 58 +++
 .../span/subspan_2_assert_neg.cc  |  2 +-
 .../span/subspan_3_assert_neg.cc  |  2 +-
 .../span/subspan_4_assert_neg.cc  |  2 +-
 .../span/subspan_5_assert_neg.cc  |  2 +-
 .../span/subspan_6_assert_neg.cc  |  2 +-
 .../23_containers/span/subspan_assert_neg.cc  |  2 +-
 .../23_containers/span/subspan_neg.cc |  6 +-
 21 files changed, 103 insertions(+), 23 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/23_containers/span/nodiscard.cc

diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index d5644a196a2..90d08f18d2c 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -246,20 +246,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // observers
 
+  [[nodiscard]]
   constexpr size_type
   size() const noexcept
   { return this->_M_extent._M_extent(); }
 
+  [[nodiscard]]
   constexpr size_type
   size_bytes() const noexcept
   { return this->_M_extent._M_extent() * sizeof(element_type); }
 
-  [[nodiscard]] constexpr bool
+  [[nodiscard]]
+  constexpr bool
   empty() const noexcept
   { return size() == 0; }
 
   // element access
 
+  [[nodiscard]]
   constexpr reference
   front() const noexcept
   {
@@ -267,6 +271,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *this->_M_ptr;
   }
 
+  [[nodiscard]]
   constexpr reference
   back() const noexcept
   {
@@ -274,6 +279,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *(this->_M_ptr + (size() - 1));
   }
 
+  [[nodiscard]]
   constexpr reference
   operator[](size_type __idx) const noexcept
   {
@@ -281,41 +287,50 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *(this->_M_ptr + __idx);
   }
 
+  [[nodiscard]]
   constexpr pointer
   data() const noexcept
   { return this->_M_ptr; }
 
   // iterator support
 
+  [[nodiscard]]
   constexpr iterator
   begin() const noexcept
   { return iterator(this->_M_ptr); }
 
+  [[nodiscard]]
   constexpr iterator
   end() const noexcept
   { return iterator(this->_M_ptr + this->size()); }
 
+  [[nodiscard]]
   constexpr reverse_iterator
   rbegin() const noexcept
   { return reverse_iterator(this->end()); }
 
+  [[nodiscard]]
   

[committed] libstdc++: Add [[nodiscard]] to lock types

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Adding this attribute means users get a warning when they accidentally
create a temporary lock instead of creating an automatic variable with
block scope.

For std::lock_guard both constructors have side effects (they both take
a mutex and so both cause it to be unlocked at the end of the full
expression when a temporary is constructed). Ideally we would just put
the attribute on the class instead of the constructors, but that doesn't
work with GCC (PR c++/85973).

For std::unique_lock the default constructor and std::defer_lock_t
constructor do not cause any locking or unlocking, so do not need to
give a warning. It might still be a mistake to create a temporary using
those constructors, but it's harmless and seems unlikely anyway. For a
lock object created with one of those constructors you would expect the
lock object to be referred to later in the function, and that would not
even compile if it was constructed as an unnamed temporary.

std::scoped_lock gets the same treatment as std::lock_guard, except that
the explicit specialization for zero lockables has no side effects so
doesn't need to warn.

libstdc++-v3/ChangeLog:

* include/bits/std_mutex.h (lock_guard): Add [[nodiscard]]
attribute to constructors.
* include/bits/unique_lock.h (unique_lock): Likewise.
* include/std/mutex (scoped_lock, scoped_lock): Likewise.
* testsuite/30_threads/lock_guard/cons/nodiscard.cc: New test.
* testsuite/30_threads/scoped_lock/cons/nodiscard.cc: New test.
* testsuite/30_threads/unique_lock/cons/nodiscard.cc: New test.
---
 libstdc++-v3/include/bits/std_mutex.h |  2 +
 libstdc++-v3/include/bits/unique_lock.h   |  5 +++
 libstdc++-v3/include/std/mutex|  5 +++
 .../30_threads/lock_guard/cons/nodiscard.cc   | 20 ++
 .../30_threads/scoped_lock/cons/nodiscard.cc  | 29 ++
 .../30_threads/unique_lock/cons/nodiscard.cc  | 40 +++
 6 files changed, 101 insertions(+)
 create mode 100644 
libstdc++-v3/testsuite/30_threads/lock_guard/cons/nodiscard.cc
 create mode 100644 
libstdc++-v3/testsuite/30_threads/scoped_lock/cons/nodiscard.cc
 create mode 100644 
libstdc++-v3/testsuite/30_threads/unique_lock/cons/nodiscard.cc

diff --git a/libstdc++-v3/include/bits/std_mutex.h 
b/libstdc++-v3/include/bits/std_mutex.h
index 4693055269d..9ac8c76c9fb 100644
--- a/libstdc++-v3/include/bits/std_mutex.h
+++ b/libstdc++-v3/include/bits/std_mutex.h
@@ -245,9 +245,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   typedef _Mutex mutex_type;
 
+  [[__nodiscard__]]
   explicit lock_guard(mutex_type& __m) : _M_device(__m)
   { _M_device.lock(); }
 
+  [[__nodiscard__]]
   lock_guard(mutex_type& __m, adopt_lock_t) noexcept : _M_device(__m)
   { } // calling thread owns mutex
 
diff --git a/libstdc++-v3/include/bits/unique_lock.h 
b/libstdc++-v3/include/bits/unique_lock.h
index c28e6456ad5..07474d26db5 100644
--- a/libstdc++-v3/include/bits/unique_lock.h
+++ b/libstdc++-v3/include/bits/unique_lock.h
@@ -66,6 +66,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_device(0), _M_owns(false)
   { }
 
+  [[__nodiscard__]]
   explicit unique_lock(mutex_type& __m)
   : _M_device(std::__addressof(__m)), _M_owns(false)
   {
@@ -77,10 +78,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_device(std::__addressof(__m)), _M_owns(false)
   { }
 
+  [[__nodiscard__]]
   unique_lock(mutex_type& __m, try_to_lock_t)
   : _M_device(std::__addressof(__m)), _M_owns(_M_device->try_lock())
   { }
 
+  [[__nodiscard__]]
   unique_lock(mutex_type& __m, adopt_lock_t) noexcept
   : _M_device(std::__addressof(__m)), _M_owns(true)
   {
@@ -88,6 +91,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
   template
+   [[__nodiscard__]]
unique_lock(mutex_type& __m,
const chrono::time_point<_Clock, _Duration>& __atime)
: _M_device(std::__addressof(__m)),
@@ -95,6 +99,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{ }
 
   template
+   [[__nodiscard__]]
unique_lock(mutex_type& __m,
const chrono::duration<_Rep, _Period>& __rtime)
: _M_device(std::__addressof(__m)),
diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex
index bd3a1cbd94d..9d22ce80045 100644
--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -744,9 +744,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 class scoped_lock
 {
 public:
+
+  [[nodiscard]]
   explicit scoped_lock(_MutexTypes&... __m) : _M_devices(std::tie(__m...))
   { std::lock(__m...); }
 
+  [[nodiscard]]
   explicit scoped_lock(adopt_lock_t, _MutexTypes&... __m) noexcept
   : _M_devices(std::tie(__m...))
   { } // calling thread owns mutex
@@ -779,9 +782,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   using mutex_type = _Mutex;
 
+

[committed] libstdc++: Remove handling for underscore-prefixed libm functions [PR111638]

2023-11-10 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The checks in linkage.m4 try to support math functions prefixed with
underscores, like _acosf and _isinf. However, that doesn't work because
they're renamed to the standard names using a macro, but then 
undefines that macro again.

This simply removes everything related to those underscored functions.

libstdc++-v3/ChangeLog:

PR libstdc++/111638
* config.h.in: Regenerate.
* configure: Regenerate.
* linkage.m4 (GLIBCXX_MAYBE_UNDERSCORED_FUNCS): Remove.
(GLIBCXX_CHECK_MATH_DECL_AND_LINKAGE_1): Do not check for _foo.
(GLIBCXX_CHECK_MATH_DECLS_AND_LINKAGES_1): Likewise.
(GLIBCXX_CHECK_MATH_DECL_AND_LINKAGE_2): Likewise.
(GLIBCXX_CHECK_MATH_DECL_AND_LINKAGE_3): Likewise.
(GLIBCXX_CHECK_STDLIB_DECL_AND_LINKAGE_2): Do not use
GLIBCXX_MAYBE_UNDERSCORED_FUNCS.
---
 libstdc++-v3/config.h.in |   506 -
 libstdc++-v3/configure   | 19292 -
 libstdc++-v3/linkage.m4  |51 -
 3 files changed, 19849 deletions(-)

;
diff --git a/libstdc++-v3/linkage.m4 b/libstdc++-v3/linkage.m4
index 45a09cdf445..29b31447c98 100644
--- a/libstdc++-v3/linkage.m4
+++ b/libstdc++-v3/linkage.m4
@@ -28,27 +28,10 @@ AC_DEFUN([GLIBCXX_CHECK_MATH_DECL_1], [
 ])
 
 
-dnl 
-dnl Define autoheader template for using the underscore functions
-dnl For each parameter, create a macro where if func doesn't exist,
-dnl but _func does, then it will "#define func _func".
-dnl
-dnl GLIBCXX_MAYBE_UNDERSCORED_FUNCS
-AC_DEFUN([GLIBCXX_MAYBE_UNDERSCORED_FUNCS], 
-[AC_FOREACH([glibcxx_ufunc], [$1],
-  [AH_VERBATIM(_[]glibcxx_ufunc,
-[#if defined (]AS_TR_CPP(HAVE__[]glibcxx_ufunc)[) && ! defined 
(]AS_TR_CPP(HAVE_[]glibcxx_ufunc)[)
-# define ]AS_TR_CPP(HAVE_[]glibcxx_ufunc)[ 1
-# define ]glibcxx_ufunc[ _]glibcxx_ufunc[
-#endif])])
-])
-
-
 dnl
 dnl Check to see if the (math function) argument passed is
 dnl 1) declared when using the c++ compiler
 dnl 2) has "C" linkage
-dnl 3) if not, see if 1) and 2) for argument prepended with '_'
 dnl
 dnl Define HAVE_CARGF etc if "cargf" is declared and links
 dnl
@@ -61,13 +44,7 @@ AC_DEFUN([GLIBCXX_CHECK_MATH_DECL_AND_LINKAGE_1], [
   GLIBCXX_CHECK_MATH_DECL_1($1)
   if test x$glibcxx_cv_func_$1_use = x"yes"; then
 AC_CHECK_FUNCS($1)
-  else
-GLIBCXX_CHECK_MATH_DECL_1(_$1)
-if test x$glibcxx_cv_func__$1_use = x"yes"; then
-  AC_CHECK_FUNCS(_$1)
-fi
   fi
-  GLIBCXX_MAYBE_UNDERSCORED_FUNCS($1)
 ])
 
 
@@ -90,22 +67,7 @@ AC_DEFUN([GLIBCXX_CHECK_MATH_DECLS_AND_LINKAGES_1], [
   AC_MSG_RESULT($glibcxx_cv_func_$2_use)
   if test x$glibcxx_cv_func_$2_use = x"yes"; then
 AC_CHECK_FUNCS(funclist)
-  else
-AC_MSG_CHECKING([for _$1 functions])
-AC_CACHE_VAL(glibcxx_cv_func__$2_use, [
-  AC_LANG_SAVE
-  AC_LANG_CPLUSPLUS
-  AC_TRY_COMPILE([#include ],
- patsubst(funclist,[\w+],[_\& (0);]),
- [glibcxx_cv_func__$2_use=yes],
- [glibcxx_cv_func__$2_use=no])
-  AC_LANG_RESTORE])
-AC_MSG_RESULT($glibcxx_cv_func__$2_use)
-if test x$glibcxx_cv_func__$2_use = x"yes"; then
-  AC_CHECK_FUNCS(patsubst(funclist,[\w+],[_\&]))
-fi
   fi
-  GLIBCXX_MAYBE_UNDERSCORED_FUNCS(funclist)
   undefine([funclist])
 ])
 
@@ -146,13 +108,7 @@ AC_DEFUN([GLIBCXX_CHECK_MATH_DECL_AND_LINKAGE_2], [
   GLIBCXX_CHECK_MATH_DECL_2($1)
   if test x$glibcxx_cv_func_$1_use = x"yes"; then
 AC_CHECK_FUNCS($1)
-  else
-GLIBCXX_CHECK_MATH_DECL_2(_$1)
-if test x$glibcxx_cv_func__$1_use = x"yes"; then
-  AC_CHECK_FUNCS(_$1)
-fi
   fi
-  GLIBCXX_MAYBE_UNDERSCORED_FUNCS($1)
 ])
 
 
@@ -193,13 +149,7 @@ AC_DEFUN([GLIBCXX_CHECK_MATH_DECL_AND_LINKAGE_3], [
   GLIBCXX_CHECK_MATH_DECL_3($1)
   if test x$glibcxx_cv_func_$1_use = x"yes"; then
 AC_CHECK_FUNCS($1)
-  else
-GLIBCXX_CHECK_MATH_DECL_3(_$1)
-if test x$glibcxx_cv_func__$1_use = x"yes"; then
-  AC_CHECK_FUNCS(_$1)
-fi
   fi
-  GLIBCXX_MAYBE_UNDERSCORED_FUNCS($1)
 ])
 
 
@@ -287,7 +237,6 @@ AC_DEFUN([GLIBCXX_CHECK_STDLIB_DECL_AND_LINKAGE_2], [
   if test x$glibcxx_cv_func_$1_use = x"yes"; then
 AC_CHECK_FUNCS($1)
   fi
-  GLIBCXX_MAYBE_UNDERSCORED_FUNCS($1)
 ])
 
 
-- 
2.41.0



Re: [Ping][PATCH] libstdc++: Add missing functions to [PR79700]

2023-11-10 Thread Jonathan Wakely
I've finally convinced myself that this patch is OK, because we
provide stub versions of all the functions being declared here. So if
a target is missing them, we provide them anyway. That happens to be
broken for the avr target, but that defaults to --disable-libstdcxx
anyway.

I've pushed the patch to trunk - thanks for the work on it!



On Wed, 17 May 2023 at 11:07, Jonathan Wakely  wrote:
>
>
>
> On Wed, 17 May 2023 at 10:38, Nathaniel Shead  
> wrote:
>>
>> On Wed, May 17, 2023 at 10:05:59AM +0100, Jonathan Wakely wrote:
>> > On Wed, 17 May 2023 at 09:37, Nathaniel Shead wrote:
>> >
>> > > Now that GCC13.1 is released is it ok to merge? Thanks!
>> > >
>> >
>> > Yes, I've been testing this locally, but I think it needs more work 
>> > (sorry!)
>> >
>> > Looking at it again, I'm not sure why I asked for the additional tests
>> > because if they fail, it's a problem in libc, and there's nothing we can
>> > actually do about it in libstdc++. We certainly do want std::expl(0.0L) to
>> > return the same thing as std::exp(0.0L), but if it doesn't, we'll just have
>> > a libstdc++ test failure caused by a bug in libc. But you wrote the test
>> > now, so let's keep it. If we get failures for the test it will allow us to
>> > inform the relevant libc maintainers that they have a bug.
>>
>> Sounds good.
>>
>> > Also, since you're contributing this under the DCO terms the new test
>> > should not have the FSF copyright header, unless it's a derived work of an
>> > existing test with that header (and in that case it should retain the dates
>> > from the copied test). I don't actually bother putting the copyright and
>> > license header on new tests these days. There's nothing in that test that
>> > is novel or interesting, and I think it's arguably not useful or meaningful
>> > to consider it copyrighted.
>>
>> Makes sense, I was just copying from other tests in the directory. I'll
>> keep this in mind for the future, thanks!
>
>
> Yeah, we have a mix of tests using the old conventions (with copyright and 
> GPL headers) and new conventions (don't bother, they're not really meaningful 
> on tests).
>
> We're unlikely to *remove* the copyright notices from the old tests, because 
> that would require all sorts of legal wrangling, and it's not clear that the 
> copyright holder (the FSF) would agree to it anyway.
>
>
>
>
>>
>> > Finally, and most importantly, the new using-declarations in  are
>> > not guarded by any autoconf macro. That will break targets without full C99
>> >  support, e.g. djgpp declares acosf but not acosl, so the new
>> > "using acosl;" would be a hard error as soon as  is included (and
>> > might even prevent GCC building on that target). So I think we need a new
>> > autoconf check for the existence of those functions. I'm in the process of
>> > reworking the autoconf macros for  (due to PR 109818), which is why
>> > I didn't address it for this patch yet.
>>
>> Ah, I see; yes, that would be a problem. I'm not very familiar with
>> autoconf, so thanks for working this out. Let me know when you've done
>> that if there's anything else I should do for this patch.
>
>
> I hope to have an updated patch by next week, so I'll let you know once 
> that's ready. Thanks for your patience and for pining the patch.
>
>
>>
>> > >
>> > > On Tue, Apr 18, 2023 at 6:48 PM Jonathan Wakely 
>> > > wrote:
>> > > >
>> > > > On Mon, 17 Apr 2023 at 09:11, Nathaniel Shead 
>> > > > 
>> > > wrote:
>> > > > >
>> > > > > Hi, just checking whether there were any issues with this patch?
>> > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612386.html
>> > > > >
>> > > > > Otherwise I assume it won't be in GCC13.
>> > > >
>> > > > That's right, it's too large and invasive a change to get into GCC 13
>> > > > when only submitted in February, sorry. I'll merge it to trunk once
>> > > > GCC 13.1 is released though.
>> > > >
>> > >
>> > >
>>



[pushed] testsuite: fix lambda-decltype3.C in C++11

2023-11-10 Thread Marek Polacek
Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
This fixes
FAIL: g++.dg/cpp0x/lambda/lambda-decltype3.C  -std=c++11 (test for excess 
errors)
due to
lambda-decltype3.C:25:6: error: lambda capture initializers only available with 
'-std=c++14' or '-std=gnu++14' [-Wc++14-extensions]

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-decltype3.C: Check __cpp_init_captures.
---
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype3.C | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype3.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype3.C
index 7fc157aefb5..2e06e496140 100644
--- a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype3.C
@@ -21,8 +21,10 @@ void f() {
   [=] {
 [](decltype((x)) y) {}; // OK, lambda takes a parameter of type float 
const&
 
+#if __cpp_init_captures
 [x=1](decltype((x)) y) {
   decltype((x)) z = x;  // OK, y has type int&, z has type int const&
 };
+#endif
   };
 }

base-commit: e0c1476d5d7c450b1b16a40364cea4e91237ea93
-- 
2.41.0



Re: [PATCH v3] libiberty: Use posix_spawn in pex-unix when available.

2023-11-10 Thread Patrick O'Neill

On 11/10/23 03:00, Prathamesh Kulkarni wrote:


On Thu, 5 Oct 2023 at 00:00, Brendan Shanks  wrote:

Hi,

This patch implements pex_unix_exec_child using posix_spawn when
available.

This should especially benefit recent macOS (where vfork just calls
fork), but should have equivalent or faster performance on all
platforms.
In addition, the implementation is substantially simpler than the
vfork+exec code path.

Tested on x86_64-linux.

Hi Brendan,
It seems this patch caused the following regressions on aarch64:


I'm also seeing the same failures on risc-v targets bisected to this commit.

Thanks,
Patrick



FAIL: g++.dg/modules/bad-mapper-1.C -std=c++17  at line 3 (test for
errors, line )
FAIL: g++.dg/modules/bad-mapper-1.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2a  at line 3 (test for
errors, line )
FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2b  at line 3 (test for
errors, line )
FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2b (test for excess errors)

Looking at g++.log:
/home/tcwg-buildslave/workspace/tcwg_gnu_2/abe/snapshots/gcc.git~master/gcc/testsuite/g++.dg/modules/bad-mapper-1.C:
error: failed posix_spawnp mapper 'this-will-not-work'
In module imported at
/home/tcwg-buildslave/workspace/tcwg_gnu_2/abe/snapshots/gcc.git~master/gcc/testsuite/g++.dg/modules/bad-mapper-1.C:2:1:
unique1.bob: error: failed to read compiled module: No such file or directory
unique1.bob: note: compiled module file is 'gcm.cache/unique1.bob.gcm'
unique1.bob: note: imports must be built before being imported
unique1.bob: fatal error: returning to the gate for a mechanical issue
compilation terminated.

Link to log files:
https://ci.linaro.org/job/tcwg_gcc_check--master-aarch64-build/1159/artifact/artifacts/00-sumfiles/
Could you please investigate ?

Thanks,
Prathamesh

v2: Fix error handling (previously the function would be run twice in
case of error), and don't use a macro that changes control flow.

v3: Match file style for error-handling blocks, don't close
in/out/errdes on error, and check close() for errors.

libiberty/
 * configure.ac (AC_CHECK_HEADERS): Add spawn.h.
 (checkfuncs): Add posix_spawn, posix_spawnp.
 (AC_CHECK_FUNCS): Add posix_spawn, posix_spawnp.
 * configure, config.in: Rebuild.
 * pex-unix.c [HAVE_POSIX_SPAWN] (pex_unix_exec_child): New function.

Signed-off-by: Brendan Shanks
---
  libiberty/configure.ac |   8 +-
  libiberty/pex-unix.c   | 168 +
  2 files changed, 173 insertions(+), 3 deletions(-)

diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 0748c592704..2488b031bc8 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -289,7 +289,7 @@ AC_SUBST_FILE(host_makefile_frag)
  # It's OK to check for header files.  Although the compiler may not be
  # able to link anything, it had better be able to at least compile
  # something.
-AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h string.h 
unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h sys/mman.h 
fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h machine/hal_sysinfo.h 
sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h stdio_ext.h process.h 
sys/prctl.h)
+AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h string.h 
unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h sys/mman.h 
fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h machine/hal_sysinfo.h 
sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h stdio_ext.h process.h 
sys/prctl.h spawn.h)
  AC_HEADER_SYS_WAIT
  AC_HEADER_TIME

@@ -412,7 +412,8 @@ funcs="$funcs setproctitle"
  vars="sys_errlist sys_nerr sys_siglist"

  checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
- getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic 
pstat_getstatic \
+ getsysinfo gettimeofday on_exit pipe2 posix_spawn posix_spawnp psignal \
+ pstat_getdynamic pstat_getstatic \
   realpath setrlimit spawnve spawnvpe strerror strsignal sysconf sysctl \
   sysmp table times wait3 wait4"

@@ -435,7 +436,8 @@ if test "x" = "y"; then
  index insque \
  memchr memcmp memcpy memmem memmove memset mkstemps \
  on_exit \
-pipe2 psignal pstat_getdynamic pstat_getstatic putenv \
+pipe2 posix_spawn posix_spawnp psignal \
+pstat_getdynamic pstat_getstatic putenv \
  random realpath rename rindex \
  sbrk setenv setproctitle setrlimit sigsetmask snprintf spawnve spawnvpe \
   stpcpy stpncpy strcasecmp strchr strdup \
diff --git a/libiberty/pex-unix.c b/libiberty/pex-unix.c
index 33b5bce31c2..336799d1125 100644
--- a/libiberty/pex-unix.c
+++ b/libiberty/pex-unix.c
@@ -58,6 +58,9 @@ extern int errno;
  #ifdef HAVE_PROCESS_H
  #include 
  #endif
+#ifdef HAVE_SPAWN_H
+#include 
+#endif

  #ifdef vfork /* Autoconf may define this to fork for us. */
  # define VFORK_STRING "fork"
@@ 

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-10 Thread Peter Bergner
On 8/25/23 6:20 AM, Kewen.Lin wrote:
> btw, I was also expecting that we don't implicitly set
> OPTION_MASK_PCREL any more for Power10, that is to remove
> OPTION_MASK_PCREL from OTHER_POWER10_MASKS.

So my patch removes the flag from the default power10 flags, like
you want.  However, it doesn't remove it from OTHER_POWER10_MASKS,
since that is used to set ISA_3_1_MASKS_SERVER and I didn't want
to change how rs6000_machine_from_flags() behaves, so instead, I
just explicitly mask it off when defining the power10 default flags.

Peter



Re: [PATCH] C99 testsuite readiness: More unverified testcase un-reductions

2023-11-10 Thread Jeff Law




On 11/10/23 15:06, Florian Weimer wrote:

gcc/testsuite/

* gcc.c-torture/compile/BUG17.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/BUG18.c (main): Likewise.  Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/BUG21.c (Nase): Add missing void
types.
* gcc.c-torture/compile/BUG23.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/BUG5.c (bar): Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/BUG6.c (main): Likewise.  Add missing
int return type and missing void type.
* gcc.c-torture/compile/b.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/b1.c (main): Likewise.  Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/b88.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/bbb.c (flset): Add missing void
return type and switch to prototype style.
* gcc.c-torture/compile/bf.c (clr, atoi): Declare.
(main): Add missing int return type.  Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/bt.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/charmtst.c (foo, bar): Declare.
(c_move_tst): Add missing int return type.
* gcc.c-torture/compile/cmpdi-1.c (f, g): Add missing int
return type.
* gcc.c-torture/compile/cmphi.c (foo): Likewise.
* gcc.c-torture/compile/conv.c (main): Likewise.  Add missing
void type.  Call __builtin_printf instead of printf.
* gcc.c-torture/compile/ddd.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/dilayout.c (str, ll): Add missing
void return type.
* gcc.c-torture/compile/dimove.c (foo): Likewise.
* gcc.c-torture/compile/f2.c (foo): Likewise.
* gcc.c-torture/compile/flatten.c  (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/fnul.c (main): Likewise.
Call __builtin_printf instead of printf.
* gcc.c-torture/compile/fq.c (expand_to_ascii): Add missing
void return type.
* gcc.c-torture/compile/funcptr-1.c (g): Call __builtin_printf
instead of printf.
(f): Likewise.  Add missing void types.
* gcc.c-torture/compile/glob.c (foo): Add missing void types.
* gcc.c-torture/compile/goto-1.c (f): Likewise.
* gcc.c-torture/compile/i++.c (main): Call __builtin_printf
instead of printf.
* gcc.c-torture/compile/ic.c (foo): Add missing int return
type.
* gcc.c-torture/compile/iftrap-1.c (bar, baz): Declare.
(f4, f6): Call __builtin_abort instead of abort.
* gcc.c-torture/compile/iftrap-2.c (bar): Declare.
* gcc.c-torture/compile/jmp.c (foo): Add missing int types.
* gcc.c-torture/compile/labels-1.c (f): Add missing int
return type and missing void type.  Call __builtin_abort
instead of abort.
* gcc.c-torture/compile/labels-2.c (f): Likewise.
* gcc.c-torture/compile/lbug.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/memtst.c (memtst): Add missing void
return type.
(main): Add missing int return type and missing void type.
Call __builtin_bzero instead of bzero.
* gcc.c-torture/compile/miscomp.c (main): Add missing int
return type and missing void type.  Call __builtin_printf
instead of printf.
* gcc.c-torture/compile/msp.c (bar): Declare.
(foo): Add missing void types.
* gcc.c-torture/compile/mtst.c (foo): Add missing int return
type.
* gcc.c-torture/compile/packed-1.c (f): Add missing void
types.
* gcc.c-torture/compile/pr17119.c (func1, func2): Declare.
* gcc.c-torture/compile/pr18712.c (foo, foo1): Declare.
* gcc.c-torture/compile/pr20412.c (bar1, bar2, bar3): Declare.
* gcc.c-torture/compile/pr21532.c (foo): Declare.
* gcc.c-torture/compile/pr22398.c (main): Call __builtin_exit
instead of exit.
* gcc.c-torture/compile/pr24883.c (orec_str_list): Add missing
void return type.
* gcc.c-torture/compile/pr25311.c (use): Declare.
* gcc.c-torture/compile/pr25514.c (foo): Declare.
* gcc.c-torture/compile/pr26425.c (key_put): Declare.
* gcc.c-torture/compile/pr27087.c (g): Declare.
* gcc.c-torture/compile/pr27282.c (colrow_equal): Add missing
int return type.
* gcc.c-torture/compile/pr27907.c (fann_run): Add missing
void return type.
* gcc.c-torture/compile/pr28489.c (c_compile): Likewise.
* gcc.c-torture/compile/pr28776-1.c

Re: [PATCH] C99 testsuite readiness: Compile more tests with -std=gnu89

2023-11-10 Thread Jeff Law




On 11/10/23 15:07, Florian Weimer wrote:

gcc/testsuite/

* gcc.c-torture/compile/386.c: Compile with -std=gnu89.
* gcc.c-torture/compile/BUG1.c: Likewise.
* gcc.c-torture/compile/BUG11.c: Likewise.
* gcc.c-torture/compile/BUG16.c: Likewise.
* gcc.c-torture/compile/BUG2.c: Likewise.
* gcc.c-torture/compile/BUG24.c: Likewise.
* gcc.c-torture/compile/BUG25.c: Likewise.
* gcc.c-torture/compile/BUG3.c: Likewise.
* gcc.c-torture/compile/DFcmp.c: Likewise.
* gcc.c-torture/compile/HIcmp.c: Likewise.
* gcc.c-torture/compile/HIset.c: Likewise.
* gcc.c-torture/compile/QIcmp.c: Likewise.
* gcc.c-torture/compile/QIset.c: Likewise.
* gcc.c-torture/compile/SFset.c: Likewise.
* gcc.c-torture/compile/SIcmp.c: Likewise.
* gcc.c-torture/compile/SIset.c: Likewise.
* gcc.c-torture/compile/UHIcmp.c: Likewise.
* gcc.c-torture/compile/UQIcmp.c: Likewise.
* gcc.c-torture/compile/USIcmp.c: Likewise.
* gcc.c-torture/compile/a.c: Likewise.
* gcc.c-torture/compile/a1.c: Likewise.
* gcc.c-torture/compile/a3.c: Likewise.
* gcc.c-torture/compile/aa.c: Likewise.
* gcc.c-torture/compile/aaa.c: Likewise.
* gcc.c-torture/compile/abs.c: Likewise.
* gcc.c-torture/compile/ac.c: Likewise.
* gcc.c-torture/compile/acc.c: Likewise.
* gcc.c-torture/compile/add.c: Likewise.
* gcc.c-torture/compile/add386.c: Likewise.
* gcc.c-torture/compile/addcc.c: Likewise.
* gcc.c-torture/compile/andm.c: Likewise.
* gcc.c-torture/compile/andmem.c: Likewise.
* gcc.c-torture/compile/andn.c: Likewise.
* gcc.c-torture/compile/andok.c: Likewise.
* gcc.c-torture/compile/andsi.c: Likewise.
* gcc.c-torture/compile/andsparc.c: Likewise.
* gcc.c-torture/compile/aos.c: Likewise.
* gcc.c-torture/compile/arr.c: Likewise.
* gcc.c-torture/compile/as.c: Likewise.
* gcc.c-torture/compile/ase.c: Likewise.
* gcc.c-torture/compile/band.c: Likewise.
* gcc.c-torture/compile/bb0.c: Likewise.
* gcc.c-torture/compile/bb1.c: Likewise.
* gcc.c-torture/compile/bc.c: Likewise.
* gcc.c-torture/compile/bcopy.c: Likewise.
* gcc.c-torture/compile/bfx.c: Likewise.
* gcc.c-torture/compile/bge.c: Likewise.
* gcc.c-torture/compile/bit.c: Likewise.
* gcc.c-torture/compile/bitf.c: Likewise.
* gcc.c-torture/compile/bitw.c: Likewise.
* gcc.c-torture/compile/blk.c: Likewise.
* gcc.c-torture/compile/bt386.c: Likewise.
* gcc.c-torture/compile/bug.c: Likewise.
* gcc.c-torture/compile/buns.c: Likewise.
* gcc.c-torture/compile/c.c: Likewise.
* gcc.c-torture/compile/c2.c: Likewise.
* gcc.c-torture/compile/call.c: Likewise.
* gcc.c-torture/compile/callind.c: Likewise.
* gcc.c-torture/compile/calls-void.c: Likewise.
* gcc.c-torture/compile/calls.c: Likewise.
* gcc.c-torture/compile/cc.c: Likewise.
* gcc.c-torture/compile/cmb.c: Likewise.
* gcc.c-torture/compile/cmpsi386.c: Likewise.
* gcc.c-torture/compile/cmul.c: Likewise.
* gcc.c-torture/compile/comb.c: Likewise.
* gcc.c-torture/compile/consec.c: Likewise.
* gcc.c-torture/compile/const.c: Likewise.
* gcc.c-torture/compile/conv_tst.c: Likewise.
* gcc.c-torture/compile/cvt.c: Likewise.
* gcc.c-torture/compile/dbl_parm.c: Likewise.
* gcc.c-torture/compile/dblbug.c: Likewise.
* gcc.c-torture/compile/dead.c: Likewise.
* gcc.c-torture/compile/delay.c: Likewise.
* gcc.c-torture/compile/di.c: Likewise.
* gcc.c-torture/compile/div.c: Likewise.
* gcc.c-torture/compile/dm.c: Likewise.
* gcc.c-torture/compile/dshift.c: Likewise.
* gcc.c-torture/compile/e.c: Likewise.
* gcc.c-torture/compile/ex.c: Likewise.
* gcc.c-torture/compile/ext.c: Likewise.
* gcc.c-torture/compile/flo.c: Likewise.
* gcc.c-torture/compile/forgetcc.c: Likewise.
* gcc.c-torture/compile/g.c: Likewise.
* gcc.c-torture/compile/gen_tst.c: Likewise.
* gcc.c-torture/compile/gronk.c: Likewise.
* gcc.c-torture/compile/hi.c: Likewise.
* gcc.c-torture/compile/i.c: Likewise.
* gcc.c-torture/compile/icmp.c: Likewise.
* gcc.c-torture/compile/ifreg.c: Likewise.
* gcc.c-torture/compile/jumptab.c: Likewise.
* gcc.c-torture/compile/l.c: Likewise.
* gcc.c-torture/compile/layout.c: Likewise.
* gcc.c-torture/compile/lll.c: Likewise.
* gcc.c-torture/compile/load8.c: Likewise.
* gcc.c-torture/compile/loadhicc.c: Likewise.
* gcc.c-torture/compile/log2.c: Likewise.
* gcc.c-torture/compile/logic.c: Likewise.
* gcc.c-torture/compile/loop-1.c: Likewise.
  

Re: [PATCH] C99 testsuite readiness: Add missing abort, exit declarations

2023-11-10 Thread Jeff Law




On 11/10/23 15:07, Florian Weimer wrote:

The execute tests use abort/exit to report failure/success, but
they generally do not declare these functions (or include ).
This change adds declarations as appropriate.

It would have been possible to switch to __builtin_abort and
__builtin_exit instead.  Existing practice varies.  Adding the
declarations makes it easier to write the GNU-style commit message
because it is not necessary to mention the function with the call
site.

Instead of this change, it would be possible to create a special
header file with the declarations that is included during the
test file compilation using -include, but that would mean that
many tests would no longer build standalone.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/2112-1.c: Declare exit and abort
as appropriate.
* gcc.c-torture/execute/2113-1.c: Likewise.
* gcc.c-torture/execute/2205-1.c: Likewise.
* gcc.c-torture/execute/2217-1.c: Likewise.
* gcc.c-torture/execute/2223-1.c: Likewise.
* gcc.c-torture/execute/2224-1.c: Likewise.
* gcc.c-torture/execute/2225-1.c: Likewise.
* gcc.c-torture/execute/2227-1.c: Likewise.
* gcc.c-torture/execute/2313-1.c: Likewise.
* gcc.c-torture/execute/2314-1.c: Likewise.
* gcc.c-torture/execute/2314-2.c: Likewise.
* gcc.c-torture/execute/2314-3.c: Likewise.
* gcc.c-torture/execute/2402-1.c: Likewise.
* gcc.c-torture/execute/2403-1.c: Likewise.
* gcc.c-torture/execute/2412-1.c: Likewise.
* gcc.c-torture/execute/2412-2.c: Likewise.
* gcc.c-torture/execute/2412-4.c: Likewise.
* gcc.c-torture/execute/2412-5.c: Likewise.
* gcc.c-torture/execute/2412-6.c: Likewise.
* gcc.c-torture/execute/2419-1.c: Likewise.
* gcc.c-torture/execute/2422-1.c: Likewise.
* gcc.c-torture/execute/2503-1.c: Likewise.
* gcc.c-torture/execute/2511-1.c: Likewise.
* gcc.c-torture/execute/2519-1.c: Likewise.
* gcc.c-torture/execute/2519-2.c: Likewise.
* gcc.c-torture/execute/2523-1.c: Likewise.
* gcc.c-torture/execute/2528-1.c: Likewise.
* gcc.c-torture/execute/2603-1.c: Likewise.
* gcc.c-torture/execute/2605-1.c: Likewise.
* gcc.c-torture/execute/2605-2.c: Likewise.
* gcc.c-torture/execute/2605-3.c: Likewise.
* gcc.c-torture/execute/2622-1.c: Likewise.
* gcc.c-torture/execute/2717-1.c: Likewise.
* gcc.c-torture/execute/2717-2.c: Likewise.
* gcc.c-torture/execute/2717-5.c: Likewise.
* gcc.c-torture/execute/2722-1.c: Likewise.
* gcc.c-torture/execute/2726-1.c: Likewise.
* gcc.c-torture/execute/2731-1.c: Likewise.
* gcc.c-torture/execute/2731-2.c: Likewise.
* gcc.c-torture/execute/2801-3.c: Likewise.
* gcc.c-torture/execute/2801-4.c: Likewise.
* gcc.c-torture/execute/2808-1.c: Likewise.
* gcc.c-torture/execute/2815-1.c: Likewise.
* gcc.c-torture/execute/2818-1.c: Likewise.
* gcc.c-torture/execute/2819-1.c: Likewise.
* gcc.c-torture/execute/2822-1.c: Likewise.
* gcc.c-torture/execute/2914-1.c: Likewise.
* gcc.c-torture/execute/2917-1.c: Likewise.
* gcc.c-torture/execute/20001009-1.c: Likewise.
* gcc.c-torture/execute/20001009-2.c: Likewise.
* gcc.c-torture/execute/20001013-1.c: Likewise.
* gcc.c-torture/execute/20001017-1.c: Likewise.
* gcc.c-torture/execute/20001017-2.c: Likewise.
* gcc.c-torture/execute/20001024-1.c: Likewise.
* gcc.c-torture/execute/20001027-1.c: Likewise.
* gcc.c-torture/execute/20001101.c: Likewise.
* gcc.c-torture/execute/20001108-1.c: Likewise.
* gcc.c-torture/execute/20001112-1.c: Likewise.
* gcc.c-torture/execute/20001124-1.c: Likewise.
* gcc.c-torture/execute/20001130-1.c: Likewise.
* gcc.c-torture/execute/20001130-2.c: Likewise.
* gcc.c-torture/execute/20001203-2.c: Likewise.
* gcc.c-torture/execute/20001221-1.c: Likewise.
* gcc.c-torture/execute/20001228-1.c: Likewise.
* gcc.c-torture/execute/20001229-1.c: Likewise.
* gcc.c-torture/execute/20010106-1.c: Likewise.
* gcc.c-torture/execute/20010118-1.c: Likewise.
* gcc.c-torture/execute/20010119-1.c: Likewise.
* gcc.c-torture/execute/20010206-1.c: Likewise.
* gcc.c-torture/execute/20010209-1.c: Likewise.
* gcc.c-torture/execute/20010221-1.c: Likewise.
* gcc.c-torture/execute/20010222-1.c: Likewise.
* gcc.c-torture/execute/20010329-1.c: Likewise.
* gcc.c-torture/execute/20010403-1.c: Likewise.
* gcc.c-torture/execute/20010409-1.c: Likewise.
* 

Re: [PATCH] C99 testsuite readiness: Cleanup of execute tests

2023-11-10 Thread Jeff Law




On 11/10/23 15:07, Florian Weimer wrote:

This change updates the gcc.c-torture/execute/ to avoid obsolete
language constructs.  In the changed tests, use of the features
appears to be accidental, and updating allows the tests run with
the default compiler flags.

gcc/testsuite/

* gcc.c-torture/execute/2112-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/2113-1.c (foobar): Add missing
void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/2314-2.c (main): Likewise.
* gcc.c-torture/execute/2402-1.c (main): Likewise.
* gcc.c-torture/execute/2403-1.c (main): Likewise.
* gcc.c-torture/execute/2503-1.c (main): Likewise.
* gcc.c-torture/execute/2605-2.c (main): Likewise.
* gcc.c-torture/execute/2717-1.c (main): Likewise.
* gcc.c-torture/execute/2717-5.c (main): Likewise.
* gcc.c-torture/execute/2726-1.c (main): Likewise.
* gcc.c-torture/execute/2914-1.c(blah): Add missing
void types.
(main): Add missing int and void types.
* gcc.c-torture/execute/20001009-1.c (main): Likewise.
* gcc.c-torture/execute/20001013-1.c (main): Likewise.
* gcc.c-torture/execute/20001031-1.c (main): Likewise.
* gcc.c-torture/execute/20010221-1.c (main): Likewise.
* gcc.c-torture/execute/20010723-1.c (main): Likewise.
* gcc.c-torture/execute/20010915-1.c (s): Call
__builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/20010924-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20011128-1.c (main): Likewise.
* gcc.c-torture/execute/20020226-1.c (main): Likewise.
* gcc.c-torture/execute/20020328-1.c (foo): Add missing
void types.
* gcc.c-torture/execute/20020406-1.c (DUPFFexgcd): Call
__builtin_printf instead of printf.
(main): Likewise.
* gcc.c-torture/execute/20020508-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20020508-2.c (main): Likewise.
* gcc.c-torture/execute/20020508-3.c (main): Likewise.
* gcc.c-torture/execute/20020611-1.c (main): Likewise.
* gcc.c-torture/execute/20021010-2.c (main): Likewise.
* gcc.c-torture/execute/20021113-1.c (foo): Add missing
void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/20021120-3.c (foo): Call
__builtin_sprintf instead of sprintf.
* gcc.c-torture/execute/20030125-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20030216-1.c (main): Likewise.
* gcc.c-torture/execute/20030404-1.c (main): Likewise.
* gcc.c-torture/execute/20030606-1.c (main): Likewise.
Call __builtin_memset instead of memset.
* gcc.c-torture/execute/20030828-1.c (main): Add missing int
and void types.
* gcc.c-torture/execute/20030828-2.c (main): Likewise.
* gcc.c-torture/execute/20031012-1.c: Call __builtin_strlen
instead of strlen.
* gcc.c-torture/execute/20031211-1.c (main): Add missing int
and void types.
* gcc.c-torture/execute/20040319-1.c (main): Likewise.
* gcc.c-torture/execute/20040411-1.c (sub1): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/execute/20040423-1.c (sub1): Likewise.
* gcc.c-torture/execute/20040917-1.c (main): Add missing int
and void types.
* gcc.c-torture/execute/20050131-1.c (main): Likewise.
* gcc.c-torture/execute/20051113-1.c (main): Likewise.
* gcc.c-torture/execute/20121108-1.c (main): Call
__builtin_printf instead of printf.
* gcc.c-torture/execute/20170401-2.c (main): Add missing int
and void types.
* gcc.c-torture/execute/900409-1.c (main): Likewise.
* gcc.c-torture/execute/920202-1.c (f): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920302-1.c (execute): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920410-1.c (main): Likewise.
* gcc.c-torture/execute/920501-2.c (main): Likewise.
* gcc.c-torture/execute/920501-3.c (execute): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920501-5.c (x): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920501-6.c (main): Add int return
type.
* gcc.c-torture/execute/920501-8.c (main): Add missing
int and void types.  Call __builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/920506-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/920612-2.c (main): Likewise.
* gcc.c-torture/execute/920618-1.c 

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-10 Thread Peter Bergner
On 8/27/23 9:06 PM, Kewen.Lin wrote:
> Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we
> can have either TARGET_PCREL or !TARGET_PCREL after the checking.
> For the latter, it's fine and don't need any checks. For the former,
> if it's implicit, for !TARGET_PREFIXED we will clean it silently;
> while if it's explicit, for !TARGET_PREFIXED we will emit an error.
> TARGET_PREFIXED checking has considered Power10, so it's also
> concerned accordingly.
[snip]
> Yeah, looking forward to their opinions.  IMHO, with the current proposed
> change, pcrel doesn't look like a pure Power10 hardware feature, it also
> quite relies on ABIs, that's why I thought it seems good not to turn it
> on by default for Power10.

Ok, how about the patch below?  This removes OPTION_MASK_PCREL from the
power10 flags, so instead of our options override code needing to disable
PCREL on the systems that don't support it, we now enable it only on those
systems that do support it.

Jeevitha, can you test this patch to see whether it fixes the testsuite
issue caused by your earlier patch that was approved, but not yet pushed?
That was the use GPR2 for register allocation, correct?  Note, you'll need
to update the patch to replace the rs6000_pcrel_p() usage with just
TARGET_PCREL, since this patch removes rs6000_pcrel_p().

If testing is clean and everyone is OK with the patch, I'll officially
submit it for review with git log entry, etc.

Peter


gcc/
* config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI.
* config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL
from power10.
* config/rs6000/predicates.md: Use TARGET_PCREL.
* config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise.
(rs6000_global_entry_point_prologue_needed_p): Likewise.
(rs6000_output_function_prologue): Likewise.
* config/rs6000/rs6000.md: Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework
the logic for enabling PCREL by default.
(rs6000_legitimize_tls_address): Use TARGET_PCREL.
(rs6000_call_template_1): Likewise.
(rs6000_indirect_call_template_1): Likewise.
(rs6000_longcall_ref): Likewise.
(rs6000_call_aix): Likewise.
(rs6000_sibcall_aix): Likewise.
(rs6000_pcrel_p): Remove.
* config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise.

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 98b7255c95f..5b77bd7fd51 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -563,8 +563,5 @@ extern int dot_symbols;
 #define TARGET_FLOAT128_ENABLE_TYPE 1
 
 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI
-   supports it.  The ELF v2 ABI only supports PC-relative relocations for
-   the medium code model.  */
-#define PCREL_SUPPORTED_BY_OS  (TARGET_POWER10 && TARGET_PREFIXED  \
-&& ELFv2_ABI_CHECK \
-&& TARGET_CMODEL == CMODEL_MEDIUM)
+   supports it.  */
+#define PCREL_SUPPORTED_BY_OS  (ELFv2_ABI_CHECK)
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f350da378c..fe01a2312ae 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | 
ISA_2_7_MASKS_SERVER
| OPTION_MASK_HTM)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER
| OPTION_MASK_HTM)
-RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | 
ISA_3_1_MASKS_SERVER)
+RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64
+   | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL))
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT
| MASK_POWERPC64)
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ef7d3f214c4..0b76541fc0a 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1216,7 +1216,7 @@
 && SYMBOL_REF_DECL (op) != NULL
 && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL
 && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op))
-!= rs6000_pcrel_p ()))")))
+!= TARGET_PCREL))")))
 
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate "input_operand"
diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 98846f781ec..9e08d9bb4d2 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -1106,7 +1106,7 @@ rs6000_decl_ok_for_sibcall (tree decl)
 r2 for its caller's TOC.  Such a function may make sibcalls to any
 function, whether local or external, without restriction based 

[PATCH] MAINTAINERS: Fix formatting

2023-11-10 Thread Sam James
ChangeLog:
* MAINTAINERS (Write After Approval): Fix indentation and missing email 
bracket.

Signed-off-by: Sam James 
---
 MAINTAINERS | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index c43167d9a752..9ad68687f769 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -325,7 +325,7 @@ Mark G. Adams   

 Ajit Kumar Agarwal 
 Pedro Alves
 Paul-Antoine Arras 
-Arsen Arsenović
+Arsen Arsenović
 Raksit Ashok   
 Matt Austern   
 David Ayers
@@ -411,7 +411,7 @@ Chris Fairles   

 Alessandro Fanfarillo  
 Changpeng Fang 
 Sam Feifer 
-Eric Feng  
+Eric Feng  
 Li Feng
 Thomas Fitzsimmons 
 Alexander Fomin

@@ -534,16 +534,16 @@ Renlin Li 

 Xinliang David Li  
 Chen Liqin 
 Martin Liska   
-Hao Liu
+Hao Liu

 Jiangning Liu  
 Sa Liu 
 Ralph Loader   
-Sheldon Lobo   
 Gabor Loki 
 Manuel López-Ibáñez
 Carl Love  
 Martin v. Löwis

-Edwin Lu   
+Edwin Lu   
 H.J. Lu
 Xiong Hu Luo   
 Bin Bin Lv 
@@ -622,7 +622,7 @@ Hafiz Abid Qadeer   

 Yao Qi 
 Jerry Quinn
 Navid Rahimi   
-Rishi Raj  

+Rishi Raj  
 Easwaran Raman 
 Joe Ramsay 
 Rolf Rasmussen 
@@ -762,7 +762,7 @@ Immad Mir   

 Gaius Mulley   
 Siddhesh Poyarekar 
 Navid Rahimi   
-Rishi Raj  

+Rishi Raj  
 Trevor Saunders

 Bill Schmidt   
 Nathan Sidwell 
-- 
2.42.1



Re: [committed] Enable LRA on several ports

2023-11-10 Thread Jeff Law




On 8/13/23 20:11, Hans-Peter Nilsson wrote:

On Mon, 1 May 2023, Jeff Law wrote:



Spurred by Segher's RFC, I went ahead and tested several ports with LRA
enabled.  Not surprisingly, many failed, but a few built their full set of
libraries successful and of those a few even ran their testsuites with no
regressions.  In fact, enabling LRA fixes a small number of failures on the
iq2000 port.

This patch converts the ports which built their libraries and have test
results that are as good as or better than without LRA.There may be minor
code quality regressions or there may be minor code quality improvements --
I'm leaving that for the port maintainers to own going forward.


How do you configure your builds?  Perhaps your cross-builds
exclude C++?  I found that this (r14-383) broke MMIX building
libstdc++-v3 from that commit up to and including r14-3180.
See commit r14-3187.
Mine configure without C++ for the embedded targets.  So that would 
explain why my testing was clean and yours failed during build time.




Thankfully there was just one single gotcha.  I temporarily
reverted the LRA change for MMIX so that I can get honest
repeatable baseline results.  There seems to have been one
test-case regressing from the LRA switch (PR53948), thus I
re-enabled LRA for MMIX again.  Sorry for the late reaction.
Thanks.  ANd I'm sorry for 1. breaking things and 2. for responding so 
damn slowly.


jeff


Re: [PATCH] libgcc/m68k: Fixes for soft float

2023-11-10 Thread Jeff Law




On 8/22/23 20:15, Keith Packard via Gcc-patches wrote:

Check for non-zero denorm in __adddf3. Need to check both the upper and
lower 32-bit chunks of a 64-bit float for a non-zero value when
checking to see if the value is -0.

Fix __addsf3 when the sum exponent is exactly 0xff to ensure that
produces infinity and not nan.

Handle converting NaN/inf values between formats.

Handle underflow and overflow when truncating.

Write a replacement for __fixxfsi so that it does not raise extra
exceptions during an extra conversion from long double to double.

Signed-off-by: Keith Packard 
I pushed this to the trunk after fixing a few minor whitespace nits. 
You didn't mention the divdf change, but I'll assume that was just an 
oversight.


I'm largely trusting your reputation on the fpgnulib changes.  I won't 
claim to know that code at all.  The assembly bits were simple enough 
that I could make out what you were doing relatively easily.


Thanks again,
Jeff


Re: [PATCH 4/4] maintainer-scripts/gcc_release: cleanup whitespace

2023-11-10 Thread Sam James


Joseph Myers  writes:

> On Thu, 2 Nov 2023, Sam James wrote:
>
>> maintainer-scripts/
>>  * gcc_release: Cleanup whitespace.
>
> OK.

Thanks. Would you mind pushing the two you approved?


[Committed] RISC-V: Add test for PR112469

2023-11-10 Thread Juzhe-Zhong
As PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112469

which has been fixed by Richard patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635994.html

Add tests to avoid regression. Committed.

PR target/112469

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112469.c: New test.

---
 .../gcc.target/riscv/rvv/autovec/pr112469.c | 13 +
 1 file changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112469.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112469.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112469.c
new file mode 100644
index 000..e647028b558
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112469.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+int a, b, c;
+static int *d = 
+int e(int f) { return f == 0 ?: f; }
+int g() {
+  a = 1;
+  for (; a <= 8; a++) {
+b = e(*d);
+c = -b;
+  }
+}
-- 
2.36.3



Re: [PATCH] doc: Add fpatchable-function-entry to Option-Summary page[PR110983]

2023-11-10 Thread Jeff Law




On 8/28/23 23:25, Mao via Gcc-patches wrote:

The -fpatchable-function-entry is missing in both the web doc [1]
and the man page's "Option Summary" section.

This patch is to add it.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html
THanks.  I created a ChangeLog and pushed this to the trunk.  Sorry this 
fell through the cracks for so long.


jeff


Re: [PATCH v3 3/4] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

2023-11-10 Thread Jeff Law




On 8/30/23 04:13, Manolis Tsamis wrote:

The existing implementation of need_cmov_or_rewire and
noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG.
This commit enchances them so they can handle/rewire arbitrary set statements.

To do that a new helper struct noce_multiple_sets_info is introduced which is
used by noce_convert_multiple_sets and its helper functions. This results in
cleaner function signatures, improved efficientcy (a number of vecs and hash
set/map are replaced with a single vec of struct) and simplicity.

gcc/ChangeLog:

* ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets_info.
(init_noce_multiple_sets_info): Initialize noce_multiple_sets_info.
(noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle
rewiring of multiple registers.
(noce_convert_multiple_sets): Updated to use noce_multiple_sets_info.
* ifcvt.h (struct noce_multiple_sets_info): Introduce new struct
noce_multiple_sets_info to store info for noce_convert_multiple_sets.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test.
So this seems like (in theory) it could move forward independently.  The 
handling of arbitrary statements code wouldn't be exercised yet, but 
that's OK IMHO as I don't think anyone is fundamentally against trying 
to handle additional kinds of statements.


So my suggestion would be to bootstrap & regression test this 
independently.  AFAICT this should have no functional change if it were 
to go in on its own.  Note the testsuite entry might not be applicable 
if this were to go in on its own and would need to roll into another 
patch in the series.



Jeff


[PATCH 2/4] Add support for integer vector pair built-ins

2023-11-10 Thread Michael Meissner
This patch adds a series of built-in functions to allow users to write code to
do a number of simple operations where the loop is done using the __vector_pair
type.  The __vector_pair type is an opaque type.  These built-in functions keep
the two 128-bit vectors within the __vector_pair together, and split the
operation after register allocation.

This patch provides vector pair operations for 8, 16, 32, and 64-bit integers.

I have built and tested these patches on:

*   A little endian power10 server using --with-cpu=power10
*   A little endian power9 server using --with-cpu=power9
*   A big endian power9 server using --with-cpu=power9.

Can I check this patch into the master branch after the preceeding patch is
checked in?

2023-11-09  Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.def (__builtin_vpair_i8*): Add built-in
functions for integer vector pairs.
(__builtin_vpair_i16*): Likeise.
(__builtin_vpair_i32*): Likeise.
(__builtin_vpair_i64*): Likeise.
* config/rs6000/vector-pair.md (UNSPEC_VPAIR_V32QI): New unspec.
(UNSPEC_VPAIR_V16HI): Likewise.
(UNSPEC_VPAIR_V8SI): Likewise.
(UNSPEC_VPAIR_V4DI): Likewise.
(VP_INT_BINARY): New iterator for integer vector pair.
(vp_insn): Add supoort for integer vector pairs.
(vp_ireg): New code attribute for integer vector pairs.
(vp_ipredicate): Likewise.
(VP_INT): New int interator for integer vector pairs.
(VP_VEC_MODE): Likewise.
(vp_pmode): Likewise.
(vp_vmode): Likewise.
(vp_neg_reg): New int interator for integer vector pairs.
(vpair_neg_): Add integer vector pair support insns.
(vpair_not_2): Likewise.
(vpair__3): Likewise.
(vpair_andc_): Likewise.
(vpair_nand__1): Likewise.
(vpair_nand__2): Likewise.
(vpair_nor__1): Likewise.
(vpair_nor__2): Likewise.
* doc/extend.texi (PowerPC Vector Pair Built-in Functions): Document the
integer vector pair built-in functions.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-5.c: New test.
* gcc.target/powerpc/vector-pair-6.c: New test.
* gcc.target/powerpc/vector-pair-7.c: New test.
* gcc.target/powerpc/vector-pair-8.c: New test.
---
 gcc/config/rs6000/rs6000-builtins.def | 144 +
 gcc/config/rs6000/vector-pair.md  | 280 +-
 gcc/doc/extend.texi   |  72 +
 .../gcc.target/powerpc/vector-pair-5.c| 193 
 .../gcc.target/powerpc/vector-pair-6.c| 193 
 .../gcc.target/powerpc/vector-pair-7.c| 193 
 .../gcc.target/powerpc/vector-pair-8.c| 194 
 7 files changed, 1266 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-5.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-6.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-7.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-8.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 89b248b50ef..3b2db39c1ab 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4183,3 +4183,147 @@
 
   v256 __builtin_vpair_f64_sub (v256, v256);
 VPAIR_F64_SUB vpair_sub_v4df3 {mma,pair}
+
+;; vector pair built-in functions for 32 8-bit unsigned char or
+;; signed char values
+
+  v256 __builtin_vpair_i8_add (v256, v256);
+VPAIR_I8_ADD vpair_add_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8_and (v256, v256);
+VPAIR_I8_AND vpair_and_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8_ior (v256, v256);
+VPAIR_I8_IOR vpair_ior_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8_max (v256, v256);
+VPAIR_I8_MAX vpair_smax_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8_min (v256, v256);
+VPAIR_I8_MIN vpair_smin_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8_neg (v256);
+VPAIR_I8_NEG vpair_neg_v32qi2 {mma,pair}
+
+  v256 __builtin_vpair_i8_not (v256);
+VPAIR_I8_NOT vpair_not_v32qi2 {mma,pair}
+
+  v256 __builtin_vpair_i8_sub (v256, v256);
+VPAIR_I8_SUB vpair_sub_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8_xor (v256, v256);
+VPAIR_I8_XOR vpair_xor_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8u_max (v256, v256);
+VPAIR_I8U_MAX vpair_umax_v32qi3 {mma,pair}
+
+  v256 __builtin_vpair_i8u_min (v256, v256);
+VPAIR_I8U_MIN vpair_umin_v32qi3 {mma,pair}
+
+;; vector pair built-in functions for 16 16-bit unsigned short or
+;; signed short values
+
+  v256 __builtin_vpair_i16_add (v256, v256);
+VPAIR_I16_ADD vpair_add_v16hi3 {mma,pair}
+
+  v256 __builtin_vpair_i16_and (v256, v256);
+VPAIR_I16_AND vpair_and_v16hi3 {mma,pair}
+
+  v256 __builtin_vpair_i16_ior (v256, v256);
+VPAIR_I16_IOR vpair_ior_v16hi3 {mma,pair}
+
+  v256 __builtin_vpair_i16_max (v256, v256);
+

Re: [PATCH] libgccjit: Fix GGC segfault when using -flto

2023-11-10 Thread David Malcolm
On Fri, 2023-11-10 at 11:02 -0500, Antoni Boucher wrote:
> Hi.
> This patch fixes the segfault when using -flto with libgccjit (bug
> 111396).
> 
> You mentioned in bugzilla that this didn't fix the reproducer for
> you,

Rereading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111396 it looks
like all I tested back in August was your reproducer; I didn't yet test
your patch.

> but it does for me.
> At first, the test case would not pass, but running "make install"
> made
> it pass.
> Not sure if this is normal.
> 
> Could you please check if this fixes the issue on your side as well?
> Since this patch changes files outside of gcc/jit, what tests should
> I
> run to make sure it didn't break anything?

I'm trying your patch in my tester now.

BTW, we shouldn't add test-ggc-bugfix to since it adds options to the
context: this would affect all the other tests.


Dave



[PATCH 4/4] Add support for doing a horizontal add on vector pair elements.

2023-11-10 Thread Michael Meissner
This patch adds a series of built-in functions to allow users to write code to
do a number of simple operations where the loop is done using the __vector_pair
type.  The __vector_pair type is an opaque type.  These built-in functions keep
the two 128-bit vectors within the __vector_pair together, and split the
operation after register allocation.

This patch provides vector pair built-in functions to do a horizontal add on
vector pair elements.  Only floating point and 64-bit horizontal adds are
provided in this patch.

I have built and tested these patches on:

*   A little endian power10 server using --with-cpu=power10
*   A little endian power9 server using --with-cpu=power9
*   A big endian power9 server using --with-cpu=power9.

Can I check this patch into the master branch after the preceeding patches have
been checked in?

2023-11-08  Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.def (__builtin_vpair_f32_add_elements):
New built-in function.
(__builtin_vpair_f64_add_elements): Likewise.
(__builtin_vpair_i64_add_elements): Likewise.
(__builtin_vpair_i64u_add_elements): Likewise.
* config/rs6000/vector-pair.md (UNSPEC_VPAIR_REDUCE_PLUS_F32): New
unspec.
(UNSPEC_VPAIR_REDUCE_PLUS_F64): Likewise.
(UNSPEC_VPAIR_REDUCE_PLUS_I64): Likewise.
(vpair_reduc_plus_scale_v8sf): New insn.
(vpair_reduc_plus_scale_v4df): Likewise.
(vpair_reduc_plus_scale_v4di): Likewise.
* doc/extend.texi (__builtin_vpair_f32_add_elements): Document.
(__builtin_vpair_f64_add_elements): Likewise.
(__builtin_vpair_i64_add_elements): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-16.c: New test.
---
 gcc/config/rs6000/rs6000-builtins.def | 12 +++
 gcc/config/rs6000/vector-pair.md  | 93 +++
 gcc/doc/extend.texi   |  3 +
 .../gcc.target/powerpc/vector-pair-16.c   | 45 +
 4 files changed, 153 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-16.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index fbd416ceb87..b9a16c01420 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4145,6 +4145,9 @@
   v256 __builtin_vpair_f32_add (v256, v256);
 VPAIR_F32_ADD vpair_add_v8sf3 {mma,pair}
 
+  float __builtin_vpair_f32_add_elements (v256);
+VPAIR_F32_ADD_ELEMENTS vpair_reduc_plus_scale_v8sf {mma,pair}
+
   v256 __builtin_vpair_f32_assemble (vf, vf);
 VPAIR_F32_ASSEMBLE vpair_assemble_v8sf {mma,pair}
 
@@ -4180,6 +4183,9 @@
   v256 __builtin_vpair_f64_add (v256, v256);
 VPAIR_F64_ADD vpair_add_v4df3 {mma,pair}
 
+  double __builtin_vpair_f64_add_elements (v256);
+VPAIR_F64_ADD_ELEMENTS vpair_reduc_plus_scale_v4df {mma,pair}
+
 v256 __builtin_vpair_f64_assemble (vd, vd);
 VPAIR_F64_ASSEMBLE vpair_assemble_v4df {mma,pair}
 
@@ -4375,6 +4381,9 @@ v256 __builtin_vpair_f64_assemble (vd, vd);
   v256 __builtin_vpair_i64_add (v256, v256);
 VPAIR_I64_ADD vpair_add_v4di3 {mma,pair}
 
+  long long __builtin_vpair_i64_add_elements (v256);
+VPAIR_I64_ADD_ELEMENTS vpair_reduc_plus_scale_v4di {mma,pair,no32bit}
+
   v256 __builtin_vpair_i64_and (v256, v256);
 VPAIR_I64_AND vpair_and_v4di3 {mma,pair}
 
@@ -4408,6 +4417,9 @@ v256 __builtin_vpair_f64_assemble (vd, vd);
   v256 __builtin_vpair_i64_xor (v256, v256);
 VPAIR_I64_XOR vpair_xor_v4di3 {mma,pair}
 
+  unsigned long long __builtin_vpair_i64u_add_elements (v256);
+VPAIR_I64U_ADD_ELEMENTS vpair_reduc_plus_scale_v4di {mma,pair,no32bit}
+
   v256 __builtin_vpair_i64u_assemble (vull, vull);
 VPAIR_I64U_ASSEMBLE vpair_assemble_v4di {mma,pair}
 
diff --git a/gcc/config/rs6000/vector-pair.md b/gcc/config/rs6000/vector-pair.md
index f6d0b2a39fc..b5e9330e71f 100644
--- a/gcc/config/rs6000/vector-pair.md
+++ b/gcc/config/rs6000/vector-pair.md
@@ -35,6 +35,9 @@ (define_c_enum "unspec"
UNSPEC_VPAIR_V4DI
UNSPEC_VPAIR_ZERO
UNSPEC_VPAIR_SPLAT
+   UNSPEC_VPAIR_REDUCE_PLUS_F32
+   UNSPEC_VPAIR_REDUCE_PLUS_F64
+   UNSPEC_VPAIR_REDUCE_PLUS_I64
])
 
 ;; Iterator doing unary/binary arithmetic on vector pairs
@@ -577,6 +580,66 @@ (define_insn_and_split "*vpair_nfms_fpcontract_4"
 }
   [(set_attr "length" "8")])
 
+
+;; Add all elements in a pair of V4SF vectors.
+(define_insn_and_split "vpair_reduc_plus_scale_v8sf"
+  [(set (match_operand:SF 0 "vsx_register_operand" "=wa")
+   (unspec:SF [(match_operand:OO 1 "vsx_register_operand" "v")]
+  UNSPEC_VPAIR_REDUCE_PLUS_F32))
+   (clobber (match_scratch:V4SF 2 "="))
+   (clobber (match_scratch:V4SF 3 "="))]
+  "TARGET_MMA"
+  "#"
+  "&& reload_completed"
+  [(pc)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx tmp1 = operands[2];
+  rtx tmp2 = operands[3];
+  unsigned r = reg_or_subregno (op1);
+  rtx op1_hi = gen_rtx_REG (V4SFmode, r);

[PATCH 3/4] Add support for initializing and extracting from vector pairs

2023-11-10 Thread Michael Meissner
This patch adds a series of built-in functions to allow users to write code to
do a number of simple operations where the loop is done using the __vector_pair
type.  The __vector_pair type is an opaque type.  These built-in functions keep
the two 128-bit vectors within the __vector_pair together, and split the
operation after register allocation.

This patch provides vector pair operations for loading up a vector pair with all
0's, duplicated (splat) from a scalar type, or combining two vectors in a vector
pair.  This patch also provides vector pair builtins to extract one vector
element of a vector pair.

I have built and tested these patches on:

*   A little endian power10 server using --with-cpu=power10
*   A little endian power9 server using --with-cpu=power9
*   A big endian power9 server using --with-cpu=power9.

Can I check this patch into the master branch after the preceeding patches have
been checked in?

2023-11-09  Michael Meissner  

gcc/

* config/rs6000/predicates.md (mma_assemble_input_operand): Allow any
16-byte vector, not just V16QImode.
* config/rs6000/rs6000-builtins.def (__builtin_vpair_zero): New vector
pair initialization built-in functions.
(__builtin_vpair_*_assemble): Likeise.
(__builtin_vpair_*_splat): Likeise.
(__builtin_vpair_*_extract_vector): New vector pair extraction built-in
functions.
* config/rs6000/vector-pair.md (UNSPEC_VPAIR_V32QI): New unspec.
(UNSPEC_VPAIR_V16HI): Likewise.
(UNSPEC_VPAIR_V8SI): Likewise.
(UNSPEC_VPAIR_V4DI): Likewise.
(VP_INT_BINARY): New iterator for integer vector pair.
(vp_insn): Add supoort for integer vector pairs.
(vp_ireg): New code attribute for integer vector pairs.
(vp_ipredicate): Likewise.
(VP_INT): New int interator for integer vector pairs.
(VP_VEC_MODE): Likewise.
(vp_pmode): Likewise.
(vp_vmode): Likewise.
(vp_neg_reg): New int interator for integer vector pairs.
(vpair_neg_): Add integer vector pair support insns.
(vpair_not_2): Likewise.
(vpair__3): Likewise.
(vpair_andc_): Likewise.
(vpair_nand__1): Likewise.
(vpair_nand__2): Likewise.
(vpair_nor__1): Likewise.
(vpair_nor__2): Likewise.
* doc/extend.texi (PowerPC Vector Pair Built-in Functions): Document the
integer vector pair built-in functions.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-5.c: New test.
* gcc.target/powerpc/vector-pair-6.c: New test.
* gcc.target/powerpc/vector-pair-7.c: New test.
* gcc.target/powerpc/vector-pair-8.c: New test.
---
 gcc/config/rs6000/predicates.md   |   2 +-
 gcc/config/rs6000/rs6000-builtins.def |  95 +
 gcc/config/rs6000/vector-pair.md  | 185 ++
 gcc/doc/extend.texi   |  44 +
 .../gcc.target/powerpc/vector-pair-10.c   |  86 
 .../gcc.target/powerpc/vector-pair-11.c   |  84 
 .../gcc.target/powerpc/vector-pair-12.c   | 156 +++
 .../gcc.target/powerpc/vector-pair-13.c   | 139 +
 .../gcc.target/powerpc/vector-pair-14.c   | 141 +
 .../gcc.target/powerpc/vector-pair-15.c   | 139 +
 .../gcc.target/powerpc/vector-pair-9.c|  13 ++
 11 files changed, 1083 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-10.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-11.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-12.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-13.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-14.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-15.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-9.c

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ef7d3f214c4..922a77716c4 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1301,7 +1301,7 @@ (define_predicate "splat_input_operand"
 
 ;; Return 1 if this operand is valid for a MMA assemble accumulator insn.
 (define_special_predicate "mma_assemble_input_operand"
-  (match_test "(mode == V16QImode
+  (match_test "(VECTOR_MODE_P (mode) && GET_MODE_SIZE (mode) == 16
&& (vsx_register_operand (op, mode)
|| (MEM_P (op)
&& (indexed_or_indirect_address (XEXP (op, 0), mode)
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 3b2db39c1ab..fbd416ceb87 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4132,6 +4132,11 @@
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
 
+;; General vector pair built-in functions

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-10 Thread Jason Merrill

[combined reply to all three threads]

On 11/9/23 23:24, waffl3x wrote:



I'm unfortunately going down a rabbit hole again.

--function.h:608
`/* If pointers to member functions use the least significant bit to indicate 
whether a function is virtual, ensure a pointer to this function will have that 
bit clear. */ #define MINIMUM_METHOD_BOUNDARY \\ 
((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn) \\ ? MAX 
(FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)`



So yes, it was for PMFs using the low bit of the pointer to indicate a
virtual member function. Since an xob memfn can't be virtual, it's
correct for them to have the same alignment as a static memfn.


Is it worth considering whether we want to support virtual xobj member
functions in the future? If that were the case would it be better if we
aligned things a little differently here? Or might it be better if we
wanted to support it as an extension to just effectively translate the
declaration back to one that is a METHOD_TYPE? I imagine this would be
the best solution for non-standard support of the syntax. We would
simply have to forbid by-value and conversion semantics and on the
user's side they would get consistent syntax.

However, this flies in the face of the defective/contradictory spec for
virtual function overrides. So I'm not really sure whether we would
want to do this. I just want to raise the question before we lock in
the alignment, if pushing the patch locks it in that is, I'm not really
sure if it needs to be stable or not.


It doesn't need to be stable; we can increase the alignment of decls as 
needed in new code without breaking older code.



All tests seemed to pass when applied to GCC14, but the results did
something funny where it said tests disappeared and new tests appeared
and passed. The ones that disappeared and the new ones that appeared
looked like they were identical so I'm not worrying about it. Just
mentioning it in case this is something I do need to look into.


That doesn't sound like a problem, but I'm curious about the specific
output you're seeing.


I've attached a few test result comparisons so you can take a look.


Looks like you're comparing results from different build directories and 
the libitm test wrongly includes the build directory in the test "name". 
 So yeah, just noise.



Side note, would you prefer I compile the lambda and by-value fixes
into a new version of this patch? Or as a separate patch? Originally I
had planned to put it in another patch, but I identified that the code
I wrote in build_over_call was kind of fundamentally broken and it was
almost merely coincidence that it worked at all. In light of this and
your comments (which I've skimmed, I will respond directly below) I
think I should just revise this patch with everything else.


Agreed.


There are a few known issues still present in this patch. Most importantly,
the implicit object argument fails to convert when passed to by-value xobj
parameters. This occurs both for xobj parameters that match the argument type
and xobj parameters that are unrelated to the object type, but have valid
conversions available. This behavior can be observed in the
explicit-obj-by-value[1-3].C tests. The implicit object argument appears to be
simply reinterpreted instead of any conversion applied. This is elaborated on
in the test cases.


Yes, that's because of:


@@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, 
tsubst_flags_t complain)
}
}
/ Bypass access control for 'this' parameter. */
- else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
+ else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
+ || DECL_XOBJ_MEMBER_FUNC_P (fn))



We don't want to take this path for xob fns. Instead I think we need to
change the existing:


gcc_assert (first_arg == NULL_TREE);



to assert that if first_arg is non-null, we're dealing with an xob fn,
and then go ahead and do the same conversion as the loop body on first_arg.


Despite this, calls where there is no valid conversion
available are correctly rejected, which I find surprising. The
explicit-obj-by-value4.C testcase demonstrates this odd but correct behavior.



Yes, because checking for conversions is handled elsewhere.


Yeah, as I noted above I realized that just handling it the same way as
iobj member functions is fundamentally broken. I was staring at it last
night and eventually realized that I could just copy the loop body. I
ended up asserting in the body handling the implicit object argument
for xobj member functions that first_arg != NULL_TREE, which I wasn't
sure of, but it seems to work.


That sounds like it might cause trouble with

struct A {
   void f(this A);
};

int main()
{
  (::f) (A());
}


I tried asking in IRC if there are any circumstances where first_arg
would be null for a non-static member function and I didn't get an
answer. The code above seemed to indicate that it could be. It just
looks like old code that is no longer valid 

[PATCH 1/4] Add support for floating point vector pair built-in functions

2023-11-10 Thread Michael Meissner
This patch adds a series of built-in functions to allow users to write code to
do a number of simple operations where the loop is done using the __vector_pair
type.  The __vector_pair type is an opaque type.  These built-in functions keep
the two 128-bit vectors within the __vector_pair together, and split the
operation after register allocation.

This patch provides vector pair operations for 32-bit floating point and 64-bit
floating point.

I have built and tested these patches on:

*   A little endian power10 server using --with-cpu=power10
*   A little endian power9 server using --with-cpu=power9
*   A big endian power9 server using --with-cpu=power9.

Can I check this patch into the master branch?

2023-11-09  Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.def (__builtin_vpair_f32_*): Add vector
pair built-in functions for float.
(__builtin_vpair_f64_*): Add vector pair built-in functions for double.
* config/rs6000/rs6000-protos.h (split_unary_vector_pair): Add
declaration.
(split_binary_vector_pair): Likewise.
(split_fma_vector_pair): Likewise.
* config/rs6000/rs6000.cc (split_unary_vector_pair): New helper function
for vector pair built-in functions.
(split_binary_vector_pair): Likewise.
(split_fma_vector_pair): Likewise.
* config/rs6000/rs6000.md (toplevel): Include vector-pair.md.
* config/rs6000/t-rs6000 (MD_INCLUDES): Add vector-pair.md.
* config/rs6000/vector-pair.md: New file.
* doc/extend.texi (PowerPC Vector Pair Built-in Functions): Document the
floating point and general vector pair built-in functions.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-1.c: New test.
* gcc.target/powerpc/vector-pair-2.c: New test.
* gcc.target/powerpc/vector-pair-3.c: New test.
* gcc.target/powerpc/vector-pair-4.c: New test.
---
 gcc/config/rs6000/rs6000-builtins.def |  52 +++
 gcc/config/rs6000/rs6000-protos.h |   5 +
 gcc/config/rs6000/rs6000.cc   |  74 
 gcc/config/rs6000/rs6000.md   |   1 +
 gcc/config/rs6000/t-rs6000|   1 +
 gcc/config/rs6000/vector-pair.md  | 329 ++
 gcc/doc/extend.texi   |  46 +++
 .../gcc.target/powerpc/vector-pair-1.c| 135 +++
 .../gcc.target/powerpc/vector-pair-2.c| 134 +++
 .../gcc.target/powerpc/vector-pair-3.c|  60 
 .../gcc.target/powerpc/vector-pair-4.c|  60 
 11 files changed, 897 insertions(+)
 create mode 100644 gcc/config/rs6000/vector-pair.md
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-4.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index ce40600e803..89b248b50ef 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4131,3 +4131,55 @@
 
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
+
+;; vector pair built-in functions for 8 32-bit float values
+
+  v256 __builtin_vpair_f32_abs (v256);
+VPAIR_F32_ABS vpair_abs_v8sf2 {mma,pair}
+
+  v256 __builtin_vpair_f32_add (v256, v256);
+VPAIR_F32_ADD vpair_add_v8sf3 {mma,pair}
+
+  v256 __builtin_vpair_f32_fma (v256, v256, v256);
+VPAIR_F32_FMA vpair_fma_v8sf4 {mma,pair}
+
+  v256 __builtin_vpair_f32_max (v256, v256);
+VPAIR_F32_MAX vpair_smax_v8sf3 {mma,pair}
+
+  v256 __builtin_vpair_f32_min (v256, v256);
+VPAIR_F32_MIN vpair_smin_v8sf3 {mma,pair}
+
+  v256 __builtin_vpair_f32_mul (v256, v256);
+VPAIR_F32_MUL vpair_mul_v8sf3 {mma,pair}
+
+  v256 __builtin_vpair_f32_neg (v256);
+VPAIR_F32_NEG vpair_neg_v8sf2 {mma,pair}
+
+  v256 __builtin_vpair_f32_sub (v256, v256);
+VPAIR_F32_SUB vpair_sub_v8sf3 {mma,pair}
+
+;; vector pair built-in functions for 4 64-bit double values
+
+  v256 __builtin_vpair_f64_abs (v256);
+VPAIR_F64_ABS vpair_abs_v4df2 {mma,pair}
+
+  v256 __builtin_vpair_f64_add (v256, v256);
+VPAIR_F64_ADD vpair_add_v4df3 {mma,pair}
+
+  v256 __builtin_vpair_f64_fma (v256, v256, v256);
+VPAIR_F64_FMA vpair_fma_v4df4 {mma,pair}
+
+  v256 __builtin_vpair_f64_max (v256, v256);
+VPAIR_F64_MAX vpair_smax_v4df3 {mma,pair}
+
+  v256 __builtin_vpair_f64_min (v256, v256);
+VPAIR_F64_MIN vpair_smin_v4df3 {mma,pair}
+
+  v256 __builtin_vpair_f64_mul (v256, v256);
+VPAIR_F64_MUL vpair_mul_v4df3 {mma,pair}
+
+  v256 __builtin_vpair_f64_neg (v256);
+VPAIR_F64_NEG vpair_neg_v4df2 {mma,pair}
+
+  v256 __builtin_vpair_f64_sub (v256, v256);
+VPAIR_F64_SUB vpair_sub_v4df3 {mma,pair}
diff --git a/gcc/config/rs6000/rs6000-protos.h 

[PATCH 0/4] Add vector pair builtins to PowerPC

2023-11-10 Thread Michael Meissner
These set of patches add support for using the vector pair load (lxvp, plxvp,
and lxvpx) instructions and the vector pair store (stxvp, pstxvp, and stxvpx)
that were introduced with ISA 3.1 on Power10 systems.

With GCC 13, the only place vector pairs (and vector quads) were used were to
feed into the MMA subsystem.  These patches do not use the MMA subsystem, but
it gives users a way to write code that is extremely memory bandwidth
intensive.

There are two main ways to add vector pair support to the GCC compiler:
built-in functions vs. __attribute__((__vector_size__(32))).

The first method is to add a set of built-in functions that use the vector pair
type and it allows the user to write loops and such using the vector pair type
(__vector_pair).  Loads are normally done using the load vector pair
instructions.  Then the operation is done as a post reload split to do the two
independent vector operations on the two 128-bit vectors located in the vector
pair.  When the type is stored, normally a store vector pair instruction is
used.  By keeping the value within a vector pair through register allocation,
the compiler does not generate extra move instructions which can slow down the
loop.

The second method is to add support for the V4DF, V8SF, etc. types.  By doing
so, you can use the attribute __vector_size__(32)) to declare variables that
are vector pairs, and the GCC compiler will generate the appropriate code.  I
implemented a limited prototype of this support, but it has some problems that
I haven't addressed.  One potential problem with using the 32-byte vector size
is it can generate worse code for options that aren't covered withe as the
compiler unpacks things and re-packs them.  The compiler would also generate
these unpacks and packs if you are generating code for a power9 system.  There
are a bunch of test cases that fail with my prototype implementation that I
haven't addressed yet.

After discussions within our group, it was decided that using built-in
functions is the way to go at this time, and these patches are implement those
functions.

In terms of benchmarks, I wrote two benchmarks:

   1)   One benchmark is a saxpy type loop: value[i] += (a[i] * b[i]).  That is
a loop with 3 loads and a store per loop.

   2)   Another benchmark produces a scalar sun of an entire vector.  This is a
loop that just has a single load and no store.

For the saxpy type loop, I get the following general numbers for both float and
double:

   1)   The vector pair built-in functions are roughly 10% faster than using
normal vector processing.

   2)   The vector pair built-in functions are roughly 19-20% faster than if I
write the loop using the vector pair loads using the exist built-ins,
and then manually split the values and do the arithmetic and single
vector stores,

   3)   The vector pair built-in functions are roughly 35-40% faster than if I
write the loop using the existing built-ins for both vector pair load
and vector pair store.  If I apply the patches that Peter Bergner has
been writing for PR target/109116, then it improves the speed of the
existing built-ins for assembling and disassembling vector pairs.  In
this case, the vector pair built-in functions are 20-25% faster,
instead of 35-40% faster.  This is due to the patch eliminating extra
vector moves.

Unfortunately, for floating point, doing the sum of the whole vector is slower
using the new vector pair built-in functions using a simple loop (compared to
using the existing built-ins for disassembling vector pairs.  If I write more
complex loops that manually unroll the loop, then the floating point vector
pair built-in functions become like the integer vector pair integer built-in
functions.  So there is some amount of tuning that will need to be done.

There are 4 patches within this group of patches.

1)  The first patch adds vector pair support for 32-bit and 64-bit floating
point operations.  The operations provided are absolute value,
addition, fused multiply-add, minimu, maximum, multiplication,
negation, and subtraction.  I did not add divde or square root because
these instructions take long enough to compute that you don't get any
advantage of using the vector pair load/store instructions.

2)  The second patch add vector pair support for 8-bit, 16-bit, 32-bit, and
64-bit integer operations.  The operations provided include addition,
bitwise and, bitwise inclusive or, bitwise exclusive or, bitwise not,
both signed and unsigned minimum/maximu, negation, and subtraction.  I
did not add multiply because the PowerPC architecture does not provide
single instructions to do integer vector multiply on the whole vector.
I could add shifts and rotates, but I didn't think memory intensive
code used these operations.

3)  The third 

[PATCH, V2] Power10: Add options to disable load and store vector pair.

2023-11-10 Thread Michael Meissner
This is version 2 of the patch to add -mno-load-vector-pair and
-mno-store-vector-pair undocumented tuning switches.

The differences between the first version of the patch and this version is that
I added explicit RTL abi attributes for when the compiler can generate the load
vector pair and store vector pair instructions.  By having this attribute, the
movoo insn has separate alternatives for when we generate the instruction and
when we want to split the instruction into 2 separate vector loads or stores.

In the first version of the patch, I had previously provided built-in functions
that would always generate load vector pair and store vector pair instructions
even if these instructions are normally disabled.  I found these built-ins
weren't specified like the other vector pair built-ins, and I didn't include
documentation for the built-in functions.  If we want such built-in functions,
we can add them as a separate patch later.

In addition, since both versions of the patch adds #pragma target and attribute
support to change the results for individual functions, we can select on a
function by function basis what the defaults for load/store vector pair is.

The original text for the patch is:

In working on some future patches that involve utilizing vector pair
instructions, I wanted to be able to tune my program to enable or disable using
the vector pair load or store operations while still keeping the other
operations on the vector pair.

This patch adds two undocumented tuning options.  The -mno-load-vector-pair
option would tell GCC to generate two load vector instructions instead of a
single load vector pair.  The -mno-store-vector-pair option would tell GCC to
generate two store vector instructions instead of a single store vector pair.

If either -mno-load-vector-pair is used, GCC will not generate the indexed
stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will not
generate the indexed lxvpx instruction.  The reason for this is to enable
splitting the {,p}lxvp or {,p}stxvp instructions after reload without needing a
scratch GPR register.

The default for -mcpu=power10 is that both load vector pair and store vector
pair are enabled.

I added code so that the user code can modify these settings using either a
'#pragma GCC target' directive or used __attribute__((__target__(...))) in the
function declaration.

I added tests for the switches, #pragma, and attribute options.

I have built this on both little endian power10 systems and big endian power9
systems doing the normal bootstrap and test.  There were no regressions in any
of the tests, and the new tests passed.  Can I check this patch into the master
branch?

2023-11-09  Michael Meissner  

gcc/

* config/rs6000/mma.md (movoo): Add support for -mno-load-vector-pair 
and
-mno-store-vector-pair.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support for
-mload-vector-pair and -mstore-vector-pair.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
indexed mode for OOmode if we are generating both load vector pair and
store vector pair instructions.
(rs6000_option_override_internal): Add support for -mno-load-vector-pair
and -mno-store-vector-pair.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
attributes.
(enabled attribute): Likewise.
* config/rs6000/rs6000.opt (-mload-vector-pair): New option.
(-mstore-vector-pair): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-attribute.c: New test.
* gcc.target/powerpc/vector-pair-pragma.c: New test.
* gcc.target/powerpc/vector-pair-switch1.c: New test.
* gcc.target/powerpc/vector-pair-switch2.c: New test.
* gcc.target/powerpc/vector-pair-switch3.c: New test.
* gcc.target/powerpc/vector-pair-switch4.c: New test.
---
 gcc/config/rs6000/mma.md  | 19 +--
 gcc/config/rs6000/rs6000-cpus.def |  8 ++-
 gcc/config/rs6000/rs6000.cc   | 30 +-
 gcc/config/rs6000/rs6000.md   | 10 +++-
 gcc/config/rs6000/rs6000.opt  |  8 +++
 .../powerpc/vector-pair-attribute.c   | 39 +
 .../gcc.target/powerpc/vector-pair-pragma.c   | 55 +++
 .../gcc.target/powerpc/vector-pair-switch1.c  | 16 ++
 .../gcc.target/powerpc/vector-pair-switch2.c  | 17 ++
 .../gcc.target/powerpc/vector-pair-switch3.c  | 17 ++
 .../gcc.target/powerpc/vector-pair-switch4.c  | 17 ++
 11 files changed, 225 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c
 create mode 100644 

[PATCH] c++/modules: more checks for exporting names with using-declarations

2023-11-10 Thread Nathaniel Shead
I noticed while fixing PR106849 that we don't currently check validity
of exporting non-functions via using-declarations either; this patch
adds those checks factored out into a new function. I also tried to make
the error messages a bit more descriptive.

This patch is based on [1] (with the adjustment to use STRIP_TEMPLATE
Nathan mentioned), but could probably be a replacement for that patch if
preferred - if so I'm happy to re-send rebased off master instead.

The ICEs mentioned in the commit message are caused by code such as

  export module M;

  namespace {
enum e { x };
  }
  export using ::e;

in depset::hash::finalize_dependencies when attempting to get a
DECL_SOURCE_LOCATION of an OVERLOAD. I haven't fixed that in this patch
though because after this patch I was no longer able to construct an
example of that error, but it's maybe something to fix up later as well.

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635869.html

Bootstrapped and regtested on x86_64-pc-linux-gnu. I don't have write
access.

-- >8 --

Currently only functions are directly checked for validity when
exporting via a using-declaration.  This patch also checks exporting
non-external names of variables, types, and enumerators.  This also
prevents ICEs with `export using enum` for internal-linkage enums.

While we're at it we also improve the error messages for these cases to
provide more context about what went wrong.

gcc/cp/ChangeLog:

* name-lookup.cc (check_can_export_using_decl): New.
(do_nonmember_using_decl): Use above to check if names can be
exported.

gcc/testsuite/ChangeLog:

* g++.dg/modules/using-10.C: New test.
* g++.dg/modules/using-enum-2.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/name-lookup.cc   | 74 +++--
 gcc/testsuite/g++.dg/modules/using-10.C | 71 
 gcc/testsuite/g++.dg/modules/using-enum-2.C | 23 +++
 3 files changed, 148 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/using-10.C
 create mode 100644 gcc/testsuite/g++.dg/modules/using-enum-2.C

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index e74084948b6..d19ea5d121c 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -4802,6 +4802,49 @@ pushdecl_outermost_localscope (tree x)
   return b ? do_pushdecl_with_scope (x, b) : error_mark_node;
 }
 
+/* Checks if BINDING is a binding that we can export.  */
+
+static bool
+check_can_export_using_decl (tree binding)
+{
+  tree decl = STRIP_TEMPLATE (binding);
+
+  /* Linkage is determined by the owner of an enumerator.  */
+  if (TREE_CODE (decl) == CONST_DECL)
+decl = TYPE_NAME (DECL_CONTEXT (decl));
+
+  /* If the using decl is exported, the things it refers
+ to must also be exported (or not have module attachment).  */
+  if (!DECL_MODULE_EXPORT_P (decl)
+  && (DECL_LANG_SPECIFIC (decl)
+ && DECL_MODULE_ATTACH_P (decl)))
+{
+  bool internal_p = !TREE_PUBLIC (decl);
+
+  /* A template in an anonymous namespace doesn't constrain TREE_PUBLIC
+until it's instantiated, so double-check its context.  */
+  if (!internal_p && TREE_CODE (binding) == TEMPLATE_DECL)
+   internal_p = decl_internal_context_p (decl);
+
+  auto_diagnostic_group d;
+  error ("exporting %q#D that does not have external linkage",
+binding);
+  if (TREE_CODE (decl) == TYPE_DECL && !DECL_IMPLICIT_TYPEDEF_P (decl))
+   /* An un-exported explicit type alias has no linkage.  */
+   inform (DECL_SOURCE_LOCATION (binding),
+   "%q#D declared here with no linkage", binding);
+  else if (internal_p)
+   inform (DECL_SOURCE_LOCATION (binding),
+   "%q#D declared here with internal linkage", binding);
+  else
+   inform (DECL_SOURCE_LOCATION (binding),
+   "%q#D declared here with module linkage", binding);
+  return false;
+}
+
+  return true;
+}
+
 /* Process a local-scope or namespace-scope using declaration.  LOOKUP
is the result of qualified lookup (both value & type are
significant).  FN_SCOPE_P indicates if we're at function-scope (as
@@ -4845,22 +4888,7 @@ do_nonmember_using_decl (name_lookup , bool 
fn_scope_p,
  tree new_fn = *usings;
  bool exporting = revealing_p && module_exporting_p ();
  if (exporting)
-   {
- tree decl = STRIP_TEMPLATE (new_fn);
-
- /* If the using decl is exported, the things it refers
-to must also be exported (or not have module attachment).  */
- if (!DECL_MODULE_EXPORT_P (decl)
- && (DECL_LANG_SPECIFIC (decl)
- && DECL_MODULE_ATTACH_P (decl)))
-   {
- auto_diagnostic_group d;
- error ("%q#D does not have external linkage", new_fn);
- inform (DECL_SOURCE_LOCATION (new_fn),
- 

[PATCH] C99 testsuite readiness: Compile more tests with -std=gnu89

2023-11-10 Thread Florian Weimer
gcc/testsuite/

* gcc.c-torture/compile/386.c: Compile with -std=gnu89.
* gcc.c-torture/compile/BUG1.c: Likewise.
* gcc.c-torture/compile/BUG11.c: Likewise.
* gcc.c-torture/compile/BUG16.c: Likewise.
* gcc.c-torture/compile/BUG2.c: Likewise.
* gcc.c-torture/compile/BUG24.c: Likewise.
* gcc.c-torture/compile/BUG25.c: Likewise.
* gcc.c-torture/compile/BUG3.c: Likewise.
* gcc.c-torture/compile/DFcmp.c: Likewise.
* gcc.c-torture/compile/HIcmp.c: Likewise.
* gcc.c-torture/compile/HIset.c: Likewise.
* gcc.c-torture/compile/QIcmp.c: Likewise.
* gcc.c-torture/compile/QIset.c: Likewise.
* gcc.c-torture/compile/SFset.c: Likewise.
* gcc.c-torture/compile/SIcmp.c: Likewise.
* gcc.c-torture/compile/SIset.c: Likewise.
* gcc.c-torture/compile/UHIcmp.c: Likewise.
* gcc.c-torture/compile/UQIcmp.c: Likewise.
* gcc.c-torture/compile/USIcmp.c: Likewise.
* gcc.c-torture/compile/a.c: Likewise.
* gcc.c-torture/compile/a1.c: Likewise.
* gcc.c-torture/compile/a3.c: Likewise.
* gcc.c-torture/compile/aa.c: Likewise.
* gcc.c-torture/compile/aaa.c: Likewise.
* gcc.c-torture/compile/abs.c: Likewise.
* gcc.c-torture/compile/ac.c: Likewise.
* gcc.c-torture/compile/acc.c: Likewise.
* gcc.c-torture/compile/add.c: Likewise.
* gcc.c-torture/compile/add386.c: Likewise.
* gcc.c-torture/compile/addcc.c: Likewise.
* gcc.c-torture/compile/andm.c: Likewise.
* gcc.c-torture/compile/andmem.c: Likewise.
* gcc.c-torture/compile/andn.c: Likewise.
* gcc.c-torture/compile/andok.c: Likewise.
* gcc.c-torture/compile/andsi.c: Likewise.
* gcc.c-torture/compile/andsparc.c: Likewise.
* gcc.c-torture/compile/aos.c: Likewise.
* gcc.c-torture/compile/arr.c: Likewise.
* gcc.c-torture/compile/as.c: Likewise.
* gcc.c-torture/compile/ase.c: Likewise.
* gcc.c-torture/compile/band.c: Likewise.
* gcc.c-torture/compile/bb0.c: Likewise.
* gcc.c-torture/compile/bb1.c: Likewise.
* gcc.c-torture/compile/bc.c: Likewise.
* gcc.c-torture/compile/bcopy.c: Likewise.
* gcc.c-torture/compile/bfx.c: Likewise.
* gcc.c-torture/compile/bge.c: Likewise.
* gcc.c-torture/compile/bit.c: Likewise.
* gcc.c-torture/compile/bitf.c: Likewise.
* gcc.c-torture/compile/bitw.c: Likewise.
* gcc.c-torture/compile/blk.c: Likewise.
* gcc.c-torture/compile/bt386.c: Likewise.
* gcc.c-torture/compile/bug.c: Likewise.
* gcc.c-torture/compile/buns.c: Likewise.
* gcc.c-torture/compile/c.c: Likewise.
* gcc.c-torture/compile/c2.c: Likewise.
* gcc.c-torture/compile/call.c: Likewise.
* gcc.c-torture/compile/callind.c: Likewise.
* gcc.c-torture/compile/calls-void.c: Likewise.
* gcc.c-torture/compile/calls.c: Likewise.
* gcc.c-torture/compile/cc.c: Likewise.
* gcc.c-torture/compile/cmb.c: Likewise.
* gcc.c-torture/compile/cmpsi386.c: Likewise.
* gcc.c-torture/compile/cmul.c: Likewise.
* gcc.c-torture/compile/comb.c: Likewise.
* gcc.c-torture/compile/consec.c: Likewise.
* gcc.c-torture/compile/const.c: Likewise.
* gcc.c-torture/compile/conv_tst.c: Likewise.
* gcc.c-torture/compile/cvt.c: Likewise.
* gcc.c-torture/compile/dbl_parm.c: Likewise.
* gcc.c-torture/compile/dblbug.c: Likewise.
* gcc.c-torture/compile/dead.c: Likewise.
* gcc.c-torture/compile/delay.c: Likewise.
* gcc.c-torture/compile/di.c: Likewise.
* gcc.c-torture/compile/div.c: Likewise.
* gcc.c-torture/compile/dm.c: Likewise.
* gcc.c-torture/compile/dshift.c: Likewise.
* gcc.c-torture/compile/e.c: Likewise.
* gcc.c-torture/compile/ex.c: Likewise.
* gcc.c-torture/compile/ext.c: Likewise.
* gcc.c-torture/compile/flo.c: Likewise.
* gcc.c-torture/compile/forgetcc.c: Likewise.
* gcc.c-torture/compile/g.c: Likewise.
* gcc.c-torture/compile/gen_tst.c: Likewise.
* gcc.c-torture/compile/gronk.c: Likewise.
* gcc.c-torture/compile/hi.c: Likewise.
* gcc.c-torture/compile/i.c: Likewise.
* gcc.c-torture/compile/icmp.c: Likewise.
* gcc.c-torture/compile/ifreg.c: Likewise.
* gcc.c-torture/compile/jumptab.c: Likewise.
* gcc.c-torture/compile/l.c: Likewise.
* gcc.c-torture/compile/layout.c: Likewise.
* gcc.c-torture/compile/lll.c: Likewise.
* gcc.c-torture/compile/load8.c: Likewise.
* gcc.c-torture/compile/loadhicc.c: Likewise.
* gcc.c-torture/compile/log2.c: Likewise.
* gcc.c-torture/compile/logic.c: Likewise.
* gcc.c-torture/compile/loop-1.c: Likewise.
* gcc.c-torture/compile/loop386.c: 

[PATCH] C99 testsuite readiness: More unverified testcase un-reductions

2023-11-10 Thread Florian Weimer
gcc/testsuite/

* gcc.c-torture/compile/BUG17.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/BUG18.c (main): Likewise.  Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/BUG21.c (Nase): Add missing void
types.
* gcc.c-torture/compile/BUG23.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/BUG5.c (bar): Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/BUG6.c (main): Likewise.  Add missing
int return type and missing void type.
* gcc.c-torture/compile/b.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/b1.c (main): Likewise.  Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/b88.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/bbb.c (flset): Add missing void
return type and switch to prototype style.
* gcc.c-torture/compile/bf.c (clr, atoi): Declare.
(main): Add missing int return type.  Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/bt.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/charmtst.c (foo, bar): Declare.
(c_move_tst): Add missing int return type.
* gcc.c-torture/compile/cmpdi-1.c (f, g): Add missing int
return type.
* gcc.c-torture/compile/cmphi.c (foo): Likewise.
* gcc.c-torture/compile/conv.c (main): Likewise.  Add missing
void type.  Call __builtin_printf instead of printf.
* gcc.c-torture/compile/ddd.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/dilayout.c (str, ll): Add missing
void return type.
* gcc.c-torture/compile/dimove.c (foo): Likewise.
* gcc.c-torture/compile/f2.c (foo): Likewise.
* gcc.c-torture/compile/flatten.c  (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/fnul.c (main): Likewise.
Call __builtin_printf instead of printf.
* gcc.c-torture/compile/fq.c (expand_to_ascii): Add missing
void return type.
* gcc.c-torture/compile/funcptr-1.c (g): Call __builtin_printf
instead of printf.
(f): Likewise.  Add missing void types.
* gcc.c-torture/compile/glob.c (foo): Add missing void types.
* gcc.c-torture/compile/goto-1.c (f): Likewise.
* gcc.c-torture/compile/i++.c (main): Call __builtin_printf
instead of printf.
* gcc.c-torture/compile/ic.c (foo): Add missing int return
type.
* gcc.c-torture/compile/iftrap-1.c (bar, baz): Declare.
(f4, f6): Call __builtin_abort instead of abort.
* gcc.c-torture/compile/iftrap-2.c (bar): Declare.
* gcc.c-torture/compile/jmp.c (foo): Add missing int types.
* gcc.c-torture/compile/labels-1.c (f): Add missing int
return type and missing void type.  Call __builtin_abort
instead of abort.
* gcc.c-torture/compile/labels-2.c (f): Likewise.
* gcc.c-torture/compile/lbug.c (main): Add missing int
return type and missing void type.
* gcc.c-torture/compile/memtst.c (memtst): Add missing void
return type.
(main): Add missing int return type and missing void type.
Call __builtin_bzero instead of bzero.
* gcc.c-torture/compile/miscomp.c (main): Add missing int
return type and missing void type.  Call __builtin_printf
instead of printf.
* gcc.c-torture/compile/msp.c (bar): Declare.
(foo): Add missing void types.
* gcc.c-torture/compile/mtst.c (foo): Add missing int return
type.
* gcc.c-torture/compile/packed-1.c (f): Add missing void
types.
* gcc.c-torture/compile/pr17119.c (func1, func2): Declare.
* gcc.c-torture/compile/pr18712.c (foo, foo1): Declare.
* gcc.c-torture/compile/pr20412.c (bar1, bar2, bar3): Declare.
* gcc.c-torture/compile/pr21532.c (foo): Declare.
* gcc.c-torture/compile/pr22398.c (main): Call __builtin_exit
instead of exit.
* gcc.c-torture/compile/pr24883.c (orec_str_list): Add missing
void return type.
* gcc.c-torture/compile/pr25311.c (use): Declare.
* gcc.c-torture/compile/pr25514.c (foo): Declare.
* gcc.c-torture/compile/pr26425.c (key_put): Declare.
* gcc.c-torture/compile/pr27087.c (g): Declare.
* gcc.c-torture/compile/pr27282.c (colrow_equal): Add missing
int return type.
* gcc.c-torture/compile/pr27907.c (fann_run): Add missing
void return type.
* gcc.c-torture/compile/pr28489.c (c_compile): Likewise.
* gcc.c-torture/compile/pr28776-1.c
(tree_contains_struct_check_failed): Declare.
 

[PATCH] C99 testsuite readiness: Verified un-reductions

2023-11-10 Thread Florian Weimer
gcc/testsuite/

   * gcc.c-torture/compile/20080613-1.c (hop_sendmsg): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/compile/complex-6.c (bar): Declare.
* gcc.c-torture/compile/pr23445.c (__brelse): Declare.
* gcc.c-torture/compile/pr23946.c (long2str): Declare.
(avi_parse_comments): Call __builtin_memset instead of
memset.  __builtin_malloc instead of malloc.  Call
__builtin_memcpy instead of memcpy.  Call
__builtin_free instead of free.
* gcc.c-torture/compile/pr31953.c (toggle_hexedit_mode):
Add missing void return type.
* gcc.c-torture/compile/pr32372.c (MPV_encode_init): Add
missing void return type.
* gcc.c-torture/compile/pr32355.c (sigemptyset): Declare.
(doSignalsSetup): Add missing void return type.
* gcc.c-torture/compile/pr32453.c (__assert_fail): Declare.
* gcc.c-torture/compile/pr32571.c (mthca_is_memfree)
(mthca_arbel_fmr_unmap, mthca_tavor_fmr_unmap)
(mthca_unmap_fmr): Declare.
* gcc.c-torture/compile/pr32584.c (sortpin): Add missing
void types.
* gcc.c-torture/compile/pr32919.c (read_int, _itoa_word)
(__strnlen): Declare.
* gcc.c-torture/compile/pr33173.c (test_dir_format): Add
missing void return type.  Add missing int types.
* gcc.c-torture/compile/pr33855.c (cabsl): Declare.
* gcc.c-torture/compile/pr34334.c (__strsep_1c)
(__strsep_2c): Add missing void return type.
* gcc.c-torture/compile/pr35006.c (grub_putchar)
(cl_set_pos, cl_print, grub_memmove, cl_delete): Declare.
(grub_cmdline_get): Add missing void return type.
* gcc.c-torture/compile/pr35595.c (__kernel_sinf):
Declare.
* gcc.c-torture/compile/pr35869.c (f): Add missing void
return type.
* gcc.c-torture/compile/pr36172.c (FcCharSetFreeze): Add
missing return value.
* gcc.c-torture/compile/pr36238.c (lshift_s_s): Declare.
* gcc.c-torture/compile/pr37207.c (func_81, func_98):
Declare.
* gcc.c-torture/compile/pr37258.c (mod_rhs, lshift_s_s)
(func_112, func_23): Declare.
* gcc.c-torture/compile/pr37305.c (safe_mod_u_u): Declare.
* gcc.c-torture/compile/pr37327.c (func_93, func_59)
(func_124, func_117, safe_add_uint64_t_u_u)
(safe_mul_int32_t_s_s): Declare.
* gcc.c-torture/compile/pr37387.c (FuncMakeConsequencesPres):
Call __builtin_abort instead of ErrorQuit.
* gcc.c-torture/compile/pr37432.c (print_wkb_bytes): Declare.
* gcc.c-torture/compile/pr37713.c (sdp_seq_alloc): Declare.
* gcc.c-torture/compile/pr39886.c (func): Declare.
* gcc.c-torture/compile/pr39941.c (stop): Declare.
* gcc.c-torture/compile/pr41016.c (CompareRNAStructures):
Call __builtin_abort instead of Die.
* gcc.c-torture/compile/pr42632.c (___pskb_trim): Add
forward declaration.
* gcc.c-torture/compile/pr49710.c (baz): Add forward
declaration and missing void types.
(bar): Add missing void type.
* gcc.c-torture/compile/pr52437.c (fn2): Declare.
* gcc.c-torture/compile/pr57441.c (func_1): Add missing void
return type.
* gcc.c-torture/compile/pr87110.c (struct d): Add missing
semicolon.
(g, h): Define as int.
(i): Add missing void types.
* gcc.c-torture/compile/pr87468.c (a): Define as int.
(e, f): Declare.
(b): Add missing void types.
* gcc.c-torture/execute/pr79043.c (ptr2): Use cast in
initializer.
(typepun): Add missing void return type.
(main): Add missing int return type and missing void type.
* gcc.dg/pr100349.c (b): Add missing void return type.
* gcc.dg/pr106189.c (T): Declare.
* gcc.dg/pr110777.c (_setjmp): Declare
* gcc.dg/pr45506.c (dynvec, relative_relocp, atexit): Declare.
(fini_array): Cast result of relative_relocp from int to int *.
* gcc.dg/pr97359.c: Compile with -Wno-pointer-to-int-cast.
(b): Cast point to int to avoid int-conversion warning.
* gcc.dg/uninit-pr78548.c: Call __builtin_printf instead
of printf.
* gcc.dg/torture/pr39829.c (f): Declare.
* gcc.dg/torture/pr44977.c (int329): Cast bar from pointer
to short.
* gcc.dg/torture/pr53703.c (ifa_sa_len): Declare.
(usagi_getifaddrs): Call __builtin_memset instead of memset
and __builtin_memcmp instead of memcmp.
* gcc.dg/torture/pr68625.c (z9): Explicitly cast
pointers to int.
* gcc.dg/torture/pr55964.c (f): Cast q to the expected type.
* gcc.dg/torture/pr70623.c (h9): Fix pointer cast in assignment
of av.
* gcc.dg/torture/pr81118.c (c): Return zero instead of
nothing.
* gcc.dg/torture/pr81510.c (e): Add cast from int to pointer.
 

[PATCH] C99 testsuite readiness: -fpermissive tests

2023-11-10 Thread Florian Weimer
These tests use obsolete language constructs, but they are not
clearly targeting C89, either.  So use -fpermissive to keep
future errors as warnings.

The reasons why obsolete constructs are used used vary from
test to test.  Some tests deliberately exercise later stages
of the compiler that only occur with those constructs.  Some
tests have precise expectations about warnings that will become
errors with a future change, but do not specifically test a
particular warning/error (if that is the case, the later changes
tend to duplicate them into warning/error variants).  In a few
cases, use of obsolete constructs is clearly due to test case
reduction, but it was not possible to un-reduce the test due
to its size.

gcc/testsuite/

* c-c++-common/Wduplicated-branches-11.c: Compile with
-fpermissive.
* c-c++-common/Wduplicated-branches-12.c: Likewise.
* c-c++-common/builtins.c: Likewise.
* c-c++-common/pointer-to-fn1.c: Likewise.
* gcc.c-torture/compile/20010320-1.c: Likewise.
* gcc.c-torture/compile/20050105-1.c: Likewise.
* gcc.c-torture/compile/20080704-1.c: Likewise.
* gcc.c-torture/compile/20080910-1.c: Likewise.
* gcc.c-torture/compile/20090917-1.c: Likewise.
* gcc.c-torture/compile/20100915-1.c: Likewise.
* gcc.c-torture/compile/20101216-1.c: Likewise.
* gcc.c-torture/compile/20121027-1.c: Likewise.
* gcc.c-torture/compile/20180605-1.c: Likewise.
* gcc.c-torture/compile/950816-2.c: Likewise.
* gcc.c-torture/compile/dse.c: Likewise.
* gcc.c-torture/compile/pr100576.c: Likewise.
* gcc.c-torture/compile/pr17408.c: Likewise.
* gcc.c-torture/compile/pr19121.c: Likewise.
* gcc.c-torture/compile/pr26213.c: Likewise.
* gcc.c-torture/compile/pr27341-2.c: Likewise.
* gcc.c-torture/compile/pr28776-2.c: Likewise.
* gcc.c-torture/compile/pr33133.c: Likewise.
* gcc.c-torture/compile/pr34091.c: Likewise.
* gcc.c-torture/compile/pr36154.c: Likewise.
* gcc.c-torture/compile/pr37381.c: Likewise.
* gcc.c-torture/compile/pr38360.c: Likewise.
* gcc.c-torture/compile/pr40291.c: Likewise.
* gcc.c-torture/compile/pr41182-1.c: Likewise.
* gcc.c-torture/compile/pr43635.c: Likewise.
* gcc.c-torture/compile/pr44043.c: Likewise.
* gcc.c-torture/compile/pr44063.c: Likewise.
* gcc.c-torture/compile/pr44246.c: Likewise.
* gcc.c-torture/compile/pr45535.c: Likewise.
* gcc.c-torture/compile/pr46934.c: Likewise.
* gcc.c-torture/compile/pr47428.c: Likewise.
* gcc.c-torture/compile/pr49145.c: Likewise.
* gcc.c-torture/compile/pr49206.c: Likewise.
* gcc.c-torture/compile/pr51694.c: Likewise.
* gcc.c-torture/compile/pr53886.c: Likewise.
* gcc.c-torture/compile/pr65241.c: Likewise.
* gcc.c-torture/compile/pr72802.c: Likewise.
* gcc.c-torture/compile/pr81360.c: Likewise.
* gcc.c-torture/compile/pr82052.c: Likewise.
* gcc.c-torture/compile/pr90275-2.c: Likewise.
* gcc.c-torture/compile/pr90275.c: Likewise.
* gcc.c-torture/compile/pr96796.c: Likewise.
* gcc.c-torture/compile/regs-arg-size.c: Likewise.
* gcc.c-torture/compile/udivmod4.c: Likewise.
* gcc.c-torture/compile/widechar-1.c: Likewise.
* gcc.c-torture/execute/2412-3.c: Likewise.
* gcc.c-torture/execute/20010605-2.c: Likewise.
* gcc.c-torture/execute/20020314-1.c: Likewise.
* gcc.c-torture/execute/20020819-1.c: Likewise.
* gcc.c-torture/execute/20031211-2.c: Likewise.
* gcc.c-torture/execute/20040223-1.c: Likewise.
* gcc.c-torture/execute/20041019-1.c: Likewise.
* gcc.c-torture/execute/20120427-1.c: Likewise.
* gcc.c-torture/execute/20120427-2.c: Likewise.
* gcc.c-torture/execute/920908-2.c: Likewise.
* gcc.c-torture/execute/921110-1.c: Likewise.
* gcc.c-torture/execute/930111-1.c: Likewise.
* gcc.c-torture/execute/930208-1.c: Likewise.
* gcc.c-torture/execute/930702-1.c: Likewise.
* gcc.c-torture/execute/930818-1.c: Likewise.
* gcc.c-torture/execute/931017-1.c: Likewise.
* gcc.c-torture/execute/931018-1.c: Likewise.
* gcc.c-torture/execute/931208-1.c: Likewise.
* gcc.c-torture/execute/941101-1.c: Likewise.
* gcc.c-torture/execute/941110-1.c: Likewise.
* gcc.c-torture/execute/950322-1.c: Likewise.
* gcc.c-torture/execute/950426-1.c: Likewise.
* gcc.c-torture/execute/950512-1.c: Likewise.
* gcc.c-torture/execute/950621-1.c: Likewise.
* gcc.c-torture/execute/960218-1.c: Likewise.
* gcc.c-torture/execute/960402-1.c: Likewise.
* gcc.c-torture/execute/cmpsf-1.c: Likewise.
* gcc.c-torture/execute/cmpsi-1.c: Likewise.
* gcc.c-torture/execute/cmpsi-2.c: Likewise.
* 

Re: [PATCH v3 4/4] ifcvt: Remove obsolete code for subreg handling in noce_convert_multiple_sets

2023-11-10 Thread Jeff Law




On 8/30/23 04:14, Manolis Tsamis wrote:

This code used to handle register replacement issues with SUBREG before
simplify_replace_rtx was introduced. This should not be needed anymore as
new_val has the correct mode and that should be preserved by
simplify_replace_rtx.

gcc/ChangeLog:

* ifcvt.cc (noce_convert_multiple_sets_1): Remove old code.
So is it the case that this code is supposed to no longer be needed as a 
result of your kit or it is unnecessary independent of patches 1..3?  If 
the latter then it's OK for the trunk now.


Jeff


[committed] RISC-V: Fix indentation of "length" attribute for branches and jumps

2023-11-10 Thread Maciej W. Rozycki
The "length" attribute calculation expressions for branches and jumps 
are incorrectly and misleadingly indented, and they overrun the 80 
column limit as well, all of this causing troubles in following them.
Correct all these issues.

gcc/
* config/riscv/riscv.md (length): Fix indentation for branch and
jump length calculation expressions.
---
Hi,

 Applied as obvious.

  Maciej
---
 gcc/config/riscv/riscv.md |   28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

gcc-riscv-attr-length-branch-indent.diff
Index: gcc/gcc/config/riscv/riscv.md
===
--- gcc.orig/gcc/config/riscv/riscv.md
+++ gcc/gcc/config/riscv/riscv.md
@@ -518,20 +518,26 @@
  ;; Branches further than +/- 1 MiB require three instructions.
  ;; Branches further than +/- 4 KiB require two instructions.
  (eq_attr "type" "branch")
- (if_then_else (and (le (minus (match_dup 0) (pc)) (const_int 4088))
- (le (minus (pc) (match_dup 0)) (const_int 
4092)))
- (const_int 4)
- (if_then_else (and (le (minus (match_dup 0) (pc)) (const_int 1048568))
- (le (minus (pc) (match_dup 0)) (const_int 
1048572)))
- (const_int 8)
- (const_int 12)))
+ (if_then_else (and (le (minus (match_dup 0) (pc))
+(const_int 4088))
+(le (minus (pc) (match_dup 0))
+(const_int 4092)))
+   (const_int 4)
+   (if_then_else (and (le (minus (match_dup 0) (pc))
+  (const_int 1048568))
+  (le (minus (pc) (match_dup 0))
+  (const_int 1048572)))
+ (const_int 8)
+ (const_int 12)))
 
  ;; Jumps further than +/- 1 MiB require two instructions.
  (eq_attr "type" "jump")
- (if_then_else (and (le (minus (match_dup 0) (pc)) (const_int 1048568))
- (le (minus (pc) (match_dup 0)) (const_int 
1048572)))
- (const_int 4)
- (const_int 8))
+ (if_then_else (and (le (minus (match_dup 0) (pc))
+(const_int 1048568))
+(le (minus (pc) (match_dup 0))
+(const_int 1048572)))
+   (const_int 4)
+   (const_int 8))
 
  ;; Conservatively assume calls take two instructions (AUIPC + JALR).
  ;; The linker will opportunistically relax the sequence to JAL.


Re: [PATCH v3 2/4] ifcvt: Allow more operations in multiple set if conversion

2023-11-10 Thread Jeff Law




On 10/19/23 13:46, Richard Sandiford wrote:

+  /* Allow a wide range of operations and let the costing function decide
+if the conversion is worth it later.  */
+  enum rtx_code code = GET_CODE (src);
+  if (!(CONSTANT_P (src)
+   || code == REG
+   || code == SUBREG
+   || code == ZERO_EXTEND
+   || code == SIGN_EXTEND
+   || code == NOT
+   || code == NEG
+   || code == PLUS
+   || code == MINUS
+   || code == AND
+   || code == IOR
+   || code == MULT
+   || code == ASHIFT
+   || code == ASHIFTRT
+   || code == NE
+   || code == EQ
+   || code == GE
+   || code == GT
+   || code == LE
+   || code == LT
+   || code == GEU
+   || code == GTU
+   || code == LEU
+   || code == LTU
+   || code == COMPARE))
return false;


I'm nervous about lists of operations like these, for two reasons:

(1) It isn't obvious what criteria are used to select the codes.

(2) It requires the top-level code to belong to a given set, but it
 allows subrtxes of src to be arbitrarily complex.  E.g. (to pick
 a silly example) a toplevel (popcount ...) would be rejected, but
 (plus (popcount ...) (const_int 1)) would be OK.

Could we just remove this condition instead?
I'd be all for that.We've actually got a similar problem in Joern's 
ext-dce code that I'm working on.  At least in that case I think we'll 
be able to enumerate how/why things are on the list if we still need it 
after the cleanup phase.


So I think the guidance on patch #2 would be to remove the list entirely 
if we can.


jeff


Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2023-11-10 Thread Jeff Law




On 10/20/23 03:16, Richard Sandiford wrote:

Thanks for the context.

Robin Dapp  writes:

Sorry for the slow review.  TBH I was hoping someone else would pick
it up, since (a) I'm not very familiar with this code, and (b) I don't
really agree with the way that the current code works.  I'm not sure the
current dependency checking is safe, so I'm nervous about adding even
more cases to it.  And it feels like the different ifcvt techniques are
not now well distinguished, so that they're beginning to overlap and
compete with one another.  None of that is your fault, of course. :)


I might be to blame, at least partially :)  The idea back then was to
do it here because (1) it can handle cases the other part cannot and
(2) its costing is based on sequence cost rather than just counting
instructions.


Ah, OK.  (2) seems like a good reason.
Agreed.  It's been a problem area (costing ifcvt), but it's still the 
right thing to do.  No doubt if we change something from counting insns 
to sequence costing it'll cause another set of problems, but that 
shouldn't stop us from doing the right thing here.





Yeah, makes sense.  Using your example, there seem to be two things
that we're checking:

(1) Does the sequence change cc?  This is checked via:

   if (cc_cmp)
{
  /* Check if SEQ can clobber registers mentioned in
 cc_cmp and/or rev_cc_cmp.  If yes, we need to use
 only seq1 from that point on.  */
  rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
  for (walk = seq; walk; walk = NEXT_INSN (walk))
{
  note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
  if (cc_cmp_pair[0] == NULL_RTX)
{
  cc_cmp = NULL_RTX;
  rev_cc_cmp = NULL_RTX;
  break;
}
}
}

 and is the case that Manolis's patch is dealing with.

(2) Does the sequence use a and b?  If so, we need to use temporary
 destinations for any earlier writes to a and b.

Is that right?

(1) looks OK, although Manolis's modified_in_p would work too.

Agreed.



(2) is the code I quoted yesterday and is the part I'm not sure
about.  First of all:

   seq1 = try_emit_cmove_seq (if_info, temp, cond,
 new_val, old_val, need_cmov,
 , _dest1);

must have a consistent view of what a and b are.  So old_val and new_val
cannot at this point reference "newer" values of a and b (as set by previous
instructions in the sequence).  AIUI:
Sigh.  ifcvt seems to pervasively adjust arguments, then you have to 
figure out which one is the right one for any given context.  I was 
driving me nuts a couple weeks ago when I was looking at the condzero 
work.  It's part of why I set everything down at the time.  I ran into 
it in the VRULL code, Ventana's hacks on top of the VRULL code and in 
the ESWIN code, got frustrated and decided to look at something else for 
a bit (which has led to its own little rathole).








The same cond, new_val and old_val are used in:

seq2 = try_emit_cmove_seq (if_info, temp, cond,
   new_val, old_val, need_cmov,
   , _dest2, cc_cmp, rev_cc_cmp);

So won't any use of a and b in seq2 to be from cond, rather than old_val
and new_val?  If so, shouldn't we set read_comparison for any use of a
and b, rather than skipping IF_THEN_ELSE?

Seems like it to me, yes.





Using seq_cost seems like the best way of costing things.  And I agree
that it's worth trying to avoid costing (and generating) redundant
instructions.
I think there's general agreement on seq_cost.  I wonder if we should 
look to split that out on its own, then figure out what to do with the 
bigger issues in this space.


Jeff


[PATCH 1/3] options: add gcc/regenerate-opt-urls.py

2023-11-10 Thread David Malcolm
gcc/ChangeLog:
* doc/options.texi (Option properties): Add UrlSuffix and
description of regenerate-opt-urls.py.
* regenerate-opt-urls.py: New file.
---
 gcc/doc/options.texi   |  17 ++
 gcc/regenerate-opt-urls.py | 366 +
 2 files changed, 383 insertions(+)
 create mode 100755 gcc/regenerate-opt-urls.py

diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
index 715f0a1479c7..1ea4b33bc765 100644
--- a/gcc/doc/options.texi
+++ b/gcc/doc/options.texi
@@ -597,4 +597,21 @@ This warning option corresponds to @code{cpplib.h} warning 
reason code
 @var{CPP_W_Enum}.  This should only be used for warning options of the
 C-family front-ends.
 
+@item UrlSuffix(@var{url_suffix})
+Adjacent to each human-written @code{.opt} file in the source tree is
+a corresponding file with a @code{.opt.urls} extension.  These files
+contain @code{UrlSuffix} directives giving the ending part of the URL
+for the documentation of the option, such as:
+
+@smallexample
+Wabi-tag
+UrlSuffix(gcc/C_002b_002b-Dialect-Options.html#index-Wabi-tag)
+@end smallexample
+
+These URL suffixes are relative to @code{DOCUMENTATION_ROOT_URL}.
+
+There files are generated from the @code{.opt} files and the generated
+HTML documentation by @code{regenerate-opt-urls.py}, and should be
+regenerated when adding new options.
+
 @end table
diff --git a/gcc/regenerate-opt-urls.py b/gcc/regenerate-opt-urls.py
new file mode 100755
index ..e2c63c27cbad
--- /dev/null
+++ b/gcc/regenerate-opt-urls.py
@@ -0,0 +1,366 @@
+#!/usr/bin/env python3
+
+# Copyright (C) 2023 Free Software Foundation, Inc.
+#
+# Script to regenerate FOO.opt.urls files for each FOO.opt in the
+# source tree.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+# WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .  */
+
+DESCRIPTION = """
+Parses the generated HTML (from "make html") to locate anchors
+for options, then parses the .opt files within the source tree,
+and generates a .opt.urls in the source tree for each .opt file,
+giving URLs for each option, where it can.
+
+Usage (from build/gcc subdirectory):
+  ../../src/gcc/regenerate-opt-urls.py HTML/gcc-14.0.0/ ../../src
+
+To run unit tests:
+  ../../src/gcc/regenerate-opt-urls.py HTML/gcc-14.0.0/ ../../src --unit-test
+"""
+
+import argparse
+import json
+import os
+from pathlib import Path
+from pprint import pprint
+import sys
+import re
+import unittest
+
+def canonicalize_option_name(option_name):
+if option_name.endswith('='):
+option_name = option_name[0:-1]
+return option_name
+
+
+def canonicalize_url_suffix(url_suffix):
+"""
+Various options have anchors for both the positive and
+negative form.  For example -Wcpp has both:
+  'gcc/Warning-Options.html#index-Wno-cpp'
+  'gcc/Warning-Options.html#index-Wcpp'
+
+Return a canonicalized version of the url_suffix that
+strips out any "no-" prefixes, for use in deduplication.
+Note that the resulting url suffix might not correspond to
+an actual anchor in the HTML.
+"""
+url_suffix = re.sub('index-Wno-', 'index-W', url_suffix)
+url_suffix = re.sub('index-fno-', 'index-f', url_suffix)
+url_suffix = re.sub('_003d$', '', url_suffix)
+url_suffix = re.sub('-([0-9]+)$', '', url_suffix)
+return url_suffix
+
+
+class Index:
+def __init__(self):
+# Map from option name to set of URL suffixes
+self.entries = {}
+
+def add_entry(self, matched_text, url_suffix, verbose=False):
+if 'Attributes.html' in url_suffix:
+return
+matched_text = canonicalize_option_name(matched_text)
+if matched_text in self.entries:
+# Partition by canonicalized url_suffixes; add the
+# first url_suffix in each such partition.
+c_new = canonicalize_url_suffix(url_suffix)
+for entry in self.entries[matched_text]:
+c_entry = canonicalize_url_suffix(entry)
+if c_new == c_entry:
+return
+self.entries[matched_text].add(url_suffix)
+else:
+self.entries[matched_text] = set([url_suffix])
+
+def get_url_suffixes(self, text):
+text = canonicalize_option_name(text)
+return self.entries.get(text)
+
+def parse_option_index(self, input_filename, verbose=False):
+with open(input_filename) as f:
+for 

[PATCH 3/3] diagnostics: use the .opt.urls files to urlify quoted text

2023-11-10 Thread David Malcolm
This patch adds machinery for using the .opt.urls files linking
to the documentation of our options in gcc_urlifier.

For every enabled .opt file, the corresponding .opt.urls file
will also be used when constructing the "optionslist" file.
The patch adds a new awk script to process the optionslist file,
options-urls-cc-gen.awk, which generates a options-urls.cc file,
containing a big array of const char * of the form:

const char * const opt_url_suffixes[] =
{
 [...snip...]

 /* [563] (OPT_Wclass_memaccess) = */
"gcc/C_002b_002b-Dialect-Options.html#index-Wclass-memaccess",
 /* [564] (OPT_Wclobbered) = */
"gcc/Warning-Options.html#index-Wclobbered",

[...snip...]
};

The patch wires up gcc_urlifier so that for quoted strings beginning
with '-' it will look up the option, and, if found, build a URL
using one of the above suffixes.

For example, given:

  ./xgcc -B. -S t.c -Wctad-maybe-unsupported
  cc1: warning: command-line option ‘-Wctad-maybe-unsupported’ is valid for 
C++/ObjC++ but not for C

the quoted string -Wctad-maybe-unsupported is automatically URLified in
my terminal to:

https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wctad-maybe-unsupported

This approach seems to allow us to get URLs automatically from the
documentation, whilst:
- integrating with the existing .opt mechanisms but keeping
autogenerated material (.opt.urls) separate from human-maintained
files (.opt)
- not adding any build-time requirements (by using awk at build time)
- only requiring Python 3 when regenerating the in-tree opt.urls files,
when the .texi or .opt files change enough to warrant it

gcc/ChangeLog:
* Makefile.in (ALL_OPT_URL_FILES): New.
(GCC_OBJS): Add options-urls.o.
(OBJS): Likewise.
(OBJS-libcommon): Likewise.
(s-options): Depend on $(ALL_OPT_URL_FILES), and add this to
inputs to opt-gather.awk.
(options-urls.cc): New Makefile target.
* gcc-urlifier.cc: Include "opts.h" and "options.h".
(gcc_urlifier::gcc_urlifier): Add lang_mask param.
(gcc_urlifier::m_lang_mask): New field.
(doc_urls): Make static.
(gcc_urlifier::get_url_for_quoted_text): Use label_text.
(gcc_urlifier::get_url_suffix_for_quoted_text): Use label_text.
Look for an option by name before trying a binary search in
doc_urls.
(gcc_urlifier::get_url_suffix_for_quoted_text): Use label_text.
(gcc_urlifier::get_url_suffix_for_option): New.
(make_gcc_urlifier): Add lang_mask param.
(selftest::gcc_urlifier_cc_tests): Update for above changes.
Verify that a URL is found for "-fpack-struct".
* gcc-urlifier.def: Drop options "--version" and "-fpack-struct".
* gcc-urlifier.h (make_gcc_urlifier): Add lang_mask param.
* gcc.cc (driver::global_initializations): Pass 0 for lang_mask
to make_gcc_urlifier.
* opt-functions.awk (url_suffix): New function.
* options-urls-cc-gen.awk: New file.
* opts.cc (get_option_html_page): Remove special-casing for
analyzer and LTO.
(get_option_url_suffix): New.
(get_option_url): Reimplement.
(selftest::test_get_option_html_page): Rename to...
(selftest::test_get_option_url_suffix): ...this and update for
above changes.
(selftest::opts_cc_tests): Update for renaming.
* opts.h (opt_url_suffixes): New decl.
(get_option_url_suffix): New decl.

gcc/testsuite/ChangeLog:
* lib/gcc-dg.exp: Set TERM to xterm.

gcc/ChangeLog:
* toplev.cc (general_init): Pass global_dc->m_lang_mask to
make_gcc_urlifier.
---
 gcc/Makefile.in  |  18 --
 gcc/gcc-urlifier.cc  | 106 ---
 gcc/gcc-urlifier.def |   2 -
 gcc/gcc-urlifier.h   |   2 +-
 gcc/gcc.cc   |   2 +-
 gcc/opt-functions.awk|   7 +++
 gcc/options-urls-cc-gen.awk  |  79 ++
 gcc/opts.cc  |  75 ++---
 gcc/opts.h   |   4 ++
 gcc/testsuite/lib/gcc-dg.exp |   6 ++
 gcc/toplev.cc|   2 +-
 11 files changed, 242 insertions(+), 61 deletions(-)
 create mode 100644 gcc/options-urls-cc-gen.awk

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 29cec21c8258..ebb59680d69b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1270,6 +1270,8 @@ FLAGS_TO_PASS = \
 # All option source files
 ALL_OPT_FILES=$(lang_opt_files) $(extra_opt_files)
 
+ALL_OPT_URL_FILES=$(patsubst %, %.urls, $(ALL_OPT_FILES))
+
 # Target specific, C specific object file
 C_TARGET_OBJS=@c_target_objs@
 
@@ -1286,7 +1288,7 @@ FORTRAN_TARGET_OBJS=@fortran_target_objs@
 RUST_TARGET_OBJS=@rust_target_objs@
 
 # Object files for gcc many-languages driver.
-GCC_OBJS = gcc.o gcc-main.o ggc-none.o gcc-urlifier.o
+GCC_OBJS = gcc.o gcc-main.o ggc-none.o gcc-urlifier.o options-urls.o
 
 c-family-warn = 

[PATCH 0/3] Option handling: add documentation URLs

2023-11-10 Thread David Malcolm
In r14-5118-gc5db4d8ba5f3de I added a mechanism to automatically add
URLs to quoted strings in diagnostics.  This was based on a data table
mapping strings to URLs, with placeholder data covering various pragmas
and a couple of options.

The following patches add automatic URLification in our diagnostic
messages to mentions of *all* of our options in quoted strings, linking
to our HTML documentation.

For example, with these patches, given:

  ./xgcc -B. -S t.c -Wctad-maybe-unsupported
  cc1: warning: command-line option ‘-Wctad-maybe-unsupported’ is valid for 
C++/ObjC++ but not for C

the quoted string '-Wctad-maybe-unsupported' gets automatically URLified
in a sufficiently modern terminal to:
  
https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wctad-maybe-unsupported

Objectives:
- integrate with DOCUMENTATION_ROOT_URL
- integrate with the existing .opt mechanisms
- automate keeping the URLs up-to-date
- work with target-specific options based on current configuration
- work with lang-specific options based on current configuration
- keep autogenerated material separate from the human-maintained .opt
  files
- no new build-time requirements (by using awk at build time)
- be maintainable

The approach is a new regenerate-opt-urls.py which:
- scrapes the generated HTML documentation finding anchors
  for options,
- reads all the .opt files in the source tree
- for each .opt file, generates a .opt.urls file; for each
  option in the .opt file it has either a UrlSuffix directives giving
  the final part of the URL of that option's documentation (relative
  to DOCUMENTATION_ROOT_URL), or a comment describing the problem.

regenerate-opt-urls.py is written in Python 3, and has unit tests.
I tested it with Python 3.8, and it probably works with earlier
releases of Python 3.
The .opt.urls files it generates become part of the source tree, and
would be regenerated by maintainers whenever new options are added.
Forgetting to update the files (or not having Python 3 handy) merely
means that URLs might be missing or out of date until someone else
regenerates them.

At build time, the .opt.urls are added to .opt files when regenerating
the optionslist file.  A new "options-urls-cc-gen.awk" is run at build
time on the optionslist to generate a "options-urls.cc" file containing
a big array of strings for all of the options present in the
configuration:

const char * const opt_url_suffixes[] =
{
  [...snip...]

  /* [563] (OPT_Wclass_memaccess) = */
  "gcc/C_002b_002b-Dialect-Options.html#index-Wclass-memaccess",

  /* [564] (OPT_Wclobbered) = */
  "gcc/Warning-Options.html#index-Wclobbered",

  [...snip...]
};

and this is then used by the gcc_urlifier class when emitting
diagnostics.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
OK for trunk?

David Malcolm (3):
  options: add gcc/regenerate-opt-urls.py
  Add generated .opt.urls files
  diagnostics: use the .opt.urls files to urlify quoted text

 gcc/Makefile.in  |   18 +-
 gcc/ada/gcc-interface/lang.opt.urls  |   28 +
 gcc/analyzer/analyzer.opt.urls   |  206 ++
 gcc/c-family/c.opt.urls  | 1404 ++
 gcc/common.opt.urls  | 1823 ++
 gcc/config/aarch64/aarch64.opt.urls  |   84 +
 gcc/config/alpha/alpha.opt.urls  |   76 +
 gcc/config/alpha/elf.opt.urls|2 +
 gcc/config/arc/arc-tables.opt.urls   |2 +
 gcc/config/arc/arc.opt.urls  |  260 +++
 gcc/config/arm/arm-tables.opt.urls   |2 +
 gcc/config/arm/arm.opt.urls  |  149 ++
 gcc/config/arm/vxworks.opt.urls  |2 +
 gcc/config/avr/avr.opt.urls  |   71 +
 gcc/config/bfin/bfin.opt.urls|   61 +
 gcc/config/bpf/bpf.opt.urls  |   35 +
 gcc/config/c6x/c6x-tables.opt.urls   |2 +
 gcc/config/c6x/c6x.opt.urls  |   18 +
 gcc/config/cris/cris.opt.urls|   65 +
 gcc/config/cris/elf.opt.urls |8 +
 gcc/config/csky/csky.opt.urls|  104 +
 gcc/config/csky/csky_tables.opt.urls |2 +
 gcc/config/darwin.opt.urls   |  221 +++
 gcc/config/dragonfly.opt.urls|9 +
 gcc/config/epiphany/epiphany.opt.urls|   52 +
 gcc/config/fr30/fr30.opt.urls|8 +
 gcc/config/freebsd.opt.urls  |9 +
 gcc/config/frv/frv.opt.urls  |  111 ++
 gcc/config/ft32/ft32.opt.urls|   20 +
 gcc/config/fused-madd.opt.urls   |4 +
 gcc/config/g.opt.urls|5 +
 gcc/config/gcn/gcn.opt.urls  |   23 +
 gcc/config/gnu-user.opt.urls |9 +
 gcc/config/h8300/h8300.opt.urls  |   29 +
 gcc/config/hpux11.opt.urls   |6 +
 gcc/config/i386/cygming.opt.urls |   30 +

Re: [PATCH] c++: non-dependent .* folding [PR112427]

2023-11-10 Thread Jason Merrill

On 11/10/23 10:28, Patrick Palka wrote:

On Fri, 10 Nov 2023, Patrick Palka wrote:


On Thu, 9 Nov 2023, Jason Merrill wrote:


On 11/8/23 16:59, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

Here when building up the non-dependent .* expression, we crash from
fold_convert on 'b.a' due to this (templated) COMPONENT_REF having an
IDENTIFIER_NODE instead of FIELD_DECL operand that middle-end routines
expect.  Like in r14-4899-gd80a26cca02587, this patch fixes this by
replacing the problematic piecemeal folding with a single call to
cp_fully_fold.

PR c++/112427

gcc/cp/ChangeLog:

* typeck2.cc (build_m_component_ref): Use cp_convert, build2 and
cp_fully_fold instead of fold_build_pointer_plus and fold_convert.



gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent29.C: New test.
---
   gcc/cp/typeck2.cc   |  5 -
   gcc/testsuite/g++.dg/template/non-dependent29.C | 13 +
   2 files changed, 17 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent29.C

diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 309903afed8..208004221da 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -2378,7 +2378,10 @@ build_m_component_ref (tree datum, tree component,
tsubst_flags_t complain)
 /* Build an expression for "object + offset" where offset is the
 value stored in the pointer-to-data-member.  */
 ptype = build_pointer_type (type);
-  datum = fold_build_pointer_plus (fold_convert (ptype, datum),
component);
+  datum = cp_convert (ptype, datum, complain);
+  datum = build2 (POINTER_PLUS_EXPR, ptype,
+ datum, convert_to_ptrofftype (component));


We shouldn't need to build the POINTER_PLUS_EXPR at all in template context.
OK with that change.


Hmm, that seems harmless at first glance, but I noticed
build_min_non_dep (called from build_x_binary_op in this case) is
careful to propagate TREE_SIDE_EFFECTS of the given tree, and so eliding
POINTER_PLUS_EXPR here could potentially mean that the tree we
ultimately return from build_x_binary_op when in a template context has
TREE_SIDE_EFFECTS not set when it used to.  Shall we still elide the
POINTER_PLUS_EXPR in a template context despite this?


True, we would need build_min_non_dep to also get TREE_SIDE_EFFECTS from 
the operands.  That might be useful in general for similar situations?


I also note that convert_to_ptrofftype uses fold_convert, so the new 
code could have the same problem if the pointer to member operand is 
also a COMPONENT_REF.


Jason



Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2023-11-10 Thread Jeff Law




On 10/20/23 01:04, Robin Dapp wrote:



But I'm not sure which cases this code is trying to catch.  Is it trying
to catch cases where seq2 "spontaneously" uses registers that happen to
overlap with cond?  If so, then when does that happen?  And if it does
happen, wouldn't the sequence also have to set the registers first?


In order for sequence costing to be superior to just counting "conditional"
instructions we need to make sure that as few redundant instructions as
possible are present in the costed sequences. (redundant as in "will be
removed in a subsequent pass").

[ ... ]
Sounds a lot like a scenario we had with threading.  Threading will 
often generate code in duplicated blocks that will trivially be eliminated.


IIRC I had Alex O. tackle that in threading several years back with good 
success.  I don't think he tried to be exhaustive, just sensible in what 
was likely to be dead after threading.  It helped numerous cases where 
we clearly should have threaded, but weren't because of artificially 
high costing.


It's not a trivial balancing act.

Jeff


Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2023-11-10 Thread Jeff Law




On 10/19/23 13:41, Richard Sandiford wrote:

Manolis Tsamis  writes:

This is an extension of what was done in PR106590.

Currently if a sequence generated in noce_convert_multiple_sets clobbers the
condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
(sequences that emit the comparison itself). Since this applies only from the
next iteration it assumes that the sequences generated (in particular seq2)
doesn't clobber the condition rtx itself before using it in the if_then_else,
which is only true in specific cases (currently only register/subregister moves
are allowed).

This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in
the current iteration. This makes it possible to include arithmetic operations
in noce_convert_multiple_sets.

gcc/ChangeLog:

* ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
(noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.

Signed-off-by: Manolis Tsamis 
---

(no changes since v1)

  gcc/ifcvt.cc | 49 +++--
  1 file changed, 19 insertions(+), 30 deletions(-)


Sorry for the slow review.  TBH I was hoping someone else would pick
it up, since (a) I'm not very familiar with this code, and (b) I don't
really agree with the way that the current code works.  I'm not sure the
current dependency checking is safe, so I'm nervous about adding even
more cases to it.  And it feels like the different ifcvt techniques are
not now well distinguished, so that they're beginning to overlap and
compete with one another.  None of that is your fault, of course. :)
I'd been hoping to get it it as well, particularly since I've got a TODO 
to sort out the conditional zero support in ifcvt.cc from various 
contibutors.  While there isn't any overlap I can see between that work 
and this submission from Manolis, it's close enough that if I'm going to 
get re-familiar with ifcvt.cc I figured I'd try to handle both.



jeff


Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-10 Thread Andrew Pinski
On Fri, Nov 10, 2023 at 5:12 AM Richard Biener  wrote:
>
> On Fri, 10 Nov 2023, Tamar Christina wrote:
>
> >
> > Hi Prathamesh,
> >
> > Yes Arm requires SIMD for copysign. The testcases fail because they don't 
> > turn on Neon.
> >
> > I'll update them.
>
> On x86_64 with -m32 I see
>
> FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN" 1
> FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1
> FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= ABS_EXPR"
> 1
> FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1
> FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= .COPYSIGN"
> 2
> FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> "Deleting[^n]* = -" 4
> FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> "Deleting[^n]* = .COPYSIGN" 2
> FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> "Deleting[^n]* = ABS_EXPR <" 1
> FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
>
> maybe add a copysign effective target?

I get the feeling that the internal function for copysign should not
be a direct internal function for scalar modes and call
expand_copysign instead when expanding.
This will fix some if not all of the issues where COPYSIGN is now
trying to show up.

BY the way this is most likely PR 88786 (and PR 112468 and a few
others). and PR 58797 .

Thanks,
Andrew



>
> > Regards,
> > Tamar
> > 
> > From: Prathamesh Kulkarni 
> > Sent: Friday, November 10, 2023 12:24 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org ; nd ; 
> > rguent...@suse.de ; j...@ventanamicro.com 
> > 
> > Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to 
> > copysign (x, -1) [PR109154]
> >
> > On Mon, 6 Nov 2023 at 15:50, Tamar Christina  
> > wrote:
> > >
> > > Hi All,
> > >
> > > This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more
> > > canonical and allows a target to expand this sequence efficiently.  Such
> > > sequences are common in scientific code working with gradients.
> > >
> > > There is an existing canonicalization of copysign (x, -1) to fneg (fabs 
> > > (x))
> > > which I remove since this is a less efficient form.  The testsuite is also
> > > updated in light of this.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > Hi Tamar,
> > It seems the patch caused following regressions on arm:
> >
> > Running gcc:gcc.dg/dg.exp ...
> > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN" 1
> > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1
> >
> > Running gcc:gcc.dg/tree-ssa/tree-ssa.exp ...
> > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1
> > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= .COPYSIGN" 2
> > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= ABS_EXPR" 1
> > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > "Deleting[^\\n]* = -" 4
> > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > "Deleting[^\\n]* = ABS_EXPR <" 1
> > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > "Deleting[^\\n]* = \\.COPYSIGN" 2
> > FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized 
> > ".COPYSIGN" 1
> > FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized "ABS" 1
> > FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple ".COPYSIGN" 4
> > FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple "ABS" 4
> > FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
> > Link to log files:
> > https://ci.linaro.org/job/tcwg_gcc_check--master-arm-build/1240/artifact/artifacts/00-sumfiles/
> >
> > Even for following test-case:
> > double g (double a)
> > {
> >   double t1 = fabs (a);
> >   double t2 = -t1;
> >   return t2;
> > }
> >
> > It seems, the pattern gets applied but doesn't get eventually
> > simplified to copysign(a, -1).
> > forwprop dump shows:
> > Applying pattern match.pd:1131, gimple-match-4.cc:4134
> > double g (double a)
> > {
> >   double t2;
> >   double t1;
> >
> >:
> >   t1_2 = ABS_EXPR ;
> >   t2_3 = -t1_2;
> >   return t2_3;
> >
> > }
> >
> > while on x86_64:
> > Applying pattern match.pd:1131, gimple-match-4.cc:4134
> > gimple_simplified to t2_3 = .COPYSIGN (a_1(D), -1.0e+0);
> > Removing dead stmt:t1_2 = ABS_EXPR ;
> > double g (double a)
> > {
> >   double t2;
> >   double t1;
> >
> >:
> >   t2_3 = .COPYSIGN (a_1(D), -1.0e+0);
> >   return t2_3;
> >
> > }
> >
> > Thanks,
> > Prathamesh
> >
> >
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > PR tree-optimization/109154
> > > * match.pd: Add new neg+abs rule, remove inverse copysign rule.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR tree-optimization/109154
> > > * gcc.dg/fold-copysign-1.c: Updated.
> > >   

Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.

2023-11-10 Thread Jeff Law




On 10/25/23 04:15, Jin Ma wrote:

+;; The conversion of DF to BF needs to be done with SF if there is a
+;; chance to generate at least one instruction, otherwise just using
+;; libfunc __truncdfbf2.
+(define_expand "truncdfbf2"
+  [(set (match_operand:BF 0 "register_operand" "=f")
+   (float_truncate:BF
+   (match_operand:DF 1 "register_operand" " f")))]
+  "TARGET_DOUBLE_FLOAT || TARGET_ZDINX"
+  {
+convert_move (operands[0],
+ convert_modes (SFmode, DFmode, operands[1], 0), 0);
+DONE;
+  })

So for conversions to/from BFmode, doesn't generic code take care of
this for us?  Search for convert_mode_scalar in expr.cc. That code will
utilize SFmode as an intermediate step just like your expander.   Is
there some reason that generic code is insufficient?

Similarly for the the other conversions.


As far as I can see, the function 'convert_mode_scalar' doesn't seem to be 
perfect for
dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, 
DF and
SF to BF well, but the rest of the conversion without any processing, directly 
using
the libcall.

Maybe I should choose to enhance its functionality? This seems to be a
good choice, I'm not sure.My recollection was that BF could be converted 
to/from SF trivially and

if we wanted BF->DF we'd first convert to SF, then to DF.

Direct BF<->DF conversions aren't actually important from a performance
standpoint.  So it's OK if they have an extra step IMHO.


Thank you very much for your review and detailed reply. Maybe there are some 
problems with my expression
and I am a little confused about your guidance. My understanding is that you 
also think that it is reasonable to
convert through SF, right? In fact, this is what I did.
My point was that I would expect the generic code to handle the 
conversion and that we didn't need to handle it explicitly in the RISC-V 
backend.


Meaning that I don't think we need a define_expand for truncdfbf2, 
fix_truncbf2, fixuns_truncbf2, floatbf2, or 
floatunsbf2.





In this patch, my thoughts are as follows:

The general principle is to use the real instructions instead of libcall as 
much as possible for conversions,
while minimizing the definition of libcall(only reusing which has been defined 
by other architectures such
as aarch64). If SF can be used as a transit, it is preferred to convert to SF, 
otherwise libcall is directly used.

1. For the conversions between floating points

For BF->DF, as you said, the function 'convert_mode_scalar' in the general code 
has been well implemented,
which will be expressed as BF->SF->DF. And the generated instruction list may 
be as follows:
   'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point 
support);
   'call __extendbfsf2' + 'fcvt.d.s'   (when (TARGET_DOUBLE_FLOAT || 
TARGET_ZDINX) is true);
   'fcvt.s.bf16'+ 'fcvt.d.s'   (when ((TARGET_DOUBLE_FLOAT || 
TARGET_ZDINX) && TARGET_ZFBFMIN) is true)

For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call 
__truncdfbf2' is directly generated
by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand 
"truncdfbf2") is used. This
makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which 
cannot be generated by the function
'convert_mode_scala'.
But I would have expected convert_mode_scalar to generate DF->BF by 
first truncating to SF, then to BF.   If that is missing for truncation, 
then we should add it to convert_mode_scalar rather than expressing it 
as a backend expander.








2. For the conversions between integer and BF, it seems that gcc only uses 
libcall to implement it, but this is
obviously wrong. For example, the conversion BF->SI directly calls the 
unimplemented libcall __fixunsbfsi.
So I added some new pattern to handle these transformations with SF.
I would suggest these move into target independent code as well. 
There's no reason I'm aware of that they should be implemented entirely 
in a target machine description.  We're not really doing anything target 
specific in here.


jeff


Re: [PING 2] [C PATCH] Synthesize nonnull attribute for parameters declared with static

2023-11-10 Thread Jeff Law




On 10/21/23 05:09, Martin Uecker wrote:


C programmers increasingly use static to indicate that
pointer parameters are non-null.  Clang can exploit this
for warnings and optimizations.  GCC has some warnings
but not all warnings it has for nonnull.  Below is a
patch to add a nonnull attribute automatically for such
arguments and to remove the special and more limited
nonnull warnings for static. This patch found some
misplaced annotations in one of my projects via
-Wnonnull-compare which clang does not seem to have,
so I think this could be useful.


 c: Synthesize nonnull attribute for parameters declared with static 
[PR110815]
 
 Parameters declared with `static` are nonnull. We synthesize

 an artifical nonnull attribute for such parameters to get the
 same warnings and optimizations.
 
 Bootstrapped and regression tested on x86.
 
	PR c/102558

PR 102556
 PR c/110815
 
 gcc/c-family:

 * c-attribs.cc (build_attr_access_from_parms): Synthesize
 nonnull attribute for parameters declared with `static`.
 
 gcc:

 * gimple-ssa-warn-access.cc 
(pass_waccess::maybe_check_access_sizes):
 remove warning for parameters declared with `static`.
 
 gcc/testsuite:

 * gcc.dg/Wnonnull-8.c: Adapt test.
 * gcc.dg/Wnonnull-9.c: New test.
This is OK -- assuming you did the usual bootstrap & regression test 
cycle.


Jeff


Re: [PATCH] attribs: Use existing traits for excl_hash_traits

2023-11-10 Thread Jeff Law




On 11/6/23 05:32, Richard Sandiford wrote:

Ping.

Richard Sandiford via Gcc-patches  writes:

excl_hash_traits can be defined more simply by reusing existing traits.

Tested on aarch64-linux-gnu.  OK to install?

Richard


gcc/
* attribs.cc (excl_hash_traits): Delete.
(test_attribute_exclusions): Use pair_hash and nofree_string_hash
instead.

OK.  Sorry this fell through the cracks.

jeff


Re: [PATCH 3/3] attribs: Namespace-aware lookup_attribute_spec

2023-11-10 Thread Jeff Law




On 11/6/23 05:24, Richard Sandiford wrote:

attribute_ignored_p already used a namespace-aware query
to find the attribute_spec for an existing attribute:

   const attribute_spec *as = lookup_attribute_spec (TREE_PURPOSE (attr));

This patch does the same for other callers in the file.

Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?

Richard


gcc/
* attribs.cc (comp_type_attributes): Pass the full TREE_PURPOSE
to lookup_attribute_spec, rather than just the name.
(remove_attributes_matching): Likewise.

OK
jeff


Re: [PATCH 3/3] RISC-V: Add support for XCVbi extension in CV32E40P

2023-11-10 Thread Jeff Law




On 11/8/23 04:09, Mary Bennett wrote:

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
   Mary Bennett 
   Nandni Jamnadas 
   Pietra Ferreira 
   Charlie Keaney
   Jessica Mills
   Craig Blackmore 
   Simon Cook 
   Jeremy Bennett 
   Helene Chelin 


gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Create XCVbi extension
  support.
* config/riscv/riscv.opt: Likewise.
* config/riscv/corev.md: Implement cv_branch pattern
  for cv.beqimm and cv.bneimm.
* config/riscv/riscv.md: Change pattern priority so corev.md
  patterns run before riscv.md patterns.
* config/riscv/constraints.md: Implement constraints
  cv_bi_s5 - signed 5-bit immediate.
* config/riscv/predicates.md: Implement predicate
  const_int5s_operand - signed 5 bit immediate.
* doc/sourcebuild.texi: Add XCVbi documentation.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
* lib/target-supports.exp: Add proc for XCVbi.
---




diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
index 0109e1836cf..7d7b952d817 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -706,3 +706,17 @@
  
[(set_attr "type" "load")

(set_attr "mode" "SI")])
+
+;; XCVBI Builtins
+(define_insn "cv_branch"
+  [(set (pc)
+   (if_then_else
+(match_operator 1 "equality_operator"
+[(match_operand:X 2 "register_operand" "r")
+ (match_operand:X 3 "const_int5s_operand" 
"CV_bi_sign5")])
+(label_ref (match_operand 0 "" ""))
+(pc)))]
+  "TARGET_XCVBI"
+  "cv.b%C1imm\t%2,%3,%0"
+  [(set_attr "type" "branch")
+   (set_attr "mode" "none")])
Note that technically you could use "i" or "n" for the constraint of 
operand 3.  This works because the predicate has priority and it only 
allows -16..15.




diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ae2217d0907..168c8665a7a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -579,6 +579,14 @@
  (define_asm_attributes
[(set_attr "type" "multi")])
  
+;; ..

+;;
+;; Machine Description Patterns
+;;
+;; ..
+
+(include "corev.md")
I would put a comment here indicating why a subtarget might want to 
include its patterns before the standard patterns in riscv.md.



OK with the comment added.  Your decision on whether or not to drop the 
CV_bi_sign5 constraint and replace it with "n".


Jeff


Re: [PATCH] c++: fix tf_decltype manipulation for COMPOUND_EXPR

2023-11-10 Thread Patrick Palka
On Fri, 10 Nov 2023, Jason Merrill wrote:

> On 11/10/23 12:25, Patrick Palka wrote:
> > On Thu, 9 Nov 2023, Jason Merrill wrote:
> > 
> > > On 11/7/23 10:08, Patrick Palka wrote:
> > > > bootstrapped and regtested on x86_64-pc-linxu-gnu, does this look OK for
> > > > trunk?
> > > > 
> > > > -- >8 --
> > > > 
> > > > In the COMPOUND_EXPR case of tsubst_expr, we were redundantly clearing
> > > > the tf_decltype flag when substituting the LHS and also neglecting to
> > > > propagate it when substituting the RHS.  This patch corrects this flag
> > > > manipulation, which allows us to accept the below testcase.
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * pt.cc (tsubst_expr) : Don't redundantly
> > > > clear tf_decltype when substituting the LHS.  Propagate
> > > > tf_decltype when substituting the RHS.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp0x/decltype-call7.C: New test.
> > > > ---
> > > >gcc/cp/pt.cc| 9 -
> > > >gcc/testsuite/g++.dg/cpp0x/decltype-call7.C | 9 +
> > > >2 files changed, 13 insertions(+), 5 deletions(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-call7.C
> > > > 
> > > > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > > > index 521749df525..5f879287a58 100644
> > > > --- a/gcc/cp/pt.cc
> > > > +++ b/gcc/cp/pt.cc
> > > > @@ -20382,11 +20382,10 @@ tsubst_expr (tree t, tree args, tsubst_flags_t
> > > > complain, tree in_decl)
> > > >  case COMPOUND_EXPR:
> > > >  {
> > > > -   tree op0 = tsubst_expr (TREE_OPERAND (t, 0), args,
> > > > -   complain & ~tf_decltype, in_decl);
> > > > -   RETURN (build_x_compound_expr (EXPR_LOCATION (t),
> > > > -  op0,
> > > > -  RECUR (TREE_OPERAND (t, 1)),
> > > > +   tree op0 = RECUR (TREE_OPERAND (t, 0));
> > > > +   tree op1 = tsubst_expr (TREE_OPERAND (t, 1), args,
> > > > +   complain|decltype_flag, in_decl);
> > > > +   RETURN (build_x_compound_expr (EXPR_LOCATION (t), op0, op1,
> > > >templated_operator_saved_lookups
> > > > (t),
> > > >complain|decltype_flag));
> > > 
> > > Hmm, passing decltype_flag to both op1 and the , is concerning.  Can you
> > > add a
> > > test with overloaded operator, where the RHS is a class with a destructor?
> > 
> > I'm not sure if this is what you had in mind, but indeed with this patch
> > we reject the following with an error outside the immediate context:
> > 
> >  struct B { ~B() = delete; };
> >  template B f();
> > 
> >  void operator,(int, const B&);
> > 
> >  template decltype(42, f()) g(int) = delete; // #1
> >  template void g(...); // #2
> > 
> >  int main() {
> >g(0); // should select #2
> >  }
> > 
> > gcc/testsuite/g++.dg/cpp0x/decltype-call8.C: In substitution of
> > ‘template decltype ((42, f())) g(int) [with T = B]’:
> > gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:12:7:   required from here
> > 12 |   g(0);
> >|   ^~~
> > gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:8:30: error: use of deleted
> > function ‘B::~B()’
> >  8 | template decltype(42, f()) g(int) = delete; // #1
> >|~~^~~~
> > gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:3:12: note: declared here
> >  3 | struct B { ~B() = delete; };
> >|^
> > 
> > Ultimately because unary_complex_lvalue isn't SFINAE-enabled.
> 
> Please elaborate; my understanding is that unary_complex_lvalue is supposed to
> be a semantically neutral transformation.

Since tf_decltype is now also set when substituting op1 i.e. f(),
substitution yields a bare CALL_EXPR with no temporary materialization.
The problematic unary_complex_lvalue call happens when binding
the reference parameter 'const B&' to this bare CALL_EXPR.  We
take its address via cp_build_addr_expr, which tries
unary_complex_lvalue.  The CALL_EXPR handling in unary_complex_lvalue
in turn materializes a temporary for the call, which fails as expected
due to the destructor but also issues the unexpected error since
unary_complex_lvalue unconditionally uses tf_warning_or_error.

So in short the CALL_EXPR handling in unary_complex_lvalue seems to
assume the requirements of temporary materialization have already
been checked.

> 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index 4f2cb2cd402..277c81412b9 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -20386,7 +20386,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t
> > complain, tree in_decl)
> > complain|decltype_flag, in_decl);
> > RETURN (build_x_compound_expr (EXPR_LOCATION (t), op0, op1,
> >templated_operator_saved_lookups (t),
> > -  

Re: [PATCH] Avoid generate vblendps with ymm16+

2023-11-10 Thread Jakub Jelinek
On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote:
> On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1  wrote:
> >
> > This patch aims to avoid generate vblendps with ymm16+, And have
> > bootstrapped and tested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/112435
> > * config/i386/sse.md: Adding constraints to restrict the generation 
> > of
> > vblendps.
> It should be "Don't output vblendps when evex sse reg or gpr32 is involved."
> Others LGTM.

I've missed this patch, so wrote my own today, and am wondering

1) if it isn't better to use separate alternative instead of
   x86_evex_reg_mentioned_p, like in the patch below
2) why do you need the last two hunks in sse.md, both avx2_permv2ti and
   *avx_vperm2f128_nozero insns only use x in constraints, never v,
   so x86_evex_reg_mentioned_p ought to be always false there

Here is the untested patch, of course you have more testcases (though, I
think it is better to test dg-do assemble with avx512vl target rather than
dg-do compile and scan the assembler, after all, the problem was that it
didn't assemble).

2023-11-10  Jakub Jelinek  

PR target/112435
* config/i386/sse.md (avx512vl_shuf_32x4_1,
avx512dq_shuf_64x2_1): Add
alternative with just x instead of v constraints and use vblendps
as optimization only with that alternative.

* gcc.target/i386/avx512vl-pr112435.c: New test.

--- gcc/config/i386/sse.md.jj   2023-11-09 09:04:18.616543403 +0100
+++ gcc/config/i386/sse.md  2023-11-10 15:56:44.138499931 +0100
@@ -19235,11 +19235,11 @@ (define_expand "avx512dq_shuf_avx512dq_shuf_64x2_1"
-  [(set (match_operand:VI8F_256 0 "register_operand" "=v")
+  [(set (match_operand:VI8F_256 0 "register_operand" "=x,v")
(vec_select:VI8F_256
  (vec_concat:
-   (match_operand:VI8F_256 1 "register_operand" "v")
-   (match_operand:VI8F_256 2 "nonimmediate_operand" "vm"))
+   (match_operand:VI8F_256 1 "register_operand" "x,v")
+   (match_operand:VI8F_256 2 "nonimmediate_operand" "xm,vm"))
  (parallel [(match_operand 3 "const_0_to_3_operand")
 (match_operand 4 "const_0_to_3_operand")
 (match_operand 5 "const_4_to_7_operand")
@@ -19254,7 +19254,7 @@ (define_insn "avx512dq_shu
   mask = INTVAL (operands[3]) / 2;
   mask |= (INTVAL (operands[5]) - 4) / 2 << 1;
   operands[3] = GEN_INT (mask);
-  if (INTVAL (operands[3]) == 2 && !)
+  if (INTVAL (operands[3]) == 2 && ! && which_alternative == 0)
 return "vblendps\t{$240, %2, %1, %0|%0, %1, %2, 240}";
   return "vshuf64x2\t{%3, %2, %1, 
%0|%0, %1, %2, %3}";
 }
@@ -19386,11 +19386,11 @@ (define_expand "avx512vl_shuf_32x4_1"
-  [(set (match_operand:VI4F_256 0 "register_operand" "=v")
+  [(set (match_operand:VI4F_256 0 "register_operand" "=x,v")
(vec_select:VI4F_256
  (vec_concat:
-   (match_operand:VI4F_256 1 "register_operand" "v")
-   (match_operand:VI4F_256 2 "nonimmediate_operand" "vm"))
+   (match_operand:VI4F_256 1 "register_operand" "x,v")
+   (match_operand:VI4F_256 2 "nonimmediate_operand" "xm,vm"))
  (parallel [(match_operand 3 "const_0_to_7_operand")
 (match_operand 4 "const_0_to_7_operand")
 (match_operand 5 "const_0_to_7_operand")
@@ -19414,7 +19414,7 @@ (define_insn "avx512vl_shuf_)
+  if (INTVAL (operands[3]) == 2 && ! && which_alternative == 0)
 return "vblendps\t{$240, %2, %1, %0|%0, %1, %2, 240}";
 
   return "vshuf32x4\t{%3, %2, %1, 
%0|%0, %1, %2, %3}";
--- gcc/testsuite/gcc.target/i386/avx512vl-pr112435.c.jj2023-11-10 
16:04:21.708046771 +0100
+++ gcc/testsuite/gcc.target/i386/avx512vl-pr112435.c   2023-11-10 
16:03:51.053479094 +0100
@@ -0,0 +1,13 @@
+/* PR target/112435 */
+/* { dg-do assemble { target { avx512vl && { ! ia32 } } } } */
+/* { dg-options "-mavx512vl -O2" } */
+
+#include 
+
+__m256i
+foo (__m256i a, __m256i b)
+{
+  register __m256i c __asm__("ymm16") = a;
+  asm ("" : "+v" (c));
+  return _mm256_shuffle_i32x4 (c, b, 2);
+}

Jakub



Re: [PATCH] c++: fix tf_decltype manipulation for COMPOUND_EXPR

2023-11-10 Thread Jason Merrill

On 11/10/23 12:25, Patrick Palka wrote:

On Thu, 9 Nov 2023, Jason Merrill wrote:


On 11/7/23 10:08, Patrick Palka wrote:

bootstrapped and regtested on x86_64-pc-linxu-gnu, does this look OK for
trunk?

-- >8 --

In the COMPOUND_EXPR case of tsubst_expr, we were redundantly clearing
the tf_decltype flag when substituting the LHS and also neglecting to
propagate it when substituting the RHS.  This patch corrects this flag
manipulation, which allows us to accept the below testcase.

gcc/cp/ChangeLog:

* pt.cc (tsubst_expr) : Don't redundantly
clear tf_decltype when substituting the LHS.  Propagate
tf_decltype when substituting the RHS.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype-call7.C: New test.
---
   gcc/cp/pt.cc| 9 -
   gcc/testsuite/g++.dg/cpp0x/decltype-call7.C | 9 +
   2 files changed, 13 insertions(+), 5 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-call7.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 521749df525..5f879287a58 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20382,11 +20382,10 @@ tsubst_expr (tree t, tree args, tsubst_flags_t
complain, tree in_decl)
 case COMPOUND_EXPR:
 {
-   tree op0 = tsubst_expr (TREE_OPERAND (t, 0), args,
-   complain & ~tf_decltype, in_decl);
-   RETURN (build_x_compound_expr (EXPR_LOCATION (t),
-  op0,
-  RECUR (TREE_OPERAND (t, 1)),
+   tree op0 = RECUR (TREE_OPERAND (t, 0));
+   tree op1 = tsubst_expr (TREE_OPERAND (t, 1), args,
+   complain|decltype_flag, in_decl);
+   RETURN (build_x_compound_expr (EXPR_LOCATION (t), op0, op1,
   templated_operator_saved_lookups (t),
   complain|decltype_flag));


Hmm, passing decltype_flag to both op1 and the , is concerning.  Can you add a
test with overloaded operator, where the RHS is a class with a destructor?


I'm not sure if this is what you had in mind, but indeed with this patch
we reject the following with an error outside the immediate context:

 struct B { ~B() = delete; };
 template B f();

 void operator,(int, const B&);

 template decltype(42, f()) g(int) = delete; // #1
 template void g(...); // #2

 int main() {
   g(0); // should select #2
 }

gcc/testsuite/g++.dg/cpp0x/decltype-call8.C: In substitution of ‘template 
decltype ((42, f())) g(int) [with T = B]’:
gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:12:7:   required from here
12 |   g(0);
   |   ^~~
gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:8:30: error: use of deleted 
function ‘B::~B()’
 8 | template decltype(42, f()) g(int) = delete; // #1
   |~~^~~~
gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:3:12: note: declared here
 3 | struct B { ~B() = delete; };
   |^

Ultimately because unary_complex_lvalue isn't SFINAE-enabled.


Please elaborate; my understanding is that unary_complex_lvalue is 
supposed to be a semantically neutral transformation.



diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 4f2cb2cd402..277c81412b9 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20386,7 +20386,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
complain|decltype_flag, in_decl);
RETURN (build_x_compound_expr (EXPR_LOCATION (t), op0, op1,
   templated_operator_saved_lookups (t),
-  complain|decltype_flag));
+  complain));


This looks like it will break if the operator, returns a class with a 
deleted destructor.


Jason



[Committed] g++: Rely on dg-do-what-default to avoid running pr102788.cc on non-vector targets

2023-11-10 Thread Patrick O'Neill



On 11/9/23 17:20, Jeff Law wrote:



On 11/2/23 17:45, Patrick O'Neill wrote:

Testcases in g++.dg/vect rely on check_vect_support_and_set_flags
to set dg-do-what-default and avoid running vector tests on non-vector
targets. The three testcases in this patch overwrite the default with
dg-do run.

Removing the dg-do run directive resolves this issue for non-vector
targets (while still running the tests on vector targets).

gcc/testsuite/ChangeLog:

* g++.dg/vect/pr102788.cc: Remove dg-do run directive.
OK.  I'll note your patch has just one file patched, but your comment 
indicates three testcases have this problem.  Did you forget to 
include a couple changes?


If so, those are pre-approved as well.  Just post them for the 
archiver and commit.


Thanks,
jeff

Committed

The comment was mistakenly copy/pasted from 
https://inbox.sourceware.org/gcc-patches/20231102190911.66763-1-patr...@rivosinc.com/T/#u

Revised commit message to only mention the one testcase.

Thanks,
Patrick


[PATCH] aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder

2023-11-10 Thread Florian Weimer
* config/aarch64/linux-unwind.h
(aarch64_fallback_frame_state): Add cast to the expected type
in sc assignment.

(Almost a v2, but the other issue was already fixed via in r14-4183.)

---
 libgcc/config/aarch64/linux-unwind.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgcc/config/aarch64/linux-unwind.h 
b/libgcc/config/aarch64/linux-unwind.h
index 00eba866049..18b3df71e7b 100644
--- a/libgcc/config/aarch64/linux-unwind.h
+++ b/libgcc/config/aarch64/linux-unwind.h
@@ -77,7 +77,10 @@ aarch64_fallback_frame_state (struct _Unwind_Context 
*context,
 }
 
   rt_ = context->cfa;
-  sc = _->uc.uc_mcontext;
+  /* Historically, the uc_mcontext member was of type struct sigcontext, but
+ glibc uses a different type now with member names in the implementation
+ namespace.  */
+  sc = (struct sigcontext *) _->uc.uc_mcontext;
 
 /* This define duplicates the definition in aarch64.md */
 #define SP_REGNUM 31

base-commit: 3a6df3281a525ae6113f50d7b38b09fcd803801e



Re: [PATCH] g++: Add require-effective-target to multi-input file testcase pr95401.cc

2023-11-10 Thread Patrick O'Neill


On 11/9/23 17:34, Jeff Law wrote:



On 11/3/23 00:18, Patrick O'Neill wrote:

On non-vector targets dejagnu attempts dg-do compile for pr95401.cc.
This produces a command like this:
g++ pr95401.cc pr95401a.cc -S -o pr95401.s

which isn't valid (gcc does not accept multiple input files when using
-S with -o).

This patch adds require-effective-target vect_int to avoid the case
where the testcase is invoked with dg-do compile.

gcc/testsuite/ChangeLog:

* g++.dg/vect/pr95401.cc: Add require-effective-target vect_int.
Sorry, I must be missing something here.  I fail to see how adding an 
effective target check would/should impact the problem you've 
described above with the dg-additional-sources interaction with -S.


It's not intuitive (& probably not the cleanest way of solving it).

pr95401.cc is an invalid testcase when run with dg-do compile (for the 
reasons above).


pr95401.cc 
 
does not define a dg-do, which means it uses the testcase uses 
dg-do-what-default 
 
to determine what to do.
dg-do-what-default is set by target-supports.exp 
.


The two options here are set dg-do-what-default run or compile.
On non-vector targets the pr95401 is set to compile (which is invalid).

Ideally we would say if dg-do-what-default == compile don't run, but 
AFAIK that isn't possible.
I didn't want to duplicate the check_vect_support_and_set_flags logic to 
return true/false since that'll probably get out of sync.


I used require-effective-target vect_int as a proxy for 
check_vect_support_and_set_flags (also since the testcase only contains 
integer arrays).


That way we do this now:
dg-do-what-default run -> run
dg-do-what-default compile -> skip test

If there's a cleaner/better approach I'm happy to revise.

Patrick



Jeff


Re: [PATCH] c++: fix tf_decltype manipulation for COMPOUND_EXPR

2023-11-10 Thread Patrick Palka
On Thu, 9 Nov 2023, Jason Merrill wrote:

> On 11/7/23 10:08, Patrick Palka wrote:
> > bootstrapped and regtested on x86_64-pc-linxu-gnu, does this look OK for
> > trunk?
> > 
> > -- >8 --
> > 
> > In the COMPOUND_EXPR case of tsubst_expr, we were redundantly clearing
> > the tf_decltype flag when substituting the LHS and also neglecting to
> > propagate it when substituting the RHS.  This patch corrects this flag
> > manipulation, which allows us to accept the below testcase.
> > 
> > gcc/cp/ChangeLog:
> > 
> > * pt.cc (tsubst_expr) : Don't redundantly
> > clear tf_decltype when substituting the LHS.  Propagate
> > tf_decltype when substituting the RHS.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/decltype-call7.C: New test.
> > ---
> >   gcc/cp/pt.cc| 9 -
> >   gcc/testsuite/g++.dg/cpp0x/decltype-call7.C | 9 +
> >   2 files changed, 13 insertions(+), 5 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-call7.C
> > 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index 521749df525..5f879287a58 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -20382,11 +20382,10 @@ tsubst_expr (tree t, tree args, tsubst_flags_t
> > complain, tree in_decl)
> > case COMPOUND_EXPR:
> > {
> > -   tree op0 = tsubst_expr (TREE_OPERAND (t, 0), args,
> > -   complain & ~tf_decltype, in_decl);
> > -   RETURN (build_x_compound_expr (EXPR_LOCATION (t),
> > -  op0,
> > -  RECUR (TREE_OPERAND (t, 1)),
> > +   tree op0 = RECUR (TREE_OPERAND (t, 0));
> > +   tree op1 = tsubst_expr (TREE_OPERAND (t, 1), args,
> > +   complain|decltype_flag, in_decl);
> > +   RETURN (build_x_compound_expr (EXPR_LOCATION (t), op0, op1,
> >templated_operator_saved_lookups (t),
> >complain|decltype_flag));
> 
> Hmm, passing decltype_flag to both op1 and the , is concerning.  Can you add a
> test with overloaded operator, where the RHS is a class with a destructor?

I'm not sure if this is what you had in mind, but indeed with this patch
we reject the following with an error outside the immediate context:

struct B { ~B() = delete; };
template B f();

void operator,(int, const B&);

template decltype(42, f()) g(int) = delete; // #1
template void g(...); // #2

int main() {
  g(0); // should select #2
}

gcc/testsuite/g++.dg/cpp0x/decltype-call8.C: In substitution of ‘template decltype ((42, f())) g(int) [with T = B]’:
gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:12:7:   required from here
   12 |   g(0);
  |   ^~~
gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:8:30: error: use of deleted 
function ‘B::~B()’
8 | template decltype(42, f()) g(int) = delete; // #1
  |~~^~~~
gcc/testsuite/g++.dg/cpp0x/decltype-call8.C:3:12: note: declared here
3 | struct B { ~B() = delete; };
  |^

Ultimately because unary_complex_lvalue isn't SFINAE-enabled.  If we
fix that with the following then we accept the testcase as before.

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index 9d4d95f85bf..58c45542793 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -556,7 +556,7 @@ build_simple_base_path (tree expr, tree binfo)
 into `(*(a ?   : )).x', and so on.  A COND_EXPR is only
 an lvalue in the front end; only _DECLs and _REFs are lvalues
 in the back end.  */
-  temp = unary_complex_lvalue (ADDR_EXPR, expr);
+  temp = unary_complex_lvalue (ADDR_EXPR, expr, tf_warning_or_error);
   if (temp)
expr = cp_build_fold_indirect_ref (temp);
 
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1fa710d7154..d826afcdb5c 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8130,7 +8130,7 @@ extern tree cp_build_addr_expr(tree, 
tsubst_flags_t);
 extern tree cp_build_unary_op   (enum tree_code, tree, bool,
  tsubst_flags_t);
 extern tree genericize_compound_lvalue (tree);
-extern tree unary_complex_lvalue   (enum tree_code, tree);
+extern tree unary_complex_lvalue   (enum tree_code, tree, 
tsubst_flags_t);
 extern tree build_x_conditional_expr   (location_t, tree, tree, tree,
  tsubst_flags_t);
 extern tree build_x_compound_expr_from_list(tree, expr_list_kind,
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 4f2cb2cd402..277c81412b9 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20386,7 +20386,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
complain|decltype_flag, in_decl);
RETURN (build_x_compound_expr (EXPR_LOCATION (t), op0, op1,
   

[pushed][PR112337][IRA]: Check autoinc and memory address after temporary equivalence substitution

2023-11-10 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337

The patch was successfully bootstrapped an tested on x86-64, ppc64le, 
and aarch64.
commit b3d1d30eeed67c78e223c146a464d2fdd1dde894
Author: Vladimir N. Makarov 
Date:   Fri Nov 10 11:14:46 2023 -0500

[IRA]: Check autoinc and memory address after temporary equivalence substitution

My previous RA patches to take register equivalence into account do
temporary register equivalence substitution to find out that the
equivalence can be consumed by insns.  The insn with the substitution is
checked on validity using target-depended code.  This code expects that
autoinc operations work on register but this register can be substituted
by equivalent memory.  The patch fixes this problem.  The patch also adds
checking that the substitution can be consumed in memory address too.

gcc/ChangeLog:

PR target/112337
* ira-costs.cc: (validate_autoinc_and_mem_addr_p): New function.
(equiv_can_be_consumed_p): Use it.

gcc/testsuite/ChangeLog:

PR target/112337
* gcc.target/arm/pr112337.c: New.

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index 50f80779025..e0528e76a64 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -1758,13 +1758,46 @@ process_bb_node_for_costs (ira_loop_tree_node_t loop_tree_node)
 process_bb_for_costs (bb);
 }
 
+/* Return true if all autoinc rtx in X change only a register and memory is
+   valid.  */
+static bool
+validate_autoinc_and_mem_addr_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
+  if (GET_RTX_CLASS (code) == RTX_AUTOINC)
+return REG_P (XEXP (x, 0));
+  const char *fmt = GET_RTX_FORMAT (code);
+  for (int i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
+if (fmt[i] == 'e')
+  {
+	if (!validate_autoinc_and_mem_addr_p (XEXP (x, i)))
+	  return false;
+  }
+else if (fmt[i] == 'E')
+  {
+	for (int j = 0; j < XVECLEN (x, i); j++)
+	  if (!validate_autoinc_and_mem_addr_p (XVECEXP (x, i, j)))
+	return false;
+  }
+  /* Check memory after checking autoinc to guarantee that autoinc is already
+ valid for machine-dependent code checking memory address.  */
+  return (!MEM_P (x)
+	  || memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
+	  MEM_ADDR_SPACE (x)));
+}
+
 /* Check that reg REGNO can be changed by TO in INSN.  Return true in case the
result insn would be valid one.  */
 static bool
 equiv_can_be_consumed_p (int regno, rtx to, rtx_insn *insn)
 {
   validate_replace_src_group (regno_reg_rtx[regno], to, insn);
-  bool res = verify_changes (0);
+  /* We can change register to equivalent memory in autoinc rtl.  Some code
+ including verify_changes assumes that autoinc contains only a register.
+ So check this first.  */
+  bool res = validate_autoinc_and_mem_addr_p (PATTERN (insn));
+  if (res)
+res = verify_changes (0);
   cancel_changes (0);
   return res;
 }
diff --git a/gcc/testsuite/gcc.target/arm/pr112337.c b/gcc/testsuite/gcc.target/arm/pr112337.c
new file mode 100644
index 000..5dacf0aa4f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr112337.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } */
+
+#pragma GCC arm "arm_mve_types.h"
+int32x4_t h(void *p) { return __builtin_mve_vldrwq_sv4si(p); }
+void g(int32x4_t);
+void f(int, int, int, short, int *p) {
+  int *bias = p;
+  for (;;) {
+int32x4_t d = h(bias);
+bias += 4;
+g(d);
+  }
+}


Re: [PATCH][Ada] Fix syntax errors in expect.c

2023-11-10 Thread Marc Poulhiès


Andris Pavēnis  writes:

> Fixing these errors (attached patch for master branch) was not sufficient for
> building Ada cross-compiler, but it fixed compiler errors.
>
> This would perhaps qualify for trivial change, but it seems that I no more 
> have
> write access (I got it in 2015, but have not used it for a long time. Perhaps 
> I
> do not really need it)

Hello,

I've merged you patch as r14-5332.

Thanks!
Marc


[PATCH] libgccjit: Fix GGC segfault when using -flto

2023-11-10 Thread Antoni Boucher
Hi.
This patch fixes the segfault when using -flto with libgccjit (bug
111396).

You mentioned in bugzilla that this didn't fix the reproducer for you,
but it does for me.
At first, the test case would not pass, but running "make install" made
it pass.
Not sure if this is normal.

Could you please check if this fixes the issue on your side as well?
Since this patch changes files outside of gcc/jit, what tests should I
run to make sure it didn't break anything?

Thanks for the review.
From f26d0f37e8d83bce1f5aa53c393961a8bd518d16 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Fri, 10 Nov 2023 09:52:32 -0500
Subject: [PATCH] libgccjit: Fix GGC segfault when using -flto

gcc/ChangeLog:
	PR jit/111396
	* ipa-fnsummary.cc (ipa_fnsummary_cc_finalize): Call
	ipa_free_size_summary.
	* ipa-icf.cc (ipa_icf_cc_finalize): New function.
	* ipa-profile.cc (ipa_profile_cc_finalize): New function.
	* ipa-prop.cc (ipa_prop_cc_finalize): New function.
	* ipa-prop.h (ipa_prop_cc_finalize): New function.
	* ipa-sra.cc (ipa_sra_cc_finalize): New function.
	* ipa-utils.h (ipa_profile_cc_finalize, ipa_icf_cc_finalize,
	ipa_sra_cc_finalize): New functions.
	* toplev.cc (toplev::finalize): Call ipa_icf_cc_finalize,
	ipa_prop_cc_finalize, ipa_profile_cc_finalize and
	ipa_sra_cc_finalize
	Include ipa-utils.h.

gcc/testsuite/ChangeLog:
	PR jit/111396
	* jit.dg/all-non-failing-tests.h: Add new test-ggc-bugfix.
	* jit.dg/test-ggc-bugfix.c: New test.
---
 gcc/ipa-fnsummary.cc |  1 +
 gcc/ipa-icf.cc   |  9 ++
 gcc/ipa-profile.cc   | 10 ++
 gcc/ipa-prop.cc  | 18 +++
 gcc/ipa-prop.h   |  2 ++
 gcc/ipa-sra.cc   | 12 +++
 gcc/ipa-utils.h  |  7 
 gcc/testsuite/jit.dg/all-non-failing-tests.h | 12 ++-
 gcc/testsuite/jit.dg/test-ggc-bugfix.c   | 34 
 gcc/toplev.cc|  5 +++
 10 files changed, 109 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/jit.dg/test-ggc-bugfix.c

diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc
index a2495ffe63e..34e011c4b50 100644
--- a/gcc/ipa-fnsummary.cc
+++ b/gcc/ipa-fnsummary.cc
@@ -5090,4 +5090,5 @@ void
 ipa_fnsummary_cc_finalize (void)
 {
   ipa_free_fn_summary ();
+  ipa_free_size_summary ();
 }
diff --git a/gcc/ipa-icf.cc b/gcc/ipa-icf.cc
index bbdfd445397..ba6c6899ce6 100644
--- a/gcc/ipa-icf.cc
+++ b/gcc/ipa-icf.cc
@@ -3657,3 +3657,12 @@ make_pass_ipa_icf (gcc::context *ctxt)
 {
   return new ipa_icf::pass_ipa_icf (ctxt);
 }
+
+/* Reset all state within ipa-icf.cc so that we can rerun the compiler
+   within the same process.  For use by toplev::finalize.  */
+
+void
+ipa_icf_cc_finalize (void)
+{
+  ipa_icf::optimizer = NULL;
+}
diff --git a/gcc/ipa-profile.cc b/gcc/ipa-profile.cc
index 78a40a118bc..8083b8195a8 100644
--- a/gcc/ipa-profile.cc
+++ b/gcc/ipa-profile.cc
@@ -1065,3 +1065,13 @@ make_pass_ipa_profile (gcc::context *ctxt)
 {
   return new pass_ipa_profile (ctxt);
 }
+
+/* Reset all state within ipa-profile.cc so that we can rerun the compiler
+   within the same process.  For use by toplev::finalize.  */
+
+void
+ipa_profile_cc_finalize (void)
+{
+  delete call_sums;
+  call_sums = NULL;
+}
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 827bdb691ba..32cfb7754be 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -5904,5 +5904,23 @@ ipcp_transform_function (struct cgraph_node *node)
   return modified_mem_access ? TODO_update_ssa_only_virtuals : 0;
 }
 
+/* Reset all state within ipa-prop.cc so that we can rerun the compiler
+   within the same process.  For use by toplev::finalize.  */
+
+void
+ipa_prop_cc_finalize (void)
+{
+  if (function_insertion_hook_holder)
+symtab->remove_cgraph_insertion_hook (function_insertion_hook_holder);
+  function_insertion_hook_holder = NULL;
+
+  if (ipa_edge_args_sum)
+ggc_delete (ipa_edge_args_sum);
+  ipa_edge_args_sum = NULL;
+
+  if (ipa_node_params_sum)
+ggc_delete (ipa_node_params_sum);
+  ipa_node_params_sum = NULL;
+}
 
 #include "gt-ipa-prop.h"
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index fcd0e5c638f..4409c4afee9 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -1255,6 +1255,8 @@ tree ipcp_get_aggregate_const (struct function *func, tree parm, bool by_ref,
 bool unadjusted_ptr_and_unit_offset (tree op, tree *ret,
  poly_int64 *offset_ret);
 
+void ipa_prop_cc_finalize (void);
+
 /* From tree-sra.cc:  */
 tree build_ref_for_offset (location_t, tree, poly_int64, bool, tree,
 			   gimple_stmt_iterator *, bool);
diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 6ffad335db4..2ac6fee14c4 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -4707,5 +4707,17 @@ make_pass_ipa_sra (gcc::context *ctxt)
   return new pass_ipa_sra (ctxt);
 }
 
+/* Reset all state within ipa-sra.cc so that we can rerun the compiler
+   within the same 

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-10 Thread Dimitar Dimitrov
On Fri, Nov 10, 2023 at 04:53:57PM +0800, Lehua Ding wrote:
> > > The divide by zero error above is interesting. I'm not sure why
> > > ira_reg_class_max_nregs[] yields 0 for the pseudo register 168 in
> > > the following rtx:
> > > (debug_insn 168 167 169 19 (var_location:SI encoding (reg/v:SI 168 [
> > > encoding ])) -1
> > >   (nil))
> > 
> > I just cross compiled an arm-none-eabi compiler and didn't encounter
> > this error, can you give me a little more config info about build? For
> > example, flags_for_target, etc. Thanks again.
> > 
> 
> Forgot, please also provide the version information of newlib code.
> 

These are the GIT commit hashes which I tested:
  gcc 39d81b667373b0033f44702a4b532a4618dde9ff
  binutils c96ceed9dce7617f270aa4742645706e535f74b7
  newlib 39f734a857e2692224715b03b99fc7bd83e94a0f

This is the script I'm using to build arm-none-eabi:
   https://github.com/dinuxbg/gnupru/blob/master/testing/manual-build-arm.sh
The build steps and config parameters are easily seen there.

Note that the Linaro CI is also detecting issues. It hits ICEs when
building libgcc:
  
https://patchwork.sourceware.org/project/gcc/patch/20231108034740.834590-8-lehua.d...@rivai.ai/

Regards,
Dimitar



[pushed] Allow md iterators to include other iterators

2023-11-10 Thread Richard Sandiford
This patch allows an .md iterator to include the contents of
previous iterators, possibly with an extra condition attached.

Too much indirection might become hard to follow, so for the
AArch64 changes I tried to stick to things that seemed likely
to be uncontroversial:

(a) structure iterators that combine modes for different sizes
and vector counts

(b) iterators that explicitly duplicate another iterator
(for iterating over the cross product)

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
* read-rtl.cc (md_reader::read_mapping): Allow iterators to
include other iterators.
* doc/md.texi: Document the change.
* config/aarch64/iterators.md (DREG2, VQ2, TX2, DX2, SX2): Include
the iterator that is being duplicated, rather than reproducing it.
(VSTRUCT_D): Redefine using VSTRUCT_[234]D.
(VSTRUCT_Q): Likewise VSTRUCT_[234]Q.
(VSTRUCT_2QD, VSTRUCT_3QD, VSTRUCT_4QD, VSTRUCT_QD): Redefine using
the individual D and Q iterators.
---
 gcc/config/aarch64/iterators.md | 60 +
 gcc/doc/md.texi | 13 +++
 gcc/read-rtl.cc | 21 ++--
 3 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 1593a8fd04f..a920de99ffc 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -106,7 +106,7 @@ (define_mode_iterator VDZ [V8QI V4HI V4HF V4BF V2SI V2SF DI 
DF])
 (define_mode_iterator DREG [V8QI V4HI V4HF V2SI V2SF DF])
 
 ;; Copy of the above.
-(define_mode_iterator DREG2 [V8QI V4HI V4HF V2SI V2SF DF])
+(define_mode_iterator DREG2 [DREG])
 
 ;; Advanced SIMD modes for integer divides.
 (define_mode_iterator VQDIV [V4SI V2DI])
@@ -124,7 +124,7 @@ (define_mode_iterator VDQ_BHSI [V8QI V16QI V4HI V8HI V2SI 
V4SI])
 (define_mode_iterator VQ [V16QI V8HI V4SI V2DI V8HF V4SF V2DF V8BF])
 
 ;; Copy of the above.
-(define_mode_iterator VQ2 [V16QI V8HI V4SI V2DI V8HF V8BF V4SF V2DF])
+(define_mode_iterator VQ2 [VQ])
 
 ;; Quad vector modes suitable for moving.  Includes BFmode.
 (define_mode_iterator VQMOV [V16QI V8HI V4SI V2DI V8HF V8BF V4SF V2DF])
@@ -320,21 +320,13 @@ (define_mode_iterator VS [V2SI V4SI])
 (define_mode_iterator TX [TI TF TD])
 
 ;; Duplicate of the above
-(define_mode_iterator TX2 [TI TF TD])
+(define_mode_iterator TX2 [TX])
 
 (define_mode_iterator VTX [TI TF TD V16QI V8HI V4SI V2DI V8HF V4SF V2DF V8BF])
 
 ;; Advanced SIMD opaque structure modes.
 (define_mode_iterator VSTRUCT [OI CI XI])
 
-;; Advanced SIMD 64-bit vector structure modes.
-(define_mode_iterator VSTRUCT_D [V2x8QI V2x4HI V2x2SI V2x1DI
-V2x4HF V2x2SF V2x1DF V2x4BF
-V3x8QI V3x4HI V3x2SI V3x1DI
-V3x4HF V3x2SF V3x1DF V3x4BF
-V4x8QI V4x4HI V4x2SI V4x1DI
-V4x4HF V4x2SF V4x1DF V4x4BF])
-
 ;; Advanced SIMD 64-bit 2-vector structure modes.
 (define_mode_iterator VSTRUCT_2D [V2x8QI V2x4HI V2x2SI V2x1DI
  V2x4HF V2x2SF V2x1DF V2x4BF])
@@ -347,6 +339,9 @@ (define_mode_iterator VSTRUCT_3D [V3x8QI V3x4HI V3x2SI 
V3x1DI
 (define_mode_iterator VSTRUCT_4D [V4x8QI V4x4HI V4x2SI V4x1DI
  V4x4HF V4x2SF V4x1DF V4x4BF])
 
+;; Advanced SIMD 64-bit vector structure modes.
+(define_mode_iterator VSTRUCT_D [VSTRUCT_2D VSTRUCT_3D VSTRUCT_4D])
+
 ;; Advanced SIMD 64-bit 2-vector structure modes minus V2x1DI and V2x1DF.
 (define_mode_iterator VSTRUCT_2DNX [V2x8QI V2x4HI V2x2SI V2x4HF
V2x2SF V2x4BF])
@@ -371,14 +366,6 @@ (define_mode_iterator VSTRUCT_3DX [V3x1DI V3x1DF])
 ;; Advanced SIMD 64-bit 4-vector structure modes with 64-bit elements.
 (define_mode_iterator VSTRUCT_4DX [V4x1DI V4x1DF])
 
-;; Advanced SIMD 128-bit vector structure modes.
-(define_mode_iterator VSTRUCT_Q [V2x16QI V2x8HI V2x4SI V2x2DI
-V2x8HF V2x4SF V2x2DF V2x8BF
-V3x16QI V3x8HI V3x4SI V3x2DI
-V3x8HF V3x4SF V3x2DF V3x8BF
-V4x16QI V4x8HI V4x4SI V4x2DI
-V4x8HF V4x4SF V4x2DF V4x8BF])
-
 ;; Advanced SIMD 128-bit 2-vector structure modes.
 (define_mode_iterator VSTRUCT_2Q [V2x16QI V2x8HI V2x4SI V2x2DI
  V2x8HF V2x4SF V2x2DF V2x8BF])
@@ -391,49 +378,32 @@ (define_mode_iterator VSTRUCT_3Q [V3x16QI V3x8HI V3x4SI 
V3x2DI
 (define_mode_iterator VSTRUCT_4Q [V4x16QI V4x8HI V4x4SI V4x2DI
  V4x8HF V4x4SF V4x2DF V4x8BF])
 
+;; Advanced SIMD 128-bit vector structure modes.
+(define_mode_iterator VSTRUCT_Q [VSTRUCT_2Q VSTRUCT_3Q VSTRUCT_4Q])
+
 ;; Advanced SIMD 2-vector structure modes.
-(define_mode_iterator VSTRUCT_2QD [V2x8QI V2x4HI V2x2SI V2x1DI
-  

[committed] i386: Clear stack protector scratch with zero/sign-extend instruction

2023-11-10 Thread Uros Bizjak
Use unrelated register initializations using zero/sign-extend instructions
to clear stack protector scratch register.

Handle only SI -> DImode extensions for 64-bit targets, as this is the
only extension that triggers the peephole in a non-negligible number.

Also use explicit check for word_mode instead of mode iterator in peephole2
patterns to avoid pattern explosion.

gcc/ChangeLog:

* config/i386/i386.md (stack_protect_set_1 peephole2):
Explicitly check operand 2 for word_mode.
(stack_protect_set_1 peephole2 #2): Ditto.
(stack_protect_set_2 peephole2): Ditto.
(stack_protect_set_3 peephole2): Ditto.
(*stack_protect_set_4z__di): New insn pattern.
(*stack_protect_set_4s__di): Ditto.
(stack_protect_set_4 peephole2): New peephole2 pattern to
substitute stack protector scratch register clear with unrelated
register initialization involving zero/sign-extend instruction.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 046b6b7919e..01fc6ecc351 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -24335,11 +24335,12 @@ (define_peephole2
   [(parallel [(set (match_operand:PTR 0 "memory_operand")
   (unspec:PTR [(match_operand:PTR 1 "memory_operand")]
   UNSPEC_SP_SET))
- (set (match_operand:W 2 "general_reg_operand") (const_int 0))
+ (set (match_operand 2 "general_reg_operand") (const_int 0))
  (clobber (reg:CC FLAGS_REG))])
(set (match_operand 3 "general_reg_operand")
(match_operand 4 "const0_operand"))]
-  "GET_MODE_SIZE (GET_MODE (operands[3])) <= UNITS_PER_WORD
+  "GET_MODE (operands[2]) == word_mode
+   && GET_MODE_SIZE (GET_MODE (operands[3])) <= UNITS_PER_WORD
&& peep2_reg_dead_p (0, operands[3])
&& peep2_reg_dead_p (1, operands[2])"
   [(parallel [(set (match_dup 0)
@@ -24395,11 +24396,12 @@ (define_peephole2
   [(parallel [(set (match_operand:PTR 0 "memory_operand")
   (unspec:PTR [(match_operand:PTR 1 "memory_operand")]
   UNSPEC_SP_SET))
- (set (match_operand:W 2 "general_reg_operand") (const_int 0))
+ (set (match_operand 2 "general_reg_operand") (const_int 0))
  (clobber (reg:CC FLAGS_REG))])
(set (match_operand:SWI48 3 "general_reg_operand")
(match_operand:SWI48 4 "general_gr_operand"))]
-  "peep2_reg_dead_p (0, operands[3])
+  "GET_MODE (operands[2]) == word_mode
+   && peep2_reg_dead_p (0, operands[3])
&& peep2_reg_dead_p (1, operands[2])"
   [(parallel [(set (match_dup 0)
   (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET))
@@ -24411,9 +24413,10 @@ (define_peephole2
(parallel [(set (match_operand:PTR 0 "memory_operand")
   (unspec:PTR [(match_operand:PTR 1 "memory_operand")]
   UNSPEC_SP_SET))
- (set (match_operand:W 2 "general_reg_operand") (const_int 0))
+ (set (match_operand 2 "general_reg_operand") (const_int 0))
  (clobber (reg:CC FLAGS_REG))])]
-  "peep2_reg_dead_p (0, operands[3])
+  "GET_MODE (operands[2]) == word_mode
+   && peep2_reg_dead_p (0, operands[3])
&& peep2_reg_dead_p (2, operands[2])
&& !reg_mentioned_p (operands[3], operands[0])
&& !reg_mentioned_p (operands[3], operands[1])"
@@ -24448,16 +24451,71 @@ (define_peephole2
   [(parallel [(set (match_operand:PTR 0 "memory_operand")
   (unspec:PTR [(match_operand:PTR 1 "memory_operand")]
   UNSPEC_SP_SET))
- (set (match_operand:W 2 "general_reg_operand") (const_int 0))
+ (set (match_operand 2 "general_reg_operand") (const_int 0))
  (clobber (reg:CC FLAGS_REG))])
(set (match_operand:SWI48 3 "general_reg_operand")
(match_operand:SWI48 4 "address_no_seg_operand"))]
-  "peep2_reg_dead_p (0, operands[3])
+  "GET_MODE (operands[2]) == word_mode
+   && peep2_reg_dead_p (0, operands[3])
&& peep2_reg_dead_p (1, operands[2])"
   [(parallel [(set (match_dup 0)
   (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET))
  (set (match_dup 3) (match_dup 4))])])
 
+(define_insn "*stack_protect_set_4z__di"
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 3 "memory_operand" "m")]
+   UNSPEC_SP_SET))
+   (set (match_operand:DI 1 "register_operand" "=")
+   (zero_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))]
+  "TARGET_64BIT && reload_completed"
+{
+  output_asm_insn ("mov{}\t{%3, %1|%1, %3}", operands);
+  output_asm_insn ("mov{}\t{%1, %0|%0, %1}", operands);
+  if (ix86_use_lea_for_mov (insn, operands + 1))
+return "lea{l}\t{%E2, %k1|%k1, %E2}";
+  else
+return "mov{l}\t{%2, %k1|%k1, %2}";
+}
+  [(set_attr "type" "multi")
+   (set_attr "length" "24")])
+
+(define_insn "*stack_protect_set_4s__di"
+  [(set 

Re: [PATCH] c++: non-dependent .* folding [PR112427]

2023-11-10 Thread Patrick Palka
On Fri, 10 Nov 2023, Patrick Palka wrote:

> On Thu, 9 Nov 2023, Jason Merrill wrote:
> 
> > On 11/8/23 16:59, Patrick Palka wrote:
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > trunk?
> > > 
> > > -- >8 --
> > > 
> > > Here when building up the non-dependent .* expression, we crash from
> > > fold_convert on 'b.a' due to this (templated) COMPONENT_REF having an
> > > IDENTIFIER_NODE instead of FIELD_DECL operand that middle-end routines
> > > expect.  Like in r14-4899-gd80a26cca02587, this patch fixes this by
> > > replacing the problematic piecemeal folding with a single call to
> > > cp_fully_fold.
> > > 
> > >   PR c++/112427
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * typeck2.cc (build_m_component_ref): Use cp_convert, build2 and
> > >   cp_fully_fold instead of fold_build_pointer_plus and fold_convert.
> > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   * g++.dg/template/non-dependent29.C: New test.
> > > ---
> > >   gcc/cp/typeck2.cc   |  5 -
> > >   gcc/testsuite/g++.dg/template/non-dependent29.C | 13 +
> > >   2 files changed, 17 insertions(+), 1 deletion(-)
> > >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent29.C
> > > 
> > > diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
> > > index 309903afed8..208004221da 100644
> > > --- a/gcc/cp/typeck2.cc
> > > +++ b/gcc/cp/typeck2.cc
> > > @@ -2378,7 +2378,10 @@ build_m_component_ref (tree datum, tree component,
> > > tsubst_flags_t complain)
> > > /* Build an expression for "object + offset" where offset is the
> > >value stored in the pointer-to-data-member.  */
> > > ptype = build_pointer_type (type);
> > > -  datum = fold_build_pointer_plus (fold_convert (ptype, datum),
> > > component);
> > > +  datum = cp_convert (ptype, datum, complain);
> > > +  datum = build2 (POINTER_PLUS_EXPR, ptype,
> > > +   datum, convert_to_ptrofftype (component));
> > 
> > We shouldn't need to build the POINTER_PLUS_EXPR at all in template context.
> > OK with that change.
> 
> Hmm, that seems harmless at first glance, but I noticed
> build_min_non_dep (called from build_x_binary_op in this case) is
> careful to propagate TREE_SIDE_EFFECTS of the given tree, and so eliding
> POINTER_PLUS_EXPR here could potentially mean that the tree we
> ultimately return from build_x_binary_op when in a template context has
> TREE_SIDE_EFFECTS not set when it used to.  Shall we still elide the
> POINTER_PLUS_EXPR in a template context despite this?
> 
> (The TREE_SIDE_EFFECTS propagation in build_min_non_dep was added in
> r71108 to avoid bogus ahead of time -Wunused-value warnings.  But then
> r105273 later made us stop issuing -Wunused-value warnings ahead of time
> altogether.  So perhaps we don't need to maintain the TREE_SIDE_EFFECTS
> flag on templated trees at all anymore?)

IMO it'd be nice to restore ahead of time -Wunused-value warnings;
it seems the original motivation for r105273 / PR8057 was to avoid
redundantly issuing a warning twice, once ahead of time and once at
instantiation time, which we now could do in a better way with
warning_suppressed_p etc.  If so, then IIUC eliding the POINTER_PLUS_EXPR
could mean we'd incorrectly issue a -Wunused-value warning for e.g.
'a.*f()' in a template context?

> 
> > 
> > Jason
> > 
> > 
> 



Re: [PATCH] AArch64: Cleanup memset expansion

2023-11-10 Thread Richard Earnshaw




On 10/11/2023 14:46, Kyrylo Tkachov wrote:




-Original Message-
From: Richard Earnshaw 
Sent: Friday, November 10, 2023 11:31 AM
To: Wilco Dijkstra ; Kyrylo Tkachov
; GCC Patches 
Cc: Richard Sandiford ; Richard Earnshaw

Subject: Re: [PATCH] AArch64: Cleanup memset expansion



On 10/11/2023 10:17, Wilco Dijkstra wrote:

Hi Kyrill,


+  /* Reduce the maximum size with -Os.  */
+  if (optimize_function_for_size_p (cfun))
+    max_set_size = 96;
+



 This is a new "magic" number in this code. It looks sensible, but how

did you arrive at it?


We need 1 instruction to create the value to store (DUP or MOVI) and 1 STP
for every 32 bytes, so the 96 means 4 instructions for typical sizes
(sizes not
a multiple of 16 can add one extra instruction).


It would be useful to have that reasoning in the comment.



I checked codesize on SPECINT2017, and 96 had practically identical size.
Using 128 would also be a reasonable Os value with a very slight size
increase,
and 384 looks good for O2 - however I didn't want to tune these values
as this
is a cleanup patch.

Cheers,
Wilco


Shouldn't this be a param then?  Also, manifest constants in the middle
of code are a potential nightmare, please move it to a #define (even if
that's then used as the default value for the param).


I agree on making this a #define but I wouldn't insist on a param.
Code size IMO has a much more consistent right or wrong answer as it's 
statically determinable.
It this was a speed-related param then I'd expect the flexibility for the power 
user to override such heuristics would be more widely useful.
But for code size the compiler should always be able to get it right.

If Richard would still like the param then I'm fine with having the param, but 
I'd be okay with the comment above and making this a #define.


I don't immediately have a feel for how sensitive code would be to the 
precise value here.  Is this value something that might affect 
individual benchmarks in different ways?  Or something where a future 
architecture might want a different value?  For either of those reasons 
a param might be useful, but if this is primarily a code size trade off 
and the variation in performance is small, then it's probably not 
worthwhile having an additional hook.


R.


Re: [PATCH] c++: non-dependent .* folding [PR112427]

2023-11-10 Thread Patrick Palka
On Thu, 9 Nov 2023, Jason Merrill wrote:

> On 11/8/23 16:59, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > -- >8 --
> > 
> > Here when building up the non-dependent .* expression, we crash from
> > fold_convert on 'b.a' due to this (templated) COMPONENT_REF having an
> > IDENTIFIER_NODE instead of FIELD_DECL operand that middle-end routines
> > expect.  Like in r14-4899-gd80a26cca02587, this patch fixes this by
> > replacing the problematic piecemeal folding with a single call to
> > cp_fully_fold.
> > 
> > PR c++/112427
> > 
> > gcc/cp/ChangeLog:
> > 
> > * typeck2.cc (build_m_component_ref): Use cp_convert, build2 and
> > cp_fully_fold instead of fold_build_pointer_plus and fold_convert.
> 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/template/non-dependent29.C: New test.
> > ---
> >   gcc/cp/typeck2.cc   |  5 -
> >   gcc/testsuite/g++.dg/template/non-dependent29.C | 13 +
> >   2 files changed, 17 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent29.C
> > 
> > diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
> > index 309903afed8..208004221da 100644
> > --- a/gcc/cp/typeck2.cc
> > +++ b/gcc/cp/typeck2.cc
> > @@ -2378,7 +2378,10 @@ build_m_component_ref (tree datum, tree component,
> > tsubst_flags_t complain)
> > /* Build an expression for "object + offset" where offset is the
> >  value stored in the pointer-to-data-member.  */
> > ptype = build_pointer_type (type);
> > -  datum = fold_build_pointer_plus (fold_convert (ptype, datum),
> > component);
> > +  datum = cp_convert (ptype, datum, complain);
> > +  datum = build2 (POINTER_PLUS_EXPR, ptype,
> > + datum, convert_to_ptrofftype (component));
> 
> We shouldn't need to build the POINTER_PLUS_EXPR at all in template context.
> OK with that change.

Hmm, that seems harmless at first glance, but I noticed
build_min_non_dep (called from build_x_binary_op in this case) is
careful to propagate TREE_SIDE_EFFECTS of the given tree, and so eliding
POINTER_PLUS_EXPR here could potentially mean that the tree we
ultimately return from build_x_binary_op when in a template context has
TREE_SIDE_EFFECTS not set when it used to.  Shall we still elide the
POINTER_PLUS_EXPR in a template context despite this?

(The TREE_SIDE_EFFECTS propagation in build_min_non_dep was added in
r71108 to avoid bogus ahead of time -Wunused-value warnings.  But then
r105273 later made us stop issuing -Wunused-value warnings ahead of time
altogether.  So perhaps we don't need to maintain the TREE_SIDE_EFFECTS
flag on templated trees at all anymore?)

> 
> Jason
> 
> 



Re: [PATCH] riscv: thead: Add support for the XTheadInt ISA extension

2023-11-10 Thread Christoph Müllner
On Tue, Nov 7, 2023 at 4:04 AM Jin Ma  wrote:
>
> The XTheadInt ISA extension provides acceleration interruption
> instructions as defined in T-Head-specific:
>
> * th.ipush
> * th.ipop

Overall, it looks ok to me.
There are just a few small issues to clean up (see below).


>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (th_int_get_mask): New prototype.
> (th_int_get_save_adjustment): Likewise.
> (th_int_adjust_cfi_prologue): Likewise.
> * config/riscv/riscv.cc (TH_INT_INTERRUPT): New macro.
> (riscv_expand_prologue): Add the processing of XTheadInt.
> (riscv_expand_epilogue): Likewise.
> * config/riscv/riscv.md: New unspec.
> * config/riscv/thead.cc (BITSET_P): New macro.
> * config/riscv/thead.md (th_int_push): New pattern.
> (th_int_pop): New pattern.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xtheadint-push-pop.c: New test.
> ---
>  gcc/config/riscv/riscv-protos.h   |  3 +
>  gcc/config/riscv/riscv.cc | 58 +-
>  gcc/config/riscv/riscv.md |  4 +
>  gcc/config/riscv/thead.cc | 78 +++
>  gcc/config/riscv/thead.md | 67 
>  .../gcc.target/riscv/xtheadint-push-pop.c | 36 +
>  6 files changed, 245 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadint-push-pop.c
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 85d4f6ed9ea..05d1fc2b3a0 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -627,6 +627,9 @@ extern void th_mempair_prepare_save_restore_operands 
> (rtx[4], bool,
>   int, HOST_WIDE_INT,
>   int, HOST_WIDE_INT);
>  extern void th_mempair_save_restore_regs (rtx[4], bool, machine_mode);
> +extern unsigned int th_int_get_mask(unsigned int);

Space between function name and parenthesis.

> +extern unsigned int th_int_get_save_adjustment();

Space between function name and parenthesis.
An empty parameter list should be written as "(void)".

> +extern rtx th_int_adjust_cfi_prologue (unsigned int);
>  #ifdef RTX_CODE
>  extern const char*
>  th_mempair_output_move (rtx[4], bool, machine_mode, RTX_CODE);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 08ff05dcc3f..c623101b05e 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -101,6 +101,16 @@ along with GCC; see the file COPYING3.  If not see
>  /* True the mode switching has static frm, or false.  */
>  #define STATIC_FRM_P(c) ((c)->machine->mode_sw_info.static_frm_p)
>
> +/* True if we can use the instructions in the XTheadInt extension
> +   to handle interrupts, or false.  */
> +#define TH_INT_INTERRUPT(c)\
> +  (TARGET_XTHEADINT\
> +   /* The XTheadInt extension only supports rv32.  */  \
> +   && !TARGET_64BIT\
> +   && (c)->machine->interrupt_handler_p\
> +   /* This instruction can be executed in M-mode only.*/   \

Dot, space, space, end of comment.

Maybe better:
/* The XTheadInt instructions can only be executed in M-mode.  */

> +   && (c)->machine->interrupt_mode == MACHINE_MODE)
> +
>  /* Information about a function's frame layout.  */
>  struct GTY(())  riscv_frame_info {
>/* The size of the frame in bytes.  */
> @@ -6703,6 +6713,7 @@ riscv_expand_prologue (void)
>unsigned fmask = frame->fmask;
>int spimm, multi_push_additional, stack_adj;
>rtx insn, dwarf = NULL_RTX;
> +  unsigned th_int_mask = 0;
>
>if (flag_stack_usage_info)
>  current_function_static_stack_size = constant_lower_bound 
> (remaining_size);
> @@ -6771,6 +6782,28 @@ riscv_expand_prologue (void)
>REG_NOTES (insn) = dwarf;
>  }
>
> +  th_int_mask = th_int_get_mask(frame->mask);

There should be exactly one space between function name and parenthesis.

> +  if (th_int_mask && TH_INT_INTERRUPT (cfun))
> +{
> +  frame->mask &= ~th_int_mask;
> +
> +  /* RISCV_PROLOGUE_TEMP may be used to handle some CSR for
> +interrupts, such as fcsr. */

Dot, space, space, end of comment.

> +  if ((TARGET_HARD_FLOAT  && frame->fmask)
> + || (TARGET_ZFINX && frame->mask))
> +   frame->mask |= (1 << RISCV_PROLOGUE_TEMP_REGNUM);
> +
> +  unsigned save_adjustment = th_int_get_save_adjustment ();
> +  frame->gp_sp_offset -= save_adjustment;
> +  remaining_size -= save_adjustment;
> +
> +  insn = emit_insn (gen_th_int_push ());
> +
> +  rtx dwarf = th_int_adjust_cfi_prologue (th_int_mask);
> +  RTX_FRAME_RELATED_P (insn) = 1;
> +  REG_NOTES (insn) = dwarf;
> +}
> +
>/* Save the GP, FP registers.  */
>if 

Re: [PATCH] c++: constantness of local var in constexpr fn [PR111703, PR112269]

2023-11-10 Thread Patrick Palka
On Wed, 1 Nov 2023, Patrick Palka wrote:

> On Tue, 31 Oct 2023, Patrick Palka wrote:
> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?  Does it look OK for release branches as well for sake of PR111703?

Ping.

> > 
> > -- >8 --
> > 
> > potential_constant_expression was incorrectly treating most local
> > variables from a constexpr function as (potentially) constant because it
> > wasn't considering the 'now' parameter.  This patch fixes this by
> > relaxing some var_in_maybe_constexpr_fn checks accordingly, which turns
> > out to partially fix two recently reported regressions:
> > 
> > PR111703 is a regression caused by r11-550-gf65a3299a521a4 for
> > restricting constexpr evaluation during warning-dependent folding.
> > The mechanism is intended to restrict only constant evaluation of the
> > instantiated non-dependent expression, but it also ends up restricting
> > constant evaluation (as part of satisfaction) during instantiation of
> > the expression, in particular when resolving the ck_rvalue conversion of
> > the 'x' argument into a copy constructor call.
> 
> Oops, this analysis is inaccurate for this specific testcase (although
> the general idea is the same)...  We don't call fold_for_warn on 'f(x)'
> but rather on its 'x' argument that has been processed by
> convert_arguments into an IMPLICIT_CONV_EXPR.  And it's the
> instantiation of this IMPLICIT_CONV_EXPR that turns it into a copy
> constructor call.  There is no ck_rvalue conversion at all here since
> 'f' is a function pointer, not an actual function, and so ICSes don't
> get computed (IIUC).  If 'f' is changed to be an actual function then
> there's no issue since build_over_call doesn't perform argument
> conversions when in a template context and therefore doesn't call
> check_function_arguments on the converted arguments (from which the
> problematic fold_for_warn call occurs).
> 
> > This seems like a bug in
> > the mechanism[1], though I don't know if we want to refine the mechanism
> > or get rid of it completely since the original testcases which motivated
> > the mechanism are fixed more simply by r13-1225-gb00b95198e6720.  In any
> > case, this patch partially fixes this by making us correctly treat 'x'
> > and therefore 'f(x)' in the below testcase as non-constant, which
> > prevents the problematic warning-dependent folding from occurring at
> > all.  If this bug crops up again then I figure we could decide what to
> > do with the mechanism then.
> > 
> > PR112269 is caused by r14-4796-g3e3d73ed5e85e7 for merging tsubst_copy
> > into tsubst_copy_and_build.  tsubst_copy used to exit early when 'args'
> > was empty, behavior which that commit deliberately didn't preserve.
> > This early exit masked the fact that COMPLEX_EXPR wasn't handled by
> > tsubst at all, and is a tree code that apparently we could see during
> > warning-dependent folding on some targets.  A complete fix is to add
> > handling for this tree code in tsubst_expr, but this patch should fix
> > the reported testsuite failures since the situations where COMPLEX_EXPR
> > crops up in  turn out to not be constant expressions in the
> > first place after this patch.

N.B. adding COMPLEX_EXPR handling to tsubst_expr is complicated by the
fact that these COMPLEX_EXRRs are created by convert_to_complex (a
middle-end routine) which occasionally creates SAVE_EXPR sub trees which
we don't expect to see inside templated trees...

> > 
> > [1]: The mechanism incorrectly assumes that instantiation of the
> > non-dependent expression shouldn't induce any template instantiation
> > since ahead of time checking of the expression should've already induced
> > whatever template instantiation was needed, but in this case although
> > overload resolution was performed ahead of time, a ck_rvalue conversion
> > gets resolved to a copy constructor call only at instantiation time.
> > 
> > PR c++/111703
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constexpr.cc (potential_constant_expression_1) :
> > Only consider var_in_maybe_constexpr_fn if 'now' is false.
> > : Likewise.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/concepts-fn8.C: New test.
> > ---
> >  gcc/cp/constexpr.cc   |  4 ++--
> >  gcc/testsuite/g++.dg/cpp2a/concepts-fn8.C | 24 +++
> >  2 files changed, 26 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-fn8.C
> > 
> > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > index c05760e6789..8a6b210144a 100644
> > --- a/gcc/cp/constexpr.cc
> > +++ b/gcc/cp/constexpr.cc
> > @@ -9623,7 +9623,7 @@ potential_constant_expression_1 (tree t, bool 
> > want_rval, bool strict, bool now,
> >   return RECUR (DECL_VALUE_EXPR (t), rval);
> > }
> >if (want_rval
> > - && !var_in_maybe_constexpr_fn (t)
> > + && (now || !var_in_maybe_constexpr_fn (t))
> >   && !type_dependent_expression_p (t)
> >   && 

RE: [PATCH] AArch64: Cleanup memset expansion

2023-11-10 Thread Kyrylo Tkachov


> -Original Message-
> From: Richard Earnshaw 
> Sent: Friday, November 10, 2023 11:31 AM
> To: Wilco Dijkstra ; Kyrylo Tkachov
> ; GCC Patches 
> Cc: Richard Sandiford ; Richard Earnshaw
> 
> Subject: Re: [PATCH] AArch64: Cleanup memset expansion
> 
> 
> 
> On 10/11/2023 10:17, Wilco Dijkstra wrote:
> > Hi Kyrill,
> >
> >> +  /* Reduce the maximum size with -Os.  */
> >> +  if (optimize_function_for_size_p (cfun))
> >> +    max_set_size = 96;
> >> +
> >
> >>  This is a new "magic" number in this code. It looks sensible, but how
> did you arrive at it?
> >
> > We need 1 instruction to create the value to store (DUP or MOVI) and 1 STP
> > for every 32 bytes, so the 96 means 4 instructions for typical sizes
> > (sizes not
> > a multiple of 16 can add one extra instruction).

It would be useful to have that reasoning in the comment.

> >
> > I checked codesize on SPECINT2017, and 96 had practically identical size.
> > Using 128 would also be a reasonable Os value with a very slight size
> > increase,
> > and 384 looks good for O2 - however I didn't want to tune these values
> > as this
> > is a cleanup patch.
> >
> > Cheers,
> > Wilco
> 
> Shouldn't this be a param then?  Also, manifest constants in the middle
> of code are a potential nightmare, please move it to a #define (even if
> that's then used as the default value for the param).

I agree on making this a #define but I wouldn't insist on a param.
Code size IMO has a much more consistent right or wrong answer as it's 
statically determinable.
It this was a speed-related param then I'd expect the flexibility for the power 
user to override such heuristics would be more widely useful.
But for code size the compiler should always be able to get it right.

If Richard would still like the param then I'm fine with having the param, but 
I'd be okay with the comment above and making this a #define.
Thanks,
Kyrill


Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-10 Thread Alexander Monakov

On Fri, 10 Nov 2023, Richard Biener wrote:

> On Fri, Nov 10, 2023 at 3:18 PM Alexander Monakov  wrote:
> >
> >
> > On Fri, 10 Nov 2023, Richard Biener wrote:
> >
> > > > I'm afraid ignoring debug-only BBs goes contrary to overall 
> > > > var-tracking design:
> > > > DEBUG_INSNs participate in dependency graph so that schedulers can 
> > > > remove or
> > > > mutate them as needed when moving real insns across them.
> > >
> > > Note that debug-only BBs do not exist - the BB would be there even 
> > > without debug
> > > insns!
> >
> > Yep, sorry, I misspoke when I earlier said
> >
> > >> and cause divergence when passing through a debug-only BB which would 
> > >> not be
> > >> present at all without -g.
> >
> > They are present in the region, but skipped via no_real_insns_p.
> >
> > > So instead you have to handle BBs with just debug insns the same you
> > > handle a completely empty BB.
> >
> > Yeah. There would be no problem if the scheduler never used no_real_insns_p
> > and handled empty and non-empty BBs the same way.
> 
> And I suppose it would be OK to do that.  Empty BBs are usually removed by
> CFG cleanup so the situation should only happen in rare corner cases where
> the fix would be to actually run CFG cleanup ...

Yeah, sel-sched invokes 'cfg_cleanup (0)' up front, and I suppose that
may be a preferable compromise for sched-rgn as well.

I'm afraid one does not simply remove all uses of no_real_insns_p from
sched-rgn, but would be happy to be wrong about that.

Alexander

Re: [PATCH] tree-ssa-loop-ivopts : Add live analysis in regs used in decision making

2023-11-10 Thread Ajit Agarwal
Hello Richard:

On 10/11/23 7:29 pm, Richard Biener wrote:
> On Fri, Nov 10, 2023 at 7:42 AM Ajit Agarwal  wrote:
>>
>> Hello Richard:
>>
>>
>> On 09/11/23 6:21 pm, Richard Biener wrote:
>>> On Wed, Nov 8, 2023 at 4:00 PM Ajit Agarwal  wrote:

 tree-ssa-loop-ivopts : Add live analysis in regs used in decision making.

 Add live anaysis in regs used calculation in decision making of
 selecting ivopts candidates.

 2023-11-08  Ajit Kumar Agarwal  

 gcc/ChangeLog:

 * tree-ssa-loop-ivopts.cc (get_regs_used): New function.
 (determine_set_costs): Call to get_regs_used to use live
 analysis.
 ---
  gcc/tree-ssa-loop-ivopts.cc | 73 +++--
  1 file changed, 70 insertions(+), 3 deletions(-)

 diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
 index c3336603778..e02fe7d434b 100644
 --- a/gcc/tree-ssa-loop-ivopts.cc
 +++ b/gcc/tree-ssa-loop-ivopts.cc
 @@ -6160,6 +6160,68 @@ ivopts_estimate_reg_pressure (struct ivopts_data 
 *data, unsigned n_invs,
return cost + n_cands;
  }

 +/* Return regs used based on live-in and liveout of given ssa variables.  
 */
>>>
>>> Please explain how the following code relates to anything like "live
>>> analysis" and
>>> where it uses live-in and live-out.  And what "live-in/out of a given
>>> SSA variable"
>>> should be.
>>>
>>> Also explain why you are doing this at all.  The patch doesn't come
>>> with a testcase
>>> or with any other hint that motivated you.
>>>
>>> Richard.
>>>
>>
>> The function get_regs_used increments the regs_used based on live-in
>> and live-out analysis of given ssa name. Instead of setting live-in and
>> live-out bitmap I increment the regs_used.
>>
>> Below is how I identify live-in and live-out and increments the regs_used
>> variable:
>>
>> a) For a given def_bb of gimple statement of ssa name there should be
>> live-out and increments the regs_used.
>>
>> b) Visit each use of SSA_NAME and if it isn't in the same block as the def,
>>  we identify live on entry blocks and increments regs_used.
>>
>> The below function is the modification of set_var_live_on_entry of 
>> tree-ssa-live.cc
>> Where we set the bitmap of liveout and livein of basic block. Instead of 
>> setting bitmap, regs_used is incremented.e
> 
> It clearly doesn't work that way, and the number doesn't in any way relate to
> the number of registers used or register pressure.
> 

I agree with you that actual regs_used is not actually the registers used 
calculated
based on livein and liveout. 

Above decision making is using the variable  reg_used which is not actually 
related
to registers used or registers used.

My decision making is based on livein and liveout instead of actual registers 
used.

I tried to sync up with variables names same as used in 
ivopts_estimate_register_pressure.

My logic is changing the actual implementation of 
ivopts_estimate_register_pressure
considering the livein and liveout instead of actual registers used. Idea behind
is to use the livein and liveout considering the regions that doing ivopts 
increases
or decreases the register pressure based on livein and liveout. My calculation 
of register
pressure should be based livein and liveout across the region  based on ivopts 
instead of calculating the register used based on number of iv candidates.

This is how my notion of register pressure.

I can change code to give variables names meaningful stated in above decison 
making.

 >> I identify regs_used as the number of live-in and liveout of given ssa name 
 >> variable.
>>
>> For each iv candiate ssa variables I identify regs_used and take maximum of 
>> regs
>> used for all the iv candidates that will be used in 
>> ivopts_estimate_register_pressure
>> cost analysis.
>>
>> Motivation behind doing this opttks for FP and INT around 2% to 7%.
> 
> An interesting GIGO effect.

Why you think its GIGO effect. The gains are happening because of decision 
making
on register pressure stated above.

Please elaborate if you think otherwise.

Thanks & Regards
Ajit
> 
>> Also setting regs_used as number of iv candiates, which is not
>> optimized and robust way of decision making for ivopts optimization I decide
>> on live-in and live-out analysis which is more correct and appropriate way of
>> identifying regs_used.
>>
>> And also there are no regressions in bootstrapped/regtested on 
>> powerpc64-linux-gnu.
>>
>> Thanks & Regards
>> Ajit
>>
 +static unsigned
 +get_regs_used (tree ssa_name)
 +{
 +  unsigned regs_used = 0;
 +  gimple *stmt;
 +  use_operand_p use;
 +  basic_block def_bb = NULL;
 +  imm_use_iterator imm_iter;
 +
 +  stmt = SSA_NAME_DEF_STMT (ssa_name);
 +  if (stmt)
 +{
 +  def_bb = gimple_bb (stmt);
 +  /* Mark defs in liveout bitmap temporarily.  */
 +  if (def_bb)
 +   

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-10 Thread Jeff Law




On 11/10/23 03:39, Richard Sandiford wrote:

Lehua Ding  writes:

On 2023/11/10 18:16, Richard Sandiford wrote:

Lehua Ding  writes:

Hi Richard,

On 2023/11/8 17:40, Richard Sandiford wrote:

Tracking subreg liveness will sometimes expose dead code that
wasn't obvious without it.  PR89606 has an example of this.
There the dead code was introduced by init-regs, and there's a
debate about (a) whether init-regs should still be run and (b) if it
should still be run, whether it should use subreg liveness tracking too.

But I think such dead code is possible even without init-regs.
So for the purpose of this series, I think the init-regs behaviour
in that PR creates a helpful example.


Yes, I think the init-regs should be enhanced to reduce unnecessary
initialization. My previous internal patchs did this in a separate
patch. Maybe I should split the live_subreg problem out of the second
patch and not couple it with these patches. That way it can be reviewed
separately.


But my point was that this kind of dead code is possible even without
init-regs.  So I think we should have something that removes the dead
code.  And we can try it on that PR (without changing init-regs).


Got it, so we should add a fast remove dead code job after init-regs pass.


I'm just not sure how fast it would be, given that it needs the subreg
liveness info.  Could it be done during RA itself, during one of the existing
instruction walks?  E.g. if IRA sees a dead instruction, it could remove it
rather than recording conflict information for it.

Yea, it's a real concern.  I haven't done the analysis yet, but I have a 
 sense that Joern's ext-dce work which Jivan and I are working on 
(which does sub-object liveness tracking) is having a compile-time 
impact as well.


Jeff


Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 3:18 PM Alexander Monakov  wrote:
>
>
> On Fri, 10 Nov 2023, Richard Biener wrote:
>
> > > I'm afraid ignoring debug-only BBs goes contrary to overall var-tracking 
> > > design:
> > > DEBUG_INSNs participate in dependency graph so that schedulers can remove 
> > > or
> > > mutate them as needed when moving real insns across them.
> >
> > Note that debug-only BBs do not exist - the BB would be there even without 
> > debug
> > insns!
>
> Yep, sorry, I misspoke when I earlier said
>
> >> and cause divergence when passing through a debug-only BB which would not 
> >> be
> >> present at all without -g.
>
> They are present in the region, but skipped via no_real_insns_p.
>
> > So instead you have to handle BBs with just debug insns the same you
> > handle a completely empty BB.
>
> Yeah. There would be no problem if the scheduler never used no_real_insns_p
> and handled empty and non-empty BBs the same way.

And I suppose it would be OK to do that.  Empty BBs are usually removed by
CFG cleanup so the situation should only happen in rare corner cases where
the fix would be to actually run CFG cleanup ...

Richard.

> Alexander


回复:Re: [PATCH v2] RISC-V: Fixbug for that XTheadMemPair causes interrupt to fail.

2023-11-10 Thread 马进(方耀)
I'm very sorry, I misunderstood. There's no difference between them, please 
ignore it.






 马进 
阿里巴巴及蚂蚁集团  
 电话:057128223456-89384085 
 邮箱:yaofang...@alibaba-inc.com 
 地址:浙江-杭州-西溪B区 B2-7-E6-090 
 
 阿里巴巴及蚂蚁集团   企业主页 
 信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。
请接收者注意保密,未经发件人书面许可,不得向任何第三方组织和个人透露本邮件所含信息的全部或部分。以上声明仅适用于工作邮件。
Information Security Notice: The information contained in this mail is solely 
property of the sender's organization. 
 This mail communication is confidential. Recipients named above are obligated 
to maintain secrecy and are not permitted to disclose the contents of this 
communication to others.  
--
发件人:Kito Cheng
日 期:2023年11月10日 22:04:26
收件人:Jin Ma
抄 送:; ; 

主 题:Re: [PATCH v2] RISC-V: Fixbug for that XTheadMemPair causes interrupt to 
fail.

I thought Christoph was already committed? Do you mind describing the 
difference between v1 and v2?
On Fri, Nov 10, 2023 at 9:55 PM Jin Ma  wrote:
The t0 register is used as a temporary register for interrupts, so it needs
 special treatment. It is necessary to avoid using "th.ldd" in the interrupt
 program to stop the subsequent operation of the t0 register, so they need to
 exchange positions in the function "riscv_for_each_saved_reg".

 gcc/ChangeLog:

 * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the interrupt
 operation before the XTheadMemPair.
 ---
  gcc/config/riscv/riscv.cc | 56 +--
  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
  2 files changed, 46 insertions(+), 28 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c

 diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
 index e25692b86fc..fa2d4d4b779 100644
 --- a/gcc/config/riscv/riscv.cc
 +++ b/gcc/config/riscv/riscv.cc
 @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
riscv_save_restore_fn fn,
   && riscv_is_eh_return_data_register (regno))
 continue;

 +  /* In an interrupt function, save and restore some necessary CSRs in 
the stack
 +to avoid changes in CSRs.  */
 +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
 + && cfun->machine->interrupt_handler_p
 + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
 + || (TARGET_ZFINX
 + && (cfun->machine->frame.mask & ~(1 << 
RISCV_PROLOGUE_TEMP_REGNUM)
 +   {
 + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
 + if (!epilogue)
 +   {
 + riscv_save_restore_reg (word_mode, regno, offset, fn);
 + offset -= fcsr_size;
 + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
 + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
 + offset, riscv_save_reg);
 +   }
 + else
 +   {
 + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
 + offset - fcsr_size, riscv_restore_reg);
 + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
 + riscv_save_restore_reg (word_mode, regno, offset, fn);
 + offset -= fcsr_size;
 +   }
 + continue;
 +   }
 +
if (TARGET_XTHEADMEMPAIR)
 {
   /* Get the next reg/offset pair.  */
 @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
riscv_save_restore_fn fn,
 }
 }

 -  /* In an interrupt function, save and restore some necessary CSRs in 
the stack
 -to avoid changes in CSRs.  */
 -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
 - && cfun->machine->interrupt_handler_p
 - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
 - || (TARGET_ZFINX
 - && (cfun->machine->frame.mask & ~(1 << 
RISCV_PROLOGUE_TEMP_REGNUM)
 -   {
 - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
 - if (!epilogue)
 -   {
 - riscv_save_restore_reg (word_mode, regno, offset, fn);
 - offset -= fcsr_size;
 - emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
 - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
 - offset, riscv_save_reg);
 -   }
 - else
 -   {
 - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
 - offset - fcsr_size, riscv_restore_reg);
 - emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
 - riscv_save_restore_reg (word_mode, regno, offset, fn);
 - offset -= fcsr_size;
 -   }
 - continue;
 -   }
 -
riscv_save_restore_reg (word_mode, regno, offset, fn);
  }

 diff --git 

Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-10 Thread Alexander Monakov


On Fri, 10 Nov 2023, Richard Biener wrote:

> > I'm afraid ignoring debug-only BBs goes contrary to overall var-tracking 
> > design:
> > DEBUG_INSNs participate in dependency graph so that schedulers can remove or
> > mutate them as needed when moving real insns across them.
> 
> Note that debug-only BBs do not exist - the BB would be there even without 
> debug
> insns!

Yep, sorry, I misspoke when I earlier said

>> and cause divergence when passing through a debug-only BB which would not be
>> present at all without -g.

They are present in the region, but skipped via no_real_insns_p.

> So instead you have to handle BBs with just debug insns the same you
> handle a completely empty BB.

Yeah. There would be no problem if the scheduler never used no_real_insns_p
and handled empty and non-empty BBs the same way.

Alexander


Re: [PATCH v2] RISC-V: Fixbug for that XTheadMemPair causes interrupt to fail.

2023-11-10 Thread Kito Cheng
I thought Christoph was already committed? Do you mind describing the
difference between v1 and v2?

On Fri, Nov 10, 2023 at 9:55 PM Jin Ma  wrote:

> The t0 register is used as a temporary register for interrupts, so it needs
> special treatment. It is necessary to avoid using "th.ldd" in the interrupt
> program to stop the subsequent operation of the t0 register, so they need
> to
> exchange positions in the function "riscv_for_each_saved_reg".
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the
> interrupt
> operation before the XTheadMemPair.
> ---
>  gcc/config/riscv/riscv.cc | 56 +--
>  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
>  2 files changed, 46 insertions(+), 28 deletions(-)
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index e25692b86fc..fa2d4d4b779 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
> riscv_save_restore_fn fn,
>   && riscv_is_eh_return_data_register (regno))
> continue;
>
> +  /* In an interrupt function, save and restore some necessary CSRs
> in the stack
> +to avoid changes in CSRs.  */
> +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> + && cfun->machine->interrupt_handler_p
> + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> + || (TARGET_ZFINX
> + && (cfun->machine->frame.mask & ~(1 <<
> RISCV_PROLOGUE_TEMP_REGNUM)
> +   {
> + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> + if (!epilogue)
> +   {
> + riscv_save_restore_reg (word_mode, regno, offset, fn);
> + offset -= fcsr_size;
> + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> + offset, riscv_save_reg);
> +   }
> + else
> +   {
> + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> + offset - fcsr_size,
> riscv_restore_reg);
> + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> + riscv_save_restore_reg (word_mode, regno, offset, fn);
> + offset -= fcsr_size;
> +   }
> + continue;
> +   }
> +
>if (TARGET_XTHEADMEMPAIR)
> {
>   /* Get the next reg/offset pair.  */
> @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
> riscv_save_restore_fn fn,
> }
> }
>
> -  /* In an interrupt function, save and restore some necessary CSRs
> in the stack
> -to avoid changes in CSRs.  */
> -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> - && cfun->machine->interrupt_handler_p
> - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> - || (TARGET_ZFINX
> - && (cfun->machine->frame.mask & ~(1 <<
> RISCV_PROLOGUE_TEMP_REGNUM)
> -   {
> - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> - if (!epilogue)
> -   {
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> - offset -= fcsr_size;
> - emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> - offset, riscv_save_reg);
> -   }
> - else
> -   {
> - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> - offset - fcsr_size,
> riscv_restore_reg);
> - emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> - offset -= fcsr_size;
> -   }
> - continue;
> -   }
> -
>riscv_save_restore_reg (word_mode, regno, offset, fn);
>  }
>
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> new file mode 100644
> index 000..d06f05f5c7c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> @@ -0,0 +1,18 @@
> +/* Verify that fcsr instructions emitted.  */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target hard_float } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
> +/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906
> -funwind-tables" { target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906
> -funwind-tables" { target { rv32 } } } */
> +
> +
> +extern int foo (void);
> +
> +void __attribute__ ((interrupt))
> +sub (void)
> +{
> +  foo ();

Re: [PATCH] tree-ssa-loop-ivopts : Add live analysis in regs used in decision making

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 7:42 AM Ajit Agarwal  wrote:
>
> Hello Richard:
>
>
> On 09/11/23 6:21 pm, Richard Biener wrote:
> > On Wed, Nov 8, 2023 at 4:00 PM Ajit Agarwal  wrote:
> >>
> >> tree-ssa-loop-ivopts : Add live analysis in regs used in decision making.
> >>
> >> Add live anaysis in regs used calculation in decision making of
> >> selecting ivopts candidates.
> >>
> >> 2023-11-08  Ajit Kumar Agarwal  
> >>
> >> gcc/ChangeLog:
> >>
> >> * tree-ssa-loop-ivopts.cc (get_regs_used): New function.
> >> (determine_set_costs): Call to get_regs_used to use live
> >> analysis.
> >> ---
> >>  gcc/tree-ssa-loop-ivopts.cc | 73 +++--
> >>  1 file changed, 70 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> >> index c3336603778..e02fe7d434b 100644
> >> --- a/gcc/tree-ssa-loop-ivopts.cc
> >> +++ b/gcc/tree-ssa-loop-ivopts.cc
> >> @@ -6160,6 +6160,68 @@ ivopts_estimate_reg_pressure (struct ivopts_data 
> >> *data, unsigned n_invs,
> >>return cost + n_cands;
> >>  }
> >>
> >> +/* Return regs used based on live-in and liveout of given ssa variables.  
> >> */
> >
> > Please explain how the following code relates to anything like "live
> > analysis" and
> > where it uses live-in and live-out.  And what "live-in/out of a given
> > SSA variable"
> > should be.
> >
> > Also explain why you are doing this at all.  The patch doesn't come
> > with a testcase
> > or with any other hint that motivated you.
> >
> > Richard.
> >
>
> The function get_regs_used increments the regs_used based on live-in
> and live-out analysis of given ssa name. Instead of setting live-in and
> live-out bitmap I increment the regs_used.
>
> Below is how I identify live-in and live-out and increments the regs_used
> variable:
>
> a) For a given def_bb of gimple statement of ssa name there should be
> live-out and increments the regs_used.
>
> b) Visit each use of SSA_NAME and if it isn't in the same block as the def,
>  we identify live on entry blocks and increments regs_used.
>
> The below function is the modification of set_var_live_on_entry of 
> tree-ssa-live.cc
> Where we set the bitmap of liveout and livein of basic block. Instead of 
> setting bitmap, regs_used is incremented.

It clearly doesn't work that way, and the number doesn't in any way relate to
the number of registers used or register pressure.

> I identify regs_used as the number of live-in and liveout of given ssa name 
> variable.
>
> For each iv candiate ssa variables I identify regs_used and take maximum of 
> regs
> used for all the iv candidates that will be used in 
> ivopts_estimate_register_pressure
> cost analysis.
>
> Motivation behind doing this optimization is I get good performance 
> improvement
> for several spec cpu 2017 benchmarks for FP and INT around 2% to 7%.

An interesting GIGO effect.

> Also setting regs_used as number of iv candiates, which is not
> optimized and robust way of decision making for ivopts optimization I decide
> on live-in and live-out analysis which is more correct and appropriate way of
> identifying regs_used.
>
> And also there are no regressions in bootstrapped/regtested on 
> powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
> >> +static unsigned
> >> +get_regs_used (tree ssa_name)
> >> +{
> >> +  unsigned regs_used = 0;
> >> +  gimple *stmt;
> >> +  use_operand_p use;
> >> +  basic_block def_bb = NULL;
> >> +  imm_use_iterator imm_iter;
> >> +
> >> +  stmt = SSA_NAME_DEF_STMT (ssa_name);
> >> +  if (stmt)
> >> +{
> >> +  def_bb = gimple_bb (stmt);
> >> +  /* Mark defs in liveout bitmap temporarily.  */
> >> +  if (def_bb)
> >> +   regs_used++;
> >> +}
> >> +  else
> >> +def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
> >> +
> >> +  /* An undefined local variable does not need to be very alive.  */
> >> +  if (virtual_operand_p (ssa_name)
> >> +  || ssa_undefined_value_p (ssa_name, false))
> >> +return 0;
> >> +
> >> +  /* Visit each use of SSA_NAME and if it isn't in the same block as the 
> >> def,
> >> + add it to the list of live on entry blocks.  */
> >> +  FOR_EACH_IMM_USE_FAST (use, imm_iter, ssa_name)
> >> +{
> >> +  gimple *use_stmt = USE_STMT (use);
> >> +  basic_block add_block = NULL;
> >> +
> >> +  if (gimple_code (use_stmt) == GIMPLE_PHI)
> >> +   {
> >> + /* Uses in PHI's are considered to be live at exit of the SRC 
> >> block
> >> +as this is where a copy would be inserted.  Check to see if 
> >> it is
> >> +defined in that block, or whether its live on entry.  */
> >> + int index = PHI_ARG_INDEX_FROM_USE (use);
> >> + edge e = gimple_phi_arg_edge (as_a  (use_stmt), index);
> >> + if (e->src != def_bb)
> >> +   add_block = e->src;
> >> +   }
> >> +  else if (is_gimple_debug (use_stmt))
> >> +   continue;
> >> +  else
> >> +   {
> >> + /* If its not 

[PATCH v2] RISC-V: Fixbug for that XTheadMemPair causes interrupt to fail.

2023-11-10 Thread Jin Ma
The t0 register is used as a temporary register for interrupts, so it needs
special treatment. It is necessary to avoid using "th.ldd" in the interrupt
program to stop the subsequent operation of the t0 register, so they need to
exchange positions in the function "riscv_for_each_saved_reg".

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the interrupt
operation before the XTheadMemPair.
---
 gcc/config/riscv/riscv.cc | 56 +--
 .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
 2 files changed, 46 insertions(+), 28 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index e25692b86fc..fa2d4d4b779 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
riscv_save_restore_fn fn,
  && riscv_is_eh_return_data_register (regno))
continue;
 
+  /* In an interrupt function, save and restore some necessary CSRs in the 
stack
+to avoid changes in CSRs.  */
+  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
+ && cfun->machine->interrupt_handler_p
+ && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
+ || (TARGET_ZFINX
+ && (cfun->machine->frame.mask & ~(1 << 
RISCV_PROLOGUE_TEMP_REGNUM)
+   {
+ unsigned int fcsr_size = GET_MODE_SIZE (SImode);
+ if (!epilogue)
+   {
+ riscv_save_restore_reg (word_mode, regno, offset, fn);
+ offset -= fcsr_size;
+ emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
+ riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
+ offset, riscv_save_reg);
+   }
+ else
+   {
+ riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
+ offset - fcsr_size, riscv_restore_reg);
+ emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
+ riscv_save_restore_reg (word_mode, regno, offset, fn);
+ offset -= fcsr_size;
+   }
+ continue;
+   }
+
   if (TARGET_XTHEADMEMPAIR)
{
  /* Get the next reg/offset pair.  */
@@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
riscv_save_restore_fn fn,
}
}
 
-  /* In an interrupt function, save and restore some necessary CSRs in the 
stack
-to avoid changes in CSRs.  */
-  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
- && cfun->machine->interrupt_handler_p
- && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
- || (TARGET_ZFINX
- && (cfun->machine->frame.mask & ~(1 << 
RISCV_PROLOGUE_TEMP_REGNUM)
-   {
- unsigned int fcsr_size = GET_MODE_SIZE (SImode);
- if (!epilogue)
-   {
- riscv_save_restore_reg (word_mode, regno, offset, fn);
- offset -= fcsr_size;
- emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
- riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
- offset, riscv_save_reg);
-   }
- else
-   {
- riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
- offset - fcsr_size, riscv_restore_reg);
- emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
- riscv_save_restore_reg (word_mode, regno, offset, fn);
- offset -= fcsr_size;
-   }
- continue;
-   }
-
   riscv_save_restore_reg (word_mode, regno, offset, fn);
 }
 
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c 
b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
new file mode 100644
index 000..d06f05f5c7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
@@ -0,0 +1,18 @@
+/* Verify that fcsr instructions emitted.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
+/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906 
-funwind-tables" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906 
-funwind-tables" { target { rv32 } } } */
+
+
+extern int foo (void);
+
+void __attribute__ ((interrupt))
+sub (void)
+{
+  foo ();
+}
+
+/* { dg-final { scan-assembler-times "frcsr\t" 1 } } */
+/* { dg-final { scan-assembler-times "fscsr\t" 1 } } */

base-commit: e7f4040d9d6ec40c48ada940168885d7dde03af9
-- 
2.17.1



Re: [PATCH] Simplify vector ((VCE?(a cmp b ? -1 : 0)) < 0) ? c : d to just (VCE:a cmp VCE:b) ? c : d.

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 2:52 AM liuhongt  wrote:
>
> When I'm working on PR112443, I notice there's some misoptimizations: after we
> fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend fails to combine it
> back to v{,p}blendv{v,ps,pd} since the pattern is too complicated, so I think
> maybe we should hanlde it in the gimple level.
>
> The dump is like
>
>   _1 = c_3(D) >= { 0, 0, 0, 0 };
>   _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;
>   _7 = VIEW_CONVERT_EXPR(_2);
>   _8 = VIEW_CONVERT_EXPR(b_6(D));
>   _9 = VIEW_CONVERT_EXPR(a_5(D));
>   _10 = _7 < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
>   _11 = VEC_COND_EXPR <_10, _8, _9>;
>
>
> It can be optimized to
>
>   _6 = VIEW_CONVERT_EXPR(b_4(D));
>   _7 = VIEW_CONVERT_EXPR(a_3(D));
>   _10 = VIEW_CONVERT_EXPR(c_1(D));
>   _5 = _10 >= { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
>   _8 = VEC_COND_EXPR <_5, _6, _7>;
>   _9 = VIEW_CONVERT_EXPR<__m256i>(_8);
>
> since _7 is either -1 or 0, _7 < 0 should is euqal to _1 = c_3(D) > { 0, 0, 
> 0, 0 };
> The patch add a gimple pattern to handle that.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * match.pd (VCE:(a cmp b ? -1 : 0) < 0) ? c : d ---> (VCE:a cmp
> VCE:b) ? c : d): New gimple simplication.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512vl-blendv-3.c: New test.
> * gcc.target/i386/blendv-3.c: New test.
> ---
>  gcc/match.pd  | 17 +++
>  .../gcc.target/i386/avx512vl-blendv-3.c   |  6 +++
>  gcc/testsuite/gcc.target/i386/blendv-3.c  | 46 +++
>  3 files changed, 69 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-blendv-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/blendv-3.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index dbc811b2b38..e6f9c4fa1fd 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -5170,6 +5170,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (if (optimize_vectors_before_lowering_p () && types_match (@0, @3))
>(vec_cond (bit_and @0 (bit_not @3)) @2 @1)))

Would be nice to have a comment here.

> +(for cmp (simple_comparison)
> + (simplify
> +  (vec_cond
> +(lt@4 (view_convert?@5 (vec_cond (cmp @0 @1)
> +integer_all_onesp
> +integer_zerop))
> + integer_zerop) @2 @3)
> +  (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0))
> +   && VECTOR_INTEGER_TYPE_P (TREE_TYPE (@5))
> +   && TYPE_SIGN (TREE_TYPE (@0)) == TYPE_SIGN (TREE_TYPE (@5))
> +   && VECTOR_TYPE_P (type))
> +   (with {
> +  tree itype = TREE_TYPE (@5);
> +  tree vbtype = TREE_TYPE (@4);}
> + (vec_cond (cmp:vbtype (view_convert:itype @0)
> +  (view_convert:itype @1)) @2 @3)

It looks like the outer vec_cond isn't actually relevant to the simplification?

 (lt (view_convert? (vec_cond (cmp @0 @1) integer_all_onesp
integer_zerop)) integer_zerop)

is the relevant part?  I wonder what canonicalizes the inner vec_cond?
 Did you ever see
the (view_convert ... missing?

> +
>  /* c1 ? c2 ? a : b : b  -->  (c1 & c2) ? a : b  */
>  (simplify
>   (vec_cond @0 (vec_cond:s @1 @2 @3) @3)
> diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-blendv-3.c 
> b/gcc/testsuite/gcc.target/i386/avx512vl-blendv-3.c
> new file mode 100644
> index 000..2777e72ab5f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512vl-blendv-3.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512vl -mavx512bw -O2" } */
> +/* { dg-final { scan-assembler-times {vp?blendv(?:b|p[sd])[ \t]*} 6 } } */
> +/* { dg-final { scan-assembler-not {vpcmp} } } */
> +
> +#include "blendv-3.c"
> diff --git a/gcc/testsuite/gcc.target/i386/blendv-3.c 
> b/gcc/testsuite/gcc.target/i386/blendv-3.c
> new file mode 100644
> index 000..fa0fb067a73
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/blendv-3.c
> @@ -0,0 +1,46 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx2 -O2" } */
> +/* { dg-final { scan-assembler-times {vp?blendv(?:b|p[sd])[ \t]*} 6 } } */
> +/* { dg-final { scan-assembler-not {vpcmp} } } */
> +
> +#include 
> +
> +__m256i
> +foo (__m256i a, __m256i b, __m256i c)
> +{
> +  return _mm256_blendv_epi8 (a, b, ~c < 0);
> +}
> +
> +__m256d
> +foo1 (__m256d a, __m256d b, __m256i c)
> +{
> +  __m256i d = ~c < 0;
> +  return _mm256_blendv_pd (a, b, (__m256d)d);
> +}
> +
> +__m256
> +foo2 (__m256 a, __m256 b, __m256i c)
> +{
> +  __m256i d = ~c < 0;
> +  return _mm256_blendv_ps (a, b, (__m256)d);
> +}
> +
> +__m128i
> +foo4 (__m128i a, __m128i b, __m128i c)
> +{
> +  return _mm_blendv_epi8 (a, b, ~c < 0);
> +}
> +
> +__m128d
> +foo5 (__m128d a, __m128d b, __m128i c)
> +{
> +  __m128i d = ~c < 0;
> +  return _mm_blendv_pd (a, b, (__m128d)d);
> +}
> +
> +__m128
> +foo6 

Re: [PATCH, expand] Call misaligned memory reference in expand_builtin_return [PR112417]

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 11:10 AM HAO CHEN GUI  wrote:
>
> Hi Richard,
>
> 在 2023/11/10 17:06, Richard Biener 写道:
> > On Fri, Nov 10, 2023 at 8:52 AM HAO CHEN GUI  wrote:
> >>
> >> Hi Richard,
> >>   Thanks so much for your comments.
> >>
> >> 在 2023/11/9 19:41, Richard Biener 写道:
> >>> I'm not sure if the testcase is valid though?
> >>>
> >>> @defbuiltin{{void} __builtin_return (void *@var{result})}
> >>> This built-in function returns the value described by @var{result} from
> >>> the containing function.  You should specify, for @var{result}, a value
> >>> returned by @code{__builtin_apply}.
> >>> @enddefbuiltin
> >>>
> >>> I don't see __builtin_apply being used here?
> >>
> >> The prototype of the test case is from "__objc_block_forward" in
> >> libobjc/sendmsg.c.
> >>
> >>   void *args, *res;
> >>
> >>   args = __builtin_apply_args ();
> >>   res = __objc_forward (rcv, op, args);
> >>   if (res)
> >> __builtin_return (res);
> >>   else
> >> ...
> >>
> >> The __builtin_apply_args puts the return values on stack by the alignment.
> >> But the forward function can do anything and return a void* pointer.
> >> IMHO the alignment might be broken. So I just simplified it to use a
> >> void* pointer as the input argument of  "__builtin_return" and skip
> >> "__builtin_apply_args".
> >
> > But doesn't __objc_forward then break the contract between
> > __builtin_apply_args and __builtin_return?
> >
> > That said, __builtin_return is a very special function, it's not supposed
> > to deal with what you are fixing.  At least I think so.
> >
> > IMHO the bug is in __objc_block_forward.
>
> If so, can we document that the memory objects pointed by input argument of
> __builtin_return have to be aligned? Then we can force the alignment in
> __builtin_return. The customer function can do anything if gcc doesn't state
> that.

I don't think they have to be aligned - they have to adhere to the ABI
which __builtin_apply_args ensures.  But others might know more details
here.

> Thanks
> Gui Haochen
>
> >
> > Richard.
> >
> >>
> >> Thanks
> >> Gui Haochen


Re: [PATCH V2] Middle-end: Fix bug of induction variable vectorization for RVV

2023-11-10 Thread Richard Biener
On Fri, 10 Nov 2023, Juzhe-Zhong wrote:

> PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438
> 
> 1. Since SELECT_VL result is not necessary always VF in non-final iteration.
> 
> Current GIMPLE IR is wrong:
> 
> # vect_vec_iv_.8_22 = PHI <_21(4), { 0, 1, 2, ... }(3)>
> ...
> _35 = .SELECT_VL (ivtmp_33, VF);
> _21 = vect_vec_iv_.8_22 + { VF, ... };
> 
> E.g. Consider the total iterations N = 6, the VF = 4.
> Since SELECT_VL output is defined as not always to be VF in non-final 
> iteration
> which needs to depend on hardware implementation.
> 
> Suppose we have a RVV CPU core with vsetvl doing even distribution workload 
> optimization.
> It may process 3 elements at the 1st iteration and 3 elements at the last 
> iteration.
> Then the induction variable here: _21 = vect_vec_iv_.8_22 + { POLY_INT_CST 
> [4, 4], ... }; 
> is wrong which is adding VF, which is 4, actually, we didn't process 4 
> elements.
> 
> It should be adding 3 elements which is the result of SELECT_VL.
> So, here the correct IR should be:
> 
>   _36 = .SELECT_VL (ivtmp_34, VF);
>   _22 = (int) _36;
>   vect_cst__21 = [vec_duplicate_expr] _22;
> 
> 2. This issue only happens on non-SLP vectorization single rgroup since:
>
>  if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> {
>   tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
>   if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
> OPTIMIZE_FOR_SPEED)
> && LOOP_VINFO_LENS (loop_vinfo).length () == 1
> && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
> && (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()))
>   LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = true;
> }
> 
> 3. This issue doesn't appears on nested loop no matter 
> LOOP_VINFO_USING_SELECT_VL_P is true or false.
> 
> Since:
> 
>   # vect_vec_iv_.6_5 = PHI <_19(3), { 0, ... }(5)>
>   # vect_diff_15.7_20 = PHI 
>   _19 = vect_vec_iv_.6_5 + { 1, ... };
>   vect_diff_9.8_22 = .COND_LEN_ADD ({ -1, ... }, vect_vec_iv_.6_5, 
> vect_diff_15.7_20, vect_diff_15.7_20, _28, 0);
>   ivtmp_1 = ivtmp_4 + 4294967295;
>   
>[local count: 6549826]:
>   # vect_diff_18.5_11 = PHI 
>   # ivtmp_26 = PHI 
>   _28 = .SELECT_VL (ivtmp_26, POLY_INT_CST [4, 4]);
>   goto ; [100.00%]
> 
> Note the induction variable IR: _21 = vect_vec_iv_.8_22 + { POLY_INT_CST [4, 
> 4], ... }; update induction variable
> independent on VF (or don't care about how many elements are processed in the 
> iteration).
> 
> The update is loop invariant. So it won't be the problem even if 
> LOOP_VINFO_USING_SELECT_VL_P is true.
>
> Testing passed, Ok for trunk ?

OK.

Richard.

>   PR tree-optimization/112438
> 
> gcc/ChangeLog:
> 
>   * tree-vect-loop.cc (vectorizable_induction):
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/autovec/pr112438.c: New test.
> 
> ---
>  .../gcc.target/riscv/rvv/autovec/pr112438.c   | 33 +++
>  gcc/tree-vect-loop.cc | 30 -
>  2 files changed, 62 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c
> 
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c
> new file mode 100644
> index 000..51f90df38a0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model 
> -ffast-math -fdump-tree-optimized-details" } */
> +
> +void
> +foo (int n, int *__restrict in, int *__restrict out)
> +{
> +  for (int i = 0; i < n; i += 1)
> +{
> +  out[i] = in[i] + i;
> +}
> +}
> +
> +void
> +foo2 (int n, float * __restrict in, 
> +float * __restrict out)
> +{
> +  for (int i = 0; i < n; i += 1)
> +{
> +  out[i] = in[i] + i;
> +}
> +}
> +
> +void
> +foo3 (int n, float * __restrict in, 
> +float * __restrict out, float x)
> +{
> +  for (int i = 0; i < n; i += 1)
> +{
> +  out[i] = in[i] + i* i;
> +}
> +}
> +
> +/* We don't want to see vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }.  
> */
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 8abc1937d74..b152072c969 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10306,10 +10306,36 @@ vectorizable_induction (loop_vec_info loop_vinfo,
>  
>  
>/* Create the vector that holds the step of the induction.  */
> +  gimple_stmt_iterator *step_iv_si = NULL;
>if (nested_in_vect_loop)
>  /* iv_loop is nested in the loop to be vectorized. Generate:
> vec_step = [S, S, S, S]  */
>  new_name = step_expr;
> +  else if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo))
> +{
> +  /* When we're using loop_len produced by SELEC_VL, the non-final
> +  iterations are not always processing VF 

Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 12:25 PM Alexander Monakov  wrote:
>
>
> On Thu, 9 Nov 2023, Jeff Law wrote:
>
> > > Yeah, I noticed that the scheduler takes care of DEBUG_INSNs as normal
> > > operations.  When I started to work on this issue, initially I wanted to 
> > > try
> > > something similar to your idea #2, but when checking the APIs, I realized
> > > why not just skip the basic block with NOTEs and LABELs, DEBUG_INSNs as
> > > well.  IMHO there is no value to try to schedule this kind of BB (to be
> > > scheduled range), skipping it can save some resource allocation (like 
> > > block
> > > dependencies) and make it more efficient (not enter function 
> > > schedule_block
> > > etc.), from this perspective it seems an enhancement.  Does it sound
> > > reasonable to you?
> > It sounds reasonable, but only if doing so doesn't add significant
> > implementation complexity.  ie, the gains from doing less work here are 
> > likely
> > to be very marginal, so I'm more interested in clean, easy to maintain code.
>
> I'm afraid ignoring debug-only BBs goes contrary to overall var-tracking 
> design:
> DEBUG_INSNs participate in dependency graph so that schedulers can remove or
> mutate them as needed when moving real insns across them.

Note that debug-only BBs do not exist - the BB would be there even without debug
insns!  So instead you have to handle BBs with just debug insns the same you
handle a completely empty BB.

> Cc'ing Alexandre Oliva who can correct me on that if necessary.
>
> Alexander


Re: [PATCH] Handle constant CONSTRUCTORs in operand_compare

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 12:17 PM Eric Botcazou  wrote:
>
> Hi,
>
> this teaches operand_compare to compare constant CONSTRUCTORs, which is quite
> helpful for so-called fat pointers in Ada, i.e. objects that are semantically
> pointers but are represented by structures made up of two pointers.  This is
> modeled on the implementation present in the ICF pass.
>
> Bootstrapped/regtested on x86-64/Linux, OK for the mainline?

OK.

>
> 2023-11-10  Eric Botcazou  
>
> * fold-const.cc (operand_compare::operand_equal_p) :
> Deal with nonempty constant CONSTRUCTORs.
> (operand_compare::hash_operand) : Hash DECL_FIELD_OFFSET
> and DECL_FIELD_BIT_OFFSET for FIELD_DECLs.
>
>
> 2023-11-10  Eric Botcazou  
>
> * gnat.dg/opt103.ads, gnat.dg/opt103.adb: New test.
>
> --
> Eric Botcazou


Re: [PATCH] RISC-V: Fix bug that XTheadMemPair extension caused fcsr not to be saved and restored before and after interrupt.

2023-11-10 Thread Christoph Müllner
On Fri, Nov 10, 2023 at 2:20 PM Kito Cheng  wrote:
>
> LGTM

Committed after shortening the commit message's heading.

>
> Christoph Müllner 於 2023年11月10日 週五,20:55寫道:
>>
>> On Fri, Nov 10, 2023 at 8:14 AM Jin Ma  wrote:
>> >
>> > The t0 register is used as a temporary register for interrupts, so it needs
>> > special treatment. It is necessary to avoid using "th.ldd" in the interrupt
>> > program to stop the subsequent operation of the t0 register, so they need 
>> > to
>> > exchange positions in the function "riscv_for_each_saved_reg".
>>
>> RISCV_PROLOGUE_TEMP_REGNUM needs indeed to be treated special
>> in case of ISRs and fcsr. This patch just moves the TARGET_XTHEADMEMPAIR
>> block after the ISR/fcsr block.
>>
>> Reviewed-by: Christoph Müllner 
>>
>> >
>> > gcc/ChangeLog:
>> >
>> > * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the 
>> > interrupt
>> > operation before the XTheadMemPair.
>> > ---
>> >  gcc/config/riscv/riscv.cc | 56 +--
>> >  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
>> >  2 files changed, 46 insertions(+), 28 deletions(-)
>> >  create mode 100644 
>> > gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
>> >
>> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> > index e25692b86fc..fa2d4d4b779 100644
>> > --- a/gcc/config/riscv/riscv.cc
>> > +++ b/gcc/config/riscv/riscv.cc
>> > @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
>> > riscv_save_restore_fn fn,
>> >   && riscv_is_eh_return_data_register (regno))
>> > continue;
>> >
>> > +  /* In an interrupt function, save and restore some necessary CSRs 
>> > in the stack
>> > +to avoid changes in CSRs.  */
>> > +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
>> > + && cfun->machine->interrupt_handler_p
>> > + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
>> > + || (TARGET_ZFINX
>> > + && (cfun->machine->frame.mask & ~(1 << 
>> > RISCV_PROLOGUE_TEMP_REGNUM)
>> > +   {
>> > + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
>> > + if (!epilogue)
>> > +   {
>> > + riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > + offset -= fcsr_size;
>> > + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
>> > + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
>> > + offset, riscv_save_reg);
>> > +   }
>> > + else
>> > +   {
>> > + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
>> > + offset - fcsr_size, 
>> > riscv_restore_reg);
>> > + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
>> > + riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > + offset -= fcsr_size;
>> > +   }
>> > + continue;
>> > +   }
>> > +
>> >if (TARGET_XTHEADMEMPAIR)
>> > {
>> >   /* Get the next reg/offset pair.  */
>> > @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
>> > riscv_save_restore_fn fn,
>> > }
>> > }
>> >
>> > -  /* In an interrupt function, save and restore some necessary CSRs 
>> > in the stack
>> > -to avoid changes in CSRs.  */
>> > -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
>> > - && cfun->machine->interrupt_handler_p
>> > - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
>> > - || (TARGET_ZFINX
>> > - && (cfun->machine->frame.mask & ~(1 << 
>> > RISCV_PROLOGUE_TEMP_REGNUM)
>> > -   {
>> > - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
>> > - if (!epilogue)
>> > -   {
>> > - riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > - offset -= fcsr_size;
>> > - emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
>> > - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
>> > - offset, riscv_save_reg);
>> > -   }
>> > - else
>> > -   {
>> > - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
>> > - offset - fcsr_size, 
>> > riscv_restore_reg);
>> > - emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
>> > - riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > - offset -= fcsr_size;
>> > -   }
>> > - continue;
>> > -   }
>> > -
>> >riscv_save_restore_reg (word_mode, regno, offset, fn);
>> >  }
>> >
>> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c 
>> > b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
>> > new file mode 100644
>> > index 000..d06f05f5c7c
>> > --- 

Re: [PATCH v3] libiberty: Use posix_spawn in pex-unix when available.

2023-11-10 Thread Richard Biener
On Fri, Nov 10, 2023 at 12:01 PM Prathamesh Kulkarni
 wrote:
>
> On Thu, 5 Oct 2023 at 00:00, Brendan Shanks  wrote:
> >
> > Hi,
> >
> > This patch implements pex_unix_exec_child using posix_spawn when
> > available.
> >
> > This should especially benefit recent macOS (where vfork just calls
> > fork), but should have equivalent or faster performance on all
> > platforms.
> > In addition, the implementation is substantially simpler than the
> > vfork+exec code path.
> >
> > Tested on x86_64-linux.
> Hi Brendan,
> It seems this patch caused the following regressions on aarch64:
>
> FAIL: g++.dg/modules/bad-mapper-1.C -std=c++17  at line 3 (test for
> errors, line )
> FAIL: g++.dg/modules/bad-mapper-1.C -std=c++17 (test for excess errors)
> FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2a  at line 3 (test for
> errors, line )
> FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2a (test for excess errors)
> FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2b  at line 3 (test for
> errors, line )
> FAIL: g++.dg/modules/bad-mapper-1.C -std=c++2b (test for excess errors)
>
> Looking at g++.log:
> /home/tcwg-buildslave/workspace/tcwg_gnu_2/abe/snapshots/gcc.git~master/gcc/testsuite/g++.dg/modules/bad-mapper-1.C:
> error: failed posix_spawnp mapper 'this-will-not-work'
> In module imported at
> /home/tcwg-buildslave/workspace/tcwg_gnu_2/abe/snapshots/gcc.git~master/gcc/testsuite/g++.dg/modules/bad-mapper-1.C:2:1:
> unique1.bob: error: failed to read compiled module: No such file or directory
> unique1.bob: note: compiled module file is 'gcm.cache/unique1.bob.gcm'
> unique1.bob: note: imports must be built before being imported
> unique1.bob: fatal error: returning to the gate for a mechanical issue
> compilation terminated.
>
> Link to log files:
> https://ci.linaro.org/job/tcwg_gcc_check--master-aarch64-build/1159/artifact/artifacts/00-sumfiles/
> Could you please investigate ?

The testcase needs adjustment, it looks for

// { dg-error "-:failed (exec|CreateProcess).*mapper.*
.*this-will-not-work" "" { target { ! { *-*-darwin[89]* *-*-darwin10*
} } } 0 }

adding |posix_spawnp probably works

>
> Thanks,
> Prathamesh
> >
> > v2: Fix error handling (previously the function would be run twice in
> > case of error), and don't use a macro that changes control flow.
> >
> > v3: Match file style for error-handling blocks, don't close
> > in/out/errdes on error, and check close() for errors.
> >
> > libiberty/
> > * configure.ac (AC_CHECK_HEADERS): Add spawn.h.
> > (checkfuncs): Add posix_spawn, posix_spawnp.
> > (AC_CHECK_FUNCS): Add posix_spawn, posix_spawnp.
> > * configure, config.in: Rebuild.
> > * pex-unix.c [HAVE_POSIX_SPAWN] (pex_unix_exec_child): New function.
> >
> > Signed-off-by: Brendan Shanks 
> > ---
> >  libiberty/configure.ac |   8 +-
> >  libiberty/pex-unix.c   | 168 +
> >  2 files changed, 173 insertions(+), 3 deletions(-)
> >
> > diff --git a/libiberty/configure.ac b/libiberty/configure.ac
> > index 0748c592704..2488b031bc8 100644
> > --- a/libiberty/configure.ac
> > +++ b/libiberty/configure.ac
> > @@ -289,7 +289,7 @@ AC_SUBST_FILE(host_makefile_frag)
> >  # It's OK to check for header files.  Although the compiler may not be
> >  # able to link anything, it had better be able to at least compile
> >  # something.
> > -AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h 
> > string.h unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h 
> > sys/mman.h fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h 
> > machine/hal_sysinfo.h sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h 
> > stdio_ext.h process.h sys/prctl.h)
> > +AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h 
> > string.h unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h 
> > sys/mman.h fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h 
> > machine/hal_sysinfo.h sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h 
> > stdio_ext.h process.h sys/prctl.h spawn.h)
> >  AC_HEADER_SYS_WAIT
> >  AC_HEADER_TIME
> >
> > @@ -412,7 +412,8 @@ funcs="$funcs setproctitle"
> >  vars="sys_errlist sys_nerr sys_siglist"
> >
> >  checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
> > - getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic 
> > pstat_getstatic \
> > + getsysinfo gettimeofday on_exit pipe2 posix_spawn posix_spawnp psignal \
> > + pstat_getdynamic pstat_getstatic \
> >   realpath setrlimit spawnve spawnvpe strerror strsignal sysconf sysctl \
> >   sysmp table times wait3 wait4"
> >
> > @@ -435,7 +436,8 @@ if test "x" = "y"; then
> >  index insque \
> >  memchr memcmp memcpy memmem memmove memset mkstemps \
> >  on_exit \
> > -pipe2 psignal pstat_getdynamic pstat_getstatic putenv \
> > +pipe2 posix_spawn posix_spawnp psignal \
> > +pstat_getdynamic pstat_getstatic putenv \
> >  random realpath rename rindex \
> >  sbrk setenv setproctitle setrlimit 

Re: Re: [PATCH] RISC-V: Add combine optimization by slideup for vec_init vectorization

2023-11-10 Thread 钟居哲
Thanks. Robin. Committed.

>> The test patterns are a bit unwieldy but not a blocker
>>IMHO.  Could probably done shorter using macro magic?
I have no idea. But I think we can revisit it and refine tests when we have 
time.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-10 20:47
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Add combine optimization by slideup for vec_init 
vectorization
Hi Juzhe,
 
LGTM.  The test patterns are a bit unwieldy but not a blocker
IMHO.  Could probably done shorter using macro magic?
 
Regards
Robin
 


Re: [PATCH] RISC-V: Fix bug that XTheadMemPair extension caused fcsr not to be saved and restored before and after interrupt.

2023-11-10 Thread Kito Cheng
LGTM

Christoph Müllner 於 2023年11月10日 週五,20:55寫道:

> On Fri, Nov 10, 2023 at 8:14 AM Jin Ma  wrote:
> >
> > The t0 register is used as a temporary register for interrupts, so it
> needs
> > special treatment. It is necessary to avoid using "th.ldd" in the
> interrupt
> > program to stop the subsequent operation of the t0 register, so they
> need to
> > exchange positions in the function "riscv_for_each_saved_reg".
>
> RISCV_PROLOGUE_TEMP_REGNUM needs indeed to be treated special
> in case of ISRs and fcsr. This patch just moves the TARGET_XTHEADMEMPAIR
> block after the ISR/fcsr block.
>
> Reviewed-by: Christoph Müllner 
>
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the
> interrupt
> > operation before the XTheadMemPair.
> > ---
> >  gcc/config/riscv/riscv.cc | 56 +--
> >  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
> >  2 files changed, 46 insertions(+), 28 deletions(-)
> >  create mode 100644
> gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index e25692b86fc..fa2d4d4b779 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
> riscv_save_restore_fn fn,
> >   && riscv_is_eh_return_data_register (regno))
> > continue;
> >
> > +  /* In an interrupt function, save and restore some necessary CSRs
> in the stack
> > +to avoid changes in CSRs.  */
> > +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> > + && cfun->machine->interrupt_handler_p
> > + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> > + || (TARGET_ZFINX
> > + && (cfun->machine->frame.mask & ~(1 <<
> RISCV_PROLOGUE_TEMP_REGNUM)
> > +   {
> > + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> > + if (!epilogue)
> > +   {
> > + riscv_save_restore_reg (word_mode, regno, offset, fn);
> > + offset -= fcsr_size;
> > + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> > + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> > + offset, riscv_save_reg);
> > +   }
> > + else
> > +   {
> > + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> > + offset - fcsr_size,
> riscv_restore_reg);
> > + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> > + riscv_save_restore_reg (word_mode, regno, offset, fn);
> > + offset -= fcsr_size;
> > +   }
> > + continue;
> > +   }
> > +
> >if (TARGET_XTHEADMEMPAIR)
> > {
> >   /* Get the next reg/offset pair.  */
> > @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
> riscv_save_restore_fn fn,
> > }
> > }
> >
> > -  /* In an interrupt function, save and restore some necessary CSRs
> in the stack
> > -to avoid changes in CSRs.  */
> > -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> > - && cfun->machine->interrupt_handler_p
> > - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> > - || (TARGET_ZFINX
> > - && (cfun->machine->frame.mask & ~(1 <<
> RISCV_PROLOGUE_TEMP_REGNUM)
> > -   {
> > - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> > - if (!epilogue)
> > -   {
> > - riscv_save_restore_reg (word_mode, regno, offset, fn);
> > - offset -= fcsr_size;
> > - emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> > - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> > - offset, riscv_save_reg);
> > -   }
> > - else
> > -   {
> > - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> > - offset - fcsr_size,
> riscv_restore_reg);
> > - emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> > - riscv_save_restore_reg (word_mode, regno, offset, fn);
> > - offset -= fcsr_size;
> > -   }
> > - continue;
> > -   }
> > -
> >riscv_save_restore_reg (word_mode, regno, offset, fn);
> >  }
> >
> > diff --git
> a/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> > new file mode 100644
> > index 000..d06f05f5c7c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> > @@ -0,0 +1,18 @@
> > +/* Verify that fcsr instructions emitted.  */
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target hard_float } */
> > +/* { dg-skip-if "" { *-*-* } 

[PATCH] tree-optimization/110221 - SLP and loop mask/len

2023-11-10 Thread Richard Biener
The following fixes the issue that when SLP stmts are internal defs
but appear invariant because they end up only using invariant defs
then they get scheduled outside of the loop.  This nice optimization
breaks down when loop masks or lens are applied since those are not
explicitly tracked as dependences.  The following makes sure to never
schedule internal defs outside of the vectorized loop when the
loop uses masks/lens.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/110221
* tree-vect-slp.cc (vect_schedule_slp_node): When loop
masking / len is applied make sure to not schedule
intenal defs outside of the loop.

* gfortran.dg/pr110221.f: New testcase.
---
 gcc/testsuite/gfortran.dg/pr110221.f | 17 +
 gcc/tree-vect-slp.cc | 10 ++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr110221.f

diff --git a/gcc/testsuite/gfortran.dg/pr110221.f 
b/gcc/testsuite/gfortran.dg/pr110221.f
new file mode 100644
index 000..8b57384313a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr110221.f
@@ -0,0 +1,17 @@
+C PR middle-end/68146
+C { dg-do compile }
+C { dg-options "-O2 -w" }
+C { dg-additional-options "-mavx512f --param vect-partial-vector-usage=2" { 
target avx512f } }
+  SUBROUTINE CJYVB(V,Z,V0,CBJ,CDJ,CBY,CYY)
+  IMPLICIT DOUBLE PRECISION (A,B,G,O-Y)
+  IMPLICIT COMPLEX*16 (C,Z)
+  DIMENSION CBJ(0:*),CDJ(0:*),CBY(0:*)
+  N=INT(V)
+  CALL GAMMA2(VG,GA)
+  DO 65 K=1,N
+CBY(K)=CYY
+65CONTINUE
+  CDJ(0)=V0/Z*CBJ(0)-CBJ(1)
+  DO 70 K=1,N
+70  CDJ(K)=-(K+V0)/Z*CBJ(K)+CBJ(K-1)
+  END
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3e5814c3a31..80e279d8f50 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -9081,6 +9081,16 @@ vect_schedule_slp_node (vec_info *vinfo,
   /* Emit other stmts after the children vectorized defs which is
 earliest possible.  */
   gimple *last_stmt = NULL;
+  if (auto loop_vinfo = dyn_cast  (vinfo))
+   if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
+   || LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
+ {
+   /* But avoid scheduling internal defs outside of the loop when
+  we might have only implicitly tracked loop mask/len defs.  */
+   gimple_stmt_iterator si
+ = gsi_after_labels (LOOP_VINFO_LOOP (loop_vinfo)->header);
+   last_stmt = *si;
+ }
   bool seen_vector_def = false;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)
-- 
2.35.3


Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-10 Thread Richard Biener
On Fri, 10 Nov 2023, Tamar Christina wrote:

> 
> Hi Prathamesh,
> 
> Yes Arm requires SIMD for copysign. The testcases fail because they don't 
> turn on Neon.
> 
> I'll update them.

On x86_64 with -m32 I see

FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN" 1
FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1
FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= ABS_EXPR" 
1
FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1
FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= .COPYSIGN" 
2
FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop 
"Deleting[^n]* = -" 4
FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop 
"Deleting[^n]* = .COPYSIGN" 2
FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop 
"Deleting[^n]* = ABS_EXPR <" 1
FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"

maybe add a copysign effective target?

> Regards,
> Tamar
> 
> From: Prathamesh Kulkarni 
> Sent: Friday, November 10, 2023 12:24 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org ; nd ; 
> rguent...@suse.de ; j...@ventanamicro.com 
> 
> Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to 
> copysign (x, -1) [PR109154]
> 
> On Mon, 6 Nov 2023 at 15:50, Tamar Christina  wrote:
> >
> > Hi All,
> >
> > This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more
> > canonical and allows a target to expand this sequence efficiently.  Such
> > sequences are common in scientific code working with gradients.
> >
> > There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))
> > which I remove since this is a less efficient form.  The testsuite is also
> > updated in light of this.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Hi Tamar,
> It seems the patch caused following regressions on arm:
> 
> Running gcc:gcc.dg/dg.exp ...
> FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN" 1
> FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1
> 
> Running gcc:gcc.dg/tree-ssa/tree-ssa.exp ...
> FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1
> FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= .COPYSIGN" 2
> FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= ABS_EXPR" 1
> FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> "Deleting[^\\n]* = -" 4
> FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> "Deleting[^\\n]* = ABS_EXPR <" 1
> FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> "Deleting[^\\n]* = \\.COPYSIGN" 2
> FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized 
> ".COPYSIGN" 1
> FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized "ABS" 1
> FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple ".COPYSIGN" 4
> FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple "ABS" 4
> FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
> Link to log files:
> https://ci.linaro.org/job/tcwg_gcc_check--master-arm-build/1240/artifact/artifacts/00-sumfiles/
> 
> Even for following test-case:
> double g (double a)
> {
>   double t1 = fabs (a);
>   double t2 = -t1;
>   return t2;
> }
> 
> It seems, the pattern gets applied but doesn't get eventually
> simplified to copysign(a, -1).
> forwprop dump shows:
> Applying pattern match.pd:1131, gimple-match-4.cc:4134
> double g (double a)
> {
>   double t2;
>   double t1;
> 
>:
>   t1_2 = ABS_EXPR ;
>   t2_3 = -t1_2;
>   return t2_3;
> 
> }
> 
> while on x86_64:
> Applying pattern match.pd:1131, gimple-match-4.cc:4134
> gimple_simplified to t2_3 = .COPYSIGN (a_1(D), -1.0e+0);
> Removing dead stmt:t1_2 = ABS_EXPR ;
> double g (double a)
> {
>   double t2;
>   double t1;
> 
>:
>   t2_3 = .COPYSIGN (a_1(D), -1.0e+0);
>   return t2_3;
> 
> }
> 
> Thanks,
> Prathamesh
> 
> 
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/109154
> > * match.pd: Add new neg+abs rule, remove inverse copysign rule.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/109154
> > * gcc.dg/fold-copysign-1.c: Updated.
> > * gcc.dg/pr55152-2.c: Updated.
> > * gcc.dg/tree-ssa/abs-4.c: Updated.
> > * gcc.dg/tree-ssa/backprop-6.c: Updated.
> > * gcc.dg/tree-ssa/copy-sign-2.c: Updated.
> > * gcc.dg/tree-ssa/mult-abs-2.c: Updated.
> > * gcc.target/aarch64/fneg-abs_1.c: New test.
> > * gcc.target/aarch64/fneg-abs_2.c: New test.
> > * gcc.target/aarch64/fneg-abs_3.c: New test.
> > * gcc.target/aarch64/fneg-abs_4.c: New test.
> > * gcc.target/aarch64/sve/fneg-abs_1.c: New test.
> > * gcc.target/aarch64/sve/fneg-abs_2.c: New test.
> > * gcc.target/aarch64/sve/fneg-abs_3.c: New test.
> > * 

Re: [PATCH] RISC-V: Fix bug that XTheadMemPair extension caused fcsr not to be saved and restored before and after interrupt.

2023-11-10 Thread Christoph Müllner
On Fri, Nov 10, 2023 at 8:14 AM Jin Ma  wrote:
>
> The t0 register is used as a temporary register for interrupts, so it needs
> special treatment. It is necessary to avoid using "th.ldd" in the interrupt
> program to stop the subsequent operation of the t0 register, so they need to
> exchange positions in the function "riscv_for_each_saved_reg".

RISCV_PROLOGUE_TEMP_REGNUM needs indeed to be treated special
in case of ISRs and fcsr. This patch just moves the TARGET_XTHEADMEMPAIR
block after the ISR/fcsr block.

Reviewed-by: Christoph Müllner 

>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the 
> interrupt
> operation before the XTheadMemPair.
> ---
>  gcc/config/riscv/riscv.cc | 56 +--
>  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
>  2 files changed, 46 insertions(+), 28 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index e25692b86fc..fa2d4d4b779 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
> riscv_save_restore_fn fn,
>   && riscv_is_eh_return_data_register (regno))
> continue;
>
> +  /* In an interrupt function, save and restore some necessary CSRs in 
> the stack
> +to avoid changes in CSRs.  */
> +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> + && cfun->machine->interrupt_handler_p
> + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> + || (TARGET_ZFINX
> + && (cfun->machine->frame.mask & ~(1 << 
> RISCV_PROLOGUE_TEMP_REGNUM)
> +   {
> + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> + if (!epilogue)
> +   {
> + riscv_save_restore_reg (word_mode, regno, offset, fn);
> + offset -= fcsr_size;
> + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> + offset, riscv_save_reg);
> +   }
> + else
> +   {
> + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> + offset - fcsr_size, riscv_restore_reg);
> + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> + riscv_save_restore_reg (word_mode, regno, offset, fn);
> + offset -= fcsr_size;
> +   }
> + continue;
> +   }
> +
>if (TARGET_XTHEADMEMPAIR)
> {
>   /* Get the next reg/offset pair.  */
> @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
> riscv_save_restore_fn fn,
> }
> }
>
> -  /* In an interrupt function, save and restore some necessary CSRs in 
> the stack
> -to avoid changes in CSRs.  */
> -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> - && cfun->machine->interrupt_handler_p
> - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> - || (TARGET_ZFINX
> - && (cfun->machine->frame.mask & ~(1 << 
> RISCV_PROLOGUE_TEMP_REGNUM)
> -   {
> - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> - if (!epilogue)
> -   {
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> - offset -= fcsr_size;
> - emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> - offset, riscv_save_reg);
> -   }
> - else
> -   {
> - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> - offset - fcsr_size, riscv_restore_reg);
> - emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> - offset -= fcsr_size;
> -   }
> - continue;
> -   }
> -
>riscv_save_restore_reg (word_mode, regno, offset, fn);
>  }
>
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> new file mode 100644
> index 000..d06f05f5c7c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> @@ -0,0 +1,18 @@
> +/* Verify that fcsr instructions emitted.  */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target hard_float } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
> +/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906 
> -funwind-tables" { target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906 
> -funwind-tables" { target { rv32 } } } */

[PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-10 Thread Robin Dapp
Hi,

this patch fixes several more FAILs that would only show up in 32-bit runs.

Regards
 Robin

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vmul-zvfh-run.c: Adjust.
* gcc.target/riscv/rvv/autovec/binop/vsub-zvfh-run.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_narrow_shift_run-3.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/pr111401.c: Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-template.h:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/slp-mask-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-10.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-11.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-12.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-3.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-4.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-5.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-6.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-7.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-8.c:
Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-9.c:
Ditto.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c:
Ditto.
---
 .../riscv/rvv/autovec/binop/vmul-zvfh-run.c   | 34 -
 .../riscv/rvv/autovec/binop/vsub-zvfh-run.c   | 72 +--
 .../autovec/cond/cond_narrow_shift_run-3.c|  2 +-
 .../riscv/rvv/autovec/cond/pr111401.c |  2 +-
 .../autovec/conversions/vfcvt-itof-zvfh-run.c |  4 +-
 .../autovec/conversions/vfcvt_rtz-zvfh-run.c  |  4 +-
 .../conversions/vfncvt-ftoi-zvfh-run.c| 18 ++---
 .../conversions/vfncvt-itof-template.h| 36 ++
 .../conversions/vfncvt-itof-zvfh-run.c| 31 
 .../rvv/autovec/conversions/vfncvt-zvfh-run.c |  4 +-
 .../conversions/vfwcvt-ftoi-zvfh-run.c| 10 +--
 .../conversions/vfwcvt-itof-zvfh-run.c|  4 +-
 .../rvv/autovec/conversions/vfwcvt-zvfh-run.c | 40 +--
 .../riscv/rvv/autovec/slp-mask-run-1.c|  2 +-
 .../rvv/autovec/ternop/ternop_run_zvfh-1.c|  4 +-
 .../rvv/autovec/ternop/ternop_run_zvfh-10.c   |  4 +-
 .../rvv/autovec/ternop/ternop_run_zvfh-11.c   | 50 ++---
 .../rvv/autovec/ternop/ternop_run_zvfh-12.c   | 49 ++---
 .../rvv/autovec/ternop/ternop_run_zvfh-2.c| 24 ---
 .../rvv/autovec/ternop/ternop_run_zvfh-3.c| 21 +++---
 .../rvv/autovec/ternop/ternop_run_zvfh-4.c|  4 +-
 .../rvv/autovec/ternop/ternop_run_zvfh-5.c| 50 ++---
 .../rvv/autovec/ternop/ternop_run_zvfh-6.c| 50 ++---
 .../rvv/autovec/ternop/ternop_run_zvfh-7.c|  4 +-
 .../rvv/autovec/ternop/ternop_run_zvfh-8.c| 21 +++---
 .../rvv/autovec/ternop/ternop_run_zvfh-9.c| 22 +++---
 .../riscv/rvv/autovec/unop/vfsqrt-run.c   | 30 
 .../riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c   |  2 +-
 .../riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c   |  2 +-
 .../riscv/rvv/autovec/unop/vfsqrt-template.h  | 24 ++-
 .../riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c  | 34 -
 .../autovec/vls-vlmax/vec_extract-zvfh-run.c  |  4 +-
 .../rvv/autovec/vls-vlmax/vec_set-zvfh-run.c  |  4 +-
 33 files changed, 335 insertions(+), 331 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vmul-zvfh-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vmul-zvfh-run.c
index a4271810e58..1082695c5de 100644
--- 

  1   2   >