Re: [PATCH] gcc: m68k: fix PR target/107645

2022-11-12 Thread Max Filippov via Gcc-patches
On Sat, Nov 12, 2022 at 11:42 AM Jeff Law  wrote:
> ISTM that we'd need to strip the unspec and process its argument
> instead.

I tried that first, the result was more ICEs because that pattern
wasn't recognized at later stages. Then I read the change to the
comment over the symbolic_operand predicate:
https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/config/m68k/predicates.md;h=6ca261fb92a2b7ecd53a0356d06410e2c0d70965;hp=417989f6d6c408fa82af9f9649a204b9a754d1dc;hb=75df395f15f2;hpb=676fd528c9990a4f1046b51d40059893c3a71490
and that made me think that the intention was to not recognize
the unspecs in that predicate.

-- 
Thanks.
-- Max


[r13-3923 Regression] FAIL: gcc.dg/fold-overflow-1.c scan-assembler-times 2139095040 2 on Linux/x86_64

2022-11-12 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

2f7f9edd28d75a85a33599978f23811e679e443d is the first bad commit
commit 2f7f9edd28d75a85a33599978f23811e679e443d
Author: Jakub Jelinek 
Date:   Sat Nov 12 09:33:01 2022 +0100

range-op: Implement floating point multiplication fold_range [PR107569]

caused

FAIL: gcc.dg/fold-overflow-1.c scan-assembler-times 2139095040 2

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-3923/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gcc.dg/fold-overflow-1.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gcc.dg/fold-overflow-1.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gcc.dg/fold-overflow-1.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gcc.dg/fold-overflow-1.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


[r13-3924 Regression] FAIL: gcc.dg/pr95115.c execution test on Linux/x86_64

2022-11-12 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

2d5c4a16dd833aa083f13dd3e78e3ef38afe6ebb is the first bad commit
commit 2d5c4a16dd833aa083f13dd3e78e3ef38afe6ebb
Author: Jakub Jelinek 
Date:   Sat Nov 12 09:35:16 2022 +0100

range-op: Implement floating point division fold_range [PR107569]

caused

FAIL: gcc.dg/pr95115.c execution test

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-3924/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr95115.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr95115.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr95115.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr95115.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


Re: [PATCH] Enable shrink wrapping for the RISC-V target.

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/7/22 15:07, Palmer Dabbelt wrote:

On Thu, 03 Nov 2022 15:23:28 PDT (-0700), j...@ventanamicro.com wrote:


On 11/2/22 18:26, Palmer Dabbelt wrote:



I also tried to remove that restriction but it looks like it can't
work because we can't create
pseudo-registers during shrink wrapping and shrink wrapping can't
work either.

I believe this means that shrink wrapping cannot interfere with a 
long

stack frame
so there is nothing to test against in this case?


It'd be marginally better to have such a test case to ensure we don't
shrink wrap it -- that would ensure that someone doesn't accidentally
introduce shrink wrapping with large offsets.   Just a bit of future
proofing.


If there's passing test cases that fail with that check removed then
it's probably good enough, though I think in this case just having a
comment there saying why the short-stack check is necessary should be
fine.


I can live with this.


Which one (or either)?  I'm fine with either option, just trying to 
avoid another re-spin as this one is a bit vague.


Sorry I wasn't clear.  Either is fine with me.


Jeff



Re: [PATCH v2] doc: Remove outdated reference to "core" and front-end downloads

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/9/22 15:12, Jonathan Wakely via Gcc-patches wrote:

Patch rebased on the new doc format. I haven't tested building the docs
this way, but this is just plain text here.

OK for trunk?

-- >8 --

gcc/ChangeLog:

* doc/install/testing.rst: Remove anachronism about separate
source tarballs.


OK.

jeff




Re: [PATCH v2] RISC-V: costs: support shift-and-add in strength-reduction

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/10/22 14:34, Philipp Tomsich wrote:

The strength-reduction implementation in expmed.cc will assess the
profitability of using shift-and-add using a RTL expression that wraps
a MULT (with a power-of-2) in a PLUS.  Unless the RISC-V rtx_costs
function recognizes this as expressing a sh[123]add instruction, we
will return an inflated cost---thus defeating the optimization.

This change adds the necessary idiom recognition to provide an
accurate cost for this for of expressing sh[123]add.

Instead on expanding to
li  a5,200
mulwa0,a5,a0
with this change, the expression 'a * 200' is sythesized as:
sh2add  a0,a0,a0   // *5 = a + 4 * a
sh2add  a0,a0,a0   // *5 = a + 4 * a
sllia0,a0,3// *8

gcc/ChangeLog:

* config/riscv/riscv.c (riscv_rtx_costs): Recognize shNadd,
if expressed as a plus and multiplication with a power-of-2.
Split costing for MINUS from PLUS.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zba-shNadd-07.c: New test.


OK.  Note that getting this right can impact one of the spec2017 integer 
benchmarks notably.  I don't recall which one, but it has a div and a 
mod by the same constant which is fairly reasonably implement with 
shifts and adds.  You won't see it in instruction count data, but would 
see it if you had cycle count data or instrumented for div/mod instructions.



Jeff




[committed] libstdc++: Add C++20 clocks

2022-11-12 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and powerpc64le-linux. Pushed to trunk.

-- >8 --

Also add the basic types for timezones, without the non-inline
definitions needed to actually use them.

The get_leap_second_info function currently uses a hardcoded list of
leap seconds, correct as of the end of 2022. That needs to be replaced
with a dynamically generated list read from the system tzdata. That will
be done in a later patch.

libstdc++-v3/ChangeLog:

* include/std/chrono (utc_clock, tai_clock, gps_clock): Define.
(clock_time_conversion, clock_cast): Define.
(sys_info, local_info): Define structs for timezone information.
(nonexistent_local_time, ambiguous_local_time): Define
exceptions for invalid times.
(time_zone, time_zone_link, leap_second, zoned_traits, tzdb)
(tzdb_list): Define classes representing time zones.
(get_leap_second_info): Define new function returning leap
second offset for a given time point.
* testsuite/std/time/clock/gps/1.cc: New test.
* testsuite/std/time/clock/tai/1.cc: New test.
* testsuite/std/time/clock/utc/1.cc: New test.
---
 libstdc++-v3/include/std/chrono   | 744 +-
 .../testsuite/std/time/clock/gps/1.cc |  38 +
 .../testsuite/std/time/clock/tai/1.cc |  41 +
 .../testsuite/std/time/clock/utc/1.cc |  24 +
 4 files changed, 844 insertions(+), 3 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/gps/1.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/tai/1.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/utc/1.cc

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index c0c3a679609..90b73f8198e 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -39,9 +39,15 @@
 #else
 
 #include 
-#if __cplusplus > 201703L
-# include  // ostringstream
-# include 
+
+#if __cplusplus >= 202002L
+# include 
+# include 
+# include 
+# include  // __to_chars_len, __to_chars_10_impl
+# include  // upper_bound TODO: move leap_second_info to .so
+# include 
+# include 
 #endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -102,6 +108,357 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   seconds elapsed;
 };
 
+template
+  leap_second_info
+  get_leap_second_info(const utc_time<_Duration>& __ut);
+
+/** A clock that measures Universal Coordinated Time (UTC).
+ *
+ * The epoch is 1970-01-01 00:00:00.
+ *
+ * @since C++20
+ */
+class utc_clock
+{
+public:
+  using rep   = system_clock::rep;
+  using period= system_clock::period;
+  using duration  = chrono::duration;
+  using time_point= chrono::time_point;
+  static constexpr bool is_steady = false;
+
+  static time_point
+  now()
+  { return from_sys(system_clock::now()); }
+
+  template
+   static sys_time>
+   to_sys(const utc_time<_Duration>& __t)
+   {
+ using _CDur = common_type_t<_Duration, seconds>;
+ const auto __li = chrono::get_leap_second_info(__t);
+ sys_time<_CDur> __s{__t.time_since_epoch() - seconds{__li.elapsed}};
+ if (__li.is_leap_second)
+   __s = chrono::floor(__s) + seconds{1} - _CDur{1};
+ return __s;
+   }
+
+  template
+   static utc_time>
+   from_sys(const sys_time<_Duration>& __t)
+   {
+ using _CDur = common_type_t<_Duration, seconds>;
+ utc_time<_Duration> __u(__t.time_since_epoch());
+ const auto __li = chrono::get_leap_second_info(__u);
+ return utc_time<_CDur>{__u} + seconds{__li.elapsed};
+   }
+};
+
+/** A clock that measures International Atomic Time.
+ *
+ * The epoch is 1958-01-01 00:00:00.
+ *
+ * @since C++20
+ */
+class tai_clock
+{
+public:
+  using rep   = system_clock::rep;
+  using period= system_clock::period;
+  using duration  = chrono::duration;
+  using time_point= chrono::time_point;
+  static constexpr bool is_steady = false; // XXX true for CLOCK_TAI?
+
+  // TODO move into lib, use CLOCK_TAI on linux, add extension point.
+  static time_point
+  now()
+  { return from_utc(utc_clock::now()); }
+
+  template
+   static utc_time>
+   to_utc(const tai_time<_Duration>& __t)
+   {
+ using _CDur = common_type_t<_Duration, seconds>;
+ return utc_time<_CDur>{__t.time_since_epoch()} - 378691210s;
+   }
+
+  template
+   static tai_time>
+   from_utc(const utc_time<_Duration>& __t)
+   {
+ using _CDur = common_type_t<_Duration, seconds>;
+ return tai_time<_CDur>{__t.time_since_epoch()} + 378691210s;
+   }
+};
+
+/** A clock that measures GPS time.
+ *
+ * The epoch is 1980-01-06 

[committed] libstdc++: Allow std::to_chars for 128-bit integers in strict mode

2022-11-12 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and powerpc64le-linux. Pushed to trunk.

-- >8 --

This allows std::format to support __int128 when __STRICT_ANSI__ is
defined, which previously failed because __int128 is not an integral
type in strict mode.

With these changes, std::to_chars still rejects 128-bit integers in
strict mode, but std::format will be able to use __detail::__to_chars_i
for unsigned __int128.

libstdc++-v3/ChangeLog:

* include/bits/charconv.h (__integer_to_chars_is_unsigned):
New variable template.
(__to_chars_len, __to_chars_10_impl): Use variable template in
assertions to allow unsigned __int128 in strict mode.
* include/std/charconv (__to_chars, __to_chars_16)
(__to_chars_10, __to_chars_8, __to_chars_2): Likewise.
---
 libstdc++-v3/include/bits/charconv.h | 18 ++
 libstdc++-v3/include/std/charconv| 19 +--
 2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/include/bits/charconv.h 
b/libstdc++-v3/include/bits/charconv.h
index d04aab77624..103cfcb8177 100644
--- a/libstdc++-v3/include/bits/charconv.h
+++ b/libstdc++-v3/include/bits/charconv.h
@@ -35,19 +35,28 @@
 #if __cplusplus >= 201103L
 
 #include 
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 namespace __detail
 {
+#if __cpp_variable_templates
+  // This accepts 128-bit integers even in strict mode.
+  template
+constexpr bool __integer_to_chars_is_unsigned
+  = ! __gnu_cxx::__int_traits<_Tp>::__is_signed;
+#endif
+
   // Generic implementation for arbitrary bases.
   template
 _GLIBCXX14_CONSTEXPR unsigned
 __to_chars_len(_Tp __value, int __base = 10) noexcept
 {
-  static_assert(is_integral<_Tp>::value, "implementation bug");
-  static_assert(is_unsigned<_Tp>::value, "implementation bug");
+#if __cpp_variable_templates
+  static_assert(__integer_to_chars_is_unsigned<_Tp>, "implementation bug");
+#endif
 
   unsigned __n = 1;
   const unsigned __b2 = __base  * __base;
@@ -71,8 +80,9 @@ namespace __detail
 _GLIBCXX23_CONSTEXPR void
 __to_chars_10_impl(char* __first, unsigned __len, _Tp __val) noexcept
 {
-  static_assert(is_integral<_Tp>::value, "implementation bug");
-  static_assert(is_unsigned<_Tp>::value, "implementation bug");
+#if __cpp_variable_templates
+  static_assert(__integer_to_chars_is_unsigned<_Tp>, "implementation bug");
+#endif
 
   constexpr char __digits[201] =
"0001020304050607080910111213141516171819"
diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index c5ed6fac73b..8f02395172f 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -88,6 +88,10 @@ namespace __detail
 using __integer_to_chars_result_type
   = enable_if_t<__or_<__is_signed_integer<_Tp>,
  __is_unsigned_integer<_Tp>,
+#if defined __SIZEOF_INT128__ && defined __STRICT_ANSI__
+ is_same<_Tp, signed __int128>,
+ is_same<_Tp, unsigned __int128>,
+#endif
  is_same>>::value,
to_chars_result>;
 
@@ -126,8 +130,7 @@ namespace __detail
 constexpr to_chars_result
 __to_chars(char* __first, char* __last, _Tp __val, int __base) noexcept
 {
-  static_assert(is_integral<_Tp>::value, "implementation bug");
-  static_assert(is_unsigned<_Tp>::value, "implementation bug");
+  static_assert(__integer_to_chars_is_unsigned<_Tp>, "implementation bug");
 
   to_chars_result __res;
 
@@ -167,8 +170,7 @@ namespace __detail
 constexpr __integer_to_chars_result_type<_Tp>
 __to_chars_16(char* __first, char* __last, _Tp __val) noexcept
 {
-  static_assert(is_integral<_Tp>::value, "implementation bug");
-  static_assert(is_unsigned<_Tp>::value, "implementation bug");
+  static_assert(__integer_to_chars_is_unsigned<_Tp>, "implementation bug");
 
   to_chars_result __res;
 
@@ -214,8 +216,7 @@ namespace __detail
 constexpr __integer_to_chars_result_type<_Tp>
 __to_chars_10(char* __first, char* __last, _Tp __val) noexcept
 {
-  static_assert(is_integral<_Tp>::value, "implementation bug");
-  static_assert(is_unsigned<_Tp>::value, "implementation bug");
+  static_assert(__integer_to_chars_is_unsigned<_Tp>, "implementation bug");
 
   to_chars_result __res;
 
@@ -238,8 +239,7 @@ namespace __detail
 constexpr __integer_to_chars_result_type<_Tp>
 __to_chars_8(char* __first, char* __last, _Tp __val) noexcept
 {
-  static_assert(is_integral<_Tp>::value, "implementation bug");
-  static_assert(is_unsigned<_Tp>::value, "implementation bug");
+  static_assert(__integer_to_chars_is_unsigned<_Tp>, "implementation bug");
 
   to_chars_result __res;
   unsigned __len;
@@ -292,8 +292,7 @@ namespace __detail
 constexpr __integer_to_chars_result_type<_Tp>
 

Re: [PATCH] Handle epilogues that contain jumps

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/11/22 09:19, Richard Sandiford via Gcc-patches wrote:

The prologue/epilogue pass allows the prologue sequence
to contain jumps.  The sequence is then partitioned into
basic blocks using find_many_sub_basic_blocks.

This patch treats epilogues in the same way.  It's needed for
a follow-on aarch64 patch that adds conditional code to both
the prologue and the epilogue.

Tested on aarch64-linux-gnu (including with a follow-on patch)
and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* function.cc (thread_prologue_and_epilogue_insns): Handle
epilogues that contain jumps.


OK

jeff




Re: [PATCH] Allow targets to add USEs to asms

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/11/22 10:30, Richard Sandiford via Gcc-patches wrote:

Arm's SME has an array called ZA that for inline asm purposes
is effectively a form of special-purpose memory.  It doesn't
have an associated storage type and so can't be passed and
returned in normal C/C++ objects.

We'd therefore like "za" in a clobber list to mean that an inline
asm can read from and write to ZA.  (Just reading or writing
individually is unlikely to be useful, but we could add syntax
for that too if necessary.)

There is currently a TARGET_MD_ASM_ADJUST target hook that allows
targets to add clobbers to an asm instruction.  This patch
extends that to allow targets to add USEs as well.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested by
building cc1 for one target per affected CPU.  OK to install?

Richard


gcc/
* target.def (md_asm_adjust): Add a uses parameter.
* doc/gccint/target-macros/tm.rst.in: Regenerate.
* cfgexpand.cc (expand_asm_loc): Update call to md_asm_adjust.
Handle any USEs created by the target.
(expand_asm_stmt): Likewise.
* recog.cc (asm_noperands): Handle asms with USEs.
(decode_asm_operands): Likewise.
* config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add uses
parameter.
* config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise.
* config/arm/arm.cc (thumb1_md_asm_adjust): Likewise.
* config/avr/avr.cc (avr_md_asm_adjust): Likewise.
* config/cris/cris.cc (cris_md_asm_adjust): Likewise.
* config/i386/i386.cc (ix86_md_asm_adjust): Likewise.
* config/mn10300/mn10300.cc (mn10300_md_asm_adjust): Likewise.
* config/nds32/nds32.cc (nds32_md_asm_adjust): Likewise.
* config/pdp11/pdp11.cc (pdp11_md_asm_adjust): Likewise.
* config/rs6000/rs6000.cc (rs6000_md_asm_adjust): Likewise.
* config/s390/s390.cc (s390_md_asm_adjust): Likewise.
* config/vax/vax.cc (vax_md_asm_adjust): Likewise.
* config/visium/visium.cc (visium_md_asm_adjust): Likewise.


OK

jeff




Re: [PATCH] RISC-V: optimize '(a >= 0) ? b : 0' to srai + andn, if compiling for Zbb

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/8/22 12:54, Philipp Tomsich wrote:

If-conversion is turning '(a >= 0) ? b : 0' into a branchless sequence
not a5,a0
sraia5,a5,63
and a0,a1,a5
missing the opportunity to combine the NOT and AND into an ANDN.

This adds a define_split to help the combiner reassociate the NOT with
the AND.


gcc/ChangeLog:

* config/riscv/bitmanip.md: New define_split.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-srai-andn.c: New test.


OK.


FWIW, combine can be pretty sneaky in manipulating the result of a scc 
style insn.    I've seen a port with pages and pages of special patterns 
to match what simplify_if_then_else would do.



Jeff



Re: [PATCH] testsuite: Fix up pr107541.c test

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/8/22 04:50, Jakub Jelinek via Gcc-patches wrote:

On Mon, Nov 07, 2022 at 12:42:38PM +0100, Aldy Hernandez via Gcc-patches wrote:

* gcc.dg/tree-ssa/pr107541.c: New test.

The test fails when long is 32-bit rather than 64-bit (say x86_64 with
RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} tree-ssa.exp=pr107541.c'
).
I've tweaked it to use long long so it passes even on the 32-bit
targets, and added an early out for weirdo targets because I think
the test assumes the usual 1/2/4/8 bytes sizes for char/short/int/long long.

Tested on x86_64-linux, ok for trunk?

2022-11-08  Jakub Jelinek  

PR tree-optimization/107541
* gcc.dg/tree-ssa/pr107541.c (c): Use long long type rather than long.
(main): Punt if sizeof short isn't 2, or int 4, or long long 8.


OK

jeff




Re: [PATCH 4/5] value-range: Add as_string diagnostics helper

2022-11-12 Thread Andrew Pinski via Gcc-patches
On Sat, Nov 12, 2022 at 3:47 PM Bernhard Reutner-Fischer via
Gcc-patches  wrote:
>
> gcc/ChangeLog:
>
> * value-range.cc (get_bound_with_infinite_markers): New static helper.
> (irange::as_string): New definition.
> * value-range.h: New declaration.
>
> ---
> Provide means to print a value range to a newly allocated buffer.
> The caller is responsible to free() the allocated memory.
>
> Bootstrapped and regtested on x86_86-unknown-linux with no regressions.
> Ok for trunk?
>
> Cc: Andrew MacLeod 
> Cc: Aldy Hernandez 
> ---
>  gcc/value-range.cc | 56 ++
>  gcc/value-range.h  |  3 +++
>  2 files changed, 59 insertions(+)
>
> diff --git a/gcc/value-range.cc b/gcc/value-range.cc
> index a855aaf626c..51cd9a38d90 100644
> --- a/gcc/value-range.cc
> +++ b/gcc/value-range.cc
> @@ -3099,6 +3099,62 @@ debug (const value_range )
>fprintf (stderr, "\n");
>  }
>
> +/* Helper for irange::as_string().  Print a bound to an allocated buffer.  */
> +static char *

Can we start using std::string instead of char* here?


> +get_bound_with_infinite_markers (tree bound)
> +{
> +  tree type = TREE_TYPE (bound);
> +  wide_int type_min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN 
> (type));
> +  wide_int type_max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN 
> (type));
> +
> +  if (INTEGRAL_TYPE_P (type)
> +  && !TYPE_UNSIGNED (type)
> +  && TREE_CODE (bound) == INTEGER_CST
> +  && wi::to_wide (bound) == type_min
> +  && TYPE_PRECISION (type) != 1)
> +return xstrdup ("-INF");
> +  else if (TREE_CODE (bound) == INTEGER_CST
> +  && wi::to_wide (bound) == type_max
> +  && TYPE_PRECISION (type) != 1)
> +return xstrdup ("+INF");
> +  else
> +return print_generic_expr_to_str (bound);
No reason to do xstrdup any more either.

> +}
> +
> +
> +/* Return an irange as string. Return NULL on failure, an allocated
> +   string on success.  */
> +char *

Likewise.

Thanks,
Andrew Pinski

> +irange::as_string ()
> +{
> +  char *ret = NULL;
This becomes std::string ret;
> +  if (undefined_p() || varying_p () || m_num_ranges == 0)
> +return ret;
> +
> +  for (unsigned i = 0; i < m_num_ranges; ++i)
> +{
> +  tree lb = m_base[i * 2];
> +  tree ub = m_base[i * 2 + 1];
> +  /* Construct [lower_bound,upper_bound].  */
> +  char *lbs = get_bound_with_infinite_markers (lb);
> +  char *ubs = get_bound_with_infinite_markers (ub);
> +  /* Paranoia mode */
> +  if (!lbs)
> +   lbs = xstrdup ("");
> +  if (!ubs)
> +   ubs = xstrdup ("");
> +
> +  if (ret)
> +   ret = reconcat (ret, ret, "[", lbs, ",", ubs, "]", NULL);
> +  else
> +   ret = concat ("[", lbs, ",", ubs, "]", NULL);
> +
> +  free (lbs);
> +  free (ubs);
> +}
> +  return ret;
> +}
> +
>  /* Create two value-ranges in *VR0 and *VR1 from the anti-range *AR
> so that *VR0 U *VR1 == *AR.  Returns true if that is possible,
> false otherwise.  If *AR can be represented with a single range
> diff --git a/gcc/value-range.h b/gcc/value-range.h
> index c87734dd8cd..76242e4bf45 100644
> --- a/gcc/value-range.h
> +++ b/gcc/value-range.h
> @@ -160,6 +160,9 @@ public:
>wide_int get_nonzero_bits () const;
>void set_nonzero_bits (const wide_int_ref );
>
> +  // For diagnostics.
> +  char *as_string ();
> +
>// Deprecated legacy public methods.
>tree min () const;   // DEPRECATED
>tree max () const;   // DEPRECATED
> --
> 2.38.1
>


[PATCH 5/5] gimple: Add pass to note possible type demotions; IPA pro/demotion; DO NOT MERGE

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
gcc/ChangeLog:

* Makefile.in (OBJS): Add gimple-warn-types.o.
* passes.def: Add pass_warn_type_demotion.
* tree-pass.h (make_pass_warn_type_demotion): New declaration.
* gimple-warn-types.cc: New file.

gcc/c-family/ChangeLog:

* c.opt (Wtype-demotion): New.

---
DO NOT MERGE.
This is the script^Wpass to emit a warning if a function's return type
could potentially be narrowed.
What would probably be useful is to equip an IPA pass with that
knowledge and demote return types of functions that do not contribute to
an external interface to the smallest possible type. The idea is that if
a target determines late via targetm.calls.promote_prototypes to promote
return values, the target will ultimately have the final say about
return types while for non-exported functions we can narrow types for
size- or speed reasons as we see fit.

This hunk does not implement an IPA pass that would do anything really
useful but merely queries the ranger to see if there are possibilities
to demote a return type, any (!) return type.
For the IPA real thing, we'd want to notice that if a function returns a
singleton, the caller would just use that singleton and demote the
callee to void. And in the caller, we'd appropriately shift the callee's
return value to the required range/value.
The usual trouble makers come to mind: qsort helpers that insist on int
return codes (that we could extend to int).

As said, that's just for your amusement and is not meant to be merged.
---
 gcc/Makefile.in  |   1 +
 gcc/c-family/c.opt   |   6 +
 gcc/gimple-warn-types.cc | 441 +++
 gcc/passes.def   |   1 +
 gcc/tree-pass.h  |   1 +
 5 files changed, 450 insertions(+)
 create mode 100644 gcc/gimple-warn-types.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f672e6ea549..c6901ececd4 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1433,6 +1433,7 @@ OBJS = \
gimple-streamer-out.o \
gimple-walk.o \
gimple-warn-recursion.o \
+   gimple-warn-types.o \
gimplify.o \
gimplify-me.o \
godump.o \
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 63a300ecd7c..0b46669e2b7 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1005,6 +1005,12 @@ Wtemplates
 C++ ObjC++ Var(warn_templates) Warning
 Warn on primary template declaration.
 
+Wtype-demotion
+C ObjC C++ ObjC++ Var(warn_type_demotion) Warning LangEnabledBy(C ObjC C++ 
ObjC++, Wall)
+Warn if function return type could be narrowed or demoted.
+; function return type, parameter type, variable type.
+; if values used for a type indicate that the type could use a narrower mode.
+
 Wmissing-attributes
 C ObjC C++ ObjC++ Var(warn_missing_attributes) Warning LangEnabledBy(C ObjC 
C++ ObjC++,Wall)
 Warn about declarations of entities that may be missing attributes
diff --git a/gcc/gimple-warn-types.cc b/gcc/gimple-warn-types.cc
new file mode 100644
index 000..e0b7212a1bb
--- /dev/null
+++ b/gcc/gimple-warn-types.cc
@@ -0,0 +1,441 @@
+/* Pass to detect and issue warnings about possibly using narrower types.
+
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Bernhard Reutner-Fischer 
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "pointer-query.h"
+#include "ssa.h"
+#include "gimple-pretty-print.h"
+#include "diagnostic-core.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "tree-dfa.h" //
+#include "tree-ssa.h"
+#include "tree-cfg.h"
+#include "tree-object-size.h" //
+#include "calls.h" //
+#include "cfgloop.h" //
+#include "intl.h"
+#include "gimple-range.h"
+
+#include "value-range.h"
+#include "gimple-range-path.h"
+#include "gcc-rich-location.h"
+#include "langhooks.h"
+#pragma GCC optimize("O0")
+namespace {
+
+const pass_data pass_data_wtype_demotion = {
+  GIMPLE_PASS,
+  "wtype_demotion",
+  OPTGROUP_NONE,
+  TV_NONE,
+  PROP_cfg, /* properties_required.  */
+  0,   /* properties_provided.  */
+  0,   /* properties_destroyed.  */
+  0,   /* properties_start */
+  0,   /* properties_finish */
+};
+
+class pass_wtype_demotion : public gimple_opt_pass
+{
+ public:
+  

[PATCH 3/5] Fortran: Narrow return types [PR78798]

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
gcc/fortran/ChangeLog:

* arith.cc (compare_complex): Use narrower return type.
(gfc_compare_string): Likewise.
* arith.h (gfc_compare_string): Same.
(gfc_compare_with_Cstring): Ditto.
* array.cc (compare_bounds): Ditto.
(gfc_compare_array_spec): Likewise.
(is_constant_element): Likewise.
(gfc_constant_ac): Likewise.
* check.cc (dim_rank_check): Likewise.
* cpp.cc (gfc_cpp_init_options): Likewise.
(dump_macro): Likewise.
* cpp.h (gfc_cpp_handle_option): Likewise.
* dependency.cc (gfc_ref_needs_temporary_p): Likewise.
(gfc_check_argument_dependency): Likewise.
(gfc_check_fncall_dependency): Likewise.
(ref_same_as_full_array): Likewise.
* dependency.h (gfc_check_fncall_dependency): Likewise.
(gfc_dep_resolver): Likewise.
(gfc_are_equivalenced_arrays): Likewise.
* expr.cc (gfc_copy_ref): Likewise.
(gfc_kind_max): Likewise.
(numeric_type): Likewise.
* gfortran.h (gfc_at_end): Likewise.
(gfc_at_eof): Likewise.
(gfc_at_bol): Likewise.
(gfc_at_eol): Likewise.
(gfc_check_include): Likewise.
(gfc_define_undef_line): Likewise.
(gfc_wide_is_printable): Likewise.
(gfc_wide_is_digit): Likewise.
(gfc_wide_fits_in_byte): Likewise.
(get_c_kind): Likewise.
(gfc_find_sym_tree): Likewise.
(gfc_generic_intrinsic): Likewise.
(gfc_specific_intrinsic): Likewise.
(gfc_intrinsic_actual_ok): Likewise.
(gfc_has_vector_index): Likewise.
(gfc_numeric_ts): Likewise.
(gfc_impure_variable): Likewise.
(gfc_pure): Likewise.
(gfc_implicit_pure): Likewise.
(gfc_elemental): Likewise.
(gfc_pure_function): Likewise.
(gfc_implicit_pure_function): Likewise.
(gfc_compare_array_spec): Likewise.
(gfc_constant_ac): Likewise.
(gfc_expanded_ac): Likewise.
(gfc_check_digit): Likewise.
* intrinsic.cc (gfc_find_subroutine): Likewise.
(gfc_generic_intrinsic): Likewise.
(gfc_specific_intrinsic): Likewise.
* io.cc (compare_to_allowed_values): Likewise.
* misc.cc (gfc_done_2): Likewise.
* parse.cc: Likewise.
* parse.h (gfc_check_do_variable): Likewise.
* primary.cc (gfc_check_digit): Likewise.
* resolve.cc (resolve_structure_cons): Likewise.
(pure_stmt_function): Likewise.
(gfc_pure_function): Likewise.
(impure_stmt_fcn): Likewise.
(resolve_forall_iterators): Likewise.
(resolve_data): Likewise.
(gfc_impure_variable): Likewise.
(gfc_pure): Likewise.
(gfc_unset_implicit_pure): Likewise.
* scanner.cc (wide_is_ascii): Likewise.
(gfc_wide_toupper): Likewise.
(gfc_open_included_file): Likewise.
(gfc_at_end): Likewise.
(gfc_at_eof): Likewise.
(gfc_at_bol): Likewise.
(skip_comment_line): Likewise.
(gfc_gobble_whitespace): Likewise.
* symbol.cc (gfc_find_symtree_in_proc): Likewise.
* target-memory.cc (size_integer): Likewise.
(size_complex): Likewise.
* trans-array.cc: Likewise.
* trans-decl.cc (gfc_set_decl_assembler_name): Likewise.
* trans-types.cc (gfc_get_element_type): Likewise.
(gfc_add_field_to_struct): Likewise.
* trans-types.h (gfc_copy_dt_decls_ifequal): Likewise.
(gfc_return_by_reference): Likewise.
(gfc_is_nodesc_array): Likewise.
* trans.h (gfc_can_put_var_on_stack): Likewise.
---
Bootstrapped and regtested on x86_86-unknown-linux with no regressions.
Ok for trunk?

Cc: fort...@gcc.gnu.org
---
 gcc/fortran/arith.cc |  4 +--
 gcc/fortran/arith.h  |  4 +--
 gcc/fortran/array.cc |  8 +++---
 gcc/fortran/check.cc |  2 +-
 gcc/fortran/cpp.cc   |  3 +--
 gcc/fortran/cpp.h|  2 +-
 gcc/fortran/dependency.cc|  8 +++---
 gcc/fortran/dependency.h |  6 ++---
 gcc/fortran/expr.cc  |  6 ++---
 gcc/fortran/gfortran.h   | 51 ++--
 gcc/fortran/intrinsic.cc |  6 ++---
 gcc/fortran/io.cc| 13 ++---
 gcc/fortran/misc.cc  |  2 +-
 gcc/fortran/parse.cc |  2 +-
 gcc/fortran/parse.h  |  2 +-
 gcc/fortran/primary.cc   |  4 +--
 gcc/fortran/resolve.cc   | 22 
 gcc/fortran/scanner.cc   | 20 +++---
 gcc/fortran/symbol.cc|  2 +-
 gcc/fortran/target-memory.cc |  6 ++---
 gcc/fortran/trans-array.cc   |  2 +-
 gcc/fortran/trans-decl.cc|  2 +-
 gcc/fortran/trans-types.cc   |  6 ++---
 gcc/fortran/trans-types.h|  6 ++---
 gcc/fortran/trans.h  |  2 +-
 25 files changed, 90 insertions(+), 101 deletions(-)

diff --git a/gcc/fortran/arith.cc b/gcc/fortran/arith.cc
index fc9224ebc5c..55f35ea66be 100644
--- a/gcc/fortran/arith.cc

[PATCH 0/5] function result decl location; type demotion

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
Hi!

The location of function result declarations was not set.
The first two patches set the location of normal functions in C and C++.

Jason, Nathan, I failed to support C++ template functions, see below.

TL;DR.
Why all this?
PR78798 noted that we should use narrower function return types if feasable.

So one idea to bring that idea forward is to determine the range a
function can return and match that range against the range provided by
the actual return type. If we waste bits, warn and, ideally, suggest a
better alternative. Ideally with a fix-it hint for the narrower type and
with a patch.
David's tremendously useful work on diagnostics makes both user-facing
aspecs rather easy to achieve (thanks, once again, David!).
Ideally, one would be able to accumulate such suggested fix-it hints
driven patches by stating something like:
  ... -Wtype-demotion -fdiagnostics-generate-patch=>>/tmp/hmz.patch
i.e. have ways to direct the ...-generate-patch to creat/append to some
given path. But that doesn't seem to work for me or i did not read the
documentation carefully enough. awk to the rescue for the full buildlog
output to extract the patch(es) but not all that userfriendly i fear.
How would i write a combined patch out to some given path, David?

Patch 1 handles locations for the C FE, this works perfectly fine.

Patch 2 handles locations for normal, non-template functions in C++
and these work fine, too.

Patch 3, the actual Fortran bits work fine and are sound (good job, Aldy
and Andrew!)

Patch 4 is a way to print the actual range to some diagnostics.
I wrote this about a year ago when only irange was available.
And from what i've heared, it's doubtful if such an as_string() is
considered useful. So i post it just for reference and don't ask for
inclusion of such a facility. Nevertheless i think that would be useful
if not just for debugging and dumping (but please to a buffer, too, so
one can hijack it ;)

Patch5 is not to be merged for obvious reasons. It is way too chatty,
doesn't run in IPA so probably ruins any "comparer" like function.
I've compiled one or two userspace, integer programs and it is not
completely off, from the looks.


C++ template functions.
===
I couldn't make this work, the mechanics in start_preparsed_function are
beyond what i could grok in a couple of evenings TBH.

Can you maybe help, Json or Nathan?

I tried several spots.
Directly in start_preparsed_function, in grokmethod, grokfndecl after
type = build_cp_fntype_variant, in grokfndecl if (funcdef_flag), in
finish_function but to no avail.
To me it seems that most locations are unset/ broken in the C++ FE for
the template path, or, more likely, i'm unable to operate them properly
to be fair.
I even tried the enterprise-level idea to get something vaguely around
the result decl in a template by (please don't cite me):

@@ -18214,6 +18260,17 @@ grokmethod (cp_decl_specifier_seq *declspecs,
   DECL_NO_INLINE_WARNING_P (fndecl) = 1;
 }
 
+  /* Set the location of the result decl, approximately.  */
+  tree result;
+  if ((result = DECL_RESULT (fndecl))
+  && DECL_SOURCE_LOCATION (result) == UNKNOWN_LOCATION)
+for (int i = (int)ds_first; i != (int)ds_last; ++i)
+  if (location_t loc = declspecs->locations[i])
+   {
+ DECL_SOURCE_LOCATION (result) = loc;
+ break;
+   }
+

but there's nothing much there in my POV?

My C++ template based testcase was this:

$ cat return-narrow-2.cc ; echo EOF
namespace std { template < typename, typename > struct pair; }
template < typename > struct __mini_vector
{
  int _M_finish;
  unsigned long
  _M_space_left()
  { return _M_finish != 0; }
};
 template class __mini_vector< std::pair< long, long > >;
 template class __mini_vector< int >;
EOF

Where the locations are all confused (maybe a tad different on trunk):
$ ../gcc/xg++ -B../gcc -c -o /tmp/foo.o return-narrow-2.cc -O -Wtype-demotion
return-narrow-2.cc: In member function ‘long unsigned int __mini_vector< 
 >::_M_space_left() [with  = 
std::pair]’:
return-narrow-2.cc:6:3: warning: Function ‘_M_space_left’ could return ‘bool’ 
[-Wtype-demotion]
6 |   _M_space_left()
  |   ^
  |   bool
return-narrow-2.cc:6:3: note: with a range of [0,1]
return-narrow-2.cc: In member function ‘long unsigned int __mini_vector< 
 >::_M_space_left() [with  = 
int]’:
return-narrow-2.cc:6:3: warning: Function ‘_M_space_left’ could return ‘bool’ 
[-Wtype-demotion]
6 |   _M_space_left()
  |   ^
  |   bool
return-narrow-2.cc:6:3: note: with a range of [0,1]


The normal C function and C++ function (non-template) dummy tests are all
fine and work as i had expected:
$ cat return-narrow.cc ; echo EOF
int xyz (int param1, int param2, int param3)
{
if (param1 == 42)
return 1;
if (param2 == 17)
return 1;
if (param3 == 99)
return 1;
return 0;
}
int abc (int param1, int param2, int 

[PATCH 1/5] c: Set the locus of the function result decl

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
Bootstrapped and regtested on x86_86-unknown-linux with no regressions.
Ok for trunk?

Cc: Joseph Myers 
---
gcc/c/ChangeLog:

* c-decl.cc (start_function): Set the result decl source
location to the location of the typespec.
---
 gcc/c/c-decl.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index a99b7456055..5250cb96c41 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9980,6 +9980,7 @@ start_function (struct c_declspecs *declspecs, struct 
c_declarator *declarator,
   tree decl1, old_decl;
   tree restype, resdecl;
   location_t loc;
+  location_t result_loc;
 
   current_function_returns_value = 0;  /* Assume, until we see it does.  */
   current_function_returns_null = 0;
@@ -10206,8 +10207,11 @@ start_function (struct c_declspecs *declspecs, struct 
c_declarator *declarator,
   push_scope ();
   declare_parm_level ();
 
+  /* Set the result decl source location to the location of the typespec.  */
+  result_loc = (declspecs->locations[cdw_typespec] == UNKNOWN_LOCATION
+   ? loc : declspecs->locations[cdw_typespec]);
   restype = TREE_TYPE (TREE_TYPE (current_function_decl));
-  resdecl = build_decl (loc, RESULT_DECL, NULL_TREE, restype);
+  resdecl = build_decl (result_loc, RESULT_DECL, NULL_TREE, restype);
   DECL_ARTIFICIAL (resdecl) = 1;
   DECL_IGNORED_P (resdecl) = 1;
   DECL_RESULT (current_function_decl) = resdecl;
-- 
2.38.1



[PATCH 4/5] value-range: Add as_string diagnostics helper

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
gcc/ChangeLog:

* value-range.cc (get_bound_with_infinite_markers): New static helper.
(irange::as_string): New definition.
* value-range.h: New declaration.

---
Provide means to print a value range to a newly allocated buffer.
The caller is responsible to free() the allocated memory.

Bootstrapped and regtested on x86_86-unknown-linux with no regressions.
Ok for trunk?

Cc: Andrew MacLeod 
Cc: Aldy Hernandez 
---
 gcc/value-range.cc | 56 ++
 gcc/value-range.h  |  3 +++
 2 files changed, 59 insertions(+)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index a855aaf626c..51cd9a38d90 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -3099,6 +3099,62 @@ debug (const value_range )
   fprintf (stderr, "\n");
 }
 
+/* Helper for irange::as_string().  Print a bound to an allocated buffer.  */
+static char *
+get_bound_with_infinite_markers (tree bound)
+{
+  tree type = TREE_TYPE (bound);
+  wide_int type_min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+  wide_int type_max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+
+  if (INTEGRAL_TYPE_P (type)
+  && !TYPE_UNSIGNED (type)
+  && TREE_CODE (bound) == INTEGER_CST
+  && wi::to_wide (bound) == type_min
+  && TYPE_PRECISION (type) != 1)
+return xstrdup ("-INF");
+  else if (TREE_CODE (bound) == INTEGER_CST
+  && wi::to_wide (bound) == type_max
+  && TYPE_PRECISION (type) != 1)
+return xstrdup ("+INF");
+  else
+return print_generic_expr_to_str (bound);
+}
+
+
+/* Return an irange as string. Return NULL on failure, an allocated
+   string on success.  */
+char *
+irange::as_string ()
+{
+  char *ret = NULL;
+  if (undefined_p() || varying_p () || m_num_ranges == 0)
+return ret;
+
+  for (unsigned i = 0; i < m_num_ranges; ++i)
+{
+  tree lb = m_base[i * 2];
+  tree ub = m_base[i * 2 + 1];
+  /* Construct [lower_bound,upper_bound].  */
+  char *lbs = get_bound_with_infinite_markers (lb);
+  char *ubs = get_bound_with_infinite_markers (ub);
+  /* Paranoia mode */
+  if (!lbs)
+   lbs = xstrdup ("");
+  if (!ubs)
+   ubs = xstrdup ("");
+
+  if (ret)
+   ret = reconcat (ret, ret, "[", lbs, ",", ubs, "]", NULL);
+  else
+   ret = concat ("[", lbs, ",", ubs, "]", NULL);
+
+  free (lbs);
+  free (ubs);
+}
+  return ret;
+}
+
 /* Create two value-ranges in *VR0 and *VR1 from the anti-range *AR
so that *VR0 U *VR1 == *AR.  Returns true if that is possible,
false otherwise.  If *AR can be represented with a single range
diff --git a/gcc/value-range.h b/gcc/value-range.h
index c87734dd8cd..76242e4bf45 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -160,6 +160,9 @@ public:
   wide_int get_nonzero_bits () const;
   void set_nonzero_bits (const wide_int_ref );
 
+  // For diagnostics.
+  char *as_string ();
+
   // Deprecated legacy public methods.
   tree min () const;   // DEPRECATED
   tree max () const;   // DEPRECATED
-- 
2.38.1



[PATCH 2/5] c++: Set the locus of the function result decl

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
gcc/cp/ChangeLog:

* decl.cc (start_function): Set the result decl source location to
the location of the typespec.

---
Bootstrapped and regtested on x86_86-unknown-linux with no regressions.
Ok for trunk?

Cc: Nathan Sidwell 
Cc: Jason Merrill 
---
 gcc/cp/decl.cc | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 6e98ea35a39..ed40815e645 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17449,6 +17449,8 @@ start_function (cp_decl_specifier_seq *declspecs,
tree attrs)
 {
   tree decl1;
+  tree result;
+  bool ret;
 
   decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, 1, );
   invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1);
@@ -17461,7 +17463,18 @@ start_function (cp_decl_specifier_seq *declspecs,
 gcc_assert (same_type_p (TREE_TYPE (TREE_TYPE (decl1)),
 integer_type_node));
 
-  return start_preparsed_function (decl1, attrs, /*flags=*/SF_DEFAULT);
+  ret = start_preparsed_function (decl1, attrs, /*flags=*/SF_DEFAULT);
+
+  /* decl1 might be ggc_freed here.  */
+  decl1 = current_function_decl;
+
+  /* Set the result decl source location to the location of the typespec.  */
+  if (TREE_CODE (decl1) == FUNCTION_DECL
+  && declspecs->locations[ds_type_spec] != UNKNOWN_LOCATION
+  && (result = DECL_RESULT (decl1)) != NULL_TREE
+  && DECL_SOURCE_LOCATION (result) == input_location)
+DECL_SOURCE_LOCATION (result) = declspecs->locations[ds_type_spec];
+  return ret;
 }
 
 /* Returns true iff an EH_SPEC_BLOCK should be created in the body of
-- 
2.38.1



Re: [PATCH 7/7] ifcvt: add if-conversion to conditional-zero instructions

2022-11-12 Thread Philipp Tomsich
On Sat, 12 Nov 2022 at 22:47, Andrew Pinski  wrote:
>
> On Sat, Nov 12, 2022 at 1:34 PM Philipp Tomsich
>  wrote:
> >
> > Some architectures, as it the case on RISC-V with the proposed
> > ZiCondOps and the vendor-defined XVentanaCondOps, define a
> > conditional-zero instruction that is equivalent to:
> >  - the positive form:  rd = (rc != 0) ? rs : 0
> >  - the negated form:   rd = (rc == 0) ? rs : 0
> >
> > While noce_try_store_flag_mask will somewhat work for this case, it
> > will generate a number of atomic RTX that will misdirect the cost
> > calculation and may be too long (i.e., 4 RTX and more) to successfully
> > merge at combine-time.
> >
> > Instead, we add two new transforms that attempt to build up what we
> > define as the canonical form of a conditional-zero expression:
> >
> >   (set (match_operand 0 "register_operand" "=r")
> >(and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
> >(const_int 0)))
> > (match_operand 2 "register_operand" "r")))
>
>
> Why is it not:
> (set x (if_then_else (eq_or_ne y (0)) z (0))
> (set x (if_then_else (ne y (0)) (0) z)
>
> That seems simpler to expression and is the normal a==0?0:z expression.

Having an if_then_else come out of if-conversion would be a bit unusual, as
transformation to branchless is the intent of the entire exercise.

Existing if-conversion via noce_try_store_flag_mask and noce_try_store_flag
already catch these sequences—if that happens, the above representation
will be present during combine: i.e., we need to implement this match anyway
(and it also matches the typical idiom, if a programmer tries to express the
idiom in a branchless way).

Consequently, we decided to use the same pattern as the canonical
representation in case that if-conversion had occurred prior.

> Also all canonical forms of RTL should be documented too.
> They are documented here:
> https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gccint/Insn-Canonicalizations.html
> https://gcc.gnu.org/onlinedocs/gccint/machine-descriptions/canonicalization-of-instructions.html
> gcc/doc/gccint/machine-descriptions/canonicalization-of-instructions.rst
>
>
> Thanks,
> Andrew Pinski
>
> >
> > Architectures that provide a conditional-zero are thus expected to
> > define an instruction matching this pattern in their backend.
> >
> > Based on this, we support the following cases:
> >  - noce_try_condzero:
> >   a ? a : b
> >   a ? b : 0  (and then/else swapped)
> >  !a ? b : 0  (and then/else swapped)
> >  - noce_try_condzero_arith:
> >  conditional-plus, conditional-minus, conditional-and,
> >  conditional-or, conditional-xor, conditional-shift,
> >  conditional-and
> >
> > Given that this is hooked into the CE passes, it is less powerful than
> > a tree-pass (e.g., it can not transform cases where an extension, such
> > as for uint16_t operations is in either the then or else-branch
> > together with the arithmetic) but already covers a good array of cases
> > and triggers across SPEC CPU 2017.
> > Adding transofmrations in a tree pass will be considered as a future
> > improvement.
> >
> > gcc/ChangeLog:
> >
> > * ifcvt.cc (noce_emit_insn): Add prototype.
> > (noce_emit_condzero): Helper for noce_try_condzero and
> > noce_try_condzero_arith transforms.
> > (noce_try_condzero): New transform.
> > (noce_try_condzero_arith): New transform for conditional
> > arithmetic that can be built up by exploiting that the
> > conditional-zero instruction will inject 0, which acts
> > as the neutral element for operations.
> > (noce_process_if_block): Call noce_try_condzero and
> > noce_try_condzero_arith.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/xventanacondops-and-01.c: New test.
> > * gcc.target/riscv/xventanacondops-and-02.c: New test.
> > * gcc.target/riscv/xventanacondops-eq-01.c: New test.
> > * gcc.target/riscv/xventanacondops-eq-02.c: New test.
> > * gcc.target/riscv/xventanacondops-lt-01.c: New test.
> > * gcc.target/riscv/xventanacondops-ne-01.c: New test.
> > * gcc.target/riscv/xventanacondops-xor-01.c: New test.
> >
> > Signed-off-by: Philipp Tomsich 
> > ---
> >
> >  gcc/ifcvt.cc  | 214 ++
> >  .../gcc.target/riscv/xventanacondops-and-01.c |  16 ++
> >  .../gcc.target/riscv/xventanacondops-and-02.c |  15 ++
> >  .../gcc.target/riscv/xventanacondops-eq-01.c  |  11 +
> >  .../gcc.target/riscv/xventanacondops-eq-02.c  |  14 ++
> >  .../gcc.target/riscv/xventanacondops-lt-01.c  |  16 ++
> >  .../gcc.target/riscv/xventanacondops-ne-01.c  |  11 +
> >  .../gcc.target/riscv/xventanacondops-xor-01.c |  14 ++
> >  8 files changed, 311 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-01.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-02.c
> >  create 

Re: [PATCH 7/7] ifcvt: add if-conversion to conditional-zero instructions

2022-11-12 Thread Andrew Pinski via Gcc-patches
On Sat, Nov 12, 2022 at 1:34 PM Philipp Tomsich
 wrote:
>
> Some architectures, as it the case on RISC-V with the proposed
> ZiCondOps and the vendor-defined XVentanaCondOps, define a
> conditional-zero instruction that is equivalent to:
>  - the positive form:  rd = (rc != 0) ? rs : 0
>  - the negated form:   rd = (rc == 0) ? rs : 0
>
> While noce_try_store_flag_mask will somewhat work for this case, it
> will generate a number of atomic RTX that will misdirect the cost
> calculation and may be too long (i.e., 4 RTX and more) to successfully
> merge at combine-time.
>
> Instead, we add two new transforms that attempt to build up what we
> define as the canonical form of a conditional-zero expression:
>
>   (set (match_operand 0 "register_operand" "=r")
>(and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
>(const_int 0)))
> (match_operand 2 "register_operand" "r")))


Why is it not:
(set x (if_then_else (eq_or_ne y (0)) z (0))
(set x (if_then_else (ne y (0)) (0) z)

That seems simpler to expression and is the normal a==0?0:z expression.

Also all canonical forms of RTL should be documented too.
They are documented here:
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gccint/Insn-Canonicalizations.html
https://gcc.gnu.org/onlinedocs/gccint/machine-descriptions/canonicalization-of-instructions.html
gcc/doc/gccint/machine-descriptions/canonicalization-of-instructions.rst


Thanks,
Andrew Pinski

>
> Architectures that provide a conditional-zero are thus expected to
> define an instruction matching this pattern in their backend.
>
> Based on this, we support the following cases:
>  - noce_try_condzero:
>   a ? a : b
>   a ? b : 0  (and then/else swapped)
>  !a ? b : 0  (and then/else swapped)
>  - noce_try_condzero_arith:
>  conditional-plus, conditional-minus, conditional-and,
>  conditional-or, conditional-xor, conditional-shift,
>  conditional-and
>
> Given that this is hooked into the CE passes, it is less powerful than
> a tree-pass (e.g., it can not transform cases where an extension, such
> as for uint16_t operations is in either the then or else-branch
> together with the arithmetic) but already covers a good array of cases
> and triggers across SPEC CPU 2017.
> Adding transofmrations in a tree pass will be considered as a future
> improvement.
>
> gcc/ChangeLog:
>
> * ifcvt.cc (noce_emit_insn): Add prototype.
> (noce_emit_condzero): Helper for noce_try_condzero and
> noce_try_condzero_arith transforms.
> (noce_try_condzero): New transform.
> (noce_try_condzero_arith): New transform for conditional
> arithmetic that can be built up by exploiting that the
> conditional-zero instruction will inject 0, which acts
> as the neutral element for operations.
> (noce_process_if_block): Call noce_try_condzero and
> noce_try_condzero_arith.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xventanacondops-and-01.c: New test.
> * gcc.target/riscv/xventanacondops-and-02.c: New test.
> * gcc.target/riscv/xventanacondops-eq-01.c: New test.
> * gcc.target/riscv/xventanacondops-eq-02.c: New test.
> * gcc.target/riscv/xventanacondops-lt-01.c: New test.
> * gcc.target/riscv/xventanacondops-ne-01.c: New test.
> * gcc.target/riscv/xventanacondops-xor-01.c: New test.
>
> Signed-off-by: Philipp Tomsich 
> ---
>
>  gcc/ifcvt.cc  | 214 ++
>  .../gcc.target/riscv/xventanacondops-and-01.c |  16 ++
>  .../gcc.target/riscv/xventanacondops-and-02.c |  15 ++
>  .../gcc.target/riscv/xventanacondops-eq-01.c  |  11 +
>  .../gcc.target/riscv/xventanacondops-eq-02.c  |  14 ++
>  .../gcc.target/riscv/xventanacondops-lt-01.c  |  16 ++
>  .../gcc.target/riscv/xventanacondops-ne-01.c  |  11 +
>  .../gcc.target/riscv/xventanacondops-xor-01.c |  14 ++
>  8 files changed, 311 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-01.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-02.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-eq-01.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-eq-02.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-lt-01.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-01.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-xor-01.c
>
> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> index eb8efb89a89..41c58876d05 100644
> --- a/gcc/ifcvt.cc
> +++ b/gcc/ifcvt.cc
> @@ -97,6 +97,7 @@ static int find_if_case_2 (basic_block, edge, edge);
>  static int dead_or_predicable (basic_block, basic_block, basic_block,
>edge, int);
>  static void noce_emit_move_insn (rtx, rtx);
> +static rtx_insn *noce_emit_insn (rtx);
>  static rtx_insn *block_has_only_trap (basic_block);
>  static 

[PATCH 7/7] ifcvt: add if-conversion to conditional-zero instructions

2022-11-12 Thread Philipp Tomsich
Some architectures, as it the case on RISC-V with the proposed
ZiCondOps and the vendor-defined XVentanaCondOps, define a
conditional-zero instruction that is equivalent to:
 - the positive form:  rd = (rc != 0) ? rs : 0
 - the negated form:   rd = (rc == 0) ? rs : 0

While noce_try_store_flag_mask will somewhat work for this case, it
will generate a number of atomic RTX that will misdirect the cost
calculation and may be too long (i.e., 4 RTX and more) to successfully
merge at combine-time.

Instead, we add two new transforms that attempt to build up what we
define as the canonical form of a conditional-zero expression:

  (set (match_operand 0 "register_operand" "=r")
   (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
   (const_int 0)))
(match_operand 2 "register_operand" "r")))

Architectures that provide a conditional-zero are thus expected to
define an instruction matching this pattern in their backend.

Based on this, we support the following cases:
 - noce_try_condzero:
  a ? a : b
  a ? b : 0  (and then/else swapped)
 !a ? b : 0  (and then/else swapped)
 - noce_try_condzero_arith:
 conditional-plus, conditional-minus, conditional-and,
 conditional-or, conditional-xor, conditional-shift,
 conditional-and

Given that this is hooked into the CE passes, it is less powerful than
a tree-pass (e.g., it can not transform cases where an extension, such
as for uint16_t operations is in either the then or else-branch
together with the arithmetic) but already covers a good array of cases
and triggers across SPEC CPU 2017.
Adding transofmrations in a tree pass will be considered as a future
improvement.

gcc/ChangeLog:

* ifcvt.cc (noce_emit_insn): Add prototype.
(noce_emit_condzero): Helper for noce_try_condzero and
noce_try_condzero_arith transforms.
(noce_try_condzero): New transform.
(noce_try_condzero_arith): New transform for conditional
arithmetic that can be built up by exploiting that the
conditional-zero instruction will inject 0, which acts
as the neutral element for operations.
(noce_process_if_block): Call noce_try_condzero and
noce_try_condzero_arith.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xventanacondops-and-01.c: New test.
* gcc.target/riscv/xventanacondops-and-02.c: New test.
* gcc.target/riscv/xventanacondops-eq-01.c: New test.
* gcc.target/riscv/xventanacondops-eq-02.c: New test.
* gcc.target/riscv/xventanacondops-lt-01.c: New test.
* gcc.target/riscv/xventanacondops-ne-01.c: New test.
* gcc.target/riscv/xventanacondops-xor-01.c: New test.

Signed-off-by: Philipp Tomsich 
---

 gcc/ifcvt.cc  | 214 ++
 .../gcc.target/riscv/xventanacondops-and-01.c |  16 ++
 .../gcc.target/riscv/xventanacondops-and-02.c |  15 ++
 .../gcc.target/riscv/xventanacondops-eq-01.c  |  11 +
 .../gcc.target/riscv/xventanacondops-eq-02.c  |  14 ++
 .../gcc.target/riscv/xventanacondops-lt-01.c  |  16 ++
 .../gcc.target/riscv/xventanacondops-ne-01.c  |  11 +
 .../gcc.target/riscv/xventanacondops-xor-01.c |  14 ++
 8 files changed, 311 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-02.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-eq-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-eq-02.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-lt-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-xor-01.c

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index eb8efb89a89..41c58876d05 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -97,6 +97,7 @@ static int find_if_case_2 (basic_block, edge, edge);
 static int dead_or_predicable (basic_block, basic_block, basic_block,
   edge, int);
 static void noce_emit_move_insn (rtx, rtx);
+static rtx_insn *noce_emit_insn (rtx);
 static rtx_insn *block_has_only_trap (basic_block);
 static void need_cmov_or_rewire (basic_block, hash_set *,
 hash_map *);
@@ -787,6 +788,9 @@ static rtx noce_get_alt_condition (struct noce_if_info *, 
rtx, rtx_insn **);
 static int noce_try_minmax (struct noce_if_info *);
 static int noce_try_abs (struct noce_if_info *);
 static int noce_try_sign_mask (struct noce_if_info *);
+static rtx noce_emit_condzero (struct noce_if_info *, rtx, bool = false);
+static int noce_try_condzero (struct noce_if_info *);
+static int noce_try_condzero_arith (struct noce_if_info *);
 
 /* Return the comparison code for reversed condition for IF_INFO,
or UNKNOWN if reversing the condition is not possible.  */
@@ -1664,6 +1668,212 @@ noce_try_addcc (struct noce_if_info *if_info)
   return FALSE;
 }
 

[PATCH 5/7] RISC-V: Recognize bexti in negated if-conversion

2022-11-12 Thread Philipp Tomsich
While the positive case "if ((bits >> SHAMT) & 1)" for SHAMT 0..10 can
trigger conversion into efficient branchless sequences
  - with Zbs (bexti + neg + and)
  - with XVentanaCondOps (andi + vt.maskc)
the inverted/negated case results in
  andi a5,a0,1024
  seqz a5,a5
  neg a5,a5
  and a5,a5,a1
due to how the sequence presents to the combine pass.

This adds an additional splitter to reassociate the polarity reversed
case into bexti + addi, if Zbs is present.

Signed-off-by: Philipp Tomsich 

gcc/ChangeLog:

* config/riscv/xventanacondops.md: Add split to reassociate
  "andi + seqz + neg" into "bexti + addi".

---

 gcc/config/riscv/xventanacondops.md | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/config/riscv/xventanacondops.md 
b/gcc/config/riscv/xventanacondops.md
index 3e9d5833a4b..22b4b7d103a 100644
--- a/gcc/config/riscv/xventanacondops.md
+++ b/gcc/config/riscv/xventanacondops.md
@@ -119,3 +119,12 @@
   operands[2] = GEN_INT(1 << UINTVAL(operands[2]));
 })
 
+(define_split
+  [(set (match_operand:X 0 "register_operand")
+   (neg:X (eq:X (zero_extract:X (match_operand:X 1 "register_operand")
+(const_int 1)
+(match_operand 2 "immediate_operand"))
+(const_int 0]
+  "!TARGET_XVENTANACONDOPS && TARGET_ZBS"
+  [(set (match_dup 0) (zero_extract:X (match_dup 1) (const_int 1) (match_dup 
2)))
+   (set (match_dup 0) (plus:X (match_dup 0) (const_int -1)))])
-- 
2.34.1



[PATCH 3/7] RISC-V: Support noce_try_store_flag_mask as vt.maskc

2022-11-12 Thread Philipp Tomsich
When if-conversion in noce_try_store_flag_mask starts the sequence off
with an order-operator, our patterns for vt.maskc will receive the
result of the order-operator as a register argument; consequently,
they can't know that the result will be either 1 or 0.

To convey this information (and make vt.maskc applicable), we wrap
the result of the order-operator in a eq/ne against (const_int 0).
This commit adds the split pattern to handle these cases.

gcc/ChangeLog:

* config/riscv/xventanacondops.md: Add split to wrap an an
  order-operator suitably for generating vt.maskc.

Signed-off-by: Philipp Tomsich 

Ref vrull/gcc#157

RISC-V: Recognize 'ge'/'le' operators as 'slt'/'sgt'

During if-conversion, if noce_try_store_flag_mask succeeds, we may see
if (cur < next) {
next = 0;
}
transformed into
   27: r82:SI=ltu(r76:DI,r75:DI)
  REG_DEAD r76:DI
   28: r81:SI=r82:SI^0x1
  REG_DEAD r82:SI
   29: r80:DI=zero_extend(r81:SI)
  REG_DEAD r81:SI

This currently escapes the combiner, as RISC-V does not have a pattern
to apply the 'slt' instruction to 'geu' verbs.  By adding a pattern in
this commit, we match such cases.

gcc/ChangeLog:

* config/riscv/predicates.md (anyge_operator): Define.
(anygt_operator): Define.
(anyle_operator): Define.
(anylt_operator): Define.
* config/riscv/riscv.md (*sge_): Add a
  pattern to map 'geu' onto slt w/ reversed operands.
* config/riscv/riscv.md: Helpers for ge & le.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xventanacondops-le-01.c: New test.
* gcc.target/riscv/xventanacondops-lt-03.c: New test.

---

 gcc/config/riscv/predicates.md| 12 +
 gcc/config/riscv/riscv.md | 26 +++
 gcc/config/riscv/xventanacondops.md   | 45 +++
 .../gcc.target/riscv/xventanacondops-le-01.c  | 17 +++
 .../gcc.target/riscv/xventanacondops-lt-03.c  | 17 +++
 5 files changed, 117 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-le-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-lt-03.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index b368c11c930..490bff688a7 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -204,6 +204,18 @@
 (define_predicate "equality_operator"
   (match_code "eq,ne"))
 
+(define_predicate "anyge_operator"
+  (match_code "ge,geu"))
+
+(define_predicate "anygt_operator"
+  (match_code "gt,gtu"))
+
+(define_predicate "anyle_operator"
+  (match_code "le,leu"))
+
+(define_predicate "anylt_operator"
+  (match_code "lt,ltu"))
+
 (define_predicate "order_operator"
   (match_code "eq,ne,lt,ltu,le,leu,ge,geu,gt,gtu"))
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4331842b7b2..d1f3270a3c8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2636,6 +2636,19 @@
   [(set_attr "type" "slt")
(set_attr "mode" "")])
 
+(define_split
+  [(set (match_operand:GPR 0 "register_operand")
+   (match_operator:GPR 1 "anyle_operator"
+  [(match_operand:X 2 "register_operand")
+   (match_operand:X 3 "register_operand")]))]
+  "TARGET_XVENTANACONDOPS"
+  [(set (match_dup 0) (match_dup 4))
+   (set (match_dup 0) (eq:GPR (match_dup 0) (const_int 0)))]
+ {
+  operands[4] = gen_rtx_fmt_ee (GET_CODE (operands[1]) == LE ? LT : LTU,
+   mode, operands[3], operands[2]);
+ })
+
 (define_insn "*slt_"
   [(set (match_operand:GPR   0 "register_operand" "= r")
(any_lt:GPR (match_operand:X 1 "register_operand" "  r")
@@ -2657,6 +2670,19 @@
   [(set_attr "type" "slt")
(set_attr "mode" "")])
 
+(define_split
+  [(set (match_operand:GPR 0 "register_operand")
+   (match_operator:GPR 1 "anyge_operator"
+  [(match_operand:X 2 "register_operand")
+   (match_operand:X 3 "register_operand")]))]
+  "TARGET_XVENTANACONDOPS"
+  [(set (match_dup 0) (match_dup 4))
+   (set (match_dup 0) (eq:GPR (match_dup 0) (const_int 0)))]
+{
+  operands[4] = gen_rtx_fmt_ee (GET_CODE (operands[1]) == GE ? LT : LTU,
+   mode, operands[2], operands[3]);
+})
+
 ;;
 ;;  
 ;;
diff --git a/gcc/config/riscv/xventanacondops.md 
b/gcc/config/riscv/xventanacondops.md
index 641cef0e44e..7930ef1d837 100644
--- a/gcc/config/riscv/xventanacondops.md
+++ b/gcc/config/riscv/xventanacondops.md
@@ -28,3 +28,48 @@
(match_operand:DI 2 "register_operand" "r")))]
   "TARGET_XVENTANACONDOPS"
   "vt.maskc\t%0,%2,%1")
+
+;; Make order operators digestible to the vt.maskc logic by
+;; wrapping their result in a comparison against (const_int 0).
+
+;; "a >= b" is "!(a < b)"
+(define_split
+  [(set (match_operand:X 0 "register_operand")
+   (and:X (neg:X (match_operator:X 1 "anyge_operator"
+[(match_operand:X 2 

[PATCH 4/7] RISC-V: Recognize sign-extract + and cases for XVentanaCondOps

2022-11-12 Thread Philipp Tomsich
Users might use explicit arithmetic operations to create a mask and
then and it, in a sequence like
cond = (bits >> SHIFT) & 1;
mask = ~(cond - 1);
val &= mask;
which will present as a single-bit sign-extract.

Dependening on what combination of XVentanaCondOps and Zbs are
available, this will map to the following sequences:
 - bexti + vt.maskc, if both Zbs and XVentanaCondOps are present
 - andi + vt.maskc, if only XVentanaCondOps is available and the
sign-extract is operating on bits 10:0 (bit
11 can't be reached, as the immediate is
sign-extended)
 - slli + srli + and, otherwise.

gcc/ChangeLog:

* config/riscv/xventanacondops.md: Recognize SIGN_EXTRACT
  of a single-bit followed by AND for XVentanaCondOps.

Signed-off-by: Philipp Tomsich 
---

 gcc/config/riscv/xventanacondops.md | 46 +
 1 file changed, 46 insertions(+)

diff --git a/gcc/config/riscv/xventanacondops.md 
b/gcc/config/riscv/xventanacondops.md
index 7930ef1d837..3e9d5833a4b 100644
--- a/gcc/config/riscv/xventanacondops.md
+++ b/gcc/config/riscv/xventanacondops.md
@@ -73,3 +73,49 @@
   "TARGET_XVENTANACONDOPS"
   [(set (match_dup 5) (match_dup 1))
(set (match_dup 0) (and:X (neg:X (ne:X (match_dup 5) (const_int 0)))
+
+;; Users might use explicit arithmetic operations to create a mask and
+;; then and it, in a sequence like
+;;cond = (bits >> SHIFT) & 1;
+;;mask = ~(cond - 1);
+;;val &= mask;
+;; which will present as a single-bit sign-extract in the combiner.
+;;
+;; This will give rise to any of the following cases:
+;; - with Zbs and XVentanaCondOps: bexti + vt.maskc
+;; - with XVentanaCondOps (but w/o Zbs):
+;;   - andi + vt.maskc, if the mask is representable in the immediate
+;;  (which requires extra care due to the immediate
+;;   being sign-extended)
+;;   - slli + srli + and
+;; - otherwise: slli + srli + and
+
+;; With Zbb, we have bexti for all possible bits...
+(define_split
+  [(set (match_operand:X 0 "register_operand")
+   (and:X (sign_extract:X (match_operand:X 1 "register_operand")
+  (const_int 1)
+  (match_operand 2 "immediate_operand"))
+  (match_operand:X 3 "register_operand")))
+   (clobber (match_operand:X 4 "register_operand"))]
+  "TARGET_XVENTANACONDOPS && TARGET_ZBS"
+  [(set (match_dup 4) (zero_extract:X (match_dup 1) (const_int 1) (match_dup 
2)))
+   (set (match_dup 0) (and:X (neg:X (ne:X (match_dup 4) (const_int 0)))
+(match_dup 3)))])
+
+;; ...whereas RV64I only allows us access to bits 0..10 in a single andi.
+(define_split
+  [(set (match_operand:X 0 "register_operand")
+   (and:X (sign_extract:X (match_operand:X 1 "register_operand")
+  (const_int 1)
+  (match_operand 2 "immediate_operand"))
+  (match_operand:X 3 "register_operand")))
+   (clobber (match_operand:X 4 "register_operand"))]
+  "TARGET_XVENTANACONDOPS && !TARGET_ZBS && (UINTVAL (operands[2]) < 11)"
+  [(set (match_dup 4) (and:X (match_dup 1) (match_dup 2)))
+   (set (match_dup 0) (and:X (neg:X (ne:X (match_dup 4) (const_int 0)))
+(match_dup 3)))]
+{
+  operands[2] = GEN_INT(1 << UINTVAL(operands[2]));
+})
+
-- 
2.34.1



[PATCH 6/7] RISC-V: Support immediates in XVentanaCondOps

2022-11-12 Thread Philipp Tomsich
When if-conversion encounters sequences using immediates, the
sequences can't trivially map back onto vt.maskc/vt.maskcn (even if
benefitial) due to vt.maskc and vt.maskcn not having immediate forms.

This adds a splitter to rewrite opportunities for XVentanaCondOps that
operate on an immediate by first putting the immediate into a register
to enable the non-immediate vt.maskc/vt.maskcn instructions to operate
on the value.

Consider code, such as

  long func2 (long a, long c)
  {
if (c)
  a = 2;
else
  a = 5;
return a;
  }

which will be converted to

  func2:
seqza0,a2
neg a0,a0
andia0,a0,3
addia0,a0,2
ret

Following this change, we generate

li  a0,3
vt.maskcn   a0,a0,a2
addia0,a0,2
ret

This commit also introduces a simple unit test for if-conversion with
immediate (literal) values as the sources for simple sets in the THEN
and ELSE blocks. The test checks that Ventana's conditional mask
instruction (vt.maskc) is emitted as part of the resultant branchless
instruction sequence.

gcc/ChangeLog:

* config/riscv/xventanacondops.md: Support immediates for
  vt.maskc/vt.maskcn through a splitter.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xventanacondops-ifconv-imm.c: New test.

Signed-off-by: Philipp Tomsich 
Reviewed-by: Henry Brausen 

---
Ref #204

 gcc/config/riscv/xventanacondops.md   | 24 +--
 .../riscv/xventanacondops-ifconv-imm.c| 19 +++
 2 files changed, 41 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ifconv-imm.c

diff --git a/gcc/config/riscv/xventanacondops.md 
b/gcc/config/riscv/xventanacondops.md
index 22b4b7d103a..0e09ee91a69 100644
--- a/gcc/config/riscv/xventanacondops.md
+++ b/gcc/config/riscv/xventanacondops.md
@@ -29,6 +29,26 @@
   "TARGET_XVENTANACONDOPS"
   "vt.maskc\t%0,%2,%1")
 
+;; XVentanaCondOps does not have immediate forms, so we need to do extra
+;; work to support these: if we encounter a vt.maskc/n with an immediate,
+;; we split this into a load-immediate followed by a vt.maskc/n.
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (and:DI (neg:DI (match_operator:DI 1 "equality_operator"
+  [(match_operand:DI 2 "register_operand")
+   (const_int 0)]))
+   (match_operand:DI 3 "immediate_operand")))
+   (clobber (match_operand:DI 4 "register_operand"))]
+  "TARGET_XVENTANACONDOPS"
+  [(set (match_dup 4) (match_dup 3))
+   (set (match_dup 0) (and:DI (neg:DI (match_dup 1))
+ (match_dup 4)))]
+{
+  /* Eliminate the clobber/temporary, if it is not needed. */
+  if (!rtx_equal_p (operands[0], operands[2]))
+ operands[4] = operands[0];
+})
+
 ;; Make order operators digestible to the vt.maskc logic by
 ;; wrapping their result in a comparison against (const_int 0).
 
@@ -37,7 +57,7 @@
   [(set (match_operand:X 0 "register_operand")
(and:X (neg:X (match_operator:X 1 "anyge_operator"
 [(match_operand:X 2 "register_operand")
- (match_operand:X 3 "register_operand")]))
+ (match_operand:X 3 "arith_operand")]))
   (match_operand:X 4 "register_operand")))
(clobber (match_operand:X 5 "register_operand"))]
   "TARGET_XVENTANACONDOPS"
@@ -54,7 +74,7 @@
   [(set (match_operand:X 0 "register_operand")
(and:X (neg:X (match_operator:X 1 "anygt_operator"
 [(match_operand:X 2 "register_operand")
- (match_operand:X 3 "register_operand")]))
+ (match_operand:X 3 "arith_operand")]))
   (match_operand:X 4 "register_operand")))
(clobber (match_operand:X 5 "register_operand"))]
   "TARGET_XVENTANACONDOPS"
diff --git a/gcc/testsuite/gcc.target/riscv/xventanacondops-ifconv-imm.c 
b/gcc/testsuite/gcc.target/riscv/xventanacondops-ifconv-imm.c
new file mode 100644
index 000..0012e7b669c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xventanacondops-ifconv-imm.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_xventanacondops -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+/* Each function below should emit a vt.maskcn instruction */
+
+long
+foo0 (long a, long b, long c)
+{
+  if (c)
+a = 0;
+  else
+a = 5;
+  return a;
+}
+
+/* { dg-final { scan-assembler-times "vt.maskcn\t" 1 } } */
+/* { dg-final { scan-assembler-not "beqz\t" } } */
+/* { dg-final { scan-assembler-not "bnez\t" } } */
-- 
2.34.1



[PATCH 2/7] RISC-V: Generate vt.maskc on noce_try_store_flag_mask if-conversion

2022-11-12 Thread Philipp Tomsich
Adds a pattern to map the output of noce_try_store_flag_mask
if-conversion in the combiner onto vt.maskc; the input patterns
supported are similar to the following:
  (set (reg/v/f:DI 75 [  ])
   (and:DI (neg:DI (ne:DI (reg:DI 82)
   (const_int 0 [0])))
   (reg/v/f:DI 75 [  ])))

This reduces dynamic instruction counts for the perlbench-workload in
SPEC CPU2017 by 0.8230%, 0.4689%, and 0.2332% (respectively, for the
each of the 3 workloads in the 'ref'-workload).

To ensure that the combine-pass doesn't get confused about
profitability, we recognize the idiom as requiring a single
instruction when the XVentanaCondOps extension is present.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_rtx_costs): Recognize idiom for
  vt.maskc as a single insn with TARGET_XVENTANACONDOPS.
* config/riscv/riscv.md: Include xventanacondops.md.
* config/riscv/xventanacondops.md: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xventanacondops-ne-03.c: New test.
* gcc.target/riscv/xventanacondops-ne-04.c: New test.

Signed-off-by: Philipp Tomsich 
---

 gcc/config/riscv/riscv.cc | 14 +
 gcc/config/riscv/riscv.md |  1 +
 gcc/config/riscv/xventanacondops.md   | 30 +++
 .../gcc.target/riscv/xventanacondops-ne-03.c  | 15 ++
 .../gcc.target/riscv/xventanacondops-ne-04.c  | 15 ++
 5 files changed, 75 insertions(+)
 create mode 100644 gcc/config/riscv/xventanacondops.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-03.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-04.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2a94482b8ed..1883b5b13a7 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2269,6 +2269,20 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
   return false;
 
 case AND:
+  /* vt.maskc/vt.maskcn for XVentanaCondOps */
+  if (TARGET_XVENTANACONDOPS && mode == word_mode
+ && GET_CODE (XEXP (x, 0)) == NEG)
+   {
+ rtx inner = XEXP (XEXP (x, 0), 0);
+
+ if ((GET_CODE (inner) == EQ || GET_CODE (inner) == NE)
+ && CONST_INT_P (XEXP (inner, 1))
+ && INTVAL (XEXP (inner, 1)) == 0)
+   {
+ *total = COSTS_N_INSNS (1);
+ return true;
+   }
+   }
   /* slli.uw pattern for zba.  */
   if (TARGET_ZBA && TARGET_64BIT && mode == DImode
  && GET_CODE (XEXP (x, 0)) == ASHIFT)
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 1514e10dbd1..4331842b7b2 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3196,3 +3196,4 @@
 (include "generic.md")
 (include "sifive-7.md")
 (include "vector.md")
+(include "xventanacondops.md")
diff --git a/gcc/config/riscv/xventanacondops.md 
b/gcc/config/riscv/xventanacondops.md
new file mode 100644
index 000..641cef0e44e
--- /dev/null
+++ b/gcc/config/riscv/xventanacondops.md
@@ -0,0 +1,30 @@
+;; Machine description for X-Ventana-CondOps
+;; Copyright (C) 2022 Free Software Foundation, Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_code_iterator eq_or_ne [eq ne])
+(define_code_attr n [(eq "n") (ne "")])
+
+(define_insn "*vt.maskc"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (and:DI (neg:DI (eq_or_ne:DI
+   (match_operand:DI 1 "register_operand" "r")
+   (const_int 0)))
+   (match_operand:DI 2 "register_operand" "r")))]
+  "TARGET_XVENTANACONDOPS"
+  "vt.maskc\t%0,%2,%1")
diff --git a/gcc/testsuite/gcc.target/riscv/xventanacondops-ne-03.c 
b/gcc/testsuite/gcc.target/riscv/xventanacondops-ne-03.c
new file mode 100644
index 000..87cc69480ac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xventanacondops-ne-03.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_xventanacondops -mabi=lp64 -mtune=thead-c906" } 
*/
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O1" "-Os" "-Oz" } } */
+
+long long ne3(long long a, long long b)
+{
+  if (a != 0)
+return b;
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vt.maskc" 1 } } */
+
+
diff --git 

[PATCH 0/7] RISC-V: Backend support for XVentanaCondOps/ZiCondops

2022-11-12 Thread Philipp Tomsich


Both the XVentanaCondOps (a vendor-defined extension from Ventana
Microsystems) and the proposed ZiCondOps extensions define a
conditional-zero(-or-value) instruction, which is similar to the
following C construct:
  rd = rc ? rs : 0

This functionality can be tied back into if-convertsion and also match
some typical programming idioms.  This series includes backend support
for XVentanaCondops and infrastructure to handle conditional-zero
constructions in if-conversion.

Tested against SPEC CPU 2017.



Philipp Tomsich (7):
  RISC-V: Recognize xventanacondops extension
  RISC-V: Generate vt.maskc on noce_try_store_flag_mask if-conversion
  RISC-V: Support noce_try_store_flag_mask as vt.maskc
  RISC-V: Recognize sign-extract + and cases for XVentanaCondOps
  RISC-V: Recognize bexti in negated if-conversion
  RISC-V: Support immediates in XVentanaCondOps
  ifcvt: add if-conversion to conditional-zero instructions

 gcc/common/config/riscv/riscv-common.cc   |   2 +
 gcc/config/riscv/predicates.md|  12 +
 gcc/config/riscv/riscv-opts.h |   3 +
 gcc/config/riscv/riscv.cc |  14 ++
 gcc/config/riscv/riscv.md |  27 +++
 gcc/config/riscv/riscv.opt|   3 +
 gcc/config/riscv/xventanacondops.md   | 150 
 gcc/ifcvt.cc  | 214 ++
 .../gcc.target/riscv/xventanacondops-and-01.c |  16 ++
 .../gcc.target/riscv/xventanacondops-and-02.c |  15 ++
 .../gcc.target/riscv/xventanacondops-eq-01.c  |  11 +
 .../gcc.target/riscv/xventanacondops-eq-02.c  |  14 ++
 .../riscv/xventanacondops-ifconv-imm.c|  19 ++
 .../gcc.target/riscv/xventanacondops-le-01.c  |  17 ++
 .../gcc.target/riscv/xventanacondops-lt-01.c  |  16 ++
 .../gcc.target/riscv/xventanacondops-lt-03.c  |  17 ++
 .../gcc.target/riscv/xventanacondops-ne-01.c  |  11 +
 .../gcc.target/riscv/xventanacondops-ne-03.c  |  15 ++
 .../gcc.target/riscv/xventanacondops-ne-04.c  |  15 ++
 .../gcc.target/riscv/xventanacondops-xor-01.c |  14 ++
 20 files changed, 605 insertions(+)
 create mode 100644 gcc/config/riscv/xventanacondops.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-and-02.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-eq-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-eq-02.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ifconv-imm.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-le-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-lt-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-lt-03.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-01.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-03.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-ne-04.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xventanacondops-xor-01.c

-- 
2.34.1



[PATCH 1/7] RISC-V: Recognize xventanacondops extension

2022-11-12 Thread Philipp Tomsich
This adds the xventanacondops extension to the option parsing and as a
default for the ventana-vt1 core:

gcc/Changelog:

* common/config/riscv/riscv-common.cc: Recognize
  "xventanacondops" as part of an architecture string.
* config/riscv/riscv-cores.def (RISCV_CORE): Enable
  "xventanacondops" by default for "ventana-vt1".
* config/riscv/riscv-opts.h (MASK_XVENTANACONDOPS): Define.
(TARGET_XVENTANACONDOPS): Define.
* config/riscv/riscv.opt: Add "riscv_xventanacondops".

Signed-off-by: Philipp Tomsich 
---

 gcc/common/config/riscv/riscv-common.cc | 2 ++
 gcc/config/riscv/riscv-opts.h   | 3 +++
 gcc/config/riscv/riscv.opt  | 3 +++
 3 files changed, 8 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 4b7f777c103..6b2bdda5feb 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1247,6 +1247,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"svinval", _options::x_riscv_sv_subext, MASK_SVINVAL},
   {"svnapot", _options::x_riscv_sv_subext, MASK_SVNAPOT},
 
+  {"xventanacondops", _options::x_riscv_xventanacondops, 
MASK_XVENTANACONDOPS},
+
   {NULL, NULL, 0}
 };
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 25fd85b09b1..84c987626bc 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -189,4 +189,7 @@ enum stack_protector_guard {
? 0 \
: 32 << (__builtin_popcount (riscv_zvl_flags) - 1))
 
+#define MASK_XVENTANACONDOPS (1 << 0)
+#define TARGET_XVENTANACONDOPS ((riscv_xventanacondops & MASK_XVENTANACONDOPS) 
!= 0)
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 7c3ca48d1cc..9595078bdd4 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -233,6 +233,9 @@ int riscv_zm_subext
 TargetVariable
 int riscv_sv_subext
 
+TargetVariable
+int riscv_xventanacondops = 0
+
 Enum
 Name(isa_spec_class) Type(enum riscv_isa_spec_class)
 Supported ISA specs (for use with the -misa-spec= option):
-- 
2.34.1



Re: [PATCH] Fortran: Remove unused declaration

2022-11-12 Thread Harald Anlauf via Gcc-patches

Am 12.11.22 um 22:05 schrieb Bernhard Reutner-Fischer via Gcc-patches:

This function definition was removed years ago, remove it's prototype.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_check_include): Remove declaration.
---
  gcc/fortran/gfortran.h | 1 -
  1 file changed, 1 deletion(-)
---
Regtests cleanly, ok for trunk?

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index c4deec0d5b8..ce3ad61bb52 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3208,7 +3208,6 @@ int gfc_at_eof (void);
  int gfc_at_bol (void);
  int gfc_at_eol (void);
  void gfc_advance_line (void);
-int gfc_check_include (void);
  int gfc_define_undef_line (void);

  int gfc_wide_is_printable (gfc_char_t);


OK, thanks.



[PATCH] Fortran: Remove unused declaration

2022-11-12 Thread Bernhard Reutner-Fischer via Gcc-patches
This function definition was removed years ago, remove it's prototype.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_check_include): Remove declaration.
---
 gcc/fortran/gfortran.h | 1 -
 1 file changed, 1 deletion(-)
---
Regtests cleanly, ok for trunk?

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index c4deec0d5b8..ce3ad61bb52 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3208,7 +3208,6 @@ int gfc_at_eof (void);
 int gfc_at_bol (void);
 int gfc_at_eol (void);
 void gfc_advance_line (void);
-int gfc_check_include (void);
 int gfc_define_undef_line (void);
 
 int gfc_wide_is_printable (gfc_char_t);
-- 
2.38.1



Re: [PATCH] gcc: m68k: fix PR target/107645

2022-11-12 Thread Jeff Law via Gcc-patches



On 11/11/22 12:43, Max Filippov wrote:

gcc/
PR target/107645
* config/m68k/predicates.md (symbolic_operand): Return false
when UNSPEC is under the CONST node.


Isn't the underlying object still symbolic though, thus returning false 
seems wrong.


insn 342 341 343 35 (set (reg:SI 97)
    (mem/u:SI (plus:SI (reg:SI 13 %a5)
    (const:SI (unspec:SI [
    (symbol_ref:SI 
("__gcov_kvp_dynamic_pool_index") [flags 0x40]  __gcov_kvp_dynamic_pool_index>)

    (const_int 0 [0])
    ] 6))) [0  S4 A8])) 
"gcc/libgcc/libgcov.h":472:44 55 {*movsi_m68k2}
 (expr_list:REG_EQUAL (symbol_ref:SI 
("__gcov_kvp_dynamic_pool_index") [flags 0x40]  __gcov_kvp_dynamic_pool_index>)

    (nil)))


ISTM that we'd need to strip the unspec and process its argument 
instead.  But maybe I'm missing something.



jeff




Re: [DOCS] sphinx: use new Sphinx links

2022-11-12 Thread Gerald Pfeifer
On Sat, 12 Nov 2022, Gerald Pfeifer wrote:
> I am not aware of who added this, and why, nor actually even why, yet it 
> seems if we can get the same in place for /install we'll be good again, so
> I'll ask overseers@.

https://gcc.gnu.org/install/ is up and running fine now/again.

> Next step: redirects from the old /install docs to the new ones.

Gerald


[PATCH] [PR68097] Try to avoid recursing for floats in tree_*_nonnegative_warnv_p.

2022-11-12 Thread Aldy Hernandez via Gcc-patches
It irks me that a PR named "we should track ranges for floating-point
hasn't been closed in this release.  This is an attempt to do just
that.

As mentioned in the PR, even though we track ranges for floats, it has
been suggested that avoiding recursing through SSA defs in
gimple_assign_nonnegative_warnv_p is also a goal.  We can do this with
various ranger components without the need for a heavy handed approach
(i.e. a full ranger).

I have implemented two versions of known_float_sign_p() that answer
the question whether we definitely know the sign for an operation or a
tree expression.

Both versions use get_global_range_query, which is a wrapper to query
global ranges.  This means, that no caching or propagation is done.
In the case of an SSA, we just return the global range for it (think
SSA_NAME_RANGE_INFO).  In the case of a tree code with operands, we
also use get_global_range_query to resolve the operands, and then call
into range-ops, which is our lowest level component.  There is no
ranger or gori involved.  All we're doing is resolving the operation
with the ranges passed.

This is enough to avoid recursing in the case where we definitely know
the sign of a range.  Otherwise, we still recurse.

Note that instead of get_global_range_query(), we could use
get_range_query() which uses a ranger (if active in a pass), or
get_global_range_query if not.  This would allow passes that have an
active ranger (with enable_ranger) to use a full ranger.  These passes
are currently, VRP, loop unswitching, DOM, loop versioning, etc.  If
no ranger is active, get_range_query defaults to global ranges, so
there's no additional penalty.

Would this be acceptable, at least enough to close (or rename the PR ;-))?

PR tree-optimization/68097

gcc/ChangeLog:

* fold-const.cc (known_float_sign_p): New.
(tree_unary_nonnegative_warnv_p): Call known_float_sign_p.
(tree_binary_nonnegative_warnv_p): Same.
(tree_single_nonnegative_warnv_p): Same.
---
 gcc/fold-const.cc | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index b89cac91cae..bd74cfca996 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -14577,6 +14577,44 @@ tree_simple_nonnegative_warnv_p (enum tree_code code, 
tree type)
   return false;
 }
 
+/* Return true if T is of type floating point and has a known sign.
+   If so, set the sign in SIGN.  */
+
+static bool
+known_float_sign_p (bool , tree t)
+{
+  if (!frange::supports_p (TREE_TYPE (t)))
+return false;
+
+  frange r;
+  return (get_global_range_query ()->range_of_expr (r, t)
+ && r.signbit_p (sign));
+}
+
+/* Return true if TYPE is a floating-point type and (CODE OP0 OP1) has
+   a known sign.  If so, set the sign in SIGN.  */
+
+static bool
+known_float_sign_p (bool , enum tree_code code, tree type, tree op0,
+   tree op1 = NULL_TREE)
+{
+  if (!frange::supports_p (type))
+return false;
+
+  range_op_handler handler (code, type);
+  if (handler)
+{
+  frange res, r0, r1;
+  get_global_range_query ()->range_of_expr (r0, op0);
+  if (op1)
+   get_global_range_query ()->range_of_expr (r1, op1);
+  else
+   r1.set_varying (type);
+  return handler.fold_range (res, type, r0, r1) && res.signbit_p (sign);
+}
+  return false;
+}
+
 /* Return true if (CODE OP0) is known to be non-negative.  If the return
value is based on the assumption that signed overflow is undefined,
set *STRICT_OVERFLOW_P to true; otherwise, don't change
@@ -14589,6 +14627,10 @@ tree_unary_nonnegative_warnv_p (enum tree_code code, 
tree type, tree op0,
   if (TYPE_UNSIGNED (type))
 return true;
 
+  bool sign;
+  if (known_float_sign_p (sign, code, type, op0))
+return !sign;
+
   switch (code)
 {
 case ABS_EXPR:
@@ -14656,6 +14698,10 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, 
tree type, tree op0,
   if (TYPE_UNSIGNED (type))
 return true;
 
+  bool sign;
+  if (known_float_sign_p (sign, code, type, op0, op1))
+return !sign;
+
   switch (code)
 {
 case POINTER_PLUS_EXPR:
@@ -14778,6 +14824,8 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, 
tree type, tree op0,
 bool
 tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
 {
+  bool sign;
+
   if (TYPE_UNSIGNED (TREE_TYPE (t)))
 return true;
 
@@ -14796,6 +14844,9 @@ tree_single_nonnegative_warnv_p (tree t, bool 
*strict_overflow_p, int depth)
   return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
 
 case SSA_NAME:
+  if (known_float_sign_p (sign, t))
+   return !sign;
+
   /* Limit the depth of recursion to avoid quadratic behavior.
 This is expected to catch almost all occurrences in practice.
 If this code misses important cases that unbounded recursion
-- 
2.38.1



ginclude: C2x header version macros

2022-11-12 Thread Joseph Myers
C2x adds __STDC_VERSION_*_H__ macros to individual headers with
interface changes compared to C17.  All the new header features in
headers provided by GCC have now been implemented, so define those
macros to the value given in the current working draft.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  OK to
commit?

gcc/
* ginclude/float.h [__STDC_VERSION__ > 201710L]
(__STDC_VERSION_FLOAT_H__): New macro.
* ginclude/stdarg.h [__STDC_VERSION__ > 201710L]
(__STDC_VERSION_STDARG_H__): New macro.
* ginclude/stdatomic.h [__STDC_VERSION__ > 201710L]
(__STDC_VERSION_STDATOMIC_H__): New macro.
* ginclude/stddef.h [__STDC_VERSION__ > 201710L]
(__STDC_VERSION_STDDEF_H__): New macro.
* ginclude/stdint-gcc.h [__STDC_VERSION__ > 201710L]
(__STDC_VERSION_STDINT_H__): New macro.
* glimits.h [__STDC_VERSION__ > 201710L]
(__STDC_VERSION_LIMITS_H__): New macro.

gcc/testsuite/
* gcc.dg/c11-float-8.c, gcc.dg/c11-limits-1.c,
gcc.dg/c11-stdarg-4.c, gcc.dg/c11-stdatomic-3.c,
gcc.dg/c11-stddef-1.c, gcc.dg/c11-stdint-1.c,
gcc.dg/c2x-float-13.c, gcc.dg/c2x-limits-1.c,
gcc.dg/c2x-stdarg-5.c, gcc.dg/c2x-stdatomic-1.c,
gcc.dg/c2x-stddef-1.c, gcc.dg/c2x-stdint-1.c: New tests.

diff --git a/gcc/ginclude/float.h b/gcc/ginclude/float.h
index bc5439d664f..172b9de477f 100644
--- a/gcc/ginclude/float.h
+++ b/gcc/ginclude/float.h
@@ -624,4 +624,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 
 #endif /* __DEC32_MANT_DIG__ */
 
+#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
+#define __STDC_VERSION_FLOAT_H__   202311L
+#endif
+
 #endif /* _FLOAT_H___ */
diff --git a/gcc/ginclude/stdarg.h b/gcc/ginclude/stdarg.h
index c704c9ffcf2..5149f7b3f4f 100644
--- a/gcc/ginclude/stdarg.h
+++ b/gcc/ginclude/stdarg.h
@@ -125,6 +125,10 @@ typedef __gnuc_va_list va_list;
 
 #endif /* not __svr4__ */
 
+#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
+#define __STDC_VERSION_STDARG_H__  202311L
+#endif
+
 #endif /* _STDARG_H */
 
 #endif /* not _ANSI_STDARG_H_ */
diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index a56ba5d9639..e16b072ccde 100644
--- a/gcc/ginclude/stdatomic.h
+++ b/gcc/ginclude/stdatomic.h
@@ -248,4 +248,8 @@ extern void atomic_flag_clear (volatile atomic_flag *);
 extern void atomic_flag_clear_explicit (volatile atomic_flag *, memory_order);
 #define atomic_flag_clear_explicit(PTR, MO)   __atomic_clear ((PTR), (MO))
 
+#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
+#define __STDC_VERSION_STDATOMIC_H__   202311L
+#endif
+
 #endif  /* _STDATOMIC_H */
diff --git a/gcc/ginclude/stddef.h b/gcc/ginclude/stddef.h
index 2767edf51de..7980045e712 100644
--- a/gcc/ginclude/stddef.h
+++ b/gcc/ginclude/stddef.h
@@ -454,6 +454,7 @@ typedef struct {
 
 #if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
 #define unreachable() (__builtin_unreachable ())
+#define __STDC_VERSION_STDDEF_H__  202311L
 #endif
 
 #endif /* _STDDEF_H was defined this time */
diff --git a/gcc/ginclude/stdint-gcc.h b/gcc/ginclude/stdint-gcc.h
index 6be01ae28b8..eab651d968a 100644
--- a/gcc/ginclude/stdint-gcc.h
+++ b/gcc/ginclude/stdint-gcc.h
@@ -362,4 +362,8 @@ typedef __UINTMAX_TYPE__ uintmax_t;
 
 #endif
 
+#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
+#define __STDC_VERSION_STDINT_H__  202311L
+#endif
+
 #endif /* _GCC_STDINT_H */
diff --git a/gcc/glimits.h b/gcc/glimits.h
index 8d74c8b88d6..994f7e33bbe 100644
--- a/gcc/glimits.h
+++ b/gcc/glimits.h
@@ -156,6 +156,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 # define BOOL_MAX 1
 # undef BOOL_WIDTH
 # define BOOL_WIDTH 1
+
+# define __STDC_VERSION_LIMITS_H__ 202311L
 #endif
 
 #endif /* _LIMITS_H___ */
diff --git a/gcc/testsuite/gcc.dg/c11-float-8.c 
b/gcc/testsuite/gcc.dg/c11-float-8.c
new file mode 100644
index 000..7fb1e0a5683
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-float-8.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION_FLOAT_H__ not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+#include 
+
+#ifdef __STDC_VERSION_FLOAT_H__
+#error "__STDC_VERSION_FLOAT_H__ defined"
+#endif
diff --git a/gcc/testsuite/gcc.dg/c11-limits-1.c 
b/gcc/testsuite/gcc.dg/c11-limits-1.c
new file mode 100644
index 000..6dc5737024d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-limits-1.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION_LIMITS_H__ not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+#include 
+
+#ifdef __STDC_VERSION_LIMITS_H__
+#error "__STDC_VERSION_LIMITS_H__ defined"
+#endif
diff --git a/gcc/testsuite/gcc.dg/c11-stdarg-4.c 
b/gcc/testsuite/gcc.dg/c11-stdarg-4.c
new file mode 100644
index 000..06bff1f0445
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-stdarg-4.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION_STDARG_H__ not in C11.  */

Re: [PATCH] Fortran: fix treatment of character, value, optional dummy arguments [PR107444]

2022-11-12 Thread Mikael Morin

Hello,

Le 10/11/2022 à 22:56, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

the attached patch is a follow-up to the fix for PR107441,
as it finally fixes the treatment of character dummy arguments
that have the value,optional attribute, and allows for checking
of the presence of such arguments.

This entails a small ABI clarification, as the previous text
was not really clear on the argument passing conventions,
and the previously generated code was inconsistent at best,
or rather wrong, for this kind of procedure arguments.
(E.g. the number of passed arguments was varying...)

Testcase cross-checked with NAG 7.1.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Looks good.  Thanks.


[PATCH] c++: Reject UDLs in certain contexts [PR105300]

2022-11-12 Thread Marek Polacek via Gcc-patches
In this PR, we are crashing because we've encountered a UDL where a
string-literal is expected.  This patch makes the parser reject string
and character UDLs in all places where the grammar requires a
string-literal and not a user-defined-string-literal.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/105300

gcc/c-family/ChangeLog:

* c-pragma.cc (handle_pragma_message): Warn for CPP_STRING_USERDEF.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_string_literal): Add a bool parameter.
Give an error when UDLs are not permitted.
(cp_parser_primary_expression): Adjust the call to
cp_parser_string_literal.
(cp_parser_linkage_specification): Likewise.
(cp_parser_static_assert): Likewise.
(cp_parser_operator): Likewise.
(cp_parser_asm_definition): Likewise.
(cp_parser_asm_specification_opt): Likewise.
(cp_parser_asm_operand_list): Likewise.
(cp_parser_asm_clobber_list): Likewise.
(cp_parser_omp_context_selector): Likewise.
(pragma_lex): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/udlit-error1.C: New test.
---
 gcc/c-family/c-pragma.cc  |  3 +
 gcc/cp/parser.cc  | 69 ++-
 gcc/testsuite/g++.dg/cpp0x/udlit-error1.C | 21 +++
 3 files changed, 65 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-error1.C

diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc
index 142a46441ac..49f405b605b 100644
--- a/gcc/c-family/c-pragma.cc
+++ b/gcc/c-family/c-pragma.cc
@@ -1390,6 +1390,9 @@ handle_pragma_message (cpp_reader *)
 }
   else if (token == CPP_STRING)
 message = x;
+  else if (token == CPP_STRING_USERDEF)
+GCC_BAD ("string literal with user-defined suffix is invalid in this "
+"context");
   else
 GCC_BAD ("expected a string after %<#pragma message%>");
 
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index e4021835ed5..ae2798e2a33 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2226,7 +2226,7 @@ pop_unparsed_function_queues (cp_parser *parser)
 static cp_expr cp_parser_identifier
   (cp_parser *);
 static cp_expr cp_parser_string_literal
-  (cp_parser *, bool, bool, bool);
+  (cp_parser *, bool, bool, bool, bool);
 static cp_expr cp_parser_userdef_char_literal
   (cp_parser *);
 static tree cp_parser_userdef_string_literal
@@ -4402,7 +4402,8 @@ cp_parser_identifier (cp_parser* parser)
TREE_STRING representing the combined, nul-terminated string
constant.  If TRANSLATE is true, translate the string to the
execution character set.  If WIDE_OK is true, a wide string is
-   invalid here.
+   valid here.  If UDL_OK is true, a string literal with user-defined
+   suffix can be used in this context.
 
C++98 [lex.string] says that if a narrow string literal token is
adjacent to a wide string literal token, the behavior is undefined.
@@ -4414,7 +4415,7 @@ cp_parser_identifier (cp_parser* parser)
FUTURE: ObjC++ will need to handle @-strings here.  */
 static cp_expr
 cp_parser_string_literal (cp_parser *parser, bool translate, bool wide_ok,
- bool lookup_udlit = true)
+ bool udl_ok, bool lookup_udlit = true)
 {
   tree value;
   size_t count;
@@ -4439,6 +4440,12 @@ cp_parser_string_literal (cp_parser *parser, bool 
translate, bool wide_ok,
 
   if (cpp_userdef_string_p (tok->type))
 {
+  if (!udl_ok)
+   {
+ error_at (loc, "string literal with user-defined suffix "
+   "is invalid in this context");
+ return error_mark_node;
+   }
   string_tree = USERDEF_LITERAL_VALUE (tok->u.value);
   curr_type = cpp_userdef_string_remove_type (tok->type);
   curr_tok_is_userdef_p = true;
@@ -5655,7 +5662,7 @@ cp_parser_primary_expression (cp_parser *parser,
 argument to cp_parser_string_literal.  */
   return (cp_parser_string_literal (parser,
parser->translate_strings_p,
-   true)
+   /*wide_ok=*/true, /*udl_ok=*/true)
  .maybe_add_location_wrapper ());
 
 case CPP_OPEN_PAREN:
@@ -16161,15 +16168,14 @@ cp_parser_function_specifier_opt (cp_parser* parser,
 static void
 cp_parser_linkage_specification (cp_parser* parser, tree prefix_attr)
 {
-  tree linkage;
-
   /* Look for the `extern' keyword.  */
   cp_token *extern_token
 = cp_parser_require_keyword (parser, RID_EXTERN, RT_EXTERN);
 
   /* Look for the string-literal.  */
   cp_token *string_token = cp_lexer_peek_token (parser->lexer);
-  linkage = cp_parser_string_literal (parser, false, false);
+  tree linkage = cp_parser_string_literal (parser, /*translate=*/false,
+  /*wide_ok=*/false, /*udl_ok=*/false);
 
   /* Transform the literal into an identifier.  If the literal is a
  

Re: c: C2x constexpr

2022-11-12 Thread Richard Biener via Gcc-patches



> Am 12.11.2022 um 05:56 schrieb Joseph Myers :
> 
> [Global / middle-end reviewers, note there is a dfp.cc change here
> that needs review.]
> 
> Implement C2x constexpr (a feature based on the C++ one but much more
> minimal, with only constexpr variables, not functions).
> 
> I believe this implementation is fully functional for use of this
> feature.  However, there are several things that seem unclear about
> the specification that I'll need to raise in NB comments.  There are
> also areas where there may be followup bug fixes because the
> implementation doesn't reject some more obscure cases that ought to be
> rejected: cases where a constexpr initializer for floating type meets
> the constraints for a constant expression in initializers but not
> those for an arithmetic constant expression (previously we haven't had
> to track whether something is an arithmetic constant expression in
> detail, unlike with integer constant expressions), and some cases
> where a tag or struct or union member gets declared indirectly in the
> declaration specifiers or declarator of a constexpr declaration, which
> is not permitted (modulo lack of clarity in the specification) for
> underspecified declarations in general (the cases of a declaration in
> the initializer, or a tagged type being directly declared as a type
> specifier, are already detected).
> 
> Cases of ambiguity in the specification include:
> 
> * Many questions (previously raised in WG14 discussions) over the rule
>  about what conversions do or do not involve a change of value that's
>  not allowed in a constexpr initializer, that aren't properly
>  addressed by the normative text (and where the footnote on the
>  subject isn't very clear either, and the examples don't necessarily
>  follow from the normative text).  I've made a series of choices
>  there, that include disallowing all conversions between real and
>  complex types or between binary and decimal floating types in
>  constexpr initializers, that might not necessarily agree with how
>  things end up getting clarified.
> 
>  The dfp.cc change also arises here, to allow quiet NaN initializers
>  of one DFP type to be used in a constexpr initializer for another
>  DFP type (as is possible for signaling NaNs) by ensuring the result
>  of such a conversion is properly marked as canonical (note that most
>  of the DFP code doesn't actually do anything with NaN payloads at
>  all).
> 
> * Various issues with what exactly counts as part of a declaration for
>  the purposes of the rule on underspecified declarations not
>  declaring any identifiers other than ordinary identifiers (and not
>  declaring more than one ordinary identifier, though the latter is
>  undefined behavior).  These include cases where the declaration of a
>  struct / union / enum type appears inside typeof or alignas in the
>  declaration specifiers (the latter also applies with auto), or in
>  the declarator (e.g. an array size or in a parameter declaration).
>  The issues are similar to those involved in C90 DR#115 and C99 DRs
>  #277 and #341; the intent may not be the same in all the different
>  cases involved, but it's not clear that the normative wording in the
>  various places is sufficient to deduce the differences in intent.
> 
> * The wording about producing a compound literal constant using member
>  access is present in one place but another place only applies that
>  to named constants.
> 
> * It's not clear when a structure or union constant (a constexpr
>  variable or compound literal with structure or union type, or a
>  member with such type extracted by a series of member access
>  operations) can itself be used in an initializer (constexpr or
>  otherwise).  Based on general wording for initializers not having
>  been changed, the working draft might only strictly allow it at
>  automatic storage duration (but elsewhere it would be undefined
>  behavior, not a constraint violation, so no diagnostic required) -
>  since that's the only case mentioned where a single expression of
>  structure or union type can be used to initialize an object of such
>  a type.  But it definitely seems to be allowed in even constexpr
>  initializers at automatic storage duration - and since generally
>  constexpr initializers (any storage duration) are *more* constrained
>  than ordinary static storage duration initializers, it would seem
>  odd for it not to be allowed at static storage duration.
> 
> * When you do allow such initializers, it's then not entirely clear
>  how the constraint that constexpr pointer initializers must be null
>  pointer constants should be applied (given that a constexpr object
>  of pointer type is a null pointer but *not* a null pointer
>  constant).  My guess would be that a constexpr struct or union
>  containing such a field should still be allowed as an initializer,
>  but the wording could be read otherwise.
> 
> * It also becomes important with constexpr exactly what kind of
>  

Re: [PATCH 4/4]AArch64 sve2: rewrite pack + NARROWB + NARROWB to NARROWB + NARROWT

2022-11-12 Thread Richard Sandiford via Gcc-patches
Richard Sandiford  writes:
> Tamar Christina  writes:
>> Hi All,
>>
>> This adds an RTL pattern for when two NARROWB instructions are being combined
>> with a PACK.  The second NARROWB is then transformed into a NARROWT.
>>
>> For the example:
>>
>> void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n)
>> {
>>   for (int i = 0; i < (n & -16); i+=1)
>> pixel[i] += (pixel[i] * level) / 0xff;
>> }
>>
>> we generate:
>>
>> addhnb  z6.b, z0.h, z4.h
>> addhnb  z5.b, z1.h, z4.h
>> addhnb  z0.b, z0.h, z6.h
>> addhnt  z0.b, z1.h, z5.h
>> add z0.b, z0.b, z2.b
>>
>> instead of:
>>
>> addhnb  z6.b, z1.h, z4.h
>> addhnb  z5.b, z0.h, z4.h
>> addhnb  z1.b, z1.h, z6.h
>> addhnb  z0.b, z0.h, z5.h
>> uzp1z0.b, z0.b, z1.b
>> add z0.b, z0.b, z2.b
>>
>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>>
>> Ok for master?
>>
>> Thanks,
>> Tamar
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-sve2.md (*aarch64_sve_pack_):
>>  New.
>>  * config/aarch64/iterators.md (binary_top): New.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.dg/vect/vect-div-bitmask-4.c: New test.
>>  * gcc.target/aarch64/sve2/div-by-bitmask_2.c: New test.
>>
>> --- inline copy of patch -- 
>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>> b/gcc/config/aarch64/aarch64-sve2.md
>> index 
>> ab5dcc369481311e5bd68a1581265e1ce99b4b0f..0ee46c8b0d43467da4a6b98ad3c41e5d05d8cf38
>>  100644
>> --- a/gcc/config/aarch64/aarch64-sve2.md
>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>> @@ -1600,6 +1600,25 @@ (define_insn "@aarch64_sve_"
>>"\t%0., %2., %3."
>>  )
>>  
>> +(define_insn_and_split "*aarch64_sve_pack_"
>> +  [(set (match_operand: 0 "register_operand" "=w")
>> +(unspec:
>> +  [(match_operand:SVE_FULL_HSDI 1 "register_operand" "w")
>
> "0" would be safer, in case the instruction is only split after RA.
>
>> +   (subreg:SVE_FULL_HSDI (unspec:
>> + [(match_operand:SVE_FULL_HSDI 2 "register_operand" "w")
>> +  (match_operand:SVE_FULL_HSDI 3 "register_operand" "w")]
>> + SVE2_INT_BINARY_NARROWB) 0)]
>> +  UNSPEC_PACK))]
>
> I think ideally this would be the canonical pattern, so that we can
> drop the separate top unspecs.  That's more work though, and would
> probably make sense to do once we have a generic way of representing
> the pack.
>
> So OK with the "0" change above.

Hmm, actually, I take that back.  Is this transform really correct?
I think the blend corresponds to a TRN1 rather than a UZP1.
The bottom operations populate the lower half of each wider element
and the top operations populate the upper half.

Thanks,
Richard


Re: [PATCH 4/4]AArch64 sve2: rewrite pack + NARROWB + NARROWB to NARROWB + NARROWT

2022-11-12 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
> Hi All,
>
> This adds an RTL pattern for when two NARROWB instructions are being combined
> with a PACK.  The second NARROWB is then transformed into a NARROWT.
>
> For the example:
>
> void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n)
> {
>   for (int i = 0; i < (n & -16); i+=1)
> pixel[i] += (pixel[i] * level) / 0xff;
> }
>
> we generate:
>
> addhnb  z6.b, z0.h, z4.h
> addhnb  z5.b, z1.h, z4.h
> addhnb  z0.b, z0.h, z6.h
> addhnt  z0.b, z1.h, z5.h
> add z0.b, z0.b, z2.b
>
> instead of:
>
> addhnb  z6.b, z1.h, z4.h
> addhnb  z5.b, z0.h, z4.h
> addhnb  z1.b, z1.h, z6.h
> addhnb  z0.b, z0.h, z5.h
> uzp1z0.b, z0.b, z1.b
> add z0.b, z0.b, z2.b
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-sve2.md (*aarch64_sve_pack_):
>   New.
>   * config/aarch64/iterators.md (binary_top): New.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.dg/vect/vect-div-bitmask-4.c: New test.
>   * gcc.target/aarch64/sve2/div-by-bitmask_2.c: New test.
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index 
> ab5dcc369481311e5bd68a1581265e1ce99b4b0f..0ee46c8b0d43467da4a6b98ad3c41e5d05d8cf38
>  100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -1600,6 +1600,25 @@ (define_insn "@aarch64_sve_"
>"\t%0., %2., %3."
>  )
>  
> +(define_insn_and_split "*aarch64_sve_pack_"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (unspec:
> +   [(match_operand:SVE_FULL_HSDI 1 "register_operand" "w")

"0" would be safer, in case the instruction is only split after RA.

> +(subreg:SVE_FULL_HSDI (unspec:
> +  [(match_operand:SVE_FULL_HSDI 2 "register_operand" "w")
> +   (match_operand:SVE_FULL_HSDI 3 "register_operand" "w")]
> +  SVE2_INT_BINARY_NARROWB) 0)]
> +   UNSPEC_PACK))]

I think ideally this would be the canonical pattern, so that we can
drop the separate top unspecs.  That's more work though, and would
probably make sense to do once we have a generic way of representing
the pack.

So OK with the "0" change above.

Thanks,
Richard

> +  "TARGET_SVE2"
> +  "#"
> +  "&& true"
> +  [(const_int 0)]
> +{
> +  rtx tmp = lowpart_subreg (mode, operands[1], mode);
> +  emit_insn (gen_aarch64_sve (, 
> mode,
> +   operands[0], tmp, operands[2], operands[3]));
> +})
> +
>  ;; -
>  ;;  [INT] Narrowing right shifts
>  ;; -
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 
> 0dd9dc66f7ccd78acacb759662d0cd561cd5b4ef..37d8161a33b1c399d80be82afa67613a087389d4
>  100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -3589,6 +3589,11 @@ (define_int_attr brk_op [(UNSPEC_BRKA "a") 
> (UNSPEC_BRKB "b")
>  
>  (define_int_attr sve_pred_op [(UNSPEC_PFIRST "pfirst") (UNSPEC_PNEXT 
> "pnext")])
>  
> +(define_int_attr binary_top [(UNSPEC_ADDHNB "UNSPEC_ADDHNT")
> +  (UNSPEC_RADDHNB "UNSPEC_RADDHNT")
> +  (UNSPEC_RSUBHNB "UNSPEC_RSUBHNT")
> +  (UNSPEC_SUBHNB "UNSPEC_SUBHNT")])
> +
>  (define_int_attr sve_int_op [(UNSPEC_ADCLB "adclb")
>(UNSPEC_ADCLT "adclt")
>(UNSPEC_ADDHNB "addhnb")
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c 
> b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c
> new file mode 100644
> index 
> ..0df08bda6fd3e33280307ea15c82dd9726897cfd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-additional-options "-fno-vect-cost-model" { target aarch64*-*-* } } 
> */
> +
> +#include 
> +#include "tree-vect.h"
> +
> +#define N 50
> +#define TYPE uint32_t
> +
> +__attribute__((noipa, noinline, optimize("O1")))
> +void fun1(TYPE* restrict pixel, TYPE level, int n)
> +{
> +  for (int i = 0; i < n; i+=1)
> +pixel[i] += (pixel[i] * (uint64_t)level) / 0xUL;
> +}
> +
> +__attribute__((noipa, noinline, optimize("O3")))
> +void fun2(TYPE* restrict pixel, TYPE level, int n)
> +{
> +  for (int i = 0; i < n; i+=1)
> +pixel[i] += (pixel[i] * (uint64_t)level) / 0xUL;
> +}
> +
> +#include "vect-div-bitmask.h"
> +
> +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: detected" 
> "vect" { target aarch64*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_2.c
> new file mode 100644
> index 

Re: [PATCH 3/4]AArch64 Add SVE2 implementation for pow2 bitmask division

2022-11-12 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review, been snowed under with stage1 stuff.

Tamar Christina  writes:
> Hi All,
>
> In plenty of image and video processing code it's common to modify pixel 
> values
> by a widening operation and then scale them back into range by dividing by 
> 255.
>
> This patch adds an named function to allow us to emit an optimized sequence
> when doing an unsigned division that is equivalent to:
>
>x = y / (2 ^ (bitsize (y)/2)-1)
>
> For SVE2 this means we generate for:
>
> void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n)
> {
>   for (int i = 0; i < (n & -16); i+=1)
> pixel[i] = (pixel[i] * level) / 0xff;
> }
>
> the following:
>
> mov z3.b, #1
> .L3:
> ld1bz0.h, p0/z, [x0, x3]
> mul z0.h, p1/m, z0.h, z2.h
> addhnb  z1.b, z0.h, z3.h
> addhnb  z0.b, z0.h, z1.h
> st1bz0.h, p0, [x0, x3]
> inchx3
> whilelo p0.h, w3, w2
> b.any   .L3
>
> instead of:
>
> .L3:
> ld1bz0.h, p1/z, [x0, x3]
> mul z0.h, p0/m, z0.h, z1.h
> umulh   z0.h, p0/m, z0.h, z2.h
> lsr z0.h, z0.h, #7
> st1bz0.h, p1, [x0, x3]
> inchx3
> whilelo p1.h, w3, w2
> b.any   .L3
>
> Which results in significantly faster code.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-sve2.md (@aarch64_bitmask_udiv3): New.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve2/div-by-bitmask_1.c: New test.
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index 
> f138f4be4bcf74c1a4a6d5847ed831435246737f..4d097f7c405cc68a1d6cda5c234a1023a6eba0d1
>  100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -71,6 +71,7 @@
>  ;;  [INT] Reciprocal approximation
>  ;;  [INT<-FP] Base-2 logarithm
>  ;;  [INT] Polynomial multiplication
> +;;  [INT] Misc optab implementations
>  ;;
>  ;; == Permutation
>  ;;  [INT,FP] General permutes
> @@ -2312,6 +2313,47 @@ (define_insn "@aarch64_sve_"
>"\t%0., %1., %2."
>  )
>  
> +;; -
> +;;  [INT] Misc optab implementations
> +;; -
> +;; Includes:
> +;; - aarch64_bitmask_udiv
> +;; -
> +
> +;; div optimizations using narrowings
> +;; we can do the division e.g. shorts by 255 faster by calculating it as
> +;; (x + ((x + 257) >> 8)) >> 8 assuming the operation is done in
> +;; double the precision of x.
> +;;
> +;; See aarch64-simd.md for bigger explanation.
> +(define_expand "@aarch64_bitmask_udiv3"
> +  [(match_operand:SVE_FULL_HSDI 0 "register_operand")
> +   (match_operand:SVE_FULL_HSDI 1 "register_operand")
> +   (match_operand:SVE_FULL_HSDI 2 "immediate_operand")]
> +  "TARGET_SVE2"
> +{
> +  unsigned HOST_WIDE_INT size
> += (1ULL << GET_MODE_UNIT_BITSIZE (mode)) - 1;
> +  if (!CONST_VECTOR_P (operands[2])
> +  || const_vector_encoded_nelts (operands[2]) != 1
> +  || size != UINTVAL (CONST_VECTOR_ELT (operands[2], 0)))
> +FAIL;

A slightly simpler way to write this, without the direct use of the
encoding, is:

  rtx elt = unwrap_const_vec_duplicate (operands[2]);
  if (!CONST_INT_P (elt) || UINTVAL (elt) != size)
FAIL;

OK with that change, thanks.

Richard

> +
> +  rtx addend = gen_reg_rtx (mode);
> +  rtx tmp1 = gen_reg_rtx (mode);
> +  rtx tmp2 = gen_reg_rtx (mode);
> +  rtx val = aarch64_simd_gen_const_vector_dup (mode, 1);
> +  emit_move_insn (addend, lowpart_subreg (mode, val, mode));
> +  emit_insn (gen_aarch64_sve (UNSPEC_ADDHNB, mode, tmp1, operands[1],
> +   addend));
> +  emit_insn (gen_aarch64_sve (UNSPEC_ADDHNB, mode, tmp2, operands[1],
> +   lowpart_subreg (mode, tmp1,
> +   mode)));
> +  emit_move_insn (operands[0],
> +   lowpart_subreg (mode, tmp2, mode));
> +  DONE;
> +})
> +
>  ;; =
>  ;; == Permutation
>  ;; =
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_1.c
> new file mode 100644
> index 
> ..e6f5098c30f4e2eb8ed1af153c0bb0d204cda6d9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/div-by-bitmask_1.c
> @@ -0,0 +1,53 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -std=c99" } */
> +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */
> +
> +#include 
> +
> +/*
> +** draw_bitmap1:
> +** ...
> +**   mul  

Re: [PATCH] libstdc++: Fix up to_chars ppc64le _Float128 overloads [PR107636]

2022-11-12 Thread Jonathan Wakely via Gcc-patches
On Sat, 12 Nov 2022, 08:47 Jakub Jelinek via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> Hi!
>
> As reported, I've misplaced __extension__ keywords in these cases
> (wanted not to have them on the whole inlines because _Float128 is
> completely standard now while __float128 is not, but before return
> it is a syntax error.
> I've verified on a short testcase that both g++ and clang++ accept
> __extension__ after return keyword.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (admittedly
> not powerpc64le-linux with new glibc), ok for trunk?
>

OK, thanks



> 2022-11-12  Jakub Jelinek  
>
> PR libstdc++/107636
> * include/std/charconv (to_chars): Fix up powerpc64le _Float128
> overload __extension__ placement.
>
> --- libstdc++-v3/include/std/charconv.jj2022-11-11
> 08:15:45.696183293 +0100
> +++ libstdc++-v3/include/std/charconv   2022-11-11 16:32:52.992693605 +0100
> @@ -892,23 +892,25 @@ namespace __detail
>inline to_chars_result
>to_chars(char* __first, char* __last, _Float128 __value) noexcept
>{
> -__extension__
> -return to_chars(__first, __last, static_cast<__float128>(__value));
> +return __extension__ to_chars(__first, __last,
> + static_cast<__float128>(__value));
>}
>inline to_chars_result
>to_chars(char* __first, char* __last, _Float128 __value,
>chars_format __fmt) noexcept
>{
> -__extension__
> -return to_chars(__first, __last, static_cast<__float128>(__value),
> __fmt);
> +
> +return __extension__ to_chars(__first, __last,
> + static_cast<__float128>(__value), __fmt);
>}
>inline to_chars_result
>to_chars(char* __first, char* __last, _Float128 __value,
>chars_format __fmt, int __precision) noexcept
>{
> -__extension__
> -return to_chars(__first, __last, static_cast<__float128>(__value),
> __fmt,
> -   __precision);
> +
> +return __extension__ to_chars(__first, __last,
> + static_cast<__float128>(__value), __fmt,
> + __precision);
>}
>  #else
>to_chars_result to_chars(char* __first, char* __last, _Float128 __value)
>
> Jakub
>
>


[PATCH] c++: Implement CWG2635 - Constrained structured bindings

2022-11-12 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch implements CWG2635.

So far tested on
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ 
RUNTESTFLAGS="dg.exp=decomp*"
ok for trunk if it passes full bootstrap/regtest and it is voted in?

2022-11-12  Jakub Jelinek  

* decl.cc (grokdeclarator): Implement
CWG2635 - Constrained structured bindings.  Diagnose constrained
auto type.

* g++.dg/cpp2a/decomp5.C: New test.

--- gcc/cp/decl.cc.jj   2022-11-11 17:14:33.103869977 +0100
+++ gcc/cp/decl.cc  2022-11-12 12:13:52.217239729 +0100
@@ -12660,7 +12660,8 @@ grokdeclarator (const cp_declarator *dec
  gcc_unreachable ();
}
   if (TREE_CODE (type) != TEMPLATE_TYPE_PARM
- || TYPE_IDENTIFIER (type) != auto_identifier)
+ || TYPE_IDENTIFIER (type) != auto_identifier
+ || PLACEHOLDER_TYPE_CONSTRAINTS_INFO (type))
{
  if (type != error_mark_node)
{
--- gcc/testsuite/g++.dg/cpp2a/decomp5.C.jj 2022-11-12 12:17:21.024392082 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/decomp5.C2022-11-12 12:20:00.700214521 
+0100
@@ -0,0 +1,20 @@
+// CWG2635 - Constrained structured bindings 
+// { dg-do compile { target c++20 } }
+
+namespace std {
+  template struct tuple_size;
+  template struct tuple_element;
+}
+
+struct A {
+  int i;
+  A(int x) : i(x) {}
+  template  int& get() { return i; }
+};
+
+template<> struct std::tuple_size { static const int value = 2; };
+template struct std::tuple_element { using type = int; };
+
+template concept C = true;
+C auto [x, y] = A{1}; // { dg-error "structured binding declaration cannot 
have type 'auto \\\[requires ::C<, >\\\]'" }
+ // { dg-message "type must be cv-qualified 'auto' or 
reference to cv-qualified 'auto'" "" { target *-*-* } .-1 }

Jakub



[COMMITTED] [frange] Avoid testing signed zero test for -fno-signed-zeros.

2022-11-12 Thread Aldy Hernandez via Gcc-patches
This patch moves a test that is meant to only work for signed zeros
into range_tests_signed_zeros.

I am not aware of any architectures where this is failing, but it is
annoying to see selftests failing when -fno-signed-zeros is used.

gcc/ChangeLog:

* value-range.cc (range_tests_signbit): Move to set from here...
(range_tests_signed_zeros): ...to here.
---
 gcc/value-range.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index d55d85846c1..34fac636cad 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -3928,6 +3928,11 @@ range_tests_signed_zeros ()
   r0.set_nonnegative (float_type_node);
   if (HONOR_NANS (float_type_node))
 ASSERT_TRUE (r0.maybe_isnan ());
+
+  // Numbers containing zero should have an unknown SIGNBIT.
+  r0 = frange_float ("0", "10");
+  r0.clear_nan ();
+  ASSERT_TRUE (r0.signbit_p (signbit) && !signbit);
 }
 
 static void
@@ -3944,10 +3949,6 @@ range_tests_signbit ()
   r0 = frange_float ("1", "10");
   r0.clear_nan ();
   ASSERT_TRUE (r0.signbit_p (signbit) && !signbit);
-  // Numbers containing zero should have an unknown SIGNBIT.
-  r0 = frange_float ("0", "10");
-  r0.clear_nan ();
-  ASSERT_TRUE (r0.signbit_p (signbit) && !signbit);
   // Numbers spanning both positive and negative should have an
   // unknown SIGNBIT.
   r0 = frange_float ("-10", "10");
-- 
2.38.1



Re: [PATCH v2] LoongArch: Add prefetch instructions.

2022-11-12 Thread Xi Ruoyao via Gcc-patches
On Sat, 2022-11-12 at 15:37 +0800, Lulu Cheng wrote:
> Co-Authored-By: xujiahao 
> 
> gcc/ChangeLog:
> 
> * config/loongarch/loongarch-def.c: Initial number of parallel
> prefetch.
> * config/loongarch/loongarch-tune.h (struct loongarch_cache):
> Define number of parallel prefetch.
> * config/loongarch/loongarch.cc
> (loongarch_option_override_internal):
> Set up parameters to be used in prefetching algorithm.
> (loongarch_prefetch_cookie): Select load or store based on the
> value of write.
> * config/loongarch/loongarch.md (prefetch): New template.
> (*prefetch_indexed_): New template.

Missing config/loongarch/constraints.md.

/* snip */

>  rtx
>  loongarch_prefetch_cookie (rtx write, rtx locality)
>  {
> -  /* store_streamed / load_streamed.  */
> -  if (INTVAL (locality) <= 0)
> -    return GEN_INT (INTVAL (write) + 4);
> +  if (INTVAL (locality) == 1 && INTVAL (write) == 0)
> +    return GEN_INT (INTVAL (write) + 2);

So __builtin_prefetch(ptr, 0, 1) will produce
"preld 2,$r4,0", while the document says

   hint has 32 optional values (0 to 31), 0 represents load to level 1
   Cache, and 8 represents store to level 1 Cache. The remaining hint
   values are not defined and are processed for nop instructions when the
   processor executes.
   
OTOH hint 2 is documented in preldx.  So does preld also support hint 2?

/* snip */


> +(define_insn "prefetch"
> +  [(prefetch (match_operand 0 "address_operand" "ZD,ZE")
> +    (match_operand 1 "const_int_operand" "n,n")
> +    (match_operand 2 "const_int_operand" "n,n"))]
> +  ""
> +{
> +  operands[1] = loongarch_prefetch_cookie (operands[1], operands[2]);
> +
> +  switch (which_alternative)
> +    {
> +    case 0:
> +  return "preld\t%1,%a0";
> +    case 1:
> +  return "preldx\t%1,%a0";

void prefetch(char *ptr, int off)
{
return __builtin_prefetch(ptr + off);
}

It's compiled to "preldx 0,$r4,$r5".  I don't think it's correct because
according to the doc, rk should contains several bit-fields instead of
an offset.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [DOCS] sphinx: use new Sphinx links

2022-11-12 Thread Gerald Pfeifer
On Thu, 10 Nov 2022, Martin Liška wrote:
> See that the problematic for some reason uses "content-security-policy:
> default-src 'self' http: https:".

Yep.

On Thu, 10 Nov 2022, Tobias Burnus wrote:
> content-security-policy: default-src 'self' http: https:
> 
> There must be some server configuration that add this - but it does not 
> seem to be in the .ht* files in the wwwdocs git repo.

And yep. I dug into this yesterday and found the following:

In /etc/httpd/conf.d/sourceware-vhost-gcc.conf we have

  
Header unset Content-Security-Policy
  

I am not aware of who added this, and why, nor actually even why, yet it 
seems if we can get the same in place for /install we'll be good again, so
I'll ask overseers@.


Next step: redirects from the old /install docs to the new ones.

Gerald


Re: [Patch] OpenMP/Fortran: Use firstprivat not alloc for ptr attach for arrays

2022-11-12 Thread Thomas Schwinge
Hi Tobias!

On 2022-05-13T19:44:51+0200, Jakub Jelinek via Fortran  
wrote:
> On Fri, May 13, 2022 at 07:21:02PM +0200, Tobias Burnus wrote:
>> gcc/fortran/ChangeLog:
>>
>>  * trans-openmp.cc (gfc_trans_omp_clauses): When mapping nondescriptor
>>  array sections, use GOMP_MAP_FIRSTPRIVATE_POINTER instead of
>>  GOMP_MAP_POINTER for the pointer attachment.
>>
>> libgomp/ChangeLog:
>>
>>  * testsuite/libgomp.fortran/target-nowait-array-section.f90: New test.
>
> Not 100% sure if we want to add such a testcase into the testsuite given
> that it is not valid OpenMP, but perhaps it is ok as we are testing a QoI.

For non-offloading x86_64-pc-linux-gnu '-m32', I'm occasionally (but very
rarely!) seeing this test case FAIL its execution test.  Similar can also
be seen on occasional reports via ,
.


Grüße
 Thomas


'libgomp.fortran/target-nowait-array-section.f90':

| ! Runs the the target region asynchrolously and checks for it
| !
| ! Note that  map(alloc: work(:, i)) + nowait  should be safe
| ! given that a nondescriptor array is used. However, it still
| ! violates a map clause restriction, added in OpenMP 5.1 [354:10-13].
|
| PROGRAM test_target_teams_distribute_nowait
|   USE ISO_Fortran_env, only: INT64
|   implicit none
| INTEGER, parameter :: N = 1024, N_TASKS = 16
| INTEGER :: i, j, k, my_ticket
| INTEGER :: order(n_tasks)
| INTEGER(INT64) :: work(n, n_tasks)
| INTEGER :: ticket
| logical :: async
|
| ticket = 0
|
| !$omp target enter data map(to: ticket, order)
|
| !$omp parallel do num_threads(n_tasks)
| DO i = 1, n_tasks
|!$omp target map(alloc: work(:, i), ticket) private(my_ticket) nowait
|!!$omp target teams distribute map(alloc: work(:, i), ticket) 
private(my_ticket) nowait
|DO j = 1, n
|   ! Waste cyles
| !  work(j, i) = 0
| !  DO k = 1, n*(n_tasks - i)
| ! work(j, i) = work(j, i) + i*j*k
| !  END DO
|   my_ticket = 0
|   !$omp atomic capture
|   ticket = ticket + 1
|   my_ticket = ticket
|   !$omp end atomic
|   !$omp atomic write
|   order(i) = my_ticket
|END DO
|!$omp end target !teams distribute
| END DO
| !$omp end parallel do
|
| !$omp target exit data map(from:ticket, order)
|
| IF (ticket .ne. n_tasks*n) stop 1
| if (maxval(order) /= n_tasks*n) stop 2
| ! order(i) == n*i if synchronous and between n and n*n_tasks if run 
concurrently
| do i = 1, n_tasks
|   if (order(i) < n .or. order(i) > n*n_tasks) stop 3
| end do
| async = .false.
| do i = 1, n_tasks
|   if (order(i) /= n*i) async = .true.
| end do
| if (.not. async) stop 4 ! Did not run asynchronously
| end
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH v2] LoongArch: Optimize the implementation of stack check.

2022-11-12 Thread Lulu Cheng
The old stack check was performed before the stack was dropped,
which would cause the detection tool to report a memory leak.

The current stack check scheme is as follows:

'-fstack-clash-protection':
1. When the frame->total_size is smaller than the guard page size,
   the stack is dropped according to the original scheme, and there
   is no need to perform stack detection in the prologue.
2. When frame->total_size is greater than or equal to guard page size,
   the first step to drop the stack is to drop the space required by
   the caller-save registers. This space needs to save the caller-save
   registers, so an implicit stack check is performed.
   So just need to check the rest of the stack space.

'-fstack-check':
There is no one-time stack drop and then page-by-page detection as
described in the document. It is also the same as
'-fstack-clash-protection', which is detected immediately after page drop.

It is judged that when frame->total_size is not 0, only the size required
to save the s register is dropped for the first stack down.

The test cases are referenced from aarch64.

gcc/ChangeLog:

* config/loongarch/linux.h (STACK_CHECK_MOVING_SP):
Define this macro to 1.
* config/loongarch/loongarch.cc (loongarch_first_stack_step):
Return the size of the first drop stack according to whether stack 
checking
is performed
(loongarch_emit_probe_stack_range): Adjust the method of stack checking 
in prologue.
(loongarch_output_probe_stack_range): Delete useless code.
(loongarch_expand_prologue): Adjust the method of stack checking in 
prologue.
(loongarch_option_override_internal): Enforce that interval is the same
size as size so the mid-end does the right thing.
* config/loongarch/loongarch.h (STACK_CLASH_MAX_UNROLL_PAGES):
New macro decide whether to loop stack detection.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add loongarch support for 
stack_clash_protection.
* gcc.target/loongarch/stack-check-alloca-1.c: New test.
* gcc.target/loongarch/stack-check-alloca-2.c: New test.
* gcc.target/loongarch/stack-check-alloca-3.c: New test.
* gcc.target/loongarch/stack-check-alloca-4.c: New test.
* gcc.target/loongarch/stack-check-alloca-5.c: New test.
* gcc.target/loongarch/stack-check-alloca-6.c: New test.
* gcc.target/loongarch/stack-check-alloca.h: New test.
* gcc.target/loongarch/stack-check-cfa-1.c: New test.
* gcc.target/loongarch/stack-check-cfa-2.c: New test.
* gcc.target/loongarch/stack-check-prologue-1.c: New test.
* gcc.target/loongarch/stack-check-prologue-2.c: New test.
* gcc.target/loongarch/stack-check-prologue-3.c: New test.
* gcc.target/loongarch/stack-check-prologue-4.c: New test.
* gcc.target/loongarch/stack-check-prologue-5.c: New test.
* gcc.target/loongarch/stack-check-prologue-6.c: New test.
* gcc.target/loongarch/stack-check-prologue-7.c: New test.
* gcc.target/loongarch/stack-check-prologue.h: New test.
---
 gcc/config/loongarch/linux.h  |   3 +
 gcc/config/loongarch/loongarch.cc | 249 +++---
 gcc/config/loongarch/loongarch.h  |   4 +
 .../loongarch/stack-check-alloca-1.c  |  15 ++
 .../loongarch/stack-check-alloca-2.c  |  12 +
 .../loongarch/stack-check-alloca-3.c  |  12 +
 .../loongarch/stack-check-alloca-4.c  |  12 +
 .../loongarch/stack-check-alloca-5.c  |  13 +
 .../loongarch/stack-check-alloca-6.c  |  13 +
 .../gcc.target/loongarch/stack-check-alloca.h |  15 ++
 .../gcc.target/loongarch/stack-check-cfa-1.c  |  12 +
 .../gcc.target/loongarch/stack-check-cfa-2.c  |  12 +
 .../loongarch/stack-check-prologue-1.c|  11 +
 .../loongarch/stack-check-prologue-2.c|  11 +
 .../loongarch/stack-check-prologue-3.c|  11 +
 .../loongarch/stack-check-prologue-4.c|  11 +
 .../loongarch/stack-check-prologue-5.c|  12 +
 .../loongarch/stack-check-prologue-6.c|  11 +
 .../loongarch/stack-check-prologue-7.c|  12 +
 .../loongarch/stack-check-prologue.h  |   5 +
 gcc/testsuite/lib/target-supports.exp |   7 +-
 21 files changed, 362 insertions(+), 101 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-2.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-3.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-4.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-5.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-6.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca.h
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c
 create mode 

Re: [PATCH] [range-ops] Add ability to represent open intervals in frange.

2022-11-12 Thread Aldy Hernandez via Gcc-patches
On Sat, Nov 12, 2022 at 9:54 AM Jakub Jelinek  wrote:
>
> On Fri, Nov 11, 2022 at 08:25:15PM +0100, Aldy Hernandez wrote:
> > Passes tests for all languages. Passes lapack tests.
> >
> > So ready to be installed unless you have any issues. Oh... I should
> > write some tests..
>
> LGTM.
>
> Yeah, for tests we still need to decide whether we make tests in the
> style like I've posted working or whether we add a plugin based tests.

FWIW, I don't have any objections to the plugin other than I may not
have enough cycles to help out for a while.

Aldy



Re: [PATCH] [range-ops] Add ability to represent open intervals in frange.

2022-11-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Nov 11, 2022 at 08:25:15PM +0100, Aldy Hernandez wrote:
> Passes tests for all languages. Passes lapack tests.
> 
> So ready to be installed unless you have any issues. Oh... I should
> write some tests..

LGTM.

Yeah, for tests we still need to decide whether we make tests in the
style like I've posted working or whether we add a plugin based tests.

Jakub



[committed] libgomp: Fix up build on mingw [PR107641]

2022-11-12 Thread Jakub Jelinek via Gcc-patches
Hi!

Pointers should be first casted to intptr_t/uintptr_t before casting
them to another integral type to avoid warnings.
Furthermore, the function has code like
  else if (upper <= UINT_MAX)
something;
  else
something_else;
so it seems using unsigned type for upper where upper <= UINT_MAX is always
true is not intended.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2022-11-12  Jakub Jelinek  

PR libgomp/107641
* env.c (parse_unsigned_long): Cast params[2] to uintptr_t rather than
unsigned long.  Change type of upper from unsigned to unsigned long.

--- libgomp/env.c.jj2022-09-13 18:59:52.331054977 +0200
+++ libgomp/env.c   2022-11-11 18:10:21.552415182 +0100
@@ -283,7 +283,7 @@ parse_unsigned_long_1 (const char *env,
 static bool
 parse_unsigned_long (const char *env, const char *val, void *const params[])
 {
-  unsigned upper = (unsigned long) params[2];
+  unsigned long upper = (uintptr_t) params[2];
   unsigned long pvalue = 0;
   bool ret = parse_unsigned_long_1 (env, val, , (bool) params[1]);
   if (!ret)

Jakub



[PATCH] libstdc++: Fix up to_chars ppc64le _Float128 overloads [PR107636]

2022-11-12 Thread Jakub Jelinek via Gcc-patches
Hi!

As reported, I've misplaced __extension__ keywords in these cases
(wanted not to have them on the whole inlines because _Float128 is
completely standard now while __float128 is not, but before return
it is a syntax error.
I've verified on a short testcase that both g++ and clang++ accept
__extension__ after return keyword.

Bootstrapped/regtested on x86_64-linux and i686-linux (admittedly
not powerpc64le-linux with new glibc), ok for trunk?

2022-11-12  Jakub Jelinek  

PR libstdc++/107636
* include/std/charconv (to_chars): Fix up powerpc64le _Float128
overload __extension__ placement.

--- libstdc++-v3/include/std/charconv.jj2022-11-11 08:15:45.696183293 
+0100
+++ libstdc++-v3/include/std/charconv   2022-11-11 16:32:52.992693605 +0100
@@ -892,23 +892,25 @@ namespace __detail
   inline to_chars_result
   to_chars(char* __first, char* __last, _Float128 __value) noexcept
   {
-__extension__
-return to_chars(__first, __last, static_cast<__float128>(__value));
+return __extension__ to_chars(__first, __last,
+ static_cast<__float128>(__value));
   }
   inline to_chars_result
   to_chars(char* __first, char* __last, _Float128 __value,
   chars_format __fmt) noexcept
   {
-__extension__
-return to_chars(__first, __last, static_cast<__float128>(__value), __fmt);
+
+return __extension__ to_chars(__first, __last,
+ static_cast<__float128>(__value), __fmt);
   }
   inline to_chars_result
   to_chars(char* __first, char* __last, _Float128 __value,
   chars_format __fmt, int __precision) noexcept
   {
-__extension__
-return to_chars(__first, __last, static_cast<__float128>(__value), __fmt,
-   __precision);
+
+return __extension__ to_chars(__first, __last,
+ static_cast<__float128>(__value), __fmt,
+ __precision);
   }
 #else
   to_chars_result to_chars(char* __first, char* __last, _Float128 __value)

Jakub