Re: [PATCH] arc: Add --with-fpu support for ARCv2 cpus

2021-06-04 Thread Jeff Law via Gcc-patches




On 6/4/2021 1:29 AM, Claudiu Zissulescu via Gcc-patches wrote:

Hi Jeff,

I would like to add spport for selecting the ARCv2 FPU extension at
configuration-time.

The --with-fpu configuration option is ignored when -mfpu compiler
option is specified.

My concern is using `grep -P` when configuring. Is that ok?

Thanks,
Claudiu

gcc/
-mm-dd  Claudiu Zissulescu  

* config.gcc (arc): Add support for with_cpu option.
* config/arc/arc.h (OPTION_DEFAULT_SPECS): Add fpu.
I strongly suspect -P is a GNU-ism and probably won't work on other 
hosts were GCC is still used.  I'd avoid it and look for an alternate 
solution.


jeff



[committed] Fix H8 split conditions

2021-06-04 Thread Jeff Law via Gcc-patches


The irony here is I had this in-flight when the discussion about 
tightening the split conditions in define_insn_and_split started.  What 
spurred it was an unexpected split with after reworking some patterns to 
allow them to be used for redundant test/compare elimination.  THat was 
ultimately tracked down to a missed condition.  The pattern's condition 
included a condition that only enabled it on the H8/S variant, but the 
splitter just had "reload_completed", so the splitter ran on all the H8 
variants generating highly unexpected results.


Committed to the trunk.

Jeff


commit 549d7f4310f6f8c2c64efcb6f3efcee99c9d9f4f
Author: Jeff Law 
Date:   Sat Jun 5 01:27:02 2021 -0400

Fix split conditions in H8/300 port

gcc/

* config/h8300/addsub.md: Fix split condition in 
define_insn_and_split
patterns.
* config/h8300/bitfield.md: Likewise.
* config/h8300/combiner.md: Likewise.
* config/h8300/divmod.md: Likewise.
* config/h8300/extensions.md: Likewise.
* config/h8300/jumpcall.md: Likewise.
* config/h8300/movepush.md: Likewise.
* config/h8300/multiply.md: Likewise.
* config/h8300/other.md: Likewise.
* config/h8300/shiftrotate.md: Likewise.
* config/h8300/logical.md: Likewise.  Fix split pattern to use
code iterator that somehow slipped through.

diff --git a/gcc/config/h8300/addsub.md b/gcc/config/h8300/addsub.md
index 3585bffa9fc..b1eb0d20188 100644
--- a/gcc/config/h8300/addsub.md
+++ b/gcc/config/h8300/addsub.md
@@ -15,7 +15,7 @@
 (match_operand:QI 2 "h8300_src_operand" "rQi")))]
   "h8300_operands_match_p (operands)"
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0) (plus:QI (match_dup 1) (match_dup 2)))
  (clobber (reg:CC CC_REG))])])
 
@@ -34,7 +34,7 @@
 (match_operand:HI 2 "h8300_src_operand" "L,N,J,n,r")))]
   "!TARGET_H8300SX"
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0) (plus:HI (match_dup 1) (match_dup 2)))
  (clobber (reg:CC CC_REG))])])
 
@@ -81,7 +81,7 @@
 (match_operand:HI 2 "h8300_src_operand" "P3>X,P3"]
   ""
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (and:SI (match_dup 1) (const_int -256))
   (zero_extend:SI (match_dup 2
@@ -758,7 +758,7 @@
(match_operand:SI 2 "register_operand" "0")))]
   ""
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (ashift:SI (match_dup 1) (const_int 31))
   (match_dup 2)))
@@ -782,7 +782,7 @@
(match_operand:SI 4 "register_operand" "0")))]
   "(INTVAL (operands[3]) & ~0x) == 0"
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (and:SI (ashift:SI (match_dup 1) (match_dup 2))
   (match_dup 3))
@@ -815,7 +815,7 @@
(match_operand:SI 4 "register_operand" "0")))]
   "((INTVAL (operands[3]) << INTVAL (operands[2])) & ~0x) == 0"
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (and:SI (lshiftrt:SI (match_dup 1) (match_dup 2))
   (match_dup 3))
@@ -848,7 +848,7 @@
(match_operand:SI 3 "register_operand" "0")))]
   "INTVAL (operands[2]) < 16"
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (zero_extract:SI (match_dup 1)
(const_int 1)
@@ -875,7 +875,7 @@
(match_operand:SI 2 "register_operand" "0")))]
   ""
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (and:SI (lshiftrt:SI (match_dup 1) (const_int 30))
   (const_int 2))
@@ -902,7 +902,7 @@
(clobber (match_scratch:HI 3 "=&r"))]
   ""
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (and:SI (lshiftrt:SI (match_dup 1) (const_int 9))
   (const_int 4194304))
@@ -993,7 +993,7 @@
 (const_int 1]
   ""
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
   (ior:SI (and:SI (match_dup 1) (const_int 1))
   (lshiftrt:SI (match_dup 1) (const_int 1
@@ -1147,7 +1147,7 @@
(const_int 8)) 1))]
   ""
   "#"
-  "reload_completed"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0) (subreg:QI (lshiftrt:HI (match_dup 1)
 (const_int 8)) 1))
  

[PATCH] PR 99293: Optimize splat of vec_extract for V2DI/V2DF.

2021-06-04 Thread Michael Meissner via Gcc-patches
PR 99293: Optimize splat of vec_extract for V2DI/V2DF.

We had optimizations for splat of a vector extract for the other vector
types, but we missed having one for V2DI and V2DF.  This patch adds a
combiner insn to do this optimization.

In looking at the source, we had similar optimizations for V4SI and V4SF
extract and splats, but we missed doing V2DI/V2DF.

Without the patch for the code:

vector long long splat_dup_l_0 (vector long long v)
{
  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
}

the compiler generates (on a little endian power9):

splat_dup_l_0:
mfvsrld 9,34
mtvsrdd 34,9,9
blr

Now it generates:

splat_dup_l_0:
xxpermdi 34,34,34,3
blr

I have tested this on:

*   Little endian power9 running Linux using --with-code=power9
*   Big endian power8 running Linux using --with-code=power8
*   Little endian power10 running Linux using --with-code=power10

There were no regressions in the test suites (including 32-bit on big endian).
Can I check this into the master branch?

Can I check this into the open branches (GCC-11, GCC-10) after a soak-in period
if there were no errors during the soak-in period?

gcc/
2021-06-04  Michael Meissner  

PR target/99293
* config/rs6000/vsx.md (vsx_splat_extract_

PR target/99293
* gcc.target/powerpc/pr99293.c: New test.
---
 gcc/config/rs6000/vsx.md   | 18 ++
 gcc/testsuite/gcc.target/powerpc/pr99293.c | 22 ++
 2 files changed, 40 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99293.c

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index b49d5b44573..ecad45a43d1 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5020,6 +5020,24 @@ (define_insn "vsx_splat__mem"
   "lxvdsx %x0,%y1"
   [(set_attr "type" "vecload")])
 
+;; Optimize SPLAT of an extract from a V2DF/V2DI vector with a constant element
+(define_insn "*vsx_splat_extract_"
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")
+   (vec_duplicate:VSX_D
+(vec_select:
+ (match_operand:VSX_D 1 "vsx_register_operand" "wa")
+ (parallel [(match_operand 2 "const_0_to_1_operand" "n")]]
+  "VECTOR_MEM_VSX_P (mode)"
+{
+  int which_word = INTVAL (operands[2]);
+  if (!BYTES_BIG_ENDIAN)
+which_word = 1 - which_word;
+
+  operands[3] = GEN_INT (which_word ? 3 : 0);
+  return "xxpermdi %x0,%x1,%x1,%3";
+}
+  [(set_attr "type" "vecperm")])
+
 ;; V4SI splat support
 (define_insn "vsx_splat_v4si"
   [(set (match_operand:V4SI 0 "vsx_register_operand" "=we,we")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr99293.c 
b/gcc/testsuite/gcc.target/powerpc/pr99293.c
new file mode 100644
index 000..20adc1f27f6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr99293.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+
+/* Test for PR 99263, which wants to do:
+   __builtin_vec_splats (__builtin_vec_extract (v, n))
+
+   where v is a V2DF or V2DI vector and n is either 0 or 1.  Previously the
+   compiler would do a direct move to the GPR registers to select the item and 
a
+   direct move from the GPR registers to do the splat.  */
+
+vector long long splat_dup_l_0 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+}
+
+vector long long splat_dup_l_1 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 1));
+}
+
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] [libstdc++] Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]

2021-06-04 Thread Thomas Rodgers
Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
Change parameter type from _Tp to _Tp*.
* testsuite/29_atomics/atomic_ref/100889.cc: New test.
---
 libstdc++-v3/include/bits/atomic_base.h   |  2 +-
 .../testsuite/29_atomics/atomic_ref/100889.cc | 29 +++
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_ref/100889.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h 
b/libstdc++-v3/include/bits/atomic_base.h
index 029b8ad65a9..20cf1343c58 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1870,7 +1870,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cpp_lib_atomic_wait
   _GLIBCXX_ALWAYS_INLINE void
-  wait(_Tp __old, memory_order __m = memory_order_seq_cst) const noexcept
+  wait(_Tp* __old, memory_order __m = memory_order_seq_cst) const noexcept
   { __atomic_impl::wait(_M_ptr, __old, __m); }
 
   // TODO add const volatile overload
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_ref/100889.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_ref/100889.cc
new file mode 100644
index 000..1ea58cb6947
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/100889.cc
@@ -0,0 +1,29 @@
+// Copyright (C) 2019-2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+void
+test01()
+{
+  void* p;
+  std::atomic_ref a(p);
+  a.store(nullptr);
+}
-- 
2.26.2



Re: [PATCH 4/5 ver4] RS6000, Add test 128-bit shifts for just the int128 type.

2021-06-04 Thread Segher Boessenkool
Hi!

On Mon, Apr 26, 2021 at 09:36:26AM -0700, Carl Love wrote:
> The previous patch added the vector 128-bit integer shift instruction
> support for the V1TI type.  This patch renames and moves the VSX_TI
> iterator from vsx.md to VEC_TI in vector.md.  The uses of VEC_TI are
> also updated.

Okay for trunk.  Thanks!


Segher


Re: [PATCH 3/5 ver4] RS6000: Add TI to TD (128-bit DFP) and TD to TI support

2021-06-04 Thread Segher Boessenkool
Hi!

Maybe use "Add floattitd2 and fixtdti2" or similar as title?

On Mon, Apr 26, 2021 at 09:36:19AM -0700, Carl Love wrote:
> gcc/ChangeLog
> dje@gmail.com, gcc-patches@gcc.gnu.org, Bill Schmidt 
> , Peter Bergner ,  

What Will said here.

> 2021-04-26  Carl Love  
> * config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns.
> * config/rs6000/rs6000-call.c (P10V_BUILTIN_VCMPNET_P,
>   P10V_BUILTIN_VCMPAET_P): New overloaded definitions.

That last line is just spurious?

Okay for trunk.  Thanks!


Segher


Re: [PATCH] PR libstdc++/100889: Fix wrong param type in atomic_ref<_Tp*>::wait

2021-06-04 Thread Jonathan Wakely via Gcc-patches
On Sat, 5 Jun 2021, 00:05 Thomas Rodgers,  wrote:

> Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
> Change parameter type from _Tp to _Tp*.
> * testsuite/29_atomics/atomic_ref/deduction.cc: Add
> reproducer case from PR.
>

That file is testing CTAD, there should be a better place to add this.



---
>  libstdc++-v3/include/bits/atomic_base.h   | 2 +-
>  libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_base.h
> b/libstdc++-v3/include/bits/atomic_base.h
> index 029b8ad65a9..20cf1343c58 100644
> --- a/libstdc++-v3/include/bits/atomic_base.h
> +++ b/libstdc++-v3/include/bits/atomic_base.h
> @@ -1870,7 +1870,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>  #if __cpp_lib_atomic_wait
>_GLIBCXX_ALWAYS_INLINE void
> -  wait(_Tp __old, memory_order __m = memory_order_seq_cst) const
> noexcept
> +  wait(_Tp* __old, memory_order __m = memory_order_seq_cst) const
> noexcept
>{ __atomic_impl::wait(_M_ptr, __old, __m); }
>
>// TODO add const volatile overload
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
> index 86395b0c2b0..ed46b430f7c 100644
> --- a/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
> @@ -34,6 +34,7 @@ test01()
>int* p = &i;
>std::atomic_ref a2(p);
>static_assert(std::is_same_v>);
> +  a2.store(nullptr);
>
>struct X { } x;
>std::atomic_ref a3(x);
> --
> 2.26.2
>
>


Re: [PATCH 2/5 ver4] RS6000: Add 128-bit Integer Operations

2021-06-04 Thread Segher Boessenkool
Hi!

On Mon, Apr 26, 2021 at 09:36:12AM -0700, Carl Love wrote:
> This patch adds the 128-bit integer support for divide, modulo, shift,
> compare of 128-bit integers instructions and builtin support.

>   (rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT,
>   P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI,
>   P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
>   P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
>   P10_BUILTIN_CMPLE_U1TI]: New case statements.

>   (builtin_function_type)[P10_BUILTIN_128BIT_VMULEUD,
>   P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI,
>   P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI,
>   P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements.

All P10_ here should be P10V_.

>   * config/rs6000/r6000.c (rs6000_handle_altivec_attribute)[E_TImode,
>   E_V1TImode]: New case statements.

Space between ) and [.

> +(define_insn "altivec_eqv1ti"
> +  [(set (match_operand:V1TI 0 "altivec_register_operand" "=v")
> + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v")
> +  (match_operand:V1TI 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10"
> +  "vcmpequq %0,%1,%2"
> +  [(set_attr "type" "veccmpfx")])

There already is

(define_insn "altivec_eq"
  [(set (match_operand:VI2 0 "altivec_register_operand" "=v")
(eq:VI2 (match_operand:VI2 1 "altivec_register_operand" "v")
(match_operand:VI2 2 "altivec_register_operand" "v")))]
  ""
  "vcmpequ %0,%1,%2"
  [(set_attr "type" "veccmpfx")])

so changing the iterator VI2 to also include V1TI (or making a new
iterator VI3 or something that includes it) will make this much easier.
(You also need the add something to VI_char; VI_unit already has it).

There are multiple iterators that can be used already.  Especially since
we have the "isa" and "enabled" attributes now :-)  So maybe we can come
up with a good logical name for it.

This can be done later.  But in the future, have an eye out if you can
just use existing patterns, it saves work :-)

The shift type things have the amount to shift by in the wrong place, so
that is a bit of a kink.


All the builtin stuff...  I didn't see mistakes, but I am so glad it
will all be rewritten soon, that is soo hard to look at :-)

> +(define_expand "vector_gtv1ti"
> +  [(set (match_operand:V1TI 0 "vlogical_operand")
> + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand")
> +  (match_operand:V1TI 2 "vlogical_operand")))]
> +  "TARGET_POWER10"
> +  "")

So this will require extending VEC_C a bit if we merge patterns.

> +;; Swap upper/lower 64-bit values in a 128-bit vector
> +(define_insn "xxswapd_v1ti"
> +  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (subreg:V1TI
> +   (vec_select:V2DI
> + (subreg:V2DI
> +(match_operand:V1TI 1 "vsx_register_operand" "v") 0 )
> +  (parallel [(const_int 1)(const_int 0)]))
> +   0))]

There are spaces instead of tabs on most of these lines.


Okay for trunk with the trivialities fixed.  Also okay for GCC 11 once
it has been tested on all CPUs and OSes (all we care about that it :-) )
Thanks!


Segher


[PATCH] PR libstdc++/100889: Fix wrong param type in atomic_ref<_Tp*>::wait

2021-06-04 Thread Thomas Rodgers
Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
Change parameter type from _Tp to _Tp*.
* testsuite/29_atomics/atomic_ref/deduction.cc: Add
reproducer case from PR.
---
 libstdc++-v3/include/bits/atomic_base.h   | 2 +-
 libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/atomic_base.h 
b/libstdc++-v3/include/bits/atomic_base.h
index 029b8ad65a9..20cf1343c58 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1870,7 +1870,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cpp_lib_atomic_wait
   _GLIBCXX_ALWAYS_INLINE void
-  wait(_Tp __old, memory_order __m = memory_order_seq_cst) const noexcept
+  wait(_Tp* __old, memory_order __m = memory_order_seq_cst) const noexcept
   { __atomic_impl::wait(_M_ptr, __old, __m); }
 
   // TODO add const volatile overload
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
index 86395b0c2b0..ed46b430f7c 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/deduction.cc
@@ -34,6 +34,7 @@ test01()
   int* p = &i;
   std::atomic_ref a2(p);
   static_assert(std::is_same_v>);
+  a2.store(nullptr);
 
   struct X { } x;
   std::atomic_ref a3(x);
-- 
2.26.2



[PATCH] [libstdc++] Cleanup atomic timed wait implementation

2021-06-04 Thread Thomas Rodgers
This cleans up the implementation of atomic_timed_wait.h and fixes the
accidental pessimization of spinning after waiting in
__timed_waiter_pool::_M_do_wait_until.

libstdc++-v3/ChangeLog:

* include/bits/atomic_timed_wait.h (__wait_clock_t): Define
conditionally.
(__cond_wait_until_impl): Define conditionally.
(__cond_wait_until): Define conditionally. Simplify clock
type detection/conversion.
(__timed_waiter_pool::_M_do_wait_until): Move the spin above
the wait.

---
 libstdc++-v3/include/bits/atomic_timed_wait.h | 26 ++-
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h 
b/libstdc++-v3/include/bits/atomic_timed_wait.h
index ec7ff51cdbc..19386e5806a 100644
--- a/libstdc++-v3/include/bits/atomic_timed_wait.h
+++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
@@ -51,7 +51,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   namespace __detail
   {
+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX || _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
 using __wait_clock_t = chrono::steady_clock;
+#else
+using __wait_clock_t = chrono::system_clock;
+#endif
 
 template
   __wait_clock_t::time_point
@@ -133,11 +137,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return false;
  }
   }
-#else
+// #elsif 
 // define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT and implement 
__platform_wait_until()
 // if there is a more efficient primitive supported by the platform
 // (e.g. __ulock_wait())which is better than pthread_cond_clockwait
-#endif // ! PLATFORM_TIMED_WAIT
+#else
+// Use wait on condition variable
 
 // Returns true if wait ended before timeout.
 // _Clock must be either steady_clock or system_clock.
@@ -173,12 +178,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __cond_wait_until(__condvar& __cv, mutex& __mx,
  const chrono::time_point<_Clock, _Dur>& __atime)
   {
-#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
-   if constexpr (is_same_v<_Clock, chrono::steady_clock>)
- return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
-   else
-#endif
-   if constexpr (is_same_v<_Clock, chrono::system_clock>)
+   if constexpr (is_same_v<__wait_clock_t, _Clock>)
  return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
else
  {
@@ -194,6 +194,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return false;
  }
   }
+#endif // ! PLATFORM_TIMED_WAIT
 
 struct __timed_waiter_pool : __waiter_pool_base
 {
@@ -300,17 +301,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  const chrono::time_point<_Clock, _Dur>&
  __atime) noexcept
  {
+
for (auto __now = _Clock::now(); __now < __atime;
  __now = _Clock::now())
  {
+   if (__base_type::_M_do_spin(__pred, __val,
+  __timed_backoff_spin_policy(__atime, __now)))
+ return true;
+
if (__base_type::_M_w._M_do_wait_until(
  __base_type::_M_addr, __val, __atime)
&& __pred())
  return true;
-
-   if (__base_type::_M_do_spin(__pred, __val,
-  __timed_backoff_spin_policy(__atime, __now)))
- return true;
  }
return false;
  }
-- 
2.26.2



Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2021-06-04 Thread Fāng-ruì Sòng via Gcc-patches
PING^2 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html

On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng  wrote:
>
> Ping https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
>
> On Tue, May 11, 2021 at 8:29 PM Fangrui Song  wrote:
> >
> > This was introduced in 2014-12 to use local binding for external symbols
> > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years which mostly
> > nullify the benefit of HAVE_LD_PIE_COPYRELOC, HAVE_LD_PIE_COPYRELOC
> > should retire now.
> >
> > One design goal of -fPIE was to avoid copy relocations.
> > HAVE_LD_PIE_COPYRELOC has deviated from the goal.  With this change, the
> > -fPIE behavior of x86-64 will be closer to x86-32 and other targets.
> >
> > ---
> >
> > See https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html for a list
> > of fixed and unfixed (e.g. gold incompatibility with protected
> > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) issues.
> >
> > If you prefer a longer write-up, see
> > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected
> > ---
> >  gcc/config.in |  6 ---
> >  gcc/config/i386/i386.c| 11 +---
> >  gcc/configure | 52 ---
> >  gcc/configure.ac  | 48 -
> >  gcc/doc/sourcebuild.texi  |  3 --
> >  .../gcc.target/i386/pie-copyrelocs-1.c| 14 -
> >  .../gcc.target/i386/pie-copyrelocs-2.c| 14 -
> >  .../gcc.target/i386/pie-copyrelocs-3.c| 14 -
> >  .../gcc.target/i386/pie-copyrelocs-4.c| 17 --
> >  gcc/testsuite/lib/target-supports.exp | 47 -
> >  10 files changed, 2 insertions(+), 224 deletions(-)
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
> >
> > diff --git a/gcc/config.in b/gcc/config.in
> > index e54f59ce0c3..a65bf5d4176 100644
> > --- a/gcc/config.in
> > +++ b/gcc/config.in
> > @@ -1659,12 +1659,6 @@
> >  #endif
> >
> >
> > -/* Define 0/1 if your linker supports -pie option with copy reloc. */
> > -#ifndef USED_FOR_TARGET
> > -#undef HAVE_LD_PIE_COPYRELOC
> > -#endif
> > -
> > -
> >  /* Define if your PowerPC linker has .gnu.attributes long double support. 
> > */
> >  #ifndef USED_FOR_TARGET
> >  #undef HAVE_LD_PPC_GNU_ATTR_LONG_DOUBLE
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 915f89f571a..5ec3c6fd0c9 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -10579,11 +10579,7 @@ legitimate_pic_address_disp_p (rtx disp)
> > return true;
> > }
> >   else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> > -  && (SYMBOL_REF_LOCAL_P (op0)
> > -  || (HAVE_LD_PIE_COPYRELOC
> > -  && flag_pie
> > -  && !SYMBOL_REF_WEAK (op0)
> > -  && !SYMBOL_REF_FUNCTION_P (op0)))
> > +  && SYMBOL_REF_LOCAL_P (op0)
> >&& ix86_cmodel != CM_LARGE_PIC)
> > return true;
> >   break;
> > @@ -22892,10 +22888,7 @@ ix86_atomic_assign_expand_fenv (tree *hold, tree 
> > *clear, tree *update)
> >  static bool
> >  ix86_binds_local_p (const_tree exp)
> >  {
> > -  return default_binds_local_p_3 (exp, flag_shlib != 0, true, true,
> > - (!flag_pic
> > -  || (TARGET_64BIT
> > -  && HAVE_LD_PIE_COPYRELOC != 0)));
> > +  return default_binds_local_p_3 (exp, flag_shlib != 0, true, true, 
> > !flag_pic);
> >  }
> >  #endif
> >
> > diff --git a/gcc/configure b/gcc/configure
> > index f03fe888384..c500f5ca11e 100755
> > --- a/gcc/configure
> > +++ b/gcc/configure
> > @@ -29968,58 +29968,6 @@ fi
> >  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie" >&5
> >  $as_echo "$gcc_cv_ld_pie" >&6; }
> >
> > -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking linker PIE support with 
> > copy reloc" >&5
> > -$as_echo_n "checking linker PIE support with copy reloc... " >&6; }
> > -gcc_cv_ld_pie_copyreloc=no
> > -if test $gcc_cv_ld_pie = yes ; then
> > -  if test $in_tree_ld = yes ; then
> > -if test "$gcc_cv_gld_major_version" -eq 2 -a 
> > "$gcc_cv_gld_minor_version" -ge 25 -o "$gcc_cv_gld_major_version" -gt 2; 
> > then
> > -  gcc_cv_ld_pie_copyreloc=yes
> > -fi
> > -  elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
> > -# Check if linker supports -pie option with copy reloc
> > -case "$target" in
> > -i?86-*-linux* | x86_64-*-linux*)
> > -  cat > conftest1.s < > -   .globl  a_glob
> > -   .data
> > -   .type   a_glob, @object
> > -   .size  

[PATCH 13/13] v2 Add regression tests for PR 74765 and 74762

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch adds regression tests for two closely related bugs
resolved by the patch series.
Regression tests for TREE_NO_WARNING enhancement to warning groups.

PR middle-end/74765 - missing uninitialized warning (parenthesis, TREE_NO_WARNING abuse)
PR middle-end/74762 - [9/10/11/12 Regression] missing uninitialized warning (C++, parenthesized expr, TREE_NO_WARNING)

gcc/testsuite/ChangeLog:

	* g++.dg/uninit-pr74762.C: New test.
	* g++.dg/warn/uninit-pr74765.C: Same.

diff --git a/gcc/testsuite/g++.dg/uninit-pr74762.C b/gcc/testsuite/g++.dg/uninit-pr74762.C
new file mode 100644
index 000..ce1bc59773e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/uninit-pr74762.C
@@ -0,0 +1,24 @@
+/* PR c++/74762 - missing uninitialized warning (C++, parenthesized expr)
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+struct tree2;
+struct tree_vector2
+{
+  tree2 *elts[1];
+};
+
+struct tree2
+{
+  struct
+  {
+tree_vector2 vector;
+  } u;
+};
+
+tree2 *
+const_with_all_bytes_same (tree2 *val)
+{
+  int i;
+  return ((val->u.vector.elts[i]));   // { dg-warning "\\\[-Wuninitialized" }
+}
diff --git a/gcc/testsuite/g++.dg/warn/uninit-pr74765.C b/gcc/testsuite/g++.dg/warn/uninit-pr74765.C
new file mode 100644
index 000..1b8c124b18b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/uninit-pr74765.C
@@ -0,0 +1,24 @@
+/* PR c++/74765 - missing uninitialized warning (parenthesis,
+   TREE_NO_WARNING abuse)
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+int warn_equal_parens (int x, int y)
+{
+  int i;
+
+  if ((i == 0)) // { dg-warning "\\\[-Wuninitialized" }
+return x;
+
+  return y;
+}
+
+int warn_equal (int x, int y)
+{
+  int i;
+
+  if (i == 0)   // { dg-warning "\\\[-Wuninitialized" }
+return x;
+
+  return y;
+}


[PATCH 12/13] v2 Remove TREE_NO_WARNING and gimple*no_warning* APIs

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch removes the definitions of the TREE_NO_WARNING macro
and the gimple_get_no_warning_p() and gimple_set_no_warning() functions.
Remove legacy TREE_NO_WARNING amd gimple_*no_warning* APIs.

gcc/ChangeLog:

	* tree.h (TREE_NO_WARNING): Remove.
	* gimple.h (gimple_no_warning_p): Remove.
	(gimple_suppress_warning): Same.

diff --git a/gcc/tree.h b/gcc/tree.h
index 62b2de46479..3c92c58980e 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -699,13 +700,6 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 /* Determines whether an ENUMERAL_TYPE has defined the list of constants. */
 #define ENUM_IS_OPAQUE(NODE) (ENUMERAL_TYPE_CHECK (NODE)->base.private_flag)
 
-/* In an expr node (usually a conversion) this means the node was made
-   implicitly and should not lead to any sort of warning.  In a decl node,
-   warnings concerning the decl should be suppressed.  This is used at
-   least for used-before-set warnings, and it set after one warning is
-   emitted.  */
-#define TREE_NO_WARNING(NODE) ((NODE)->base.nowarning_flag)
-
 /* Nonzero if we should warn about the change in empty class parameter
passing ABI in this TU.  */
 #define TRANSLATION_UNIT_WARN_EMPTY_P(NODE) \
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 91b92b4a4d1..ca5d4acfc71 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1948,22 +1969,6 @@ gimple_seq_singleton_p (gimple_seq seq)
 	  && (gimple_seq_first (seq) == gimple_seq_last (seq)));
 }
 
-/* Return true if no warnings should be emitted for statement STMT.  */
-
-static inline bool
-gimple_no_warning_p (const gimple *stmt)
-{
-  return stmt->no_warning;
-}
-
-/* Set the no_warning flag of STMT to NO_WARNING.  */
-
-static inline void
-gimple_set_no_warning (gimple *stmt, bool no_warning)
-{
-  stmt->no_warning = (unsigned) no_warning;
-}
-
 /* Set the visited status on statement STMT to VISITED_P.
 
Please note that this 'visited' property of the gimple statement is


[PATCH 11/13] v2 Use new per-location warning APIs in the Objective-C front end

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in
the Objective-C front end with the new suppress_warning(),
warning_suppressed_p(), and copy_warning() APIs.
Add support for per-location warning groups.

gcc/objc/ChangeLog:

	* objc-act.c (objc_maybe_build_modify_expr): Replace direct uses
	of TREE_NO_WARNING with warning_suppressed_p, and suppress_warning.
	(objc_build_incr_expr_for_property_ref): Same.
	(objc_build_struct): Same.
	(synth_module_prologue): Same.
	* objc-gnu-runtime-abi-01.c (gnu_runtime_01_initialize): Same.
	* objc-next-runtime-abi-01.c (next_runtime_01_initialize): Same.
	* objc-next-runtime-abi-02.c (next_runtime_02_initialize): Same.

diff --git a/gcc/objc/objc-act.c b/gcc/objc/objc-act.c
index 8d106a4de26..ec20891152b 100644
--- a/gcc/objc/objc-act.c
+++ b/gcc/objc/objc-act.c
@@ -2007,7 +2007,7 @@ objc_maybe_build_modify_expr (tree lhs, tree rhs)
 	 correct (maybe a more sophisticated implementation could
 	 avoid generating the compound expression if not needed), but
 	 we need to turn it off.  */
-  TREE_NO_WARNING (compound_expr) = 1;
+  suppress_warning (compound_expr, OPT_Wunused);
   return compound_expr;
 }
   else
@@ -2129,7 +2129,7 @@ objc_build_incr_expr_for_property_ref (location_t location,
 
   /* Prevent C++ from warning with -Wall that "right operand of comma
  operator has no effect".  */
-  TREE_NO_WARNING (compound_expr) = 1;
+  suppress_warning (compound_expr, OPT_Wunused);
   return compound_expr;
 }
 
@@ -2262,8 +2262,9 @@ objc_build_struct (tree klass, tree fields, tree super_name)
   DECL_FIELD_IS_BASE (base) = 1;
 
   if (fields)
-	TREE_NO_WARNING (fields) = 1;	/* Suppress C++ ABI warnings -- we   */
-#endif	/* are following the ObjC ABI here.  */
+	/* Suppress C++ ABI warnings: we are following the ObjC ABI here.  */
+	suppress_warning (fields, OPT_Wabi);
+#endif
   DECL_CHAIN (base) = fields;
   fields = base;
 }
@@ -3112,19 +3113,19 @@ synth_module_prologue (void)
 		TYPE_DECL,
 		objc_object_name,
 		objc_object_type));
-  TREE_NO_WARNING (type) = 1;
+  suppress_warning (type);
 
   type = lang_hooks.decls.pushdecl (build_decl (input_location,
 		TYPE_DECL,
 		objc_instancetype_name,
 		objc_instancetype_type));
-  TREE_NO_WARNING (type) = 1;
+  suppress_warning (type);
 
   type = lang_hooks.decls.pushdecl (build_decl (input_location,
 		TYPE_DECL,
 		objc_class_name,
 		objc_class_type));
-  TREE_NO_WARNING (type) = 1;
+  suppress_warning (type);
 
   /* Forward-declare '@interface Protocol'.  */
   type = get_identifier (PROTOCOL_OBJECT_CLASS_NAME);
diff --git a/gcc/objc/objc-gnu-runtime-abi-01.c b/gcc/objc/objc-gnu-runtime-abi-01.c
index 4add71edf41..976fa1e36cb 100644
--- a/gcc/objc/objc-gnu-runtime-abi-01.c
+++ b/gcc/objc/objc-gnu-runtime-abi-01.c
@@ -213,7 +213,7 @@ static void gnu_runtime_01_initialize (void)
 		TYPE_DECL,
 		objc_selector_name,
 		objc_selector_type));
-  TREE_NO_WARNING (type) = 1;
+  suppress_warning (type);
 
   /* typedef id (*IMP)(id, SEL, ...); */
   ftype = build_varargs_function_type_list (objc_object_type,
diff --git a/gcc/objc/objc-next-runtime-abi-01.c b/gcc/objc/objc-next-runtime-abi-01.c
index 3ec6e1703c1..183fc01abb2 100644
--- a/gcc/objc/objc-next-runtime-abi-01.c
+++ b/gcc/objc/objc-next-runtime-abi-01.c
@@ -282,7 +282,7 @@ static void next_runtime_01_initialize (void)
 		TYPE_DECL,
 		objc_selector_name,
 		objc_selector_type));
-  TREE_NO_WARNING (type) = 1;
+  suppress_warning (type);
 
   build_v1_class_template ();
   build_super_template ();
diff --git a/gcc/objc/objc-next-runtime-abi-02.c b/gcc/objc/objc-next-runtime-abi-02.c
index 3cfcd0b1a57..963d1bf1ad8 100644
--- a/gcc/objc/objc-next-runtime-abi-02.c
+++ b/gcc/objc/objc-next-runtime-abi-02.c
@@ -379,7 +379,7 @@ static void next_runtime_02_initialize (void)
 		TYPE_DECL,
 		objc_selector_name,
 		objc_selector_type));
-  TREE_NO_WARNING (type) = 1;
+  suppress_warning (type);
 
   /* IMP : id (*) (id, _message_ref_t*, ...)
  SUPER_IMP : id (*) ( super_t*, _super_message_ref_t*, ...)


[PATCH 10/13] v2 Use new per-location warning APIs in the middle end

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch introduces declarations of the new
suppress_warning(), warning_suppressed_p(), and copy_warning() APIs,
and replaces the uses of TREE_NO_WARNING in the middle end with them.
Add support for per-location warning groups.

gcc/ChangeLog:

	* builtins.c (warn_string_no_nul): Replace uses of TREE_NO_WARNING,
	gimple_no_warning_p and gimple_set_no_warning with
	warning_suppressed_p, and suppress_warning.
	(c_strlen): Same.
	(maybe_warn_for_bound): Same.
	(warn_for_access): Same.
	(check_access): Same.
	(expand_builtin_strncmp): Same.
	(fold_builtin_varargs): Same.
	* calls.c (maybe_warn_nonstring_arg): Same.
	(maybe_warn_rdwr_sizes): Same.
	* cfgexpand.c (expand_call_stmt): Same.
	* cgraphunit.c (check_global_declaration): Same.
	* fold-const.c (fold_undefer_overflow_warnings): Same.
	(fold_truth_not_expr): Same.
	(fold_unary_loc): Same.
	(fold_checksum_tree): Same.
	* gengtype.c (open_base_files): Same.
	* gimple-array-bounds.cc (array_bounds_checker::check_array_ref): Same.
	(array_bounds_checker::check_mem_ref): Same.
	(array_bounds_checker::check_addr_expr): Same.
	(array_bounds_checker::check_array_bounds): Same.
	* gimple-expr.c (copy_var_decl): Same.
	* gimple-fold.c (gimple_fold_builtin_strcpy): Same.
	(gimple_fold_builtin_strncat): Same.
	(gimple_fold_builtin_stxcpy_chk): Same.
	(gimple_fold_builtin_stpcpy): Same.
	(gimple_fold_builtin_sprintf): Same.
	(fold_stmt_1): Same.
	* gimple-ssa-isolate-paths.c (diag_returned_locals): Same.
	* gimple-ssa-nonnull-compare.c (do_warn_nonnull_compare): Same.
	* gimple-ssa-sprintf.c (handle_printf_call): Same.
	* gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Same.
	* gimple-ssa-warn-restrict.c (maybe_diag_overlap): Same.
	(maybe_diag_access_bounds): Same.
	(check_call): Same.
	(check_bounds_or_overlap): Same.
	* gimple.c (gimple_build_call_from_tree): Same.
	* gimplify.c (gimplify_return_expr): Same.
	(gimplify_cond_expr): Same.
	(gimplify_modify_expr_complex_part): Same.
	(gimplify_modify_expr): Same.
	(gimple_push_cleanup): Same.
	(gimplify_expr): Same.
	* omp-expand.c (expand_omp_for_generic): Same.
	(expand_omp_taskloop_for_outer): Same.
	* omp-low.c (lower_rec_input_clauses): Same.
	(lower_lastprivate_clauses): Same.
	(lower_send_clauses): Same.
	(lower_omp_target): Same.
	* tree-cfg.c (pass_warn_function_return::execute): Same.
	* tree-complex.c (create_one_component_var): Same.
	* tree-inline.c (remap_gimple_op_r): Same.
	(copy_tree_body_r): Same.
	(declare_return_variable): Same.
	(expand_call_inline): Same.
	* tree-nested.c (lookup_field_for_decl): Same.
	* tree-sra.c (create_access_replacement): Same.
	(generate_subtree_copies): Same.
	* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Same.
	* tree-ssa-forwprop.c (combine_cond_expr_cond): Same.
	* tree-ssa-loop-ch.c (ch_base::copy_headers): Same.
	* tree-ssa-loop-im.c (execute_sm): Same.
	* tree-ssa-phiopt.c (cond_store_replacement): Same.
	* tree-ssa-strlen.c (maybe_warn_overflow): Same.
	(handle_builtin_strcpy): Same.
	(maybe_diag_stxncpy_trunc): Same.
	(handle_builtin_stxncpy_strncat): Same.
	(handle_builtin_strcat): Same.
	* tree-ssa-uninit.c (get_no_uninit_warning): Same.
	(set_no_uninit_warning): Same.
	(uninit_undefined_value_p): Same.
	(warn_uninit): Same.
	(maybe_warn_operand): Same.
	* tree-vrp.c (compare_values_warnv): Same.
	* vr-values.c (vr_values::extract_range_for_var_from_comparison_expr): Same.
	(test_for_singularity): Same.

	* gimple.h (warning_suppressed_p): New function.
	(suppress_warning): Same.
	(copy_no_warning): Same.
	(gimple_set_block): Call gimple_set_location.
	(gimple_set_location): Call copy_warning.
	* tree.h (no_warning, all_warnings): New constants.
	(warning_suppressed_p): New function.
	(suppress_warning): Same.
	(copy_no_warning): Same.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index af1fe49bb48..740fed69873 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1095,7 +1095,9 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 		bool exact /* = false */,
 		const wide_int bndrng[2] /* = NULL */)
 {
-  if ((expr && TREE_NO_WARNING (expr)) || TREE_NO_WARNING (arg))
+  const opt_code opt = OPT_Wstringop_overread;
+  if ((expr && warning_suppressed_p (expr, opt))
+  || warning_suppressed_p (arg, opt))
 return;
 
   loc = expansion_point_location_if_in_system_header (loc);
@@ -1123,14 +1125,14 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
   if (bndrng)
 	{
 	  if (wi::ltu_p (maxsiz, bndrng[0]))
-	warned = warning_at (loc, OPT_Wstringop_overread,
+	warned = warning_at (loc, opt,
  "%K%qD specified bound %s exceeds "
  "maximum object size %E",
  expr, func, bndstr, maxobjsize);
 	  else
 	{
 	  bool maybe = wi::to_wide (size) == bndrng[0];
-	  warned = warning_at (loc, OPT_Wstringop_overread,
+	  warned = warning_at (loc, opt,
    exact
    ? G_("%K%qD specified bound %s exceeds "
 	"the size %E of unterminated array")
@

[PATCH 9/13] v2 Use new per-location warning APIs in LTO

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the LTO
front end with the new suppress_warning() API.  It adds a couple of
FIXMEs that I plan to take care of in a follow up.
Add support for per-location warning groups.

gcc/lto/ChangeLog:

	* gimple-streamer-out.c (output_gimple_stmt): Same.
	* lto-common.c (compare_tree_sccs_1): Expand use of TREE_NO_WARNING.
	* lto-streamer-out.c (hash_tree): Same.
	* tree-streamer-in.c (unpack_ts_base_value_fields): Same.
	* tree-streamer-out.c (pack_ts_base_value_fields): Same.

diff --git a/gcc/gimple-streamer-out.c b/gcc/gimple-streamer-out.c
index fcbf92300d4..7f7e06a79b8 100644
--- a/gcc/gimple-streamer-out.c
+++ b/gcc/gimple-streamer-out.c
@@ -73,7 +73,7 @@ output_gimple_stmt (struct output_block *ob, struct function *fn, gimple *stmt)
   /* Emit the tuple header.  */
   bp = bitpack_create (ob->main_stream);
   bp_pack_var_len_unsigned (&bp, gimple_num_ops (stmt));
-  bp_pack_value (&bp, gimple_no_warning_p (stmt), 1);
+  bp_pack_value (&bp, warning_suppressed_p (stmt), 1);
   if (is_gimple_assign (stmt))
 bp_pack_value (&bp,
 		   gimple_assign_nontemporal_move_p (
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index a26d4885800..f1809e60c1e 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1207,7 +1207,7 @@ hash_tree (struct streamer_tree_cache_d *cache, hash_map *map,
   if (TYPE_P (t))
 hstate.add_flag (TYPE_ARTIFICIAL (t));
   else
-hstate.add_flag (TREE_NO_WARNING (t));
+hstate.add_flag (warning_suppressed_p (t));
   hstate.add_flag (TREE_NOTHROW (t));
   hstate.add_flag (TREE_STATIC (t));
   hstate.add_flag (TREE_PROTECTED (t));
diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index bfe52a2e942..9e7ea877e66 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -1110,8 +1110,8 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
 compare_values (TYPE_UNSIGNED);
   if (TYPE_P (t1))
 compare_values (TYPE_ARTIFICIAL);
-  else
-compare_values (TREE_NO_WARNING);
+  else if (t1->base.nowarning_flag != t2->base.nowarning_flag)
+return false;
   compare_values (TREE_NOTHROW);
   compare_values (TREE_STATIC);
   if (code != TREE_BINFO)
diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
index e0522bf2ac1..31dbf2fb992 100644
--- a/gcc/tree-streamer-in.c
+++ b/gcc/tree-streamer-in.c
@@ -131,7 +131,8 @@ unpack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
   if (TYPE_P (expr))
 TYPE_ARTIFICIAL (expr) = (unsigned) bp_unpack_value (bp, 1);
   else
-TREE_NO_WARNING (expr) = (unsigned) bp_unpack_value (bp, 1);
+/* FIXME: set all warning bits. */
+suppress_warning (expr, N_OPTS, (unsigned) bp_unpack_value (bp, 1));
   TREE_NOTHROW (expr) = (unsigned) bp_unpack_value (bp, 1);
   TREE_STATIC (expr) = (unsigned) bp_unpack_value (bp, 1);
   if (TREE_CODE (expr) != TREE_BINFO)
diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
index 855d1cd59b9..b76e0c59c6f 100644
--- a/gcc/tree-streamer-out.c
+++ b/gcc/tree-streamer-out.c
@@ -104,7 +104,8 @@ pack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
   if (TYPE_P (expr))
 bp_pack_value (bp, TYPE_ARTIFICIAL (expr), 1);
   else
-bp_pack_value (bp, TREE_NO_WARNING (expr), 1);
+/* FIXME: pack all warning bits.  */
+bp_pack_value (bp, warning_suppressed_p (expr), 1);
   bp_pack_value (bp, TREE_NOTHROW (expr), 1);
   bp_pack_value (bp, TREE_STATIC (expr), 1);
   if (TREE_CODE (expr) != TREE_BINFO)


[PATCH 8/13] v2 Use new per-location warning APIs in libcc1

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in libcc1 with
the new suppress_warning() API.
Add support for per-location warning groups.

libcc1/ChangeLog:

	* libcp1plugin.cc (record_decl_address): Replace a direct use
	of TREE_NO_WARNING with suppress_warning.

diff --git a/libcc1/libcp1plugin.cc b/libcc1/libcp1plugin.cc
index 79694b91964..ea6ee553401 100644
--- a/libcc1/libcp1plugin.cc
+++ b/libcc1/libcp1plugin.cc
@@ -541,7 +541,7 @@ record_decl_address (plugin_context *ctx, decl_addr_value value)
   **slot = value;
   /* We don't want GCC to warn about e.g. static functions
  without a code definition.  */
-  TREE_NO_WARNING (value.decl) = 1;
+  suppress_warning (value.decl);
   return *slot;
 }
 


[PATCH 7/13] v2 Use new per-location warning APIs in the FORTRAN front end

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the FORTRAN
front end with the new suppress_warning() API.
Add support for per-location warning groups.

gcc/fortran/ChangeLog:

	* trans-array.c (trans_array_constructor): Replace direct uses
	of TREE_NO_WARNING with warning_suppressed_p, and suppress_warning.
	* trans-decl.c (gfc_build_qualified_array): Same.
	(gfc_build_dummy_array_decl): Same.
	(generate_local_decl): Same.
	(gfc_generate_function_code): Same.
	* trans-openmp.c (gfc_omp_clause_default_ctor): Same.
	(gfc_omp_clause_copy_ctor): Same.
	* trans-types.c (get_dtype_type_node): Same.
	(gfc_get_desc_dim_type): Same.
	(gfc_get_array_descriptor_base): Same.
	(gfc_get_caf_vector_type): Same.
	(gfc_get_caf_reference_type): Same.
	* trans.c (gfc_create_var_np): Same.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 7eeef554c0f..64a050ff196 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -2755,7 +2755,7 @@ trans_array_constructor (gfc_ss * ss, locus * where)
   desc = ss_info->data.array.descriptor;
   offset = gfc_index_zero_node;
   offsetvar = gfc_create_var_np (gfc_array_index_type, "offset");
-  TREE_NO_WARNING (offsetvar) = 1;
+  suppress_warning (offsetvar);
   TREE_USED (offsetvar) = 0;
   gfc_trans_array_constructor_value (&outer_loop->pre, type, desc, c,
  &offset, &offsetvar, dynamic);
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index c32bd05bb1b..3f7953c8400 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1040,7 +1040,7 @@ gfc_build_qualified_array (tree decl, gfc_symbol * sym)
   if (GFC_TYPE_ARRAY_LBOUND (type, dim) == NULL_TREE)
 	{
 	  GFC_TYPE_ARRAY_LBOUND (type, dim) = create_index_var ("lbound", nest);
-	  TREE_NO_WARNING (GFC_TYPE_ARRAY_LBOUND (type, dim)) = 1;
+	  suppress_warning (GFC_TYPE_ARRAY_LBOUND (type, dim));
 	}
   /* Don't try to use the unknown bound for assumed shape arrays.  */
   if (GFC_TYPE_ARRAY_UBOUND (type, dim) == NULL_TREE
@@ -1048,13 +1048,13 @@ gfc_build_qualified_array (tree decl, gfc_symbol * sym)
 	  || dim < GFC_TYPE_ARRAY_RANK (type) - 1))
 	{
 	  GFC_TYPE_ARRAY_UBOUND (type, dim) = create_index_var ("ubound", nest);
-	  TREE_NO_WARNING (GFC_TYPE_ARRAY_UBOUND (type, dim)) = 1;
+	  suppress_warning (GFC_TYPE_ARRAY_UBOUND (type, dim));
 	}
 
   if (GFC_TYPE_ARRAY_STRIDE (type, dim) == NULL_TREE)
 	{
 	  GFC_TYPE_ARRAY_STRIDE (type, dim) = create_index_var ("stride", nest);
-	  TREE_NO_WARNING (GFC_TYPE_ARRAY_STRIDE (type, dim)) = 1;
+	  suppress_warning (GFC_TYPE_ARRAY_STRIDE (type, dim));
 	}
 }
   for (dim = GFC_TYPE_ARRAY_RANK (type);
@@ -1063,21 +1063,21 @@ gfc_build_qualified_array (tree decl, gfc_symbol * sym)
   if (GFC_TYPE_ARRAY_LBOUND (type, dim) == NULL_TREE)
 	{
 	  GFC_TYPE_ARRAY_LBOUND (type, dim) = create_index_var ("lbound", nest);
-	  TREE_NO_WARNING (GFC_TYPE_ARRAY_LBOUND (type, dim)) = 1;
+	  suppress_warning (GFC_TYPE_ARRAY_LBOUND (type, dim));
 	}
   /* Don't try to use the unknown ubound for the last coarray dimension.  */
   if (GFC_TYPE_ARRAY_UBOUND (type, dim) == NULL_TREE
   && dim < GFC_TYPE_ARRAY_RANK (type) + GFC_TYPE_ARRAY_CORANK (type) - 1)
 	{
 	  GFC_TYPE_ARRAY_UBOUND (type, dim) = create_index_var ("ubound", nest);
-	  TREE_NO_WARNING (GFC_TYPE_ARRAY_UBOUND (type, dim)) = 1;
+	  suppress_warning (GFC_TYPE_ARRAY_UBOUND (type, dim));
 	}
 }
   if (GFC_TYPE_ARRAY_OFFSET (type) == NULL_TREE)
 {
   GFC_TYPE_ARRAY_OFFSET (type) = gfc_create_var_np (gfc_array_index_type,
 			"offset");
-  TREE_NO_WARNING (GFC_TYPE_ARRAY_OFFSET (type)) = 1;
+  suppress_warning (GFC_TYPE_ARRAY_OFFSET (type));
 
   if (nest)
 	gfc_add_decl_to_parent_function (GFC_TYPE_ARRAY_OFFSET (type));
@@ -1089,7 +1089,7 @@ gfc_build_qualified_array (tree decl, gfc_symbol * sym)
   && as->type != AS_ASSUMED_SIZE)
 {
   GFC_TYPE_ARRAY_SIZE (type) = create_index_var ("size", nest);
-  TREE_NO_WARNING (GFC_TYPE_ARRAY_SIZE (type)) = 1;
+  suppress_warning (GFC_TYPE_ARRAY_SIZE (type));
 }
 
   if (POINTER_TYPE_P (type))
@@ -1294,7 +1294,7 @@ gfc_build_dummy_array_decl (gfc_symbol * sym, tree dummy)
 
   /* Avoid uninitialized warnings for optional dummy arguments.  */
   if (sym->attr.optional)
-TREE_NO_WARNING (decl) = 1;
+suppress_warning (decl);
 
   /* We should never get deferred shape arrays here.  We used to because of
  frontend bugs.  */
@@ -5981,7 +5981,7 @@ generate_local_decl (gfc_symbol * sym)
 			 "does not have a default initializer",
 			 sym->name, &sym->declared_at);
 	  if (sym->backend_decl != NULL_TREE)
-		TREE_NO_WARNING(sym->backend_decl) = 1;
+		suppress_warning (sym->backend_decl);
 	}
 	  else if (warn_unused_dummy_argument)
 	{
@@ -5991,7 +5991,7 @@ generate_local_decl (gfc_symbol * sym)
 			 &sym->declared_at);
 
 	  if (sym->backend_decl != NULL_TREE)
-		TREE_NO_WARNING(sym->backend_decl) = 

[PATCH 6/13] v2 Use new per-location warning APIs in the C++ front end

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the C++
front end with the new suppress_warning(), warning_suppressed_p(),
and copy_warning() APIs.
Add support for per-location warning groups.

	* call.c (build_over_call): Replace direct uses of TREE_NO_WARNING
	with warning_suppressed_p, suppress_warning, and copy_no_warning, or
	nothing if not necessary.
	(set_up_extended_ref_temp): Same.
	* class.c (layout_class_type): Same.
	* constraint.cc (constraint_satisfaction_value): Same.
	* coroutines.cc (finish_co_await_expr): Same.
	(finish_co_yield_expr): Same.
	(finish_co_return_stmt): Same.
	(build_actor_fn): Same.
	(coro_rewrite_function_body): Same.
	(morph_fn_to_coro): Same.
	* cp-gimplify.c (genericize_eh_spec_block): Same.
	(gimplify_expr_stmt): Same.
	(cp_genericize_r): Same.
	(cp_fold): Same.
	* cp-ubsan.c (cp_ubsan_instrument_vptr): Same.
	* cvt.c (cp_fold_convert): Same.
	(convert_to_void): Same.
	* decl.c (wrapup_namespace_globals): Same.
	(grokdeclarator): Same.
	(finish_function): Same.
	(require_deduced_type): Same.
	* decl2.c (no_linkage_error): Same.
	(c_parse_final_cleanups): Same.
	* except.c (expand_end_catch_block): Same.
	* init.c (build_new_1): Same.
	(build_new): Same.
	(build_vec_delete_1): Same.
	(build_vec_init): Same.
	(build_delete): Same.
	* method.c (defaultable_fn_check): Same.
	* parser.c (cp_parser_fold_expression): Same.
	(cp_parser_primary_expression): Same.
	* pt.c (push_tinst_level_loc): Same.
	(tsubst_copy): Same.
	(tsubst_omp_udr): Same.
	(tsubst_copy_and_build): Same.
	* rtti.c (build_if_nonnull): Same.
	* semantics.c (maybe_convert_cond): Same.
	(finish_return_stmt): Same.
	(finish_parenthesized_expr): Same.
	(cp_check_omp_declare_reduction): Same.
	* tree.c (build_cplus_array_type): Same.
	* typeck.c (build_ptrmemfunc_access_expr): Same.
	(cp_build_indirect_ref_1): Same.
	(cp_build_function_call_vec): Same.
	(warn_for_null_address): Same.
	(cp_build_binary_op): Same.
	(unary_complex_lvalue): Same.
	(cp_build_modify_expr): Same.
	(build_x_modify_expr): Same.
	(convert_for_assignment): Same.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 17fc60cd4af..0afddd56496 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -9461,7 +9461,7 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 	{
 	  /* Avoid copying empty classes.  */
 	  val = build2 (COMPOUND_EXPR, type, arg, to);
-	  TREE_NO_WARNING (val) = 1;
+	  suppress_warning (val, OPT_Wunused);
 	}
   else if (tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (as_base)))
 	{
@@ -9492,7 +9492,7 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 		  build2 (MEM_REF, array_type, arg0, alias_set),
 		  build2 (MEM_REF, array_type, arg, alias_set));
 	  val = build2 (COMPOUND_EXPR, TREE_TYPE (to), t, to);
-  TREE_NO_WARNING (val) = 1;
+  suppress_warning (val, OPT_Wunused);
 	}
 
   cp_warn_deprecated_use (fn, complain);
@@ -9566,7 +9566,7 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 {
   tree c = extract_call_expr (call);
   if (TREE_CODE (c) == CALL_EXPR)
-	TREE_NO_WARNING (c) = 1;
+	suppress_warning (c /* Suppress all warnings.  */);
 }
   if (TREE_CODE (fn) == ADDR_EXPR)
 {
@@ -12516,11 +12516,11 @@ set_up_extended_ref_temp (tree decl, tree expr, vec **cleanups,
 TREE_ADDRESSABLE (var) = 1;
 
   if (TREE_CODE (decl) == FIELD_DECL
-  && extra_warnings && !TREE_NO_WARNING (decl))
+  && extra_warnings && !warning_suppressed_p (decl))
 {
   warning (OPT_Wextra, "a temporary bound to %qD only persists "
 	   "until the constructor exits", decl);
-  TREE_NO_WARNING (decl) = true;
+  suppress_warning (decl);
 }
 
   /* Recursively extend temps in this initializer.  */
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index b53a4dbdd4e..c89ffadcef8 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -6704,7 +6704,7 @@ layout_class_type (tree t, tree *virtuals_p)
 	 laying out an Objective-C class.  The ObjC ABI differs
 	 from the C++ ABI, and so we do not want a warning
 	 here.  */
-	  && !TREE_NO_WARNING (field)
+	  && !warning_suppressed_p (field, OPT_Wabi)
 	  && !last_field_was_bitfield
 	  && !integer_zerop (size_binop (TRUNC_MOD_EXPR,
 	 DECL_FIELD_BIT_OFFSET (field),
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 03ce8eb9ff2..ae88666e4a2 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3281,14 +3281,14 @@ constraint_satisfaction_value (tree t, tree args, sat_info info)
   else
 r = satisfy_nondeclaration_constraints (t, args, info);
   if (r == error_mark_node && info.quiet ()
-  && !(DECL_P (t) && TREE_NO_WARNING (t)))
+  && !(DECL_P (t) && warning_suppressed_p (t)))
 {
   /* Replay the error noisily.  */
   sat_info noisy (tf_warning_or_error, info.in_decl);
   constraint_satisfaction_value (t, args, noisy);
   if (DECL_P (t) && !args)
 	/* Avoid giv

[PATCH 5/13] v2 Use new per-location warning APIs in the RL78 back end

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the RL78
back end with the new suppress_warning() and warning_suppressed_p()
APIs.
Add support for per-location warning groups.

gcc/ChangeLog:

	* config/rl78/rl78.c (rl78_handle_naked_attribute): Replace a direct
	use of TREE_NO_WARNING with suppress_warning.

diff --git a/gcc/config/rl78/rl78.c b/gcc/config/rl78/rl78.c
index 4c34949a97f..22d1690a035 100644
--- a/gcc/config/rl78/rl78.c
+++ b/gcc/config/rl78/rl78.c
@@ -847,7 +847,7 @@ rl78_handle_naked_attribute (tree * node,
   /* Disable warnings about this function - eg reaching the end without
  seeing a return statement - because the programmer is doing things
  that gcc does not know about.  */
-  TREE_NO_WARNING (* node) = 1;
+  suppress_warning (* node);
 
   return NULL_TREE;
 }


[PATCH 4/13] v2 Use new per-location warning APIs in C family code

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the shared
C family front end with the new suppress_warning(),
warning_suppressed_p(), and copy_warning() APIs.
Add support for per-location warning groups.

gcc/c-family/ChangeLog:

	* c-common.c (c_wrap_maybe_const): Remove TREE_NO_WARNING.
	(c_common_truthvalue_conversion): Replace direct uses of
	TREE_NO_WARNING with warning_suppressed_p, suppress_warning, and
	copy_no_warning.
	(check_function_arguments_recurse): Same.
	* c-gimplify.c (c_gimplify_expr): Same.
	* c-warn.c (overflow_warning): Same.
	(warn_logical_operator): Same.
	(warn_if_unused_value): Same.
	(do_warn_unused_parameter): Same.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index c4eb2b1c920..681fcc972f4 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3375,7 +3375,6 @@ pointer_int_sum (location_t loc, enum tree_code resultcode,
 tree
 c_wrap_maybe_const (tree expr, bool non_const)
 {
-  bool nowarning = TREE_NO_WARNING (expr);
   location_t loc = EXPR_LOCATION (expr);
 
   /* This should never be called for C++.  */
@@ -3386,8 +3385,6 @@ c_wrap_maybe_const (tree expr, bool non_const)
   STRIP_TYPE_NOPS (expr);
   expr = build2 (C_MAYBE_CONST_EXPR, TREE_TYPE (expr), NULL, expr);
   C_MAYBE_CONST_EXPR_NON_CONST (expr) = non_const;
-  if (nowarning)
-TREE_NO_WARNING (expr) = 1;
   protected_set_expr_location (expr, loc);
 
   return expr;
@@ -3633,12 +3630,12 @@ c_common_truthvalue_conversion (location_t location, tree expr)
   break;
 
 case MODIFY_EXPR:
-  if (!TREE_NO_WARNING (expr)
+  if (!warning_suppressed_p (expr, OPT_Wparentheses)
 	  && warn_parentheses
 	  && warning_at (location, OPT_Wparentheses,
 			 "suggest parentheses around assignment used as "
 			 "truth value"))
-	TREE_NO_WARNING (expr) = 1;
+	suppress_warning (expr, OPT_Wparentheses);
   break;
 
 case CONST_DECL:
@@ -6019,7 +6016,7 @@ check_function_arguments_recurse (void (*callback)
   void *ctx, tree param,
   unsigned HOST_WIDE_INT param_num)
 {
-  if (TREE_NO_WARNING (param))
+  if (warning_suppressed_p (param))
 return;
 
   if (CONVERT_EXPR_P (param)
diff --git a/gcc/c-family/c-gimplify.c b/gcc/c-family/c-gimplify.c
index 39c969d8f40..0d38b706f4c 100644
--- a/gcc/c-family/c-gimplify.c
+++ b/gcc/c-family/c-gimplify.c
@@ -713,7 +713,7 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p ATTRIBUTE_UNUSED,
 	  && !TREE_STATIC (DECL_EXPR_DECL (*expr_p))
 	  && (DECL_INITIAL (DECL_EXPR_DECL (*expr_p)) == DECL_EXPR_DECL (*expr_p))
 	  && !warn_init_self)
-	TREE_NO_WARNING (DECL_EXPR_DECL (*expr_p)) = 1;
+	suppress_warning (DECL_EXPR_DECL (*expr_p), OPT_Winit_self);
   break;
 
 case PREINCREMENT_EXPR:
diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index a587b993fde..cfa2373585f 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -155,7 +155,7 @@ overflow_warning (location_t loc, tree value, tree expr)
 			 value);
 
   if (warned)
-TREE_NO_WARNING (value) = 1;
+suppress_warning (value, OPT_Woverflow);
 }
 
 /* Helper function for walk_tree.  Unwrap C_MAYBE_CONST_EXPRs in an expression
@@ -219,7 +219,7 @@ warn_logical_operator (location_t location, enum tree_code code, tree type,
   && INTEGRAL_TYPE_P (TREE_TYPE (op_left))
   && !CONSTANT_CLASS_P (stripped_op_left)
   && TREE_CODE (stripped_op_left) != CONST_DECL
-  && !TREE_NO_WARNING (op_left)
+  && !warning_suppressed_p (op_left, OPT_Wlogical_op)
   && TREE_CODE (op_right) == INTEGER_CST
   && !integer_zerop (op_right)
   && !integer_onep (op_right))
@@ -234,7 +234,7 @@ warn_logical_operator (location_t location, enum tree_code code, tree type,
 	  = warning_at (location, OPT_Wlogical_op,
 			"logical % applied to non-boolean constant");
   if (warned)
-	TREE_NO_WARNING (op_left) = true;
+	suppress_warning (op_left, OPT_Wlogical_op);
   return;
 }
 
@@ -588,7 +588,7 @@ bool
 warn_if_unused_value (const_tree exp, location_t locus, bool quiet)
 {
  restart:
-  if (TREE_USED (exp) || TREE_NO_WARNING (exp))
+  if (TREE_USED (exp) || warning_suppressed_p (exp, OPT_Wunused_value))
 return false;
 
   /* Don't warn about void constructs.  This includes casting to void,
@@ -2422,7 +2422,7 @@ do_warn_unused_parameter (tree fn)
decl; decl = DECL_CHAIN (decl))
 if (!TREE_USED (decl) && TREE_CODE (decl) == PARM_DECL
 	&& DECL_NAME (decl) && !DECL_ARTIFICIAL (decl)
-	&& !TREE_NO_WARNING (decl))
+	&& !warning_suppressed_p (decl, OPT_Wunused_parameter))
   warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wunused_parameter,
 		  "unused parameter %qD", decl);
 }


[PATCH 3/13] v2 Use new per-location warning APIs in C front end

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the C front
end with the new suppress_warning(), warning_suppressed_p(), and
copy_warning() APIs.
Add support for per-location warning groups.

gcc/c/ChangeLog:

	* c-decl.c (pop_scope): Replace direct uses of TREE_NO_WARNING with
	warning_suppressed_p, suppress_warning, and copy_no_warning.
	(diagnose_mismatched_decls): Same.
	(duplicate_decls): Same.
	(grokdeclarator): Same.
	(finish_function): Same.
	(c_write_global_declarations_1): Same.
	* c-fold.c (c_fully_fold_internal): Same.
	* c-parser.c (c_parser_expr_no_commas): Same.
	(c_parser_postfix_expression): Same.
	* c-typeck.c (array_to_pointer_conversion): Same.
	(function_to_pointer_conversion): Same.
	(default_function_array_conversion): Same.
	(convert_lvalue_to_rvalue): Same.
	(default_conversion): Same.
	(build_indirect_ref): Same.
	(build_function_call_vec): Same.
	(build_atomic_assign): Same.
	(build_unary_op): Same.
	(c_finish_return): Same.
	(emit_side_effect_warnings): Same.
	(c_finish_stmt_expr): Same.
	(c_omp_clause_copy_ctor): Same.

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 28f851b9d0b..adfdd56d49d 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -1295,7 +1295,7 @@ pop_scope (void)
 	case VAR_DECL:
 	  /* Warnings for unused variables.  */
 	  if ((!TREE_USED (p) || !DECL_READ_P (p))
-	  && !TREE_NO_WARNING (p)
+	  && !warning_suppressed_p (p, OPT_Wunused_but_set_variable)
 	  && !DECL_IN_SYSTEM_HEADER (p)
 	  && DECL_NAME (p)
 	  && !DECL_ARTIFICIAL (p)
@@ -2159,8 +2159,8 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
 
   if (DECL_IN_SYSTEM_HEADER (newdecl)
 	  || DECL_IN_SYSTEM_HEADER (olddecl)
-	  || TREE_NO_WARNING (newdecl)
-	  || TREE_NO_WARNING (olddecl))
+	  || warning_suppressed_p (newdecl, OPT_Wpedantic)
+	  || warning_suppressed_p (olddecl, OPT_Wpedantic))
 	return true;  /* Allow OLDDECL to continue in use.  */
 
   if (variably_modified_type_p (newtype, NULL))
@@ -2953,7 +2953,7 @@ duplicate_decls (tree newdecl, tree olddecl)
   if (!diagnose_mismatched_decls (newdecl, olddecl, &newtype, &oldtype))
 {
   /* Avoid `unused variable' and other warnings for OLDDECL.  */
-  TREE_NO_WARNING (olddecl) = 1;
+  suppress_warning (olddecl, OPT_Wunused);
   return false;
 }
 
@@ -7540,10 +7540,7 @@ grokdeclarator (const struct c_declarator *declarator,
 			   FIELD_DECL, declarator->u.id.id, type);
 	DECL_NONADDRESSABLE_P (decl) = bitfield;
 	if (bitfield && !declarator->u.id.id)
-	  {
-	TREE_NO_WARNING (decl) = 1;
-	DECL_PADDING_P (decl) = 1;
-	  }
+	  DECL_PADDING_P (decl) = 1;
 
 	if (size_varies)
 	  C_DECL_VARIABLE_SIZE (decl) = 1;
@@ -10232,7 +10229,7 @@ finish_function (location_t end_loc)
   && targetm.warn_func_return (fndecl)
   && warning (OPT_Wreturn_type,
 		  "no return statement in function returning non-void"))
-TREE_NO_WARNING (fndecl) = 1;
+suppress_warning (fndecl, OPT_Wreturn_type);
 
   /* Complain about parameters that are only set, but never otherwise used.  */
   if (warn_unused_but_set_parameter)
@@ -10247,7 +10244,7 @@ finish_function (location_t end_loc)
 	&& !DECL_READ_P (decl)
 	&& DECL_NAME (decl)
 	&& !DECL_ARTIFICIAL (decl)
-	&& !TREE_NO_WARNING (decl))
+	&& !warning_suppressed_p (decl, OPT_Wunused_but_set_parameter))
 	  warning_at (DECL_SOURCE_LOCATION (decl),
 		  OPT_Wunused_but_set_parameter,
 		  "parameter %qD set but not used", decl);
@@ -12114,19 +12111,20 @@ c_write_global_declarations_1 (tree globals)
 	{
 	  if (C_DECL_USED (decl))
 	{
+	  /* TODO: Add OPT_Wundefined-inline.  */
 	  if (pedwarn (input_location, 0, "%q+F used but never defined",
 			   decl))
-		TREE_NO_WARNING (decl) = 1;
+		suppress_warning (decl /* OPT_Wundefined-inline.  */);
 	}
 	  /* For -Wunused-function warn about unused static prototypes.  */
 	  else if (warn_unused_function
 		   && ! DECL_ARTIFICIAL (decl)
-		   && ! TREE_NO_WARNING (decl))
+		   && ! warning_suppressed_p (decl, OPT_Wunused_function))
 	{
 	  if (warning (OPT_Wunused_function,
 			   "%q+F declared % but never defined",
 			   decl))
-		TREE_NO_WARNING (decl) = 1;
+		suppress_warning (decl, OPT_Wunused_function);
 	}
 	}
 
diff --git a/gcc/c/c-fold.c b/gcc/c/c-fold.c
index 68c74cc1eb2..0ebcb469d28 100644
--- a/gcc/c/c-fold.c
+++ b/gcc/c/c-fold.c
@@ -154,7 +154,7 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
   tree orig_op0, orig_op1, orig_op2;
   bool op0_const = true, op1_const = true, op2_const = true;
   bool op0_const_self = true, op1_const_self = true, op2_const_self = true;
-  bool nowarning = TREE_NO_WARNING (expr);
+  bool nowarning = warning_suppressed_p (expr, OPT_Woverflow);
   bool unused_p;
   bool op0_lval = false;
   source_range old_range;
@@ -670,13 +670,13 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
  out:
   /* Some folding may introduce NON_LVALUE_E

[PATCH 1/13] v2 [PATCH 1/13] Add support for per-location warning groups (PR 74765)

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch introduces the suppress_warning(),
warning_suppressed(), and copy_no_warning() APIs without making
use of them in the rest of GCC.  They are in three files:

  diagnostic-spec.{h,c}: Location-centric overloads.
  warning-control.cc: Tree- and gimple*-centric overloads.

The location-centric overloads are suitable to use from the diagnostic
subsystem.  The rest can be used from the front ends and the middle end.
Add support for per-location warning groups.

gcc/ChangeLog:

	* Makefile.in (OBJS-libcommon): Add diagnostic-spec.o.
	* gengtype.c (open_base_files): Add diagnostic-spec.h.
	* diagnostic-spec.c: New file.
	* diagnostic-spec.h: New file.
	* warning-control.cc: New file.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 4cb2966157e..35eef812ac8 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1696,6 +1696,7 @@ OBJS = \
 	vmsdbgout.o \
 	vr-values.o \
 	vtable-verify.o \
+	warning-control.o \
 	web.o \
 	wide-int.o \
 	wide-int-print.o \
@@ -1707,8 +1708,8 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o diagnostic-color.o diagnostic-show-locus.o \
-	diagnostic-format-json.o json.o \
+OBJS-libcommon = diagnostic-spec.o diagnostic.o diagnostic-color.o \
+	diagnostic-show-locus.o diagnostic-format-json.o json.o \
 	edit-context.o \
 	pretty-print.o intl.o \
 	sbitmap.o \
@@ -2648,6 +2649,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \
   $(srcdir)/ipa-modref-tree.h \
   $(srcdir)/signop.h \
+  $(srcdir)/diagnostic-spec.h $(srcdir)/diagnostic-spec.c \
   $(srcdir)/dwarf2out.h \
   $(srcdir)/dwarf2asm.c \
   $(srcdir)/dwarf2cfi.c \
diff --git a/gcc/diagnostic-spec.c b/gcc/diagnostic-spec.c
new file mode 100644
index 000..c5668831a9b
--- /dev/null
+++ b/gcc/diagnostic-spec.c
@@ -0,0 +1,177 @@
+/* Functions to enable and disable individual warnings on an expression
+   and statement basis.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Martin Sebor 
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "bitmap.h"
+#include "tree.h"
+#include "cgraph.h"
+#include "hash-map.h"
+#include "diagnostic-spec.h"
+#include "pretty-print.h"
+#include "options.h"
+
+/* Initialize *THIS from warning option OPT.  */
+
+nowarn_spec_t::nowarn_spec_t (opt_code opt)
+{
+  /* Create a very simple mapping based on testing and experience.
+ It should become more refined with time. */
+  switch (opt)
+{
+case no_warning:
+  bits = 0;
+  break;
+
+case all_warnings:
+  bits = -1;
+  break;
+
+  /* Flow-sensitive warnings about pointer problems issued by both
+	 front ends and the middle end.  */
+case OPT_Waddress:
+case OPT_Wnonnull:
+  bits = NW_NONNULL;
+  break;
+
+  /* Flow-sensitive warnings about arithmetic overflow issued by both
+	 front ends and the middle end.  */
+case OPT_Woverflow:
+case OPT_Wshift_count_negative:
+case OPT_Wshift_count_overflow:
+case OPT_Wstrict_overflow:
+  bits = NW_VFLOW;
+  break;
+
+  /* Lexical warnings issued by front ends.  */
+case OPT_Wabi:
+case OPT_Wlogical_op:
+case OPT_Wparentheses:
+case OPT_Wreturn_type:
+case OPT_Wsizeof_array_div:
+case OPT_Wstrict_aliasing:
+case OPT_Wunused:
+case OPT_Wunused_function:
+case OPT_Wunused_but_set_variable:
+case OPT_Wunused_variable:
+case OPT_Wunused_but_set_parameter:
+  bits = NW_LEXICAL;
+  break;
+
+  /* Access warning group.  */
+case OPT_Warray_bounds:
+case OPT_Warray_bounds_:
+case OPT_Wformat_overflow_:
+case OPT_Wformat_truncation_:
+case OPT_Wrestrict:
+case OPT_Wstrict_aliasing_:
+case OPT_Wstringop_overflow_:
+case OPT_Wstringop_overread:
+case OPT_Wstringop_truncation:
+  bits = NW_ACCESS;
+  break;
+
+  /* Initialization warning group.  */
+case OPT_Winit_self:
+case OPT_Wuninitialized:
+case OPT_Wmaybe_uninitialized:
+	bits = NW_UNINIT;
+  break;
+
+default:
+  /* A catchall group for everything else.  */
+  bits = NW_OTHER;
+}
+}
+
+/* Map from location to its no-warning disp

[PATCH 2/13] v2 Use new per-location warning APIs in Ada.

2021-06-04 Thread Martin Sebor via Gcc-patches

The attached patch replaces the uses of TREE_NO_WARNING in the Ada front
end with the new suppress_warning(), warning_suppressed_p(), and
copy_warning() APIs.
Add support for per-location warning groups.

gcc/ada/ChangeLog:

	* gcc-interface/trans.c (Handled_Sequence_Of_Statements_to_gnu):
	Replace TREE_NO_WARNING with suppress_warning.
	(gnat_gimplify_expr): Same.
	* gcc-interface/utils.c (gnat_pushdecl): Same.

diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index ee014a35cc2..949b7733766 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -5363,7 +5363,7 @@ Handled_Sequence_Of_Statements_to_gnu (Node_Id gnat_node)
 	 because of the unstructured form of EH used by fe_sjlj_eh, there
 	 might be forward edges going to __builtin_setjmp receivers on which
 	 it is uninitialized, although they will never be actually taken.  */
-  TREE_NO_WARNING (gnu_jmpsave_decl) = 1;
+  suppress_warning (gnu_jmpsave_decl, OPT_Wuninitialized);
   gnu_jmpbuf_decl
 	= create_var_decl (get_identifier ("JMP_BUF"), NULL_TREE,
 			   jmpbuf_type,
@@ -8805,7 +8805,7 @@ gnat_gimplify_expr (tree *expr_p, gimple_seq *pre_p,
   else
 	{
 	  *expr_p = create_tmp_var (type, NULL);
-	  TREE_NO_WARNING (*expr_p) = 1;
+	  suppress_warning (*expr_p);
 	}
 
   gimplify_and_add (TREE_OPERAND (expr, 0), pre_p);
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 1786fbf8186..982274c6d77 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -836,7 +836,8 @@ gnat_pushdecl (tree decl, Node_Id gnat_node)
   if (!deferred_decl_context)
 DECL_CONTEXT (decl) = context;
 
-  TREE_NO_WARNING (decl) = (No (gnat_node) || Warnings_Off (gnat_node));
+  suppress_warning (decl, all_warnings,
+		No (gnat_node) || Warnings_Off (gnat_node));
 
   /* Set the location of DECL and emit a declaration for it.  */
   if (Present (gnat_node) && !renaming_from_instantiation_p (gnat_node))


[PATCH 0/13] v2 warning control by group and location (PR 74765)

2021-06-04 Thread Martin Sebor via Gcc-patches

This is a revised patch series to add warning control by group and
location, updated based on feedback on the initial series.

v2 changes include:

1) Use opt_code rather than int for the option argument to the new
   APIs.  This let me find and fix a bug in the original Ada change.
2) Use suppress_warning() and warning_suppressed_p() instead of
   get/set_no_warning, and also instead of warning_enabled/disabled
   for the names of the new functions (as requested/clarified offline
   by David).
3) Make the removal of the TREE_NO_WARNING macro and
   the gimple_get_no_warning_p() and gimple_set_no_warning()
   functions a standalone patch.
4) Include tests for PR 74765 and 74762 fixed by these changes.

I have retested the whole patch series on x86_64-linux.


[PATCH] PR fortran/95502 - ICE in gfc_check_do_variable, at fortran/parse.c:4446

2021-06-04 Thread Harald Anlauf via Gcc-patches
ICE-on-invalid issues during error recovery.  Testcase by Gerhard,
initial patch by Steve.  I found another variant which needed an
additional fix for a NULL pointer dereference.

Regtested on x86_64-pc-linux-gnu.

OK for mainline / 11-branch?

Thanks,
Harald


Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446

Avoid NULL pointer dereferences during error recovery.

gcc/fortran/ChangeLog:

PR fortran/95502
* expr.c (gfc_check_pointer_assign): Avoid NULL pointer
dereference.
* match.c (gfc_match_pointer_assignment): Likewise.
* parse.c (gfc_check_do_variable): Avoid comparison with NULL
symtree.

gcc/testsuite/ChangeLog:

PR fortran/95502
* gfortran.dg/pr95502.f90: New test.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 956003ec605..b11ae7ce5c5 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -3815,6 +3815,9 @@ gfc_check_pointer_assign (gfc_expr *lvalue, gfc_expr *rvalue,
   int proc_pointer;
   bool same_rank;

+  if (!lvalue->symtree)
+return false;
+
   lhs_attr = gfc_expr_attr (lvalue);
   if (lvalue->ts.type == BT_UNKNOWN && !lhs_attr.proc_pointer)
 {
diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 29462013038..d148de3e3b5 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -1409,7 +1409,7 @@ gfc_match_pointer_assignment (void)
   gfc_matching_procptr_assignment = 0;

   m = gfc_match (" %v =>", &lvalue);
-  if (m != MATCH_YES)
+  if (m != MATCH_YES || !lvalue->symtree)
 {
   m = MATCH_NO;
   goto cleanup;
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 0522b391393..6d7845e8517 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -4588,6 +4588,9 @@ gfc_check_do_variable (gfc_symtree *st)
 {
   gfc_state_data *s;

+  if (!st)
+return 0;
+
   for (s=gfc_state_stack; s; s = s->previous)
 if (s->do_variable == st)
   {
diff --git a/gcc/testsuite/gfortran.dg/pr95502.f90 b/gcc/testsuite/gfortran.dg/pr95502.f90
new file mode 100644
index 000..d40fd9a5508
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95502.f90
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! PR fortran/95502 - ICE in gfc_check_do_variable, at fortran/parse.c:4446
+
+program p
+  integer, pointer :: z
+  nullify (z%kind)  ! { dg-error "in variable definition context" }
+  z%kind => NULL()  ! { dg-error "constant expression" }
+end


[PATCH] c++: access of dtor named by qualified template-id [PR100918]

2021-06-04 Thread Patrick Palka via Gcc-patches
Here, when resolving the destructor named by Inner::~Inner
(which is valid only before C++20) we end up in cp_parser_lookup_name to
look up the name Inner relative to the scope Inner.  The lookup
naturally finds the injected-class-name Inner, and because
is_template is true, we adjust this lookup result to the TEMPLATE_DECL
Inner, and then check access of this adjusted lookup result.  But this
access check fails because the scope is Inner and not Outer, and
the context_for_name_lookup of the TEMPLATE_DECL is Outer.

The simplest fix seems to be to perform the access check on the original
lookup result (the injected-class-name) instead of the TEMPLATE_DECL.
So this patch moves the access check in cp_parser_lookup_name to before
the injected-class-name adjustment.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/100918

gcc/cp/ChangeLog:

* parser.c (cp_parser_lookup_name): Check access of the lookup
result before we potentially adjust an injected-class-name to
its TEMPLATE_DECL.

gcc/testsuite/ChangeLog:

* g++.dg/template/access38.C: New test.
---
 gcc/cp/parser.c  | 24 +---
 gcc/testsuite/g++.dg/template/access38.C | 15 +++
 2 files changed, 28 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/access38.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 4a46828e162..829a94b2928 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29505,6 +29505,19 @@ cp_parser_lookup_name (cp_parser *parser, tree name,
   if (!decl || decl == error_mark_node)
 return error_mark_node;
 
+  /* If we have resolved the name of a member declaration, check to
+ see if the declaration is accessible.  When the name resolves to
+ set of overloaded functions, accessibility is checked when
+ overload resolution is done.  If we have a TREE_LIST, then the lookup
+ is either ambiguous or it found multiple injected-class-names, the
+ accessibility of which is trivially satisfied.
+
+ During an explicit instantiation, access is not checked at all,
+ as per [temp.explicit].  */
+  if (DECL_P (decl))
+check_accessibility_of_qualified_id (decl, object_type, parser->scope,
+tf_warning_or_error);
+
   /* Pull out the template from an injected-class-name (or multiple).  */
   if (is_template)
 decl = maybe_get_template_decl_from_type_decl (decl);
@@ -29531,17 +29544,6 @@ cp_parser_lookup_name (cp_parser *parser, tree name,
  || TREE_CODE (decl) == UNBOUND_CLASS_TEMPLATE
  || BASELINK_P (decl));
 
-  /* If we have resolved the name of a member declaration, check to
- see if the declaration is accessible.  When the name resolves to
- set of overloaded functions, accessibility is checked when
- overload resolution is done.
-
- During an explicit instantiation, access is not checked at all,
- as per [temp.explicit].  */
-  if (DECL_P (decl))
-check_accessibility_of_qualified_id (decl, object_type, parser->scope,
-tf_warning_or_error);
-
   maybe_record_typedef_use (decl);
 
   return cp_expr (decl, name_location);
diff --git a/gcc/testsuite/g++.dg/template/access38.C 
b/gcc/testsuite/g++.dg/template/access38.C
new file mode 100644
index 000..488f8650c97
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/access38.C
@@ -0,0 +1,15 @@
+// PR c++/100918
+
+struct Outer {
+  template
+  struct Inner { ~Inner(); };
+};
+
+template<>
+Outer::Inner::~Inner() { } // { dg-error "template-id" "" { target 
c++20 } }
+
+template
+Outer::Inner::~Inner() { } // { dg-error "template-id" "" { target c++20 
} }
+
+Outer::Inner x;
+Outer::Inner y;
-- 
2.32.0.rc2



Re: [committed] libstdc++: Fix value categories used by ranges access CPOs [PR 100824]

2021-06-04 Thread Jonathan Wakely via Gcc-patches

On 04/06/21 21:44 +0100, Jonathan Wakely wrote:

On 04/06/21 18:03 +0100, Jonathan Wakely wrote:

The implementation of P2091R0 was incomplete, so that some range access
CPOs used perfect forwarding where they should not. This fixes it by
consistently operating on lvalues.

Some additional changes that are not necessary to fix the bug:

Modify the __as_const helper to simplify its usage. Instead of deducing
the value category from its argument, and requiring callers to forward
the argument as the correct category, add a non-deduced template
parameter which is used for the value category and accept the argument
as an lvalue. This means callers say __as_const(t) instead of
__as_const(std::forward(t)).

Always use an lvalue reference type as the template argument for the
_S_noexcept helpers, so that we only instantiate one specialization for
lvalues and rvalues of the same type.

Move some helper concepts and functions from namespace std::__detail
to ranges::__cust_access, to be consistent with the ranges::begin CPO.
This ensures that the __adl_begin concept and the _Begin::operator()
function are in the same namespace, so unqualified lookup is consistent
and the poison pills for begin are visible to both.

Simplified static assertions for arrays, because the expression a+0 is
already ill-formed for an array of incomplete type.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/100824
* include/bits/iterator_concepts.h (__detail::__decay_copy)
(__detail::__member_begin, __detail::__adl_begin): Move to
namespace ranges::__cust_access.
(__detail::__ranges_begin): Likewise, and rename to __begin.
Remove redundant static assertion.
* include/bits/ranges_base.h (_Begin, _End, _RBegin, _REnd):
Use lvalue in noexcept specifier.
(__as_const): Add non-deduced parameter for value category.
(_CBegin, _CEnd, _CRBegin, _CREnd, _CData): Adjust uses of
__as_const.
(__member_size, __adl_size, __member_empty, __size0_empty):
(__eq_iter_empty, __adl_data): Use lvalue objects in
requirements.
(__sentinel_size): Likewise. Add check for conversion to
unsigned-like.
(__member_data): Allow non-lvalue types to satisfy the concept,
but use lvalue object in requirements.
(_Size, _SSize): Remove forwarding to always use an lvalue.
(_Data): Likewise. Add static assertion for arrays.
* testsuite/std/ranges/access/cdata.cc: Adjust expected
behaviour for rvalues. Add negative tests for ill-formed
expressions.
* testsuite/std/ranges/access/data.cc: Likewise.
* testsuite/std/ranges/access/empty.cc: Adjust expected
behaviour for rvalues.
* testsuite/std/ranges/access/size.cc: Likewise.


An additional problem with ranges::data was pointed out in the PR,
fixed with this patch.


And this implements the rest of LWG 3403. The change to the
ranges::ssize constraints was already done by the first patch in this
thread, this fixes the return type.

Tested powerpc64le-linux. Committed to trunk.

This should also be backported to gcc-11 and gcc-10.


commit 621ea10ca060ba19ec693aa73b5e29d553cca849
Author: Jonathan Wakely 
Date:   Fri Jun 4 20:28:04 2021

libstdc++: Implement LWG 3403 for std::ranges::ssize

I already changed the constraints for ranges::ssize to use ranges::size,
this implements the rest of LWG 3403, so that the returned type is the
signed type corresponding to the result of ranges::size.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h (_SSize): Return the result of
ranges::size converted to the wider of make-signed-like-t and
ptrdiff_t, rather than the ranges different type.
* testsuite/std/ranges/access/ssize.cc: Adjust expected result
for an iota_view that uses an integer class type for its
difference_type.

diff --git a/libstdc++-v3/include/bits/ranges_base.h b/libstdc++-v3/include/bits/ranges_base.h
index 61d91eb8389..e3c3962bcd9 100644
--- a/libstdc++-v3/include/bits/ranges_base.h
+++ b/libstdc++-v3/include/bits/ranges_base.h
@@ -425,22 +425,32 @@ namespace ranges
 
 struct _SSize
 {
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 3403. Domain of ranges::ssize(E) doesn't match ranges::size(E)
   template
 	requires requires (_Tp& __t) { _Size{}(__t); }
 	constexpr auto
 	operator()(_Tp&& __t) const noexcept(noexcept(_Size{}(__t)))
 	{
-	  using __iter_type = decltype(_Begin{}(__t));
-	  using __diff_type = iter_difference_t<__iter_type>;
-	  using __gnu_cxx::__int_traits;
 	  auto __size = _Size{}(__t);
-	  if constexpr (integral<__diff_type>)
+	  using __size_type = decltype(__size);
+	  // Return the wider of ptrdiff_t and make-signed-like-t<__size_type>.
+	  if constexpr (integral<__size_type>)
 	{
-	  if constexpr (__int_t

Re: [committed] libstdc++: Fix value categories used by ranges access CPOs [PR 100824]

2021-06-04 Thread Jonathan Wakely via Gcc-patches

On 04/06/21 18:03 +0100, Jonathan Wakely wrote:

The implementation of P2091R0 was incomplete, so that some range access
CPOs used perfect forwarding where they should not. This fixes it by
consistently operating on lvalues.

Some additional changes that are not necessary to fix the bug:

Modify the __as_const helper to simplify its usage. Instead of deducing
the value category from its argument, and requiring callers to forward
the argument as the correct category, add a non-deduced template
parameter which is used for the value category and accept the argument
as an lvalue. This means callers say __as_const(t) instead of
__as_const(std::forward(t)).

Always use an lvalue reference type as the template argument for the
_S_noexcept helpers, so that we only instantiate one specialization for
lvalues and rvalues of the same type.

Move some helper concepts and functions from namespace std::__detail
to ranges::__cust_access, to be consistent with the ranges::begin CPO.
This ensures that the __adl_begin concept and the _Begin::operator()
function are in the same namespace, so unqualified lookup is consistent
and the poison pills for begin are visible to both.

Simplified static assertions for arrays, because the expression a+0 is
already ill-formed for an array of incomplete type.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/100824
* include/bits/iterator_concepts.h (__detail::__decay_copy)
(__detail::__member_begin, __detail::__adl_begin): Move to
namespace ranges::__cust_access.
(__detail::__ranges_begin): Likewise, and rename to __begin.
Remove redundant static assertion.
* include/bits/ranges_base.h (_Begin, _End, _RBegin, _REnd):
Use lvalue in noexcept specifier.
(__as_const): Add non-deduced parameter for value category.
(_CBegin, _CEnd, _CRBegin, _CREnd, _CData): Adjust uses of
__as_const.
(__member_size, __adl_size, __member_empty, __size0_empty):
(__eq_iter_empty, __adl_data): Use lvalue objects in
requirements.
(__sentinel_size): Likewise. Add check for conversion to
unsigned-like.
(__member_data): Allow non-lvalue types to satisfy the concept,
but use lvalue object in requirements.
(_Size, _SSize): Remove forwarding to always use an lvalue.
(_Data): Likewise. Add static assertion for arrays.
* testsuite/std/ranges/access/cdata.cc: Adjust expected
behaviour for rvalues. Add negative tests for ill-formed
expressions.
* testsuite/std/ranges/access/data.cc: Likewise.
* testsuite/std/ranges/access/empty.cc: Adjust expected
behaviour for rvalues.
* testsuite/std/ranges/access/size.cc: Likewise.


An additional problem with ranges::data was pointed out in the PR,
fixed with this patch.

Tested powerpc64le-linux. Committed to trunk.

This should also be backported to gcc-11 and gcc-10.


commit 3e5f2425f80aedd00f28235022a2755eb46f310d
Author: Jonathan Wakely 
Date:   Fri Jun 4 20:25:39 2021

libstdc++: Fix helper concept for ranges::data [PR 100824]

We need to decay the result of t.data() before checking if it's a
pointer.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/100824
* include/bits/ranges_base.h (__member_data): Use __decay_copy.
* testsuite/std/ranges/access/data.cc: Add testcase from PR.

diff --git a/libstdc++-v3/include/bits/ranges_base.h b/libstdc++-v3/include/bits/ranges_base.h
index 17a421a4927..61d91eb8389 100644
--- a/libstdc++-v3/include/bits/ranges_base.h
+++ b/libstdc++-v3/include/bits/ranges_base.h
@@ -495,8 +495,10 @@ namespace ranges
 && is_object_v>;
 
 template
-  concept __member_data
-	= requires(_Tp& __t) { { __t.data() } -> __pointer_to_object; };
+  concept __member_data = requires(_Tp& __t)
+	{
+	  { __cust_access::__decay_copy(__t.data()) } -> __pointer_to_object;
+	};
 
 template
   concept __begin_data = requires(_Tp& __t)
diff --git a/libstdc++-v3/testsuite/std/ranges/access/data.cc b/libstdc++-v3/testsuite/std/ranges/access/data.cc
index 237bbcc76c5..4f16f447f9f 100644
--- a/libstdc++-v3/testsuite/std/ranges/access/data.cc
+++ b/libstdc++-v3/testsuite/std/ranges/access/data.cc
@@ -92,8 +92,12 @@ test03()
   // ranges::data should treat the subexpression as an lvalue
   VERIFY( std::ranges::data(std::move(r)) == &R3::i );
   VERIFY( std::ranges::data(std::move(c)) == &R3::l );
-}
 
+  // PR libstdc++/100824 comment 3
+  // Check for member data() should use decay-copy
+  struct A { int*&& data(); };
+  static_assert( has_data );
+}
 
 int
 main()


Re: [PATCH] PR libstdc++/98842: Fixed Constraints on operator<=>(optional, U)

2021-06-04 Thread Jonathan Wakely via Gcc-patches
On Thu, 3 Jun 2021 at 17:27, Seija K. via Libstdc++ 
wrote:

> The original operator was underconstrained. _Up needs to fulfill
> compare_three_way_result,
> as mentioned in this bug report
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98842
>

Thanks, I'll get the patch applied next week.



> diff --git a/libstdc++-v3/include/std/optional
> b/libstdc++-v3/include/std/optional
> index 8b9e038e6e510..9e61c1b2cbfbd 100644
> --- a/libstdc++-v3/include/std/optional
> +++ b/libstdc++-v3/include/std/optional
> @@ -1234,7 +1234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  { return !__rhs || __lhs >= *__rhs; }
>
>  #ifdef __cpp_lib_three_way_comparison
> -  template
> +  template _Up>
>  constexpr compare_three_way_result_t<_Tp, _Up>
>  operator<=>(const optional<_Tp>& __x, const _Up& __v)
>  { return bool(__x) ? *__x <=> __v : strong_ordering::less; }
>
>


Re: [PATCH] [libstdc++] Remove unused hasher instance.

2021-06-04 Thread Jonathan Wakely via Gcc-patches
On Fri, 4 Jun 2021 at 20:54, Thomas Rodgers wrote:

> This is a remnant of poorly executed refactoring.
>

OK for trunk and gcc-11, thanks.



> libstdc++-v3/ChangeLog:
>
> * include/std/barrier (__tree_barrier::_M_arrive): Remove
> unnecessary hasher instantiation.
> ---
>  libstdc++-v3/include/std/barrier | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/barrier
> b/libstdc++-v3/include/std/barrier
> index fd61fb4f9da..4210e30d1ce 100644
> --- a/libstdc++-v3/include/std/barrier
> +++ b/libstdc++-v3/include/std/barrier
> @@ -103,7 +103,6 @@ It looks different from literature pseudocode for two
> main reasons:
>static_cast<__barrier_phase_t>(__old_phase_val
> + 2);
>
> size_t __current_expected = _M_expected;
> -   std::hash __hasher;
> __current %= ((_M_expected + 1) >> 1);
>
> for (int __round = 0; ; ++__round)
> --
> 2.26.2
>
>


Re: [PATCH 2/5 ver4] RS6000: Add 128-bit Integer Operations

2021-06-04 Thread Segher Boessenkool
On Tue, Apr 27, 2021 at 06:46:16PM -0500, will schmidt wrote:
> On Mon, 2021-04-26 at 09:36 -0700, Carl Love wrote:
> > (rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT,
> > P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI,
> > P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
> > P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
> > P10_BUILTIN_CMPLE_U1TI]: New case statements.
> 
> No signs of P10_BUILTIN_CMPNET below.  possibly P10V_BUILTIN_CMPNET?  
> S
> ame through at least P10_BUILTIN_CMPLE_U1TI.

It is P10V_BUILTIN_CMPNET (note the V).


Segher


[PATCH] [libstdc++] Remove unused hasher instance.

2021-06-04 Thread Thomas Rodgers
This is a remnant of poorly executed refactoring.

libstdc++-v3/ChangeLog:

* include/std/barrier (__tree_barrier::_M_arrive): Remove
unnecessary hasher instantiation.
---
 libstdc++-v3/include/std/barrier | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index fd61fb4f9da..4210e30d1ce 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -103,7 +103,6 @@ It looks different from literature pseudocode for two main 
reasons:
   static_cast<__barrier_phase_t>(__old_phase_val + 2);
 
size_t __current_expected = _M_expected;
-   std::hash __hasher;
__current %= ((_M_expected + 1) >> 1);
 
for (int __round = 0; ; ++__round)
-- 
2.26.2



Re: [PATCH 01/11] gen: Emit error msg for empty split condition

2021-06-04 Thread Segher Boessenkool
On Fri, Jun 04, 2021 at 01:03:34PM -0600, Martin Sebor wrote:
> Also, "insn" is not a word, and even though it's common abbreviation
> in GCC speak it's not necessarily something all users are familiar
> with, and doesn't lend itself to translation.  Please spell out
> the word instead.

This is a message for GCC developers, and it is not translated.


Segher


Re: [PATCH 01/11] gen: Emit error msg for empty split condition

2021-06-04 Thread Martin Sebor via Gcc-patches

On 6/1/21 11:04 PM, Kewen Lin via Gcc-patches wrote:

As Segher suggested, this patch is to emit the error message
if the split condition of define_insn_and_split is empty while
the insn condition isn't.

gcc/ChangeLog:

* gensupport.c (process_rtx): Emit error message for empty
split condition in define_insn_and_split while the insn
condition isn't.
---
  gcc/gensupport.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/gensupport.c b/gcc/gensupport.c
index 0f19bd70664..52cee120215 100644
--- a/gcc/gensupport.c
+++ b/gcc/gensupport.c
@@ -620,6 +620,9 @@ process_rtx (rtx desc, file_location loc)
  }
else if (GET_CODE (desc) == DEFINE_INSN_AND_REWRITE)
  error_at (loc, "the rewrite condition must start with `&&'");
+   else if (split_cond[0] == '\0' && strlen (XSTR (desc, 2)) != 0)
+ error_at (loc, "the split condition mustn't be empty if the "
+"insn condition isn't empty");


The "mustn't" (and other similar contractions) should trigger
-Wdiag-format that GCC should be free of, or was not too long ago.
Can you please spell them out (the suggested alternative spelling
should be mentined in the warning)?

Also, "insn" is not a word, and even though it's common abbreviation
in GCC speak it's not necessarily something all users are familiar
with, and doesn't lend itself to translation.  Please spell out
the word instead.

Thanks
Martin


XSTR (split, 1) = split_cond;
if (GET_CODE (desc) == DEFINE_INSN_AND_REWRITE)
  XVEC (split, 2) = gen_rewrite_sequence (XVEC (desc, 1));





Re: [PATCH 02/57] Support scanning of build-time GC roots in gengtype

2021-06-04 Thread Bill Schmidt via Gcc-patches

On 5/20/21 5:24 PM, Segher Boessenkool wrote:

On Tue, May 11, 2021 at 11:01:22AM -0500, Bill Schmidt wrote:

Hi!  I'd like to ping this specific patch from the series, which is the
only one remaining that affects common code.  I confess that I don't
know whom to ask for a review for gengtype; I didn't get any good ideas
from MAINTAINERS.  If you know of a good reviewer candidate, please CC
them.

Richard is listed as the "gen* on machine desc" maintainer, that might
be the closest to this.  cc:ed.


Hi, Richard -- any thoughts on this patch?

https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568841.html

Thanks!
Bill




Segher


[committed] d: Fix ICE in gimplify_var_or_parm_decl, at gimplify.c:2755 (PR100882)

2021-06-04 Thread Iain Buclaw via Gcc-patches
Hi,

Constructor calls for temporaries were reusing the TARGET_EXPR_SLOT of a
TARGET_EXPR for an assignment, which later got passed to `build_assign',
which stripped away the outer TARGET_EXPR, leaving a reference to a lone
temporary with no declaration.

This stripping away of the TARGET_EXPR also discarded any cleanups that
may have been assigned to the expression as well.

So now the reuse of TARGET_EXPR_SLOT has been removed, and
`build_assign' now constructs assignments inside the TARGET_EXPR_INITIAL
slot.  This has also been extended to `return_expr', to deal with
possibility of a TARGET_EXPR being returned.

Bootstrapped and regression tested on x86_64-linux-gnu{-m32,-mx32},
committed to mainline, and backported to the releases/gcc-9,
releases/gcc-10, and releases/gcc-11 branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/100882
* d-codegen.cc (build_assign): Construct initializations inside
TARGET_EXPR_INITIAL.
(compound_expr): Remove intermediate expressions that have no
side-effects.
(return_expr): Construct returns inside TARGET_EXPR_INITIAL.
* expr.cc (ExprVisitor::visit (CallExp *)): Remove useless assignment
to TARGET_EXPR_SLOT.

gcc/testsuite/ChangeLog:

PR d/100882
* gdc.dg/pr100882a.d: New test.
* gdc.dg/pr100882b.d: New test.
* gdc.dg/pr100882c.d: New test.
* gdc.dg/torture/pr100882.d: New test.
---
 gcc/d/d-codegen.cc  | 36 -
 gcc/d/expr.cc   |  7 +
 gcc/testsuite/gdc.dg/pr100882a.d| 35 
 gcc/testsuite/gdc.dg/pr100882b.d| 19 +
 gcc/testsuite/gdc.dg/pr100882c.d| 25 +
 gcc/testsuite/gdc.dg/torture/pr100882.d | 21 +++
 6 files changed, 131 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr100882a.d
 create mode 100644 gcc/testsuite/gdc.dg/pr100882b.d
 create mode 100644 gcc/testsuite/gdc.dg/pr100882c.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/pr100882.d

diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
index 5fa1acd9240..9a9447371aa 100644
--- a/gcc/d/d-codegen.cc
+++ b/gcc/d/d-codegen.cc
@@ -1330,6 +1330,7 @@ component_ref (tree object, tree field)
 tree
 build_assign (tree_code code, tree lhs, tree rhs)
 {
+  tree result;
   tree init = stabilize_expr (&lhs);
   init = compound_expr (init, stabilize_expr (&rhs));
 
@@ -1348,22 +1349,27 @@ build_assign (tree_code code, tree lhs, tree rhs)
   if (TREE_CODE (rhs) == TARGET_EXPR)
 {
   /* If CODE is not INIT_EXPR, can't initialize LHS directly,
-since that would cause the LHS to be constructed twice.
-So we force the TARGET_EXPR to be expanded without a target.  */
+since that would cause the LHS to be constructed twice.  */
   if (code != INIT_EXPR)
{
  init = compound_expr (init, rhs);
- rhs = TARGET_EXPR_SLOT (rhs);
+ result = build_assign (code, lhs, TARGET_EXPR_SLOT (rhs));
}
   else
{
  d_mark_addressable (lhs);
- rhs = TARGET_EXPR_INITIAL (rhs);
+ TARGET_EXPR_INITIAL (rhs) = build_assign (code, lhs,
+   TARGET_EXPR_INITIAL (rhs));
+ result = rhs;
}
 }
+  else
+{
+  /* Simple assignment.  */
+  result = fold_build2_loc (input_location, code,
+   TREE_TYPE (lhs), lhs, rhs);
+}
 
-  tree result = fold_build2_loc (input_location, code,
-TREE_TYPE (lhs), lhs, rhs);
   return compound_expr (init, result);
 }
 
@@ -1485,6 +1491,11 @@ compound_expr (tree arg0, tree arg1)
   if (arg0 == NULL_TREE || !TREE_SIDE_EFFECTS (arg0))
 return arg1;
 
+  /* Remove intermediate expressions that have no side-effects.  */
+  while (TREE_CODE (arg0) == COMPOUND_EXPR
+&& !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1)))
+arg0 = TREE_OPERAND (arg0, 0);
+
   if (TREE_CODE (arg1) == TARGET_EXPR)
 {
   /* If the rhs is a TARGET_EXPR, then build the compound expression
@@ -1505,6 +1516,19 @@ compound_expr (tree arg0, tree arg1)
 tree
 return_expr (tree ret)
 {
+  /* Same as build_assign, the DECL_RESULT assignment replaces the temporary
+ in TARGET_EXPR_SLOT.  */
+  if (ret != NULL_TREE && TREE_CODE (ret) == TARGET_EXPR)
+{
+  tree exp = TARGET_EXPR_INITIAL (ret);
+  tree init = stabilize_expr (&exp);
+
+  exp = fold_build1_loc (input_location, RETURN_EXPR, void_type_node, exp);
+  TARGET_EXPR_INITIAL (ret) = compound_expr (init, exp);
+
+  return ret;
+}
+
   return fold_build1_loc (input_location, RETURN_EXPR,
  void_type_node, ret);
 }
diff --git a/gcc/d/expr.cc b/gcc/d/expr.cc
index aad7cbbf947..e76cae98f7e 100644
--- a/gcc/d/expr.cc
+++ b/gcc/d/expr.cc
@@ -1894,15 +1894,10 @@ public:
   exp = d_convert (build_ctype (e->type), exp);
 

Re: [Patch] OpenMP: Handle bind clause in tree-nested.c [PR100905]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 04, 2021 at 07:47:50PM +0200, Tobias Burnus wrote:
> Fails due to the (explicit or implicitly added) 'bind' clause as
> tree-nested.c did not handle them.
> 
> In convert_nonlocal_omp_clauses, the following clauses are
> missing: OMP_CLAUSE_AFFINITY, OMP_CLAUSE_DEVICE_TYPE,
> OMP_CLAUSE_EXCLUSIVE, OMP_CLAUSE_INCLUSIVE.
> 
> I am not sure which of them should or must be added – but the
> 'bind' clause for sure; I did add 'affinity' but it is currently
> removed during gimplification – hence, I think leaving it out
> would also be an option.
> 
> Lightly tested. OK once/when testing has succeeded?

Because OMP_CLAUSE_AFFINITY is dropped during gimplification,
I'd leave OMP_CLAUSE_AFFINITY out.  Ok for trunk with that change.

And OMP_CLAUSE_{EXCLUSIVE,INCLUSIVE} isn't needed, because we don't
walk the clauses at all for GIMPLE_OMP_SCAN.  It would be a bug
if we used the exclusive/inclusive operands after gimplification,
but we apparently don't do that, all we check is whether the
OMP_CLAUSE_KIND of the first clause (all should be the same) is
OMP_CLAUSE_EXCLUSIVE or OMP_CLAUSE_INCLUSIVE, nothing else.

That said, I think we should have a testcase, so I'll commit following
after testing:

2021-06-04  Jakub Jelinek  

* gcc.dg/gomp/scan-1.c: New test.

--- gcc/testsuite/gcc.dg/gomp/scan-1.c.jj   2021-06-04 20:03:44.250674711 
+0200
+++ gcc/testsuite/gcc.dg/gomp/scan-1.c  2021-06-04 20:03:31.164851821 +0200
@@ -0,0 +1,51 @@
+int baz (void);
+void qux (int);
+int r;
+
+int
+foo (void)
+{
+  int r = 0, i;
+  void bar (void) { r++; }
+  #pragma omp parallel for reduction(inscan, +:r)
+  for (i = 0; i < 64; i++)
+{
+  r += baz ();
+  #pragma omp scan inclusive(r)
+  qux (r);
+}
+  #pragma omp parallel for reduction(inscan, +:r)
+  for (i = 0; i < 64; i++)
+{
+  qux (r);
+  #pragma omp scan exclusive(r)
+  r += baz ();
+}
+  bar ();
+  return r;
+}
+
+int
+corge (void)
+{
+  int r = 0, i;
+  void bar (void)
+  {
+#pragma omp parallel for reduction(inscan, +:r)
+for (i = 0; i < 64; i++)
+  {
+   r += baz ();
+   #pragma omp scan inclusive(r)
+   qux (r);
+  }
+#pragma omp parallel for reduction(inscan, +:r)
+for (i = 0; i < 64; i++)
+  {
+   qux (r);
+   #pragma omp scan exclusive(r)
+   r += baz ();
+  }
+  }
+  bar ();
+  return r;
+}


Jakub



arm_arch8_5 and arm_arch8_6 target baselines in arm.c

2021-06-04 Thread John Paul Adrian Glaubitz
Please keep me CC'ed as I'm currently not subscribed!

Hi!

I'm currently fixing some minor portability issues in gccrs  and stumbled over 
the following
error [1] which indicates a reference to an ARMv8 target extension in the ARM 
backend which
doesn't seem to exist:

../../gcc/config/arm/arm-rust.c:204:9: error: 'arm_arch8_5' was not declared in 
this scope; did you mean 'arm_arch8_4'?
  204 | if (arm_arch8_5)
  | ^~~
  | arm_arch8_4
../../gcc/config/arm/arm-rust.c:206:9: error: 'arm_arch8_6' was not declared in 
this scope; did you mean 'arm_arch8_4'?
  206 | if (arm_arch8_6)
  | ^~~
  | arm_arch8_4

Looking at gcc/config/arm.c, the highest level of ARMv8 that is supported in 
the ARM backend is 8.4,
not 8.5 and 8.6 [2]:

/* Nonzero if this chip supports the ARM Architecture 8.2 extensions.  */
int arm_arch8_2 = 0;

/* Nonzero if this chip supports the ARM Architecture 8.3 extensions.  */
int arm_arch8_3 = 0;

/* Nonzero if this chip supports the ARM Architecture 8.4 extensions.  */
int arm_arch8_4 = 0;
/* Nonzero if this chip supports the ARM Architecture 8.1-M Mainline
   
   extensions.  */
int arm_arch8_1m_main = 0;

/* Nonzero if this chip supports the FP16 instructions extension of ARM 
   
   Architecture 8.2.  */
int arm_fp16_inst = 0;

Does anyone know whether the ARM backend is supposed to support the ARMv8.5 and 
ARMv8.6
baselines or are these only present in the Aarch64 backend? If the latter 
applies, I can
just delete the two if statements in arm-rust.c.

Thanks,
Adrian

> [1] https://github.com/Rust-GCC/gccrs/issues/483
> [2] https://github.com/gcc-mirror/gcc/blob/master/gcc/config/arm/arm.c#L919

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


[Patch] OpenMP: Handle bind clause in tree-nested.c [PR100905]

2021-06-04 Thread Tobias Burnus

Fails due to the (explicit or implicitly added) 'bind' clause as
tree-nested.c did not handle them.

In convert_nonlocal_omp_clauses, the following clauses are
missing: OMP_CLAUSE_AFFINITY, OMP_CLAUSE_DEVICE_TYPE,
OMP_CLAUSE_EXCLUSIVE, OMP_CLAUSE_INCLUSIVE.

I am not sure which of them should or must be added – but the
'bind' clause for sure; I did add 'affinity' but it is currently
removed during gimplification – hence, I think leaving it out
would also be an option.

Lightly tested. OK once/when testing has succeeded?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
OpenMP: Handle bind clause in tree-nested.c [PR100905]

	PR middle-end/100905

gcc/ChangeLog:

	* tree-nested.c (convert_nonlocal_omp_clauses,
	convert_local_omp_clauses): Handle OMP_CLAUSE_AFFINITY
	and OMP_CLAUSE_BIND.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/loop-3.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-3.f90
new file mode 100644
index 000..6d25b19735d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/loop-3.f90
@@ -0,0 +1,55 @@
+! PR middle-end/100905
+!
+PROGRAM test_loop_order_concurrent
+  implicit none
+  integer :: a, cc(64), dd(64)
+
+  dd = 54
+  cc = 99
+
+  call test_loop()
+  call test_affinity(a)
+  if (a /= 5) stop 3
+  call test_scan(cc, dd)
+  if (any (cc /= 99)) stop 4
+  if (dd(1) /= 5  .or. dd(2) /= 104) stop 5
+
+CONTAINS
+
+  SUBROUTINE test_loop()
+INTEGER,DIMENSION(1024):: a, b, c
+INTEGER:: i
+
+DO i = 1, 1024
+   a(i) = 1
+   b(i) = i + 1
+   c(i) = 2*(i + 1)
+END DO
+
+   !$omp loop order(concurrent) bind(thread)
+DO i = 1, 1024
+   a(i) = a(i) + b(i)*c(i)
+END DO
+
+DO i = 1, 1024
+   if (a(i) /= 1 + (b(i)*c(i))) stop 1
+END DO
+  END SUBROUTINE test_loop
+
+  SUBROUTINE test_affinity(aa)
+integer :: aa
+!$omp task affinity(aa)
+  a = 5
+!$omp end task
+  end 
+
+  subroutine test_scan(c, d)
+integer i, c(*), d(*)
+!$omp simd reduction (inscan, +: a)
+do i = 1, 64
+  d(i) = a
+  !$omp scan exclusive (a)
+  a = a + c(i)
+end do
+  end
+END PROGRAM test_loop_order_concurrent
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index cea917a4d58..6ab3bfd5184 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -1365,6 +1365,7 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_FINAL:
 	case OMP_CLAUSE_IF:
 	case OMP_CLAUSE_NUM_THREADS:
+	case OMP_CLAUSE_AFFINITY:
 	case OMP_CLAUSE_DEPEND:
 	case OMP_CLAUSE_DEVICE:
 	case OMP_CLAUSE_NUM_TEAMS:
@@ -1484,6 +1485,7 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_IF_PRESENT:
 	case OMP_CLAUSE_FINALIZE:
+	case OMP_CLAUSE_BIND:
 	case OMP_CLAUSE__CONDTEMP_:
 	case OMP_CLAUSE__SCANTEMP_:
 	  break;
@@ -2140,6 +2142,7 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_IF:
 	case OMP_CLAUSE_NUM_THREADS:
 	case OMP_CLAUSE_DEPEND:
+	case OMP_CLAUSE_AFFINITY:
 	case OMP_CLAUSE_DEVICE:
 	case OMP_CLAUSE_NUM_TEAMS:
 	case OMP_CLAUSE_THREAD_LIMIT:
@@ -2264,6 +2267,7 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_IF_PRESENT:
 	case OMP_CLAUSE_FINALIZE:
+	case OMP_CLAUSE_BIND:
 	case OMP_CLAUSE__CONDTEMP_:
 	case OMP_CLAUSE__SCANTEMP_:
 	  break;


Re: [committed] libstdc++: Add feature test macro for heterogeneous lookup in unordered containers

2021-06-04 Thread Jonathan Wakely via Gcc-patches

On 04/06/21 16:01 +0100, Jonathan Wakely wrote:

Also update the C++20 status docs.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2020.xml:
* doc/html/*: Regenerate.
* include/bits/hashtable.h (__cpp_lib_generic_unordered_lookup):
Define.
* include/std/version (__cpp_lib_generic_unordered_lookup):
Define.
* testsuite/23_containers/unordered_map/operations/1.cc: Check
feature test macro.
* testsuite/23_containers/unordered_set/operations/1.cc:
Likewise.

Tested powerpc64le-linux. Committed to trunk.

I'll also add a note to the GCC 11 release notes, and I've updated the
compiler support page at cppreference.com


Oh, and this should be backported to gcc-11 too.




Re: [committed] libstdc++: Optimize std::any_cast by replacing indirect call

2021-06-04 Thread Jonathan Wakely via Gcc-patches

Apparently my mailer decided to sent this email as From: Tim, rather
than me. Sorry for any confusion.  The patch is from Tim, but the
email to the lists was sent by me (jwakely). Hopefully this one will
have the right From: header on it!


On 04/06/21 18:02 +0100, Tim Adye wrote:

This significantly improves the performance of std::any_cast, by
avoiding an indirect call to the _S_manage function through a function
pointer. Before we make that indirect call we've already established
that the contained value has the expected type, which means we also know
the manager type, and so can call one of its members directly.

We also know the precise type in the any::emplace functions, because
we've just constructed that type, so we can use the new member there
too. That doesn't seem to affect performance, but we might as well use
the new _S_access function anyway.

Signed-off-by: Tim Adye 
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/std/any (any::_Manager::_S_access): New static
function to access the contained value.
(any::emplace, __any_caster): Use _S_access member of the
manager type.

Tested powerpc64le-linux. Committed to trunk.

This patch was contributed by Tim Adye and accepted in line with the
new policy announced in https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html

Thanks, Tim!




commit f6bb145c0bff19767931d37733be11c8acc6fa00
Author: Tim Adye 
Date:   Fri Jun 4 15:59:38 2021

   libstdc++: Optimize std::any_cast by replacing indirect call

   This significantly improves the performance of std::any_cast, by
   avoiding an indirect call to the _S_manage function through a function
   pointer. Before we make that indirect call we've already established
   that the contained value has the expected type, which means we also know
   the manager type, and so can call one of its members directly.

   We also know the precise type in the any::emplace functions, because
   we've just constructed that type, so we can use the new member there
   too. That doesn't seem to affect performance, but we might as well use
   the new _S_access function anyway.

   Signed-off-by: Tim Adye 
   Signed-off-by: Jonathan Wakely 

   libstdc++-v3/ChangeLog:

   * include/std/any (any::_Manager::_S_access): New static
   function to access the contained value.
   (any::emplace, __any_caster): Use _S_access member of the
   manager type.

diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 391e43339a0..21120a9146f 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -263,9 +263,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  {
using _VTp = decay_t<_Tp>;
__do_emplace<_VTp>(std::forward<_Args>(__args)...);
-   any::_Arg __arg;
-   this->_M_manager(any::_Op_access, this, &__arg);
-   return *static_cast<_VTp*>(__arg._M_obj);
+   return *any::_Manager<_VTp>::_S_access(_M_storage);
  }

/// Emplace with an object created from @p __il and @p __args as
@@ -276,9 +274,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  {
using _VTp = decay_t<_Tp>;
__do_emplace<_VTp, _Up>(__il, std::forward<_Args>(__args)...);
-   any::_Arg __arg;
-   this->_M_manager(any::_Op_access, this, &__arg);
-   return *static_cast<_VTp*>(__arg._M_obj);
+   return *any::_Manager<_VTp>::_S_access(_M_storage);
  }

// modifiers
@@ -384,6 +380,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void* __addr = &__storage._M_buffer;
::new (__addr) _Tp(std::forward<_Args>(__args)...);
  }
+
+   static _Tp*
+   _S_access(const _Storage& __storage)
+   {
+ // The contained object is in __storage._M_buffer
+ const void* __addr = &__storage._M_buffer;
+ return static_cast<_Tp*>(const_cast(__addr));
+   }
  };

// Manage external contained object.
@@ -405,6 +409,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  {
__storage._M_ptr = new _Tp(std::forward<_Args>(__args)...);
  }
+   static _Tp*
+   _S_access(const _Storage& __storage)
+   {
+ // The contained object is in *__storage._M_ptr
+ return static_cast<_Tp*>(__storage._M_ptr);
+   }
  };
  };

@@ -511,9 +521,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
#endif
  )
{
- any::_Arg __arg;
- __any->_M_manager(any::_Op_access, __any, &__arg);
- return __arg._M_obj;
+ return any::_Manager<_Up>::_S_access(__any->_M_storage);
}
  return nullptr;
}




[committed] libstdc++: Fix value categories used by ranges access CPOs [PR 100824]

2021-06-04 Thread Jonathan Wakely via Gcc-patches
The implementation of P2091R0 was incomplete, so that some range access
CPOs used perfect forwarding where they should not. This fixes it by
consistently operating on lvalues.

Some additional changes that are not necessary to fix the bug:

Modify the __as_const helper to simplify its usage. Instead of deducing
the value category from its argument, and requiring callers to forward
the argument as the correct category, add a non-deduced template
parameter which is used for the value category and accept the argument
as an lvalue. This means callers say __as_const(t) instead of
__as_const(std::forward(t)).

Always use an lvalue reference type as the template argument for the
_S_noexcept helpers, so that we only instantiate one specialization for
lvalues and rvalues of the same type.

Move some helper concepts and functions from namespace std::__detail
to ranges::__cust_access, to be consistent with the ranges::begin CPO.
This ensures that the __adl_begin concept and the _Begin::operator()
function are in the same namespace, so unqualified lookup is consistent
and the poison pills for begin are visible to both.

Simplified static assertions for arrays, because the expression a+0 is
already ill-formed for an array of incomplete type.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/100824
* include/bits/iterator_concepts.h (__detail::__decay_copy)
(__detail::__member_begin, __detail::__adl_begin): Move to
namespace ranges::__cust_access.
(__detail::__ranges_begin): Likewise, and rename to __begin.
Remove redundant static assertion.
* include/bits/ranges_base.h (_Begin, _End, _RBegin, _REnd):
Use lvalue in noexcept specifier.
(__as_const): Add non-deduced parameter for value category.
(_CBegin, _CEnd, _CRBegin, _CREnd, _CData): Adjust uses of
__as_const.
(__member_size, __adl_size, __member_empty, __size0_empty):
(__eq_iter_empty, __adl_data): Use lvalue objects in
requirements.
(__sentinel_size): Likewise. Add check for conversion to
unsigned-like.
(__member_data): Allow non-lvalue types to satisfy the concept,
but use lvalue object in requirements.
(_Size, _SSize): Remove forwarding to always use an lvalue.
(_Data): Likewise. Add static assertion for arrays.
* testsuite/std/ranges/access/cdata.cc: Adjust expected
behaviour for rvalues. Add negative tests for ill-formed
expressions.
* testsuite/std/ranges/access/data.cc: Likewise.
* testsuite/std/ranges/access/empty.cc: Adjust expected
behaviour for rvalues.
* testsuite/std/ranges/access/size.cc: Likewise.

Tested powerpc64le-linux. Committed to trunk.

I think this should be backported to gcc-11 too, and maybe gcc-10 too.


commit ee9548b36a7f17e8a63585b58f340c93dcba95d8
Author: Jonathan Wakely 
Date:   Fri Jun 4 15:59:38 2021

libstdc++: Fix value categories used by ranges access CPOs [PR 100824]

The implementation of P2091R0 was incomplete, so that some range access
CPOs used perfect forwarding where they should not. This fixes it by
consistently operating on lvalues.

Some additional changes that are not necessary to fix the bug:

Modify the __as_const helper to simplify its usage. Instead of deducing
the value category from its argument, and requiring callers to forward
the argument as the correct category, add a non-deduced template
parameter which is used for the value category and accept the argument
as an lvalue. This means callers say __as_const(t) instead of
__as_const(std::forward(t)).

Always use an lvalue reference type as the template argument for the
_S_noexcept helpers, so that we only instantiate one specialization for
lvalues and rvalues of the same type.

Move some helper concepts and functions from namespace std::__detail
to ranges::__cust_access, to be consistent with the ranges::begin CPO.
This ensures that the __adl_begin concept and the _Begin::operator()
function are in the same namespace, so unqualified lookup is consistent
and the poison pills for begin are visible to both.

Simplified static assertions for arrays, because the expression a+0 is
already ill-formed for an array of incomplete type.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/100824
* include/bits/iterator_concepts.h (__detail::__decay_copy)
(__detail::__member_begin, __detail::__adl_begin): Move to
namespace ranges::__cust_access.
(__detail::__ranges_begin): Likewise, and rename to __begin.
Remove redundant static assertion.
* include/bits/ranges_base.h (_Begin, _End, _RBegin, _REnd):
Use lvalue in noexcept specifier.
(__as_const): Add non-deduced parameter for value category

[committed] libstdc++: Optimize std::any_cast by replacing indirect call

2021-06-04 Thread Tim Adye
This significantly improves the performance of std::any_cast, by
avoiding an indirect call to the _S_manage function through a function
pointer. Before we make that indirect call we've already established
that the contained value has the expected type, which means we also know
the manager type, and so can call one of its members directly.

We also know the precise type in the any::emplace functions, because
we've just constructed that type, so we can use the new member there
too. That doesn't seem to affect performance, but we might as well use
the new _S_access function anyway.

Signed-off-by: Tim Adye 
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/std/any (any::_Manager::_S_access): New static
function to access the contained value.
(any::emplace, __any_caster): Use _S_access member of the
manager type.

Tested powerpc64le-linux. Committed to trunk.

This patch was contributed by Tim Adye and accepted in line with the
new policy announced in https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html

Thanks, Tim!

commit f6bb145c0bff19767931d37733be11c8acc6fa00
Author: Tim Adye 
Date:   Fri Jun 4 15:59:38 2021

libstdc++: Optimize std::any_cast by replacing indirect call

This significantly improves the performance of std::any_cast, by
avoiding an indirect call to the _S_manage function through a function
pointer. Before we make that indirect call we've already established
that the contained value has the expected type, which means we also know
the manager type, and so can call one of its members directly.

We also know the precise type in the any::emplace functions, because
we've just constructed that type, so we can use the new member there
too. That doesn't seem to affect performance, but we might as well use
the new _S_access function anyway.

Signed-off-by: Tim Adye 
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/std/any (any::_Manager::_S_access): New static
function to access the contained value.
(any::emplace, __any_caster): Use _S_access member of the
manager type.

diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 391e43339a0..21120a9146f 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -263,9 +263,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
using _VTp = decay_t<_Tp>;
__do_emplace<_VTp>(std::forward<_Args>(__args)...);
-   any::_Arg __arg;
-   this->_M_manager(any::_Op_access, this, &__arg);
-   return *static_cast<_VTp*>(__arg._M_obj);
+   return *any::_Manager<_VTp>::_S_access(_M_storage);
   }
 
 /// Emplace with an object created from @p __il and @p __args as
@@ -276,9 +274,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
using _VTp = decay_t<_Tp>;
__do_emplace<_VTp, _Up>(__il, std::forward<_Args>(__args)...);
-   any::_Arg __arg;
-   this->_M_manager(any::_Op_access, this, &__arg);
-   return *static_cast<_VTp*>(__arg._M_obj);
+   return *any::_Manager<_VTp>::_S_access(_M_storage);
   }
 
 // modifiers
@@ -384,6 +380,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void* __addr = &__storage._M_buffer;
::new (__addr) _Tp(std::forward<_Args>(__args)...);
  }
+
+   static _Tp*
+   _S_access(const _Storage& __storage)
+   {
+ // The contained object is in __storage._M_buffer
+ const void* __addr = &__storage._M_buffer;
+ return static_cast<_Tp*>(const_cast(__addr));
+   }
   };
 
 // Manage external contained object.
@@ -405,6 +409,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  {
__storage._M_ptr = new _Tp(std::forward<_Args>(__args)...);
  }
+   static _Tp*
+   _S_access(const _Storage& __storage)
+   {
+ // The contained object is in *__storage._M_ptr
+ return static_cast<_Tp*>(__storage._M_ptr);
+   }
   };
   };
 
@@ -511,9 +521,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
  )
{
- any::_Arg __arg;
- __any->_M_manager(any::_Op_access, __any, &__arg);
- return __arg._M_obj;
+ return any::_Manager<_Up>::_S_access(__any->_M_storage);
}
   return nullptr;
 }


[committed] Fortran/OpenMP: Fix -fdump-parse-tree for 'omp loop'

2021-06-04 Thread Tobias Burnus

Committed as r12-1220-gcb6e6d5faa3f817435b6f203226fa5969d7a7264

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
commit cb6e6d5faa3f817435b6f203226fa5969d7a7264
Author: Tobias Burnus 
Date:   Fri Jun 4 18:51:35 2021 +0200

Fortran/OpenMP: Fix -fdump-parse-tree for 'omp loop'

gcc/fortran/ChangeLog
* dump-parse-tree.c (show_code_node): Handle
EXEC_OMP_(TARGET_)(,PARALLEL_,TEAMS_)LOOP.

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 8e2df736d8c..141101e699d 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -3214,6 +3214,7 @@ show_code_node (int level, gfc_code *c)
 case EXEC_OMP_DO:
 case EXEC_OMP_DO_SIMD:
 case EXEC_OMP_FLUSH:
+case EXEC_OMP_LOOP:
 case EXEC_OMP_MASTER:
 case EXEC_OMP_MASTER_TASKLOOP:
 case EXEC_OMP_MASTER_TASKLOOP_SIMD:
@@ -3221,6 +3222,7 @@ show_code_node (int level, gfc_code *c)
 case EXEC_OMP_PARALLEL:
 case EXEC_OMP_PARALLEL_DO:
 case EXEC_OMP_PARALLEL_DO_SIMD:
+case EXEC_OMP_PARALLEL_LOOP:
 case EXEC_OMP_PARALLEL_MASTER:
 case EXEC_OMP_PARALLEL_MASTER_TASKLOOP:
 case EXEC_OMP_PARALLEL_MASTER_TASKLOOP_SIMD:
@@ -3237,12 +3239,14 @@ show_code_node (int level, gfc_code *c)
 case EXEC_OMP_TARGET_PARALLEL:
 case EXEC_OMP_TARGET_PARALLEL_DO:
 case EXEC_OMP_TARGET_PARALLEL_DO_SIMD:
+case EXEC_OMP_TARGET_PARALLEL_LOOP:
 case EXEC_OMP_TARGET_SIMD:
 case EXEC_OMP_TARGET_TEAMS:
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE:
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO:
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
 case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
+case EXEC_OMP_TARGET_TEAMS_LOOP:
 case EXEC_OMP_TARGET_UPDATE:
 case EXEC_OMP_TASK:
 case EXEC_OMP_TASKGROUP:
@@ -3255,6 +3259,7 @@ show_code_node (int level, gfc_code *c)
 case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO:
 case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
 case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD:
+case EXEC_OMP_TEAMS_LOOP:
 case EXEC_OMP_WORKSHARE:
   show_omp_node (level, c);
   break;


[Ping, Patch, Fortran] PR100337 Should be able to pass non-present optional arguments to CO_BROADCAST

2021-06-04 Thread Andre Vehreschild via Gcc-patches
Ping!

On Fri, 21 May 2021 15:33:11 +0200
Andre Vehreschild  wrote:

> Hi,
>
> the attached patch fixes an issue when calling CO_BROADCAST in
> -fcoarray=single mode, where the optional but non-present (in the calling
> scope) stat variable was assigned to before checking for it being not present.
>
> Regtests fine on x86-64-linux/f33. Ok for trunk?
>
> Regards,
>   Andre


--
Andre Vehreschild * Email: vehre ad gmx dot de
gcc/fortran/ChangeLog:

	PR fortran/100337
	* trans-intrinsic.c (conv_co_collective): Check stat for null ptr
	before dereferrencing.

gcc/testsuite/ChangeLog:

	PR fortran/100337
	* gfortran.dg/coarray_collectives_17.f90: New test.

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 4d7451479d3..03a38090051 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -11232,8 +11232,28 @@ conv_co_collective (gfc_code *code)
   if (flag_coarray == GFC_FCOARRAY_SINGLE)
 {
   if (stat != NULL_TREE)
-	gfc_add_modify (&block, stat,
-			fold_convert (TREE_TYPE (stat), integer_zero_node));
+	{
+	  /* For optional stats, check the pointer is valid before zero'ing.  */
+	  if (gfc_expr_attr (stat_expr).optional)
+	{
+	  tree tmp;
+	  stmtblock_t ass_block;
+	  gfc_start_block (&ass_block);
+	  gfc_add_modify (&ass_block, stat,
+			  fold_convert (TREE_TYPE (stat),
+	integer_zero_node));
+	  tmp = fold_build2 (NE_EXPR, logical_type_node,
+ gfc_build_addr_expr (NULL_TREE, stat),
+ null_pointer_node);
+	  tmp = fold_build3 (COND_EXPR, void_type_node, tmp,
+ gfc_finish_block (&ass_block),
+ build_empty_stmt (input_location));
+	  gfc_add_expr_to_block (&block, tmp);
+	}
+	  else
+	gfc_add_modify (&block, stat,
+			fold_convert (TREE_TYPE (stat), integer_zero_node));
+	}
   return gfc_finish_block (&block);
 }

diff --git a/gcc/testsuite/gfortran.dg/coarray_collectives_17.f90 b/gcc/testsuite/gfortran.dg/coarray_collectives_17.f90
new file mode 100644
index 000..84a6645865e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_collectives_17.f90
@@ -0,0 +1,42 @@
+! { dg-do run }
+! { dg-options "-fcoarray=single" }
+!
+! PR 100337
+! Test case inspired by code submitted by Brad Richardson
+
+program main
+implicit none
+
+integer, parameter :: MESSAGE = 42
+integer :: result
+
+call myco_broadcast(MESSAGE, result, 1)
+
+if (result /= MESSAGE) error stop 1
+contains
+subroutine myco_broadcast(m, r, source_image, stat, errmsg)
+integer, intent(in) :: m
+integer, intent(out) :: r
+integer, intent(in) :: source_image
+integer, intent(out), optional :: stat
+character(len=*), intent(inout), optional :: errmsg
+
+integer :: data_length
+
+data_length = 1
+
+call co_broadcast(data_length, source_image, stat, errmsg)
+
+if (present(stat)) then
+if (stat /= 0) return
+end if
+
+if (this_image() == source_image) then
+r = m
+end if
+
+call co_broadcast(r, source_image, stat, errmsg)
+end subroutine
+
+end program
+


Re: [PATCH] c++: tsubst_function_decl and excess arg levels [PR100102]

2021-06-04 Thread Jason Merrill via Gcc-patches

On 6/3/21 11:46 PM, Patrick Palka wrote:

Here, when instantiating the dependent alias template
duration::__is_harmonic with args={{T,U},{int}}, we find ourselves
substituting the function decl _S_gcd.  Since we have more arg levels
than _S_gcd has parm levels, an old special case in tsubst_function_decl
causes us to unwantedly reduce args to its innermost level, yielding
args={int}, which leads to a nonsensical substitution into the decl's
context and an eventual crash.

The comment for this special case refers to three examples for which we
ought to see more arg levels than parm levels here, but none of the
examples actually demonstrate this.  In the first example, when
defining S::f(U) parms_depth is 2 and args_depth is 1, and
later when instantiating say S::f both depths are 2.  In the
second example, when substituting the template friend declaration
parms_depth is 2 and args_depth is 1, and later when instantiating f
both depths are 1.  Finally, the third example is invalid since we can't
specialize a member template of an unspecialized class template like
that.

Given that this reduction code seems no longer relevant for its
documented purpose and that it causes problems as in the PR, this patch
just removes it.  Note that as far as bootstrap/regtest is concerned,
this code is dead; the below two tests would be the first to trigger the
removed code.


Interesting!


Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps backports?  Also tested on various other libraries,
e.g. range-v3 and cmcstl2.


OK I think for 10/11/12; 9 doesn't have the  change that 
revealed this issue.



PR c++/100102

gcc/cp/ChangeLog:

* pt.c (tsubst_function_decl): Remove old code for reducing
args when it has excess levels.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-72.C: New test.
* g++.dg/cpp0x/alias-decl-72a.C: New test.
---
  gcc/cp/pt.c | 39 -
  gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C  |  9 +
  gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C |  9 +
  3 files changed, 18 insertions(+), 39 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3cac073ed50..a6acdf864d1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13909,45 +13909,6 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
  if (tree spec = retrieve_specialization (gen_tmpl, argvec, hash))
return spec;
}
-
-  /* We can see more levels of arguments than parameters if
-there was a specialization of a member template, like
-this:
-
-template  struct S { template  void f(); }
-template <> template  void S::f(U);
-
-Here, we'll be substituting into the specialization,
-because that's where we can find the code we actually
-want to generate, but we'll have enough arguments for
-the most general template.
-
-We also deal with the peculiar case:
-
-template  struct S {
-  template  friend void f();
-};
-template  void f() {}
-template S;
-template void f();
-
-Here, the ARGS for the instantiation of will be {int,
-double}.  But, we only need as many ARGS as there are
-levels of template parameters in CODE_PATTERN.  We are
-careful not to get fooled into reducing the ARGS in
-situations like:
-
-template  struct S { template  void f(U); }
-template  template <> void S::f(int) {}
-
-which we can spot because the pattern will be a
-specialization in this case.  */
-  int args_depth = TMPL_ARGS_DEPTH (args);
-  int parms_depth =
-   TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (DECL_TI_TEMPLATE (t)));
-
-  if (args_depth > parms_depth && !DECL_TEMPLATE_SPECIALIZATION (t))
-   args = get_innermost_template_args (args, parms_depth);
  }
else
  {
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C
new file mode 100644
index 000..8009756dcba
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C
@@ -0,0 +1,9 @@
+// PR c++/100102
+// { dg-do compile { target c++11 } }
+
+template struct ratio;
+template struct duration {
+  static constexpr int _S_gcd();
+  template using __is_harmonic = ratio<_S_gcd>;
+  using type = __is_harmonic;
+};
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C
new file mode 100644
index 000..a4443e18f9d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C
@@ -0,0 +1,9 @@
+// PR c++/100102
+// { dg-do compile { target c++11 } }
+
+template struct ratio;
+template struct duration {
+  static constexpr int _S_gcd();
+  template using __is_harmonic = ratio<(duration::_S_gcd)()>;
+ 

Re: PING [PATCH] PR fortran/99839 - [9/10/11/12 Regression] ICE in inline_matmul_assign, at fortran/frontend-passes.c:4234

2021-06-04 Thread Paul Richard Thomas via Gcc-patches
Hi Harald,

Looks good to me - OK for as many branches as you have sufficient fortitude
for.

Regards

Paul


On Thu, 3 Jun 2021 at 21:22, Harald Anlauf via Fortran 
wrote:

> *PING*
>
> > Gesendet: Donnerstag, 27. Mai 2021 um 22:20 Uhr
> > Von: "Harald Anlauf" 
> > An: "fortran" , "gcc-patches" <
> gcc-patches@gcc.gnu.org>
> > Betreff: [PATCH] PR fortran/99839 - [9/10/11/12 Regression] ICE in
> inline_matmul_assign, at fortran/frontend-passes.c:4234
> >
> > Dear Fortranners,
> >
> > frontend optimization tries to inline matmul, but then it also needs
> > to take care of the assignment to the result array.  If that one is
> > not of canonical type, we currently get an ICE.  The straightforward
> > solution is to simply punt in those cases and avoid inlining.
> >
> > Regtested on x86_64-pc-linux-gnu.
> >
> > OK for mainline?  Backport to affected branches?
> >
> > Thanks,
> > Harald
> >
> >
> > Fortran - ICE in inline_matmul_assign
> >
> > Restrict inlining of matmul to those cases where assignment to the
> > result array does not need special treatment.
> >
> > gcc/fortran/ChangeLog:
> >
> >   PR fortran/99839
> >   * frontend-passes.c (inline_matmul_assign): Do not inline matmul
> >   if the assignment to the resulting array if it is not of canonical
> >   type (real/integer/complex/logical).
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR fortran/99839
> >   * gfortran.dg/inline_matmul_25.f90: New test.
> >
> >
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


Re: Generate 128-bit divide/modulus

2021-06-04 Thread will schmidt via Gcc-patches
On Fri, 2021-06-04 at 11:10 -0400, Michael Meissner wrote:


Hi,


> Generate 128-bit divide/modulus.
> 
> This patch adds support for the VDIVSQ, VDIVUQ, VMODSQ, and VMODUQ
> instructions to do 128-bit arithmetic.

vdivsq,vdivuq,vmodsq,vmoduq should be lowercase ? 

> 
> I have tested this on 3 compilers:
> * Power9 little endian, --with-cpu=power9
> * Power8 big endian, --with-cpu=power8, both 32/64-bit tested
> * Power10 little endian, --with-cpu=power10
> 
> There were no issues found in the runs.  Can I check this into the
> master
> branch and later into the GCC 11 branch after a soak-in period?





> 
> gcc/
> 2021-06-03  Michael Meissner  
> 
>   PR target/100809

Add some reference to [PR/100809] in the subject?

>From the GCC bugzilla 

> [tag] [reply] [−] Comment 3 Michael Meissner 2021-06-01 22:55:20 UTC
> 
> Carl Love submitted a patch for this on April 26th.
> 
> [tag] [reply] [−] Comment 4 Michael Meissner 2021-06-01 22:58:31 UTC
> 
> Note, in looking at Carl's patch, it is only for adding the built-
> ins.  I don't believe it adds direct support for {,u}divti3 and
> {,u}moddti3 to implement these for normal __int128 variables.
> 

A few words to clarify the situation in the description may be good.. 
Since that patch did not directly address the PR, i imagine that was a
happy accident that it partially implemented/resolved the situation
here.


>   * config/rs6000/rs6000.md (udivti3): New insn.
>   (divti3): New insn.
>   (umodti3): New insn.
>   (modti3): New insn.

ok

> 
> gcc/testsuite/
> 2021-06-03  Michael Meissner  
> 
>   PR target/100809
>   * gcc.target/powerpc/p10-vdiv-vmod.c: New test.


ok



> ---
>  gcc/config/rs6000/rs6000.md   | 34
> +++
>  .../gcc.target/powerpc/p10-vdivq-vmodq.c  | 27 +++
>  2 files changed, 61 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-vdivq-
> vmodq.c
> 
> diff --git a/gcc/config/rs6000/rs6000.md
> b/gcc/config/rs6000/rs6000.md
> index 2517901f239..e70dbe409df 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -3234,6 +3234,14 @@ (define_insn "udiv3"
>[(set_attr "type" "div")
> (set_attr "size" "")])
> 
> +(define_insn "udivti3"
> +  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
> +(udiv:TI (match_operand:TI 1 "altivec_register_operand" "v")
> +  (match_operand:TI 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10 && TARGET_POWERPC64"
> +  "vdivuq %0,%1,%2"
> +  [(set_attr "type" "vecdiv")
> +   (set_attr "size" "128")])
> 
>  ;; For powers of two we can do sra[wd]i/addze for divide and then
> adjust for
>  ;; modulus.  If it isn't a power of two, force operands into
> register and do
> @@ -3324,6 +3332,15 @@ (define_insn_and_split "*div3_sra_dot2"
> (set_attr "length" "8,12")
> (set_attr "cell_micro" "not")])
> 
> +(define_insn "divti3"
> +  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
> +(div:TI (match_operand:TI 1 "altivec_register_operand" "v")
> + (match_operand:TI 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10 && TARGET_POWERPC64"
> +  "vdivsq %0,%1,%2"
> +  [(set_attr "type" "vecdiv")
> +   (set_attr "size" "128")])
> +
>  (define_expand "mod3"
>[(set (match_operand:GPR 0 "gpc_reg_operand")
>   (mod:GPR (match_operand:GPR 1 "gpc_reg_operand")
> @@ -3424,6 +3441,23 @@ (define_peephole2
>   (minus:GPR (match_dup 1)
>  (match_dup 3)))])
> 
> +(define_insn "umodti3"
> +  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
> +(umod:TI (match_operand:TI 1 "altivec_register_operand" "v")
> +  (match_operand:TI 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10 && TARGET_POWERPC64"
> +  "vmoduq %0,%1,%2"
> +  [(set_attr "type" "vecdiv")
> +   (set_attr "size" "128")])
> +
> +(define_insn "modti3"
> +  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
> +(mod:TI (match_operand:TI 1 "altivec_register_operand" "v")
> + (match_operand:TI 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10 && TARGET_POWERPC64"
> +  "vmodsq %0,%1,%2"
> +  [(set_attr "type" "vecdiv")
> +   (set_attr "size" "128")])

ok

>  
>  ;; Logical instructions
>  ;; The logical instructions are mostly combined by using
> match_operator,
> diff --git a/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c
> b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c
> new file mode 100644
> index 000..cd29b0a4b6b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c
> @@ -0,0 +1,27 @@
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */

ok

> +
> +unsigned __int128 u_div(unsigned __int128 a, unsigned __int128 b)
> +{
> +   return a/b;
> +}
> +
> +unsigned __int128 u_mod(unsigned __int128 a, unsigned __int128 b)
> +{
> +   return a

[PATCH] i386: Add init pattern for V2HI vectors [PR100637]

2021-06-04 Thread Uros Bizjak via Gcc-patches
2021-06-03  Uroš Bizjak  

gcc/
PR target/100637
* config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
Handle V2HI mode.
(ix86_expand_vector_init_general): Ditto.
Use SImode instead of word_mode for logic operations
when GET_MODE_SIZE (mode) < UNITS_PER_WORD.
(expand_vec_perm_even_odd_1): Assert that V2HI mode should be
implemented by expand_vec_perm_1.
(expand_vec_perm_broadcast_1): Assert that V2HI and V4HI modes
should be implemented using standard shuffle patterns.
(ix86_vectorize_vec_perm_const): Handle V2HImode.  Add V4HI and
V2HI modes to modes, implementable with shuffle for one operand.
* config/i386/mmx.md (*punpckwd): New insn_and_split pattern.
(*pshufw_1): New insn pattern.
(*vec_dupv2hi): Ditto.
(vec_initv2hihi): New expander.

gcc/testsuite/

PR target/100637
* gcc.dg/vect/slp-perm-9.c (dg-final): Adjust dumps for vect32 targets.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index eb7cdb0c14f..661d91abe4e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -13723,6 +13723,19 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
machine_mode mode,
}
   goto widen;
 
+case E_V2HImode:
+  if (TARGET_SSE2)
+   {
+ rtx x;
+
+ val = gen_lowpart (SImode, val);
+ x = gen_rtx_TRUNCATE (HImode, val);
+ x = gen_rtx_VEC_DUPLICATE (mode, x);
+ emit_insn (gen_rtx_SET (target, x));
+ return true;
+   }
+  return false;
+
 case E_V8QImode:
   if (!mmx_ok)
return false;
@@ -14524,6 +14537,8 @@ quarter:
 
 case E_V4HImode:
 case E_V8QImode:
+
+case E_V2HImode:
   break;
 
 default:
@@ -14532,12 +14547,14 @@ quarter:
 
 {
   int i, j, n_elts, n_words, n_elt_per_word;
-  machine_mode inner_mode;
+  machine_mode tmp_mode, inner_mode;
   rtx words[4], shift;
 
+  tmp_mode = (GET_MODE_SIZE (mode) < UNITS_PER_WORD) ? SImode : word_mode;
+
   inner_mode = GET_MODE_INNER (mode);
   n_elts = GET_MODE_NUNITS (mode);
-  n_words = GET_MODE_SIZE (mode) / UNITS_PER_WORD;
+  n_words = GET_MODE_SIZE (mode) / GET_MODE_SIZE (tmp_mode);
   n_elt_per_word = n_elts / n_words;
   shift = GEN_INT (GET_MODE_BITSIZE (inner_mode));
 
@@ -14548,15 +14565,15 @@ quarter:
  for (j = 0; j < n_elt_per_word; ++j)
{
  rtx elt = XVECEXP (vals, 0, (i+1)*n_elt_per_word - j - 1);
- elt = convert_modes (word_mode, inner_mode, elt, true);
+ elt = convert_modes (tmp_mode, inner_mode, elt, true);
 
  if (j == 0)
word = elt;
  else
{
- word = expand_simple_binop (word_mode, ASHIFT, word, shift,
+ word = expand_simple_binop (tmp_mode, ASHIFT, word, shift,
  word, 1, OPTAB_LIB_WIDEN);
- word = expand_simple_binop (word_mode, IOR, word, elt,
+ word = expand_simple_binop (tmp_mode, IOR, word, elt,
  word, 1, OPTAB_LIB_WIDEN);
}
}
@@ -14570,14 +14587,14 @@ quarter:
{
  rtx tmp = gen_reg_rtx (mode);
  emit_clobber (tmp);
- emit_move_insn (gen_lowpart (word_mode, tmp), words[0]);
- emit_move_insn (gen_highpart (word_mode, tmp), words[1]);
+ emit_move_insn (gen_lowpart (tmp_mode, tmp), words[0]);
+ emit_move_insn (gen_highpart (tmp_mode, tmp), words[1]);
  emit_move_insn (target, tmp);
}
   else if (n_words == 4)
{
  rtx tmp = gen_reg_rtx (V4SImode);
- gcc_assert (word_mode == SImode);
+ gcc_assert (tmp_mode == SImode);
  vals = gen_rtx_PARALLEL (V4SImode, gen_rtvec_v (4, words));
  ix86_expand_vector_init_general (false, V4SImode, tmp, vals);
  emit_move_insn (target, gen_lowpart (mode, tmp));
@@ -19544,6 +19561,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d 
*d, unsigned odd)
 case E_V2DImode:
 case E_V2SImode:
 case E_V4SImode:
+case E_V2HImode:
   /* These are always directly implementable by expand_vec_perm_1.  */
   gcc_unreachable ();
 
@@ -19754,6 +19772,8 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d 
*d)
 case E_V2DImode:
 case E_V2SImode:
 case E_V4SImode:
+case E_V2HImode:
+case E_V4HImode:
   /* These are always implementable using standard shuffle patterns.  */
   gcc_unreachable ();
 
@@ -20263,6 +20283,10 @@ ix86_vectorize_vec_perm_const (machine_mode vmode, rtx 
target, rtx op0,
   if (!TARGET_MMX_WITH_SSE)
return false;
   break;
+case E_V2HImode:
+   if (!TARGET_SSE2)
+ return false;
+   break;
 case E_V2DImode:
 case

Re: [Patch] Fortran: Fix OpenMP/OpenACC continue-line parsing

2021-06-04 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 04, 2021 at 05:28:37PM +0200, Tobias Burnus wrote:
> Fortran: Fix OpenMP/OpenACC continue-line parsing
> 
> gcc/fortran/ChangeLog:
> 
>   * scanner.c (skip_fixed_omp_sentinel): Set openacc_flag if
>   this is not an (OpenMP) continuation line.
>   (skip_fixed_oacc_sentinel): Likewise for openmp_flag and OpenACC.
>   (gfc_next_char_literal): gfc_error_now to force error for mixed OMP/ACC
>   continuation once per location and return '\n'.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/goacc/omp-fixed.f: Re-add test item changed in previous
>   commit in addition - add more dg-errors and '... end ...' due to changed
>   parsing.
>   * gfortran.dg/goacc/omp.f95: Likewise.
>   * gfortran.dg/goacc-gomp/mixed-1.f: New test.

LGTM, thanks.

Jakub



[Patch] Fortran: Fix OpenMP/OpenACC continue-line parsing

2021-06-04 Thread Tobias Burnus

Hi all, hi Jakub & Thomas,

I did run into this issue with the previous patch where

!$omp  parallel &
!$acc& loop

did no longer report an error – hence, I changed 'loop' to 'kernels loop'
as buffered 'gfc_error' might not be output.

Having no error is very unfortunate.
There is no ideal solution for the problem, but I think the
attached patch makes sense.

I now include the original version besides the patch version of
the 'parallel' / '(kernels )loop' test.


This patch now does:

* For Fortran's free source form:
  - use 'gfc_error_now' to ensure an error is printed
  - cache the error location such that the same error is not
shown multiple times
  - Return '\n' to avoid parsing the code again.

* For Fortran's fixed source form:
  - Likewise
Except:
!$OMP  ...
!$ACC  ...
Here, '!$ACC ' is not a continuation line! Not even when
placing a '&' in column > 72 (as those columns are ignored).
Thus, handle this !$ACC as separate statement.


SIDE EFFECT: Due to returning '\n', the parsing of
 !$OMP parallel &
 !$ACC LOOP
now succeeds: '!$OMP parallel' + '!$ACC LOOP' - such that
the rest of the syntax/semantic rules apply.

Thus, (after printing the gfc_error_now), gfortran continues and
- there is now the resolution-time error about invalid nesting of OMP/ACC
- the '!$OMP parallel' now requires an '!$OMP end parallel' to avoid
  parsing errors.


OK? More comments?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran: Fix OpenMP/OpenACC continue-line parsing

gcc/fortran/ChangeLog:

	* scanner.c (skip_fixed_omp_sentinel): Set openacc_flag if
	this is not an (OpenMP) continuation line.
	(skip_fixed_oacc_sentinel): Likewise for openmp_flag and OpenACC.
	(gfc_next_char_literal): gfc_error_now to force error for mixed OMP/ACC
	continuation once per location and return '\n'.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/omp-fixed.f: Re-add test item changed in previous
	commit in addition - add more dg-errors and '... end ...' due to changed
	parsing.
	* gfortran.dg/goacc/omp.f95: Likewise.
	* gfortran.dg/goacc-gomp/mixed-1.f: New test.

 gcc/fortran/scanner.c  | 35 +-
 gcc/testsuite/gfortran.dg/goacc-gomp/mixed-1.f | 23 +
 gcc/testsuite/gfortran.dg/goacc/omp-fixed.f| 10 +++-
 gcc/testsuite/gfortran.dg/goacc/omp.f95| 12 +
 4 files changed, 67 insertions(+), 13 deletions(-)

diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index 74c5461ed6f..39db0994b62 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -942,6 +942,8 @@ skip_fixed_omp_sentinel (locus *start)
 	  && (continue_flag
 	  || c == ' ' || c == '\t' || c == '0'))
 	{
+	  if (c == ' ' || c == '\t' || c == '0')
+	openacc_flag = 0;
 	  do
 	c = next_char ();
 	  while (gfc_is_whitespace (c));
@@ -971,6 +973,8 @@ skip_fixed_oacc_sentinel (locus *start)
 	  && (continue_flag
 	  || c == ' ' || c == '\t' || c == '0'))
 	{
+	  if (c == ' ' || c == '\t' || c == '0')
+	openmp_flag = 0;
 	  do
 	c = next_char ();
 	  while (gfc_is_whitespace (c));
@@ -1205,6 +1209,7 @@ gfc_skip_comments (void)
 gfc_char_t
 gfc_next_char_literal (gfc_instring in_string)
 {
+  static locus omp_acc_err_loc = {};
   locus old_loc;
   int i, prev_openmp_flag, prev_openacc_flag;
   gfc_char_t c;
@@ -1403,14 +1408,16 @@ restart:
 	{
 	  if (gfc_wide_tolower (c) != (unsigned char) "!$acc"[i])
 		is_openmp = 1;
-	  if (i == 4)
-		old_loc = gfc_current_locus;
 	}
-	  gfc_error (is_openmp
-		 ? G_("Wrong OpenACC continuation at %C: "
-			  "expected !$ACC, got !$OMP")
-		 : G_("Wrong OpenMP continuation at %C: "
-			  "expected !$OMP, got !$ACC"));
+	  if (omp_acc_err_loc.nextc != gfc_current_locus.nextc
+	  || omp_acc_err_loc.lb != gfc_current_locus.lb)
+	gfc_error_now (is_openmp
+			   ? G_("Wrong OpenACC continuation at %C: "
+"expected !$ACC, got !$OMP")
+			   : G_("Wrong OpenMP continuation at %C: "
+"expected !$OMP, got !$ACC"));
+	  omp_acc_err_loc = gfc_current_locus;
+	  goto not_continuation;
 	}
 
   if (c != '&')
@@ -1511,11 +1518,15 @@ restart:
 	  if (gfc_wide_tolower (c) != (unsigned char) "*$acc"[i])
 		is_openmp = 1;
 	}
-	  gfc_error (is_openmp
-		 ? G_("Wrong OpenACC continuation at %C: "
-			  "expected !$ACC, got !$OMP")
-		 : G_("Wrong OpenMP continuation at %C: "
-			  "expected !$OMP, got !$ACC"));
+	  if (omp_acc_err_loc.nextc != gfc_current_locus.nextc
+	  || omp_acc_err_loc.lb != gfc_current_locus.lb)
+	gfc_error_now (is_openmp
+			   ? G_("Wrong OpenACC continuation at %C: "
+"expected !$ACC, got !$OMP")
+			   : G_("Wrong OpenMP continuation at %C: "
+"expected !$OMP, got !$ACC"));
+	  omp_acc_err_loc = gfc_current_locus;
+	  goto not_continuation;
 	}
   else if (!openmp_flag && !openacc_flag)
 	for (i = 0; 

Re: [Patch, fortran] PR fortran/100120/100816/100818/100819/100821 problems raised by aggregate data types

2021-06-04 Thread Paul Richard Thomas via Gcc-patches
Hi José,

I can second Dominique's thanks. I applied it to my tree when you first
posted, set the regtest in motion and have not been able to return to
gfortran matters since.

OK for master.

I am especially happy that you have tackled this area and have rationalised
it to a substantial degree. The wheel keeps being re-invented by different
people, largely for a lack of documentation or coherent self-documentation.
I know, as one of the guilty ones.

Regards

Paul


On Thu, 3 Jun 2021 at 16:05, dhumieres.dominique--- via Fortran <
fort...@gcc.gnu.org> wrote:

> Hi José,
>
> > Patch tested only on x86_64-pc-linux-gnu.
>
> Also tested on darwin20. The patch is OK for me.
>
> Thanks for the work,
>
> Dominique
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


Generate 128-bit divide/modulus

2021-06-04 Thread Michael Meissner via Gcc-patches
Generate 128-bit divide/modulus.

This patch adds support for the VDIVSQ, VDIVUQ, VMODSQ, and VMODUQ
instructions to do 128-bit arithmetic.

I have tested this on 3 compilers:
* Power9 little endian, --with-cpu=power9
* Power8 big endian, --with-cpu=power8, both 32/64-bit tested
* Power10 little endian, --with-cpu=power10

There were no issues found in the runs.  Can I check this into the master
branch and later into the GCC 11 branch after a soak-in period?

gcc/
2021-06-03  Michael Meissner  

PR target/100809
* config/rs6000/rs6000.md (udivti3): New insn.
(divti3): New insn.
(umodti3): New insn.
(modti3): New insn.

gcc/testsuite/
2021-06-03  Michael Meissner  

PR target/100809
* gcc.target/powerpc/p10-vdiv-vmod.c: New test.
---
 gcc/config/rs6000/rs6000.md   | 34 +++
 .../gcc.target/powerpc/p10-vdivq-vmodq.c  | 27 +++
 2 files changed, 61 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2517901f239..e70dbe409df 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3234,6 +3234,14 @@ (define_insn "udiv3"
   [(set_attr "type" "div")
(set_attr "size" "")])
 
+(define_insn "udivti3"
+  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
+(udiv:TI (match_operand:TI 1 "altivec_register_operand" "v")
+(match_operand:TI 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_POWERPC64"
+  "vdivuq %0,%1,%2"
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
 
 ;; For powers of two we can do sra[wd]i/addze for divide and then adjust for
 ;; modulus.  If it isn't a power of two, force operands into register and do
@@ -3324,6 +3332,15 @@ (define_insn_and_split "*div3_sra_dot2"
(set_attr "length" "8,12")
(set_attr "cell_micro" "not")])
 
+(define_insn "divti3"
+  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
+(div:TI (match_operand:TI 1 "altivec_register_operand" "v")
+   (match_operand:TI 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_POWERPC64"
+  "vdivsq %0,%1,%2"
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
+
 (define_expand "mod3"
   [(set (match_operand:GPR 0 "gpc_reg_operand")
(mod:GPR (match_operand:GPR 1 "gpc_reg_operand")
@@ -3424,6 +3441,23 @@ (define_peephole2
(minus:GPR (match_dup 1)
   (match_dup 3)))])
 
+(define_insn "umodti3"
+  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
+(umod:TI (match_operand:TI 1 "altivec_register_operand" "v")
+(match_operand:TI 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_POWERPC64"
+  "vmoduq %0,%1,%2"
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
+
+(define_insn "modti3"
+  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
+(mod:TI (match_operand:TI 1 "altivec_register_operand" "v")
+   (match_operand:TI 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_POWERPC64"
+  "vmodsq %0,%1,%2"
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
 
 ;; Logical instructions
 ;; The logical instructions are mostly combined by using match_operator,
diff --git a/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c 
b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c
new file mode 100644
index 000..cd29b0a4b6b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+unsigned __int128 u_div(unsigned __int128 a, unsigned __int128 b)
+{
+   return a/b;
+}
+
+unsigned __int128 u_mod(unsigned __int128 a, unsigned __int128 b)
+{
+   return a%b;
+}
+__int128 s_div(__int128 a, __int128 b)
+{
+   return a/b;
+}
+
+__int128 s_mod(__int128 a, __int128 b)
+{
+   return a%b;
+}
+
+/* { dg-final { scan-assembler {\mvdivsq\M} } } */
+/* { dg-final { scan-assembler {\mvdivuq\M} } } */
+/* { dg-final { scan-assembler {\mvmodsq\M} } } */
+/* { dg-final { scan-assembler {\mvmoduq\M} } } */
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: GCC documentation: porting to Sphinx

2021-06-04 Thread Martin Sebor via Gcc-patches

On 6/3/21 4:56 AM, Martin Liška wrote:

On 6/2/21 10:41 PM, Martin Sebor wrote:

On 5/31/21 7:25 AM, Martin Liška wrote:

Hello.

I've made quite some progress with the porting of the documentation and
I would like to present it to the community now:
https://splichal.eu/scripts/sphinx/




Hello.

Thank you for the review.


Just a few issues I noticed in the warnings section:

The headings of some warnings mention the same option twice (e.g.,
-Wabi, -Wabi, -Wno-abi;  -Wdouble-promotion, -Wdouble-promotion,
-Wno-double-promotion;  -Winit-self, -Winit-self, -Wno-init-self).
This looks like a pretty pervasive problem.


You are right, I fixed that.


Looks good.





Mentioning the -Wno-xxx option is redundant in a heading for -Wxxx.


Yes. Good reason for that is that Sphinx can then generated properly links
to the current non-documented version of the option. Hope it's improvement
over the current situation?


I think the linking is helpful.  But for warnings, the documented
convention is to only mention the one that's not the default:

  This manual lists only one of the two forms, whichever is not
  the default.

so including both blurs this (IMO rather subtle) distinction.
In addition, in options whose description says something like
"This warning is enabled by -Wall." it's now less clear which
one is the one the "this" refers to (see for example
-Wchar-subscripts).

If the heading can't be changed at a minimum we'll need to update
the convention above, e.g., by saying that the first option mentions
is the default. But again, I think this is too subtle for the casual
reader to pick up on.  The fact that the sentence quoted above appears
under -Wfatal-errors doesn't help.  We should also work on updating
the "This option is in -Wall." either to name the specific option
it refers to, or consider moving that into a Note box like the one
listing the languages the option applies to.)





The headings of some other warnings also mention options that are
only remotely related to them.  E.g., -Wformat has all these:

   -Wformat, -Wno-format, -ffreestanding, -fno-builtin, -Wformat=

(I see the same problem in the attributes section where the headings
for some attributes include option names).

That seems quite puzzling.  I assume it's a consequence of having
index entries for the related options, but I don't think making
them visible in the headings is helfpful.


Oh, you are right. It was consequence of wrong parsing of index entries.
It should be fixed now.


Looks good.





Headings that in the manual today include a level like

   -Wformat-overflow
   -Wformat-overflow=level

don't mention the level in the Spinx manual:

   -Wformat-overflow, -Wno-format-overflow

When the /level/ is then discussed in the rest of the text it's
not clear what it refers to.


Should be also fixed now.


Also looks good.



Can you please take a look at the current output and give me a feedback?


I noticed another minor issue that may already have been pointed
out by someone else.  Under -Wall (and -Wextra), some option names
are prefixed by :option: (e.g., (only with :option:-O2``).  Looks
like some sort of a transcription bug?

And a couple of questions:

References to options with an argument like -Warray-bounds=1 are
rendered in a way that makes it look like there's a space before
the equals: -Warray-bounds =1, with  the =1 being in a different
color and not part of the hyperlink. Is there a way to make it look
like there is no space?

I like how options are automatically linked, and I'd like to see
the same for other references like to attributes.  Can that be
automated as part of the migration or should we/I try to tackle
it in a followup?

In any event, thanks for working so hard on making this turn out
great!

Martin


Thanks,
Martin



Martin



Note the documentation is automatically ([1]) generated from texinfo 
with a GitHub workflow ([2]).
It's built on the devel/sphinx GCC branch which I periodically with 
the master branch. One can

see the current source .rst files here: [3].

Changes made since the last time:
- a shared content is factored out ([4])
- conditional build is fully supported (even for shared parts)
- manual pages look reasonable well
- folders are created for files which have >= 5 TOC tree entries
- various formatting issues were resolved
- baseconf.py reads BASE-VER, DEV-PHASE, .. files

I've got couple of questions:

1) Do we have to you the following cover text?
    Copyright (c) 1988-2020 Free Software Foundation, Inc.

    Permission is granted to copy, distribute and/or modify this 
document under the terms of the GNU Free Documentation License, 
Version 1.3 or any later version published by the Free Software 
Foundation; with the Invariant Sections being "GNU General Public
    License" and "Funding Free Software", the Front-Cover texts 
being (a) (see below), and with the Back-Cover Texts being (b) (see 
below).  A copy of the license is included in the gfdl(7) man page.



[committed] libstdc++: Add feature test macro for heterogeneous lookup in unordered containers

2021-06-04 Thread Jonathan Wakely via Gcc-patches
Also update the C++20 status docs.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2020.xml:
* doc/html/*: Regenerate.
* include/bits/hashtable.h (__cpp_lib_generic_unordered_lookup):
Define.
* include/std/version (__cpp_lib_generic_unordered_lookup):
Define.
* testsuite/23_containers/unordered_map/operations/1.cc: Check
feature test macro.
* testsuite/23_containers/unordered_set/operations/1.cc:
Likewise.

Tested powerpc64le-linux. Committed to trunk.

I'll also add a note to the GCC 11 release notes, and I've updated the
compiler support page at cppreference.com

commit f78f25f43864f38ae5a6a9fcce8f26c94fe45bcd
Author: Jonathan Wakely 
Date:   Fri Jun 4 15:59:37 2021

libstdc++: Add feature test macro for heterogeneous lookup in unordered 
containers

Also update the C++20 status docs.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2020.xml:
* doc/html/*: Regenerate.
* include/bits/hashtable.h (__cpp_lib_generic_unordered_lookup):
Define.
* include/std/version (__cpp_lib_generic_unordered_lookup):
Define.
* testsuite/23_containers/unordered_map/operations/1.cc: Check
feature test macro.
* testsuite/23_containers/unordered_set/operations/1.cc:
Likewise.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
index b62a432eed1..ca12d8023f1 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
@@ -429,13 +429,12 @@ or any notes about the implementation.
 
 
 
-  
Atomic waiting and notifying, std::semaphore, std::latch and 
std::barrier 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1135r6.html";>
 P1135R6 
   
-   
+   11.1 
   
 
  __cpp_lib_atomic_lock_free_type_aliases >= 
201907L 
@@ -803,16 +802,25 @@ or any notes about the implementation.
 
 
 
-  
 Heterogeneous lookup for unordered containers 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0919r3.html";>
 P0919R3 
   
-   
+   11.1 
__cpp_lib_generic_unordered_lookup >= 201811 

 
 
+
+   Refinement Proposal for P0919 
+  
+http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1690r1.html";>
+P1690R1 
+  
+   11.1 
+  
+
+
 
 Adopt Consistent Container Erasure from Library Fundamentals 2 
for C++20 
   
diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h
index 4bdbe7dd9cc..dfc2a2a7800 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -735,7 +735,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   std::pair
   equal_range(const key_type& __k) const;
 
-#if __cplusplus > 201702L
+#if __cplusplus >= 202002L
+#define __cpp_lib_generic_unordered_lookup 201811L
+
   template,
   typename = __has_is_transparent_t<_Equal, _Kt>>
@@ -765,7 +767,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typename = __has_is_transparent_t<_Equal, _Kt>>
pair
_M_equal_range_tr(const _Kt& __k) const;
-#endif
+#endif // C++20
 
 private:
   // Bucket index computation helpers.
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index ea0e18a3f9d..8d0b2b95f34 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -169,7 +169,7 @@
 #define __cpp_lib_variant 201606L
 #endif
 
-#if __cplusplus > 201703L
+#if __cplusplus >= 202002L
 // c++20
 #define __cpp_lib_atomic_flag_test 201907L
 #define __cpp_lib_atomic_float 201711L
@@ -225,6 +225,7 @@
 #define __cpp_lib_constexpr_tuple 201811L
 #define __cpp_lib_constexpr_utility 201811L
 #define __cpp_lib_erase_if 202002L
+#define __cpp_lib_generic_unordered_lookup 201811L
 #define __cpp_lib_interpolate 201902L
 #ifdef _GLIBCXX_HAS_GTHREADS
 # define __cpp_lib_jthread 201911L
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_map/operations/1.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_map/operations/1.cc
index 4f2df728ebb..f310a8a55ed 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_map/operations/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_map/operations/1.cc
@@ -18,6 +18,13 @@
 // { dg-do run { target c++20 } }
 
 #include 
+
+#ifndef __cpp_lib_generic_unordered_lookup
+# error "Feature-test macro for generic lookup missing in "
+#elif __cpp_lib_generic_unordered_lookup < 201811L
+# error "Feature-test macro for generic lookup has wron

Re: [PATCH] fold-const: Fix up fold_read_from_vector [PR100887]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 04, 2021 at 04:21:41PM +0200, Jakub Jelinek wrote:
> but if the permutation was e.g.
> { 0, 13, 2, 3, 4, 5, 6, 7 }
> then it would be called with 5 as index and it could see that
> it is in the second half (aka. the { 0, 0, 0, 0 } constructor) and
> read the 5-4 element from there.

Note, if it is just
typedef unsigned long long __attribute__((__vector_size__ (2 * sizeof (long 
long U;
typedef unsigned long long __attribute__((__vector_size__ (4 * sizeof (long 
long V;
typedef unsigned long long __attribute__((__vector_size__ (8 * sizeof (long 
long W;

U
foo (V v)
{
  return __builtin_shufflevector ((W){}, v, 0, 13);
}
then that doesn't work and is diagnosed as an error, so we'd probably need
help of compound literals in there
typedef unsigned long long __attribute__((__vector_size__ (2 * sizeof (long 
long U;
typedef unsigned long long __attribute__((__vector_size__ (4 * sizeof (long 
long V;
typedef unsigned long long __attribute__((__vector_size__ (8 * sizeof (long 
long W;

U
foo (V v)
{
  return __builtin_shufflevector ((W){}, (V){1,2,3,4}, 0, 9);
}
where it is
  _1 = {{ 1, 2, 3, 4 }, { 0, 0, 0, 0 }};
  _2 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _1, { 0, 9, 2, 3, 4, 5, 6, 7 
}>;

Jakub



Re: RFC: Sphinx for GCC documentation

2021-06-04 Thread Koning, Paul via Gcc-patches


> On Jun 4, 2021, at 3:55 AM, Tobias Burnus  wrote:
> 
> Hello,
> 
> On 13.05.21 13:45, Martin Liška wrote:
>> On 4/1/21 3:30 PM, Martin Liška wrote:
>>> That said, I'm asking the GCC community for a green light before I
>>> invest
>>> more time on it?
>> So far, I've received just a small feedback about the transition. In
>> most cases positive.
>> 
>> [1] https://splichal.eu/scripts/sphinx/
> 
> The HTML output looks quite nice.
> 
> What I observed:
> 
> * Looking at
>  
> https://splichal.eu/scripts/sphinx/gfortran/_build/html/intrinsic-procedures/access-checks-file-access-modes.html
> why is the first argument description in bold?
> It is also not very readable to have a scollbar there – linebreaks would be 
> better.
> → I think that's because the assumption is that the first line contains a 
> header
>  and the rest the data

Explicit line breaks are likely to be wrong depending on the reader's window 
size.  I would suggest setting the table to have cells with line-wrapped 
contents.  That would typically be the default in HTML, I'm curious why that is 
not happening here.

paul




Re: [PATCH] fold-const: Fix up fold_read_from_vector [PR100887]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 04, 2021 at 04:06:43PM +0200, Richard Biener wrote:
> On June 4, 2021 10:44:42 AM GMT+02:00, Jakub Jelinek  wrote:
> >Hi!
> >
> >The callers of fold_read_from_vector expect that the index they pass is
> >an index of an element in the vector and the function does that most of
> >the
> >time.  But we allow CONSTRUCTORs with VECTOR_TYPE to have VECTOR_TYPE
> >elements and in that case every CONSTRUCTOR element represents not just
> >one
> >index (with the exception of V1 vectors), but multiple.
> >So returning zero vector if i >= CONSTRUCTOR_NELTS or returning some
> >CONSTRUCTOR_ELT's value might not be what the callers expect.
> >
> >Fixed by punting if the first element has vector type.
> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> >In theory we could instead recurse (and assert that for CONSTRUCTORs of
> >vector elements we have always all elements specified like tree-cfg.c
> >verifies?) after adjusting the index appropriately.
> 
> I think we do this in the corresponding BIT_FIELD_REF match.pd rule. 

The fold_read_from_vector is indeed only called from the BIT_FIELD_REF
match.pd rule (3 times there), but I don't see there any number of elements
checking or this vector elt checking.

Do you want me to do this check in match.pd instead of
fold_read_from_vector?

> Note I don't think we allow CONSTRUCTOR (as in GENERIC) elements, so you'd 
> only see SSA names with vector type here? 

The match.pd rule is both GIMPLE and GENERIC, so I don't see why
it couldn't be there a VECTOR_TYPE CONSTRUCTOR with CONSTRUCTOR elements
(or e.g. one VAR_DECL element and one CONSTRUCTOR element).
In particular, on the testcase I see in *.original
  W D.2844 = { 0, 0, 0, 0, 0, 0, 0, 0 };

  return VIEW_CONVERT_EXPR(BIT_FIELD_REF < VEC_PERM_EXPR < <<< Unknown tree: 
compound_literal_expr
W D.2844 = { 0, 0, 0, 0, 0, 0, 0, 0 }; >>> , {v, { 0, 0, 0, 0 }} , { 0, 8, 
2, 3, 4, 5, 6, 7 } > , 128, 0>);
In this case fold_read_from_vector is called on the {v, { 0, 0, 0, 0 }}
CONSTRUCTOR with 0 and so the recursion would be on VAR_DECL and would punt,
but if the permutation was e.g.
{ 0, 13, 2, 3, 4, 5, 6, 7 }
then it would be called with 5 as index and it could see that
it is in the second half (aka. the { 0, 0, 0, 0 } constructor) and
read the 5-4 element from there.
Even in GIMPLE, I see up to before veclower21
  _1 = {v_3(D), { 0, 0, 0, 0 }};
  _2 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _1, { 0, 8, 2, 3, 4, 5, 6, 7 
}>;
which seems like CONSTRUCTOR inside of CONSTRUCTOR.
But sure, it could be an SSA_NAME too and if we wanted to recurse it
would need to handle that case too and look at SSA_NAME_DEF_STMT.

Jakub



Re: [PATCH] fold-const: Fix up fold_read_from_vector [PR100887]

2021-06-04 Thread Richard Biener
On June 4, 2021 10:44:42 AM GMT+02:00, Jakub Jelinek  wrote:
>Hi!
>
>The callers of fold_read_from_vector expect that the index they pass is
>an index of an element in the vector and the function does that most of
>the
>time.  But we allow CONSTRUCTORs with VECTOR_TYPE to have VECTOR_TYPE
>elements and in that case every CONSTRUCTOR element represents not just
>one
>index (with the exception of V1 vectors), but multiple.
>So returning zero vector if i >= CONSTRUCTOR_NELTS or returning some
>CONSTRUCTOR_ELT's value might not be what the callers expect.
>
>Fixed by punting if the first element has vector type.
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
>In theory we could instead recurse (and assert that for CONSTRUCTORs of
>vector elements we have always all elements specified like tree-cfg.c
>verifies?) after adjusting the index appropriately.

I think we do this in the corresponding BIT_FIELD_REF match.pd rule. 

Note I don't think we allow CONSTRUCTOR (as in GENERIC) elements, so you'd only 
see SSA names with vector type here? 

Richard. 

>2021-06-04  Jakub Jelinek  
>
>   PR target/100887
>   * fold-const.c (fold_read_from_vector): Return NULL if trying to
>   read from a CONSTRUCTOR with vector type elements.
>
>   * gcc.dg/pr100887.c: New test.
>
>--- gcc/fold-const.c.jj2021-05-28 11:03:19.507884088 +0200
>+++ gcc/fold-const.c   2021-06-03 14:52:52.616393656 +0200
>@@ -15471,6 +15471,9 @@ fold_read_from_vector (tree arg, poly_ui
>   return VECTOR_CST_ELT (arg, i);
>   else if (TREE_CODE (arg) == CONSTRUCTOR)
>   {
>+if (CONSTRUCTOR_NELTS (arg)
>+&& VECTOR_TYPE_P (TREE_TYPE (CONSTRUCTOR_ELT (arg, 0)->value)))
>+  return NULL_TREE;
> if (i >= CONSTRUCTOR_NELTS (arg))
>   return build_zero_cst (TREE_TYPE (TREE_TYPE (arg)));
> return CONSTRUCTOR_ELT (arg, i)->value;
>--- gcc/testsuite/gcc.dg/pr100887.c.jj 2021-06-03 15:09:07.629898248
>+0200
>+++ gcc/testsuite/gcc.dg/pr100887.c2021-06-03 15:09:48.265335283 +0200
>@@ -0,0 +1,14 @@
>+/* PR target/100887 */
>+/* { dg-do compile } */
>+/* { dg-options "" } */
>+/* { dg-additional-options "-mavx512f" { target { i?86-*-* x86_64-*-*
>} } } */
>+
>+typedef unsigned long long __attribute__((__vector_size__ (2 * sizeof
>(long long U;
>+typedef unsigned long long __attribute__((__vector_size__ (4 * sizeof
>(long long V;
>+typedef unsigned long long __attribute__((__vector_size__ (8 * sizeof
>(long long W;
>+
>+U
>+foo (V v)
>+{
>+  return __builtin_shufflevector ((W){}, v, 0, 8);
>+}
>
>   Jakub



[PATCH v3] AArch64: Improve GOT addressing

2021-06-04 Thread Wilco Dijkstra via Gcc-patches
Hi Richard,

This merges the v1 and v2 patches and removes the spurious MEM from
ldr_got_small_si/di. This has been rebased after [1], and the performance
gain has now doubled.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571708.html

Improve GOT addressing by treating the instructions as a pair.  This reduces
register pressure and improves code quality significantly.  SPECINT2017 improves
by 0.6% with -fPIC and codesize is 0.73% smaller.  Perlbench has 0.9% smaller
codesize, 1.5% fewer executed instructions and is 1.8% faster on Neoverse N1.

Passes bootstrap and regress. OK for commit?

ChangeLog:
2021-06-04  Wilco Dijkstra  

* config/aarch64/aarch64.md (movsi): Split GOT accesses after reload.
(movdi): Likewise.
(ldr_got_small_): Remove MEM and LO_SUM, emit ADRP+LDR GOT 
sequence.
(ldr_got_small_sidi): Likewise.
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Delay
splitting of GOT accesses until after reload. Remove tmp_reg and MEM.
(aarch64_print_operand): Correctly print got_lo12 in L specifier.
(aarch64_rtx_costs): Set rematerialization cost for GOT accesses.
(aarch64_mov_operand_p): Make GOT accesses valid move operands.

---

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
08245827daa3f8199b29031e754244c078f0f500..11ea33c70fb06194fadfe94322fdfa098e5320fc
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3615,6 +3615,14 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
 
 case SYMBOL_SMALL_GOT_4G:
   {
+   /* Use movdi for GOT accesses until after reload - this improves
+  CSE and rematerialization.  */
+   if (!reload_completed)
+ {
+   emit_insn (gen_rtx_SET (dest, imm));
+   return;
+ }
+
/* In ILP32, the mode of dest can be either SImode or DImode,
   while the got entry is always of SImode size.  The mode of
   dest depends on how dest is used: if dest is assigned to a
@@ -3624,34 +3632,21 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
   patterns here (two patterns for ILP32).  */
 
rtx insn;
-   rtx mem;
-   rtx tmp_reg = dest;
machine_mode mode = GET_MODE (dest);
 
-   if (can_create_pseudo_p ())
- tmp_reg = gen_reg_rtx (mode);
-
-   emit_move_insn (tmp_reg, gen_rtx_HIGH (mode, imm));
if (mode == ptr_mode)
  {
if (mode == DImode)
- insn = gen_ldr_got_small_di (dest, tmp_reg, imm);
+ insn = gen_ldr_got_small_di (dest, imm);
else
- insn = gen_ldr_got_small_si (dest, tmp_reg, imm);
-
-   mem = XVECEXP (SET_SRC (insn), 0, 0);
+ insn = gen_ldr_got_small_si (dest, imm);
  }
else
  {
gcc_assert (mode == Pmode);
-
-   insn = gen_ldr_got_small_sidi (dest, tmp_reg, imm);
-   mem = XVECEXP (XEXP (SET_SRC (insn), 0), 0, 0);
+   insn = gen_ldr_got_small_sidi (dest, imm);
  }
 
-   gcc_assert (MEM_P (mem));
-   MEM_READONLY_P (mem) = 1;
-   MEM_NOTRAP_P (mem) = 1;
emit_insn (insn);
return;
   }
@@ -11019,7 +11014,7 @@ aarch64_print_operand (FILE *f, rtx x, int code)
   switch (aarch64_classify_symbolic_expression (x))
{
case SYMBOL_SMALL_GOT_4G:
- asm_fprintf (asm_out_file, ":lo12:");
+ asm_fprintf (asm_out_file, ":got_lo12:");
  break;
 
case SYMBOL_SMALL_TLSGD:
@@ -13452,6 +13447,12 @@ cost_plus:
 
 case SYMBOL_REF:
   *cost = 0;
+
+  /* Use a separate remateralization cost for GOT accesses.  */
+  if (aarch64_cmodel == AARCH64_CMODEL_SMALL_PIC
+ && aarch64_classify_symbol (x, 0) == SYMBOL_SMALL_GOT_4G)
+   *cost = COSTS_N_INSNS (1) / 2;
+
   return true;
 
 case HIGH:
@@ -19907,6 +19908,11 @@ aarch64_mov_operand_p (rtx x, machine_mode mode)
   return aarch64_simd_valid_immediate (x, NULL);
 }
 
+  /* GOT accesses are valid moves until after regalloc.  */
+  if (SYMBOL_REF_P (x)
+  && aarch64_classify_symbolic_expression (x) == SYMBOL_SMALL_GOT_4G)
+return true;
+
   x = strip_salt (x);
   if (SYMBOL_REF_P (x) && mode == DImode && CONSTANT_ADDRESS_P (x))
 return true;
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
abfd84526745d029ad4953eabad6dd17b159a218..30effca6f3562f6870a6cc8097750e63bb0d424d
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1283,8 +1283,11 @@ (define_insn_and_split "*movsi_aarch64"
fmov\\t%w0, %s1
fmov\\t%s0, %s1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
-  "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
-&& REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
+  "(CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands

Re: [PATCH] arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd

2021-06-04 Thread Christophe Lyon via Gcc-patches
On Wed, 2 Jun 2021 at 20:19, Richard Sandiford
 wrote:
>
> Christophe Lyon  writes:
> > This patch adds support for auto-vectorization of average value
> > computation using vhadd or vrhadd, for both MVE and Neon.
> >
> > The patch adds the needed [u]avg3_[floor|ceil] patterns to
> > vec-common.md, I'm not sure how to factorize them without introducing
> > an unspec iterator?
>
> Yeah, an int iterator would be one way, but I'm not sure it would
> make things better given the differences in how Neon and MVE handle
> their unspecs.
>
> > It also adds tests for 'floor' and for 'ceil', each for MVE and Neon.
> >
> > Vectorization works with 8-bit and 16 bit input/output vectors, but
> > not with 32-bit ones because the vectorizer expects wider types
> > availability for the intermediate values, but int32_t + int32_t does
> > not involve wider types in the IR.
>
> Right.  Like you say, it's only valid to use V(R)HADD if, in the source
> code, the addition and shift have a wider precision than the operands.
> That happens naturally for 8-bit and 16-bit operands, since C arithmetic
> promotes them to "int" first.  But for 32-bit operands, the C code needs
> to do the addition and shift in 64 bits.  Doing them in 64 bits should
> be fine for narrower operands too.
>
> So:
>
> > diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vhadd-1.c 
> > b/gcc/testsuite/gcc.target/arm/simd/mve-vhadd-1.c
> > new file mode 100644
> > index 000..40489ecc67d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/arm/simd/mve-vhadd-1.c
> > @@ -0,0 +1,31 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> > +/* { dg-add-options arm_v8_1m_mve } */
> > +/* { dg-additional-options "-O3" } */
> > +
> > +#include 
> > +
> > +#define FUNC(SIGN, TYPE, BITS, OP, NAME) \
> > +  void test_ ## NAME ##_ ## SIGN ## BITS (TYPE##BITS##_t * __restrict__ 
> > dest, \
> > +   TYPE##BITS##_t *a, TYPE##BITS##_t 
> > *b) { \
> > +int i;   \
> > +for (i=0; i < (128 / BITS); i++) { 
> >   \
> > +  dest[i] = (a[i] OP b[i]) >> 1; \
> > +}  
> >   \
> > +}
> > +
>
> …it should work if you make this "((int64_t) a[i] OP b[i]) >> 1".

Indeed. However, this may not be obvious for end-users :-(

I've updated my patch as attached: added the (int64_t) cast and
removed the xfail clauses.

OK for trunk?

Thanks,

Christophe

>
> > As noted in neon-vhadd-1.c, I couldn't write a test able to use Neon
> > vectorization with 64-bit vectors: we default to
> > -mvectorize-with-neon-quad, and attempts to use
> > -mvectorize-with-neon-double resulted in much worse code, which this
> > patch does not aim at improving.
>
> I guess this is because the MVE_2 mode iterators only include 128-bit types.
> Leaving Neon double as future work sounds good though.
Note that I am focusing on MVE enablement at the moment.

> And yeah, the code for V(R)HADD-equivalent operations is much worse when
> V(R)HADD isn't available, since the compiler really does need to double
> the precision of the operands, do double-precision addition,
> do double-precision shifts, and then truncate back.  So this looks
> like the expected behaviour.
>
> Thanks,
> Richard
From 493693b5c2f4e5fee7408062785930f723f2bd85 Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Thu, 27 May 2021 20:11:28 +
Subject: [PATCH v2] arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd

This patch adds support for auto-vectorization of average value
computation using vhadd or vrhadd, for both MVE and Neon.

The patch adds the needed [u]avg3_[floor|ceil] patterns to
vec-common.md, I'm not sure how to factorize them without introducing
an unspec iterator?

It also adds tests for 'floor' and for 'ceil', each for MVE and Neon.

Vectorization works with 8-bit and 16 bit input/output vectors, but
not with 32-bit ones because the vectorizer expects wider types
availability for the intermediate values, but int32_t + int32_t does
not involve wider types in the IR.

As noted in neon-vhadd-1.c, I couldn't write a test able to use Neon
vectorization with 64-bit vectors: we default to
-mvectorize-with-neon-quad, and attempts to use
-mvectorize-with-neon-double resulted in much worse code, which this
patch does not aim at improving.

2021-05-31  Christophe Lyon  

	gcc/
	* gcc/config/arm/mve.md (mve_vhaddq_): Prefix with '@'.
	(@mve_vrhaddq_hadd): Likewise.
	* config/arm/vec-common.md (avg3_floor, uavg3_floor)
	(avg3_ceil", uavg3_ceil): New patterns.

	gcc/testsuite/
	* gcc.target/arm/simd/mve-vhadd-1.c: New test.
	* gcc.target/arm/simd/mve-vhadd-2.c: New test.
	* gcc.target/arm/simd/neon-vhadd-1.c: New test.
	* gcc.target/arm/simd/neon-vhadd-2.c: New test.
---
 gcc/config/arm/mve.md |  4 +-
 gcc

Re: [PATCH] x86: Convert CONST_WIDE_INT to broadcast in move expanders

2021-06-04 Thread H.J. Lu via Gcc-patches
On Thu, Jun 3, 2021 at 11:21 PM Uros Bizjak  wrote:
>
> On Fri, Jun 4, 2021 at 12:39 AM H.J. Lu  wrote:
> >
> > On Thu, Jun 3, 2021 at 12:39 AM Uros Bizjak  wrote:
> > >
> > > On Thu, Jun 3, 2021 at 5:49 AM H.J. Lu  wrote:
> > > >
> > > > Update move expanders to convert the CONST_WIDE_INT operand to vector
> > > > broadcast from a byte with AVX2.  Add ix86_gen_scratch_sse_rtx to
> > > > return a scratch SSE register which won't increase stack alignment
> > > > requirement and blocks transformation by the combine pass.
> > >
> > > Using fixed scratch reg is just too hackish for my taste. The
> >
> > It was recommended to use hard register for things like this:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569945.html
>
> I was worried that the temporary (even hard reg) will be spilled under
> some rare cases. If this is not the case, and even recommended for
> your use case, then it should be OK. Maybe use a hard register instead
> of match_scratch in the clobber of the proposed insn pattern below.

With a hard register, we can add

;; Modes handled by byte broadcast patterns.
(define_mode_iterator BYTE_BROADCAST_MODE
  [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI])

;; Broadcast from a byte.
(define_expand "vec_duplicate"
  [(set (match_operand:BYTE_BROADCAST_MODE 0 "register_operand")
(vec_duplicate:BYTE_BROADCAST_MODE
  (match_operand:QI 1 "general_operand")))]
  "TARGET_SSE2"
{
  /* Enable VEC_DUPLICATE from a constant byte only if vector broadcast
 is available.  Otherwise use a compile-time constant to expand
 memset.  */
  if (!TARGET_AVX2 && CONST_INT_P (operands[1]))
FAIL;
  if (!ix86_expand_vector_init_duplicate (false, mode,
  operands[0], operands[1]))
gcc_unreachable ();
  DONE;
})

> > > expansion is OK to emit some optimized sequence, but the approach to
> > > use fixed reg to bypass stack alignment functionality and combine is
> > > not.
> > >
> > > Perhaps a new insn pattern should be introduced, e.g.
> > >
> > > (define_insn_and_split ""
> > >[(set (match_opreand:V 0 "memory_operand" "=m,m")
> > > (vec_duplicate:V
> > >   (match_operand:S 2 "reg_or_0_operand" "r,C"))
> > > (clobber (match_scratch:V 1 "=x"))]
> > >
> > > and split it at some appropriate point.
>
> Please note that zero (C) can be assigned directly to the XMM
> register, so there is no need for the intermediate integer reg,
> assuming that zero was propagated into the insn (-1 can be handled
> this way, too). Due to these propagations, it looks that the correct
> point to split the insn is after the reload.

On the other hand, we can do

  unsigned int nunits = GET_MODE_SIZE (mode) / GET_MODE_SIZE (QImode);
  machine_mode vector_mode;
  if (!mode_for_vector (QImode, nunits).exists (&vector_mode))
gcc_unreachable ();
  target = ix86_gen_scratch_sse_rtx (vector_mode, true);
  rtx byte = GEN_INT ((char) byte_broadcast);
  if (!ix86_expand_vector_init_duplicate (false, vector_mode, target,
  byte))
gcc_unreachable ();

This generates the same instruction and is much simpler.
 ix86_expand_vector_init_duplicate can handle 0 and -1 properly.

> On a related note, you are using only vpbroadcastb, assuming byte
> (QImode) granularity. Is it possible to also use HImode and larger
> modes to handle e.g. initializations of int arrays?

I will work on it.

> Uros.
>
> > I will give it a try.
> >
> > Thanks.
> >
> > > Uros.
> > >
> > > >
> > > > A small benchmark:
> > > >
> > > > https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/memset/broadcast
> > > >
> > > > shows that broadcast is a little bit faster on Intel Core i7-8559U:
> > > >
> > > > $ make
> > > > gcc -g -I. -O2   -c -o test.o test.c
> > > > gcc -g   -c -o memory.o memory.S
> > > > gcc -g   -c -o broadcast.o broadcast.S
> > > > gcc -o test test.o memory.o broadcast.o
> > > > ./test
> > > > memory   : 99333
> > > > broadcast: 97208
> > > > $
> > > >
> > > > broadcast is also smaller:
> > > >
> > > > $ size memory.o broadcast.o
> > > >textdata bss dec hex filename
> > > > 132   0   0 132  84 memory.o
> > > > 122   0   0 122  7a broadcast.o
> > > > $
> > > >
> > > > gcc/
> > > >
> > > > PR target/100865
> > > > * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
> > > > New prototype.
> > > > (ix86_byte_broadcast): New function.
> > > > (ix86_convert_const_wide_int_to_broadcast): Likewise.
> > > > (ix86_expand_move): Try ix86_convert_const_wide_int_to_broadcast
> > > > if mode size is 16 bytes or bigger.
> > > > (ix86_expand_vector_move): Try
> > > > ix86_convert_const_wide_int_to_broadcast.
> > > > * config/i386/i386-protos.h (ix86_gen_scratch_sse_rtx): New
> > > > prototype.
> > > > * config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Add
> > >

[PATCH] openmp: Call c_omp_adjust_map_clauses even for combined target [PR100902]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
Hi!

When looking at in_reduction support for target, I've noticed that
c_omp_adjust_map_clauses is not called for the combined target case.

The following patch fixes it, will commit once tested.

Unfortunately, there are other issues.

One is (also mentioned in the PR) that currently the pointer attachment
stuff seems to be clause ordering dependent (the standard says that clause
ordering on the same construct does not matter), the baz and qux cases
in the PR are rejected while when swapped it is accepted.
Note, the order of clauses in GCC really is treated as insignificant
initially and only later on the compiler can adjust the ordering (e.g. when
we sort map clauses based on what they refer to etc.) and in particular,
clauses from parsing is reverse of the order in user code, while
c_omp_split_clauses performed for combined/composite constructs typically
reverses that ordering, i.e. makes it follow the user code ordering.

And another one is I'm slightly afraid c_omp_adjust_map_clauses might
misbehave in templates, though haven't tried to verify it with testcases.
When processing_template_decl, the non-dependent clauses will be handled
usually the same as when not in a template, but dependent clauses aren't
processed or only limited processing is done there, and rest is deferred
till later.  From quick skimming of c_omp_adjust_map_clauses, it seems
it might not be very happy about non-processed map clauses that might
still have the TREE_LIST representation of array sections, or might
not have finalized decls or base decls etc.
So, for this I wonder if cp_parser_omp_target (and other cp/parser.c
callers of c_omp_adjust_map_clauses) shouldn't call it only 
if (!processing_template_decl) - perhaps you could add
cp_omp_adjust_map_clauses wrapper that would be
if (!processing_template_decl)
  c_omp_adjust_map_clauses (...);
- and call c_omp_adjust_map_clauses from within pt.c after the clauses
are tsubsted and finish_omp_clauses is called again.

2021-06-04  Jakub Jelinek  

PR c/100902
* c-parser.c (c_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.

* parser.c (cp_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.

* c-c++-common/gomp/pr100902-1.c: New test.

--- gcc/c/c-parser.c.jj 2021-06-04 11:15:11.616690819 +0200
+++ gcc/c/c-parser.c2021-06-04 13:15:42.788471162 +0200
@@ -20133,6 +20133,7 @@ c_parser_omp_target (c_parser *parser, e
  tree stmt = make_node (OMP_TARGET);
  TREE_TYPE (stmt) = void_type_node;
  OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET];
+ c_omp_adjust_map_clauses (OMP_TARGET_CLAUSES (stmt), true);
  OMP_TARGET_BODY (stmt) = block;
  OMP_TARGET_COMBINED (stmt) = 1;
  SET_EXPR_LOCATION (stmt, loc);
--- gcc/cp/parser.c.jj  2021-05-31 10:11:15.145978965 +0200
+++ gcc/cp/parser.c 2021-06-04 13:17:00.952392489 +0200
@@ -42233,6 +42233,7 @@ cp_parser_omp_target (cp_parser *parser,
  tree stmt = make_node (OMP_TARGET);
  TREE_TYPE (stmt) = void_type_node;
  OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET];
+ c_omp_adjust_map_clauses (OMP_TARGET_CLAUSES (stmt), true);
  OMP_TARGET_BODY (stmt) = body;
  OMP_TARGET_COMBINED (stmt) = 1;
  SET_EXPR_LOCATION (stmt, pragma_tok->location);
--- gcc/testsuite/c-c++-common/gomp/pr100902-1.c.jj 2021-06-04 
13:13:48.471048762 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr100902-1.c2021-06-04 
13:02:07.655723606 +0200
@@ -0,0 +1,17 @@
+/* PR c/100902 */
+
+void
+foo (int *ptr)
+{
+  #pragma omp target map (ptr, ptr[:4])
+  #pragma omp parallel master
+  ptr[0] = 1;
+}
+
+void
+bar (int *ptr)
+{
+  #pragma omp target parallel map (ptr[:4], ptr)
+  #pragma omp master
+  ptr[0] = 1;
+}

Jakub



Re: [committed] gfortran.dg/gomp/pr99928-*.f90: Use implicit none, remove one xfail

2021-06-04 Thread Tobias Burnus

On 04.06.21 13:20, Tobias Burnus wrote:


This adds a bunch of 'implicit none' to the testcases,
a missing 'integer i' and fixes a typo in a loop variable,
which permitted to remove an xfail.


Or actually it didn't, as I seemingly did a last-minute change in the
wrong way after running the testsuite or missed a 'git add' or ...  :-(

The xfail is fixed – but only with this follow-up commit
r12-1211-gad3f0ad4bafe377072a53ded468fd9948e659f46.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
commit ad3f0ad4bafe377072a53ded468fd9948e659f46
Author: Tobias Burnus 
Date:   Fri Jun 4 13:26:40 2021 +0200

gfortran.dg/gomp/pr99928-5.f90: Use proper iteration var

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/pr99928-5.f90: Really use the
proper iteration variable.

diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90
index c612aaf9556..49cbf1e8cc2 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90
@@ -90,3 +90,3 @@ subroutine bar ()
   !$omp taskloop simd linear (j10) default(none)
-  do j01 = 1, 64
+  do j10 = 1, 64
   end do


[committed] gfortran.dg/gomp/pr99928-*.f90: Use implicit none, remove one xfail

2021-06-04 Thread Tobias Burnus

This adds a bunch of 'implicit none' to the testcases,
a missing 'integer i' and fixes a typo in a loop variable,
which permitted to remove an xfail.

Currently, there are two classes of xfail:
* !$omp parallel master taskloop (simd)
  → parallel misses shared(...)
for firstprivate/lastprivate/reduction variable
* !$omp target ...
  → Missing 'map(tofrom:'
for firstprivate/lastprivate/reduction
  → Wrong firstprivate in that case

Committed as r12-1210-g78b622e37381e1c0e9992f6634972dfbe0338d0b

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
commit 78b622e37381e1c0e9992f6634972dfbe0338d0b
Author: Tobias Burnus 
Date:   Fri Jun 4 13:10:57 2021 +0200

gfortran.dg/gomp/pr99928-*.f90: Use implicit none, remove one xfail

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/pr99928-1.f90: Add 'implicit none'.
* gfortran.dg/gomp/pr99928-11.f90: Likewise.
* gfortran.dg/gomp/pr99928-4.f90: Likewise.
* gfortran.dg/gomp/pr99928-6.f90: Likewise.
* gfortran.dg/gomp/pr99928-8.f90: Likewise.
* gfortran.dg/gomp/pr99928-2.f90: Likewise. Add missing decl.
* gfortran.dg/gomp/pr99928-5.f90: Add implicit none;
fix loop-variable and remove xfail.

diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-1.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-1.f90
index 5cbffb09b3f..e5be42fba53 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-1.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-1.f90
@@ -3,6 +3,7 @@
 ! { dg-options "-fopenmp -fdump-tree-gimple" }
 
 module m
+  implicit none
   integer :: f00, f01, f02, f03, f04, f05, f06, f07, f08, f09
   integer :: f12, f13, f14, f15, f16, f17, f18, f19
   integer :: f20, f21, f22, f23, f24, f25, f26, f27, f28, f29
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-11.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-11.f90
index 864ae4b6c99..22a40e2b49c 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-11.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-11.f90
@@ -3,6 +3,7 @@
 ! { dg-options "-fopenmp -fdump-tree-gimple" }
 
 module m
+  implicit none
   integer :: r00, r01, r02
 
 contains
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-2.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-2.f90
index 5dbf78ba291..fe8a715279a 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-2.f90
@@ -3,12 +3,14 @@
 ! { dg-options "-fopenmp -fdump-tree-gimple" }
 
 module m
+  implicit none
   integer :: l00, l01, l02, l03, l04, l05, l06, l07
   integer :: l10, l11, l12, l13, l14, l15, l16, l17, l18
 
 contains
 
 subroutine foo ()
+  integer :: i
   ! { dg-final { scan-tree-dump "omp distribute\[^\n\r]*lastprivate\\(l00\\)" "gimple" } }
   ! { dg-final { scan-tree-dump "omp parallel\[^\n\r]*lastprivate\\(l00\\)" "gimple" } } ! FIXME: This should be on for instead. 
   ! { dg-final { scan-tree-dump-not "omp for\[^\n\r]*lastprivate\\(l00\\)" "gimple" } } ! FIXME. 
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-4.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-4.f90
index 5b82dd6581c..ead8f030e63 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-4.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-4.f90
@@ -3,6 +3,7 @@
 ! { dg-options "-fopenmp -fdump-tree-gimple" }
 
 module m
+  implicit none
   integer :: l00, l01, l05, l06, l07, l08
 
 contains
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90
index 9f45e48feb4..c612aaf9556 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-5.f90
@@ -3,6 +3,7 @@
 ! { dg-options "-fopenmp -fdump-tree-gimple" }
 
 module m
+  implicit none
   integer :: j00, j01, j02, j03, j04, j06, j07, j08, j09
   integer :: j10
 
@@ -85,9 +86,9 @@ subroutine bar ()
   end do
   ! { dg-final { scan-tree-dump "omp taskloop\[^\n\r]*shared\\(j10\\)" "gimple" } } ! NOTE: This is implementation detail. 
   ! { dg-final { scan-tree-dump "omp taskloop\[^\n\r]*lastprivate\\(j10\\)" "gimple" } }
-  ! { dg-final { scan-tree-dump "omp simd\[^\n\r]*linear\\(j10:1\\)" "gimple" { xfail *-*-* } } }
+  ! { dg-final { scan-tree-dump "omp simd\[^\n\r]*linear\\(j10:1\\)" "gimple" } }
   !$omp taskloop simd linear (j10) default(none)
-  do j010 = 1, 64
+  do j01 = 1, 64
   end do
   ! { dg-final { scan-tree-dump "omp teams\[^\n\r]*shared\\(j11\\)" "gimple" } }
   ! { dg-final { scan-tree-dump "omp distribute\[^\n\r]*lastprivate\\(j11\\)" "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr99928-6.f90 b/gcc/testsuite/gfortran.dg/gomp/pr99928-6.f90
index 37a93e6b1ac..0e60199476b 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr99928-6.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr99928-6.f90
@@ -3,6 +3,7 @@
 ! { dg-options "-fopenmp -fdump-tree-gimple" }
 
 module m
+  implicit none
   integer :: j00, j01, j02

Re: [PATCH] libgcc: Fix _Unwind_Backtrace() for SEH

2021-06-04 Thread Eric Botcazou
> Forgot to assign to gcc_context.cfa and gcc_context.ra. Note this fix can
> be backported to earlier editions of gcc as well

It's already done by:

2020-11-03  Martin Storsjö  

* unwind-seh.c (_Unwind_Backtrace): Set the ra and cfa pointers
before calling the callback.

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553418.html

-- 
Eric Botcazou




RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-04 Thread Tamar Christina via Gcc-patches
Hi Richi,

Attached is re-spun patch.  tree_nop_conversion_p was very handy in cleaning up 
the patch, Thanks!

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master if Richard S has no comments?

Thanks,
Tamar

gcc/ChangeLog:

* optabs.def (usdot_prod_optab): New.
* doc/md.texi: Document it and clarify other dot prod optabs.
* optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
* optabs.c (expand_widen_pattern_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Query dot-product kind.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
optab subtype.
(vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
mismatch types.
(vect_recog_dot_prod_pattern): Support usdot_prod_optab.


--- inline copy of patch ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..9fad3322b3f1eb2a836833bb390df78f0cd9734b
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5438,13 +5438,55 @@ Like @samp{fold_left_plus_@var{m}}, but takes an 
additional mask operand
 
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
+
+Compute the sum of the products of two signed elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+sdot ==
+   res = sign-ext (a) * sign-ext (b) + c
+@dots{}
+@end smallexample
+
 @cindex @code{udot_prod@var{m}} instruction pattern
-@itemx @samp{udot_prod@var{m}}
-Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+@item @samp{udot_prod@var{m}}
+
+Compute the sum of the products of two unsigned elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+udot ==
+   res = zero-ext (a) * zero-ext (b) + c
+@dots{}
+@end smallexample
+
+
+
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@item @samp{usdot_prod@var{m}}
+Compute the sum of the products of elements of different signs.
+Operand 1 must be unsigned and operand 2 signed. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+usdot ==
+   res = ((unsigned-conv) sign-ext (a)) * zero-ext (b) + c
+@dots{}
+@end smallexample
 
 @cindex @code{ssad@var{m}} instruction pattern
 @item @samp{ssad@var{m}}
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index 
c3aaa1a416991e856d3e24da45968a92ebada82c..fbd2b06b8dbfd560dfb66b314830e6b564b37abb
 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -29,7 +29,8 @@ enum optab_subtype
 {
   optab_default,
   optab_scalar,
-  optab_vector
+  optab_vector,
+  optab_vector_mixed_sign
 };
 
 /* Return the optab used for computing the given operation on the type given by
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 
95ffe397c23e80c105afea52e9d47216bf52f55a..eeb5aeed3202cc6971b6447994bc5311e9c010bb
 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -127,7 +127,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
   return TYPE_UNSIGNED (type) ? usum_widen_optab : ssum_widen_optab;
 
 case DOT_PROD_EXPR:
-  return TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab;
+  {
+   if (subtype == optab_vector_mixed_sign)
+ return usdot_prod_optab;
+
+   return (TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab);
+  }
 
 case SAD_EXPR:
   return TYPE_UNSIGNED (type) ? usad_optab : ssad_optab;
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 
f4614a394587787293dc8b680a38901f7906f61c..d9b64441d0e0726afee89dc9c937350451e7670d
 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, 
rtx wide_op,
   bool sbool = false;
 
   oprnd0 = op

Re: [PATCH] x86: Fix ix86_expand_vector_init for V*TImode [PR100887]

2021-06-04 Thread Uros Bizjak via Gcc-patches
On Fri, Jun 4, 2021 at 11:00 AM Jakub Jelinek  wrote:
>
> Hi!
>
> We have vec_initv4tiv2ti and vec_initv2titi patterns which call
> ix86_expand_vector_init and assume it works for those modes.  For the
> case of construction from two half-sized vectors, the code assumes it
> will always succeed, but we have only insn patterns with SImode and DImode
> element types.  QImode and HImode element types are already handled
> by performing it with same sized vectors with SImode elements and the
> following patch extends that to V*TImode vectors.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-06-04  Jakub Jelinek  
>
> PR target/100887
> * config/i386/i386-expand.c (ix86_expand_vector_init): Handle
> concatenation from half-sized modes with TImode elements.
>
> * gcc.target/i386/pr100887.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-expand.c.jj2021-05-28 11:03:19.424885281 +0200
> +++ gcc/config/i386/i386-expand.c   2021-06-03 12:30:44.263286549 +0200
> @@ -14610,11 +14610,15 @@ ix86_expand_vector_init (bool mmx_ok, rt
>if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts)
> {
>   rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
> - if (inner_mode == QImode || inner_mode == HImode)
> + if (inner_mode == QImode
> + || inner_mode == HImode
> + || inner_mode == TImode)
> {
>   unsigned int n_bits = n_elts * GET_MODE_SIZE (inner_mode);
> - mode = mode_for_vector (SImode, n_bits / 4).require ();
> - inner_mode = mode_for_vector (SImode, n_bits / 8).require ();
> + scalar_mode elt_mode = inner_mode == TImode ? DImode : SImode;
> + n_bits /= GET_MODE_SIZE (elt_mode);
> + mode = mode_for_vector (elt_mode, n_bits).require ();
> + inner_mode = mode_for_vector (elt_mode, n_bits / 2).require ();
>   ops[0] = gen_lowpart (inner_mode, ops[0]);
>   ops[1] = gen_lowpart (inner_mode, ops[1]);
>   subtarget = gen_reg_rtx (mode);
> --- gcc/testsuite/gcc.target/i386/pr100887.c.jj 2021-06-03 12:44:09.653939987 
> +0200
> +++ gcc/testsuite/gcc.target/i386/pr100887.c2021-06-03 12:43:36.580404322 
> +0200
> @@ -0,0 +1,13 @@
> +/* PR target/100887 */
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-mavx512f" } */
> +
> +typedef unsigned __int128 U __attribute__((__vector_size__ (64)));
> +typedef unsigned __int128 V __attribute__((__vector_size__ (32)));
> +typedef unsigned __int128 W __attribute__((__vector_size__ (16)));
> +
> +W
> +foo (U u, V v)
> +{
> +  return __builtin_shufflevector (u, v, 0);
> +}
>
> Jakub
>


[committed] openmp: Assorted depend/affinity/iterator related fixes [PR100859]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
Hi!

The depend-iterator-3.C testcase shows various bugs.

1) tsubst_omp_clauses didn't handle OMP_CLAUSE_AFFINITY (should be
   handled like OMP_CLAUSE_DEPEND)
2) because locators can be arbitrary lvalue expressions, we need
   to allow for C++ array section base (especially when array section
   is just an array reference) FIELD_DECLs, handle them as this->member,
   but don't need to privatize in any way
3) similarly for this as base
4) depend(inout: this) is invalid, but for different reason than the reported
   one, again this is an expression, but not lvalue expression, so that
   should be reported
5) the ctor/dtor cloning in the C++ FE (which is using walk_tree with
   copy_tree_body_r) didn't handle iterators correctly, walk_tree normally
   doesn't walk TREE_PURPOSE of TREE_LIST, and in the iterator case
   that TREE_VEC contains also a BLOCK that needs special handling during
   copy_tree_body_r

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-06-03  Jakub Jelinek  

PR c++/100859
gcc/
* tree-inline.c (copy_tree_body_r): Handle iterators on
OMP_CLAUSE_AFFINITY or OMP_CLAUSE_DEPEND.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Move OMP_CLAUSE_AFFINITY
after depend only cases.
gcc/cp/
* semantics.c (handle_omp_array_sections_1): For
OMP_CLAUSE_{AFFINITY,DEPEND} handle FIELD_DECL base using
finish_non_static_data_member and allow this as base.
(finish_omp_clauses): Move OMP_CLAUSE_AFFINITY
after depend only cases.  Let this be diagnosed by !lvalue_p
case for OMP_CLAUSE_{AFFINITY,DEPEND} and remove useless
assert.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_AFFINITY.
gcc/testsuite/
* g++.dg/gomp/depend-iterator-3.C: New test.
* g++.dg/gomp/this-1.C: Don't expect any diagnostics for
this as base expression of depend array section, expect a different
error wording for this as depend locator and add testcases
for affinity clauses.

--- gcc/tree-inline.c.jj2021-05-29 10:04:31.052446552 +0200
+++ gcc/tree-inline.c   2021-06-02 12:58:51.337707890 +0200
@@ -1453,6 +1453,27 @@ copy_tree_body_r (tree *tp, int *walk_su
 
  *walk_subtrees = 0;
}
+  else if (TREE_CODE (*tp) == OMP_CLAUSE
+  && (OMP_CLAUSE_CODE (*tp) == OMP_CLAUSE_AFFINITY
+  || OMP_CLAUSE_CODE (*tp) == OMP_CLAUSE_DEPEND))
+   {
+ tree t = OMP_CLAUSE_DECL (*tp);
+ if (TREE_CODE (t) == TREE_LIST
+ && TREE_PURPOSE (t)
+ && TREE_CODE (TREE_PURPOSE (t)) == TREE_VEC)
+   {
+ *walk_subtrees = 0;
+ OMP_CLAUSE_DECL (*tp) = copy_node (t);
+ t = OMP_CLAUSE_DECL (*tp);
+ TREE_PURPOSE (t) = copy_node (TREE_PURPOSE (t));
+ for (int i = 0; i <= 4; i++)
+   walk_tree (&TREE_VEC_ELT (TREE_PURPOSE (t), i),
+  copy_tree_body_r, id, NULL);
+ if (TREE_VEC_ELT (TREE_PURPOSE (t), 5))
+   remap_block (&TREE_VEC_ELT (TREE_PURPOSE (t), 5), id);
+ walk_tree (&TREE_VALUE (t), copy_tree_body_r, id, NULL);
+   }
+   }
 }
 
   /* Keep iterating.  */
--- gcc/c/c-typeck.c.jj 2021-06-02 10:07:47.630826586 +0200
+++ gcc/c/c-typeck.c2021-06-02 12:09:34.639451958 +0200
@@ -14557,7 +14557,6 @@ c_finish_omp_clauses (tree clauses, enum
}
  break;
 
-   case OMP_CLAUSE_AFFINITY:
case OMP_CLAUSE_DEPEND:
  t = OMP_CLAUSE_DECL (c);
  if (t == NULL_TREE)
@@ -14566,8 +14565,7 @@ c_finish_omp_clauses (tree clauses, enum
  == OMP_CLAUSE_DEPEND_SOURCE);
  break;
}
- if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
- && OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
+ if (OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
{
  gcc_assert (TREE_CODE (t) == TREE_LIST);
  for (; t; t = TREE_CHAIN (t))
@@ -14595,6 +14593,9 @@ c_finish_omp_clauses (tree clauses, enum
}
  break;
}
+ /* FALLTHRU */
+   case OMP_CLAUSE_AFFINITY:
+ t = OMP_CLAUSE_DECL (c);
  if (TREE_CODE (t) == TREE_LIST
  && TREE_PURPOSE (t)
  && TREE_CODE (TREE_PURPOSE (t)) == TREE_VEC)
--- gcc/cp/semantics.c.jj   2021-06-02 10:07:47.633826543 +0200
+++ gcc/cp/semantics.c  2021-06-02 13:42:25.311863041 +0200
@@ -4968,7 +4968,11 @@ handle_omp_array_sections_1 (tree c, tre
  if (REFERENCE_REF_P (t))
t = TREE_OPERAND (t, 0);
}
-  if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
+  if (TREE_CODE (t) == FIELD_DECL
+ && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_AFFINITY
+ || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND))
+   ret = finish_non_static_data_member (t, NULL_TREE, NULL_TREE);
+  else if (!VAR

Re: [PATCH] arc: Add --with-fpu support for ARCv2 cpus

2021-06-04 Thread Bernhard Reutner-Fischer via Gcc-patches
On Fri,  4 Jun 2021 10:29:09 +0300
Claudiu Zissulescu via Gcc-patches  wrote:

> Hi Jeff,
> 
> I would like to add spport for selecting the ARCv2 FPU extension at
> configuration-time.
> 
> The --with-fpu configuration option is ignored when -mfpu compiler
> option is specified.
> 
> My concern is using `grep -P` when configuring. Is that ok?

Please don't.
Not every grep(1) has PCRE support so it would be great if you could
rephrase it to use just use normal regexp or ERE.

like e.g.:
grep -q -E "^ARC_CPU \(hs38,[   ]*[emhs]+," config/arc/arc-cpus.def

where the space is <^vi>, i.e. space, tab
Or use an awk script that exits appropriately if the arg matches ARCH.

TIA,


[PATCH] x86: Fix ix86_expand_vector_init for V*TImode [PR100887]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
Hi!

We have vec_initv4tiv2ti and vec_initv2titi patterns which call
ix86_expand_vector_init and assume it works for those modes.  For the
case of construction from two half-sized vectors, the code assumes it
will always succeed, but we have only insn patterns with SImode and DImode
element types.  QImode and HImode element types are already handled
by performing it with same sized vectors with SImode elements and the
following patch extends that to V*TImode vectors.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-06-04  Jakub Jelinek  

PR target/100887
* config/i386/i386-expand.c (ix86_expand_vector_init): Handle
concatenation from half-sized modes with TImode elements.

* gcc.target/i386/pr100887.c: New test.

--- gcc/config/i386/i386-expand.c.jj2021-05-28 11:03:19.424885281 +0200
+++ gcc/config/i386/i386-expand.c   2021-06-03 12:30:44.263286549 +0200
@@ -14610,11 +14610,15 @@ ix86_expand_vector_init (bool mmx_ok, rt
   if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts)
{
  rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
- if (inner_mode == QImode || inner_mode == HImode)
+ if (inner_mode == QImode
+ || inner_mode == HImode
+ || inner_mode == TImode)
{
  unsigned int n_bits = n_elts * GET_MODE_SIZE (inner_mode);
- mode = mode_for_vector (SImode, n_bits / 4).require ();
- inner_mode = mode_for_vector (SImode, n_bits / 8).require ();
+ scalar_mode elt_mode = inner_mode == TImode ? DImode : SImode;
+ n_bits /= GET_MODE_SIZE (elt_mode);
+ mode = mode_for_vector (elt_mode, n_bits).require ();
+ inner_mode = mode_for_vector (elt_mode, n_bits / 2).require ();
  ops[0] = gen_lowpart (inner_mode, ops[0]);
  ops[1] = gen_lowpart (inner_mode, ops[1]);
  subtarget = gen_reg_rtx (mode);
--- gcc/testsuite/gcc.target/i386/pr100887.c.jj 2021-06-03 12:44:09.653939987 
+0200
+++ gcc/testsuite/gcc.target/i386/pr100887.c2021-06-03 12:43:36.580404322 
+0200
@@ -0,0 +1,13 @@
+/* PR target/100887 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-mavx512f" } */
+
+typedef unsigned __int128 U __attribute__((__vector_size__ (64)));
+typedef unsigned __int128 V __attribute__((__vector_size__ (32)));
+typedef unsigned __int128 W __attribute__((__vector_size__ (16)));
+
+W
+foo (U u, V v)
+{
+  return __builtin_shufflevector (u, v, 0);
+}

Jakub



[PATCH] fold-const: Fix up fold_read_from_vector [PR100887]

2021-06-04 Thread Jakub Jelinek via Gcc-patches
Hi!

The callers of fold_read_from_vector expect that the index they pass is
an index of an element in the vector and the function does that most of the
time.  But we allow CONSTRUCTORs with VECTOR_TYPE to have VECTOR_TYPE
elements and in that case every CONSTRUCTOR element represents not just one
index (with the exception of V1 vectors), but multiple.
So returning zero vector if i >= CONSTRUCTOR_NELTS or returning some
CONSTRUCTOR_ELT's value might not be what the callers expect.

Fixed by punting if the first element has vector type.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

In theory we could instead recurse (and assert that for CONSTRUCTORs of
vector elements we have always all elements specified like tree-cfg.c
verifies?) after adjusting the index appropriately.

2021-06-04  Jakub Jelinek  

PR target/100887
* fold-const.c (fold_read_from_vector): Return NULL if trying to
read from a CONSTRUCTOR with vector type elements.

* gcc.dg/pr100887.c: New test.

--- gcc/fold-const.c.jj 2021-05-28 11:03:19.507884088 +0200
+++ gcc/fold-const.c2021-06-03 14:52:52.616393656 +0200
@@ -15471,6 +15471,9 @@ fold_read_from_vector (tree arg, poly_ui
return VECTOR_CST_ELT (arg, i);
   else if (TREE_CODE (arg) == CONSTRUCTOR)
{
+ if (CONSTRUCTOR_NELTS (arg)
+ && VECTOR_TYPE_P (TREE_TYPE (CONSTRUCTOR_ELT (arg, 0)->value)))
+   return NULL_TREE;
  if (i >= CONSTRUCTOR_NELTS (arg))
return build_zero_cst (TREE_TYPE (TREE_TYPE (arg)));
  return CONSTRUCTOR_ELT (arg, i)->value;
--- gcc/testsuite/gcc.dg/pr100887.c.jj  2021-06-03 15:09:07.629898248 +0200
+++ gcc/testsuite/gcc.dg/pr100887.c 2021-06-03 15:09:48.265335283 +0200
@@ -0,0 +1,14 @@
+/* PR target/100887 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+/* { dg-additional-options "-mavx512f" { target { i?86-*-* x86_64-*-* } } } */
+
+typedef unsigned long long __attribute__((__vector_size__ (2 * sizeof (long 
long U;
+typedef unsigned long long __attribute__((__vector_size__ (4 * sizeof (long 
long V;
+typedef unsigned long long __attribute__((__vector_size__ (8 * sizeof (long 
long W;
+
+U
+foo (V v)
+{
+  return __builtin_shufflevector ((W){}, v, 0, 8);
+}

Jakub



Re: [PATCH 2/2, rs6000] Remove mode promotion for pseudos

2021-06-04 Thread HAO CHEN GUI via Gcc-patches

Segher,

   I committed two patches (r12-1201 and r12-1202) into trunk. Thanks 
for your review and advice.



On 4/6/2021 上午 1:36, Segher Boessenkool wrote:

Hi!

On Thu, May 20, 2021 at 05:49:49PM +0800, HAO CHEN GUI wrote:

rs6000 has instructions that can do almost everything 32 bit
at least as efficiently as corresponding 64 bit things. The
mode promotion can be defered to when a wide mode is necessary.
So it helps a lot not promote mode for pseudos. SPECint test
shows that the overall performance improvement (by geomean) is
more than 2% with this patch.
testsuite/gcc.target/powerpc/not-promote-mode.c illustrates how
the patch eliminates the redundant extensions and do further
optimization by disabling mode promotion for pseduos.

I'd still like to see if (and why) this works better than explicitly
promoting QImode and HImode here.  But that can be done later.


--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/not-promote-mode.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */

Just

/* { dg-do compile { target lp64 } } */

because the rest is already implied by this being in gcc.target/powerpc .

The patch is okay for trunk.  Thank you very much for finding this huge
performance gain!


Segher


Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]

2021-06-04 Thread Marc Glisse

On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:


On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse  wrote:


On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:


Hi:
 This patch is about to simplify (view_convert:type ~a) < 0 to
(view_convert:type a) >= 0 when type is signed integer. Similar for
(view_convert:type ~a) >= 0.
 Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
 Ok for the trunk?

gcc/ChangeLog:

   PR middle-end/100738
   * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
   (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
   simplification.


We already have

/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
(for cmp (simple_comparison)
  scmp (swapped_simple_comparison)
  (simplify
   (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
   (if (single_use (@2)
&& (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
(scmp @0 (bit_not @1)

Would it make sense to try and generalize it a bit, say with

(cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)

(scmp (view_convert:XXX @0) (bit_not @1))


Thanks for your advice, it looks great.
And can I use *view_convert1?* instead of *nop_convert1?* here,
because the original case is view_convert, and nop_convert would fail
to simplify the case.


Near the top of match.pd, you can see

/* With nop_convert? combine convert? and view_convert? in one pattern
   plus conditionalize on tree_nop_conversion_p conversions.  */
(match (nop_convert @0)
 (convert @0)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)
(match (nop_convert @0)
 (view_convert @0)
 (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
  && known_eq (TYPE_VECTOR_SUBPARTS (type),
   TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
  && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))

So at least the intention is that it can handle both NOP_EXPR for scalars 
and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way 
in some places in match.pd, like


(simplify
 (negate (nop_convert? (bit_not @0)))
 (plus (view_convert @0) { build_each_one_cst (type); }))

(simplify
 (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  (bit_not (bit_xor (view_convert @0) @1

(the 'if' seems redundant for this one)

 (simplify
  (negate (nop_convert? (negate @1)))
  (if (!TYPE_OVERFLOW_SANITIZED (type)
   && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
   (view_convert @1)))

etc.


At some point this got some genmatch help, to handle '?' and numbers, so I 
don't remember all the details, but following these examples should work.


--
Marc Glisse


Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]

2021-06-04 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 4, 2021 at 1:01 PM Hongtao Liu  wrote:
>
> On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse  wrote:
> >
> > On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> >
> > > Hi:
> > >  This patch is about to simplify (view_convert:type ~a) < 0 to
> > > (view_convert:type a) >= 0 when type is signed integer. Similar for
> > > (view_convert:type ~a) >= 0.
> > >  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> > >  Ok for the trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > >PR middle-end/100738
> > >* match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> > >(view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> > >simplification.
> >
> > We already have
> >
> > /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> > (for cmp (simple_comparison)
> >   scmp (swapped_simple_comparison)
> >   (simplify
> >(cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
> >(if (single_use (@2)
> > && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
> > (scmp @0 (bit_not @1)
> >
> > Would it make sense to try and generalize it a bit, say with
> >
> > (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
> >
> > (scmp (view_convert:XXX @0) (bit_not @1))
> >
> Thanks for your advice, it looks great.
> And can I use *view_convert1?* instead of *nop_convert1?* here,
> because the original case is view_convert, and nop_convert would fail
> to simplify the case.
Here is updated patch

gcc/ChangeLog:

PR middle-end/100738
* match.pd (Fold ~X op C as X op' ~C): Extend GIMPLE
simplification to handle view_convert ~X.

gcc/testsuite/ChangeLog:

PR middle-end/100738
* g++.target/i386/avx2-pr100738-1.C: New test.
* g++.target/i386/sse4_1-pr100738-1.C: New test.

> > (I still believe that it is a bad idea that SSA_NAMEs are strongly typed,
> > encoding the type in operations would be more convenient, but I think the
> > time for that choice has long gone)
> >
> > --
> > Marc Glisse
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao
From 60308636a36fa7a5b96d115452a42be914ef19e7 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Thu, 27 May 2021 15:21:06 +0800
Subject: [PATCH] Extend gimple simplication of ((~X) op C --> (X) op' ~C) to
 hanlde view_convert of ~X

gcc/ChangeLog:

	PR middle-end/100738
	* match.pd (Fold ~X op C as X op' ~C): Extend GIMPLE
	simplification to handle view_convert ~X.

gcc/testsuite/ChangeLog:

	PR middle-end/100738
	* g++.target/i386/avx2-pr100738-1.C: New test.
	* g++.target/i386/sse4_1-pr100738-1.C: New test.
---
 gcc/match.pd  |   5 +-
 .../g++.target/i386/avx2-pr100738-1.C | 120 ++
 .../g++.target/i386/sse4_1-pr100738-1.C   | 120 ++
 3 files changed, 243 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr100738-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C

diff --git a/gcc/match.pd b/gcc/match.pd
index cdb87636951..cbb76d67dc5 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4144,10 +4144,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for cmp (simple_comparison)
  scmp (swapped_simple_comparison)
  (simplify
-  (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
+  (cmp (view_convert1? (bit_not@2 @0)) CONSTANT_CLASS_P@1)
+  (with {tree ttype = TREE_TYPE (@1);}
   (if (single_use (@2)
&& (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
-   (scmp @0 (bit_not @1)
+   (scmp (view_convert:ttype @0) (bit_not @1))
 
 (for cmp (simple_comparison)
  /* Fold (double)float1 CMP (double)float2 into float1 CMP float2.  */
diff --git a/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C b/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C
new file mode 100644
index 000..80fdad3e5f0
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C
@@ -0,0 +1,120 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -std=c++14 -O2 -mno-avx512f -mno-xop" } */
+/* { dg-final { scan-assembler-not "pxor" } } */
+/* { dg-final { scan-assembler-not "pcmpgt\[bdq]" } } */
+/* { dg-final { scan-assembler-times "pblendvb" 6 } } */
+/* { dg-final { scan-assembler-times "blendvps" 6 } } */
+/* { dg-final { scan-assembler-times "blendvpd" 6 } } */
+
+typedef char v32qi __attribute__ ((vector_size (32)));
+typedef short v16hi __attribute__ ((vector_size (32)));
+typedef int v8si __attribute__ ((vector_size (32)));
+typedef long long v4di __attribute__ ((vector_size (32)));
+
+v8si
+f1 (v32qi a, v8si b, v8si c)
+{
+  return ((v8si)~a) < 0 ? b : c;
+}
+
+v4di
+f2 (v32qi a, v4di b, v4di c)
+{
+  return ((v4di)~a) < 0 ? b : c;
+}
+
+v32qi
+f3 (v16hi a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) < 0 ? b : c;
+}
+
+v8si
+f4 (v16hi a, v8si b, v8si c)
+{
+  return ((v8si)~a) < 0 ? b : c;
+}
+
+v4di
+f5 (v16hi a, v4di b, v4di c)
+{
+  return ((v4di)~a) < 0 ? b : c;
+}
+
+v32qi
+f6 (v8si a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) < 0 ? b : c;

[PATCH V3] Split loop for NE condition.

2021-06-04 Thread Jiufu Guo via Gcc-patches
Update the patch since v2:
. Check index and bound from gcond before checking if wrap.
. Update test case, and add an executable case.
. Refine code comments.
. Enhance the checking for i++/++i in the loop header.
. Enhance code to handle equal condition on exit

Bootstrap and regtest pass on powerpc64le, and also pass regtest
on bootstrap-O3. Is this ok for trunk?

BR.
Jiufu Guo.


When there is the possibility that wrap may happen on the loop index,
a few optimizations would not happen. For example code:

foo (int *a, int *b, unsigned k, unsigned n)
{
  while (++k != n)
a[k] = b[k]  + 1;
}

For this code, if "k > n", k would wrap.  if "k < n" at begining,
it could be optimized (e.g. vectorization).

We can split the loop into two loops:

  while (++k > n)
a[k] = b[k]  + 1;
  while (k++ < n)
a[k] = b[k]  + 1;

This patch splits this kind of loop to achieve better performance.

gcc/ChangeLog:

2021-06-04  Jiufu Guo  

* tree-ssa-loop-split.c (connect_loop_phis): Add new param.
(get_ne_cond_branch): New function.
(split_ne_loop): New function.
(split_loop_on_ne_cond): New function.
(tree_ssa_split_loops): Use split_loop_on_ne_cond.

gcc/testsuite/ChangeLog:

2021-06-04  Jiufu Guo  

* gcc.dg/loop-split1.c: New test.
* gcc.dg/loop-split2.c: New test.
* g++.dg/vect/pr98064.cc: Suppress warning.

---
 gcc/testsuite/g++.dg/vect/pr98064.cc |   4 +-
 gcc/testsuite/gcc.dg/loop-split1.c   | 101 +++
 gcc/testsuite/gcc.dg/loop-split2.c   |  54 ++
 gcc/tree-ssa-loop-split.c| 251 ++-
 4 files changed, 404 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/loop-split1.c
 create mode 100644 gcc/testsuite/gcc.dg/loop-split2.c

diff --git a/gcc/testsuite/g++.dg/vect/pr98064.cc 
b/gcc/testsuite/g++.dg/vect/pr98064.cc
index 74043ce7725..dcb2985d05a 100644
--- a/gcc/testsuite/g++.dg/vect/pr98064.cc
+++ b/gcc/testsuite/g++.dg/vect/pr98064.cc
@@ -1,5 +1,7 @@
 // { dg-do compile }
-// { dg-additional-options "-O3" }
+// { dg-additional-options "-O3 -Wno-stringop-overflow" }
+/* There is warning message when "short g = var_8; g; g++"
+   is optimized/analyzed as string operation,e.g. memset.  */
 
 const long long &min(const long long &__a, long long &__b) {
   if (__b < __a)
diff --git a/gcc/testsuite/gcc.dg/loop-split1.c 
b/gcc/testsuite/gcc.dg/loop-split1.c
new file mode 100644
index 000..dd2d03a7b96
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-split1.c
@@ -0,0 +1,101 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
+
+void
+foo (int *a, int *b, unsigned l, unsigned n)
+{
+  while (++l != n)
+a[l] = b[l] + 1;
+}
+void
+foo_1 (int *a, int *b, unsigned n)
+{
+  unsigned l = 0;
+  while (++l != n)
+a[l] = b[l] + 1;
+}
+
+void
+foo1 (int *a, int *b, unsigned l, unsigned n)
+{
+  while (l++ != n)
+a[l] = b[l] + 1;
+}
+
+/* No wrap.  */
+void
+foo1_1 (int *a, int *b, unsigned n)
+{
+  unsigned l = 0;
+  while (l++ != n)
+a[l] = b[l] + 1;
+}
+
+unsigned
+foo2 (char *a, char *b, unsigned l, unsigned n)
+{
+  while (++l != n)
+if (a[l] != b[l])
+  break;
+
+  return l;
+}
+
+unsigned
+foo2_1 (char *a, char *b, unsigned l, unsigned n)
+{
+  l = 0;
+  while (++l != n)
+if (a[l] != b[l])
+  break;
+
+  return l;
+}
+
+unsigned
+foo3 (char *a, char *b, unsigned l, unsigned n)
+{
+  while (l++ != n)
+if (a[l] != b[l])
+  break;
+
+  return l;
+}
+
+/* No wrap.  */
+unsigned
+foo3_1 (char *a, char *b, unsigned l, unsigned n)
+{
+  l = 0;
+  while (l++ != n)
+if (a[l] != b[l])
+  break;
+
+  return l;
+}
+
+void
+bar ();
+void
+foo4 (unsigned n, unsigned i)
+{
+  do
+{
+  if (i == n)
+   return;
+  bar ();
+  ++i;
+}
+  while (1);
+}
+
+unsigned
+find_skip_diff (char *p, char *q, unsigned n, unsigned i)
+{
+  while (p[i] == q[i] && ++i != n)
+p++, q++;
+
+  return i;
+}
+
+/* { dg-final { scan-tree-dump-times "Loop split" 8 "lsplit" } } */
diff --git a/gcc/testsuite/gcc.dg/loop-split2.c 
b/gcc/testsuite/gcc.dg/loop-split2.c
new file mode 100644
index 000..0d3fded3f61
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-split2.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+extern void abort (void);
+extern void exit (int);
+
+#define NI __attribute__ ((noinline))
+
+void NI
+foo (int *a, int *b, unsigned char l, unsigned char n)
+{
+  while (++l != n)
+a[l] = b[l] + 1;
+}
+
+unsigned NI
+bar (int *a, int *b, unsigned char l, unsigned char n)
+{
+  while (l++ != n)
+if (a[l] != b[l])
+  break;
+
+  return l;
+}
+
+int a[258];
+int b[258];
+
+int main()
+{
+  __builtin_memcpy (b, a, sizeof (a));
+
+  if (bar (a, b, 3, 8) != 9)
+abort ();
+
+  if (bar (a, b, 8, 3) != 4)
+abort ();
+
+  b[100] += 1;
+  if (bar (a, b, 90, 110) != 100)
+abort ();
+
+  if (bar (a, b, 110, 105) != 100)
+abort ();
+
+  foo (a, b, 99, 99);
+  a[9

Re: RFC: Sphinx for GCC documentation

2021-06-04 Thread Tobias Burnus

Hello,

On 13.05.21 13:45, Martin Liška wrote:

On 4/1/21 3:30 PM, Martin Liška wrote:

That said, I'm asking the GCC community for a green light before I
invest
more time on it?

So far, I've received just a small feedback about the transition. In
most cases positive.

[1] https://splichal.eu/scripts/sphinx/


The HTML output looks quite nice.

What I observed:

* Looking at
  
https://splichal.eu/scripts/sphinx/gfortran/_build/html/intrinsic-procedures/access-checks-file-access-modes.html
why is the first argument description in bold?
It is also not very readable to have a scollbar there – linebreaks would be 
better.
→ I think that's because the assumption is that the first line contains a header
  and the rest the data

* https://splichal.eu/scripts/sphinx/gfortran/_build/latex/gfortran.pdf
  If I look at page 92 (alias 96), 8.2.13 _gfortran_caf_sendget, the first 
column
  is too small to fit the argument names. – Admittedly, the current gfortran.pdf
  is not much better – it is very tight but just fits. I don't know how to fix 
this.

* I note that we write before the argument index, that those are without -/-- 
prefix
  but that's not true. Something to fix after the conversation.

* The syntax highlighting for gfortran is odd. Looking at @smallexample:
- intrinsic.texi: All Fortran examples (F90/free-form)
- gfc-internals.texi: 4x Fortran, 4x C, 3x plain text
- gfortran.texi: Shell, Fortran, C, plain text.
- invoke.texi: 4x Shell, 2x C, 4x Fortran
Does not seem to be that simple, but it would be nice if at least all in
intrinsic.texi would be marked as Fortran.

Actually, I do not quite understand when the output is formatted a C (wrongly
or rightly) as Fortran (rarely but correctly) as plain or in some odd formatting
which randomly highlights some examples.
Possibly also an item for after the conversion.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-06-04 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 03, 2021 at 02:54:07PM +0800, liuhongt wrote:
> Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
> fake call, it won't have its own function stack.
> 
> gcc/ChangeLog
> 
>   PR target/82735
>   * df-scan.c (df_get_call_refs): When call_insn is a fake call,
>   it won't use stack pointer reg.
>   * final.c (leaf_function_p): When call_insn is a fake call, it
>   won't affect caller as a leaf function.
>   * reg-stack.c (callee_clobbers_any_stack_reg): New.
>   (subst_stack_regs): When call_insn doesn't clobber any stack
>   reg, don't clear the arguments.
>   * rtl.c (shallow_copy_rtx): Don't clear flag used when orig is
>   a insn.
>   * shrink-wrap.c (requires_stack_frame_p): No need for stack
>   frame for a fake call.
>   * rtl.h (FAKE_CALL_P): New macro.

Ok, thanks.

Jakub



Re: [PATCH] [i386] Fix ICE of insn does not satisfy its constraints.

2021-06-04 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 04, 2021 at 01:03:58AM +, Liu, Hongtao wrote:
> Thanks for the review.
> Yes, you're right, AVX512VL parts are already guaranteed by 
> ix86_hard_regno_mode_ok.
> 
> Here is updated patch.

One remaining thing, could you try to modify the testcase back to
#include  and using intrinsics instead of the target builtins,
so that next time we replace some builtins we don't have to adjust the
testcase (and of course verify that without your patch it still ICEs and
with your patch it doesn't)?

Ok for trunk with that change.

Jakub



Re: [ARM] PR98435: Missed optimization in expanding vector constructor

2021-06-04 Thread Christophe Lyon via Gcc-patches
On Fri, 4 Jun 2021 at 09:27, Prathamesh Kulkarni via Gcc-patches
 wrote:
>
> Hi,
> As mentioned in PR, for the following test-case:
>
> #include 
>
> bfloat16x4_t f1 (bfloat16_t a)
> {
>   return vdup_n_bf16 (a);
> }
>
> bfloat16x4_t f2 (bfloat16_t a)
> {
>   return (bfloat16x4_t) {a, a, a, a};
> }
>
> Compiling with arm-linux-gnueabi -O3 -mfpu=neon -mfloat-abi=softfp
> -march=armv8.2-a+bf16+fp16 results in f2 not being vectorized:
>
> f1:
> vdup.16 d16, r0
> vmovr0, r1, d16  @ v4bf
> bx  lr
>
> f2:
> mov r3, r0  @ __bf16
> adr r1, .L4
> ldrdr0, [r1]
> mov r2, r3  @ __bf16
> mov ip, r3  @ __bf16
> bfi r1, r2, #0, #16
> bfi r0, ip, #0, #16
> bfi r1, r3, #16, #16
> bfi r0, r2, #16, #16
> bx  lr
>
> This seems to happen because vec_init pattern in neon.md has VDQ mode
> iterator, which doesn't include V4BF. In attached patch, I changed
> mode
> to VDQX which seems to work for the test-case, and the compiler now generates:
>
> f2:
> vdup.16 d16, r0
> vmovr0, r1, d16  @ v4bf
> bx  lr
>
> However, the pattern is also gated on TARGET_HAVE_MVE and I am not
> sure if either VDQ or VDQX are correct modes for MVE since MVE has
> only 128-bit vectors ?
>

I think patterns common to both Neon and MVE should be moved to
vec-common.md, I don't know why such patterns were left in neon.md.

That being said, I suggest you look at other similar patterns in
vec-common.md, most of which are gated on
ARM_HAVE__ARITH
and possibly beware of issues with iwmmxt :-)

Christophe

> Thanks,
> Prathamesh


[PATCH] arc: Add --with-fpu support for ARCv2 cpus

2021-06-04 Thread Claudiu Zissulescu via Gcc-patches
Hi Jeff,

I would like to add spport for selecting the ARCv2 FPU extension at
configuration-time.

The --with-fpu configuration option is ignored when -mfpu compiler
option is specified.

My concern is using `grep -P` when configuring. Is that ok?

Thanks,
Claudiu

gcc/
-mm-dd  Claudiu Zissulescu  

* config.gcc (arc): Add support for with_cpu option.
* config/arc/arc.h (OPTION_DEFAULT_SPECS): Add fpu.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config.gcc   | 56 ++--
 gcc/config/arc/arc.h |  4 
 2 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 610422fb29ee..f46b5e79af69 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4258,18 +4258,70 @@ case "${target}" in
;;
 
arc*-*-*)
-   supported_defaults="cpu"
+   supported_defaults="cpu fpu"
 
+   new_cpu=hs38_linux
if [ x"$with_cpu" = x ] \
|| grep "^ARC_CPU ($with_cpu," \
   ${srcdir}/config/arc/arc-cpus.def \
   > /dev/null; then
 # Ok
-true
+new_cpu=$with_cpu
else
 echo "Unknown cpu used in --with-cpu=$with_cpu" 1>&2
 exit 1
fi
+
+   # see if --with-fpu matches any of the supported FPUs
+   case "$with_fpu" in
+   "")
+   # OK
+   ;;
+   fpus | fpus_div | fpus_fma | fpus_all)
+   # OK if em or hs
+   if grep -P "^ARC_CPU \($new_cpu,\s+[emhs]+," \
+  ${srcdir}/config/arc/arc-cpus.def \
+  > /dev/null; then
+  # OK
+  true
+   else
+echo "Unknown floating point type used in "\
+"--with-fpu=$with_fpu for cpu $new_cpu" 1>&2
+exit 1
+   fi
+   ;;
+   fpuda | fpuda_div | fpuda_fma | fpuda_all)
+   # OK only em
+   if grep -P "^ARC_CPU \($new_cpu,\s+em," \
+  ${srcdir}/config/arc/arc-cpus.def \
+  > /dev/null; then
+  # OK
+  true
+   else
+echo "Unknown floating point type used in "\
+ "--with-fpu=$with_fpu for cpu $new_cpu" 1>&2
+exit 1
+   fi
+   ;;
+   fpud | fpud_div | fpud_fma | fpud_all)
+   # OK only hs
+   if grep -P "^ARC_CPU \($new_cpu,\s+hs," \
+  ${srcdir}/config/arc/arc-cpus.def \
+  > /dev/null; then
+  # OK
+  true
+   else
+echo "Unknown floating point type used in"\
+ "--with-fpu=$with_fpu for cpu $new_cpu" 1>&2
+exit 1
+   fi
+   ;;
+   *)
+   echo "Unknown floating point type used in "\
+"--with-fpu=$with_fpu" 1>&2
+   exit 1
+   ;;
+   esac
;;
 
 csky-*-*)
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 722bb10b8813..b9c4ba0398e5 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -100,7 +100,11 @@ extern const char *arc_cpu_to_as (int argc, const char 
**argv);
   "%:cpu_to_as(%{mcpu=*:%*}) %{mspfp*} %{mdpfp*} "  \
   "%{mfpu=fpuda*:-mfpuda} %{mcode-density}"
 
+/* Support for a compile-time default CPU and FPU.  The rules are:
+   --with-cpu is ignored if -mcpu, mARC*, marc*, mA7, mA6 are specified.
+   --with-fpu is ignored if -mfpu is specified.  */
 #define OPTION_DEFAULT_SPECS   \
+  {"fpu", "%{!mfpu=*:-mfpu=%(VALUE)}"},
\
   {"cpu", "%{!mcpu=*:%{!mARC*:%{!marc*:%{!mA7:%{!mA6:-mcpu=%(VALUE)}" }
 
 #ifndef DRIVER_ENDIAN_SELF_SPECS
-- 
2.31.1



Re: GCC documentation: porting to Sphinx

2021-06-04 Thread Martin Liška

On 6/3/21 7:16 PM, Joseph Myers wrote:

On Thu, 3 Jun 2021, Martin Liška wrote:


On 6/2/21 6:44 PM, Joseph Myers wrote:

On Wed, 2 Jun 2021, Joel Sherrill wrote:


For RTEMS, we switched from texinfo to Sphinx and the dependency
on Python3 for Sphinx has caused a bit of hassle. Is this going to be
an issue for GCC?


What Sphinx (and, thus, Python) versions does the GCC manual build work
with?


I've just tried version 1.7.6 which we use for libgccjit and it's fine:
https://gcc.gnu.org/onlinedocs/jit/

About Python version: I'm not planning supporting Python2, it's dead 10 years
already.


There should be appropriate configure checks to avoid building manuals
with too-old versions (i.e. disable the info/man manual build/install when
Sphinx, or the Python version it's using, is too old or missing, not fail
configure).


Sure, that makes sense.



Actually this code is depending on Python 3.6 or later because of the use
of an f-string in baseconf.py (without that f-string, it works with older
versions, even 2.7).


Yeah, I used the f-string syntax only at one place and it does not pay off.


Formally 3.5 and older are no longer supported
upstream, but certainly still present in some maintained long-term-support
distribution versions.


Makes sense.




I would recommend testing the build. You can simply clone:
https://github.com/marxin/texi2rst-generated

and simply run 'make html' or 'make latexpdf'. Basic dependencies are
mentioned here:
https://github.com/marxin/texi2rst-generated#requirements


It appears "make html" works (with lots of WARNINGs) with Sphinx 1.6.1 but
fails with 1.4 ("Theme error: unsupported theme option
'prev_next_buttons_location' given").



I checked that and the template needs at least version 1.6:
https://sphinx-rtd-theme.readthedocs.io/en/latest/installing.html#compatibility

so I added needs_sphinx to baseconf.py:
https://www.sphinx-doc.org/en/master/usage/configuration.html?highlight=conf.py#confval-needs_sphinx

The following message is displayed when one builds a manual:

$ make html

sphinx-build -b "html" -d _build/doctrees. "_build/html"

Running Sphinx v4.0.2



Sphinx version error:

This project needs at least Sphinx v66.6 and therefore cannot be built with 
this version.

make: *** [Makefile:96: html] Error 2


Martin


[ARM] PR98435: Missed optimization in expanding vector constructor

2021-06-04 Thread Prathamesh Kulkarni via Gcc-patches
Hi,
As mentioned in PR, for the following test-case:

#include 

bfloat16x4_t f1 (bfloat16_t a)
{
  return vdup_n_bf16 (a);
}

bfloat16x4_t f2 (bfloat16_t a)
{
  return (bfloat16x4_t) {a, a, a, a};
}

Compiling with arm-linux-gnueabi -O3 -mfpu=neon -mfloat-abi=softfp
-march=armv8.2-a+bf16+fp16 results in f2 not being vectorized:

f1:
vdup.16 d16, r0
vmovr0, r1, d16  @ v4bf
bx  lr

f2:
mov r3, r0  @ __bf16
adr r1, .L4
ldrdr0, [r1]
mov r2, r3  @ __bf16
mov ip, r3  @ __bf16
bfi r1, r2, #0, #16
bfi r0, ip, #0, #16
bfi r1, r3, #16, #16
bfi r0, r2, #16, #16
bx  lr

This seems to happen because vec_init pattern in neon.md has VDQ mode
iterator, which doesn't include V4BF. In attached patch, I changed
mode
to VDQX which seems to work for the test-case, and the compiler now generates:

f2:
vdup.16 d16, r0
vmovr0, r1, d16  @ v4bf
bx  lr

However, the pattern is also gated on TARGET_HAVE_MVE and I am not
sure if either VDQ or VDQX are correct modes for MVE since MVE has
only 128-bit vectors ?

Thanks,
Prathamesh


[committed] arc: Don't allow millicode thunks with reduced register set CPUs.

2021-06-04 Thread Claudiu Zissulescu via Gcc-patches
The millicode thunks are not reduced register set safe.  Disable them
for CPUs having this option on.

gcc/
2021-06-04  Claudiu Zissulescu  

* config/arc/arc.c (arc_override_options): Disable millicode
thunks when RF16 is on.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index b77d0566386..0d34c964963 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -1451,8 +1451,10 @@ arc_override_options (void)
   if (TARGET_ARC700 && (arc_tune != ARC_TUNE_ARC7XX))
 flag_delayed_branch = 0;
 
-  /* Millicode thunks doesn't work with long calls.  */
-  if (TARGET_LONG_CALLS_SET)
+  /* Millicode thunks doesn't work for long calls.  */
+  if (TARGET_LONG_CALLS_SET
+  /* neither for RF16.  */
+  || TARGET_RF16)
 target_flags &= ~MASK_MILLICODE_THUNK_SET;
 
   /* Set unaligned to all HS cpus.  */
-- 
2.31.1