date:20210519

Re: [PATCH v4 12/12] constructor: Check if it is faster to load constant from memory

2021-05-19 Thread Richard Biener via Gcc-patches

On Wed, May 19, 2021 at 9:05 PM H.J. Lu  wrote:
>
> On Wed, May 19, 2021 at 6:27 AM Bernd Edlinger
>  wrote:
> >
> > On 5/19/21 3:22 PM, H.J. Lu wrote:
> > > On Wed, May 19, 2021 at 2:33 AM Richard Biener
> > >  wrote:
> > >>
> > >> On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> > >>>
> > >>> When expanding a constant constructor, don't call expand_constructor if
> > >>> it is more efficient to load the data from the memory via move by 
> > >>> pieces.
> > >>>
> > >>> gcc/
> > >>>
> > >>> PR middle-end/90773
> > >>> * expr.c (expand_expr_real_1): Don't call expand_constructor if
> > >>> it is more efficient to load the data from the memory.
> > >>>
> > >>> gcc/testsuite/
> > >>>
> > >>> PR middle-end/90773
> > >>> * gcc.target/i386/pr90773-24.c: New test.
> > >>> * gcc.target/i386/pr90773-25.c: Likewise.
> > >>> ---
> > >>>  gcc/expr.c | 10 ++
> > >>>  gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++
> > >>>  gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 
> > >>>  3 files changed, 52 insertions(+)
> > >>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c
> > >>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c
> > >>>
> > >>> diff --git a/gcc/expr.c b/gcc/expr.c
> > >>> index d09ee42e262..80e01ea1cbe 100644
> > >>> --- a/gcc/expr.c
> > >>> +++ b/gcc/expr.c
> > >>> @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> > >>> machine_mode tmode,
> > >>> unsigned HOST_WIDE_INT ix;
> > >>> tree field, value;
> > >>>
> > >>> +   /* Check if it is more efficient to load the data from
> > >>> +  the memory directly.  FIXME: How many stores do we
> > >>> +  need here if not moved by pieces?  */
> > >>> +   unsigned HOST_WIDE_INT bytes
> > >>> + = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> > >>
> > >> that's prone to fail - it could be a VLA.
> > >
> > > What do you mean by fail?  Is it ICE or missed optimization?
> > > Do you have a testcase?
> > >
> >
> > I think for a VLA the TYPE_SIZE_UNIT may be unknown (NULL), or something 
> > like "x".
> >
> > for instance something like
> >
> > int test (int x)
> > {
> >   int vla[x];
> >
> >   vla[x-1] = 0;
> >   return vla[x-1];
> > }
>
> My patch changes the CONSTRUCTOR code path.   I couldn't find a CONSTRUCTOR
> testcase with VLA.

nevertheless it doens't hurt to check tree_fits_uhwi (TYPE_SIZE_UNIT (type)),
there's also int_size_in_bytes () returning a signed HOST_WIDE_INT and -1
on "failure" that would work well in your case.

> >
> > Bernd.
> >
> > >>
> > >>> +   if ((bytes / UNITS_PER_WORD) > 2
> > >>> +   && MOVE_MAX_PIECES > UNITS_PER_WORD
> > >>> +   && can_move_by_pieces (bytes, TYPE_ALIGN (type)))
> > >>> + goto normal_inner_ref;
> > >>> +
> > >>
> > >> It looks like you're concerned about aggregate copies but this also 
> > >> handles
> > >> non-aggregates (which on GIMPLE might already be optimized of course).
> > >
> > > Here I check if we copy more than 2 words and we can move more than
> > > a word in a single instruction.
> > >
> > >> Also you say "if it's cheaper" but I see no cost considerations.  How do
> > >> we generally handle immed const vs. load from constant pool costs?
> > >
> > > This trades 2 (update to 8) stores with one load plus one store.  Is there
> > > a way to check which one is faster?
> > >
> > >>> FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), ix,
> > >>>   field, value)
> > >>>   if (tree_int_cst_equal (field, index))
> > >>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-24.c 
> > >>> b/gcc/testsuite/gcc.target/i386/pr90773-24.c
> > >>> new file mode 100644
> > >>> index 000..4a4b62533dc
> > >>> --- /dev/null
> > >>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-24.c
> > >>> @@ -0,0 +1,22 @@
> > >>> +/* { dg-do compile } */
> > >>> +/* { dg-options "-O2 -march=x86-64" } */
> > >>> +
> > >>> +struct S
> > >>> +{
> > >>> +  long long s1 __attribute__ ((aligned (8)));
> > >>> +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
> > >>> +};
> > >>> +
> > >>> +const struct S array[] = {
> > >>> +  { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 }
> > >>> +};
> > >>> +
> > >>> +void
> > >>> +foo (struct S *x)
> > >>> +{
> > >>> +  x[0] = array[0];
> > >>> +}
> > >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > >>> \\(%\[\^,\]+\\)" 1 } } */
> > >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > >>> 16\\(%\[\^,\]+\\)" 1 } } */
> > >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > >>> 32\\(%\[\^,\]+\\)" 1 } } */
> > >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > >>> 48\\(%\[\^,\]+\\)" 1

Aw: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

> @All, Harald: Does the attached patch make sense?

first of all: sorry for the really badly designed testcase.
The best solution has already been discussed in this thread,
which is to replace

  integer(16), parameter :: m1 = 9007199254740992_16!2**53
  integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 !2**113

by

  integer(16), parameter :: m1 = 2_16 ** digits (x) ! IEEE: 2**53
  integer(16), parameter :: m2 = 2_16 ** digits (y) ! IEEE: 2**113

The motivation was to test that compile-time and run-time produce the
same correct result, as well as verifying that the user gets what he/she
would naively expect from the intrinsic.  There should be no hidden
double conversion that might e.g. truncate.

I decided for the largest real values which are exactly representable
also as integer, and for which the rounding operation should always
reproduce the expected result.

E.g.  nint (x) - nint (x-1) should give 1, while nint (x) - nint (x+1)
might give 0, which happens for the chosen values on x86.

I had this in my mind, but decided to drop this because I had no idea
if there are exotic, non-IEEE, or other implementations which would
fail on this.

Thanks for fixing this!

Harald

[i386] [PATCH] Fix ICE when lhs is NULL [PR target/100660]

2021-05-19 Thread Hongtao Liu via Gcc-patches

Hi:
  In folding target-specific builtin, when lhs is NULL, create a
temporary variable for it.
  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}

gcc/ChangeLog:
PR target/100660
* config/i386/i386.c (ix86_gimple_fold_builtin): Create a tmp
variable for lhs when it doesn't exist.

gcc/testsuite/ChangeLog:
PR target/100660
* gcc.target/i386/pr100660.c: New test.



-- 
BR,
Hongtao
From b32791645429e3e25c46f56e2b81ffab7863afde Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Thu, 20 May 2021 09:59:36 +0800
Subject: [PATCH] Fix ICE when lhs is NULL.

gcc/ChangeLog:
	PR target/100660
	* config/i386/i386.c (ix86_gimple_fold_builtin): Create a tmp
	variable for lhs when it doesn't exist.

gcc/testsuite/ChangeLog:
	PR target/100660
	* gcc.target/i386/pr100660.c: New test.
---
 gcc/config/i386/i386.c   |  5 -
 gcc/testsuite/gcc.target/i386/pr100660.c | 10 ++
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100660.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 743d8a25fe3..705257c414f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18000,7 +18000,10 @@ ix86_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 	gimple_seq stmts = NULL;
 	tree cmp = gimple_build (&stmts, tcode, cmp_type, arg0, arg1);
 	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
-	gimple* g = gimple_build_assign (gimple_call_lhs (stmt),
+	tree lhs = gimple_call_lhs (stmt);
+	if (!lhs)
+	  lhs = create_tmp_reg_or_ssa_name (type);
+	gimple* g = gimple_build_assign (lhs,
 	 VEC_COND_EXPR, cmp,
 	 minus_one_vec, zero_vec);
 	gimple_set_location (g, loc);
diff --git a/gcc/testsuite/gcc.target/i386/pr100660.c b/gcc/testsuite/gcc.target/i386/pr100660.c
new file mode 100644
index 000..1112b742779
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr100660.c
@@ -0,0 +1,10 @@
+/* PR target/pr100660.  */
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -O" } */
+
+typedef char v16qi __attribute__ ((vector_size (16)));
+v16qi
+f5 (v16qi a, v16qi b)
+{
+  __builtin_ia32_pcmpgtb128 (a, b);
+}
-- 
2.18.1

Re: [PATCH] avoid -Wnonnull with lambda (PR 100684)

2021-05-19 Thread Jeff Law via Gcc-patches




On 5/19/2021 3:01 PM, Martin Sebor via Gcc-patches wrote:

The front end -Wnonnull handler has code to suppress warning for
lambdas with null this pointers but the middle end handler has
no corresponding logic.  This leads to spurious -Wnonnull in
lambda calls that aren't inlined.  That might happen at low
optimization levels such as -O1 or -Og and with sanitization.

The attached patch enhances the middle end -Wnonnull to deal
with this case and avoid the false positives.

Tested on x86_64-linux.

Martin

gcc-100684.diff

PR middle-end/100684 - spurious -Wnonnull with -O1 on a C++ lambda

gcc/ChangeLog:

PR middle-end/100684
* tree-ssa-ccp.c (pass_post_ipa_warn::execute):

gcc/testsuite/ChangeLog:

PR middle-end/100684
* g++.dg/warn/Wnonnull13.C: New test.
* g++.dg/warn/Wnonnull14.C: New test.
* g++.dg/warn/Wnonnull15.C: New test.


OK with a ChangeLog entry for the tree-ssa-ccp change.


Jeff

[PATCH] PCH large file bug

2021-05-19 Thread Evgeniy via Gcc-patches

Hello,

can I ask somebody to push the patch to fix the PCH large file problem
(BUG 14940)? The bug fix was sent in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940#c49
and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940#c50

I would copy and paste it, but I'm not the author of this patch.

Evgeniy

[pushed] c++: _Complex template parameter [PR100634]

2021-05-19 Thread Jason Merrill via Gcc-patches

We were crashing because invalid_nontype_parm_type_p allowed _Complex
template parms, but convert_nontype_argument didn't know what to do for
them.  Let's just disallow it, people can and should use std::complex
instead.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/100634

gcc/cp/ChangeLog:

* pt.c (invalid_nontype_parm_type_p): Return true for COMPLEX_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-complex1.C: New test.
---
 gcc/cp/pt.c   | 2 ++
 gcc/testsuite/g++.dg/cpp2a/nontype-complex1.C | 8 
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-complex1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 32cd0b7a6ed..cbd2f3dc338 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -26563,6 +26563,8 @@ invalid_nontype_parm_type_p (tree type, tsubst_flags_t 
complain)
   else if (cxx_dialect >= cxx11
   && TREE_CODE (type) == BOUND_TEMPLATE_TEMPLATE_PARM)
 return false;
+  else if (TREE_CODE (type) == COMPLEX_TYPE)
+/* Fall through.  */;
   else if (VOID_TYPE_P (type))
 /* Fall through.  */;
   else if (cxx_dialect >= cxx20)
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-complex1.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-complex1.C
new file mode 100644
index 000..4de2168ef60
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-complex1.C
@@ -0,0 +1,8 @@
+// PR c++/100634
+// { dg-do compile { target c++20 } }
+// { dg-options "" }
+
+// We could support _Complex template arguments, but better I think to make
+// people use a standard type instead.
+template<_Complex int> struct ComplexInt {}; // { dg-error "not a valid type" }
+using CI = ComplexInt<1 + 3i>;

base-commit: 65f32e5d6bbeb93a7d8d121fd56af6555e16d747
prerequisite-patch-id: ae2039c7381d197d44ea669393e7107448d8e156
-- 
2.27.0

[pushed] c++: ICE with using and enum [PR100659]

2021-05-19 Thread Jason Merrill via Gcc-patches

Here the code for 'using enum' is confused by the combination of a
using-decl and an enum that are not from 'using enum'; this CONST_DECL is
from the normal unscoped enum scoping.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/100659

gcc/cp/ChangeLog:

* cp-tree.h (CONST_DECL_USING_P): Check for null TREE_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/parse/access13.C: New test.
---
 gcc/cp/cp-tree.h  | 1 +
 gcc/testsuite/g++.dg/parse/access13.C | 7 +++
 2 files changed, 8 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/parse/access13.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 860ed795299..aa202715873 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -3553,6 +3553,7 @@ struct GTY(()) lang_decl {
created by handle_using_decl.  */
 #define CONST_DECL_USING_P(NODE)   \
   (TREE_CODE (NODE) == CONST_DECL  \
+   && TREE_TYPE (NODE) \
&& TREE_CODE (TREE_TYPE (NODE)) == ENUMERAL_TYPE\
&& DECL_CONTEXT (NODE) != TREE_TYPE (NODE))
 
diff --git a/gcc/testsuite/g++.dg/parse/access13.C 
b/gcc/testsuite/g++.dg/parse/access13.C
new file mode 100644
index 000..41463c5dde5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/access13.C
@@ -0,0 +1,7 @@
+// PR c++/100659
+
+template  struct A
+{
+  A::E::V;// { dg-warning "access decl" }
+  enum { V }; // { dg-error "conflicts with a previous decl" }
+};

base-commit: 65f32e5d6bbeb93a7d8d121fd56af6555e16d747
-- 
2.27.0

[PATCH] libgccjit: Add support for setting the link section of global variables [PR100688]

2021-05-19 Thread Antoni Boucher via Gcc-patches

Hello.
This patch adds support to set the link section of global variables.
I used the ABI 18 because I submitted other patches up to 17.
Thanks for the review.
From c867732ee36003759d479497c85135ecfc4a0cf3 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Wed, 12 May 2021 07:57:54 -0400
Subject: [PATCH] Add support for setting the link section of global variables
 [PR100688]

2021-05-19  Antoni Boucher  

gcc/jit/
PR target/100688
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_18): New ABI
tag.
* docs/topics/expressions.rst: Add documentation for the
function gcc_jit_lvalue_set_link_section.
* jit-playback.h: New function (set_link_section) and
rvalue::m_inner protected.
* jit-recording.c: New function (set_link_section) and
support for setting the link section.
* jit-recording.h: New function (set_link_section) and new
field m_link_section.
* libgccjit.c: New function (gcc_jit_lvalue_set_link_section).
* libgccjit.h: New function (gcc_jit_lvalue_set_link_section).
* libgccjit.map (LIBGCCJIT_ABI_18): New ABI tag.

gcc/testsuite/
PR target/100688
* jit.dg/all-non-failing-tests.h: Add test-link-section.c.
* jit.dg/test-link_section.c: New test.
---
 gcc/jit/docs/topics/compatibility.rst|  9 +++
 gcc/jit/docs/topics/expressions.rst  | 12 ++
 gcc/jit/jit-playback.h   |  8 +++
 gcc/jit/jit-recording.c  | 23 +++---
 gcc/jit/jit-recording.h  |  7 +-
 gcc/jit/libgccjit.c  | 12 ++
 gcc/jit/libgccjit.h  | 13 ++
 gcc/jit/libgccjit.map| 11 +
 gcc/testsuite/jit.dg/all-non-failing-tests.h |  7 ++
 gcc/testsuite/jit.dg/test-link-section.c | 25 
 10 files changed, 123 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-link-section.c

diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index 239b6aa1a92..94e3127f049 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -243,3 +243,12 @@ embedding assembler instructions:
   * :func:`gcc_jit_extended_asm_add_input_operand`
   * :func:`gcc_jit_extended_asm_add_clobber`
   * :func:`gcc_jit_context_add_top_level_asm`
+
+.. _LIBGCCJIT_ABI_18:
+
+``LIBGCCJIT_ABI_18``
+---
+``LIBGCCJIT_ABI_18`` covers the addition of an API entrypoint to set the link
+section of a variable:
+
+  * :func:`gcc_jit_lvalue_set_link_section`
diff --git a/gcc/jit/docs/topics/expressions.rst b/gcc/jit/docs/topics/expressions.rst
index 396259ef07e..b39f6c02527 100644
--- a/gcc/jit/docs/topics/expressions.rst
+++ b/gcc/jit/docs/topics/expressions.rst
@@ -539,6 +539,18 @@ where the rvalue is computed by reading from the storage area.
 
in C.
 
+.. function:: void
+  gcc_jit_lvalue_set_link_section (gcc_jit_lvalue *lvalue,
+   const char *name)
+
+   Set the link section of a variable; analogous to:
+
+   .. code-block:: c
+
+ int variable __attribute__((section(".section")));
+
+   in C.
+
 Global variables
 
 
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index 825a3e172e9..8b0f65e87e8 100644
--- a/gcc/jit/jit-playback.h
+++ b/gcc/jit/jit-playback.h
@@ -650,6 +650,8 @@ public:
 
 private:
   context *m_ctxt;
+
+protected:
   tree m_inner;
 };
 
@@ -670,6 +672,12 @@ public:
   rvalue *
   get_address (location *loc);
 
+  void
+  set_link_section (const char* name)
+  {
+set_decl_section_name (m_inner, name);
+  }
+
 private:
   bool mark_addressable (location *loc);
 };
diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
index 117ff70114c..214e6f487fe 100644
--- a/gcc/jit/jit-recording.c
+++ b/gcc/jit/jit-recording.c
@@ -3713,6 +3713,11 @@ recording::lvalue::get_address (recording::location *loc)
   return result;
 }
 
+void recording::lvalue::set_link_section (const char *name)
+{
+  m_link_section = new_string (name);
+}
+
 /* The implementation of class gcc::jit::recording::param.  */
 
 /* Implementation of pure virtual hook recording::memento::replay_into
@@ -4547,8 +4552,7 @@ recording::block::dump_edges_to_dot (pretty_printer *pp)
 void
 recording::global::replay_into (replayer *r)
 {
-  set_playback_obj (
-m_initializer
+  playback::lvalue *global = m_initializer
 ? r->new_global_initialized (playback_location (r, m_loc),
  m_kind,
  m_type->playback_type (),
@@ -4560,7 +4564,12 @@ recording::global::replay_into (replayer *r)
 : r->new_global (playback_location (r, m_loc),
 		 m_kind,
 		 m_type->playback_type (),
-		 playback_string (m_name)));
+		 playback_string (m_name));
+  if (m_link_

[r12-928 Regression] FAIL: g++.dg/opt/pr94589-2.C -std=gnu++2a scan-tree-dump-times optimized "[ij]_[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) [ij]_[0-9]+\\(D\\)" 12 on Linux/x86_64

2021-05-19 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

cd67343703ef4fa61de837f4690eba70d2760825 is the first bad commit
commit cd67343703ef4fa61de837f4690eba70d2760825
Author: Jason Merrill 
Date:   Tue May 18 12:29:33 2021 -0400

c++: ICE with <=> fallback [PR100367]

caused

FAIL: g++.dg/opt/pr94589-2.C  -std=gnu++2a  scan-tree-dump-times optimized 
"i_[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 12
FAIL: g++.dg/opt/pr94589-2.C  -std=gnu++2a  scan-tree-dump-times optimized 
"[ij]_[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) [ij]_[0-9]+\\(D\\)" 12

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-928/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr94589-2.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr94589-2.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr94589-2.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr94589-2.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Segher Boessenkool

Hi!

On Wed, May 19, 2021 at 08:35:26PM +0200, Tobias Burnus wrote:
> On 19.05.21 17:15, Segher Boessenkool wrote:
> >>real(16)   :: y   ! 128bit REAL
> >>integer(16), parameter :: k2 = nint (2 / epsilon (y), kind(k2))
> >>integer(16), parameter :: m2 = 10384593717069655257060992658440192_16
> >>!2**113
> >>if (k2 /= m2) stop 3
> >>
> >>On x86_64-linux-gnu, k2 == m2 — but on powerpc64le-linux-gnu,
> >>k2 == 2**106 instead of 2**113.
> >>
> >>My solution is to permit also 2**106 besides 2**113.
> >I do not understand Fortran well enough, could you explain what the code
> >is supposed to do?
> 
> First, 2_16 means the integer '2' of the integer kind '16', i.e. int128_t 
> type.

Ah ok.

> And I think this testcase tries to ensure that the result of 'nint'
> both at compile time and at runtime matches what should be the result.

That is the main thing I missed :-)

> ('a**b' is the Fortran syntax for: 'a' raised to the power of 'b'.)

I know, I use it all the time :-)  Many other languages have this
nowadays btw.  It is such a succinct notation :-)

> nint(2/epsilon(y)). Here, 'epsilon' is the
> "Model number that is small compared to 1."
> Namely: b**(p-1) = '(radix)**(1-digits)'
> alias 'real_format *fmt = REAL_MODE_FORMAT (mode)'
> with radix = fmt->b  and digits = fmt->p;
> 
> [b**(p-1) is from the Fortran standard but 'b' and 'p' also match the
> ME/target names, while radix/digits matches the FE names and also the
> Fortran intrinsic inquiry function names.]
> 
> This is for radix = 2 equivalent to:
> 
> 2/2**(1-digits) = 2*2**(digits-1) = 2**(digits)
> 
> On x86-64, digits == fmt->p == 113.

And the same on powerpc64le with -mabi=ieeelongdouble, presumably.

> Our powerpc64le gives digits == 106.

It stil defaults to -mabi=ibmlongdouble.

> Having written all this, I wonder why we don't just
> rely on the assumption that '2**digit(x)' works – and use this
> to generate the valid.

If that works (and I have no reason to believe it won't), that is as
simple a solution as you will find :-)

> As I like that patch and believe it is obvious, I intent to
> commit it as such – unless there are further comments.

> It passes on both x86-64-gnu-linux and powerpc64le-none-linux-gnu.
> I think the radix == 2 is a good bet, but if we ever run into issues,
> it can also be changed to use radix(...) as well ...

I don't think any GCC target does 10 or 16 by default, and less chance
in Fortran even :-)

Thanks!


Segher

Re: [PATCH,rs6000] Test cases for p10 fusion patterns

2021-05-19 Thread Segher Boessenkool

Hi!

On Mon, Apr 26, 2021 at 02:00:36PM -0500, acsaw...@linux.ibm.com wrote:
> This adds some test cases to make sure that the combine patterns for p10
> fusion are working.

> +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
> @@ -0,0 +1,205 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Same issues again.

> +/* Recreate with:
> +   grep ' \*fuse_' fusion-p10-2logical.s|sed -e 's,^.*\*,,' |sort -k 7,7 
> |uniq -c|awk '{l=30-length($2); printf("/%s* { %s { scan-assembler-times 
> \"%s\"%-*s%4d } } *%s/\n","","dg-final",$2,l,"",$1,"");}'
> + */
> +  
> +/* { dg-final { scan-assembler-times "fuse_and_and/1"
>   16 } } */

You could make the lines not wrap easy enough ;-)  (Not that it matters of
course, this is a testcase, heh.)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
> @@ -0,0 +1,66 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

And one more.

Okay for trunk and backport to 11 with those things looked at.  Thanks!


Segher

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 11:51 -0600, Martin Sebor wrote:

On 5/19/21 10:39 AM, Jonathan Wakely via Gcc-patches wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Looks good to me, it just needs an update to the manual describing
the new options.

It's too bad that the conditionals checking for the dialect have
to be repeated throughout the front end.  They're implied by
the new option enumerator passed to pedwarn().  If the diagnostic
subsystem had access to cxx_dialect the check could be done there
and all the other conditionals could be avoided.  An alternative
to that would be to add a new wrapper to the C++ front end, like
cxxdialect_pedwarn, to do the checking before calling pedwarn
(or, more likely, emit_diagnostic_valist).


btw I did try adding that new wrapper, but it got complicated when I
needed to overload it for both location_t and richloc* (and also
overload all the forwarding wrappers for each cxx dialect that used
it). I gave up and reverted everything.

Maybe I'll try that again another day, but for now I just need to be
able to disable some of these warnings with -Wno-xxx, because they're
not currently controllable by any option.

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 11:51 -0600, Martin Sebor wrote:

On 5/19/21 10:39 AM, Jonathan Wakely via Gcc-patches wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Looks good to me, it just needs an update to the manual describing
the new options.

It's too bad that the conditionals checking for the dialect have
to be repeated throughout the front end.  They're implied by
the new option enumerator passed to pedwarn().  If the diagnostic
subsystem had access to cxx_dialect the check could be done there
and all the other conditionals could be avoided.  An alternative
to that would be to add a new wrapper to the C++ front end, like
cxxdialect_pedwarn, to do the checking before calling pedwarn
(or, more likely, emit_diagnostic_valist).


Actually, you've made me realise the conditionals aren't really
needed. The new warn_about_dialect_p predicate made sense for an
earlier version of my patch, but then I simplified it and now it's
mostly useless.

So in most places I can just leave if (cxx_dialect < cxx17) as the
controlling condition, and then just use OPT_Wc__17_extensions for the
pedwarn opt_code. And even the cxx_dialect check could be eliminated
if the value of warn_cxx11_extensions was set according to the
cxx_dialect (instead of Init(1) for all of them). i.e. Use Init(-1)
and then during option parsing set each according to the -std option:

  if (warn_cxx11_extensions == -1)
warn_cxx11_extensions = cxx_dialect < cxx11;
  if (warn_cxx14_extensions == -1)
warn_cxx14_extensions = cxx_dialect < cxx14;
  if (warn_cxx17_extensions == -1)
warn_cxx17_extensions = cxx_dialect < cxx17;
  etc.

IIUC that would mean that a pedwarn using OPT_Wc__14_extensions will
not warn when cxx_dialect >= cxx14, i.e. only warn about C++14
extensions if the active -std dialect is older than C++14. Which would
mean checking cxx_dialect before the pedwarn isn't needed.

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 16:08 -0400, Jason Merrill wrote:

On 5/19/21 4:05 PM, Jonathan Wakely wrote:

On 19/05/21 20:55 +0100, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



 if (omitted_parms_loc && lambda_specs.any_specifiers_p)
   {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


else if (cxx_dialect < cxx23)
  omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.

Should I change the message to say "init capture" rather than
"default argument"?

I'll add some docs to invoke.texi and get a new patch out.


Oh, also we have https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93769
which points out a problem with the current wording. Not a very
important one, but still ...

While I'm touching all 38(?) places that say "only available with
-std=c++NN or -std=gnu++NN I could change them to say something like
"only available since C++NN". Should I bother?

Clang's equivalent warnings say "are a C++11 feature" e.g.

ext.C:1:17: warning: inline namespaces are a C++11 feature 
[-Wc++11-inline-namespace]


(They have a specific warning for each feature, with
-Wc++11-extensions to control them all at once.)


The clang wording seems more accurate, as that PR points out.


OK, that requires touching a number of error_at and inform calls as
well as the pedwarns, so I'll address that separately in a later
patch.

[PATCH v2 4/4] [og10] Rework indirect struct handling for OpenACC in gimplify.c

2021-05-19 Thread Julian Brown

This patch reworks indirect struct handling in gimplify.c (i.e. for
struct components mapped with "mystruct->a[0:n]", "mystruct->b", etc.),
for OpenACC.  The key observation leading to these changes was that
component mappings of references-to-structures is already implemented
and working, and indirect struct component handling via a pointer can
work quite similarly.  That lets us remove some earlier, special-case
handling for mapping indirect struct component accesses for OpenACC,
which required the pointed-to struct to be manually mapped before the
indirect component mapping.

With this patch, you can map struct components directly (e.g. an array
slice "mystruct->a[0:n]") just like you can map a non-indirect struct
component slice ("mystruct.a[0:n]"). Both references-to-pointers (with
the former syntax) and references to structs (with the latter syntax)
work now.

For Fortran class pointers, we no longer re-use GOMP_MAP_TO_PSET for the
class metadata (the structure that points to the class data and vptr)
-- it is instead treated as any other struct.

For C++, the struct handling also works for class members ("this->foo"),
without having to explicitly map "this[:1]" first.

For OpenACC, we permit chained indirect component references
("mystruct->a->b[0:n]"), though only the last part of such mappings will
trigger an attach/detach operation.  To properly use such a construct
on the target, you must still manually map "mystruct->a[:1]" first --
but there's no need to map "mystruct[:1]" explicitly before that.

This version of the patch avoids altering code paths for OpenMP,
where possible.

I will apply shortly to the og10 branch.

Thanks,

Julian

2021-05-19  Julian Brown  

gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET
mappings for class metadata, nor GOMP_MAP_POINTER mappings for
POINTER_TYPE_P decls.

gcc/
* gimplify.c (extract_base_bit_offset): Add BASE_IND and OPENMP
parameters.  Handle pointer-typed indirect references for OpenACC
alongside reference-typed ones.
(strip_components_and_deref, aggregate_base_p): New functions.
(build_struct_group): Add pointer type indirect ref handling,
including chained references, for OpenACC.  Also handle references to
structs for OpenACC.  Conditionalise bits for OpenMP only where
appropriate.
(gimplify_scan_omp_clauses): Rework pointer-type indirect structure
access handling to work more like the reference-typed handling for
OpenACC only.
* omp-low.c (scan_sharing_clauses): Handle pointer-type indirect struct
references, and references to pointers to structs also.

gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
---
 gcc/fortran/trans-openmp.c|  20 +-
 gcc/gimplify.c| 215 +---
 gcc/omp-low.c |  16 +-
 gcc/testsuite/g++.dg/goacc/member-array-acc.C |  13 +
 gcc/testsuite/g++.dg/gomp/member-array-omp.C  |  13 +
 .../testsuite/libgomp.oacc-c++/deep-copy-17.C | 101 
 .../libgomp.oacc-c-c++-common/deep-copy-15.c  |  68 ++
 .../libgomp.oacc-c-c++-common/deep-copy-16.c  | 231 ++
 8 files changed, 619 insertions(+), 58 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/goacc/member-array-acc.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/member-array-omp.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index d3667031ca9..85b09514b1c 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -2696,30 +2696,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
  tree present = gfc_omp_check_optional_argument (decl, true);
  if (openacc && n->sym->ts.type == BT_CLASS)
{
- tree type = TREE_TYPE (decl);
  if (n->sym->attr.optional)
sorry ("optional class parameter");
- if (POINTER_TYPE_P (type))
-   {
- node4 = build_omp_clause (input_location,
-   OMP_CLAUSE_MAP);
- OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
- OMP_CLAUSE_DECL (node4) = decl;
- OMP_CLAUSE_SIZE (node4) = size_int (0);
- decl = bu

[PATCH v2 3/4] [og10] Refactor struct lowering for OpenACC/OpenMP in gimplify.c

2021-05-19 Thread Julian Brown

This patch is a second attempt at refactoring struct component mapping
handling for OpenACC/OpenMP during gimplification, after the patch I
posted here:

  https://gcc.gnu.org/pipermail/gcc-patches/2018-November/510503.html

And improved here, post-review:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/533394.html

This patch goes further, in that the struct-handling code is outlined
into its own function (to create the "GOMP_MAP_STRUCT" node and the
sorted list of nodes immediately following it, from a set of mappings
of components of a given struct or derived type). I've also gone through
the list-handling code and attempted to add comments documenting how it
works to the best of my understanding, and broken out a couple of helper
functions in order to (hopefully) have the code self-document better also.

I will apply shortly to the og10 branch.

Thanks,

Julian

2021-05-19  Julian Brown  

gcc/
* gimplify.c (insert_struct_comp_map): Refactor function into...
(build_struct_comp_nodes): This new function.  Remove list handling
and improve self-documentation.
(insert_node_after, move_node_after, move_nodes_after,
move_concat_nodes_after): New helper functions.
(build_struct_group): New function to build up GOMP_MAP_STRUCT node
groups to map struct components. Outlined from...
(gimplify_scan_omp_clauses): Here.  Call above function.
---
 gcc/gimplify.c | 934 +++--
 1 file changed, 591 insertions(+), 343 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 3921841f2fc..bb87128bcdf 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8253,73 +8253,66 @@ gimplify_omp_depend (tree *list_p, gimple_seq *pre_p)
   return 1;
 }
 
-/* Insert a GOMP_MAP_ALLOC or GOMP_MAP_RELEASE node following a
-   GOMP_MAP_STRUCT mapping.  C is an always_pointer mapping.  STRUCT_NODE is
-   the struct node to insert the new mapping after (when the struct node is
-   initially created).  PREV_NODE is the first of two or three mappings for a
-   pointer, and is either:
- - the node before C, when a pair of mappings is used, e.g. for a C/C++
-   array section.
- - not the node before C.  This is true when we have a reference-to-pointer
-   type (with a mapping for the reference and for the pointer), or for
-   Fortran derived-type mappings with a GOMP_MAP_TO_PSET.
-   If SCP is non-null, the new node is inserted before *SCP.
-   if SCP is null, the new node is inserted before PREV_NODE.
-   The return type is:
- - PREV_NODE, if SCP is non-null.
- - The newly-created ALLOC or RELEASE node, if SCP is null.
- - The second newly-created ALLOC or RELEASE node, if we are mapping a
-   reference to a pointer.  */
+/* For a set of mappings describing an array section pointed to by a struct
+   (or derived type, etc.) component, create an "alloc" or "release" node to
+   insert into a list following a GOMP_MAP_STRUCT node.  For some types of
+   mapping (e.g. Fortran arrays with descriptors), an additional mapping may
+   be created that is inserted into the list of mapping nodes attached to the
+   directive being processed -- not part of the sorted list of nodes after
+   GOMP_MAP_STRUCT.
+
+   CODE is the code of the directive being processed.  GRP_START and GRP_END
+   are the first and last of two or three nodes representing this array section
+   mapping (e.g. a data movement node like GOMP_MAP_{TO,FROM}, optionally a
+   GOMP_MAP_TO_PSET, and finally a GOMP_MAP_ALWAYS_POINTER).  EXTRA_NODE is
+   filled with the additional node described above, if needed.
+
+   This function does not add the new nodes to any lists itself.  It is the
+   responsibility of the caller to do that.  */
 
 static tree
-insert_struct_comp_map (enum tree_code code, tree c, tree struct_node,
-   tree prev_node, tree *scp)
+build_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end,
+tree *extra_node)
 {
   enum gomp_map_kind mkind
 = (code == OMP_TARGET_EXIT_DATA || code == OACC_EXIT_DATA)
   ? GOMP_MAP_RELEASE : GOMP_MAP_ALLOC;
 
-  tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
-  tree cl = scp ? prev_node : c2;
+  gcc_assert (grp_start != grp_end);
+
+  tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), OMP_CLAUSE_MAP);
   OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
-  OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c));
-  OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node;
-  if (OMP_CLAUSE_CHAIN (prev_node) != c
-  && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP
-  && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node))
- == GOMP_MAP_TO_PSET))
-OMP_CLAUSE_SIZE (c2) = OMP_CLAUSE_SIZE (OMP_CLAUSE_CHAIN (prev_node));
+  OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (grp_end));
+  OMP_CLAUSE_CHAIN (c2) = NULL_TREE;
+  tree grp_mid = NULL_TREE;
+  if (OMP_CLA

[PATCH v2 2/4] [og10] Unify ARRAY_REF/INDIRECT_REF stripping code in extract_base_bit_offset

2021-05-19 Thread Julian Brown

For historical reasons, it seems that extract_base_bit_offset
unnecessarily used two different ways to strip ARRAY_REF/INDIRECT_REF
nodes from component accesses. I verified that the two ways of performing
the operation gave the same results across the whole testsuite (and
several additional benchmarks).

The code was like this since an earlier "mechanical" refactoring by me,
first posted here:

  https://gcc.gnu.org/pipermail/gcc-patches/2018-November/510503.html

It was never clear to me if there was an important semantic
difference between the two ways of stripping the base before calling
get_inner_reference, but it appears that there is not, so one can go away.

I will apply shortly to the og10 branch.

Thanks,

Julian

2021-05-11  Julian Brown  

gcc/
* gimplify.c (extract_base_bit_offset): Unify ARRAY_REF/INDIRECT_REF
stripping code in first call/subsequent call cases.
---
 gcc/gimplify.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index b94004dc7b1..3921841f2fc 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8340,31 +8340,21 @@ extract_base_bit_offset (tree base, tree *base_ref, 
poly_int64 *bitposp,
   poly_offset_int poffset;
 
   if (base_ref)
-{
-  *base_ref = NULL_TREE;
-
-  while (TREE_CODE (base) == ARRAY_REF)
-   base = TREE_OPERAND (base, 0);
+*base_ref = NULL_TREE;
 
-  if (TREE_CODE (base) == INDIRECT_REF)
-   base = TREE_OPERAND (base, 0);
-}
-  else
+  if (TREE_CODE (base) == ARRAY_REF)
 {
-  if (TREE_CODE (base) == ARRAY_REF)
-   {
- while (TREE_CODE (base) == ARRAY_REF)
-   base = TREE_OPERAND (base, 0);
- if (TREE_CODE (base) != COMPONENT_REF
- || TREE_CODE (TREE_TYPE (base)) != ARRAY_TYPE)
-   return NULL_TREE;
-   }
-  else if (TREE_CODE (base) == INDIRECT_REF
-  && TREE_CODE (TREE_OPERAND (base, 0)) == COMPONENT_REF
-  && (TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0)))
-  == REFERENCE_TYPE))
+  while (TREE_CODE (base) == ARRAY_REF)
base = TREE_OPERAND (base, 0);
+  if (TREE_CODE (base) != COMPONENT_REF
+ || TREE_CODE (TREE_TYPE (base)) != ARRAY_TYPE)
+   return NULL_TREE;
 }
+  else if (TREE_CODE (base) == INDIRECT_REF
+  && TREE_CODE (TREE_OPERAND (base, 0)) == COMPONENT_REF
+  && (TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0)))
+  == REFERENCE_TYPE))
+base = TREE_OPERAND (base, 0);
 
   base = get_inner_reference (base, &bitsize, &bitpos, &offset, &mode,
  &unsignedp, &reversep, &volatilep);
-- 
2.29.2

[PATCH v2 1/4] [og10] Rewrite GOMP_MAP_ATTACH_DETACH mappings unconditionally

2021-05-19 Thread Julian Brown

It never makes sense for a GOMP_MAP_ATTACH_DETACH mapping to survive
beyond gimplify.c, so this patch rewrites such mappings to GOMP_MAP_ATTACH
or GOMP_MAP_DETACH unconditionally (rather than checking for a list
of types of OpenACC or OpenMP constructs), in cases where it hasn't
otherwise been done already in the preceding code.

I will apply shortly to the og10 branch.

Thanks,

Julian

2021-05-19  Julian Brown  

gcc/
* gimplify.c (gimplify_scan_omp_clauses): Simplify condition
for changing GOMP_MAP_ATTACH_DETACH to GOMP_MAP_ATTACH or
GOMP_MAP_DETACH.
---
 gcc/gimplify.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index e51f0dd7787..b94004dc7b1 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9647,15 +9647,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
skip_map_struct:
  ;
}
- else if ((code == OACC_ENTER_DATA
-   || code == OACC_EXIT_DATA
-   || code == OACC_DATA
-   || code == OACC_PARALLEL
-   || code == OACC_KERNELS
-   || code == OACC_SERIAL
-   || code == OMP_TARGET_ENTER_DATA
-   || code == OMP_TARGET_EXIT_DATA)
-  && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH)
+ else if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH)
{
  gomp_map_kind k = ((code == OACC_EXIT_DATA
  || code == OMP_TARGET_EXIT_DATA)
-- 
2.29.2

[PATCH v2 0/4] [og10] OpenACC: Rework struct component handling

2021-05-19 Thread Julian Brown

This is a new series of patches for the og10 branch to rework how indirect
struct components are handled for offloaded OpenACC code regions. Compared
to the version posted previously here:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570058.html

...the patches have been rebased to the current og10 tip, and adjusted so
that the rework largely only affects OpenACC rather than both OpenACC and
OpenMP. Recent changes on the og10 branch (and apparent divergence between
the OpenACC and OpenMP APIs themselves) make it harder to unify code
paths for the two APIs, though further work could probably remove some
of the duplication that remains after these patches are committed.

Tested with offloading to nvptx. I will apply to the og10 branch shortly.

Julian

Julian Brown (4):
  [og10] Rewrite GOMP_MAP_ATTACH_DETACH mappings unconditionally
  [og10] Unify ARRAY_REF/INDIRECT_REF stripping code in
extract_base_bit_offset
  [og10] Refactor struct lowering for OpenACC/OpenMP in gimplify.c
  [og10] Rework indirect struct handling for OpenACC in gimplify.c

 gcc/fortran/trans-openmp.c|   20 +-
 gcc/gimplify.c| 1151 +++--
 gcc/omp-low.c |   16 +-
 gcc/testsuite/g++.dg/goacc/member-array-acc.C |   13 +
 gcc/testsuite/g++.dg/gomp/member-array-omp.C  |   13 +
 .../testsuite/libgomp.oacc-c++/deep-copy-17.C |  101 ++
 .../libgomp.oacc-c-c++-common/deep-copy-15.c  |   68 +
 .../libgomp.oacc-c-c++-common/deep-copy-16.c  |  231 
 8 files changed, 1202 insertions(+), 411 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/goacc/member-array-acc.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/member-array-omp.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c

-- 
2.29.2

Re: [PATCH,rs6000 2/2] Fusion patterns for add-logical/logical-add

2021-05-19 Thread Segher Boessenkool

On Mon, Apr 26, 2021 at 03:21:30PM -0500, acsaw...@linux.ibm.com wrote:
> This patch modifies the function in genfusion.pl for generating
> the logical-logical patterns so that it can also generate the
> add-logical and logical-add patterns which are very similar.

> +   $outer_32 = "%2,%3";
> +   $outer_42 = "%2,%4";

I think you had trouble thinking of good names here :-)

> +mpower10-fusion-logical-add
> +Target Undocumented Mask(P10_FUSION_LOGADD) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
> +mpower10-fusion-add-logical
> +Target Undocumented Mask(P10_FUSION_ADDLOG) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.

Do you not want to say something a little more precise here?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-logadd.c
> @@ -0,0 +1,98 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Same issues here as in the previous patch.

Other than those things, okay for trunk and backport to 11.  Thanks!


Segher

Re: [PATCH 1/2] c-family: Copy DECL_USER_ALIGN even if DECL_ALIGN is similar.

2021-05-19 Thread Martin Sebor via Gcc-patches


On 5/3/21 8:53 AM, Robin Dapp via Gcc-patches wrote:

Hi,

on s390 a warning test fails:

inline int ATTR ((cold, aligned (8)))
finline_hot_noret_align (int);

inline int ATTR ((warn_unused_result))
finline_hot_noret_align (int);

inline int ATTR ((aligned (4)))
   finline_hot_noret_align (int);  /* { dg-warning "ignoring attribute 
.aligned \\(4\\). because it conflicts with attribute .aligned \\(8\\)."


This test actually uncovered two problems.  First, on s390 the default 
function alignment is 8 bytes.  When the second decl above is merged 
with the first one, DECL_USER_ALIGN is only copied if DECL_ALIGN (old) > 
DECL_ALIGN (new).  Subsequently, when merging the third decl, no warning 
is emitted since DECL_USER_ALIGN is unset.


This patch also copies DECL_USER_ALIGN if DECL_ALIGN (old) == DECL_ALIGN 
(new) && DECL_USER_ALIGN (olddecl) != DECL_USER_ALIGN (newdecl)).



Then, while going through the related files I also noticed that we emit 
a wrong warning for:


inline int ATTR ((aligned (32)))
finline_align (int);

inline int ATTR ((aligned (4)))
   finline_align (int);  /* { dg-warning "ignoring attribute .aligned 
\\(4\\). because it conflicts with attribute .aligned \\(32\\)." "" } */


What we emit is

warning: ignoring attribute ‘aligned (4)’ because it conflicts with 
attribute ‘aligned (8)’ [-Wattributes].


This is due to the short circuit evaluation in c-attribs.c:

    && ((curalign = DECL_ALIGN (decl)) > bitalign
    || ((lastalign = DECL_ALIGN (last_decl)) > bitalign)))

where lastalign is only initialized when curalign > bitalign.  On s390 
this is not the case and lastalign is used zero-initialized in the 
following code.


On top, the following condition
   else if (!warn_if_not_aligned_p
   && TREE_CODE (decl) == FUNCTION_DECL
   && DECL_ALIGN (decl) > bitalign)

seems to be fully covered by
     else if (TREE_CODE (decl) == FUNCTION_DECL
    && ((curalign = DECL_ALIGN (decl)) > bitalign
    || ((lastalign = DECL_ALIGN (last_decl)) > bitalign)))

and so is essentially dead. I therefore removed it as there does not 
seem to be a test anywhere for the error message ( "alignment for %q+D 
was previously specified as %d and may not be decreased") either.


The removal of the dead code looks good to me.  The change to
"re-init lastalign" doesn't seem right.  When it's zero it means
the conflict is between two attributes on the same declaration,
in which case the note shouldn't be printed (it would just point
to the same location as the warning).

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index c1f652d1dc9..d132b6fd3b6 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -2317,6 +2317,10 @@ common_handle_aligned_attribute (tree *node, tree 
name, tree args, int flags,

   && ((curalign = DECL_ALIGN (decl)) > bitalign
   || ((lastalign = DECL_ALIGN (last_decl)) > bitalign)))
 {
+  /* Re-init lastalign in case we short-circuit the condition,
+i.e.  curalign > bitalign.  */
+  lastalign = DECL_ALIGN (last_decl);
+
   /* Either a prior attribute on the same declaration or one
 on a prior declaration of the same function specifies
 stricter alignment than this attribute.  */

For the C/C++ FE changes:

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 3ea4708c507..2ea5051e9cd 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -2620,6 +2620,9 @@ merge_decls (tree newdecl, tree olddecl, tree 
newtype, tree oldtype)

  SET_DECL_ALIGN (newdecl, DECL_ALIGN (olddecl));
  DECL_USER_ALIGN (newdecl) |= DECL_USER_ALIGN (olddecl);
}
+  else if (DECL_ALIGN (olddecl) == DECL_ALIGN (newdecl)
+  && DECL_USER_ALIGN (olddecl) != DECL_USER_ALIGN (newdecl))
+   DECL_USER_ALIGN (newdecl) |= DECL_USER_ALIGN (olddecl);

This should be the same as

  DECL_USER_ALIGN (newdecl) = 1;

so I would suggest to use that for clarity.

Other than that, a maintainer needs to review and approve the work.

Martin

Re: [PATCH] handle VLAs with arbitrary many bounds (PR 100619)

2021-05-19 Thread Joseph Myers

On Wed, 19 May 2021, Martin Sebor via Gcc-patches wrote:

> The GCC 11 -Warray-parameter and -Wvla-parameter enhancement uses
> a small local buffer to format the description of the VLA bounds
> for the internal attribute access.  When the number of bounds is
> in excess of the size of the buffer the code asserts as the test
> case in pr100619 shows.  The attached change does away with
> the small buffer and instead formats the description into
> the destination string itself, thus letting it handle VLAs with
> arbitrarily many bounds.
> 
> Tested on x86_64-linux.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] c: Add support for __FILE_NAME__ macro (PR c/42579)

2021-05-19 Thread Joseph Myers

On Wed, 19 May 2021, Christophe Lyon via Gcc-patches wrote:

> On Wed, 19 May 2021 at 16:50, Joseph Myers  wrote:
> >
> > This patch is missing documentation (in cpp.texi) and tests for the value
> > of the macro.
> >
> 
> Indeed. How about this new version?

This version is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

RFA: save/restore target options in handle_optimize_attribute

2021-05-19 Thread Joern Wolfgang Rennecke

We set default for some target options in TARGET_OPTION_OPTIMIZATION_TABLE,
but these can be overridden by specifying the corresponding explicit
-mXXX / -mno-XXX options.
When a function bears the attribue
__attribute__ ((optimize("02")))
the target options are set to the default for that optimization level,
which can be different from what was selected for the file as a whole.
As handle_optimize_attribute is right now, it will thus clobber the
target options, and with enable_checking it will then abort.

The attached patch makes it save and restore the target options.

Bootstrapped and regression tested on x86_64-pc-linux-gnu.
c-family/
* handle_optimize_attribute: Save & restore target options too.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index f54388e9939..82d85ef8e01 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -5333,10 +5333,13 @@ handle_optimize_attribute (tree *node, tree name, tree 
args,
   else
 {
   struct cl_optimization cur_opts;
+  struct cl_target_option cur_target_opts;
   tree old_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (*node);
 
   /* Save current options.  */
   cl_optimization_save (&cur_opts, &global_options, &global_options_set);
+  cl_target_option_save (&cur_target_opts, &global_options,
+&global_options_set);
 
   /* If we previously had some optimization options, use them as the
 default.  */
@@ -5359,6 +5362,8 @@ handle_optimize_attribute (tree *node, tree name, tree 
args,
   /* Restore current options.  */
   cl_optimization_restore (&global_options, &global_options_set,
   &cur_opts);
+  cl_target_option_restore (&global_options, &global_options_set,
+   &cur_target_opts);
   if (saved_global_options != NULL)
{
  cl_optimization_compare (saved_global_options, &global_options);

Re: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Tobias Burnus


On 19.05.21 22:39, Bernhard Reutner-Fischer wrote:

On Wed, 19 May 2021 20:35:26 +0200
Tobias Burnus  wrote:

As I like that patch and believe it is obvious, I intent to

/intent/s/tent/tend/

?

No real comment except that it sounds odd to arrive at 53 instead of
the quad bits precision on an arch that allegedly does hardware FP?

  real(8):: x
  integer(16), parameter :: m1 = 2_16**digits(x) ! Usually 2**53

i.e. 'x' is double precision not quad precision.

Tobias
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf

Re: [PATCH,rs6000 1/2] combine patterns for add-add fusion

2021-05-19 Thread Segher Boessenkool

On Mon, Apr 26, 2021 at 03:21:29PM -0500, acsaw...@linux.ibm.com wrote:
> This patch adds a function to genfusion.pl to add a couple
> more patterns so combine can do fusion of pairs of add and
> vaddudm instructions.

> +sub gen_addadd
> +{
> +my ($kind, $vchr, $op, $ty, $mode, $pred, $constraint);

Does spelling out $type conflict with anything? :-)

> +  KIND: foreach $kind ('scalar','vector') {
Why put a label on this?  It isn't used.

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4467,6 +4467,9 @@ rs6000_option_override_internal (bool global_init_p)
>if (TARGET_POWER10 && (rs6000_isa_flags_explicit & 
> OPTION_MASK_P10_FUSION_2LOGICAL) == 0)
>  rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2LOGICAL;
>  
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & 
> OPTION_MASK_P10_FUSION_2ADD) == 0)

Line too long.  Just break before the &&?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-addadd.c
> @@ -0,0 +1,41 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */

Just omit the target, anything in gcc.target/powerpc/ is run only for
powerpc already.  You can also omit the dg-do coompletely, "compile" is
the default, but it can be nice documentation.

> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Do we know it does not work on darwin?  If not, please don't disable it
there -- if we do that unnecessarily all the time, much tooo little is
tested on darwin.

Okay for trunk, and a backport to 11 later.  Thanks!  And the previous
patch is fine for backport as well of course.

Segher

Re: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Bernhard Reutner-Fischer via Gcc-patches

On Wed, 19 May 2021 22:39:13 +0200
Bernhard Reutner-Fischer  wrote:

> On Wed, 19 May 2021 20:35:26 +0200
> Tobias Burnus  wrote:

> > commit it as such – unless there are further comments.  
> 
> No real comment except ..

why don't we end up with IEEE binary128 quadruple precision here per
default. Suspicious, no?
thanks,

[PATCH] avoid -Wnonnull with lambda (PR 100684)

2021-05-19 Thread Martin Sebor via Gcc-patches


The front end -Wnonnull handler has code to suppress warning for
lambdas with null this pointers but the middle end handler has
no corresponding logic.  This leads to spurious -Wnonnull in
lambda calls that aren't inlined.  That might happen at low
optimization levels such as -O1 or -Og and with sanitization.

The attached patch enhances the middle end -Wnonnull to deal
with this case and avoid the false positives.

Tested on x86_64-linux.

Martin
PR middle-end/100684 - spurious -Wnonnull with -O1 on a C++ lambda

gcc/ChangeLog:

	PR middle-end/100684
	* tree-ssa-ccp.c (pass_post_ipa_warn::execute):

gcc/testsuite/ChangeLog:

	PR middle-end/100684
	* g++.dg/warn/Wnonnull13.C: New test.
	* g++.dg/warn/Wnonnull14.C: New test.
	* g++.dg/warn/Wnonnull15.C: New test.

diff --git a/gcc/testsuite/g++.dg/warn/Wnonnull13.C b/gcc/testsuite/g++.dg/warn/Wnonnull13.C
new file mode 100644
index 000..e3279764ac0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wnonnull13.C
@@ -0,0 +1,28 @@
+/* PR middle-end/100684 - spurious -Wnonnull with -O1 on a C++ lambda
+   { dg-do compile { target c++11 } }
+   { dg-options "-O0 -Wall -fsanitize=undefined" } */
+
+#define NONNULL  __attribute__ ((nonnull))
+
+typedef int F (const char *);
+
+NONNULL int f (const char *);
+
+int nowarn_O0 ()
+{
+  return static_cast([](const char *s){ return f (s); })("O0");
+  // { dg-bogus "\\\[-Wnonnull" "" { target *-*-* } .-1 }
+}
+
+int warn_O0 ()
+{
+  return static_cast([] NONNULL (const char *){ return 0; })(0);
+  // { dg-warning "\\\[-Wnonnull" "" { target *-*-* } .-1 }
+}
+
+int warn_O0_inline ()
+{
+  return static_cast([](const char *s){ return f (s); })(0);
+  // { dg-warning "\\\[-Wnonnull" "lambda not inlined" { xfail *-*-* } .-1 }
+}
+
diff --git a/gcc/testsuite/g++.dg/warn/Wnonnull14.C b/gcc/testsuite/g++.dg/warn/Wnonnull14.C
new file mode 100644
index 000..16d7ec3f573
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wnonnull14.C
@@ -0,0 +1,28 @@
+/* PR middle-end/100684 - spurious -Wnonnull with -O1 on a C++ lambda
+   { dg-do compile { target c++11 } }
+   { dg-options "-Og -Wall -fsanitize=undefined" } */
+
+#define NONNULL  __attribute__ ((nonnull))
+
+typedef int F (const char *);
+
+__attribute__ ((nonnull)) int f (const char *);
+
+int nowarn_Og ()
+{
+  return static_cast([](const char *s){ return f (s); })("Og");
+  // { dg-bogus "'this' pointer is null" "" { target *-*-* } .-1 }
+}
+
+int warn_Og ()
+{
+  return static_cast([] NONNULL (const char *){ return 0; })(0);
+  // { dg-warning "\\\[-Wnonnull" "" { target *-*-* } .-1 }
+}
+
+int warn_Og_inline ()
+{
+  const char *p = 0;
+  return static_cast([](const char *s){ return f (s); })(p);
+  // { dg-warning "\\\[-Wnonnull" "lambda not inlined" { xfail *-*-* } .-1 }
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wnonnull15.C b/gcc/testsuite/g++.dg/warn/Wnonnull15.C
new file mode 100644
index 000..36a2ab48789
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wnonnull15.C
@@ -0,0 +1,28 @@
+/* PR middle-end/100684 - spurious -Wnonnull with -O1 on a C++ lambda
+   { dg-do compile { target c++11 } }
+   { dg-options "-O1 -Wall -fsanitize=undefined" } */
+
+#define NONNULL  __attribute__ ((nonnull))
+
+typedef int F (const char *);
+
+NONNULL int f (const char *);
+
+int nowarn_O1 ()
+{
+  return static_cast([](const char *s){ return f (s); })("O1");
+  // { dg-bogus "\\\[-Wnonnull" "" { target *-*-* } .-1 }
+}
+
+int warn_O1 ()
+{
+  return static_cast([] NONNULL (const char *){ return 0; })(0);
+  // { dg-warning "\\\[-Wnonnull" "" { target *-*-* } .-1 }
+}
+
+int warn_O1_inline ()
+{
+  const char *p = 0;
+  return static_cast([](const char *s){ return f (s); })(p);
+  // { dg-warning "\\\[-Wnonnull" "lambda not inlined" { xfail *-*-* } .-1 }
+}
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index bf31f035153..3834212b867 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -3536,6 +3536,7 @@ pass_post_ipa_warn::execute (function *fun)
 	continue;
 
 	  tree fndecl = gimple_call_fndecl (stmt);
+	  const bool closure = fndecl && DECL_LAMBDA_FUNCTION_P (fndecl);
 
 	  for (unsigned i = 0; i < gimple_call_num_args (stmt); i++)
 	{
@@ -3544,6 +3545,9 @@ pass_post_ipa_warn::execute (function *fun)
 		continue;
 	  if (!integer_zerop (arg))
 		continue;
+	  if (i == 0 && closure)
+		/* Avoid warning for the first argument to lambda functions.  */
+		continue;
 	  if (!bitmap_empty_p (nonnullargs)
 		  && !bitmap_bit_p (nonnullargs, i))
 		continue;

[PATCH] handle VLAs with arbitrary many bounds (PR 100619)

2021-05-19 Thread Martin Sebor via Gcc-patches


The GCC 11 -Warray-parameter and -Wvla-parameter enhancement uses
a small local buffer to format the description of the VLA bounds
for the internal attribute access.  When the number of bounds is
in excess of the size of the buffer the code asserts as the test
case in pr100619 shows.  The attached change does away with
the small buffer and instead formats the description into
the destination string itself, thus letting it handle VLAs with
arbitrarily many bounds.

Tested on x86_64-linux.

Martin
PR c/100619 - ICE on a VLA parameter with too many dimensions

gcc/c-family/ChangeLog:

	PR c/100619
	* c-attribs.c (build_attr_access_from_parms): Handle arbitrarily many
	bounds.

gcc/testsuite/ChangeLog:

	PR c/100619
	* gcc.dg/pr100619.c: New test.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index ecb32c70172..ccf9e4ccf0b 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -5043,16 +5043,25 @@ build_attr_access_from_parms (tree parms, bool skip_voidptr)
   /* Create the attribute access string from the arg spec string,
 	 optionally followed by position of the VLA bound argument if
 	 it is one.  */
-  char specbuf[80];
-  int len = snprintf (specbuf, sizeof specbuf, "%c%u%s",
-			  attr_access::mode_chars[access_deferred],
-			  argpos, s);
-  gcc_assert ((size_t) len < sizeof specbuf);
-
-  if (!spec.length ())
-	spec += '+';
+  {
+	size_t specend = spec.length ();
+	if (!specend)
+	  {
+	spec = '+';
+	specend = 1;
+	  }
 
-  spec += specbuf;
+	/* Format the access string in place.  */
+	int len = snprintf (NULL, 0, "%c%u%s",
+			attr_access::mode_chars[access_deferred],
+			argpos, s);
+	spec.resize (specend + len + 1);
+	sprintf (&spec[specend], "%c%u%s",
+		 attr_access::mode_chars[access_deferred],
+		 argpos, s);
+	/* Trim the trailing NUL.  */
+	spec.resize (specend + len);
+  }
 
   /* The (optional) list of expressions denoting the VLA bounds
 	 N in ARGTYPE [Ni]...[Nj]...[Nk].  */
@@ -5077,8 +5086,13 @@ build_attr_access_from_parms (tree parms, bool skip_voidptr)
 		{
 		  /* BOUND previously seen in the parameter list.  */
 		  TREE_PURPOSE (vb) = size_int (*psizpos);
-		  sprintf (specbuf, "$%u", *psizpos);
-		  spec += specbuf;
+		  /* Format the position string in place.  */
+		  int len = snprintf (NULL, 0, "$%u", *psizpos);
+		  size_t specend = spec.length ();
+		  spec.resize (specend + len + 1);
+		  sprintf (&spec[specend], "$%u", *psizpos);
+		  /* Trim the trailing NUL.  */
+		  spec.resize (specend + len);
 		}
 	  else
 		{
diff --git a/gcc/testsuite/gcc.dg/pr100619.c b/gcc/testsuite/gcc.dg/pr100619.c
new file mode 100644
index 000..89b62938797
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr100619.c
@@ -0,0 +1,24 @@
+/* PR c/100619 - ICE on a VLA parameter with too many dimensions
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+extern int n;
+
+#define A10 [n][n][n][n][n][n][n][n][n][n]
+#define A100A10 A10 A10 A10 A10 A10 A10 A10 A10 A10 A10
+#define A1000   A100 A100 A100 A100 A100 A100 A100 A100 A100 A100 A100
+
+void f10 (int A10);
+void f10 (int A10);
+
+void f100 (int A100);
+void f100 (int A100);
+
+void f1000 (int A1000);
+void f1000 (int A1000);
+
+void fx_1000 (int [ ]A1000);
+void fx_1000 (int [1]A1000);// { dg-warning "-Warray-parameter=" }
+
+void fn_1000 (int [n]A1000);
+void fn_1000 (int [n + 1]A1000);// { dg-warning "-Wvla-parameter=" }

Re: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Bernhard Reutner-Fischer via Gcc-patches

On Wed, 19 May 2021 20:35:26 +0200
Tobias Burnus  wrote:

> Hi Segher,
> 
> Quick version: Jump to the new patch, which I like much more.

> Namely, as the attached updated patch does.
> 
> As I like that patch and believe it is obvious, I intent to

/intent/s/tent/tend/

> commit it as such – unless there are further comments.

No real comment except that it sounds odd to arrive at 53 instead of
the quad bits precision on an arch that allegedly does hardware FP?
Did not look close though so i claim to haven't said a word, let alone
2*2 nor four.

thanks,
> 
> It passes on both x86-64-gnu-linux and powerpc64le-none-linux-gnu.
> I think the radix == 2 is a good bet, but if we ever run into issues,
> it can also be changed to use radix(...) as well ...
> 
> Tobias
> 
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
> Thürauf

Re: [PATCH] c: Add support for __FILE_NAME__ macro (PR c/42579)

2021-05-19 Thread Christophe Lyon via Gcc-patches

On Wed, 19 May 2021 at 16:50, Joseph Myers  wrote:
>
> This patch is missing documentation (in cpp.texi) and tests for the value
> of the macro.
>

Indeed. How about this new version?

Thanks

Christophe

> --
> Joseph S. Myers
> jos...@codesourcery.com
commit d0e79f75dc3723231609f24e2840ac5858a652e1
Author: Christophe Lyon 
Date:   Wed May 19 20:28:08 2021 +

c: Add support for __FILE_NAME__ macro (PR c/42579)

The toolchain provided by ST for stm32 has had support for
__FILENAME__ for a while, but clang/llvm has recently implemented
support for __FILE_NAME__, so it seems better to use the same macro
name in GCC.

It happens that the ST patch is similar to the one proposed in PR
c/42579.

Given these input files:
::
mydir/myinc.h
::
char* mystringh_file = __FILE__;
char* mystringh_filename = __FILE_NAME__;
char* mystringh_base_file = __BASE_FILE__;
::
mydir/mysrc.c
::

char* mystring_file = __FILE__;
char* mystring_filename = __FILE_NAME__;
char* mystring_base_file = __BASE_FILE__;

we produce:
$ gcc mydir/mysrc.c -I . -E
char* mystringh_file = "./mydir/myinc.h";
char* mystringh_filename = "myinc.h";
char* mystringh_base_file = "mydir/mysrc.c";

char* mystring_file = "mydir/mysrc.c";
char* mystring_filename = "mysrc.c";
char* mystring_base_file = "mydir/mysrc.c";

2021-05-19  Christophe Lyon  
Torbjörn Svensson  

PR c/42579
libcpp/
* include/cpplib.h (cpp_builtin_type): Add BT_FILE_NAME entry.
* init.c (builtin_array): Likewise.
* macro.c (_cpp_builtin_macro_text): Add support for BT_FILE_NAME.

gcc/
* doc/cpp.texi (Common Predefined Macros): Document __FILE_NAME__.

gcc/testsuite/
* c-c++-common/spellcheck-reserved.c: Add tests for __FILE_NAME__.
* c-c++-common/cpp/file-name-1.c: New test.

diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index 2c109bbc5bd..0b5e6cdd6bd 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -2005,6 +2005,13 @@ This macro expands to the name of the main input file, in the form
 of a C string constant.  This is the source file that was specified
 on the command line of the preprocessor or C compiler.
 
+@item __FILE_NAME__
+This macro expands to the basename of the current input file, in the
+form of a C string constant.  This is the last path component by which
+the preprocessor opened the file.  For example, processing
+@code{"/usr/local/include/myheader.h"} would set this
+macro to @code{"myheader.h"}.
+
 @item __INCLUDE_LEVEL__
 This macro expands to a decimal integer constant that represents the
 depth of nesting in include files.  The value of this macro is
diff --git a/gcc/testsuite/c-c++-common/cpp/file-name-1.c b/gcc/testsuite/c-c++-common/cpp/file-name-1.c
new file mode 100644
index 000..2b476e34cce
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/file-name-1.c
@@ -0,0 +1,22 @@
+/* { dg-do preprocess } */
+/* { dg-additional-options -Wno-pedantic } */
+
+main-1 __FILE_NAME__
+
+# 7 "inner.h" 1
+inner-1 __FILE_NAME__
+# 9 "subdir/inside.h" 1
+inside-1 __FILE_NAME__
+inside-2 __FILE__
+# 11 "" 2
+inner-2 __FILE_NAME__
+#13 "" 2
+main-2 __FILE_NAME__
+
+
+/* { dg-final { scan-file file-name-1.i "main-1 \"\[^\n]*file-name-1.c\"\n" } } */
+/* { dg-final { scan-file file-name-1.i "main-2 \"\[^\n]*file-name-1.c\"\n" } } */
+/* { dg-final { scan-file file-name-1.i "inner-1 \"inner.h\"\n" } } */
+/* { dg-final { scan-file file-name-1.i "inner-2 \"inner.h\"\n" } } */
+/* { dg-final { scan-file file-name-1.i "inside-1 \"inside.h\"\n" } } */
+/* { dg-final { scan-file file-name-1.i "inside-2 \"subdir/inside.h\"\n" } } */
diff --git a/gcc/testsuite/c-c++-common/spellcheck-reserved.c b/gcc/testsuite/c-c++-common/spellcheck-reserved.c
index ed292f2bae0..56e59dcc00a 100644
--- a/gcc/testsuite/c-c++-common/spellcheck-reserved.c
+++ b/gcc/testsuite/c-c++-common/spellcheck-reserved.c
@@ -50,3 +50,21 @@ const char * test_3 (void)
   /* { dg-error "did you mean '__FILE__'" "" { target c } misspelled__FILE_ } */
   /* { dg-error "'__FILE_' was not declared in this scope; did you mean '__FILE__'\\?"  "" { target c++ } misspelled__FILE_ } */
 }
+
+/* Verify that we can correct "__FILE_NAME_" to "__FILE_NAME__".  */
+
+const char * test_4 (void)
+{
+  return __FILE_NAME_; /* { dg-line misspelled__FILE_NAME_ } */
+  /* { dg-error "did you mean '__FILE_NAME__'" "" { target c } misspelled__FILE_NAME_ } */
+  /* { dg-error "'__FILE_NAME_' was not declared in this scope; did you mean '__FILE_NAME__'\\?"  "" { target c++ } misspelled__FILE_NAME_ } */
+}
+
+/* Verify that we can correct "__FILENAME__" to "__FILE_NAME__".  */
+
+const char * test_5 (void)
+{
+  return __FILENAME__; /* { dg-line misspelled__FILENAME__ } */
+  /* { dg-error "did you mea

[PATCH] Split gimple statement range folding into a stand alone class.

2021-05-19 Thread Andrew MacLeod via Gcc-patches

When ranger was first written, it processed all the range-ops 
statements, and the remainder of the statements we slowly added, and 
shared as much code with vr_values as we could.


We are now at a point where it makes sense to split this out into its 
own class. There are a number of places where range-ops is treated 
"special" because it follows  a specific formula, and the other kinds of 
statements have becoming second class citizens.  By pulling all the 
statement processing out of gimple-ranger and into a class which handles 
the statements and uses general range queries, we can treat all the 
statements the same way. There are numerous benefits to this.


  1) with the upcoming relational work, I want to be able to query 
relations in builtin function processing (and other places) as well as  
range-ops.
  2) GORI is going to be enhanced so we can look back thru things other 
than range-ops statements... this will help with overflow analysis in 
builtins especially.
  3) simplified interface for anyone wanting to calculate a range.  
a new routine:

    bool fold_stmt (irange&r, gimple *s, range_query *q=NULL)
will do look-ups to and provide a range for the stmt.  This is now being 
used internally in ranger and can work with global ranges, or any kind 
of range_query.


All the various stmt kind processing is pulled out of gimple_ranger and 
moved into class fold_using_range, which has a single external API.  A 
helper class fur_source (FoldUsingRange_source) is used to provide a 
mechanism for the statement folder to resolve any ssa_names.  It can be 
set up to provide range queries from the original stmt location, on an 
edge, or from any arbitrary location.


This allows us to not only fold a stmt where it sits in the IL, but also 
gives us the power to recalculate a stmt as if it occurs somewhere 
else.  A second function is provided which specifies an edge, and the 
statement can be recalculated as if it was sitting on that edge.  It is 
also possible (and ranger uses this) to calculate the statement as it 
occurs elsewhere in the program.


This is useful for things like

   b_2 = a_1 + 10
   if (a_1 > 20 && a_1 < 40)
 c_5 = b_2 + 4

When ranger calculates a range for c_5, it recognizes that b_2 is 
dependant on the value of a_1, and a_1 has been "refined" since b_2 was 
defined.  So rather than using the original value of b_2, SSA allows us 
to recalculate b_2 at the location of the use... ie,  we know a_1 is 
[21, 39], so b_2 is recalculated at that use as [21,39] + 10 = [31,49] . 
The value of c_5 is then calculated as [31,49] + 4 = [35, 53].


This works with an arbitrary amount of IL between the statements since 
its driven by uses and defs.


This is the first of a set of cleanup/generalizations. Bootstraps on 
x86_64-pc-linux-gnu with no regressions.


Pushed.

Andrew

commit dc6758f03effbf7d6946d8c314576c7a6c0003af
Author: Andrew MacLeod 
Date:   Tue May 18 20:33:09 2021 -0400

Split gimple range folding with ranges into a stand alone class.

Introduces fold_using_range which folds any kind of gimple statement by
querying argument ranges thru a generic range_query.
This pulls all the statement processing into a client neutral location.

* gimple-range.cc (fur_source::get_operand): New.
(gimple_range_fold): Delete.
(fold_using_range::fold_stmt): Move from gimple_ranger::calc_stmt.
(fold_using_range::range_of_range_op): Move from gimple_ranger.
(fold_using_range::range_of_address): Ditto.
(fold_using_range::range_of_phi): Ditto.
(fold_using_range::range_of_call): Ditto.
(fold_using_range::range_of_builtin_ubsan_call): Move from
range_of_builtin_ubsan_call.
(fold_using_range::range_of_builtin_call): Move from
range_of_builtin_call.
(gimple_ranger::range_of_builtin_call): Delete.
(fold_using_range::range_of_cond_expr): Move from gimple_ranger.
(gimple_ranger::fold_range_internal): New.
(gimple_ranger::range_of_stmt): Use new fold_using_range API.
(fold_using_range::range_of_ssa_name_with_loop_info): Move from
gimple_ranger.  Improve ranges of SSA_NAMES when possible.
* gimple-range.h (gimple_ranger): Remove various range_of routines.
(class fur_source): New.
(class fold_using_range): New.
(fur_source::fur_source): New.
(fold_range): New.
* vr-values.c (vr_values::extract_range_basic): Use fold_using_range
instead of range_of_builtin_call.

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 710bc7f9632..06e9804494b 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -47,6 +47,31 @@ along with GCC; see the file COPYING3.  If not see
 #include "vr-values.h"
 #include "gimple-range.h"
 
+// Evaluate expression EXPR using the source information the cl

Re: [PATCH,rs6000] Add insn types for fusion pairs

2021-05-19 Thread Segher Boessenkool

Hi!

On Mon, Apr 26, 2021 at 01:04:56PM -0500, acsaw...@linux.ibm.com wrote:
> This adds new values for insn attr type for p10 fusion. The genfusion.pl
> script is modified to use them, and fusion.md regenerated to capture
> the new patterns. There are also some formatting only changes to
> fusion.md that apparently weren't captured after a previous commit
> of genfusion.pl.

That is fine, it is a generated file after all.

> @@ -227,7 +229,7 @@ sub gen_2logical
> ${inner_op} %3,%1,%0\\;${outer_op} %3,%3,%2
> ${inner_op} %3,%1,%0\\;${outer_op} %3,%3,%2
> ${inner_op} %4,%1,%0\\;${outer_op} %3,%4,%2"
> -  [(set_attr "type" "logical")
> +  [(set_attr "type" "$fuse_type")
> (set_attr "cost" "6")
> (set_attr "length" "8")])
>  EOF

You can make the rest use heredocs as well, when you have some time?  It
is so much easier to read :-)

> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -204,7 +204,9 @@ (define_attr "type"
> vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,
> vecfloat,vecfdiv,vecdouble,mtvsr,mfvsr,crypto,
> veclogical,veccmpfx,vecexts,vecmove,
> -   htm,htmsimple,dfp,mma"
> +   htm,htmsimple,dfp,mma,
> +   fused_arith_logical,fused_cmp_isel,fused_carry,fused_load_cmpi,
> +   
> fused_load_load,fused_store_store,fused_addis_load,fused_mtbc,fused_vector"

Maybe you can document what the fused types are for?  (Maybe we should
for some of the others as well, but most are trivial thankfully.)  You
cannot use inline comments here (we are inside a quoted string), but
maybe you can put something after it?  For example, fused_arith_logical
is not for an arith insn fused with a logical insn, like most others are
patterned like.

Oh, and it doesn't hurt to use more lines here, if you can group things
better that would help.  And of course you can use whitespace before
names here (you cannot tell from existing entries, but :-) )

Okay for trunk, with or without such improvements.  Thanks!

Segher

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jason Merrill via Gcc-patches


On 5/19/21 4:05 PM, Jonathan Wakely wrote:

On 19/05/21 20:55 +0100, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



 if (omitted_parms_loc && lambda_specs.any_specifiers_p)
   {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


else if (cxx_dialect < cxx23)
  omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.

Should I change the message to say "init capture" rather than
"default argument"?

I'll add some docs to invoke.texi and get a new patch out.


Oh, also we have https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93769
which points out a problem with the current wording. Not a very
important one, but still ...

While I'm touching all 38(?) places that say "only available with
-std=c++NN or -std=gnu++NN I could change them to say something like
"only available since C++NN". Should I bother?

Clang's equivalent warnings say "are a C++11 feature" e.g.

ext.C:1:17: warning: inline namespaces are a C++11 feature 
[-Wc++11-inline-namespace]


(They have a specific warning for each feature, with
-Wc++11-extensions to control them all at once.)


The clang wording seems more accurate, as that PR points out.

Jason

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jason Merrill via Gcc-patches


On 5/19/21 3:55 PM, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
    {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


  /* Default arguments shall not be specified in the
  parameter-declaration-clause of a lambda-declarator.  */
  if (cxx_dialect < cxx14)
 for (tree t = param_list; t; t = TREE_CHAIN (t))
   if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
     pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
  "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.


Indeed.


Should I change the message to say "init capture" rather than
"default argument"?


No, this is about e.g. [](int = 42){}

Jason

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 20:55 +0100, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



 if (omitted_parms_loc && lambda_specs.any_specifiers_p)
   {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


else if (cxx_dialect < cxx23)
  omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.

Should I change the message to say "init capture" rather than
"default argument"?

I'll add some docs to invoke.texi and get a new patch out.


Oh, also we have https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93769
which points out a problem with the current wording. Not a very
important one, but still ...

While I'm touching all 38(?) places that say "only available with
-std=c++NN or -std=gnu++NN I could change them to say something like
"only available since C++NN". Should I bother?

Clang's equivalent warnings say "are a C++11 feature" e.g.

ext.C:1:17: warning: inline namespaces are a C++11 feature 
[-Wc++11-inline-namespace]

(They have a specific warning for each feature, with
-Wc++11-extensions to control them all at once.)

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
{
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


  /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
  if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.

Should I change the message to say "init capture" rather than
"default argument"?

I'll add some docs to invoke.texi and get a new patch out.

Re: [PATCH v4 12/12] constructor: Check if it is faster to load constant from memory

2021-05-19 Thread H.J. Lu via Gcc-patches

On Wed, May 19, 2021 at 6:27 AM Bernd Edlinger
 wrote:
>
> On 5/19/21 3:22 PM, H.J. Lu wrote:
> > On Wed, May 19, 2021 at 2:33 AM Richard Biener
> >  wrote:
> >>
> >> On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> >>>
> >>> When expanding a constant constructor, don't call expand_constructor if
> >>> it is more efficient to load the data from the memory via move by pieces.
> >>>
> >>> gcc/
> >>>
> >>> PR middle-end/90773
> >>> * expr.c (expand_expr_real_1): Don't call expand_constructor if
> >>> it is more efficient to load the data from the memory.
> >>>
> >>> gcc/testsuite/
> >>>
> >>> PR middle-end/90773
> >>> * gcc.target/i386/pr90773-24.c: New test.
> >>> * gcc.target/i386/pr90773-25.c: Likewise.
> >>> ---
> >>>  gcc/expr.c | 10 ++
> >>>  gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++
> >>>  gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 
> >>>  3 files changed, 52 insertions(+)
> >>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c
> >>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c
> >>>
> >>> diff --git a/gcc/expr.c b/gcc/expr.c
> >>> index d09ee42e262..80e01ea1cbe 100644
> >>> --- a/gcc/expr.c
> >>> +++ b/gcc/expr.c
> >>> @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> >>> machine_mode tmode,
> >>> unsigned HOST_WIDE_INT ix;
> >>> tree field, value;
> >>>
> >>> +   /* Check if it is more efficient to load the data from
> >>> +  the memory directly.  FIXME: How many stores do we
> >>> +  need here if not moved by pieces?  */
> >>> +   unsigned HOST_WIDE_INT bytes
> >>> + = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> >>
> >> that's prone to fail - it could be a VLA.
> >
> > What do you mean by fail?  Is it ICE or missed optimization?
> > Do you have a testcase?
> >
>
> I think for a VLA the TYPE_SIZE_UNIT may be unknown (NULL), or something like 
> "x".
>
> for instance something like
>
> int test (int x)
> {
>   int vla[x];
>
>   vla[x-1] = 0;
>   return vla[x-1];
> }

My patch changes the CONSTRUCTOR code path.   I couldn't find a CONSTRUCTOR
testcase with VLA.

>
> Bernd.
>
> >>
> >>> +   if ((bytes / UNITS_PER_WORD) > 2
> >>> +   && MOVE_MAX_PIECES > UNITS_PER_WORD
> >>> +   && can_move_by_pieces (bytes, TYPE_ALIGN (type)))
> >>> + goto normal_inner_ref;
> >>> +
> >>
> >> It looks like you're concerned about aggregate copies but this also handles
> >> non-aggregates (which on GIMPLE might already be optimized of course).
> >
> > Here I check if we copy more than 2 words and we can move more than
> > a word in a single instruction.
> >
> >> Also you say "if it's cheaper" but I see no cost considerations.  How do
> >> we generally handle immed const vs. load from constant pool costs?
> >
> > This trades 2 (update to 8) stores with one load plus one store.  Is there
> > a way to check which one is faster?
> >
> >>> FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), ix,
> >>>   field, value)
> >>>   if (tree_int_cst_equal (field, index))
> >>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-24.c 
> >>> b/gcc/testsuite/gcc.target/i386/pr90773-24.c
> >>> new file mode 100644
> >>> index 000..4a4b62533dc
> >>> --- /dev/null
> >>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-24.c
> >>> @@ -0,0 +1,22 @@
> >>> +/* { dg-do compile } */
> >>> +/* { dg-options "-O2 -march=x86-64" } */
> >>> +
> >>> +struct S
> >>> +{
> >>> +  long long s1 __attribute__ ((aligned (8)));
> >>> +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
> >>> +};
> >>> +
> >>> +const struct S array[] = {
> >>> +  { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 }
> >>> +};
> >>> +
> >>> +void
> >>> +foo (struct S *x)
> >>> +{
> >>> +  x[0] = array[0];
> >>> +}
> >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> >>> \\(%\[\^,\]+\\)" 1 } } */
> >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> >>> 16\\(%\[\^,\]+\\)" 1 } } */
> >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> >>> 32\\(%\[\^,\]+\\)" 1 } } */
> >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> >>> 48\\(%\[\^,\]+\\)" 1 } } */
> >>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-25.c 
> >>> b/gcc/testsuite/gcc.target/i386/pr90773-25.c
> >>> new file mode 100644
> >>> index 000..2520b670989
> >>> --- /dev/null
> >>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-25.c
> >>> @@ -0,0 +1,20 @@
> >>> +/* { dg-do compile } */
> >>> +/* { dg-options "-O2 -march=skylake" } */
> >>> +
> >>> +struct S
> >>> +{
> >>> +  long long s1 __attribute__ ((aligned (8)));
> >>> +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s1

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Marek Polacek via Gcc-patches

On Wed, May 19, 2021 at 07:35:20PM +0100, Jonathan Wakely wrote:
> On 19/05/21 14:03 -0400, Marek Polacek wrote:
> > On Wed, May 19, 2021 at 11:51:54AM -0600, Martin Sebor via Gcc-patches 
> > wrote:
> > > On 5/19/21 10:39 AM, Jonathan Wakely via Gcc-patches wrote:
> > > > Jakub pointed out I'd forgotten the spaces before the opening parens
> > > > for function calls. The attached patch should fix all those, with no
> > > > other changes.
> > > >
> > > > Tested x86_64-linux. OK for trunk?
> > > 
> > > Looks good to me, it just needs an update to the manual describing
> > > the new options.
> > > 
> > > It's too bad that the conditionals checking for the dialect have
> > > to be repeated throughout the front end.  They're implied by
> > > the new option enumerator passed to pedwarn().  If the diagnostic
> > > subsystem had access to cxx_dialect the check could be done there
> > > and all the other conditionals could be avoided.  An alternative
> > > to that would be to add a new wrapper to the C++ front end, like
> > > cxxdialect_pedwarn, to do the checking before calling pedwarn
> > > (or, more likely, emit_diagnostic_valist).
> > 
> > In the C FE I introduced pedwarn_c11 and similar to deal with a similar
> > problem, and also handle the rule that a more specific warning overrides
> > more general warnings.
> 
> The C++ FE does already have pedwarn_cxx98, but generalizing that
> would have taken more effort than I wanted to spend on this, sorry :-(

No worries, that's understandable.  I think the patch is a good improvement
as-is.  (Though some doc text into the doc/invoke.texi should be added.)

Marek

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches

On 19/05/21 14:03 -0400, Marek Polacek wrote:

On Wed, May 19, 2021 at 11:51:54AM -0600, Martin Sebor via Gcc-patches wrote:

On 5/19/21 10:39 AM, Jonathan Wakely via Gcc-patches wrote:
> Jakub pointed out I'd forgotten the spaces before the opening parens
> for function calls. The attached patch should fix all those, with no
> other changes.
>
> Tested x86_64-linux. OK for trunk?

Looks good to me, it just needs an update to the manual describing
the new options.

It's too bad that the conditionals checking for the dialect have
to be repeated throughout the front end.  They're implied by
the new option enumerator passed to pedwarn().  If the diagnostic
subsystem had access to cxx_dialect the check could be done there
and all the other conditionals could be avoided.  An alternative
to that would be to add a new wrapper to the C++ front end, like
cxxdialect_pedwarn, to do the checking before calling pedwarn
(or, more likely, emit_diagnostic_valist).

In the C FE I introduced pedwarn_c11 and similar to deal with a similar
problem, and also handle the rule that a more specific warning overrides
more general warnings.

The C++ FE does already have pedwarn_cxx98, but generalizing that
would have taken more effort than I wanted to spend on this, sorry :-(

Re: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Tobias Burnus


Hi Segher,

Quick version: Jump to the new patch, which I like much more.
Longer version:
On 19.05.21 17:15, Segher Boessenkool wrote:

real(16)   :: y   ! 128bit REAL
integer(16), parameter :: k2 = nint (2 / epsilon (y), kind(k2))
integer(16), parameter :: m2 = 10384593717069655257060992658440192_16
!2**113
if (k2 /= m2) stop 3

On x86_64-linux-gnu, k2 == m2 — but on powerpc64le-linux-gnu,
k2 == 2**106 instead of 2**113.

My solution is to permit also 2**106 besides 2**113.

I do not understand Fortran well enough, could you explain what the code
is supposed to do?


First, 2_16 means the integer '2' of the integer kind '16', i.e. int128_t type.

The original bug report (PR96983) was that 'nint'
with the 16byte floating point/quad-precision was giving an ICE
and the complaint was that there was no testcase for the value.

And I think this testcase tries to ensure that the result of 'nint'
both at compile time and at runtime matches what should be the result.

(Quotes from the Fortran standard, augumenty by what the source code does.)

'nint' does "__builtin_round" is available: "The result is the integer
nearest A, or if there are two integers equally near A, the result
is whichever such integer has the greater magnitude" – and in this
testcase, the argument is a quad-precision/16byte/128bit floating-point
number and the result is a 128bit integer.

('a**b' is the Fortran syntax for: 'a' raised to the power of 'b'.)

This testcase does:

nint(2/epsilon(y)). Here, 'epsilon' is the
"Model number that is small compared to 1."
Namely: b**(p-1) = '(radix)**(1-digits)'
alias 'real_format *fmt = REAL_MODE_FORMAT (mode)'
with radix = fmt->b  and digits = fmt->p;

[b**(p-1) is from the Fortran standard but 'b' and 'p' also match the
ME/target names, while radix/digits matches the FE names and also the
Fortran intrinsic inquiry function names.]

This is for radix = 2 equivalent to:

2/2**(1-digits) = 2*2**(digits-1) = 2**(digits)

On x86-64, digits == fmt->p == 113.

Our powerpc64le gives digits == 106.

 * * *

Having written all this, I wonder why we don't just
rely on the assumption that '2**digit(x)' works – and use this
to generate the valid.

Namely, as the attached updated patch does.

As I like that patch and believe it is obvious, I intent to
commit it as such – unless there are further comments.

It passes on both x86-64-gnu-linux and powerpc64le-none-linux-gnu.
I think the radix == 2 is a good bet, but if we ever run into issues,
it can also be changed to use radix(...) as well ...

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

gcc/testsuite/ChangeLog:

	PR fortran/96983
	* gfortran.dg/pr96711.f90: Use 2**digit(x) instead of a hard-coded value;
	add comments regarding what the code does.

diff --git a/gcc/testsuite/gfortran.dg/pr96711.f90 b/gcc/testsuite/gfortran.dg/pr96711.f90
index 3761a8ea416..ab7707d0120 100644
--- a/gcc/testsuite/gfortran.dg/pr96711.f90
+++ b/gcc/testsuite/gfortran.dg/pr96711.f90
@@ -1,28 +1,30 @@
 ! { dg-do run }
 ! { dg-require-effective-target fortran_integer_16 }
 ! { dg-require-effective-target fortran_real_16 }
 ! { dg-additional-options "-fdump-tree-original" }
 ! { dg-final { scan-tree-dump-times "_gfortran_stop_numeric" 2 "original" } }
 !
 ! PR fortran/96711 - ICE on NINT() Function
 
 program p
   implicit none
   real(8):: x
   real(16)   :: y
+  ! Assume radix(x) == 2
+  ! 2/epsilon(x) = 2/(radix(x)**(1-digits(x)) = 2**digits(x) with that assumption
   integer(16), parameter :: k1 = nint (2 / epsilon (x), kind(k1))
   integer(16), parameter :: k2 = nint (2 / epsilon (y), kind(k2))
-  integer(16), parameter :: m1 = 9007199254740992_16!2**53
-  integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 !2**113
+  integer(16), parameter :: m1 = 2_16**digits(x) ! Usually 2**53
+  integer(16), parameter :: m2 = 2_16**digits(y) ! Might be 2**113 or 2**106 or ... depending on the system
   integer(16), volatile  :: m
   x = 2 / epsilon (x)
   y = 2 / epsilon (y)
   m = nint (x, kind(m))
 ! print *, m
   if (k1 /= m1) stop 1
   if (m  /= m1) stop 2
   m = nint (y, kind(m))
 ! print *, m
   if (k2 /= m2) stop 3
   if (m  /= m2) stop 4
 end program

Re: [Patch, fortran] PR fortran/100683 - Array initialization refuses valid

2021-05-19 Thread José Rui Faustino de Sousa via Gcc-patches


Hi All!

And yes I forgot the patch...

Sorry...

Best regards,
José Rui

On 19/05/21 17:09, José Rui Faustino de Sousa wrote:

Hi all!

Proposed patch to:

PR100683 - Array initialization refuses valid

Patch tested only on x86_64-pc-linux-gnu.

Add call to simplify expression before parsing.

Thank you very much.

Best regards,
José Rui

Fortran: Fix bogus error

gcc/fortran/ChangeLog:

 PR fortran/100683
 * resolve.c (gfc_resolve_expr): Add call to gfc_simplify_expr.

gcc/testsuite/ChangeLog:

 PR fortran/100683
 * gfortran.dg/PR100683.f90: New test.



diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 747516f..e68391a 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -7138,6 +7138,7 @@ gfc_resolve_expr (gfc_expr *e)
   /* Also try to expand a constructor.  */
   if (t)
 	{
+	  gfc_simplify_expr(e, 1);
 	  gfc_expression_rank (e);
 	  if (gfc_is_constant_expr (e) || gfc_is_expandable_expr (e))
 	gfc_expand_constructor (e, false);
diff --git a/gcc/testsuite/gfortran.dg/PR100683.f90 b/gcc/testsuite/gfortran.dg/PR100683.f90
new file mode 100644
index 000..6929bb5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/PR100683.f90
@@ -0,0 +1,36 @@
+! { dg-do run }
+!
+! Test the fix for PR100683
+! 
+
+program main_p
+
+  implicit none
+
+  integer:: i
+  integer, parameter :: n = 11
+  integer, parameter :: u(*) = [(i, i=1,n)]
+
+  type :: foo_t
+integer :: i
+  end type foo_t
+
+  type, extends(foo_t) :: bar_t
+integer :: a(n)
+  end type bar_t
+  
+  type(bar_t), parameter :: a(*) = [(bar_t(i, u), i=1,n)]
+  type(bar_t):: b(n) = [(bar_t(i, u), i=1,n)]
+
+  if(any(a(:)%i/=u))   stop 1
+  do i = 1, n
+if(any(a(i)%a/=u)) stop 2
+  end do
+  if(any(b(:)%i/=u))   stop 3
+  do i = 1, n
+if(any(b(i)%a/=u)) stop 4
+  end do
+  stop
+
+end program main_p
+

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Marek Polacek via Gcc-patches

On Wed, May 19, 2021 at 11:51:54AM -0600, Martin Sebor via Gcc-patches wrote:
> On 5/19/21 10:39 AM, Jonathan Wakely via Gcc-patches wrote:
> > Jakub pointed out I'd forgotten the spaces before the opening parens
> > for function calls. The attached patch should fix all those, with no
> > other changes.
> > 
> > Tested x86_64-linux. OK for trunk?
> 
> Looks good to me, it just needs an update to the manual describing
> the new options.
> 
> It's too bad that the conditionals checking for the dialect have
> to be repeated throughout the front end.  They're implied by
> the new option enumerator passed to pedwarn().  If the diagnostic
> subsystem had access to cxx_dialect the check could be done there
> and all the other conditionals could be avoided.  An alternative
> to that would be to add a new wrapper to the C++ front end, like
> cxxdialect_pedwarn, to do the checking before calling pedwarn
> (or, more likely, emit_diagnostic_valist).

In the C FE I introduced pedwarn_c11 and similar to deal with a similar
problem, and also handle the rule that a more specific warning overrides
more general warnings.

Marek

[Patch, fortran] PR fortran/93308/93963/94327/94331/97046 problems raised by descriptor handling

2021-05-19 Thread José Rui Faustino de Sousa via Gcc-patches


Hi all!

Proposed patch to:

Bug 93308 - bind(c) subroutine changes lower bound of array argument in 
caller
Bug 93963 - Select rank mishandling allocatable and pointer arguments 
with bind(c)

Bug 94327 - Bind(c) argument attributes are incorrectly set
Bug 94331 - Bind(C) corrupts array descriptors
Bug 97046 - Bad interaction between lbound/ubound, allocatable arrays 
and bind(C) subroutine with dimension(..) parameter


Patch tested only on x86_64-pc-linux-gnu.

Fix attribute handling, which reflect a prior intermediate version of 
the Fortran standard.


CFI descriptors, in most cases, should not be copied out has they can 
corrupt the Fortran descriptor. Bounds will vary and the original 
Fortran bounds are definitively lost on conversion.


Thank you very much.

Best regards,
José Rui

Fortran: Fix attributtes and bounds in ISO_Fortran_binding.

gcc/fortran/ChangeLog:

PR fortran/93308
PR fortran/93963
PR fortran/94327
PR fortran/94331
PR fortran/97046
* trans-decl.c (convert_CFI_desc): Only copy out the descriptor
if necessary.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Updated attribute
handling which reflect a previous intermediate version of the
standard. Only copy out the descriptor if necessary.

libgfortran/ChangeLog:

PR fortran/93308
PR fortran/93963
PR fortran/94327
PR fortran/94331
PR fortran/97046
* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Add code
to verify the descriptor. Correct bounds calculation.
(gfc_desc_to_cfi_desc): Add code to verify the descriptor.

gcc/testsuite/ChangeLog:

PR fortran/93308
PR fortran/93963
PR fortran/94327
PR fortran/94331
PR fortran/97046
* gfortran.dg/ISO_Fortran_binding_1.f90: Add pointer attribute,
this test is still erroneous but now it compiles.
* gfortran.dg/bind_c_array_params_2.f90: Update regex to match
code changes.
* gfortran.dg/PR93308.f90: New test.
* gfortran.dg/PR93963.f90: New test.
* gfortran.dg/PR94327.c: New test.
* gfortran.dg/PR94327.f90: New test.
* gfortran.dg/PR94331.c: New test.
* gfortran.dg/PR94331.f90: New test.
* gfortran.dg/PR97046.f90: New test.
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 406b4ae..9fb4ef9 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -4519,22 +4519,28 @@ convert_CFI_desc (gfc_wrapped_block * block, gfc_symbol *sym)
   gfc_add_expr_to_block (&outer_block, incoming);
   incoming = gfc_finish_block (&outer_block);
 
-
   /* Convert the gfc descriptor back to the CFI type before going
 	 out of scope, if the CFI type was present at entry.  */
-  gfc_init_block (&outer_block);
-  gfc_init_block (&tmpblock);
-
-  tmp = gfc_build_addr_expr (ppvoid_type_node, CFI_desc_ptr);
-  outgoing = build_call_expr_loc (input_location,
-			gfor_fndecl_gfc_to_cfi, 2, tmp, gfc_desc_ptr);
-  gfc_add_expr_to_block (&tmpblock, outgoing);
+  outgoing = NULL_TREE;
+  if ((sym->attr.pointer || sym->attr.allocatable)
+	  && !sym->attr.value
+	  && sym->attr.intent != INTENT_IN)
+	{
+	  gfc_init_block (&outer_block);
+	  gfc_init_block (&tmpblock);
 
-  outgoing = build3_v (COND_EXPR, present,
-			   gfc_finish_block (&tmpblock),
-			   build_empty_stmt (input_location));
-  gfc_add_expr_to_block (&outer_block, outgoing);
-  outgoing = gfc_finish_block (&outer_block);
+	  tmp = gfc_build_addr_expr (ppvoid_type_node, CFI_desc_ptr);
+	  outgoing = build_call_expr_loc (input_location,
+	  gfor_fndecl_gfc_to_cfi, 2,
+	  tmp, gfc_desc_ptr);
+	  gfc_add_expr_to_block (&tmpblock, outgoing);
+
+	  outgoing = build3_v (COND_EXPR, present,
+			   gfc_finish_block (&tmpblock),
+			   build_empty_stmt (input_location));
+	  gfc_add_expr_to_block (&outer_block, outgoing);
+	  outgoing = gfc_finish_block (&outer_block);
+	}
 
   /* Add the lot to the procedure init and finally blocks.  */
   gfc_add_init_cleanup (block, incoming, outgoing);
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index cce18d0..1f84d57 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -5460,13 +5460,12 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym)
 	attribute = 1;
 }
 
-  /* If the formal argument is assumed shape and neither a pointer nor
- allocatable, it is unconditionally CFI_attribute_other.  */
-  if (fsym->as->type == AS_ASSUMED_SHAPE
-  && !fsym->attr.pointer && !fsym->attr.allocatable)
-   cfi_attribute = 2;
+  if (fsym->attr.pointer)
+cfi_attribute = 0;
+  else if (fsym->attr.allocatable)
+cfi_attribute = 1;
   else
-   cfi_attribute = attribute;
+cfi_attribute = 2;
 
   if (e->rank != 0)
 {
@@ -5574,10 +5573,15 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parms

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Martin Sebor via Gcc-patches


On 5/19/21 10:39 AM, Jonathan Wakely via Gcc-patches wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Looks good to me, it just needs an update to the manual describing
the new options.

It's too bad that the conditionals checking for the dialect have
to be repeated throughout the front end.  They're implied by
the new option enumerator passed to pedwarn().  If the diagnostic
subsystem had access to cxx_dialect the check could be done there
and all the other conditionals could be avoided.  An alternative
to that would be to add a new wrapper to the C++ front end, like
cxxdialect_pedwarn, to do the checking before calling pedwarn
(or, more likely, emit_diagnostic_valist).

Martin

[pushed] c++: ICE with <=> fallback [PR100367]

2021-05-19 Thread Jason Merrill via Gcc-patches

Here, when genericizing lexicographical_compare_three_way, we haven't yet
walked the operands, so (a == a) still sees ADDR_EXPR , but this is after
we've changed the type of a to REFERENCE_TYPE.  When we try to fold (a == a)
by constexpr evaluation, the constexpr code doesn't understand trying to
take the address of a reference, and we end up crashing.

Fixed by avoiding constexpr evaluation in genericize_spaceship, by using
fold_build2 instead of build_new_op on scalar operands.  Class operands
should have been expanded during parsing.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/100367
PR c++/96299

gcc/cp/ChangeLog:

* method.c (genericize_spaceship): Use fold_build2 for scalar
operands.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-fallback1.C: New test.
---
 gcc/cp/method.c   | 27 ++-
 .../g++.dg/cpp2a/spaceship-fallback1.C| 17 
 2 files changed, 37 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-fallback1.C

diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index f8c9456d720..75effb0d698 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1087,7 +1087,8 @@ genericize_spaceship (location_t loc, tree type, tree 
op0, tree op1)
   gcc_checking_assert (tag < cc_last);
 
   tree r;
-  if (SCALAR_TYPE_P (TREE_TYPE (op0)))
+  bool scalar = SCALAR_TYPE_P (TREE_TYPE (op0));
+  if (scalar)
 {
   op0 = save_expr (op0);
   op1 = save_expr (op1);
@@ -1097,26 +1098,38 @@ genericize_spaceship (location_t loc, tree type, tree 
op0, tree op1)
 
   int flags = LOOKUP_NORMAL;
   tsubst_flags_t complain = tf_none;
+  tree comp;
 
   if (tag == cc_partial_ordering)
 {
   /* op0 == op1 ? equivalent : op0 < op1 ? less :
 op1 < op0 ? greater : unordered */
   tree uo = lookup_comparison_result (tag, type, 3);
-  tree comp = build_new_op (loc, LT_EXPR, flags, op1, op0, complain);
-  r = build_conditional_expr (loc, comp, gt, uo, complain);
+  if (scalar)
+   /* For scalars use the low level operations; using build_new_op causes
+  trouble with constexpr eval in the middle of genericize (100367).  */
+   comp = fold_build2 (LT_EXPR, boolean_type_node, op1, op0);
+  else
+   comp = build_new_op (loc, LT_EXPR, flags, op1, op0, complain);
+  r = fold_build3 (COND_EXPR, type, comp, gt, uo);
 }
   else
 /* op0 == op1 ? equal : op0 < op1 ? less : greater */
 r = gt;
 
   tree lt = lookup_comparison_result (tag, type, 2);
-  tree comp = build_new_op (loc, LT_EXPR, flags, op0, op1, complain);
-  r = build_conditional_expr (loc, comp, lt, r, complain);
+  if (scalar)
+comp = fold_build2 (LT_EXPR, boolean_type_node, op0, op1);
+  else
+comp = build_new_op (loc, LT_EXPR, flags, op0, op1, complain);
+  r = fold_build3 (COND_EXPR, type, comp, lt, r);
 
   tree eq = lookup_comparison_result (tag, type, 0);
-  comp = build_new_op (loc, EQ_EXPR, flags, op0, op1, complain);
-  r = build_conditional_expr (loc, comp, eq, r, complain);
+  if (scalar)
+comp = fold_build2 (EQ_EXPR, boolean_type_node, op0, op1);
+  else
+comp = build_new_op (loc, EQ_EXPR, flags, op0, op1, complain);
+  r = fold_build3 (COND_EXPR, type, comp, eq, r);
 
   return r;
 }
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-fallback1.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-fallback1.C
new file mode 100644
index 000..5ce49490fe5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-fallback1.C
@@ -0,0 +1,17 @@
+// PR c++/100367
+// { dg-do compile { target c++20 } }
+
+#include 
+
+struct iter {
+  bool current;
+  iter(iter &);
+};
+
+constexpr bool operator==(const iter &, const iter &y) {
+  return y.current;
+}
+
+void lexicographical_compare_three_way(iter a) {
+  (a == a) <=> true;
+}

base-commit: 873c5188fd5d2e17430cab1522aa36fa62582ea7
-- 
2.27.0

[pushed] c++: implicit deduction guides, protected access

2021-05-19 Thread Jason Merrill via Gcc-patches

Jonathan raised this issue with CWG, and there seems to be general agreement
that a deduction guide generated from a constructor should have access to
the same names that the constructor has access to.  That seems to be as easy
as setting DECL_CONTEXT.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* pt.c (build_deduction_guide): Treat the implicit deduction guide
as a member of the class.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction-access1.C: New test.
* g++.dg/cpp1z/class-deduction-access2.C: New test.
---
 gcc/cp/pt.c|  3 +++
 .../g++.dg/cpp1z/class-deduction-access1.C | 18 ++
 .../g++.dg/cpp1z/class-deduction-access2.C | 10 ++
 3 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-access1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-access2.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 23d26231849..32cd0b7a6ed 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28803,6 +28803,9 @@ build_deduction_guide (tree type, tree ctor, tree 
outer_args, tsubst_flags_t com
 DECL_ABSTRACT_ORIGIN (ded_tmpl) = fn_tmpl;
   if (ci)
 set_constraints (ded_tmpl, ci);
+  /* The artificial deduction guide should have same access as the
+ constructor.  */
+  DECL_CONTEXT (ded_fn) = type;
 
   return ded_tmpl;
 }
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-access1.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction-access1.C
new file mode 100644
index 000..2424abb52ef
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-access1.C
@@ -0,0 +1,18 @@
+// { dg-do compile { target c++17 } }
+
+template
+struct Base
+{
+protected:
+  using type = T;
+};
+
+template
+struct Cont : Base
+{
+  using argument_type = typename Base::type;
+
+  Cont(T, argument_type) { }
+};
+
+Cont c(1, 1);
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-access2.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction-access2.C
new file mode 100644
index 000..87f20311e09
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-access2.C
@@ -0,0 +1,10 @@
+// { dg-do compile { target c++17 } }
+
+struct B {
+protected:
+struct type {};
+};
+template struct D : B {
+D(T, typename T::type);
+};
+D c = {B(), {}};

base-commit: adcb497bdba499d161d2e5e8de782bdd6f75d62c
-- 
2.27.0

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jason Merrill via Gcc-patches


On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



   if (omitted_parms_loc && lambda_specs.any_specifiers_p)
 {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


  else if (cxx_dialect < cxx23)
omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.

Jason

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Marek Polacek via Gcc-patches

On Wed, May 19, 2021 at 05:59:34PM +0100, Jonathan Wakely wrote:
> On 19/05/21 12:53 -0400, Marek Polacek wrote:
> > On Wed, May 19, 2021 at 05:39:24PM +0100, Jonathan Wakely via Gcc-patches 
> > wrote:
> > > Jakub pointed out I'd forgotten the spaces before the opening parens
> > > for function calls. The attached patch should fix all those, with no
> > > other changes.
> > > 
> > > Tested x86_64-linux. OK for trunk?
> > 
> > Nice, this is cool.
> > 
> > > --- a/gcc/c-family/c.opt
> > > +++ b/gcc/c-family/c.opt
> > > @@ -431,6 +431,22 @@ Wc++20-compat
> > >  C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
> > >  Warn about C++ constructs whose meaning differs between ISO C++ 2017 and 
> > > ISO C++ 2020.
> > > 
> > > +Wc++11-extensions
> > > +C++ ObjC++ Var(warn_cxx11_extensions) Warning LangEnabledBy(C++ 
> > > ObjC++,Wall) Init(1)
> > > +Warn about C++11 constructs in code compiled with an older standard.
> > > +
> > > +Wc++14-extensions
> > > +C++ ObjC++ Var(warn_cxx14_extensions) Warning LangEnabledBy(C++ 
> > > ObjC++,Wall) Init(1)
> > > +Warn about C++14 constructs in code compiled with an older standard.
> > > +
> > > +Wc++17-extensions
> > > +C++ ObjC++ Var(warn_cxx17_extensions) Warning LangEnabledBy(C++ 
> > > ObjC++,Wall) Init(1)
> > > +Warn about C++17 constructs in code compiled with an older standard.
> > > +
> > > +Wc++20-extensions
> > > +C++ ObjC++ Var(warn_cxx20_extensions) Warning LangEnabledBy(C++ 
> > > ObjC++,Wall) Init(1)
> > > +Warn about C++20 constructs in code compiled with an older standard.
> > > +
> > 
> > So these are enabled by -Wall but also turned on by default?  Let's choose 
> > one
> > and then drop either the Init(1) or the LangEnabledBy(C++ ObjC++,Wall) part?
> 
> Ah, good point. I mostly just cargo-cult what I see in that file (is
> the format documented somewhere?)

doc/options.texi I think.
 
> I think to preserve the current behaviour (using these constructs in
> an unsupported dialect warns by default) we want them to be Init(1)
> but not in -Wall. Or we could change the behaviour, and include them
> in -Wall and not Init(1), but then people who don't use -Wall (aka
> idiots and beginners) would not get the warnings.
> 
> Any preference?

Um.  Probably Init(1) aka status quo.  Frankly it sounds like a -pedantic thing
to me, but I guess let's not change that now.

Marek

[Patch, fortran] PR fortran/100683 - Array initialization refuses valid

2021-05-19 Thread José Rui Faustino de Sousa via Gcc-patches


Hi all!

Proposed patch to:

PR100683 - Array initialization refuses valid

Patch tested only on x86_64-pc-linux-gnu.

Add call to simplify expression before parsing.

Thank you very much.

Best regards,
José Rui

Fortran: Fix bogus error

gcc/fortran/ChangeLog:

PR fortran/100683
* resolve.c (gfc_resolve_expr): Add call to gfc_simplify_expr.

gcc/testsuite/ChangeLog:

PR fortran/100683
* gfortran.dg/PR100683.f90: New test.

Re: [PATCH] doc: Update description of __GXX_EXPERIMENTAL_CXX0X__

2021-05-19 Thread Jason Merrill via Gcc-patches


On 5/19/21 6:15 AM, Jonathan Wakely wrote:

This macro has been obsolete for years, and C++0x features are no longer
experimental or liable to be removed.

gcc/ChangeLog:

* doc/cpp.texi (Common Predefined Macros): Update documentation
for the __GXX_EXPERIMENTAL_CXX0X__ macro.

OK for trunk and release branches?


OK.

Re: [PATCH] c++: Relax attribute on friend declaration checking [PR100596]

2021-05-19 Thread Jason Merrill via Gcc-patches


On 5/18/21 5:00 PM, Marek Polacek wrote:

It turned out that there are codebases that profusely use GNU attributes
on friend declarations, so we have to dial back our checking and allow
them.  And for C++11 attributes let's just warn instead of giving
errors.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


PR c++/100596

gcc/cp/ChangeLog:

* cp-tree.h (any_non_type_attribute_p): Remove.
* decl.c (grokdeclarator): Turn an error into a warning and only
warn for standard attributes.
* decl2.c (any_non_type_attribute_p): Remove.
* parser.c (cp_parser_elaborated_type_specifier): Turn an error
into a warning and only warn for standard attributes.
(cp_parser_member_declaration): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/friend7.C: Turn a few dg-warnings into dg-errors.
Remove dg-errors for GNU attributes.
* g++.dg/ext/attrib63.C: Remove dg-error.
* g++.dg/cpp0x/friend8.C: New test.
---
  gcc/cp/cp-tree.h |  1 -
  gcc/cp/decl.c| 14 +-
  gcc/cp/decl2.c   | 14 --
  gcc/cp/parser.c  | 29 ++--
  gcc/testsuite/g++.dg/cpp0x/friend7.C | 28 +--
  gcc/testsuite/g++.dg/cpp0x/friend8.C | 15 ++
  gcc/testsuite/g++.dg/ext/attrib63.C  | 23 +++---
  7 files changed, 77 insertions(+), 47 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/friend8.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 580db914d40..122dadf976f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6763,7 +6763,6 @@ extern tree grokbitfield (const cp_declarator *, 
cp_decl_specifier_seq *,
  tree, tree, tree);
  extern tree splice_template_attributes(tree *, tree);
  extern bool any_dependent_type_attributes_p   (tree);
-extern bool any_non_type_attribute_p   (tree);
  extern tree cp_reconstruct_complex_type   (tree, tree);
  extern bool attributes_naming_typedef_ok  (tree);
  extern void cplus_decl_attributes (tree *, tree, int);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 17511f09e79..92fb4a2daea 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13741,11 +13741,15 @@ grokdeclarator (const cp_declarator *declarator,
  
  	if (friendp)

  {
-   if (attrlist && !funcdef_flag
-   /* Hack to allow attributes like vector_size on a friend.  */
-   && any_non_type_attribute_p (*attrlist))
- error_at (id_loc, "attribute appertains to a friend "
-   "declaration that is not a definition");
+   /* Packages tend to use GNU attributes on friends, so we only
+  warn for standard attributes.  */
+   if (attrlist && !funcdef_flag && cxx11_attribute_p (*attrlist))
+ {
+   *attrlist = NULL_TREE;
+   if (warning_at (id_loc, OPT_Wattributes, "attribute ignored"))
+ inform (id_loc, "an attribute that appertains to a friend "
+ "declaration that is not a definition is ignored");
+ }
/* Friends are treated specially.  */
if (ctype == current_class_type)
  ;  /* We already issued a permerror.  */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 8e4dd6b544a..89f874a32cc 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1331,20 +1331,6 @@ any_dependent_type_attributes_p (tree attrs)
return false;
  }
  
-/* True if ATTRS contains any attribute that does not require a type.  */

-
-bool
-any_non_type_attribute_p (tree attrs)
-{
-  for (tree a = attrs; a; a = TREE_CHAIN (a))
-{
-  const attribute_spec *as = lookup_attribute_spec (get_attribute_name 
(a));
-  if (as && !as->type_required)
-   return true;
-}
-  return false;
-}
-
  /* Return true iff ATTRS are acceptable attributes to be applied in-place
 to a typedef which gives a previously unnamed class or enum a name for
 linkage purposes.  */
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c0b57955954..ac1cefc5c41 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -19774,9 +19774,12 @@ cp_parser_elaborated_type_specifier (cp_parser* parser,
   && ! processing_explicit_instantiation)
warning (OPT_Wattributes,
 "attributes ignored on template instantiation");
-  else if (is_friend && attributes)
-   error ("attribute appertains to a friend declaration that is not "
-  "a definition");
+  else if (is_friend && cxx11_attribute_p (attributes))
+   {
+ if (warning (OPT_Wattributes, "attribute ignored"))
+   inform (input_location, "an attribute that appertains to a friend "
+   "declaration that is not a definition is ignored");
+   }
else if (is_declarati

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 12:53 -0400, Marek Polacek wrote:

On Wed, May 19, 2021 at 05:39:24PM +0100, Jonathan Wakely via Gcc-patches wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Nice, this is cool.


--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -431,6 +431,22 @@ Wc++20-compat
 C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
 Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.

+Wc++11-extensions
+C++ ObjC++ Var(warn_cxx11_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(1)
+Warn about C++11 constructs in code compiled with an older standard.
+
+Wc++14-extensions
+C++ ObjC++ Var(warn_cxx14_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(1)
+Warn about C++14 constructs in code compiled with an older standard.
+
+Wc++17-extensions
+C++ ObjC++ Var(warn_cxx17_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(1)
+Warn about C++17 constructs in code compiled with an older standard.
+
+Wc++20-extensions
+C++ ObjC++ Var(warn_cxx20_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(1)
+Warn about C++20 constructs in code compiled with an older standard.
+


So these are enabled by -Wall but also turned on by default?  Let's choose one
and then drop either the Init(1) or the LangEnabledBy(C++ ObjC++,Wall) part?


Ah, good point. I mostly just cargo-cult what I see in that file (is
the format documented somewhere?)

I think to preserve the current behaviour (using these constructs in
an unsupported dialect warns by default) we want them to be Init(1)
but not in -Wall. Or we could change the behaviour, and include them
in -Wall and not Init(1), but then people who don't use -Wall (aka
idiots and beginners) would not get the warnings.

Any preference?

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 17:50 +0100, Jonathan Wakely wrote:

On 19/05/21 12:40 -0400, Eric Gallager wrote:

Thank you for doing this! One thing I'm wondering about, is that I'm
pretty sure clang also allows at least some of these to be used with
plain C as well, for example for things like the old use of "auto" in
C conflicting with the newer C++11 meaning of "auto". Would it be
possible to do likewise for GCC as well? Just an idea.


I think that would belong in -Wc++-compat and would need changes to
the C front end, which I'm almost entirely unfamiliar with. My patch
doesn't add any new diagnostics, it just makes slight adjustments to
existing ones. If you want new diagnostics in the C front end you'll
need to convince a C FE maintainer. That would be too far outside my
comfort zone :-)


FWIW, Clang does accept -Wc++11-extensions as an option for the C
compiler, but I think that's true for all its warning options (I don't
think it ever rejects a warning option as "valid for C++ but not for
C" the way that GCC does). But I can't persuade it to warn about using
'auto' in C, even with -Weverything.

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Marek Polacek via Gcc-patches

On Wed, May 19, 2021 at 05:39:24PM +0100, Jonathan Wakely via Gcc-patches wrote:
> Jakub pointed out I'd forgotten the spaces before the opening parens
> for function calls. The attached patch should fix all those, with no
> other changes.
> 
> Tested x86_64-linux. OK for trunk?

Nice, this is cool.

> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -431,6 +431,22 @@ Wc++20-compat
>  C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
>  Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
> C++ 2020.
>  
> +Wc++11-extensions
> +C++ ObjC++ Var(warn_cxx11_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
> Init(1)
> +Warn about C++11 constructs in code compiled with an older standard.
> +
> +Wc++14-extensions
> +C++ ObjC++ Var(warn_cxx14_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
> Init(1)
> +Warn about C++14 constructs in code compiled with an older standard.
> +
> +Wc++17-extensions
> +C++ ObjC++ Var(warn_cxx17_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
> Init(1)
> +Warn about C++17 constructs in code compiled with an older standard.
> +
> +Wc++20-extensions
> +C++ ObjC++ Var(warn_cxx20_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) 
> Init(1)
> +Warn about C++20 constructs in code compiled with an older standard.
> +

So these are enabled by -Wall but also turned on by default?  Let's choose one
and then drop either the Init(1) or the LangEnabledBy(C++ ObjC++,Wall) part?

Marek

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 12:40 -0400, Eric Gallager wrote:

Thank you for doing this! One thing I'm wondering about, is that I'm
pretty sure clang also allows at least some of these to be used with
plain C as well, for example for things like the old use of "auto" in
C conflicting with the newer C++11 meaning of "auto". Would it be
possible to do likewise for GCC as well? Just an idea.


I think that would belong in -Wc++-compat and would need changes to
the C front end, which I'm almost entirely unfamiliar with. My patch
doesn't add any new diagnostics, it just makes slight adjustments to
existing ones. If you want new diagnostics in the C front end you'll
need to convince a C FE maintainer. That would be too far outside my
comfort zone :-)

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.

(Thanks, Jakub!)


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index a7605cf3a38..9fc41d9f2b3 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -447,6 +447,10 @@ Wc++20-extensions
 C++ ObjC++ Var(warn_cxx20_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) Init(1)
 Warn about C++20 constructs in code compiled with an older standard.
 
+Wc++23-extensions
+C++ ObjC++ Var(warn_cxx23_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) Init(1)
+Warn about C++23 constructs in code compiled with an older standard.
+
 Wcast-function-type
 C ObjC C++ ObjC++ Var(warn_cast_function_type) Warning EnabledBy(Wextra)
 Warn about casts between incompatible function types.
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3652649c6c6..ba11a436499 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -11390,7 +11390,7 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr)
 
   if (omitted_parms_loc && lambda_specs.any_specifiers_p)
 {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
 	   "parameter declaration before lambda declaration "
 	   "specifiers only optional with %<-std=c++2b%> or "
 	   "%<-std=gnu++2b%>");
@@ -11409,7 +11409,7 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr)
   tx_qual = cp_parser_tx_qualifier_opt (parser);
   if (omitted_parms_loc && tx_qual)
 {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
 	   "parameter declaration before lambda transaction "
 	   "qualifier only optional with %<-std=c++2b%> or "
 	   "%<-std=gnu++2b%>");
@@ -11422,7 +11422,7 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr)
 
   if (omitted_parms_loc && exception_spec)
 {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
 	   "parameter declaration before lambda exception "
 	   "specification only optional with %<-std=c++2b%> or "
 	   "%<-std=gnu++2b%>");
@@ -11440,7 +11440,7 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr)
   if (cp_lexer_next_token_is (parser->lexer, CPP_DEREF))
 {
   if (omitted_parms_loc)
-	pedwarn (omitted_parms_loc, 0,
+	pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,
 		 "parameter declaration before lambda trailing "
 		 "return type only optional with %<-std=c++2b%> or "
 		 "%<-std=gnu++2b%>");

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Eric Gallager via Gcc-patches

On Wed, May 19, 2021 at 12:33 PM Jonathan Wakely via Gcc-patches
 wrote:
>
> This adds new warning flags, enabled by default: -Wc++11-extensions,
> -Wc++14-extensions, -Wc++17-extensions, and -Wc++20-extensions. The
> names of the flags are copied from Clang, which already has similar
> options.
>
> No new diagnostics are added, but the new OPT_Wxxx variables are used to
> control existing pedwarns about occurences of new C++ constructs in code
> using an old C++ standard dialect. This allows several existing warnings
> that cannot currently be disabled to be controlled by the appropriate
> -Wno-xxx flag. For example, it will now be possible to disable warnings
> about using variadic templates in C++98 code, by using the new
> -Wno-c++11-extensions option. This will allow libstdc++ headers to
> disable those warnings unconditionally by using diagnostic pragmas, so
> that they are not emitted even if -Wsystem-headers is used.
>
> Some of the affected diagnostics are currently only given when
> -Wpedantic is used. Now that we have a more specific warning flag, we
> could consider making them not depend on -Wpedantic, and only on the new
> flag. This patch does not do that, as it intends to make no changes to
> what is accepted/rejected by default. The only effect should be that
> the new option is shown when -fdiagnostics-show-option is active, and
> that some warnings can be disabled by using the new flags (and for the
> warnings that previously only dependend on -Wpedantic, it will now be
> possible to disable just those warnings while still using -Wpedantic for
> its other benefits).
>
> A new helper function, warn_about_dialect_p, is introduced to avoid the
> repetition of `if (cxx_dialect < cxxNN && warn_cxxNN_extensions)`
> everywhere.
>
> gcc/c-family/ChangeLog:
>
> * c.opt (Wc++11-extensions, Wc++14-extensions)
> (Wc++17-extensions, Wc++20-extensions): New options.
>
> gcc/cp/ChangeLog:
>
> * call.c (maybe_warn_array_conv): Use new function and option.
> * cp-tree.h (warn_about_dialect_p): Declare new function.
> * error.c (maybe_warn_cpp0x): Use new function and options.
> (warn_about_dialect_p): Define new function.
> * parser.c (cp_parser_unqualified_id): Use new function and
> option.
> (cp_parser_pseudo_destructor_name): Likewise.
> (cp_parser_lambda_introducer): Likewise.
> (cp_parser_lambda_declarator_opt): Likewise.
> (cp_parser_init_statement): Likewise.
> (cp_parser_decomposition_declaration): Likewise.
> (cp_parser_function_specifier_opt): Likewise.
> (cp_parser_static_assert): Likewise.
> (cp_parser_namespace_definition): Likewise.
> (cp_parser_initializer_list): Likewise.
> (cp_parser_member_declaration): Likewise.
> * pt.c (check_template_variable): Likewise.
>
> Tested x86_64-linux. OK for trunk?
>
>

Thank you for doing this! One thing I'm wondering about, is that I'm
pretty sure clang also allows at least some of these to be used with
plain C as well, for example for things like the old use of "auto" in
C conflicting with the newer C++11 meaning of "auto". Would it be
possible to do likewise for GCC as well? Just an idea.
Thanks,
Eric Gallager

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches


Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?



commit a7dc19cdc0c5d3762bb90d12ebd82a05d0013246
Author: Jonathan Wakely 
Date:   Wed May 19 17:37:00 2021

c++: Add new warning options for C++ language mismatches

This adds new warning flags, enabled by default: -Wc++11-extensions,
-Wc++14-extensions, -Wc++17-extensions, and -Wc++20-extensions. The
names of the flags are copied from Clang, which already has similar
options.

No new diagnostics are added, but the new OPT_Wxxx variables are used to
control existing pedwarns about occurences of new C++ constructs in code
using an old C++ standard dialect. This allows several existing warnings
that cannot currently be disabled to be controlled by the appropriate
-Wno-xxx flag. For example, it will now be possible to disable warnings
about using variadic templates in C++98 code, by using the new
-Wno-c++11-extensions option. This will allow libstdc++ headers to
disable those warnings unconditionally by using diagnostic pragmas, so
that they are not emitted even if -Wsystem-headers is used.

Some of the affected diagnostics are currently only given when
-Wpedantic is used. Now that we have a more specific warning flag, we
could consider making them not depend on -Wpedantic, and only on the new
flag. This patch does not do that, as it intends to make no changes to
what is accepted/rejected by default. The only effect should be that
the new option is shown when -fdiagnostics-show-option is active, and
that some warnings can be disabled by using the new flags (and for the
warnings that previously only dependend on -Wpedantic, it will now be
possible to disable just those warnings while still using -Wpedantic for
its other benefits).

A new helper function, warn_about_dialect_p, is introduced to avoid the
repetition of `if (cxx_dialect < cxxNN && warn_cxxNN_extensions)`
everywhere.

gcc/c-family/ChangeLog:

* c.opt (Wc++11-extensions, Wc++14-extensions)
(Wc++17-extensions, Wc++20-extensions): New options.

gcc/cp/ChangeLog:

* call.c (maybe_warn_array_conv): Use new function and option.
* cp-tree.h (warn_about_dialect_p): Declare new function.
* error.c (maybe_warn_cpp0x): Use new function and options.
(warn_about_dialect_p): Define new function.
* parser.c (cp_parser_unqualified_id): Use new function and
option.
(cp_parser_pseudo_destructor_name): Likewise.
(cp_parser_lambda_introducer): Likewise.
(cp_parser_lambda_declarator_opt): Likewise.
(cp_parser_init_statement): Likewise.
(cp_parser_decomposition_declaration): Likewise.
(cp_parser_function_specifier_opt): Likewise.
(cp_parser_static_assert): Likewise.
(cp_parser_namespace_definition): Likewise.
(cp_parser_initializer_list): Likewise.
(cp_parser_member_declaration): Likewise.
* pt.c (check_template_variable): Likewise.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 5fcf961fd96..a7605cf3a38 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -431,6 +431,22 @@ Wc++20-compat
 C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
 Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO C++ 2020.
 
+Wc++11-extensions
+C++ ObjC++ Var(warn_cxx11_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) Init(1)
+Warn about C++11 constructs in code compiled with an older standard.
+
+Wc++14-extensions
+C++ ObjC++ Var(warn_cxx14_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) Init(1)
+Warn about C++14 constructs in code compiled with an older standard.
+
+Wc++17-extensions
+C++ ObjC++ Var(warn_cxx17_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) Init(1)
+Warn about C++17 constructs in code compiled with an older standard.
+
+Wc++20-extensions
+C++ ObjC++ Var(warn_cxx20_extensions) Warning LangEnabledBy(C++ ObjC++,Wall) Init(1)
+Warn about C++20 constructs in code compiled with an older standard.
+
 Wcast-function-type
 C ObjC C++ ObjC++ Var(warn_cast_function_type) Warning EnabledBy(Wextra)
 Warn about casts between incompatible function types.
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 1e2d1d43184..5134d10bb24 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7464,8 +7464,10 @@ maybe_warn_array_conv (location_t loc, conversion *c, tree expr)
   || TYPE_DOMAIN (type) == NULL_TREE)
 return;
 
-  if (conv_binds_to_array_of_unknown_bound (c))
-pedwarn (loc, OPT_Wpedantic, "conversions to arrays of unknown bound "
+  if (conv_binds_to_array_of_unknown_bound (c)
+  && pedantic && warn_about_dialect_p (cxx20))
+pedwarn (loc, OPT

[PATCH] c++: Add new warning options for C++ language mismatches

2021-05-19 Thread Jonathan Wakely via Gcc-patches

This adds new warning flags, enabled by default: -Wc++11-extensions,
-Wc++14-extensions, -Wc++17-extensions, and -Wc++20-extensions. The
names of the flags are copied from Clang, which already has similar
options.

No new diagnostics are added, but the new OPT_Wxxx variables are used to
control existing pedwarns about occurences of new C++ constructs in code
using an old C++ standard dialect. This allows several existing warnings
that cannot currently be disabled to be controlled by the appropriate
-Wno-xxx flag. For example, it will now be possible to disable warnings
about using variadic templates in C++98 code, by using the new
-Wno-c++11-extensions option. This will allow libstdc++ headers to
disable those warnings unconditionally by using diagnostic pragmas, so
that they are not emitted even if -Wsystem-headers is used.

Some of the affected diagnostics are currently only given when
-Wpedantic is used. Now that we have a more specific warning flag, we
could consider making them not depend on -Wpedantic, and only on the new
flag. This patch does not do that, as it intends to make no changes to
what is accepted/rejected by default. The only effect should be that
the new option is shown when -fdiagnostics-show-option is active, and
that some warnings can be disabled by using the new flags (and for the
warnings that previously only dependend on -Wpedantic, it will now be
possible to disable just those warnings while still using -Wpedantic for
its other benefits).

A new helper function, warn_about_dialect_p, is introduced to avoid the
repetition of `if (cxx_dialect < cxxNN && warn_cxxNN_extensions)`
everywhere.

gcc/c-family/ChangeLog:

* c.opt (Wc++11-extensions, Wc++14-extensions)
(Wc++17-extensions, Wc++20-extensions): New options.

gcc/cp/ChangeLog:

* call.c (maybe_warn_array_conv): Use new function and option.
* cp-tree.h (warn_about_dialect_p): Declare new function.
* error.c (maybe_warn_cpp0x): Use new function and options.
(warn_about_dialect_p): Define new function.
* parser.c (cp_parser_unqualified_id): Use new function and
option.
(cp_parser_pseudo_destructor_name): Likewise.
(cp_parser_lambda_introducer): Likewise.
(cp_parser_lambda_declarator_opt): Likewise.
(cp_parser_init_statement): Likewise.
(cp_parser_decomposition_declaration): Likewise.
(cp_parser_function_specifier_opt): Likewise.
(cp_parser_static_assert): Likewise.
(cp_parser_namespace_definition): Likewise.
(cp_parser_initializer_list): Likewise.
(cp_parser_member_declaration): Likewise.
* pt.c (check_template_variable): Likewise.

Tested x86_64-linux. OK for trunk?


commit ed749dcee73bff55fc27231c15d2bae265ec86b3
Author: Jonathan Wakely 
Date:   Wed May 19 17:07:32 2021

c++: Add new warning options for C++ language mismatches

This adds new warning flags, enabled by default: -Wc++11-extensions,
-Wc++14-extensions, -Wc++17-extensions, and -Wc++20-extensions. The
names of the flags are copied from Clang, which already has similar
options.

No new diagnostics are added, but the new OPT_Wxxx variables are used to
control existing pedwarns about occurences of new C++ constructs in code
using an old C++ standard dialect. This allows several existing warnings
that cannot currently be disabled to be controlled by the appropriate
-Wno-xxx flag. For example, it will now be possible to disable warnings
about using variadic templates in C++98 code, by using the new
-Wno-c++11-extensions option. This will allow libstdc++ headers to
disable those warnings unconditionally by using diagnostic pragmas, so
that they are not emitted even if -Wsystem-headers is used.

Some of the affected diagnostics are currently only given when
-Wpedantic is used. Now that we have a more specific warning flag, we
could consider making them not depend on -Wpedantic, and only on the new
flag. This patch does not do that, as it intends to make no changes to
what is accepted/rejected by default. The only effect should be that
the new option is shown when -fdiagnostics-show-option is active, and
that some warnings can be disabled by using the new flags (and for the
warnings that previously only dependend on -Wpedantic, it will now be
possible to disable just those warnings while still using -Wpedantic for
its other benefits).

A new helper function, warn_about_dialect_p, is introduced to avoid the
repetition of `if (cxx_dialect < cxxNN && warn_cxxNN_extensions)`
everywhere.

gcc/c-family/ChangeLog:

* c.opt (Wc++11-extensions, Wc++14-extensions)
(Wc++17-extensions, Wc++20-extensions): New options.

gcc/cp/ChangeLog:

* call.c (maybe_warn_array_conv): Use new function and option.
* cp-tree.h (warn_about_dialect_p): Declare new

[PATCH] wwwdocs: Add D front-end section for GCC 11 changes

2021-05-19 Thread Iain Buclaw via Gcc-patches

Hi,

This is a belated patch which covers some of the more notable changes
that have gone into the GCC 11 release of the D front-end.

Ran this through the W3 validator and no new warnings are generated.

I will go through it a few more times to see if there's anything more
that can be made more succinct, otherwise are there any wording
suggestions before I commit this?

Iain.

---
 htdocs/gcc-11/changes.html | 190 +
 htdocs/gcc-12/changes.html |   2 +
 2 files changed, 192 insertions(+)

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 4bdae272..0072164f 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -544,6 +544,196 @@ You may also want to check out our
   
 
 
+D
+
+  New features:
+
+  A new bottom type typeof(*null) has been added to
+   represent run-time errors and non-terminating functions.  This also
+   introduces a new standard alias for the type named
+   noreturn, and is implicitly imported into every module.
+  
+  Printf-like and scanf-like functions are now detected by prefixing
+   them with pragma(printf) for printf-like functions or
+   pragma(scanf) for scanf-like functions.
+  
+  The __traits() expression now supports the extensions
+   isDeprecated, isDisabled,
+   isFuture, isModule, isPackage,
+   child, isReturnOnStack,
+   isZeroInit, getTargetInfo,
+   getLocation, hasPostblit,
+   isCopyable, getVisibility, and
+   totype.
+  
+  An expression-based contract syntax has been added to the
+   language.
+  
+  Function literals can now return a value by reference with the
+   ref keyword.
+  
+  A new syntax is available to declare aliases to function types using
+   the alias syntax based on the assignment operator.
+  
+  New types __c_complex_float,
+   __c_complex_double,  __c_complex_real, and
+   __c_wchar_t have been added for interfacing with C
+   and C++ code, and are available from the core.stdc.config
+   module.
+  
+  User-defined attributes can now be used to annotate enum members,
+   alias declarations, and function parameters.
+  
+  Templates alias parameters can now be instantiated with basic types
+   such as int or void function().
+  
+  The mixin construct can now be used as types in the form
+   mixin(string) var.
+  
+  The mixin construct can now take an argument list, same
+   as pragma(msg).
+  
+
+  
+  New intrinsics:
+
+  Bitwise rotate intrinsics core.bitop.rol and
+   core.bitop.ror have been added.
+  
+  Byte swap intrinsic core.bitop.byteswap for swapping
+   bytes in a 2-byte ushort has been added.
+  
+  Math intrinsics available from core.math now have
+   overloads for float and double types.
+   Volatile intrinsics core.volatile.volatileLoad and
+ core.volatile.volatileStore have been moved from the
+ core.bitop module.
+   
+
+  
+  New attributes:
+
+  The following GCC attributes are now recognized and available from
+   the gcc.attributes module with short-hand aliases for
+   convenience:
+   
+ @attribute("alloc_size", arguments) or
+   @alloc_size(arguments).
+ 
+ @attribute("always_inline") or
+   @always_inline.
+ 
+ @attribute("used") or @used.
+ @attribute("optimize", arguments) or
+   @optimize(arguments).
+ 
+ @attribute("cold") or @cold.
+ @attribute("noplt") or @noplt.
+ @attribute("target_clones", arguments) or
+   @target_clones(arguments).
+ 
+ @attribute("no_icf") or @no_icf.
+ @attribute("noipa") or @noipa.
+ @attribute("symver", arguments) or
+   @symver(arguments).
+ 
+   
+  
+  New aliases have been added to gcc.attributes for
+   compatibility with ldc.attributes.
+   
+ The @allocSize(arguments) attribute is the same as
+   @alloc_size(arguments), but uses a 0-based index for
+   function arguments.
+ 
+ The @assumeUsed attribute is an alias for
+   @attribute("used").
+ 
+ The @fastmath attribute is an alias for
+   @optimize("Ofast").
+ 
+ The @naked attribute is an alias for
+   @attribute("naked").  This attribute may not be
+   available on all targets.
+ 
+ The @restrict attribute has been added to specify
+   that a function parameter is to be restrict-qualified in the C99
+   sense of the term.
+ 
+ The @optStrategy(strategy) attribute is an alias for
+   @optimize("O0") when the strategy is
+   "none", otherwise @optimize("Os") for the
+   "optsize" and "minsize" strategies.
+ 
+ The @

Re: [Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Segher Boessenkool

On Wed, May 19, 2021 at 02:32:02PM +0200, Tobias Burnus wrote:
> Regarding gfortran.dg/pr96711.f90:
> 
> On my x86-64-gnu-linux, it PASSes.
> On our powerpc64le-linux-gnu it FAILS with
> 'STOP 3' (→ also scan-dump count) and 'STOP 4'.
> 
> Contrary to PR96983's bug summary, I don't get an ICE.
> 
> 
> On powerpc64le-linux-gnu, the following condition evaluates true (→ 'STOP 
> 3'):
> 
>real(16)   :: y   ! 128bit REAL
>integer(16), parameter :: k2 = nint (2 / epsilon (y), kind(k2))
>integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 
>!2**113
>if (k2 /= m2) stop 3
> 
> On x86_64-linux-gnu, k2 == m2 — but on powerpc64le-linux-gnu,
> k2 == 2**106 instead of 2**113.
> 
> My solution is to permit also 2**106 besides 2**113.
> 
> @PowerPC maintainers: Does this make sense? – It seems to work on our 
> PowerPC
> but with all the new 'long double' changes, does it also work for you?

I do not understand Fortran well enough, could you explain what the code
is supposed to do?

>   PR fortran/96983
>   * gfortran.dg/pr96711.f90:

You're missing the actual entry here, fwiw.

> -  integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 
> !2**113
> +  integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 
> !2**113  ! Some systems like x86-64
> +  integer(16), parameter :: m2a = 81129638414606681695789005144064_16   
> !2**106  ! Some systems like PowerPC

If you use double-double ("ibm long double") a number is represented as
the sum of two double precision numbers, while if you use IEEE quad
precision floating point you get a 112-bit fraction (and a leading one).
The most significant of the two DP numbers is the whole rounded to DP.
The actual precision varies, it depends on various factors :-/


Segher


>integer(16), volatile  :: m
>x = 2 / epsilon (x)
>y = 2 / epsilon (y)
>m = nint (x, kind(m))
>  ! print *, m
>if (k1 /= m1) stop 1
>if (m  /= m1) stop 2
>m = nint (y, kind(m))
>  ! print *, m
> -  if (k2 /= m2) stop 3
> -  if (m  /= m2) stop 4
> +  if (k2 /= m2 .and. k2 /= m2a) stop 3
> +  if (m  /= m2 .and. m /= m2a) stop 4
>  end program

[committed] testuite: Check pthread for omp module testing

2021-05-19 Thread Kito Cheng

gcc/testsuite/ChangeLog:

* g++.dg/modules/omp-1_a.C: Check pthread is available.
* g++.dg/modules/omp-1_b.C: Ditto.
* g++.dg/modules/omp-1_c.C: Ditto.
* g++.dg/modules/omp-2_a.C: Ditto.
* g++.dg/modules/omp-2_b.C: Ditto.
---
 gcc/testsuite/g++.dg/modules/omp-1_a.C | 1 +
 gcc/testsuite/g++.dg/modules/omp-1_b.C | 1 +
 gcc/testsuite/g++.dg/modules/omp-1_c.C | 1 +
 gcc/testsuite/g++.dg/modules/omp-2_a.C | 1 +
 gcc/testsuite/g++.dg/modules/omp-2_b.C | 1 +
 5 files changed, 5 insertions(+)

diff --git a/gcc/testsuite/g++.dg/modules/omp-1_a.C 
b/gcc/testsuite/g++.dg/modules/omp-1_a.C
index 722720a0e93..94e1171f03c 100644
--- a/gcc/testsuite/g++.dg/modules/omp-1_a.C
+++ b/gcc/testsuite/g++.dg/modules/omp-1_a.C
@@ -1,4 +1,5 @@
 // { dg-additional-options "-fmodules-ts -fopenmp" }
+// { dg-require-effective-target pthread }
 
 export module foo;
 // { dg-module-cmi foo }
diff --git a/gcc/testsuite/g++.dg/modules/omp-1_b.C 
b/gcc/testsuite/g++.dg/modules/omp-1_b.C
index f3f5d92e517..09d97e4ac4e 100644
--- a/gcc/testsuite/g++.dg/modules/omp-1_b.C
+++ b/gcc/testsuite/g++.dg/modules/omp-1_b.C
@@ -1,4 +1,5 @@
 // { dg-additional-options "-fmodules-ts -fopenmp" }
+// { dg-require-effective-target pthread }
 
 import foo;
 
diff --git a/gcc/testsuite/g++.dg/modules/omp-1_c.C 
b/gcc/testsuite/g++.dg/modules/omp-1_c.C
index f30f6115277..599a5a5d34f 100644
--- a/gcc/testsuite/g++.dg/modules/omp-1_c.C
+++ b/gcc/testsuite/g++.dg/modules/omp-1_c.C
@@ -1,4 +1,5 @@
 // { dg-additional-options "-fmodules-ts" }
+// { dg-require-effective-target pthread }
 
 import foo;
 
diff --git a/gcc/testsuite/g++.dg/modules/omp-2_a.C 
b/gcc/testsuite/g++.dg/modules/omp-2_a.C
index d2291b6bbe0..b0d4bbc6e8a 100644
--- a/gcc/testsuite/g++.dg/modules/omp-2_a.C
+++ b/gcc/testsuite/g++.dg/modules/omp-2_a.C
@@ -1,4 +1,5 @@
 // { dg-additional-options "-fmodules-ts -fopenmp" }
+// { dg-require-effective-target pthread }
 
 export module foo;
 // { dg-module-cmi foo }
diff --git a/gcc/testsuite/g++.dg/modules/omp-2_b.C 
b/gcc/testsuite/g++.dg/modules/omp-2_b.C
index 39f34c70275..aeee4d1561a 100644
--- a/gcc/testsuite/g++.dg/modules/omp-2_b.C
+++ b/gcc/testsuite/g++.dg/modules/omp-2_b.C
@@ -1,4 +1,5 @@
 // { dg-additional-options "-fmodules-ts" }
+// { dg-require-effective-target pthread }
 
 import foo;
 
-- 
2.31.1

[PING][PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-05-19 Thread Tom de Vries

On 4/23/21 6:48 PM, Tom de Vries wrote:
> On 4/23/21 5:45 PM, Alexander Monakov wrote:
>> On Thu, 22 Apr 2021, Tom de Vries wrote:
>>
>>> Ah, I see, agreed, that makes sense.  I was afraid there was some
>>> fundamental problem that I overlooked.
>>>
>>> Here's an updated version.  I've tried to make it clear that the
>>> futex_wait/wake are locally used versions, not generic functionality.
>> Could you please regenerate the patch passing appropriate flags to
>> 'git format-patch' so it presents a rewrite properly (see documentation
>> for --patience and --break-rewrites options). The attached patch was mostly
>> unreadable, I'm afraid.
> Sure.  I did notice that the patch was not readable, but I didn't known
> there were options to improve that, so thanks for pointing that out.
> 

Ping.  Any comments?

Thanks,
- Tom

> 0001-libgomp-nvptx-Fix-hang-in-gomp_team_barrier_wait_end.patch
> 
> From d3053a7ec7444b371ee29097a673e637b0d369d9 Mon Sep 17 00:00:00 2001
> From: Tom de Vries 
> Date: Tue, 20 Apr 2021 08:47:03 +0200
> Subject: [PATCH 1/4] [libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end
> 
> Consider the following omp fragment.
> ...
>   #pragma omp target
>   #pragma omp parallel num_threads (2)
>   #pragma omp task
> ;
> ...
> 
> This hangs at -O0 for nvptx.
> 
> Investigating the behaviour gives us the following trace of events:
> - both threads execute GOMP_task, where they:
>   - deposit a task, and
>   - execute gomp_team_barrier_wake
> - thread 1 executes gomp_team_barrier_wait_end and, not being the last thread,
>   proceeds to wait at the team barrier
> - thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it
>   calls gomp_barrier_handle_tasks, where it:
>   - executes both tasks and marks the team barrier done
>   - executes a gomp_team_barrier_wake which wakes up thread 1
> - thread 1 exits the team barrier
> - thread 0 returns from gomp_barrier_handle_tasks and goes to wait at
>   the team barrier.
> - thread 0 hangs.
> 
> To understand why there is a hang here, it's good to understand how things
> are setup for nvptx.  The libgomp/config/nvptx/bar.c implementation is
> a copy of the libgomp/config/linux/bar.c implementation, with uses of both
> futex_wake and do_wait replaced with uses of ptx insn bar.sync:
> ...
>   if (bar->total > 1)
> asm ("bar.sync 1, %0;" : : "r" (32 * bar->total));
> ...
> 
> The point where thread 0 goes to wait at the team barrier, corresponds in
> the linux implementation with a do_wait.  In the linux case, the call to
> do_wait doesn't hang, because it's waiting for bar->generation to become
> a certain value, and if bar->generation already has that value, it just
> proceeds, without any need for coordination with other threads.
> 
> In the nvtpx case, the bar.sync waits until thread 1 joins it in the same
> logical barrier, which never happens: thread 1 is lingering in the
> thread pool at the thread pool barrier (using a different logical barrier),
> waiting to join a new team.
> 
> The easiest way to fix this is to revert to the posix implementation for
> bar.{c,h}.  That however falls back on a busy-waiting approach, and
> does not take advantage of the ptx bar.sync insn.
> 
> Instead, we revert to the linux implementation for bar.c,
> and implement bar.c local functions futex_wait and futex_wake using the
> bar.sync insn.
> 
> This is a WIP version that does not yet take performance into consideration,
> but instead focuses on copying a working version as completely as possible,
> and isolating the machine-specific changes to as few functions as
> possible.
> 
> The bar.sync insn takes an argument specifying how many threads are
> participating, and that doesn't play well with the futex syntax where it's
> not clear in advance how many threads will be woken up.
> 
> This is solved by waking up all waiting threads each time a futex_wait or
> futex_wake happens, and possibly going back to sleep with an updated thread
> count.
> 
> Tested libgomp on x86_64 with nvptx accelerator, both as-is and with
> do_spin hardcoded to 1.
> 
> libgomp/ChangeLog:
> 
> 2021-04-20  Tom de Vries  
> 
>   PR target/99555
>   * config/nvptx/bar.c (generation_to_barrier): New function, copied
>   from config/rtems/bar.c.
>   (futex_wait, futex_wake): New function.
>   (do_spin, do_wait): New function, copied from config/linux/wait.h.
>   (gomp_barrier_wait_end, gomp_barrier_wait_last)
>   (gomp_team_barrier_wake, gomp_team_barrier_wait_end):
>   (gomp_team_barrier_wait_cancel_end, gomp_team_barrier_cancel): Remove
>   and replace with include of config/linux/bar.c.
>   * config/nvptx/bar.h (gomp_barrier_t): Add fields waiters and lock.
>   (gomp_barrier_init): Init new fields.
>   * testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific
>   workarounds.
>   * testsuite/libgomp.c/pr99555-1.c: Same.
>   * testsuite/libgomp.fortran/task-detach-6.f90: Same.
> ---
>  libgomp/c

Re: [PATCH][DOCS] Remove install-old.texi

2021-05-19 Thread Joseph Myers

On Wed, 19 May 2021, Martin Liška wrote:

> > I'm not sure where this list comes from
> 
> I split parts in contrib/config-list.mk and printed them.
> 
> > but I'd expect "linux" to be the
> > canonical "linux-gnu", along with "linux-uclibc", "linux-android",
> > "linux-musl" ("uclibc" etc. aren't system names on their own) and variants
> > with "eabi" or "eabihf" on the end (see what config.guess produces for
> > Arm).
> 
> One needs an Arm machine for that. Do you know about a better way how to get
> list of all systems?

Looking at the config.sub / config.guess testsuites (in config.git) will 
provide lists of many systems (not all supported by GCC) (and the code of 
config.guess may also help show what the Arm variants are).  Looking at 
config.gcc to see what it matches against may also be helpful (but then 
use config.sub to get the canonical form of a target triplet).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] c: Add support for __FILE_NAME__ macro (PR c/42579)

2021-05-19 Thread Joseph Myers

This patch is missing documentation (in cpp.texi) and tests for the value 
of the macro.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] arm: Fix ICE with CMSE nonsecure call on Armv8.1-M [PR100333]

2021-05-19 Thread Richard Earnshaw via Gcc-patches





On 19/05/2021 15:44, Alex Coplan via Gcc-patches wrote:

This time with attachment.

On 19/05/2021 15:42, Alex Coplan via Gcc-patches wrote:

Hi Richard,

On 17/05/2021 17:31, Richard Earnshaw wrote:



On 30/04/2021 09:30, Alex Coplan via Gcc-patches wrote:

Hi,

As the PR shows, we ICE shortly after expanding nonsecure calls for
Armv8.1-M.  For Armv8.1-M, we have TARGET_HAVE_FPCXT_CMSE. As it stands,
the expander (arm.md:nonsecure_call_internal) moves the callee's address
to a register (with copy_to_suggested_reg) only if
!TARGET_HAVE_FPCXT_CMSE.

However, looking at the pattern which the insn appears to be intended to
match (thumb2.md:*nonsecure_call_reg_thumb2_fpcxt), it requires the
callee's address to be in a register.

This patch therefore just forces the callee's address into a register in
the expander.

Testing:
   * Regtested an arm-eabi cross configured with
   --with-arch=armv8.1-m.main+mve.fp+fp.dp --with-float=hard. No regressions.
   * Bootstrap and regtest on arm-linux-gnueabihf in progress.

OK for trunk and backports as appropriate if bootstrap looks good?

Thanks,
Alex

gcc/ChangeLog:

PR target/100333
* config/arm/arm.md (nonsecure_call_internal): Always ensure
callee's address is in a register.

gcc/testsuite/ChangeLog:

PR target/100333
* gcc.target/arm/cmse/pr100333.c: New test.




-  "
{
-if (!TARGET_HAVE_FPCXT_CMSE)
-  {
-   rtx tmp =
- copy_to_suggested_reg (XEXP (operands[0], 0),
-gen_rtx_REG (SImode, R4_REGNUM),
-SImode);
+rtx tmp = NULL_RTX;
+rtx addr = XEXP (operands[0], 0);

-   operands[0] = replace_equiv_address (operands[0], tmp);
-  }
-  }")
+if (TARGET_HAVE_FPCXT_CMSE && !REG_P (addr))
+  tmp = force_reg (SImode, addr);
+else if (!TARGET_HAVE_FPCXT_CMSE)
+  tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
+  gen_rtx_REG (SImode, R4_REGNUM),
+  SImode);


I think it might be better to handle the !TARGET_HAVE_FPCXT_CMSE case via a
pseudo as well, then we don't end up generating a potentially non-trivial
insn that directly writes a fixed hard reg - it's better to let later passes
clean that up if they can.


Ah, I wasn't aware that was an issue.



Also, you've extracted XEXP (operands[0], 0) into 'addr', but then continue
to use the XEXP form in the existing path.  Please be consistent use XEXP
directly everywhere, or use 'addr' everywhere.


Fixed, thanks.



So you want something like

   addr = XEXP (operands[0], 0);
   if (!REG_P (addr))
 addr = force_reg (SImode, addr);

   if (!T_H_F_C)
 addr = copy...(addr, gen(r4), SImode);

   operands[0] = replace_equiv_addr (operands[0], addr);

R.


How about the attached? Regtested an armv8.1-m.main cross, 
bootstrapped/regtested
on arm-linux-gnueabihf: no issues.

OK for trunk and eventual backports?

Thanks,
Alex





OK.

R.

Re: [PATCH] arm/testsuite: Fix testcase for PR99977

2021-05-19 Thread Christophe Lyon via Gcc-patches

On Wed, 19 May 2021 at 16:40, Richard Earnshaw
 wrote:
>
>
>
> On 19/05/2021 09:10, Christophe Lyon via Gcc-patches wrote:
> > Some targets (eg arm-none-uclinuxfdpiceabi) do not support Thumb-1,
> > and since the testcase forces -march=armv8-m.base, we need to check
> > whether this option is actually supported.
> >
> > Using dg-add-options arm_arch_v8m_base ensure that we pass -mthumb as
> > needed too.
> >
> > 2021-05-19  Christophe Lyon  
> >
> >   PR 99977
> >   gcc/testsuite/
> >   * gcc.target/arm/pr99977.c: Require arm_arch_v8m_base.
> > ---
> >   gcc/testsuite/gcc.target/arm/pr99977.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/testsuite/gcc.target/arm/pr99977.c 
> > b/gcc/testsuite/gcc.target/arm/pr99977.c
> > index 7911899d928..db330e4a4a3 100644
> > --- a/gcc/testsuite/gcc.target/arm/pr99977.c
> > +++ b/gcc/testsuite/gcc.target/arm/pr99977.c
> > @@ -1,5 +1,7 @@
> >   /* { dg-do compile } */
> > -/* { dg-options "-march=armv8-m.base -mfloat-abi=soft -O2" } */
> > +/* { dg-require-effective-target arm_arch_v8m_base_ok } */
> > +/* { dg-options "-O2" } */
> > +/* { dg-add-options arm_arch_v8m_base } */
> >   _Bool f1(int *p) { return __sync_bool_compare_and_swap (p, -1, 2); }
> >   _Bool f2(int *p) { return __sync_bool_compare_and_swap (p, -8, 2); }
> >   int g1(int *p) { return __sync_val_compare_and_swap (p, -1, 2); }
> >
>
> OK.
>

Thanks, I also pushed it to gcc-11, where the patch was recently backported.

> R.

Re: [PATCH] RISC-V: Properly parse the letter 'p' in '-march'.

2021-05-19 Thread Kito Cheng via Gcc-patches

Hi Geng:

Thanks for your patch, committed with minor tweaks for gcc_assert.

On Tue, May 18, 2021 at 2:31 PM Geng Qi via Gcc-patches
 wrote:
>
> gcc/ChangeLog:
> * common/config/riscv/riscv-common.c
> (riscv_subset_list::parsing_subset_version): Properly parse the letter
> 'p' in '-march'.
> (riscv_subset_list::parse_std_ext,
>  riscv_subset_list::parse_multiletter_ext): To handle errors generated
> in riscv_subset_list::parsing_subset_version.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/arch-12.c: New.
> * gcc.target/riscv/attribute-19.c: New.
> ---
>  gcc/common/config/riscv/riscv-common.c| 67 
> ++-
>  gcc/testsuite/gcc.target/riscv/arch-12.c  |  4 ++
>  gcc/testsuite/gcc.target/riscv/attribute-19.c |  4 ++
>  3 files changed, 42 insertions(+), 33 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-12.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-19.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.c 
> b/gcc/common/config/riscv/riscv-common.c
> index 34b74e5..65e5641 100644
> --- a/gcc/common/config/riscv/riscv-common.c
> +++ b/gcc/common/config/riscv/riscv-common.c
> @@ -518,40 +518,38 @@ riscv_subset_list::parsing_subset_version (const char 
> *ext,
>unsigned version = 0;
>unsigned major = 0;
>unsigned minor = 0;
> -  char np;
>*explicit_version_p = false;
>
> -  for (; *p; ++p)
> -{
> -  if (*p == 'p')
> -   {
> - np = *(p + 1);
> -
> - if (!ISDIGIT (np))
> -   {
> - /* Might be beginning of `p` extension.  */
> - if (std_ext_p)
> -   {
> - get_default_version (ext, major_version, minor_version);
> - return p;
> -   }
> - else
> -   {
> - error_at (m_loc, "%<-march=%s%>: Expect number "
> -   "after %<%dp%>.", m_arch, version);
> - return NULL;
> -   }
> -   }
> -
> - major = version;
> - major_p = false;
> - version = 0;
> -   }
> -  else if (ISDIGIT (*p))
> -   version = (version * 10) + (*p - '0');
> -  else
> -   break;
> -}
> +  if (*p == 'p')
> +gcc_assert (std_ext_p);
> +  else {
> +for (; *p; ++p)
> +  {
> +   if (*p == 'p')
> + {
> +   if (!ISDIGIT (*(p+1)))
> + {
> +   error_at (m_loc, "%<-march=%s%>: Expect number "
> + "after %<%dp%>.", m_arch, version);
> +   return NULL;
> + }
> +   if (!major_p)
> + {
> +   error_at (m_loc, "%<-march=%s%>: For %<%s%dp%dp?%>, version "
> + "number with more than 2 level is not supported.",
> + m_arch, ext, major, version);
> +   return NULL;
> + }
> +   major = version;
> +   major_p = false;
> +   version = 0;
> + }
> +   else if (ISDIGIT (*p))
> + version = (version * 10) + (*p - '0');
> +   else
> + break;
> +  }
> +  }
>
>if (major_p)
>  major = version;
> @@ -643,7 +641,7 @@ riscv_subset_list::parse_std_ext (const char *p)
>return NULL;
>  }
>
> -  while (*p)
> +  while (p != NULL && *p)
>  {
>char subset[2] = {0, 0};
>
> @@ -771,6 +769,9 @@ riscv_subset_list::parse_multiletter_ext (const char *p,
>   /* std_ext_p= */ false, 
> &explicit_version_p);
>free (ext);
>
> +  if (end_of_version == NULL)
> +   return NULL;
> +
>*q = '\0';
>
>if (strlen (subset) == 1)
> diff --git a/gcc/testsuite/gcc.target/riscv/arch-12.c 
> b/gcc/testsuite/gcc.target/riscv/arch-12.c
> new file mode 100644
> index 000..29e16c3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/arch-12.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=rv64im1p2p3 -mabi=lp64" } */
> +int foo() {}
> +/* { dg-error "'-march=rv64im1p2p3': For 'm1p2p\\?', version number with 
> more than 2 level is not supported." "" { target *-*-* } 0 } */
> diff --git a/gcc/testsuite/gcc.target/riscv/attribute-19.c 
> b/gcc/testsuite/gcc.target/riscv/attribute-19.c
> new file mode 100644
> index 000..18f68d9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/attribute-19.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mriscv-attribute -march=rv64imp0p9 -mabi=lp64" } */
> +int foo() {}
> +/* { dg-final { scan-assembler ".attribute arch, \"rv64i2p0_m2p0_p0p9\"" } } 
> */
> --
> 2.7.4
>

Re: [PATCH] arm: Fix ICE with CMSE nonsecure call on Armv8.1-M [PR100333]

2021-05-19 Thread Richard Earnshaw via Gcc-patches


ENOATTACHMENT.

On 19/05/2021 15:42, Alex Coplan via Gcc-patches wrote:

Hi Richard,

On 17/05/2021 17:31, Richard Earnshaw wrote:



On 30/04/2021 09:30, Alex Coplan via Gcc-patches wrote:

Hi,

As the PR shows, we ICE shortly after expanding nonsecure calls for
Armv8.1-M.  For Armv8.1-M, we have TARGET_HAVE_FPCXT_CMSE. As it stands,
the expander (arm.md:nonsecure_call_internal) moves the callee's address
to a register (with copy_to_suggested_reg) only if
!TARGET_HAVE_FPCXT_CMSE.

However, looking at the pattern which the insn appears to be intended to
match (thumb2.md:*nonsecure_call_reg_thumb2_fpcxt), it requires the
callee's address to be in a register.

This patch therefore just forces the callee's address into a register in
the expander.

Testing:
   * Regtested an arm-eabi cross configured with
   --with-arch=armv8.1-m.main+mve.fp+fp.dp --with-float=hard. No regressions.
   * Bootstrap and regtest on arm-linux-gnueabihf in progress.

OK for trunk and backports as appropriate if bootstrap looks good?

Thanks,
Alex

gcc/ChangeLog:

PR target/100333
* config/arm/arm.md (nonsecure_call_internal): Always ensure
callee's address is in a register.

gcc/testsuite/ChangeLog:

PR target/100333
* gcc.target/arm/cmse/pr100333.c: New test.




-  "
{
-if (!TARGET_HAVE_FPCXT_CMSE)
-  {
-   rtx tmp =
- copy_to_suggested_reg (XEXP (operands[0], 0),
-gen_rtx_REG (SImode, R4_REGNUM),
-SImode);
+rtx tmp = NULL_RTX;
+rtx addr = XEXP (operands[0], 0);

-   operands[0] = replace_equiv_address (operands[0], tmp);
-  }
-  }")
+if (TARGET_HAVE_FPCXT_CMSE && !REG_P (addr))
+  tmp = force_reg (SImode, addr);
+else if (!TARGET_HAVE_FPCXT_CMSE)
+  tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
+  gen_rtx_REG (SImode, R4_REGNUM),
+  SImode);


I think it might be better to handle the !TARGET_HAVE_FPCXT_CMSE case via a
pseudo as well, then we don't end up generating a potentially non-trivial
insn that directly writes a fixed hard reg - it's better to let later passes
clean that up if they can.


Ah, I wasn't aware that was an issue.



Also, you've extracted XEXP (operands[0], 0) into 'addr', but then continue
to use the XEXP form in the existing path.  Please be consistent use XEXP
directly everywhere, or use 'addr' everywhere.


Fixed, thanks.



So you want something like

   addr = XEXP (operands[0], 0);
   if (!REG_P (addr))
 addr = force_reg (SImode, addr);

   if (!T_H_F_C)
 addr = copy...(addr, gen(r4), SImode);

   operands[0] = replace_equiv_addr (operands[0], addr);

R.


How about the attached? Regtested an armv8.1-m.main cross, 
bootstrapped/regtested
on arm-linux-gnueabihf: no issues.

OK for trunk and eventual backports?

Thanks,
Alex

Re: [GCC-10 backport][PATCH] arm: _Generic feature failing with ICE for -O0 (pr97205).

2021-05-19 Thread Richard Earnshaw via Gcc-patches


Looking through the bugzilla logs shows:

  Since it is a gcc_checking_assert that triggers the ICE,
  I assumed that does not affect a release build,
  is that correct?

So it would appear that the decision was taken that a backport was not 
needed.


Have I missed something?

R.

On 19/05/2021 13:22, Srinath Parvathaneni via Gcc-patches wrote:

Ping!!


-Original Message-
From: Srinath Parvathaneni 
Sent: 30 April 2021 16:24
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov ; Richard Earnshaw

Subject: [GCC-10 backport][PATCH] arm: _Generic feature failing with ICE for
-O0 (pr97205).

Hi,

This is a backport to GCC-10 to fix PR97205, patch applies cleanly on the
branch.

Regression tested and found no issues.

Ok for GCC-10 backport?

Regards,
Srinath.

 This makes sure that stack allocated SSA_NAMEs are
 at least MODE_ALIGNED.  Also increase the MEM_ALIGN
 for the corresponding rtl objects.

 gcc:
 2020-11-03  Bernd Edlinger  

 PR target/97205
 * cfgexpand.c (align_local_variable): Make SSA_NAMEs
 at least MODE_ALIGNED.
 (expand_one_stack_var_at): Increase MEM_ALIGN for SSA_NAMEs.

 gcc/testsuite:
 2020-11-03  Bernd Edlinger  

 PR target/97205
 * gcc.c-torture/compile/pr97205.c: New test.

 (cherry picked from commit
23ac7a009ecfeec3eab79136abed8aac9768b458)


### Attachment also inlined for ease of reply
###


diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index
bf4f194ed993134109cc21be9cb0ed8a5c170824..4fef5d6ebf420ce4d6f59606e
cd064f45ae59065 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -366,7 +366,15 @@ align_local_variable (tree decl, bool really_expand)
unsigned int align;

if (TREE_CODE (decl) == SSA_NAME)
-align = TYPE_ALIGN (TREE_TYPE (decl));
+{
+  tree type = TREE_TYPE (decl);
+  machine_mode mode = TYPE_MODE (type);
+
+  align = TYPE_ALIGN (type);
+  if (mode != BLKmode
+ && align < GET_MODE_ALIGNMENT (mode))
+   align = GET_MODE_ALIGNMENT (mode);
+}
else
  {
align = LOCAL_DECL_ALIGNMENT (decl); @@ -999,20 +1007,21 @@
expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
x = plus_constant (Pmode, base, offset);
x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
   ? TYPE_MODE (TREE_TYPE (decl))
-  : DECL_MODE (SSAVAR (decl)), x);
+  : DECL_MODE (decl), x);
+
+  /* Set alignment we actually gave this decl if it isn't an SSA name.
+ If it is we generate stack slots only accidentally so it isn't as
+ important, we'll simply set the alignment directly on the MEM.  */
+
+  if (base == virtual_stack_vars_rtx)
+offset -= frame_phase;
+  align = known_alignment (offset);
+  align *= BITS_PER_UNIT;
+  if (align == 0 || align > base_align)
+align = base_align;

if (TREE_CODE (decl) != SSA_NAME)
  {
-  /* Set alignment we actually gave this decl if it isn't an SSA name.
- If it is we generate stack slots only accidentally so it isn't as
-important, we'll simply use the alignment that is already set.  */
-  if (base == virtual_stack_vars_rtx)
-   offset -= frame_phase;
-  align = known_alignment (offset);
-  align *= BITS_PER_UNIT;
-  if (align == 0 || align > base_align)
-   align = base_align;
-
/* One would think that we could assert that we're not decreasing
 alignment here, but (at least) the i386 port does exactly this
 via the MINIMUM_ALIGNMENT hook.  */
@@ -1022,6 +1031,8 @@ expand_one_stack_var_at (tree decl, rtx base,
unsigned base_align,
  }

set_rtl (decl, x);
+
+  set_mem_align (x, align);
  }

  class stack_vars_data
@@ -1327,13 +1338,11 @@ expand_one_stack_var_1 (tree var)
  {
tree type = TREE_TYPE (var);
size = tree_to_poly_uint64 (TYPE_SIZE_UNIT (type));
-  byte_align = TYPE_ALIGN_UNIT (type);
  }
else
-{
-  size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
-  byte_align = align_local_variable (var, true);
-}
+size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
+
+  byte_align = align_local_variable (var, true);

/* We handle highly aligned variables in expand_stack_vars.  */
gcc_assert (byte_align * BITS_PER_UNIT <=
MAX_SUPPORTED_STACK_ALIGNMENT); diff --git a/gcc/testsuite/gcc.c-
torture/compile/pr97205.c b/gcc/testsuite/gcc.c-torture/compile/pr97205.c
new file mode 100644
index
..6600011fcf84660edcba8d9
68c78ee6aaa0aa923
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr97205.c
@@ -0,0 +1,7 @@
+int a;
+typedef __attribute__((aligned(2))) int x; int f () {
+  x b = a;
+  return b;
+}

Re: [PATCH] arm: Fix ICE with CMSE nonsecure call on Armv8.1-M [PR100333]

2021-05-19 Thread Alex Coplan via Gcc-patches

This time with attachment.

On 19/05/2021 15:42, Alex Coplan via Gcc-patches wrote:
> Hi Richard,
> 
> On 17/05/2021 17:31, Richard Earnshaw wrote:
> > 
> > 
> > On 30/04/2021 09:30, Alex Coplan via Gcc-patches wrote:
> > > Hi,
> > > 
> > > As the PR shows, we ICE shortly after expanding nonsecure calls for
> > > Armv8.1-M.  For Armv8.1-M, we have TARGET_HAVE_FPCXT_CMSE. As it stands,
> > > the expander (arm.md:nonsecure_call_internal) moves the callee's address
> > > to a register (with copy_to_suggested_reg) only if
> > > !TARGET_HAVE_FPCXT_CMSE.
> > > 
> > > However, looking at the pattern which the insn appears to be intended to
> > > match (thumb2.md:*nonsecure_call_reg_thumb2_fpcxt), it requires the
> > > callee's address to be in a register.
> > > 
> > > This patch therefore just forces the callee's address into a register in
> > > the expander.
> > > 
> > > Testing:
> > >   * Regtested an arm-eabi cross configured with
> > >   --with-arch=armv8.1-m.main+mve.fp+fp.dp --with-float=hard. No 
> > > regressions.
> > >   * Bootstrap and regtest on arm-linux-gnueabihf in progress.
> > > 
> > > OK for trunk and backports as appropriate if bootstrap looks good?
> > > 
> > > Thanks,
> > > Alex
> > > 
> > > gcc/ChangeLog:
> > > 
> > >   PR target/100333
> > >   * config/arm/arm.md (nonsecure_call_internal): Always ensure
> > >   callee's address is in a register.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   PR target/100333
> > >   * gcc.target/arm/cmse/pr100333.c: New test.
> > > 
> > 
> > 
> > -  "
> >{
> > -if (!TARGET_HAVE_FPCXT_CMSE)
> > -  {
> > -   rtx tmp =
> > - copy_to_suggested_reg (XEXP (operands[0], 0),
> > -gen_rtx_REG (SImode, R4_REGNUM),
> > -SImode);
> > +rtx tmp = NULL_RTX;
> > +rtx addr = XEXP (operands[0], 0);
> > 
> > -   operands[0] = replace_equiv_address (operands[0], tmp);
> > -  }
> > -  }")
> > +if (TARGET_HAVE_FPCXT_CMSE && !REG_P (addr))
> > +  tmp = force_reg (SImode, addr);
> > +else if (!TARGET_HAVE_FPCXT_CMSE)
> > +  tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
> > +  gen_rtx_REG (SImode, R4_REGNUM),
> > +  SImode);
> > 
> > 
> > I think it might be better to handle the !TARGET_HAVE_FPCXT_CMSE case via a
> > pseudo as well, then we don't end up generating a potentially non-trivial
> > insn that directly writes a fixed hard reg - it's better to let later passes
> > clean that up if they can.
> 
> Ah, I wasn't aware that was an issue.
> 
> > 
> > Also, you've extracted XEXP (operands[0], 0) into 'addr', but then continue
> > to use the XEXP form in the existing path.  Please be consistent use XEXP
> > directly everywhere, or use 'addr' everywhere.
> 
> Fixed, thanks.
> 
> > 
> > So you want something like
> > 
> >   addr = XEXP (operands[0], 0);
> >   if (!REG_P (addr))
> > addr = force_reg (SImode, addr);
> > 
> >   if (!T_H_F_C)
> > addr = copy...(addr, gen(r4), SImode);
> > 
> >   operands[0] = replace_equiv_addr (operands[0], addr);
> > 
> > R.
> 
> How about the attached? Regtested an armv8.1-m.main cross, 
> bootstrapped/regtested
> on arm-linux-gnueabihf: no issues.
> 
> OK for trunk and eventual backports?
> 
> Thanks,
> Alex

-- 
Alex
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 45a471a887a..064604808cc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -8580,18 +8580,21 @@ (define_expand "nonsecure_call_internal"
  (use (match_operand 2 "" ""))
  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
-  "
   {
+rtx addr = XEXP (operands[0], 0);
+rtx tmp = REG_P (addr) ? addr : force_reg (SImode, addr);
+
 if (!TARGET_HAVE_FPCXT_CMSE)
   {
-   rtx tmp =
- copy_to_suggested_reg (XEXP (operands[0], 0),
-gen_rtx_REG (SImode, R4_REGNUM),
-SImode);
-
-   operands[0] = replace_equiv_address (operands[0], tmp);
+   rtx r4 = gen_rtx_REG (SImode, R4_REGNUM);
+   emit_move_insn (r4, tmp);
+   tmp = r4;
   }
-  }")
+
+if (tmp != addr)
+  operands[0] = replace_equiv_address (operands[0], tmp);
+  }
+)
 
 (define_insn "*call_reg_armv5"
   [(call (mem:SI (match_operand:SI 0 "s_register_operand" "r"))
diff --git a/gcc/testsuite/gcc.target/arm/cmse/pr100333.c 
b/gcc/testsuite/gcc.target/arm/cmse/pr100333.c
new file mode 100644
index 000..d8e3d809f73
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/cmse/pr100333.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-mcmse" } */
+typedef void __attribute__((cmse_nonsecure_call)) t(void);
+t g;
+void f() {
+  g();
+}

Re: [PATCH] arm: Fix ICE with CMSE nonsecure call on Armv8.1-M [PR100333]

2021-05-19 Thread Alex Coplan via Gcc-patches

Hi Richard,

On 17/05/2021 17:31, Richard Earnshaw wrote:
> 
> 
> On 30/04/2021 09:30, Alex Coplan via Gcc-patches wrote:
> > Hi,
> > 
> > As the PR shows, we ICE shortly after expanding nonsecure calls for
> > Armv8.1-M.  For Armv8.1-M, we have TARGET_HAVE_FPCXT_CMSE. As it stands,
> > the expander (arm.md:nonsecure_call_internal) moves the callee's address
> > to a register (with copy_to_suggested_reg) only if
> > !TARGET_HAVE_FPCXT_CMSE.
> > 
> > However, looking at the pattern which the insn appears to be intended to
> > match (thumb2.md:*nonsecure_call_reg_thumb2_fpcxt), it requires the
> > callee's address to be in a register.
> > 
> > This patch therefore just forces the callee's address into a register in
> > the expander.
> > 
> > Testing:
> >   * Regtested an arm-eabi cross configured with
> >   --with-arch=armv8.1-m.main+mve.fp+fp.dp --with-float=hard. No regressions.
> >   * Bootstrap and regtest on arm-linux-gnueabihf in progress.
> > 
> > OK for trunk and backports as appropriate if bootstrap looks good?
> > 
> > Thanks,
> > Alex
> > 
> > gcc/ChangeLog:
> > 
> > PR target/100333
> > * config/arm/arm.md (nonsecure_call_internal): Always ensure
> > callee's address is in a register.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR target/100333
> > * gcc.target/arm/cmse/pr100333.c: New test.
> > 
> 
> 
> -  "
>{
> -if (!TARGET_HAVE_FPCXT_CMSE)
> -  {
> - rtx tmp =
> -   copy_to_suggested_reg (XEXP (operands[0], 0),
> -  gen_rtx_REG (SImode, R4_REGNUM),
> -  SImode);
> +rtx tmp = NULL_RTX;
> +rtx addr = XEXP (operands[0], 0);
> 
> - operands[0] = replace_equiv_address (operands[0], tmp);
> -  }
> -  }")
> +if (TARGET_HAVE_FPCXT_CMSE && !REG_P (addr))
> +  tmp = force_reg (SImode, addr);
> +else if (!TARGET_HAVE_FPCXT_CMSE)
> +  tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
> +gen_rtx_REG (SImode, R4_REGNUM),
> +SImode);
> 
> 
> I think it might be better to handle the !TARGET_HAVE_FPCXT_CMSE case via a
> pseudo as well, then we don't end up generating a potentially non-trivial
> insn that directly writes a fixed hard reg - it's better to let later passes
> clean that up if they can.

Ah, I wasn't aware that was an issue.

> 
> Also, you've extracted XEXP (operands[0], 0) into 'addr', but then continue
> to use the XEXP form in the existing path.  Please be consistent use XEXP
> directly everywhere, or use 'addr' everywhere.

Fixed, thanks.

> 
> So you want something like
> 
>   addr = XEXP (operands[0], 0);
>   if (!REG_P (addr))
> addr = force_reg (SImode, addr);
> 
>   if (!T_H_F_C)
> addr = copy...(addr, gen(r4), SImode);
> 
>   operands[0] = replace_equiv_addr (operands[0], addr);
> 
> R.

How about the attached? Regtested an armv8.1-m.main cross, 
bootstrapped/regtested
on arm-linux-gnueabihf: no issues.

OK for trunk and eventual backports?

Thanks,
Alex

Re: [PATCH] arm/testsuite: Fix testcase for PR99977

2021-05-19 Thread Richard Earnshaw via Gcc-patches





On 19/05/2021 09:10, Christophe Lyon via Gcc-patches wrote:

Some targets (eg arm-none-uclinuxfdpiceabi) do not support Thumb-1,
and since the testcase forces -march=armv8-m.base, we need to check
whether this option is actually supported.

Using dg-add-options arm_arch_v8m_base ensure that we pass -mthumb as
needed too.

2021-05-19  Christophe Lyon  

PR 99977
gcc/testsuite/
* gcc.target/arm/pr99977.c: Require arm_arch_v8m_base.
---
  gcc/testsuite/gcc.target/arm/pr99977.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr99977.c 
b/gcc/testsuite/gcc.target/arm/pr99977.c
index 7911899d928..db330e4a4a3 100644
--- a/gcc/testsuite/gcc.target/arm/pr99977.c
+++ b/gcc/testsuite/gcc.target/arm/pr99977.c
@@ -1,5 +1,7 @@
  /* { dg-do compile } */
-/* { dg-options "-march=armv8-m.base -mfloat-abi=soft -O2" } */
+/* { dg-require-effective-target arm_arch_v8m_base_ok } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v8m_base } */
  _Bool f1(int *p) { return __sync_bool_compare_and_swap (p, -1, 2); }
  _Bool f2(int *p) { return __sync_bool_compare_and_swap (p, -8, 2); }
  int g1(int *p) { return __sync_val_compare_and_swap (p, -1, 2); }



OK.

R.

Re: [PATCH] libcpp: Fix up -fdirectives-only handling of // comments on last line not terminated with newline [PR100646]

2021-05-19 Thread Marek Polacek via Gcc-patches

On Wed, May 19, 2021 at 09:54:07AM +0200, Jakub Jelinek wrote:
> Hi!
> 
> As can be seen on the testcases, before the -fdirectives-only preprocessing
> rewrite the preprocessor would assume // comments are terminated by the
> end of file even when newline wasn't there, but now we error out.
> The following patch restores the previous behavior.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2021-05-19  Jakub Jelinek  
> 
>   PR preprocessor/100646
>   * lex.c (cpp_directive_only_process): Treat end of file as termination
>   for !is_block comments.
> 
>   * gcc.dg/cpp/pr100646-1.c: New test.
>   * gcc.dg/cpp/pr100646-2.c: New test.
> 
> --- libcpp/lex.c.jj   2021-05-12 15:13:57.0 +0200
> +++ libcpp/lex.c  2021-05-18 19:48:04.211383565 +0200
> @@ -4480,6 +4480,8 @@ cpp_directive_only_process (cpp_reader *
>   break;
> }
> }
> + if (pos >= limit && !is_block)
> +   goto done_comment;

The goto seems unnecessary since we're only skipping...

>   cpp_error_with_line (pfile, CPP_DL_ERROR, sloc, 0,
>"unterminated comment");

...this line, so just adjust the condition?  Were you just trying to be
consistent with the gotos above?

The patch is OK either way.

> done_comment:

Marek

Re: [PATCH] PR tree-optimization/100512: Once a range becomes constant, make it invariant.

2021-05-19 Thread Andrew MacLeod via Gcc-patches


On 5/19/21 5:13 AM, Richard Biener wrote:

On Tue, May 18, 2021 at 6:35 PM Andrew MacLeod  wrote:

On 5/18/21 3:22 AM, Richard Biener wrote:

On Tue, May 18, 2021 at 1:23 AM Andrew MacLeod via Gcc-patches
 wrote:

The code in PR 100512 triggers an interaction between ranger and the
propagation engine related to undefined values.

I put the detailed analysis in the PR, but it boils down to the early
VRP pass has concluded that something is a constant and can be replaced,
and removes the definition expecting the constant to be propagated
everywhere.


If the code is in an undefined region that the CFG is going to remove,
we can find impossible situations,a nd ranger then changes that value ot
UNDEFINED..  because, well, it is.  But then the propagation engine
panics because it doesnt have a constant any more, so odesnt replace it,
and now we have a used but not defined value.

Once we get to a globally constant range where further refinements can
only end up in an UNDEFINED state, stop further evaluating the range.
This is typically in places which are about to be removed by CFG cleanup
anyway, and it will make the propagation engine happy with no surprises.

Yeah, the propagation engine and EVRP as I know it relies on not visiting
"unexecutable" (as figured by anaysis) paths in the CFG and thus considering
edges coming from such regions not contributing conditions/values/etc. that
would cause such "undefinedness" to appear.  Not sure how it works with
ranger, maybe that can as well get a mode where it does only traverse
EDGE_EXECUTABLE edges.  Might be a bit difficult since IIRC it works
with SSA edges and not CFG edges.


Well it does do CFG based work as well, and I do not currently check
EDGE_EXECUTABLE...   I just tried checking the EXECUTABLE_EDGE flag and
not processing it, but it doesn't resolve the problem.  I think its
because the edge has not been determined unexecutable until after the
pass is done.. which is too late.


Bootstraps on x86_64-pc-linux-gnu with no regressions, and fixes the PR.

So that means the lattice isn't an optimistic lattice, right?  EVRPs wasn't
optimistic either, but VRPs is/was.  Whatever this means in this context ;)



It is optimistic, this just tells it to stop being optimistic if we get
to a constant so we don't mess up propagation.

Guess I'm confused - in classical terms an optimistic lattice
propagator only allows downward transitions UNDEF -> CONSTANT -> VARYING
while a non-optimistic one doesn't need to iterate and thus by definition
has no lattice "transition" other than from the initial VARYING to the
possibly non-VARYING but final state.


well, its not really a lattice, but to use those terms, we have aspects 
of both.


We start with the global range which is calculated as best we can. That 
forms the initial value (which may well be varying).  We may later find 
a better range for one or more inputs and recalculate that global range 
and refine it.  It only ever moves towards the UNDEFINED state from that 
point..


When we initially propagate a value through the CFG via the on-entry 
cache, we start with all affected blocks at UNDEFINED, mark the block(s) 
which may generate/affect a range. We then calculate the range at each 
of those points, and push those values thru the CFG.  the range on entry 
at each block is the union of all ranges on predecessor edges.  This 
means the propagator iteratively moves things from UNDEFINED towards the 
initial global value...  which provides the optimistic aspect of the 
algorithm when setting the values in blocks.


If we later update the range of the ssa-name because we've discovered a 
more refined range for an input, we "push" that new range into the 
cache.  That change is propagated thru the affected blocks in the CFG. 
We look at each successor and see if this new value would affect its 
value, and push it there if needed.  That is the iterative aspect. At 
this point, we are back to taking an existing known range, and further 
refining it.


The on-demand part is that we only visit blocks that are needed between 
the request and the Definition. We may later ask for a range in a 
different part of the CFG, and we do the initial propagation from 
UNDEFINED on just those blocks which haven't been previously calcluated 
and do the optimistic propagation.


That is what is happening in that PR.  we have started with a range, 
evolved it to a constant, and then further refined it to undefined. In 
general terms, I would say that is our model, but we have an optimistic 
propagator which catches all the stuff VRP does with it optimistic lattice.


I will do a full on writeup of this eventually...  complete with 
drawings :-)


Andrew

Re: [PATCH] improve warning suppression for inlined functions (PR 98465, 98512)

2021-05-19 Thread David Malcolm via Gcc-patches

On Thu, 2021-01-21 at 16:46 -0700, Martin Sebor via Gcc-patches wrote:

Martin and I had a chat about this patch, but it's best to discuss code
on the mailing list rather than in a silo, so here goes...

> The initial patch I posted is missing initialization for a couple
> of locals.  I'd noticed it in testing but forgot to add the fix to
> the patch before posting it.  I have corrected that in the updated
> revision and also added the test case from pr98512, and retested
> the whole thing on x86_64-linux.
> 
> On 1/19/21 11:58 AM, Martin Sebor wrote:
> > std::string tends to trigger a class of false positive out of bounds
> > access warnings for code GCC cannot prove is unreachable because of
> > missing aliasing constrains, and that ends up expanded inline into
> > user code.  Simply inserting the contents of a constant char array
> > does that.  In GCC 10 these false positives are suppressed due to
> > -Wno-system-headers, but in GCC 11, to help detect calls rendered
> > invalid by user code passing in either incorrect or insufficiently
> > constrained arguments, -Wno-system-header no longer has this effect
> > on invalid access warnings.
> > 
> > To solve the problem without at least partially reverting the change
> > and going back to the GCC 10 way of things for the affected subset
> > of calls (just memcpy and memmove), the attached patch enhances
> > the #pragma GCC diagnostic machinery to consider not just a single
> > location for inlined code but all locations at which an expression
> > and its callers are inlined all the way up the stack.  This gives
> > each author of a function involved in inlining the ability to
> > control a warning issued for the code, not just the user into whose
> > code all the calls end up inlined.  To resolve PR 98465, it lets us
> > suppress the false positives selectively in std::string rather
> > than across the board in GCC.

I like the idea of checking the whole of the inlining stack for
pragmas, but I don't like the way the patch implements it.

The patch provides a hook for getting a vec of locations for a
diagnostic for use when checking for pragmas, and uses it the hook on a
few specific diagnostics.

Why wouldn’t we always do this?  It seems to me like something we
should always do when there's inlining information associated with a
location, rather than being a per-diagnostic thing - but maybe there's
something I'm missing here.  The patch adds diag_inlining_context
instances on the stack in various places, and doing that feels to me
like a special-case hack, when it should be fixed more generally in
diagnostics.c

I don't like attaching the "override the location" hook to the
diagnostic_metadata; I intended the latter to be about diagnostic
taxonomies (CWE, coding standards, etc), rather than a place to stash
location overrides.

One issue is that the core of the diagnostics subsystem doesn't have
knowledge of "tree", but the inlining information requires tree-ish
knowledge.

It's still possible to get at the inlining information from the
diagnostics.c, but only as a void *:

input.h's has:

#define LOCATION_BLOCK(LOC) \
  ((tree) ((IS_ADHOC_LOC (LOC)) ? get_data_from_adhoc_loc (line_table,
(LOC)) \
   : NULL))

Without knowing about "tree", diagnostic.c could still query a
location_t to get at the data as a void *:

  if (IS_ADHOC_LOC (loc)
return get_data_from_adhoc_loc (line_table, loc);
  else
return NULL;

If we make the "get all the pertinent locations" hook a part of the
diagnostic_context, we could have diagnostic_report_diagnostic check to
see if there's ad-hoc data associated with the location and a non-NULL
hook on the context, and if so, call it.  This avoids adding an
indirect call for the common case where there isn't any inlining
information, and lets us stash the implementation of the hook in the
tree-diagnostic.c, keeping the separation of trees from diagnostic.c

One design question here is: what if there are multiple pragmas on the
inlining stack, e.g. explicitly enabling a warning at one place, and
explicitly ignoring that warning in another?  I don't think it would
happen in the cases you're interested in, but it seems worth
considering.  Perhaps the closest place to the user's code "wins".

> > 
> > The solution is to provide a new pair of overloads for warning
> > functions that, instead of taking a single location argument, take
> > a tree node from which the location(s) are determined.  The tree
> > argument is indirect because the diagnostic machinery doesn't (and
> > cannot without more intrusive changes) at the moment depend on
> > the various tree definitions.  A nice feature of these overloads
> > is that they do away with the need for the %K directive (and in
> > the future also %G, with another enhancement to accept a gimple*
> > argument).

We were chatting about this, and I believe we don't need the new
overloads or the %K and %G directives: all of the information about
inlining can be got from the location_t via

Re: [committed] libstdc++: Fix std::jthread assertion and re-enable skipped test

2021-05-19 Thread Bernd Edlinger

On 5/19/21 3:27 PM, Jonathan Wakely wrote:
> On 18/05/21 13:58 +0200, Bernd Edlinger wrote:
>> On 5/18/21 1:55 PM, Bernd Edlinger wrote:
>>> On 5/17/21 7:13 PM, Jonathan Wakely via Gcc-patches wrote:
 libstdc++-v3/ChangeLog:

 * include/std/thread (jthread::_S_create): Fix static assert
 message.
 * testsuite/30_threads/jthread/95989.cc: Re-enable test.
 * testsuite/30_threads/jthread/jthread.cc: Do not require
 pthread effective target.
 * testsuite/30_threads/jthread/2.cc: Moved to...
 * testsuite/30_threads/jthread/version.cc: ...here.

 Tested powerpc64le-linux. Committed to trunk.

 Let's see if this test is actually fixed, or if it still causes
 failures on some targets.

>>>
>>> Yes, indeed it is failing on x86_64-pc-linux-gnu.
>>>
>>
>> that means only this one:
>>
>> FAIL: 30_threads/jthread/95989.cc execution test
> 
> What's your glibc version?
> 
> 

that is ubuntu 20.04 with latest patches:

$ ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.3) 2.31
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Bernd.

Re: [PATCH] aarch64: Use correct type attributes for RTL generating XTN(2)

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> As subject, this patch corrects the type attribute in RTL patterns that
> generate XTN/XTN2 instructions to be "neon_move_narrow_q".
>
> This makes a material difference because these instructions can be
> executed on both SIMD pipes in the Cortex-A57 core model, whereas the
> "neon_shift_imm_narrow_q" attribute (in use until now) would suggest
> to the scheduler that they could only execute on one of the two
> pipes.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-05-18  Jonathan Wright  
>
> * config/aarch64/aarch64-simd.md: Use "neon_move_narrow_q"
> type attribute in patterns generating XTN(2).
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 447b5575f2f5adbad4957e90787a4954af644b20..e750faed1dbd940cdfa216d858b98f3bc25bba42
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1697,7 +1697,7 @@
>   (truncate: (match_operand:VQN 1 "register_operand" "w")))]
>"TARGET_SIMD"
>"xtn\\t%0., %1."
> -  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +  [(set_attr "type" "neon_move_narrow_q")]
>  )
>  
>  (define_insn "aarch64_xtn2_le"
> @@ -1707,7 +1707,7 @@
> (truncate: (match_operand:VQN 2 "register_operand" "w"]
>"TARGET_SIMD && !BYTES_BIG_ENDIAN"
>"xtn2\t%0., %2."
> -  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +  [(set_attr "type" "neon_move_narrow_q")]
>  )
>  
>  (define_insn "aarch64_xtn2_be"
> @@ -1717,7 +1717,7 @@
> (match_operand: 1 "register_operand" "0")))]
>"TARGET_SIMD && BYTES_BIG_ENDIAN"
>"xtn2\t%0., %2."
> -  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +  [(set_attr "type" "neon_move_narrow_q")]
>  )
>  
>  (define_expand "aarch64_xtn2"
> @@ -8618,7 +8618,7 @@
>   (truncate: (match_operand:VQN 1 "register_operand" "w")))]
>"TARGET_SIMD"
>"xtn\t%0., %1."
> -  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +  [(set_attr "type" "neon_move_narrow_q")]
>  )
>  
>  (define_insn "aarch64_bfdot"

Re: [PATCH] aarch64: Use an expander for quad-word vec_pack_trunc pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> The existing vec_pack_trunc RTL pattern emits an opaque two-
> instruction assembly code sequence that prevents proper instruction
> scheduling. This commit changes the pattern to an expander that emits
> individual xtn and xtn2 instructions.
>
> This commit also consolidates the duplicate truncation patterns.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK.  Nice clean-up, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-05-17  Jonathan Wright  
>
> * config/aarch64/aarch64-simd.md (aarch64_simd_vec_pack_trunc_):
> Remove as duplicate of...
> (aarch64_xtn): This.
> (aarch64_xtn2_le): Move position in file.
> (aarch64_xtn2_be): Move position in file.
> (aarch64_xtn2): Move position in file.
> (vec_pack_trunc_): Define as an expander.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> c67fa3fb6f0ca0a181a09a42677526d12e955c06..447b5575f2f5adbad4957e90787a4954af644b20
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1691,14 +1691,51 @@
>  ;; Narrowing operations.
>  
>  ;; For doubles.
> -(define_insn "aarch64_simd_vec_pack_trunc_"
> - [(set (match_operand: 0 "register_operand" "=w")
> -   (truncate: (match_operand:VQN 1 "register_operand" "w")))]
> - "TARGET_SIMD"
> - "xtn\\t%0., %1."
> +
> +(define_insn "aarch64_xtn"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (truncate: (match_operand:VQN 1 "register_operand" "w")))]
> +  "TARGET_SIMD"
> +  "xtn\\t%0., %1."
>[(set_attr "type" "neon_shift_imm_narrow_q")]
>  )
>  
> +(define_insn "aarch64_xtn2_le"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (vec_concat:
> +   (match_operand: 1 "register_operand" "0")
> +   (truncate: (match_operand:VQN 2 "register_operand" "w"]
> +  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
> +  "xtn2\t%0., %2."
> +  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +)
> +
> +(define_insn "aarch64_xtn2_be"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (vec_concat:
> +   (truncate: (match_operand:VQN 2 "register_operand" "w"))
> +   (match_operand: 1 "register_operand" "0")))]
> +  "TARGET_SIMD && BYTES_BIG_ENDIAN"
> +  "xtn2\t%0., %2."
> +  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +)
> +
> +(define_expand "aarch64_xtn2"
> +  [(match_operand: 0 "register_operand")
> +   (match_operand: 1 "register_operand")
> +   (truncate: (match_operand:VQN 2 "register_operand"))]
> +  "TARGET_SIMD"
> +  {
> +if (BYTES_BIG_ENDIAN)
> +  emit_insn (gen_aarch64_xtn2_be (operands[0], operands[1],
> +  operands[2]));
> +else
> +  emit_insn (gen_aarch64_xtn2_le (operands[0], operands[1],
> +  operands[2]));
> +DONE;
> +  }
> +)
> +
>  (define_expand "vec_pack_trunc_"
>   [(match_operand: 0 "register_operand")
>(match_operand:VDN 1 "register_operand")
> @@ -1711,7 +1748,7 @@
>  
>emit_insn (gen_move_lo_quad_ (tempreg, operands[lo]));
>emit_insn (gen_move_hi_quad_ (tempreg, operands[hi]));
> -  emit_insn (gen_aarch64_simd_vec_pack_trunc_ (operands[0], tempreg));
> +  emit_insn (gen_aarch64_xtn (operands[0], tempreg));
>DONE;
>  })
>  
> @@ -1901,20 +1938,25 @@
>  
>  ;; For quads.
>  
> -(define_insn "vec_pack_trunc_"
> - [(set (match_operand: 0 "register_operand" "=&w")
> +(define_expand "vec_pack_trunc_"
> + [(set (match_operand: 0 "register_operand")
> (vec_concat:
> -  (truncate: (match_operand:VQN 1 "register_operand" "w"))
> -  (truncate: (match_operand:VQN 2 "register_operand" "w"]
> +  (truncate: (match_operand:VQN 1 "register_operand"))
> +  (truncate: (match_operand:VQN 2 "register_operand"]
>   "TARGET_SIMD"
>   {
> +   rtx tmpreg = gen_reg_rtx (mode);
> +   int lo = BYTES_BIG_ENDIAN ? 2 : 1;
> +   int hi = BYTES_BIG_ENDIAN ? 1 : 2;
> +
> +   emit_insn (gen_aarch64_xtn (tmpreg, operands[lo]));
> +
> if (BYTES_BIG_ENDIAN)
> - return "xtn\\t%0., %2.\;xtn2\\t%0., %1.";
> + emit_insn (gen_aarch64_xtn2_be (operands[0], tmpreg, 
> operands[hi]));
> else
> - return "xtn\\t%0., %1.\;xtn2\\t%0., %2.";
> + emit_insn (gen_aarch64_xtn2_le (operands[0], tmpreg, 
> operands[hi]));
> +   DONE;
>   }
> -  [(set_attr "type" "multiple")
> -   (set_attr "length" "8")]
>  )
>  
>  ;; Widening operations.
> @@ -8570,13 +8612,6 @@
>""
>  )
>  
> -(define_expand "aarch64_xtn"
> -  [(set (match_operand: 0 "register_operand" "=w")
> - (truncate: (match_operand:VQN 1 "register_operand" "w")))]
> -  "TARGET_SIMD"
> -  ""
> -)
> -
>  ;; Truncate a 128-bit integer vector to a 64-bit vector.
>  (define_insn "trunc2"
>[(set (match_operand: 0 "register_operand" "=w")
> @@ -8586,42 +8621,6 @@
>[(set_attr "type" "neon_shift_imm_narrow_q")]
>  )
>  
> -(de

Re: [PATCH] vect: Replace hardcoded weight factor with param

2021-05-19 Thread Segher Boessenkool

On Wed, May 19, 2021 at 10:15:49AM +0200, Richard Biener wrote:
> On Wed, May 19, 2021 at 8:20 AM Kewen.Lin  wrote:
> "weight_factor" is kind-of double-speak

"Weighting factor" (with -ing) is a standard term actually.  (But
cost_factor of course is better and avoids all that :-) )


Segher

Re: [PATCH v4 12/12] constructor: Check if it is faster to load constant from memory

2021-05-19 Thread Bernd Edlinger

On 5/19/21 3:22 PM, H.J. Lu wrote:
> On Wed, May 19, 2021 at 2:33 AM Richard Biener
>  wrote:
>>
>> On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
>>>
>>> When expanding a constant constructor, don't call expand_constructor if
>>> it is more efficient to load the data from the memory via move by pieces.
>>>
>>> gcc/
>>>
>>> PR middle-end/90773
>>> * expr.c (expand_expr_real_1): Don't call expand_constructor if
>>> it is more efficient to load the data from the memory.
>>>
>>> gcc/testsuite/
>>>
>>> PR middle-end/90773
>>> * gcc.target/i386/pr90773-24.c: New test.
>>> * gcc.target/i386/pr90773-25.c: Likewise.
>>> ---
>>>  gcc/expr.c | 10 ++
>>>  gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++
>>>  gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 
>>>  3 files changed, 52 insertions(+)
>>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c
>>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c
>>>
>>> diff --git a/gcc/expr.c b/gcc/expr.c
>>> index d09ee42e262..80e01ea1cbe 100644
>>> --- a/gcc/expr.c
>>> +++ b/gcc/expr.c
>>> @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, 
>>> machine_mode tmode,
>>> unsigned HOST_WIDE_INT ix;
>>> tree field, value;
>>>
>>> +   /* Check if it is more efficient to load the data from
>>> +  the memory directly.  FIXME: How many stores do we
>>> +  need here if not moved by pieces?  */
>>> +   unsigned HOST_WIDE_INT bytes
>>> + = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>>
>> that's prone to fail - it could be a VLA.
> 
> What do you mean by fail?  Is it ICE or missed optimization?
> Do you have a testcase?
> 

I think for a VLA the TYPE_SIZE_UNIT may be unknown (NULL), or something like 
"x".

for instance something like

int test (int x)
{
  int vla[x];

  vla[x-1] = 0;
  return vla[x-1];
}


Bernd.

>>
>>> +   if ((bytes / UNITS_PER_WORD) > 2
>>> +   && MOVE_MAX_PIECES > UNITS_PER_WORD
>>> +   && can_move_by_pieces (bytes, TYPE_ALIGN (type)))
>>> + goto normal_inner_ref;
>>> +
>>
>> It looks like you're concerned about aggregate copies but this also handles
>> non-aggregates (which on GIMPLE might already be optimized of course).
> 
> Here I check if we copy more than 2 words and we can move more than
> a word in a single instruction.
> 
>> Also you say "if it's cheaper" but I see no cost considerations.  How do
>> we generally handle immed const vs. load from constant pool costs?
> 
> This trades 2 (update to 8) stores with one load plus one store.  Is there
> a way to check which one is faster?
> 
>>> FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), ix,
>>>   field, value)
>>>   if (tree_int_cst_equal (field, index))
>>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-24.c 
>>> b/gcc/testsuite/gcc.target/i386/pr90773-24.c
>>> new file mode 100644
>>> index 000..4a4b62533dc
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-24.c
>>> @@ -0,0 +1,22 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -march=x86-64" } */
>>> +
>>> +struct S
>>> +{
>>> +  long long s1 __attribute__ ((aligned (8)));
>>> +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
>>> +};
>>> +
>>> +const struct S array[] = {
>>> +  { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 }
>>> +};
>>> +
>>> +void
>>> +foo (struct S *x)
>>> +{
>>> +  x[0] = array[0];
>>> +}
>>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
>>> \\(%\[\^,\]+\\)" 1 } } */
>>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
>>> 16\\(%\[\^,\]+\\)" 1 } } */
>>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
>>> 32\\(%\[\^,\]+\\)" 1 } } */
>>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
>>> 48\\(%\[\^,\]+\\)" 1 } } */
>>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-25.c 
>>> b/gcc/testsuite/gcc.target/i386/pr90773-25.c
>>> new file mode 100644
>>> index 000..2520b670989
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-25.c
>>> @@ -0,0 +1,20 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -march=skylake" } */
>>> +
>>> +struct S
>>> +{
>>> +  long long s1 __attribute__ ((aligned (8)));
>>> +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
>>> +};
>>> +
>>> +const struct S array[] = {
>>> +  { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 }
>>> +};
>>> +
>>> +void
>>> +foo (struct S *x)
>>> +{
>>> +  x[0] = array[0];
>>> +}
>>> +/* { dg-final { scan-assembler-times "vmovdqu\[\\t \]%ymm\[0-9\]+, 
>>> \\(%\[\^,\]+\\)" 1 } } */
>>> +/* { dg-final { scan-assembler-times "vmovdqu\[\\t \]%ymm\[0-9\]+, 
>>> 32\\(%\[\^,\]+\

Re: [committed] libstdc++: Fix std::jthread assertion and re-enable skipped test

2021-05-19 Thread Jonathan Wakely via Gcc-patches


On 18/05/21 13:58 +0200, Bernd Edlinger wrote:

On 5/18/21 1:55 PM, Bernd Edlinger wrote:

On 5/17/21 7:13 PM, Jonathan Wakely via Gcc-patches wrote:

libstdc++-v3/ChangeLog:

* include/std/thread (jthread::_S_create): Fix static assert
message.
* testsuite/30_threads/jthread/95989.cc: Re-enable test.
* testsuite/30_threads/jthread/jthread.cc: Do not require
pthread effective target.
* testsuite/30_threads/jthread/2.cc: Moved to...
* testsuite/30_threads/jthread/version.cc: ...here.

Tested powerpc64le-linux. Committed to trunk.

Let's see if this test is actually fixed, or if it still causes
failures on some targets.




Yes, indeed it is failing on x86_64-pc-linux-gnu.



that means only this one:

FAIL: 30_threads/jthread/95989.cc execution test


What's your glibc version?

[PATCH][pushed] Fix typos.

2021-05-19 Thread Martin Liška


PR testsuite/100658

gcc/cp/ChangeLog:

* mangle.c (write_encoding): Fix typos.

gcc/jit/ChangeLog:

* libgccjit.c (gcc_jit_context_new_function): Fix typos.

gcc/testsuite/ChangeLog:

* gcc.dg/local1.c: Fix typos.
* gcc.dg/ucnid-5-utf8.c: Likewise.
* gcc.dg/ucnid-5.c: Likewise.
---
 gcc/cp/mangle.c | 2 +-
 gcc/jit/libgccjit.c | 2 +-
 gcc/testsuite/gcc.dg/local1.c   | 2 +-
 gcc/testsuite/gcc.dg/ucnid-5-utf8.c | 2 +-
 gcc/testsuite/gcc.dg/ucnid-5.c  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index f0e1f416804..ee14c2d5a25 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -835,7 +835,7 @@ write_encoding (const tree decl)
 }
 }
 
-/* Interface to substitution and identifer mangling, used by the

+/* Interface to substitution and identifier mangling, used by the
module name mangler.  */
 
 void

diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 0cc650f9810..7fa948007ad 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -909,7 +909,7 @@ gcc_jit_context_new_function (gcc_jit_context *ctxt,
   RETURN_NULL_IF_FAIL (return_type, ctxt, loc, "NULL return_type");
   RETURN_NULL_IF_FAIL (name, ctxt, loc, "NULL name");
   /* The assembler can only handle certain names, so for now, enforce
- C's rules for identiers upon the name, using ISALPHA and ISALNUM
+ C's rules for identifiers upon the name, using ISALPHA and ISALNUM
  from safe-ctype.h to ignore the current locale.
  Eventually we'll need some way to interact with e.g. C++ name
  mangling.  */
diff --git a/gcc/testsuite/gcc.dg/local1.c b/gcc/testsuite/gcc.dg/local1.c
index e9f653bcc56..448c71b056d 100644
--- a/gcc/testsuite/gcc.dg/local1.c
+++ b/gcc/testsuite/gcc.dg/local1.c
@@ -10,7 +10,7 @@
  the later daclaration is the same as the linkage specified at
  the prior declaration.  If no prior declaration is visible,
  or if the prior declaration specifies no linkage, then the
- identifer has external linkage.
+ identifier has external linkage.
 
This is PR 14366.  */
 
diff --git a/gcc/testsuite/gcc.dg/ucnid-5-utf8.c b/gcc/testsuite/gcc.dg/ucnid-5-utf8.c

index 8e104672d13..43310b6ada9 100644
--- a/gcc/testsuite/gcc.dg/ucnid-5-utf8.c
+++ b/gcc/testsuite/gcc.dg/ucnid-5-utf8.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if "No dollar in identfiers" { avr-*-* powerpc-ibm-aix* } } */
+/* { dg-skip-if "No dollar in identifiers" { avr-*-* powerpc-ibm-aix* } } */
 /* { dg-skip-if "" { ! ucn } } */
 /* { dg-options "-std=c99 -fdollars-in-identifiers -g" } */
 void abort (void);
diff --git a/gcc/testsuite/gcc.dg/ucnid-5.c b/gcc/testsuite/gcc.dg/ucnid-5.c
index dc282a780fc..6f0f4753032 100644
--- a/gcc/testsuite/gcc.dg/ucnid-5.c
+++ b/gcc/testsuite/gcc.dg/ucnid-5.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if "No dollar in identfiers" { avr-*-* powerpc-ibm-aix* } } */
+/* { dg-skip-if "No dollar in identifiers" { avr-*-* powerpc-ibm-aix* } } */
 /* { dg-options "-std=c99 -fdollars-in-identifiers -g" } */
 void abort (void);
 
--

2.31.1

Re: [PATCH v4 12/12] constructor: Check if it is faster to load constant from memory

2021-05-19 Thread H.J. Lu via Gcc-patches

On Wed, May 19, 2021 at 2:33 AM Richard Biener
 wrote:
>
> On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> >
> > When expanding a constant constructor, don't call expand_constructor if
> > it is more efficient to load the data from the memory via move by pieces.
> >
> > gcc/
> >
> > PR middle-end/90773
> > * expr.c (expand_expr_real_1): Don't call expand_constructor if
> > it is more efficient to load the data from the memory.
> >
> > gcc/testsuite/
> >
> > PR middle-end/90773
> > * gcc.target/i386/pr90773-24.c: New test.
> > * gcc.target/i386/pr90773-25.c: Likewise.
> > ---
> >  gcc/expr.c | 10 ++
> >  gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++
> >  gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 
> >  3 files changed, 52 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c
> >
> > diff --git a/gcc/expr.c b/gcc/expr.c
> > index d09ee42e262..80e01ea1cbe 100644
> > --- a/gcc/expr.c
> > +++ b/gcc/expr.c
> > @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> > machine_mode tmode,
> > unsigned HOST_WIDE_INT ix;
> > tree field, value;
> >
> > +   /* Check if it is more efficient to load the data from
> > +  the memory directly.  FIXME: How many stores do we
> > +  need here if not moved by pieces?  */
> > +   unsigned HOST_WIDE_INT bytes
> > + = tree_to_uhwi (TYPE_SIZE_UNIT (type));
>
> that's prone to fail - it could be a VLA.

What do you mean by fail?  Is it ICE or missed optimization?
Do you have a testcase?

>
> > +   if ((bytes / UNITS_PER_WORD) > 2
> > +   && MOVE_MAX_PIECES > UNITS_PER_WORD
> > +   && can_move_by_pieces (bytes, TYPE_ALIGN (type)))
> > + goto normal_inner_ref;
> > +
>
> It looks like you're concerned about aggregate copies but this also handles
> non-aggregates (which on GIMPLE might already be optimized of course).

Here I check if we copy more than 2 words and we can move more than
a word in a single instruction.

> Also you say "if it's cheaper" but I see no cost considerations.  How do
> we generally handle immed const vs. load from constant pool costs?

This trades 2 (update to 8) stores with one load plus one store.  Is there
a way to check which one is faster?

> > FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), ix,
> >   field, value)
> >   if (tree_int_cst_equal (field, index))
> > diff --git a/gcc/testsuite/gcc.target/i386/pr90773-24.c 
> > b/gcc/testsuite/gcc.target/i386/pr90773-24.c
> > new file mode 100644
> > index 000..4a4b62533dc
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr90773-24.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -march=x86-64" } */
> > +
> > +struct S
> > +{
> > +  long long s1 __attribute__ ((aligned (8)));
> > +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
> > +};
> > +
> > +const struct S array[] = {
> > +  { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 }
> > +};
> > +
> > +void
> > +foo (struct S *x)
> > +{
> > +  x[0] = array[0];
> > +}
> > +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > \\(%\[\^,\]+\\)" 1 } } */
> > +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > 16\\(%\[\^,\]+\\)" 1 } } */
> > +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > 32\\(%\[\^,\]+\\)" 1 } } */
> > +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 
> > 48\\(%\[\^,\]+\\)" 1 } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr90773-25.c 
> > b/gcc/testsuite/gcc.target/i386/pr90773-25.c
> > new file mode 100644
> > index 000..2520b670989
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr90773-25.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -march=skylake" } */
> > +
> > +struct S
> > +{
> > +  long long s1 __attribute__ ((aligned (8)));
> > +  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
> > +};
> > +
> > +const struct S array[] = {
> > +  { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 }
> > +};
> > +
> > +void
> > +foo (struct S *x)
> > +{
> > +  x[0] = array[0];
> > +}
> > +/* { dg-final { scan-assembler-times "vmovdqu\[\\t \]%ymm\[0-9\]+, 
> > \\(%\[\^,\]+\\)" 1 } } */
> > +/* { dg-final { scan-assembler-times "vmovdqu\[\\t \]%ymm\[0-9\]+, 
> > 32\\(%\[\^,\]+\\)" 1 } } */
> > --
> > 2.31.1
> >



-- 
H.J.

Re: [PATCH, rs6000] Remove mode promotion of SSA variables

2021-05-19 Thread Segher Boessenkool

Hi!

On Wed, May 19, 2021 at 04:36:00PM +0800, HAO CHEN GUI wrote:
> On 19/5/2021 下午 4:33, HAO CHEN GUI wrote:
> >    This patch removes mode promotion of SSA variables on rs6000 
> >platform.

It isn't "promotion of SSA variables".  At the point where this code
applies we are generating RTL, which doesn't do SSA.  It has what is
called "pseudo-registers" (or short, "pseudos"), which will be assigned
hard registers later.

> >    Bootstrapped and tested on powerppc64le and powerppc64be (with 
> >m32) with no regressions. Is this okay for trunk? Any recommendations? 
> >Thanks a lot.

powerpc64-linux and powerpc64le-linux I guess?

>   * config/rs6000/rs6000-call.c (rs6000_promote_function_mode):
>   Replace PROMOTE_MODE marco with its content.
>   * config/rs6000/rs6000.h (PROMOTE_MODE): Remove.

Please split this into two?  The first is obvious, the second much less
so; we'll need to see justification for it.  I know it helps greatly,
but please record that in the commit message :-)

> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index f5676255387..dca139b2ecf 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -6646,7 +6646,9 @@ rs6000_promote_function_mode (const_tree type 
> ATTRIBUTE_UNUSED,
> int *punsignedp ATTRIBUTE_UNUSED,
> const_tree, int for_return ATTRIBUTE_UNUSED)
>  {
> -  PROMOTE_MODE (mode, *punsignedp, type);
> +  if (GET_MODE_CLASS (mode) == MODE_INT
> +  && GET_MODE_SIZE (mode) < (TARGET_32BIT ? 4 : 8))
> +mode = TARGET_32BIT ? SImode : DImode;
>  
>return mode;
>  }

So this part is pre-approved as a separate patch.

> -/* Define this macro if it is advisable to hold scalars in registers
> -   in a wider mode than that declared by the program.  In such cases,
> -   the value is constrained to be within the bounds of the declared
> -   type, but kept valid in the wider mode.  The signedness of the
> -   extension may differ from that of the type.  */
> -
> -#define PROMOTE_MODE(MODE,UNSIGNEDP,TYPE)\
> -  if (GET_MODE_CLASS (MODE) == MODE_INT  \
> -  && GET_MODE_SIZE (MODE) < (TARGET_32BIT ? 4 : 8)) \
> -(MODE) = TARGET_32BIT ? SImode : DImode;
> -

And this part needs some more words in the commit message :-)

Since we do have instructions that can do (almost) everything 32-bit at
least as efficiently as the corresponding 64-bit things, it helps a lot
to not promote so many things to 64 bits.  We didn't realise that before
because that TARGET_PROMOTE_FUNCTION_MODE thing was in the way, since
the very early days of the rs6000 port even, so everyone (well, at least
me :-) ) was tricked into thinking this is an ABI requirement and we
cannot touch it.  But of course it is not, and we can :-)

Some examples of how this improves generated code, or even some
benchmark results, would be good to have.

Also, how about something like

#define PROMOTE_MODE(MODE,UNSIGNEDP,TYPE)   \
  if (GET_MODE_CLASS (MODE) == MODE_INT \
  && GET_MODE_SIZE (MODE) < 4)  \
(MODE) = SImode;

(that is, promoting modes smaller than SImode to SImode).  How does that
compare?

Thanks!

Segher

Re: [PATCH 5/5] testsuite: aarch64: Add tests for high-half narrowing instructions

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> As subject, this patch adds tests to confirm that a *2 (write to high-half)
> Neon instruction is generated from vcombine* of a narrowing intrinsic
> sequence.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/testsuite/ChangeLog:
>
> 2021-05-14  Jonathan Wright  
>
> * gcc.target/aarch64/narrow_high_combine.c: New test.
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c 
> b/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c
> new file mode 100644
> index 
> ..cf649bda28d4d648c9392d202fcc5660107a11d7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c
> @@ -0,0 +1,125 @@
> +/* { dg-skip-if "" { arm*-*-* } } */
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +#include 
> +
> +#define TEST_ARITH(name, rettype, rmwtype, intype, fs, rs) \
> +  rettype test_ ## name ## _ ## fs ## _high_combine \
> + (rmwtype a, intype b, intype c) \
> + { \
> + return vcombine_ ## rs (a, name ## _ ## fs (b, c)); \
> + }
> +
> +TEST_ARITH (vaddhn, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_ARITH (vaddhn, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_ARITH (vaddhn, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_ARITH (vaddhn, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_ARITH (vaddhn, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_ARITH (vaddhn, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_ARITH (vraddhn, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_ARITH (vraddhn, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_ARITH (vraddhn, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_ARITH (vraddhn, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_ARITH (vraddhn, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_ARITH (vraddhn, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_ARITH (vsubhn, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_ARITH (vsubhn, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_ARITH (vsubhn, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_ARITH (vsubhn, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_ARITH (vsubhn, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_ARITH (vsubhn, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_ARITH (vrsubhn, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_ARITH (vrsubhn, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_ARITH (vrsubhn, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_ARITH (vrsubhn, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_ARITH (vrsubhn, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_ARITH (vrsubhn, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +#define TEST_SHIFT(name, rettype, rmwtype, intype, fs, rs) \
> +  rettype test_ ## name ## _ ## fs ## _high_combine \
> + (rmwtype a, intype b) \
> + { \
> + return vcombine_ ## rs (a, name ## _ ## fs (b, 4)); \
> + }
> +
> +TEST_SHIFT (vshrn_n, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_SHIFT (vshrn_n, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_SHIFT (vshrn_n, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_SHIFT (vshrn_n, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_SHIFT (vshrn_n, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_SHIFT (vshrn_n, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_SHIFT (vrshrn_n, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_SHIFT (vrshrn_n, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_SHIFT (vrshrn_n, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_SHIFT (vrshrn_n, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_SHIFT (vrshrn_n, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_SHIFT (vrshrn_n, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_SHIFT (vqshrn_n, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_SHIFT (vqshrn_n, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_SHIFT (vqshrn_n, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_SHIFT (vqshrn_n, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_SHIFT (vqshrn_n, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_SHIFT (vqshrn_n, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_SHIFT (vqrshrn_n, int8x16_t, int8x8_t, int16x8_t, s16, s8)
> +TEST_SHIFT (vqrshrn_n, int16x8_t, int16x4_t, int32x4_t, s32, s16)
> +TEST_SHIFT (vqrshrn_n, int32x4_t, int32x2_t, int64x2_t, s64, s32)
> +TEST_SHIFT (vqrshrn_n, uint8x16_t, uint8x8_t, uint16x8_t, u16, u8)
> +TEST_SHIFT (vqrshrn_n, uint16x8_t, uint16x4_t, uint32x4_t, u32, u16)
> +TEST_SHIFT (vqrshrn_n, uint32x4_t, uint32x2_t, uint64x2_t, u64, u32)
> +
> +TEST_SHIFT (vqshrun_n, uint8x16_t, uint8x8_t, int16x8_t, s16, u8)
> +TEST_SHIFT (vqshrun_n, uint16x8_t, uint16x4_t, int32x4_t, s32, u16)
> +TEST_SHIFT (vqshrun_n, uint32x4_t, uint32x2_t, int64x2_t, s64, u32)
> +
> +TEST_SHIFT (vqrshrun_n, uint8x16_t, uint8x8_t, int16x8_t, s16, u8)
> +TEST_SHIFT (vqrshrun_n, ui

Re: [PATCH 4/5] aarch64: Refactor aarch64_qshrn_n RTL pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> As subject, this patch splits the aarch64_qshrn_n
> pattern into separate scalar and vector variants. It further splits the vector
> pattern into big/little endian variants that model the zero-high-half
> semantics of the underlying instruction - allowing for more combinations
> with the write-to-high-half variant
> (aarch64_qshrn2_n.) This improvement will be
> confirmed by a new test in gcc.target/aarch64/narrow_high_combine.c
> (patch 5/5 in this series.)
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-05-14  Jonathan Wright  
>
> * config/aarch64/aarch64-simd-builtins.def: Split builtin
> generation for aarch64_qshrn_n pattern into
> separate scalar and vector generators.
> * config/aarch64/aarch64-simd.md
> (aarch64_qshrn_n): Define as an expander and
> split into...
> (aarch64_qshrn_n_insn_le): This and...
> (aarch64_qshrn_n_insn_be): This.
> * config/aarch64/iterators.md: Define SD_HSDI iterator.
>
> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> b/gcc/config/aarch64/aarch64-simd-builtins.def
> index 
> 1e81bb53287e9797f3539c2c64ed11c6c26d6e4e..18baa6720b09b2ebda8577b809f8a8683f8b44f0
>  100644
> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> @@ -421,12 +421,18 @@
>BUILTIN_VQW (SHIFTIMM, sshll2_n, 0, NONE)
>BUILTIN_VQW (SHIFTIMM, ushll2_n, 0, NONE)
>/* Implemented by aarch64_qshrn_n.  */
> -  BUILTIN_VSQN_HSDI (SHIFTIMM, sqshrun_n, 0, NONE)
> -  BUILTIN_VSQN_HSDI (SHIFTIMM, sqrshrun_n, 0, NONE)
> -  BUILTIN_VSQN_HSDI (SHIFTIMM, sqshrn_n, 0, NONE)
> -  BUILTIN_VSQN_HSDI (USHIFTIMM, uqshrn_n, 0, NONE)
> -  BUILTIN_VSQN_HSDI (SHIFTIMM, sqrshrn_n, 0, NONE)
> -  BUILTIN_VSQN_HSDI (USHIFTIMM, uqrshrn_n, 0, NONE)
> +  BUILTIN_VQN (SHIFTIMM, sqshrun_n, 0, NONE)
> +  BUILTIN_VQN (SHIFTIMM, sqrshrun_n, 0, NONE)
> +  BUILTIN_VQN (SHIFTIMM, sqshrn_n, 0, NONE)
> +  BUILTIN_VQN (USHIFTIMM, uqshrn_n, 0, NONE)
> +  BUILTIN_VQN (SHIFTIMM, sqrshrn_n, 0, NONE)
> +  BUILTIN_VQN (USHIFTIMM, uqrshrn_n, 0, NONE)
> +  BUILTIN_SD_HSDI (SHIFTIMM, sqshrun_n, 0, NONE)
> +  BUILTIN_SD_HSDI (SHIFTIMM, sqrshrun_n, 0, NONE)
> +  BUILTIN_SD_HSDI (SHIFTIMM, sqshrn_n, 0, NONE)
> +  BUILTIN_SD_HSDI (USHIFTIMM, uqshrn_n, 0, NONE)
> +  BUILTIN_SD_HSDI (SHIFTIMM, sqrshrn_n, 0, NONE)
> +  BUILTIN_SD_HSDI (USHIFTIMM, uqrshrn_n, 0, NONE)
>/* Implemented by aarch64_qshrn2_n.  */
>BUILTIN_VQN (SHIFT2IMM_UUSS, sqshrun2_n, 0, NONE)
>BUILTIN_VQN (SHIFT2IMM_UUSS, sqrshrun2_n, 0, NONE)
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 79523093ec327b826c0a6741bf315c6c2f67fe64..c67fa3fb6f0ca0a181a09a42677526d12e955c06
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -6045,7 +6045,7 @@
>  
>  (define_insn "aarch64_qshrn_n"
>[(set (match_operand: 0 "register_operand" "=w")
> -(unspec: [(match_operand:VSQN_HSDI 1 "register_operand" 
> "w")
> +(unspec: [(match_operand:SD_HSDI 1 "register_operand" "w")
>   (match_operand:SI 2
> "aarch64_simd_shift_imm_offset_" "i")]
>  VQSHRN_N))]
> @@ -6054,6 +6054,58 @@
>[(set_attr "type" "neon_sat_shift_imm_narrow_q")]
>  )
>  
> +(define_insn "aarch64_qshrn_n_insn_le"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (vec_concat:
> +   (unspec:
> + [(match_operand:VQN 1 "register_operand" "w")
> +  (match_operand:VQN 2 "aarch64_simd_shift_imm_vec_")]
> + VQSHRN_N)
> +   (match_operand: 3 "aarch64_simd_or_scalar_imm_zero")))]
> +  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
> +  "qshrn\\t%0, %1, %2"
> +  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +)
> +
> +(define_insn "aarch64_qshrn_n_insn_be"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (vec_concat:
> +   (match_operand: 3 "aarch64_simd_or_scalar_imm_zero")
> +   (unspec:
> + [(match_operand:VQN 1 "register_operand" "w")
> +  (match_operand:VQN 2 "aarch64_simd_shift_imm_vec_")]
> + VQSHRN_N)))]
> +  "TARGET_SIMD && BYTES_BIG_ENDIAN"
> +  "qshrn\\t%0, %1, %2"
> +  [(set_attr "type" "neon_shift_imm_narrow_q")]
> +)
> +
> +(define_expand "aarch64_qshrn_n"
> +  [(set (match_operand: 0 "register_operand" "=w")
> +(unspec: [(match_operand:VQN 1 "register_operand")
> + (match_operand:SI 2
> +   "aarch64_simd_shift_imm_offset_")]
> +VQSHRN_N))]
> +  "TARGET_SIMD"
> +  {
> +operands[2] = aarch64_simd_gen_const_vector_dup (mode,
> +  INTVAL (operands[2]));
> +rtx tmp = gen_reg_rtx (mode);
> +if (BYTES_BI

Re: [PATCH 3/5] aarch64: Relax aarch64_sqxtun2 RTL pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> As subject, this patch uses UNSPEC_SQXTUN instead of UNSPEC_SQXTUN2
> in the aarch64_sqxtun2 patterns. This allows for more more
> aggressive combinations and ultimately better code generation - which will
> be confirmed by a new set of tests in
> gcc.target/aarch64/narrow_high_combine.c (patch 5/5 in this series.)
>
> The now redundant UNSPEC_SQXTUN2 is removed.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.

OK, thanks.

Richard

> Ok for master?
>
> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-05-14  Jonathn Wright  
>
> * config/aarch64/aarch64-simd.md: Use UNSPEC_SQXTUN instead
> of UNSPEC_SQXTUN2.
> * config/aarch64/iterators.md: Remove UNSPEC_SQXTUN2.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 2a836e8f9a4dfe11d645d439b19ac4487d9fb1a8..fa6c2d81cf3b6939a8eb4a4f471ac2398a60e115
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -4859,7 +4859,7 @@
>   (vec_concat:
> (match_operand: 1 "register_operand" "0")
> (unspec:
> - [(match_operand:VQN 2 "register_operand" "w")] UNSPEC_SQXTUN2)))]
> + [(match_operand:VQN 2 "register_operand" "w")] UNSPEC_SQXTUN)))]
>"TARGET_SIMD && !BYTES_BIG_ENDIAN"
>"sqxtun2\\t%0., %2."
> [(set_attr "type" "neon_sat_shift_imm_narrow_q")]
> @@ -4869,7 +4869,7 @@
>[(set (match_operand: 0 "register_operand" "=w")
>   (vec_concat:
> (unspec:
> - [(match_operand:VQN 2 "register_operand" "w")] UNSPEC_SQXTUN2)
> + [(match_operand:VQN 2 "register_operand" "w")] UNSPEC_SQXTUN)
> (match_operand: 1 "register_operand" "0")))]
>"TARGET_SIMD && BYTES_BIG_ENDIAN"
>"sqxtun2\\t%0., %2."
> @@ -4880,7 +4880,7 @@
>[(match_operand: 0 "register_operand")
> (match_operand: 1 "register_operand")
> (unspec:
> - [(match_operand:VQN 2 "register_operand")] UNSPEC_SQXTUN2)]
> + [(match_operand:VQN 2 "register_operand")] UNSPEC_SQXTUN)]
>"TARGET_SIMD"
>{
>  if (BYTES_BIG_ENDIAN)
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 
> d13f54a32465619110d7d014fdfe8aaf22384189..96eaef9c749927394465bfe445f509807bfdc57c
>  100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -521,7 +521,6 @@
>  UNSPEC_USQADD; Used in aarch64-simd.md.
>  UNSPEC_SUQADD; Used in aarch64-simd.md.
>  UNSPEC_SQXTUN; Used in aarch64-simd.md.
> -UNSPEC_SQXTUN2   ; Used in aarch64-simd.md.
>  UNSPEC_SSRA  ; Used in aarch64-simd.md.
>  UNSPEC_USRA  ; Used in aarch64-simd.md.
>  UNSPEC_SRSRA ; Used in aarch64-simd.md.

Re: [PATCH 2/5] aarch64: Relax aarch64_qshrn2_n RTL pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> As subject, this patch implements saturating right-shift and narrow high
> Neon intrinsic RTL patterns using a vec_concat of a register_operand
> and a VQSHRN_N unspec - instead of just a VQSHRN_N unspec. This
> more relaxed pattern allows for more aggressive combinations and
> ultimately better code generation - which will be confirmed by a new
> set of tests in gcc.target/aarch64/narrow_high_combine.c (patch 5/5 in
> this series.)
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-03-04  Jonathan Wright  
>
> * config/aarch64/aarch64-simd.md (aarch64_qshrn2_n):
> Implement as an expand emitting a big/little endian
> instruction pattern.
> (aarch64_qshrn2_n_insn_le): Define.
> (aarch64_qshrn2_n_insn_be): Define.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 69d48fea16b732c20db0ee400782ef9b73982c47..2a836e8f9a4dfe11d645d439b19ac4487d9fb1a8
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -6031,17 +6031,54 @@
>[(set_attr "type" "neon_sat_shift_imm_narrow_q")]
>  )
>  
> -(define_insn "aarch64_qshrn2_n"
> +(define_insn "aarch64_qshrn2_n_insn_le"
>[(set (match_operand: 0 "register_operand" "=w")
> -(unspec: [(match_operand: 1 "register_operand" 
> "0")
> -  (match_operand:VQN 2 "register_operand" "w")
> -  (match_operand:SI 3 
> "aarch64_simd_shift_imm_offset_" "i")]
> -VQSHRN_N))]
> -  "TARGET_SIMD"
> + (vec_concat:
> +   (match_operand: 1 "register_operand" "0")
> +   (unspec: [(match_operand:VQN 2 "register_operand" "w")
> +   (match_operand:VQN 3
> + "aarch64_simd_shift_imm_vec_")]
> +  VQSHRN_N)))]
> +  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
>"qshrn2\\t%0., %2., %3"
>[(set_attr "type" "neon_sat_shift_imm_narrow_q")]
>  )
>  
> +(define_insn "aarch64_qshrn2_n_insn_be"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (vec_concat:
> +  (unspec: [(match_operand:VQN 2 "register_operand" "w")
> +   (match_operand:VQN 3
> + "aarch64_simd_shift_imm_vec_")]
> +  VQSHRN_N)
> +   (match_operand: 1 "register_operand" "0")))]
> +  "TARGET_SIMD && BYTES_BIG_ENDIAN"
> +  "qshrn2\\t%0., %2., %3"
> +  [(set_attr "type" "neon_sat_shift_imm_narrow_q")]
> +)
> +
> +(define_expand "aarch64_qshrn2_n"
> +  [(match_operand: 0 "register_operand")
> +   (match_operand: 1 "register_operand")
> +   (unspec:
> + [(match_operand:VQN 2 "register_operand")
> +  (match_operand:SI 3 "aarch64_simd_shift_imm_offset_")]
> +VQSHRN_N)]
> +  "TARGET_SIMD"
> +  {
> +operands[3] = aarch64_simd_gen_const_vector_dup (mode,
> +  INTVAL (operands[3]));
> +
> +if (BYTES_BIG_ENDIAN)
> +  emit_insn (gen_aarch64_qshrn2_n_insn_be (operands[0],
> + operands[1], operands[2], operands[3]));
> +else
> +  emit_insn (gen_aarch64_qshrn2_n_insn_le (operands[0],
> + operands[1], operands[2], operands[3]));
> +DONE;
> +  }
> +)
> +
>  
>  ;; cm(eq|ge|gt|lt|le)
>  ;; Note, we have constraints for Dz and Z as different expanders

Re: [PATCH 1/5] aarch64: Relax aarch64_hn2 RTL pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> As subject, this patch implements v[r]addhn2 and v[r]subhn2 Neon intrinsic
> RTL patterns using a vec_concat of a register_operand and an ADDSUBHN
> unspec - instead of just an ADDSUBHN2 unspec. This more relaxed pattern
> allows for more aggressive combinations and ultimately better code
> generation - which will be confirmed by a new set of tests in
> gcc.target/aarch64/narrow_high_combine.c (patch 5/5 in this series).
>
> This patch also removes the now redundant [R]ADDHN2 and [R]SUBHN2
> unspecs and their iterator.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-03-03  Jonathan Wright  
>
> * config/aarch64/aarch64-simd.md (aarch64_hn2):
> Implement as an expand emitting a big/little endian
> instruction pattern.
> (aarch64_hn2_insn_le): Define.
> (aarch64_hn2_insn_be): Define.
> * config/aarch64/iterators.md: Remove UNSPEC_[R]ADDHN2 and
> UNSPEC_[R]SUBHN2 unspecs and ADDSUBHN2 iterator.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 99620895e7874cdfe346eb8994fa7b519c650f88..69d48fea16b732c20db0ee400782ef9b73982c47
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -4594,17 +4594,48 @@
>[(set_attr "type" "neon__halve_narrow_q")]
>  )
>  
> -(define_insn "aarch64_hn2"
> +(define_insn "aarch64_hn2_insn_le"
>[(set (match_operand: 0 "register_operand" "=w")
> -(unspec: [(match_operand: 1 "register_operand" 
> "0")
> -  (match_operand:VQN 2 "register_operand" "w")
> -  (match_operand:VQN 3 "register_operand" "w")]
> -ADDSUBHN2))]
> -  "TARGET_SIMD"
> + (vec_concat:
> +   (match_operand: 1 "register_operand" "0")
> +   (unspec: [(match_operand:VQN 2 "register_operand" "w")
> +   (match_operand:VQN 3 "register_operand" "w")]
> +  ADDSUBHN)))]
> +  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
> +  "hn2\\t%0., %2., %3."
> +  [(set_attr "type" "neon__halve_narrow_q")]
> +)
> +
> +(define_insn "aarch64_hn2_insn_be"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (vec_concat:
> +   (unspec: [(match_operand:VQN 2 "register_operand" "w")
> +   (match_operand:VQN 3 "register_operand" "w")]
> +  ADDSUBHN)
> +   (match_operand: 1 "register_operand" "0")))]
> +  "TARGET_SIMD && BYTES_BIG_ENDIAN"
>"hn2\\t%0., %2., %3."
>[(set_attr "type" "neon__halve_narrow_q")]
>  )
>  
> +(define_expand "aarch64_hn2"
> +  [(match_operand: 0 "register_operand")
> +   (match_operand: 1 "register_operand")
> +   (unspec [(match_operand:VQN 2 "register_operand")
> + (match_operand:VQN 3 "register_operand")]
> +ADDSUBHN)]
> +  "TARGET_SIMD"
> +  {
> +if (BYTES_BIG_ENDIAN)
> +  emit_insn (gen_aarch64_hn2_insn_be (operands[0],
> + operands[1], operands[2], operands[3]));
> +else
> +  emit_insn (gen_aarch64_hn2_insn_le (operands[0],
> + operands[1], operands[2], operands[3]));
> +DONE;
> +  }
> +)
> +
>  ;; pmul.
>  
>  (define_insn "aarch64_pmul"
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 
> 69d9dbebe8f1fcee39e93586b97da1a887fd94e3..d13f54a32465619110d7d014fdfe8aaf22384189
>  100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -514,10 +514,6 @@
>  UNSPEC_RADDHN; Used in aarch64-simd.md.
>  UNSPEC_SUBHN ; Used in aarch64-simd.md.
>  UNSPEC_RSUBHN; Used in aarch64-simd.md.
> -UNSPEC_ADDHN2; Used in aarch64-simd.md.
> -UNSPEC_RADDHN2   ; Used in aarch64-simd.md.
> -UNSPEC_SUBHN2; Used in aarch64-simd.md.
> -UNSPEC_RSUBHN2   ; Used in aarch64-simd.md.
>  UNSPEC_SQDMULH   ; Used in aarch64-simd.md.
>  UNSPEC_SQRDMULH  ; Used in aarch64-simd.md.
>  UNSPEC_PMUL  ; Used in aarch64-simd.md.
> @@ -2241,9 +2237,6 @@
>  (define_int_iterator ADDSUBHN [UNSPEC_ADDHN UNSPEC_RADDHN
>  UNSPEC_SUBHN UNSPEC_RSUBHN])
>  
> -(define_int_iterator ADDSUBHN2 [UNSPEC_ADDHN2 UNSPEC_RADDHN2
> - UNSPEC_SUBHN2 UNSPEC_RSUBHN2])
> -
>  (define_int_iterator FMAXMIN_UNS [UNSPEC_FMAX UNSPEC_FMIN
> UNSPEC_FMAXNM UNSPEC_FMINNM])
>  
> @@ -2996,8 +2989,6 @@
> (UNSPEC_SABDL2 "s") (UNSPEC_UABDL2 "u")
> (UNSPEC_SADALP "s") (UNSPEC_UADALP "u")
> (UNSPEC_SUBHN "") (UNSPEC_RSUBHN "r")
> -   (UNSPEC_ADDHN2 "") (UNSPEC_RADDHN2 "r")
> -   (UNSPEC_SUBHN2 "") (UNSPEC_RSUBHN2 "r")
> (UNSPEC_USQADD "us") (UNSPEC_S

[PATCH] middle-end/100672 - fix bogus right shift folding

2021-05-19 Thread Richard Biener

This fixes the bogus use of TYPE_PRECISION on vector types
from optimizing -((int)x >> 31) into (unsigned)x >> 31.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-05-19  Richard Biener  

PR middle-end/100672
* fold-const.c (fold_negate_expr_1): Use element_precision.
(negate_expr_p): Likewise.

* gcc.dg/torture/pr100672.c: New testcase.
---
 gcc/fold-const.c|  4 ++--
 gcc/testsuite/gcc.dg/torture/pr100672.c | 19 +++
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100672.c

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 5a41524702b..3be9c15e6b2 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -512,7 +512,7 @@ negate_expr_p (tree t)
   if (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST)
{
  tree op1 = TREE_OPERAND (t, 1);
- if (wi::to_wide (op1) == TYPE_PRECISION (type) - 1)
+ if (wi::to_wide (op1) == element_precision (type) - 1)
return true;
}
   break;
@@ -705,7 +705,7 @@ fold_negate_expr_1 (location_t loc, tree t)
   if (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST)
{
  tree op1 = TREE_OPERAND (t, 1);
- if (wi::to_wide (op1) == TYPE_PRECISION (type) - 1)
+ if (wi::to_wide (op1) == element_precision (type) - 1)
{
  tree ntype = TYPE_UNSIGNED (type)
   ? signed_type_for (type)
diff --git a/gcc/testsuite/gcc.dg/torture/pr100672.c 
b/gcc/testsuite/gcc.dg/torture/pr100672.c
new file mode 100644
index 000..cc62e71f9a3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr100672.c
@@ -0,0 +1,19 @@
+/* { dg-do run } */
+/* { dg-additional-options "-w -Wno-psabi" } */
+
+typedef long long __attribute__((__vector_size__ (4 * sizeof (long long V;
+
+V
+foo (V v)
+{
+  return -(v >> 1);
+}
+
+int
+main (void)
+{
+  V v = foo ((V) { -2, -4, -6, -8 });
+  if (v[0] != 1 || v[1] != 2 || v[2] != 3 || v[3] != 4)
+__builtin_abort ();
+  return 0;
+}
-- 
2.26.2

Add 'libgomp.oacc-c-c++-common/private-atomic-1.c' [PR83812] (was: [PATCH][testsuite, nvptx] Add effective target sync_int_long_stack)

2021-05-19 Thread Thomas Schwinge

Hi!

On 2020-08-12T15:57:23+0200, Tom de Vries  wrote:
> When enabling sync_int_long for nvptx, we run into a failure in
> gcc.dg/pr86314.c:
> ...
>  nvptx-run: error getting kernel result: operation not supported on \
>global/shared address space
> ...
> due to a ptx restriction:  accesses to local memory are illegal, and the
> test-case does an atomic operation on a stack address, which is mapped to
> local memory.

Now, that problem also is easy to reproduce in an OpenACC/nvptx
offloading setting.  (..., as Julian had reported and analyzed in an
internal "Date: Fri, 31 May 2019 00:29:17 +0100" email, similar to Tom's
comments in PR96494 "[nvptx] Enable effective target sync_int_long" and
PR97444 "[nvptx] stack atomics".)

> Fix this by adding a target sync_int_long_stack, wich returns false for nvptx,
> which can be used to mark test-cases that require sync_int_long support for
> stack address.

Similar to PR97444 "[nvptx] stack atomics", such a conditional is of
course not applicable for the OpenACC implementation, so to at least
document/test/XFAIL nvptx offloading: PR83812 "operation not supported
on global/shared address space", I've now pushed to master branch
"Add 'libgomp.oacc-c-c++-common/private-atomic-1.c' [PR83812]" in
commit 1467100fc72562a59f70cdd4e05f6c810d1fadcc, see attached.

And then, I got back testresults from one more system, and I've filed
 "[OpenACC/nvptx]
'libgomp.oacc-c-c++-common/private-atomic-1.c' FAILs (differently) in
certain configurations"...  :-\

As it's not clear yet what the conditions are, I cannot come up with a
selector to (differently) XFAIL that one.

Grüße
 Thomas

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf

Re: [PATCH v4 01/12] Add TARGET_READ_MEMSET_VALUE/TARGET_GEN_MEMSET_VALUE

2021-05-19 Thread H.J. Lu via Gcc-patches

On Wed, May 19, 2021 at 2:25 AM Richard Biener
 wrote:
>
> On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> >
> > Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support
> > target instructions to duplicate QImode value to TImode/OImode/XImode
> > value for memmset.
> >
> > PR middle-end/90773
> > * builtins.c (builtin_memset_read_str): Call
> > targetm.read_memset_value.
> > (builtin_memset_gen_str): Call targetm.gen_memset_value.
> > * target.def (read_memset_value): New hook.
> > (gen_memset_value): Likewise.
> > * targhooks.c: Inclue "builtins.h".
> > (default_read_memset_value): New function.
> > (default_gen_memset_value): Likewise.
> > * targhooks.h (default_read_memset_value): New prototype.
> > (default_gen_memset_value): Likewise.
> > * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and
> > TARGET_GEN_MEMSET_VALUE hooks.
> > * doc/tm.texi: Regenerated.
> > ---
> >  gcc/builtins.c | 47 --
> >  gcc/doc/tm.texi| 16 +
> >  gcc/doc/tm.texi.in |  4 
> >  gcc/target.def | 20 +
> >  gcc/targhooks.c| 56 ++
> >  gcc/targhooks.h|  4 
> >  6 files changed, 104 insertions(+), 43 deletions(-)
> >
> > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > index e1b284846b1..f78a36478ef 100644
> > --- a/gcc/builtins.c
> > +++ b/gcc/builtins.c
> > @@ -6584,24 +6584,11 @@ expand_builtin_strncpy (tree exp, rtx target)
> > previous iteration.  */
> >
> >  rtx
> > -builtin_memset_read_str (void *data, void *prevp,
> > +builtin_memset_read_str (void *data, void *prev,
> >  HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> >  scalar_int_mode mode)
> >  {
> > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > -  if (prev != nullptr && prev->data != nullptr)
> > -{
> > -  /* Use the previous data in the same mode.  */
> > -  if (prev->mode == mode)
> > -   return prev->data;
> > -}
> > -
> > -  const char *c = (const char *) data;
> > -  char *p = XALLOCAVEC (char, GET_MODE_SIZE (mode));
> > -
> > -  memset (p, *c, GET_MODE_SIZE (mode));
> > -
> > -  return c_readstr (p, mode);
> > +  return targetm.read_memset_value ((const char *) data, prev, mode);
> >  }
> >
> >  /* Callback routine for store_by_pieces.  Return the RTL of a register
> > @@ -6611,37 +6598,11 @@ builtin_memset_read_str (void *data, void *prevp,
> > nullptr, it has the RTL info from the previous iteration.  */
> >
> >  static rtx
> > -builtin_memset_gen_str (void *data, void *prevp,
> > +builtin_memset_gen_str (void *data, void *prev,
> > HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > scalar_int_mode mode)
> >  {
> > -  rtx target, coeff;
> > -  size_t size;
> > -  char *p;
> > -
> > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > -  if (prev != nullptr && prev->data != nullptr)
> > -{
> > -  /* Use the previous data in the same mode.  */
> > -  if (prev->mode == mode)
> > -   return prev->data;
> > -
> > -  target = simplify_gen_subreg (mode, prev->data, prev->mode, 0);
> > -  if (target != nullptr)
> > -   return target;
> > -}
> > -
> > -  size = GET_MODE_SIZE (mode);
> > -  if (size == 1)
> > -return (rtx) data;
> > -
> > -  p = XALLOCAVEC (char, size);
> > -  memset (p, 1, size);
> > -  coeff = c_readstr (p, mode);
> > -
> > -  target = convert_to_mode (mode, (rtx) data, 1);
> > -  target = expand_mult (mode, target, coeff, NULL_RTX, 1);
> > -  return force_reg (mode, target);
> > +  return targetm.gen_memset_value ((rtx) data, prev, mode);
> >  }
> >
> >  /* Expand expression EXP, which is a call to the memset builtin.  Return
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index 85ea9395560..51385044e76 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -11868,6 +11868,22 @@ This function prepares to emit a conditional 
> > comparison within a sequence
> >   @var{bit_code} is @code{AND} or @code{IOR}, which is the op on the 
> > compares.
> >  @end deftypefn
> >
> > +@deftypefn {Target Hook} rtx TARGET_READ_MEMSET_VALUE (const char 
> > *@var{c}, void *@var{prev}, scalar_int_mode @var{mode})
> > +This function returns the RTL of a constant integer corresponding to
> > +target reading @code{GET_MODE_SIZE (@var{mode})} bytes from the stringn
> > +constant @var{str}.  If @var{prev} is not @samp{nullptr}, it contains
> > +the RTL information from the previous interation.
> > +@end deftypefn
> > +
> > +@deftypefn {Target Hook} rtx TARGET_GEN_MEMSET_VALUE (rtx @var{data}, void 
> > *@var{prev}, scalar_int_mode @var{mode})
> > +This function returns the RTL of a register containing
> > +@code{GET_MODE_SIZE (@var{mode})} consecutive copies of the unsigned
> > +char value given in the RTL register @var{data}.  For example, if
>

[PATCH] Avoid marking TARGET_MEM_REF bases addressable

2021-05-19 Thread Richard Biener

The following does no longer mark TARGET_MEM_REF bases addressable,
mimicing MEM_REFs beahvior here.  In contrast to the latter,
TARGET_MEM_REF RTL expansion expects to always operate on memory
though, so make sure we expand them so.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-05-19  Richard Biener  

* cfgexpand.c (discover_nonconstant_array_refs_r): Make
sure TARGET_MEM_REF bases are expanded as memory.
* tree-ssa-operands.c (operands_scanner::get_tmr_operands):
Do not mark TARGET_MEM_REF bases addressable.
* tree-ssa.c (non_rewritable_mem_ref_base): Handle
TARGET_MEM_REF bases as never rewritable.
* gimple-walk.c (walk_stmt_load_store_addr_ops): Do not
walk TARGET_MEM_REF bases as address-takens.
* tree-ssa-dce.c (ref_may_be_aliased): Handle TARGET_MEM_REF.
---
 gcc/cfgexpand.c | 10 ++
 gcc/gimple-walk.c   |  8 
 gcc/tree-ssa-dce.c  |  2 +-
 gcc/tree-ssa-operands.c |  4 +++-
 gcc/tree-ssa.c  | 10 ++
 5 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 3e6f7cafc4c..39e5b040427 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -6280,10 +6280,12 @@ discover_nonconstant_array_refs_r (tree * tp, int 
*walk_subtrees,
 }
   /* References of size POLY_INT_CST to a fixed-size object must go
  through memory.  It's more efficient to force that here than
- to create temporary slots on the fly.  */
-  else if ((TREE_CODE (t) == MEM_REF || TREE_CODE (t) == TARGET_MEM_REF)
-  && TYPE_SIZE (TREE_TYPE (t))
-  && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t
+ to create temporary slots on the fly.
+ RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
+  else if (TREE_CODE (t) == TARGET_MEM_REF
+  || (TREE_CODE (t) == MEM_REF
+  && TYPE_SIZE (TREE_TYPE (t))
+  && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)
 {
   tree base = get_base_address (t);
   if (base
diff --git a/gcc/gimple-walk.c b/gcc/gimple-walk.c
index f8b0482564b..e4a55f1eeb6 100644
--- a/gcc/gimple-walk.c
+++ b/gcc/gimple-walk.c
@@ -748,10 +748,6 @@ walk_stmt_load_store_addr_ops (gimple *stmt, void *data,
{
  if (TREE_CODE (rhs) == ADDR_EXPR)
ret |= visit_addr (stmt, TREE_OPERAND (rhs, 0), arg, data);
- else if (TREE_CODE (rhs) == TARGET_MEM_REF
-  && TREE_CODE (TMR_BASE (rhs)) == ADDR_EXPR)
-   ret |= visit_addr (stmt, TREE_OPERAND (TMR_BASE (rhs), 0), arg,
-  data);
  else if (TREE_CODE (rhs) == OBJ_TYPE_REF
   && TREE_CODE (OBJ_TYPE_REF_OBJECT (rhs)) == ADDR_EXPR)
ret |= visit_addr (stmt, TREE_OPERAND (OBJ_TYPE_REF_OBJECT (rhs),
@@ -770,10 +766,6 @@ walk_stmt_load_store_addr_ops (gimple *stmt, void *data,
 TREE_OPERAND (OBJ_TYPE_REF_OBJECT (val),
   0), arg, data);
}
-  lhs = gimple_assign_lhs (stmt);
- if (TREE_CODE (lhs) == TARGET_MEM_REF
-  && TREE_CODE (TMR_BASE (lhs)) == ADDR_EXPR)
-   ret |= visit_addr (stmt, TREE_OPERAND (TMR_BASE (lhs), 0), lhs, 
data);
}
   if (visit_load)
{
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index c091868a313..def6ae69e24 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -452,7 +452,7 @@ ref_may_be_aliased (tree ref)
   gcc_assert (TREE_CODE (ref) != WITH_SIZE_EXPR);
   while (handled_component_p (ref))
 ref = TREE_OPERAND (ref, 0);
-  if (TREE_CODE (ref) == MEM_REF
+  if ((TREE_CODE (ref) == MEM_REF || TREE_CODE (ref) == TARGET_MEM_REF)
   && TREE_CODE (TREE_OPERAND (ref, 0)) == ADDR_EXPR)
 ref = TREE_OPERAND (TREE_OPERAND (ref, 0), 0);
   return !(DECL_P (ref)
diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
index 2fc74d51917..c15575416dd 100644
--- a/gcc/tree-ssa-operands.c
+++ b/gcc/tree-ssa-operands.c
@@ -669,7 +669,9 @@ operands_scanner::get_tmr_operands(tree expr, int flags)
 gimple_set_has_volatile_ops (stmt, true);
 
   /* First record the real operands.  */
-  get_expr_operands (&TMR_BASE (expr), opf_use | (flags & opf_no_vops));
+  get_expr_operands (&TMR_BASE (expr),
+opf_non_addressable | opf_use
+| (flags & (opf_no_vops|opf_not_non_addressable)));
   get_expr_operands (&TMR_INDEX (expr), opf_use | (flags & opf_no_vops));
   get_expr_operands (&TMR_INDEX2 (expr), opf_use | (flags & opf_no_vops));
 
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index cf54c891426..4cc400d3c2e 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1494,6 +1494,16 @@ non_rewritable_mem_ref_base (tree ref)
   return decl;
 }
 
+  /* We cannot rewrite TARGET_MEM_REFs.  */
+  if (TREE_CODE (base) == TARGET_MEM_REF
+  && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR

Re: [PATCH][DOCS] Remove install-old.texi

2021-05-19 Thread Martin Liška


On 5/18/21 10:18 PM, Joseph Myers wrote:

On Tue, 18 May 2021, Martin Liška wrote:


+@quotation
+aix7.1, aix7.2, amdhsa, androideabi, aout, cygwin, darwin, darwin10, darwin7,
+darwin8, darwin9, eabi, eabialtivec, eabisim, eabisimaltivec, elf, elf32,
+elfbare, elfoabi, freebsd4, freebsd6, gnu, hpux, hpux10.1, hpux11.0, hpux11.3,
+hpux11.9, linux, linux_altivec, lynxos, mingw32, mingw32crt, mmixware, 
msdosdjgpp,
+musl, netbsd, netbsdelf, netbsdelf9, none, openbsd, qnx, rtems, solaris2.11,
+symbianelf, tpf, uclibc, uclinux, uclinux_eabi, vms, vxworks, vxworksae,
+vxworksmils
+@end quotation


I'm not sure where this list comes from


I split parts in contrib/config-list.mk and printed them.


but I'd expect "linux" to be the
canonical "linux-gnu", along with "linux-uclibc", "linux-android",
"linux-musl" ("uclibc" etc. aren't system names on their own) and variants
with "eabi" or "eabihf" on the end (see what config.guess produces for
Arm).


One needs an Arm machine for that. Do you know about a better way how to get
list of all systems?

Thanks,
Martin

[Patch] Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

2021-05-19 Thread Tobias Burnus


Regarding gfortran.dg/pr96711.f90:

On my x86-64-gnu-linux, it PASSes.
On our powerpc64le-linux-gnu it FAILS with
'STOP 3' (→ also scan-dump count) and 'STOP 4'.

Contrary to PR96983's bug summary, I don't get an ICE.


On powerpc64le-linux-gnu, the following condition evaluates true (→ 'STOP 3'):

   real(16)   :: y   ! 128bit REAL
   integer(16), parameter :: k2 = nint (2 / epsilon (y), kind(k2))
   integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 !2**113
   if (k2 /= m2) stop 3

On x86_64-linux-gnu, k2 == m2 — but on powerpc64le-linux-gnu,
k2 == 2**106 instead of 2**113.

My solution is to permit also 2**106 besides 2**113.

@PowerPC maintainers: Does this make sense? – It seems to work on our PowerPC
but with all the new 'long double' changes, does it also work for you?

@All, Harald: Does the attached patch make sense?

Tobias

PS: I find the PR a bit confusing – there are some GCC 11 commits
and a GCC 12 commit but no real explanation what now works and what
still fails.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Testsuite/Fortran: gfortran.dg/pr96711.f90 - fix expected value for PowerPC [PR96983]

gcc/testsuite/ChangeLog:

	PR fortran/96983
	* gfortran.dg/pr96711.f90:

diff --git a/gcc/testsuite/gfortran.dg/pr96711.f90 b/gcc/testsuite/gfortran.dg/pr96711.f90
index 3761a8ea416..0597ff2e37e 100644
--- a/gcc/testsuite/gfortran.dg/pr96711.f90
+++ b/gcc/testsuite/gfortran.dg/pr96711.f90
@@ -1,28 +1,29 @@
 ! { dg-do run }
 ! { dg-require-effective-target fortran_integer_16 }
 ! { dg-require-effective-target fortran_real_16 }
 ! { dg-additional-options "-fdump-tree-original" }
 ! { dg-final { scan-tree-dump-times "_gfortran_stop_numeric" 2 "original" } }
 !
 ! PR fortran/96711 - ICE on NINT() Function
 
 program p
   implicit none
   real(8):: x
   real(16)   :: y
   integer(16), parameter :: k1 = nint (2 / epsilon (x), kind(k1))
   integer(16), parameter :: k2 = nint (2 / epsilon (y), kind(k2))
   integer(16), parameter :: m1 = 9007199254740992_16!2**53
-  integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 !2**113
+  integer(16), parameter :: m2 = 10384593717069655257060992658440192_16 !2**113  ! Some systems like x86-64
+  integer(16), parameter :: m2a = 81129638414606681695789005144064_16   !2**106  ! Some systems like PowerPC
   integer(16), volatile  :: m
   x = 2 / epsilon (x)
   y = 2 / epsilon (y)
   m = nint (x, kind(m))
 ! print *, m
   if (k1 /= m1) stop 1
   if (m  /= m1) stop 2
   m = nint (y, kind(m))
 ! print *, m
-  if (k2 /= m2) stop 3
-  if (m  /= m2) stop 4
+  if (k2 /= m2 .and. k2 /= m2a) stop 3
+  if (m  /= m2 .and. m /= m2a) stop 4
 end program

RE: [GCC-10 backport][PATCH] arm: _Generic feature failing with ICE for -O0 (pr97205).

2021-05-19 Thread Srinath Parvathaneni via Gcc-patches

Ping!!

> -Original Message-
> From: Srinath Parvathaneni 
> Sent: 30 April 2021 16:24
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> 
> Subject: [GCC-10 backport][PATCH] arm: _Generic feature failing with ICE for
> -O0 (pr97205).
> 
> Hi,
> 
> This is a backport to GCC-10 to fix PR97205, patch applies cleanly on the
> branch.
> 
> Regression tested and found no issues.
> 
> Ok for GCC-10 backport?
> 
> Regards,
> Srinath.
> 
> This makes sure that stack allocated SSA_NAMEs are
> at least MODE_ALIGNED.  Also increase the MEM_ALIGN
> for the corresponding rtl objects.
> 
> gcc:
> 2020-11-03  Bernd Edlinger  
> 
> PR target/97205
> * cfgexpand.c (align_local_variable): Make SSA_NAMEs
> at least MODE_ALIGNED.
> (expand_one_stack_var_at): Increase MEM_ALIGN for SSA_NAMEs.
> 
> gcc/testsuite:
> 2020-11-03  Bernd Edlinger  
> 
> PR target/97205
> * gcc.c-torture/compile/pr97205.c: New test.
> 
> (cherry picked from commit
> 23ac7a009ecfeec3eab79136abed8aac9768b458)
> 
> 
> ### Attachment also inlined for ease of reply
> ###
> 
> 
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index
> bf4f194ed993134109cc21be9cb0ed8a5c170824..4fef5d6ebf420ce4d6f59606e
> cd064f45ae59065 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -366,7 +366,15 @@ align_local_variable (tree decl, bool really_expand)
>unsigned int align;
> 
>if (TREE_CODE (decl) == SSA_NAME)
> -align = TYPE_ALIGN (TREE_TYPE (decl));
> +{
> +  tree type = TREE_TYPE (decl);
> +  machine_mode mode = TYPE_MODE (type);
> +
> +  align = TYPE_ALIGN (type);
> +  if (mode != BLKmode
> +   && align < GET_MODE_ALIGNMENT (mode))
> + align = GET_MODE_ALIGNMENT (mode);
> +}
>else
>  {
>align = LOCAL_DECL_ALIGNMENT (decl); @@ -999,20 +1007,21 @@
> expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>x = plus_constant (Pmode, base, offset);
>x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>  ? TYPE_MODE (TREE_TYPE (decl))
> -: DECL_MODE (SSAVAR (decl)), x);
> +: DECL_MODE (decl), x);
> +
> +  /* Set alignment we actually gave this decl if it isn't an SSA name.
> + If it is we generate stack slots only accidentally so it isn't as
> + important, we'll simply set the alignment directly on the MEM.  */
> +
> +  if (base == virtual_stack_vars_rtx)
> +offset -= frame_phase;
> +  align = known_alignment (offset);
> +  align *= BITS_PER_UNIT;
> +  if (align == 0 || align > base_align)
> +align = base_align;
> 
>if (TREE_CODE (decl) != SSA_NAME)
>  {
> -  /* Set alignment we actually gave this decl if it isn't an SSA name.
> - If it is we generate stack slots only accidentally so it isn't as
> -  important, we'll simply use the alignment that is already set.  */
> -  if (base == virtual_stack_vars_rtx)
> - offset -= frame_phase;
> -  align = known_alignment (offset);
> -  align *= BITS_PER_UNIT;
> -  if (align == 0 || align > base_align)
> - align = base_align;
> -
>/* One would think that we could assert that we're not decreasing
>alignment here, but (at least) the i386 port does exactly this
>via the MINIMUM_ALIGNMENT hook.  */
> @@ -1022,6 +1031,8 @@ expand_one_stack_var_at (tree decl, rtx base,
> unsigned base_align,
>  }
> 
>set_rtl (decl, x);
> +
> +  set_mem_align (x, align);
>  }
> 
>  class stack_vars_data
> @@ -1327,13 +1338,11 @@ expand_one_stack_var_1 (tree var)
>  {
>tree type = TREE_TYPE (var);
>size = tree_to_poly_uint64 (TYPE_SIZE_UNIT (type));
> -  byte_align = TYPE_ALIGN_UNIT (type);
>  }
>else
> -{
> -  size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
> -  byte_align = align_local_variable (var, true);
> -}
> +size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
> +
> +  byte_align = align_local_variable (var, true);
> 
>/* We handle highly aligned variables in expand_stack_vars.  */
>gcc_assert (byte_align * BITS_PER_UNIT <=
> MAX_SUPPORTED_STACK_ALIGNMENT); diff --git a/gcc/testsuite/gcc.c-
> torture/compile/pr97205.c b/gcc/testsuite/gcc.c-torture/compile/pr97205.c
> new file mode 100644
> index
> ..6600011fcf84660edcba8d9
> 68c78ee6aaa0aa923
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr97205.c
> @@ -0,0 +1,7 @@
> +int a;
> +typedef __attribute__((aligned(2))) int x; int f () {
> +  x b = a;
> +  return b;
> +}

Re: [PATCH] phiopt: Simplify (X & Y) == X -> (X & ~Y) == 0 even in presence of integral conversions [PR94589]

2021-05-19 Thread Christophe Lyon via Gcc-patches

On Wed, 19 May 2021 at 13:29, Christophe Lyon
 wrote:
>
> On Wed, 19 May 2021 at 13:13, Richard Biener  wrote:
> >
> > On Wed, 19 May 2021, Jakub Jelinek wrote:
> >
> > > On Wed, May 19, 2021 at 11:09:19AM +0200, Jakub Jelinek via Gcc-patches 
> > > wrote:
> > > > On Wed, May 19, 2021 at 10:15:53AM +0200, Christophe Lyon via 
> > > > Gcc-patches wrote:
> > > > > After this update, the test fails on arm and aarch64: according to the
> > > > > logs, the optimization is still performed 14 times.
> > > >
> > > > Seems this is because
> > > >   if (change
> > > >   && !flag_syntax_only
> > > >   && (load_extend_op (TYPE_MODE (TREE_TYPE (and0)))
> > > >   == ZERO_EXTEND))
> > > > {
> > > >   tree uns = unsigned_type_for (TREE_TYPE (and0));
> > > >   and0 = fold_convert_loc (loc, uns, and0);
> > > >   and1 = fold_convert_loc (loc, uns, and1);
> > > > }
> > > > in fold-const.c adds on these targets extra casts that prevent the
> > > > optimizations.
> > >
> > > This patch seems to fix it (but I don't have an easy way to test on 
> > > aarch64
> > > or arm on the trunk and 11 branch would need numerous backports).
> >
> > OK if somebody manages to test on arm/aarch64.
> >
> I confirm it fixes the problem on arm. (aarch64 in progress)
>
And aarch64 is OK too.

> Thanks!
>
> > Richard.
> >
> > > 2021-05-19  Jakub Jelinek  
> > >
> > >   PR tree-optimization/94589
> > >   * match.pd ((X & Y) == X -> (X & ~Y) == 0): Simplify even in 
> > > presence
> > >   of integral conversions.
> > >
> > > --- gcc/match.pd.jj   2021-05-15 10:10:28.0 +0200
> > > +++ gcc/match.pd  2021-05-19 11:34:42.130624557 +0200
> > > @@ -4769,6 +4769,16 @@ (define_operator_list COND_TERNARY
> > >   (simplify
> > >(cmp:c (bit_and:c @0 @1) @0)
> > >(cmp (bit_and @0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))
> > > + (simplify
> > > +  (cmp:c (convert@3 (bit_and (convert@2 @0) INTEGER_CST@1)) (convert @0))
> > > +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > > +   && INTEGRAL_TYPE_P (TREE_TYPE (@2))
> > > +   && INTEGRAL_TYPE_P (TREE_TYPE (@3))
> > > +   && TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE 
> > > (@0))
> > > +   && TYPE_PRECISION (TREE_TYPE (@3)) > TYPE_PRECISION (TREE_TYPE 
> > > (@2))
> > > +   && !wi::neg_p (wi::to_wide (@1)))
> > > +   (cmp (bit_and @0 (convert (bit_not @1)))
> > > + { build_zero_cst (TREE_TYPE (@0)); })))
> > >
> > >   /* (X | Y) == Y becomes (X & ~Y) == 0.  */
> > >   (simplify
> > >
> > >
> > >   Jakub

Add 'libgomp.oacc-c-c++-common/loop-gwv-2.c' (was: [PATCH, OpenACC] Add support for gang local storage allocation in shared memory)

2021-05-19 Thread Thomas Schwinge

Hi!

On 2018-08-13T21:41:50+0100, Julian Brown  wrote:
> On Mon, 13 Aug 2018 11:42:26 -0700 Cesar Philippidis  
> wrote:
>> On 08/13/2018 09:21 AM, Julian Brown wrote:
>> > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c 
>> > b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
>> > new file mode 100644
>> > index 000..2fa708a
>> > --- /dev/null
>> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
>> > @@ -0,0 +1,106 @@
>> > +/* { dg-xfail-run-if "gangprivate failure" { 
>> > openacc_nvidia_accel_selected } { "-O0" } { "" } } */

>> is the above xfail still necessary? It seems to xpass
>> for me on nvptx. However, I see this regression on the host:
>>
>> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/loop-gwv-2.c
>> -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  execution test

> Oops, this was the version of the patch I meant to post (and the one I
> tested). The XFAIL on loop-gwv-2.c isn't necessary, plus that test
> needed some other fixes to make it pass for NVPTX (it was written for
> GCN to start with).

As I should find out later, this testcase actually does work without the
code changes (OpenACC privatization levels) that it's accompanying -- and
I don't actually see anything in the testcase that the code changes would
trigger for.  Maybe it was for some earlier revision of these code
changes?  Anyway, as it's all-PASS for all systems that I've tested on,
I've now pushed "Add 'libgomp.oacc-c-c++-common/loop-gwv-2.c'" to master
branch in commit 5a16fb19e7c4274f8dd9bbdd30d7d06fe2eff8af, see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
>From 5a16fb19e7c4274f8dd9bbdd30d7d06fe2eff8af Mon Sep 17 00:00:00 2001
From: Julian Brown 
Date: Mon, 13 Aug 2018 21:41:50 +0100
Subject: [PATCH] Add 'libgomp.oacc-c-c++-common/loop-gwv-2.c'

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New.
---
 .../libgomp.oacc-c-c++-common/loop-gwv-2.c| 95 +++
 1 file changed, 95 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
new file mode 100644
index 000..a4f81a39e24
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
@@ -0,0 +1,95 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#if 0
+#define DEBUG(DIM, IDX, VAL) \
+  fprintf (stderr, "%sdist[%d] = %d\n", (DIM), (IDX), (VAL))
+#else
+#define DEBUG(DIM, IDX, VAL)
+#endif
+
+#define N (32*32*32)
+
+int
+check (const char *dim, int *dist, int dimsize)
+{
+  int ix;
+  int exit = 0;
+
+  for (ix = 0; ix < dimsize; ix++)
+{
+  DEBUG(dim, ix, dist[ix]);
+  if (dist[ix] < (N) / (dimsize + 0.5)
+	  || dist[ix] > (N) / (dimsize - 0.5))
+	{
+	  fprintf (stderr, "did not distribute to %ss (%d not between %d "
+		   "and %d)\n", dim, dist[ix], (int) ((N) / (dimsize + 0.5)),
+		   (int) ((N) / (dimsize - 0.5)));
+	  exit |= 1;
+	}
+}
+
+  return exit;
+}
+
+int main ()
+{
+  int ary[N];
+  int ix;
+  int exit = 0;
+  int gangsize = 0, workersize = 0, vectorsize = 0;
+  int *gangdist, *workerdist, *vectordist;
+
+  for (ix = 0; ix < N;ix++)
+ary[ix] = -1;
+
+#pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
+	copy(ary) copyout(gangsize, workersize, vectorsize)
+  {
+#pragma acc loop gang worker vector
+for (unsigned ix = 0; ix < N; ix++)
+  {
+	int g, w, v;
+
+	g = __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+	w = __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+	v = __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+
+	ary[ix] = (g << 16) | (w << 8) | v;
+  }
+
+gangsize = __builtin_goacc_parlevel_size (GOMP_DIM_GANG);
+workersize = __builtin_goacc_parlevel_size (GOMP_DIM_WORKER);
+vectorsize = __builtin_goacc_parlevel_size (GOMP_DIM_VECTOR);
+  }
+
+  gangdist = (int *) alloca (gangsize * sizeof (int));
+  workerdist = (int *) alloca (workersize * sizeof (int));
+  vectordist = (int *) alloca (vectorsize * sizeof (int));
+  memset (gangdist, 0, gangsize * sizeof (int));
+  memset (workerdist, 0, workersize * sizeof (int));
+  memset (vectordist, 0, vectorsize * sizeof (int));
+
+  /* Test that work is shared approximately equally amongst each active
+ gang/worker/vector.  */
+  for (ix = 0; ix < N; ix++)
+{
+  int g = (ary[ix] >> 16) & 255;
+  int w = (ary[ix] >> 8) & 255;
+  int v = ary[ix] & 255;
+
+  gangdist[g]++;
+  workerdist[w]++;
+  vectordist[v]++;
+}
+
+  exit = check ("gang", gangdist, gangsize);
+  exit |= check ("worker", workerdist, workersize);
+  exit |= check ("vector", vectordist, vectorsize);
+
+  return exit;
+}
-- 
2.30.2

1 2 >

1 - 100 of 128 matches

Mail list logo