Re: rs6000: Generate an lxvp instead of two adjacent lxv instructions

2021-07-09 Thread Peter Bergner via Gcc-patches
On 7/9/21 6:14 PM, Peter Bergner wrote:
> ...code section.  Does this look better?  I'm currently running bootstraps
> and regtests on LE and BE.

Bootstrap and regtesting on both LE and BE showed no regressions.


Peter


Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-09 Thread Ian Lance Taylor via Gcc-patches
On Thu, Jul 8, 2021 at 11:16 PM Richard Biener
 wrote:
>
> On Thu, Jul 8, 2021 at 8:02 PM Martin Sebor via Gcc-patches
>  wrote:
> >
> > Hi Ian,
> >
> > Yesterday's enhancement to -Warray-bounds has exposed a couple of
> > issues in libgo where the code writes into an invalid constant
> > address that the warning is designed to flag.
> >
> > On the assumption that those invalid addresses are deliberate,
> > the attached patch suppresses these instances by using #pragma
> > GCC diagnostic but I don't think I'm supposed to commit it (at
> > least Git won't let me).  To avoid Go bootstrap failures please
> > either apply the patch or otherwise suppress the warning (e.g.,
> > by using a volatile pointer temporary).
>
> Btw, I don't think we should diagnose things like
>
> *(int*)0x21 = 0x21;
>
> when somebody literally writes that he'll be just annoyed by diagnostics.
>
> Of course the above might be able to use __builtin_trap (); - it looks
> like it is placed where control flow should never end, kind of a
> __builtin_unreachable (), which means abort () might do as well.


I agree.  While this code is certainly intentional, abort will work
just as well in practice.  I committed the following to change it.

Ian
a15210699cbc60bc9ed077549dcd5288a295f42c
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index ab1384d698b..4d0f44f2dd2 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-01cb2b5e69a2d08ef3cc1ea023c22ed9b79f5114
+adcf10890833026437a94da54934ce50c0018309
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/runtime/proc.c b/libgo/runtime/proc.c
index 38bf7a6b255..3a30748d329 100644
--- a/libgo/runtime/proc.c
+++ b/libgo/runtime/proc.c
@@ -594,7 +594,7 @@ runtime_mstart(void *arg)
gp->entry = nil;
gp->param = nil;
__builtin_call_with_static_chain(pfn(gp1), fv);
-   *(int*)0x21 = 0x21;
+   abort();
}
 
if(mp->exiting) {
@@ -662,7 +662,7 @@ setGContext(void)
gp->entry = nil;
gp->param = nil;
__builtin_call_with_static_chain(pfn(gp1), fv);
-   *(int*)0x22 = 0x22;
+   abort();
}
 }
 
diff --git a/libgo/runtime/runtime_c.c b/libgo/runtime/runtime_c.c
index 18222c14465..bc920a5d406 100644
--- a/libgo/runtime/runtime_c.c
+++ b/libgo/runtime/runtime_c.c
@@ -116,7 +116,7 @@ runtime_signalstack(byte *p, uintptr n)
if(p == nil)
st.ss_flags = SS_DISABLE;
if(sigaltstack(, nil) < 0)
-   *(int *)0xf1 = 0xf1;
+   abort();
 }
 
 int32 go_open(char *, int32, int32)


Re: [PATCH 00/10] vect: Reuse reduction accumulators between loops

2021-07-09 Thread Kewen.Lin via Gcc-patches
Hi Richard,

on 2021/7/8 下午8:38, Richard Sandiford via Gcc-patches wrote:
> Quoting from the final patch in the series:
> 
> 
> This patch adds support for reusing a main loop's reduction accumulator
> in an epilogue loop.  This in turn lets the loops share a single piece
> of vector->scalar reduction code.
> 
> The patch has the following restrictions:
> 
> (1) The epilogue reduction can only operate on a single vector
> (e.g. ncopies must be 1 for non-SLP reductions, and the group size
> must be <= the element count for SLP reductions).
> 
> (2) Both loops must use the same vector mode for their accumulators.
> This means that the patch is restricted to targets that support
> --param vect-partial-vector-usage=1.
> 
> (3) The reduction must be a standard “tree code” reduction.
> 
> However, these restrictions could be lifted in future.  For example,
> if the main loop operates on 128-bit vectors and the epilogue loop
> operates on 64-bit vectors, we could in future reduce the 128-bit
> vector by one stage and use the 64-bit result as the starting point
> for the epilogue result.
> 
> The patch tries to handle chained SLP reductions, unchained SLP
> reductions and non-SLP reductions.  It also handles cases in which
> the epilogue loop is entered directly (rather than via the main loop)
> and cases in which the epilogue loop can be skipped.
> 
> 
> However, it ended up being difficult to do that without some preparatory
> clean-ups.  Some of them could probably stand on their own, but others
> are a bit “meh” without the final patch to justify them.
> 
> The diff below shows the effect of the patch when compiling:
> 
>   unsigned short __attribute__((noipa))
>   add_loop (unsigned short *x, int n)
>   {
> unsigned short res = 0;
> for (int i = 0; i < n; ++i)
>   res += x[i];
> return res;
>   }
> 
> with -O3 --param vect-partial-vector-usage=1 on an SVE target:
> 
> add_loop: add_loop:
> .LFB0:.LFB0:
>   .cfi_startproc  .cfi_startproc
>   mov x4, x0<
>   cmp w1, 0   cmp w1, 0
>   ble .L7 ble .L7
>   cnthx0| cnthx4
>   sub w2, w1, #1  sub w2, w1, #1
>   sub w3, w0, #1| sub w3, w4, #1
>   cmp w2, w3  cmp w2, w3
>   bcc .L8 bcc .L8
>   sub w0, w1, w0| sub w4, w1, w4
>   mov x3, 0   mov x3, 0
>   cnthx5  cnthx5
>   mov z0.b, #0mov z0.b, #0
>   ptrue   p0.b, all   ptrue   p0.b, all
>   .p2align 3,,7   .p2align 3,,7
> .L4:  .L4:
>   ld1hz1.h, p0/z, [x4, x3,  | ld1hz1.h, p0/z, [x0, x3, 
>   mov x2, x3  mov x2, x3
>   add x3, x3, x5  add x3, x3, x5
>   add z0.h, z0.h, z1.hadd z0.h, z0.h, z1.h
>   cmp w0, w3| cmp w4, w3
>   bcs .L4 bcs .L4
>   uaddv   d0, p0, z0.h  <
>   umovw0, v0.h[0]   <
>   inchx2  inchx2
>   and w0, w0, 65535 <
>   cmp w1, w2  cmp w1, w2
>   beq .L2   | beq .L6
> .L3:  .L3:
>   sub w1, w1, w2  sub w1, w1, w2
>   mov z1.b, #0  | add x2, x0, w2, uxtw 1
>   whilelo p0.h, wzr, w1   whilelo p0.h, wzr, w1
>   add x2, x4, w2, uxtw 1| ld1hz1.h, p0/z, [x2]
>   ptrue   p1.b, all | add z0.h, p0/m, z0.h, z1.
>   ld1hz0.h, p0/z, [x2]  | .L6:
>   sel z0.h, p0, z0.h, z1.h  | ptrue   p0.b, all
>   uaddv   d0, p1, z0.h  | uaddv   d0, p0, z0.h
>   fmovx1, d0| umovw0, v0.h[0]
>   add w0, w0, w1, uxth  <
>   and w0, w0, 65535   and w0, w0, 65535
> .L2:<
>   ret ret
>   .p2align 2,,3   .p2align 2,,3
> .L7:  .L7:
>   mov w0, 0   mov w0, 0
>   ret ret
> .L8:  .L8:
>   mov w2, 0

Re: rs6000: Generate an lxvp instead of two adjacent lxv instructions

2021-07-09 Thread Peter Bergner via Gcc-patches
On 7/8/21 8:26 PM, Peter Bergner wrote:
> We do need different code for LE versus BE.  So you want something like
> 
>   if (WORDS_BIG_ENDIAN) {...} else {...}
> 
> ...instead?  I can try that to see if the code is easier to read.
[snip]
> Let me make the changes you want and I'll repost with what I come up with.

Ok, I removed the consecutive_mem_locations() function from the previous
patch and just call adjacent_mem_locations() directly now.  I also moved
rs6000_split_multireg_move() to later in the file to fix the declaration
issue.  However, since rs6000_split_multireg_move() is where the new code
was added to emit the lxvp's, it can be hard to see what I changed because
of the move.  I'll note that all of my changes are restrictd to within the

  if (GET_CODE (src) == UNSPEC)
{
  gcc_assert (XINT (src, 1) == UNSPEC_MMA_ASSEMBLE);
  ...
}

...code section.  Does this look better?  I'm currently running bootstraps
and regtests on LE and BE.

Peter


gcc/
* config/rs6000/rs6000.c (rs6000_split_multireg_move): Move to later
in the file.  Handle MMA build built-ins with operands in adjacent
memory locations.
(adjacent_mem_locations): Test that MEM1 and MEM2 are MEMs.
Return the lower addressed memory rtx, if any.
(power6_sched_reorder2): Update for adjacent_mem_locations change.


gcc/testsuite/
* gcc.target/powerpc/mma-builtin-9.c: New test.


diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9a5db63d0ef..8edf7a4a81c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16690,382 +16690,6 @@ rs6000_expand_atomic_op (enum rtx_code code, rtx mem, 
rtx val,
 emit_move_insn (orig_after, after);
 }
 
-/* Emit instructions to move SRC to DST.  Called by splitters for
-   multi-register moves.  It will emit at most one instruction for
-   each register that is accessed; that is, it won't emit li/lis pairs
-   (or equivalent for 64-bit code).  One of SRC or DST must be a hard
-   register.  */
-
-void
-rs6000_split_multireg_move (rtx dst, rtx src)
-{
-  /* The register number of the first register being moved.  */
-  int reg;
-  /* The mode that is to be moved.  */
-  machine_mode mode;
-  /* The mode that the move is being done in, and its size.  */
-  machine_mode reg_mode;
-  int reg_mode_size;
-  /* The number of registers that will be moved.  */
-  int nregs;
-
-  reg = REG_P (dst) ? REGNO (dst) : REGNO (src);
-  mode = GET_MODE (dst);
-  nregs = hard_regno_nregs (reg, mode);
-
-  /* If we have a vector quad register for MMA, and this is a load or store,
- see if we can use vector paired load/stores.  */
-  if (mode == XOmode && TARGET_MMA
-  && (MEM_P (dst) || MEM_P (src)))
-{
-  reg_mode = OOmode;
-  nregs /= 2;
-}
-  /* If we have a vector pair/quad mode, split it into two/four separate
- vectors.  */
-  else if (mode == OOmode || mode == XOmode)
-reg_mode = V1TImode;
-  else if (FP_REGNO_P (reg))
-reg_mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode :
-   (TARGET_HARD_FLOAT ? DFmode : SFmode);
-  else if (ALTIVEC_REGNO_P (reg))
-reg_mode = V16QImode;
-  else
-reg_mode = word_mode;
-  reg_mode_size = GET_MODE_SIZE (reg_mode);
-
-  gcc_assert (reg_mode_size * nregs == GET_MODE_SIZE (mode));
-
-  /* TDmode residing in FP registers is special, since the ISA requires that
- the lower-numbered word of a register pair is always the most significant
- word, even in little-endian mode.  This does not match the usual subreg
- semantics, so we cannnot use simplify_gen_subreg in those cases.  Access
- the appropriate constituent registers "by hand" in little-endian mode.
-
- Note we do not need to check for destructive overlap here since TDmode
- can only reside in even/odd register pairs.  */
-  if (FP_REGNO_P (reg) && DECIMAL_FLOAT_MODE_P (mode) && !BYTES_BIG_ENDIAN)
-{
-  rtx p_src, p_dst;
-  int i;
-
-  for (i = 0; i < nregs; i++)
-   {
- if (REG_P (src) && FP_REGNO_P (REGNO (src)))
-   p_src = gen_rtx_REG (reg_mode, REGNO (src) + nregs - 1 - i);
- else
-   p_src = simplify_gen_subreg (reg_mode, src, mode,
-i * reg_mode_size);
-
- if (REG_P (dst) && FP_REGNO_P (REGNO (dst)))
-   p_dst = gen_rtx_REG (reg_mode, REGNO (dst) + nregs - 1 - i);
- else
-   p_dst = simplify_gen_subreg (reg_mode, dst, mode,
-i * reg_mode_size);
-
- emit_insn (gen_rtx_SET (p_dst, p_src));
-   }
-
-  return;
-}
-
-  /* The __vector_pair and __vector_quad modes are multi-register
- modes, so if we have to load or store the registers, we have to be
- careful to properly swap them if we're in little endian mode
- below.  This means the last register gets the first memory
- location.  We also need to be careful of using the right register

[PATCH libatomic/arm] avoid warning on constant addresses (PR 101379)

2021-07-09 Thread Martin Sebor via Gcc-patches

The attached tweak avoids the new -Warray-bounds instances when
building libatomic for arm. Christophe confirms it resolves
the problem (thank you!)

As we have discussed, the main goal of this class of warnings
is to detect accesses at addresses derived from null pointers
(e.g., to struct members or array elements at a nonzero offset).
Diagnosing accesses at hardcoded addresses is incidental because
at the stage they are detected the two are not distinguishable
from each another.

I'm planning (hoping) to implement detection of invalid pointer
arithmetic involving null for GCC 12, so this patch is a stopgap
solution to unblock the arm libatomic build without compromising
the warning.  Once the new detection is in place these workarounds
can be removed or replaced with something more appropriate (e.g.,
declaring the objects at the hardwired addresses with an attribute
like AVR's address or io; that would enable bounds checking at
those addresses as well).

Martin
PR bootstrap/101379 - libatomic arm build failure after r12-2132 due to -Warray-bounds on a constant address

libatomic/ChangeLog:
	* /config/linux/arm/host-config.h (__kernel_helper_version): New
	function.  Adjust shadow macro.

diff --git a/libatomic/config/linux/arm/host-config.h b/libatomic/config/linux/arm/host-config.h
index 1520f237d73..777d08a2b85 100644
--- a/libatomic/config/linux/arm/host-config.h
+++ b/libatomic/config/linux/arm/host-config.h
@@ -39,8 +39,14 @@ typedef void (__kernel_dmb_t) (void);
 #define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)
 
 /* Kernel helper page version number.  */
-#define __kernel_helper_version (*(unsigned int *)0x0ffc)
+static inline unsigned*
+__kernel_helper_version ()
+{
+  unsigned *volatile addr = (unsigned int *)0x0ffc;
+  return addr;
+}
 
+#define __kernel_helper_version (*__kernel_helper_version())
 
 #ifndef HAVE_STREX
 static inline bool


Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Patrick Palka via Gcc-patches
On Fri, 9 Jul 2021, Jason Merrill wrote:

> On 7/9/21 4:18 PM, Patrick Palka wrote:
> > On Fri, 9 Jul 2021, Patrick Palka wrote:
> > 
> > > On Fri, 9 Jul 2021, Jason Merrill wrote:
> > > 
> > > > On 7/9/21 3:18 PM, Patrick Palka wrote:
> > > > > This adds support for declaring (class-scope) deduction guides for a
> > > > > member class template.  Fortunately it seems only a couple of changes
> > > > > are needed in order for the existing CTAD machinery to handle them
> > > > > like
> > > > > any other deduction guide: we need to make sure to give them a
> > > > > FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
> > > > > BASELINK when looking them up.
> > > > > 
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > > > > for
> > > > > trunk?
> > > > > 
> > > > >   PR c++/79501
> > > > > 
> > > > > gcc/cp/ChangeLog:
> > > > > 
> > > > >   * decl.c (grokfndecl): Don't require that deduction guides are
> > > > >   declared at namespace scope.  Check that class-scope deduction
> > > > >   guides have the same access as the member class template.
> > > > >   (grokdeclarator): Pretend class-scope deduction guides are
> > > > > static.
> > > > >   * name-lookup.c (lookup_qualified_name): Don't use a BASELINK
> > > > >   for class-scope deduction guides.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > >   * g++.dg/cpp1z/class-deduction92.C: New test.
> > > > >   * g++.dg/cpp1z/class-deduction93.C: New test.
> > > > >   * g++.dg/cpp1z/class-deduction94.C: New test.
> > > > > ---
> > > > >gcc/cp/decl.c | 17 -
> > > > >gcc/cp/name-lookup.c  | 11 +---
> > > > >.../g++.dg/cpp1z/class-deduction92.C  | 16 
> > > > >.../g++.dg/cpp1z/class-deduction93.C  | 25
> > > > > +++
> > > > >.../g++.dg/cpp1z/class-deduction94.C  | 19 ++
> > > > >5 files changed, 79 insertions(+), 9 deletions(-)
> > > > >create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
> > > > >create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
> > > > >create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C
> > > > > 
> > > > > diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> > > > > index ebe1318d38d..8b8ffb7de83 100644
> > > > > --- a/gcc/cp/decl.c
> > > > > +++ b/gcc/cp/decl.c
> > > > > @@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
> > > > >if (deduction_guide_p (decl))
> > > > >{
> > > > > -  if (!DECL_NAMESPACE_SCOPE_P (decl))
> > > > > - {
> > > > > -   error_at (location, "deduction guide %qD must be declared at
> > > > > "
> > > > > - "namespace scope", decl);
> > > > > -   return NULL_TREE;
> > > > > - }
> > > > 
> > > > Do we still reject deduction guides at function scope?
> > > 
> > > Yes, it looks like the parser doesn't even recognize them at function
> > > scope:
> > > 
> > >template struct A;
> > > 
> > >int main() {
> > >  A(int) -> A;
> > >}
> > > 
> > > :4:4: error: missing template arguments before ‘(’ token
> > > :4:5: error: expected primary-expression before ‘int’
> > > :4:13: error: invalid use of ‘struct A’
> > > 
> > > Deduction guide templates are also still rejected (as with all templates
> > > at
> > > function scope).
> > > 
> > > > 
> > > > >  tree type = TREE_TYPE (DECL_NAME (decl));
> > > > >  if (in_namespace == NULL_TREE
> > > > > && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
> > > > > @@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
> > > > > inform (location_of (type), "  declared here");
> > > > > return NULL_TREE;
> > > > >   }
> > > > > +  if (DECL_CLASS_SCOPE_P (decl)
> > > > > +   && current_access_specifier != declared_access (TYPE_NAME
> > > > > (type)))
> > > > > + {
> > > > > +   error_at (location, "deduction guide %qD must have the same
> > > > > access "
> > > > > +   "as %qT", decl, type);
> > > > > +   inform (location_of (type), "  declared here");
> > > > > + }
> > > > >  if (funcdef_flag)
> > > > >   error_at (location,
> > > > > "deduction guide %qD must not have a function body",
> > > > > decl);
> > > > > @@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator
> > > > > *declarator,
> > > > >  storage_class = declspecs->storage_class;
> > > > >  if (storage_class == sc_static)
> > > > >staticp = 1 + (decl_context == FIELD);
> > > > > +  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
> > > > > +/* Treat class-scope deduction guides as static member functions
> > > > > +   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
> > > > > +staticp = 2;
> > > > >if (virtualp)
> > > > >{
> > > > > diff --git a/gcc/cp/name-lookup.c 

Re: [PATCH] coroutines: Adjust outlined function names [PR95520].

2021-07-09 Thread Jason Merrill via Gcc-patches

On 7/9/21 2:18 PM, Iain Sandoe wrote:

Hi,

The mechanism used to date for uniquing the coroutine helper
functions (actor, destroy) was over-complicating things and
leading to the noted PR and also difficulties in setting
breakpoints on these functions (so this will help PR99215 as
well).  The revised mangling matches the form used by clang.

OK for master & backports?
thanks
Iain

Signed-off-by: Iain Sandoe 

PR c++/95520 - [coroutines] __builtin_FUNCTION() returns mangled .actor instead 
of original function name

PR c++/95520

gcc/cp/ChangeLog:

* coroutines.cc (act_des_fn): Adjust coroutine
helper function name mangling.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr95520.C: New test.
---
  gcc/cp/coroutines.cc  | 14 +--
  gcc/testsuite/g++.dg/coroutines/pr95520.C | 29 +++
  2 files changed, 41 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95520.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 54ffdc8d062..1a3ab58e044 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3985,9 +3985,19 @@ register_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  static tree
  act_des_fn (tree orig, tree fn_type, tree coro_frame_ptr, const char* name)
  {
-  tree fn_name = get_fn_local_identifier (orig, name);
+  tree fn_name;
location_t loc = DECL_SOURCE_LOCATION (orig);
-  tree fn = build_lang_decl (FUNCTION_DECL, fn_name, fn_type);
+  tree fn = build_lang_decl (FUNCTION_DECL, DECL_NAME (orig), fn_type);
+  if (tree da_name = DECL_ASSEMBLER_NAME (orig))
+{
+  char *buf = xasprintf ("%s.%s", IDENTIFIER_POINTER (da_name), name);
+  fn_name = get_identifier (buf);
+  free (buf);
+}
+  else
+fn_name = get_fn_local_identifier (orig, name);


How about handling this in write_encoding, along the lines of the 
devel/c++-contracts branch?


Speaking of which, I wonder if you also want to do something similar to 
what I did there to put the ramp/actor/destroyer functions into into the 
same comdat group.


Jason



Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Jason Merrill via Gcc-patches

On 7/9/21 4:18 PM, Patrick Palka wrote:

On Fri, 9 Jul 2021, Patrick Palka wrote:


On Fri, 9 Jul 2021, Jason Merrill wrote:


On 7/9/21 3:18 PM, Patrick Palka wrote:

This adds support for declaring (class-scope) deduction guides for a
member class template.  Fortunately it seems only a couple of changes
are needed in order for the existing CTAD machinery to handle them like
any other deduction guide: we need to make sure to give them a
FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
BASELINK when looking them up.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/79501

gcc/cp/ChangeLog:

* decl.c (grokfndecl): Don't require that deduction guides are
declared at namespace scope.  Check that class-scope deduction
guides have the same access as the member class template.
(grokdeclarator): Pretend class-scope deduction guides are static.
* name-lookup.c (lookup_qualified_name): Don't use a BASELINK
for class-scope deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction92.C: New test.
* g++.dg/cpp1z/class-deduction93.C: New test.
* g++.dg/cpp1z/class-deduction94.C: New test.
---
   gcc/cp/decl.c | 17 -
   gcc/cp/name-lookup.c  | 11 +---
   .../g++.dg/cpp1z/class-deduction92.C  | 16 
   .../g++.dg/cpp1z/class-deduction93.C  | 25 +++
   .../g++.dg/cpp1z/class-deduction94.C  | 19 ++
   5 files changed, 79 insertions(+), 9 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index ebe1318d38d..8b8ffb7de83 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
   if (deduction_guide_p (decl))
   {
-  if (!DECL_NAMESPACE_SCOPE_P (decl))
-   {
- error_at (location, "deduction guide %qD must be declared at "
-   "namespace scope", decl);
- return NULL_TREE;
-   }


Do we still reject deduction guides at function scope?


Yes, it looks like the parser doesn't even recognize them at function scope:

   template struct A;

   int main() {
 A(int) -> A;
   }

:4:4: error: missing template arguments before ‘(’ token
:4:5: error: expected primary-expression before ‘int’
:4:13: error: invalid use of ‘struct A’

Deduction guide templates are also still rejected (as with all templates at
function scope).




 tree type = TREE_TYPE (DECL_NAME (decl));
 if (in_namespace == NULL_TREE
  && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
@@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
  inform (location_of (type), "  declared here");
  return NULL_TREE;
}
+  if (DECL_CLASS_SCOPE_P (decl)
+ && current_access_specifier != declared_access (TYPE_NAME (type)))
+   {
+ error_at (location, "deduction guide %qD must have the same access "
+ "as %qT", decl, type);
+ inform (location_of (type), "  declared here");
+   }
 if (funcdef_flag)
error_at (location,
  "deduction guide %qD must not have a function body", decl);
@@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator *declarator,
 storage_class = declspecs->storage_class;
 if (storage_class == sc_static)
   staticp = 1 + (decl_context == FIELD);
+  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
+/* Treat class-scope deduction guides as static member functions
+   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
+staticp = 2;
   if (virtualp)
   {
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 1be5f3da6d5..089bca1d471 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name,
LOOK_want want, bool complain)
 else if (cxx_dialect != cxx98 && TREE_CODE (scope) == ENUMERAL_TYPE)
   t = lookup_enumerator (scope, name);
 else if (is_class_type (scope, complain))
-t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
-  tf_warning_or_error);
-
+{
+  t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
+tf_warning_or_error);
+  if (t && dguide_name_p (name))
+   /* Since class-scope deduction guides aren't really member functions,
+  don't use a BASELINK for them.  */
+   t = MAYBE_BASELINK_FUNCTIONS (t);
+}


On second thought, this seems to be an awkward spot to do this
adjustment.  Maybe it's better to do it in lookup_member, or in
deduction_guides_for (the only caller which really 

[PATCH]middle-e: GNU GCC version in graphviz dot data [PR83711]

2021-07-09 Thread Tjibbe Legering via Gcc-patches
Created this patch to add GCC version to the graph data

for the options:
-fdump-rtl-all-graph
-fdump-tree-all-graph
-fdump-ipq-all-graph
-fcallgraph-info
-fdump-analyzer-callgraph
-fdump-analyzer-exploded-graph
-fdump-analyzer-supergraph
-fdump-analyzer-state-purge
-fdump-analyzer-feasibility

using gcc git version 4 july 2021

git describe
basepoints/gcc-12-1999-gd07092a61d5

gcc --version
gcc (GCC) 12.0.0 20210704 (experimental)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The start of the gcc -fcallgraph-info output now looks like this:

/* callgraph generated by GNU GCC Compiler -fcallgraph-info option version
 * GNU C17 (GCC) version 12.0.0 20210704 (experimental)
(x86_64-pc-linux-gnu)
 * compiled by GNU C version 12.0.0 20210704 (experimental), GMP version
6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version none
 * GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
*/
graph: { title: "foo.c"
node: { title: "hello_GCC" label: "hello_GCC\nfoo.c:3:6" }
 ...
}

There are more routines in GCC generating graph data
but do not know the options how to use it.

See also in bugzilla :
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83711

diff --git a/gcc/analyzer/analysis-plan.cc b/gcc/analyzer/analysis-plan.cc
index 7dfc48e9c3e..223eba37761 100644
--- a/gcc/analyzer/analysis-plan.cc
+++ b/gcc/analyzer/analysis-plan.cc
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "basic-block.h"
 #include "gimple.h"
 #include "gimple-iterator.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/supergraph.h"

diff --git a/gcc/analyzer/call-string.cc b/gcc/analyzer/call-string.cc
index 9f4f77ab3a9..1489b687966 100644
--- a/gcc/analyzer/call-string.cc
+++ b/gcc/analyzer/call-string.cc
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "basic-block.h"
 #include "gimple.h"
 #include "gimple-iterator.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/supergraph.h"

diff --git a/gcc/analyzer/checker-path.cc b/gcc/analyzer/checker-path.cc
index e10c8e2bb7c..0d1e3f9bdd0 100644
--- a/gcc/analyzer/checker-path.cc
+++ b/gcc/analyzer/checker-path.cc
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "function.h"
 #include "cfg.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "alloc-pool.h"
 #include "fibonacci_heap.h"
diff --git a/gcc/analyzer/complexity.cc b/gcc/analyzer/complexity.cc
index ece4272ff6e..4d6d0900eda 100644
--- a/gcc/analyzer/complexity.cc
+++ b/gcc/analyzer/complexity.cc
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "options.h"
 #include "cgraph.h"
 #include "cfg.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/call-string.h"
 #include "analyzer/program-point.h"
diff --git a/gcc/analyzer/constraint-manager.cc
b/gcc/analyzer/constraint-manager.cc
index 51cf52258a9..b62e91bf28d 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "options.h"
 #include "cgraph.h"
 #include "cfg.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/supergraph.h"
 #include "sbitmap.h"
diff --git a/gcc/analyzer/diagnostic-manager.cc
b/gcc/analyzer/diagnostic-manager.cc
index 7eb4ed8a4f2..db548191313 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "gimple-iterator.h"
 #include "cgraph.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/supergraph.h"
 #include "analyzer/program-state.h"
diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 4456d9b828b..c6f44adc42e 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimple-pretty-print.h"
 #include "cgraph.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/supergraph.h"
 #include "analyzer/program-state.h"
diff --git a/gcc/analyzer/feasible-graph.cc b/gcc/analyzer/feasible-graph.cc
index 675bda9e7e5..426c2d46c28 100644
--- a/gcc/analyzer/feasible-graph.cc
+++ b/gcc/analyzer/feasible-graph.cc
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "gimple-iterator.h"
 #include "cgraph.h"
+#include "toplev.h"
 #include "digraph.h"
 #include "analyzer/supergraph.h"
 #include "analyzer/program-state.h"
diff --git a/gcc/analyzer/program-point.cc b/gcc/analyzer/program-point.cc
index d8cfc61975e..de6aa13ae1c 100644
--- a/gcc/analyzer/program-point.cc
+++ b/gcc/analyzer/program-point.cc
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 

Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Marek Polacek via Gcc-patches
On Fri, Jul 09, 2021 at 04:18:34PM -0400, Patrick Palka via Gcc-patches wrote:
> > > > --- a/gcc/cp/name-lookup.c
> > > > +++ b/gcc/cp/name-lookup.c
> > > > @@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name,
> > > > LOOK_want want, bool complain)
> > > > else if (cxx_dialect != cxx98 && TREE_CODE (scope) == ENUMERAL_TYPE)
> > > >   t = lookup_enumerator (scope, name);
> > > > else if (is_class_type (scope, complain))
> > > > -t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
> > > > -  tf_warning_or_error);
> > > > -
> > > > +{
> > > > +  t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
> > > > +tf_warning_or_error);
> > > > +  if (t && dguide_name_p (name))
> > > > +   /* Since class-scope deduction guides aren't really member 
> > > > functions,
> > > > +  don't use a BASELINK for them.  */
> > > > +   t = MAYBE_BASELINK_FUNCTIONS (t);
> > > > +}
> 
> On second thought, this seems to be an awkward spot to do this
> adjustment.  Maybe it's better to do it in lookup_member, or in
> deduction_guides_for (the only caller which really needs it)?

Yeah, doing it in deduction_guides_for sounds a bit better to me.

Marek



Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Patrick Palka via Gcc-patches
On Fri, 9 Jul 2021, Patrick Palka wrote:

> On Fri, 9 Jul 2021, Jason Merrill wrote:
> 
> > On 7/9/21 3:18 PM, Patrick Palka wrote:
> > > This adds support for declaring (class-scope) deduction guides for a
> > > member class template.  Fortunately it seems only a couple of changes
> > > are needed in order for the existing CTAD machinery to handle them like
> > > any other deduction guide: we need to make sure to give them a
> > > FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
> > > BASELINK when looking them up.
> > > 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > trunk?
> > > 
> > >   PR c++/79501
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * decl.c (grokfndecl): Don't require that deduction guides are
> > >   declared at namespace scope.  Check that class-scope deduction
> > >   guides have the same access as the member class template.
> > >   (grokdeclarator): Pretend class-scope deduction guides are static.
> > >   * name-lookup.c (lookup_qualified_name): Don't use a BASELINK
> > >   for class-scope deduction guides.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   * g++.dg/cpp1z/class-deduction92.C: New test.
> > >   * g++.dg/cpp1z/class-deduction93.C: New test.
> > >   * g++.dg/cpp1z/class-deduction94.C: New test.
> > > ---
> > >   gcc/cp/decl.c | 17 -
> > >   gcc/cp/name-lookup.c  | 11 +---
> > >   .../g++.dg/cpp1z/class-deduction92.C  | 16 
> > >   .../g++.dg/cpp1z/class-deduction93.C  | 25 +++
> > >   .../g++.dg/cpp1z/class-deduction94.C  | 19 ++
> > >   5 files changed, 79 insertions(+), 9 deletions(-)
> > >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
> > >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
> > >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C
> > > 
> > > diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> > > index ebe1318d38d..8b8ffb7de83 100644
> > > --- a/gcc/cp/decl.c
> > > +++ b/gcc/cp/decl.c
> > > @@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
> > >   if (deduction_guide_p (decl))
> > >   {
> > > -  if (!DECL_NAMESPACE_SCOPE_P (decl))
> > > - {
> > > -   error_at (location, "deduction guide %qD must be declared at "
> > > - "namespace scope", decl);
> > > -   return NULL_TREE;
> > > - }
> > 
> > Do we still reject deduction guides at function scope?
> 
> Yes, it looks like the parser doesn't even recognize them at function scope:
> 
>   template struct A;
> 
>   int main() {
> A(int) -> A;
>   }
> 
> :4:4: error: missing template arguments before ‘(’ token
> :4:5: error: expected primary-expression before ‘int’
> :4:13: error: invalid use of ‘struct A’
> 
> Deduction guide templates are also still rejected (as with all templates at
> function scope).
> 
> > 
> > > tree type = TREE_TYPE (DECL_NAME (decl));
> > > if (in_namespace == NULL_TREE
> > > && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
> > > @@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
> > > inform (location_of (type), "  declared here");
> > > return NULL_TREE;
> > >   }
> > > +  if (DECL_CLASS_SCOPE_P (decl)
> > > +   && current_access_specifier != declared_access (TYPE_NAME (type)))
> > > + {
> > > +   error_at (location, "deduction guide %qD must have the same access "
> > > +   "as %qT", decl, type);
> > > +   inform (location_of (type), "  declared here");
> > > + }
> > > if (funcdef_flag)
> > >   error_at (location,
> > > "deduction guide %qD must not have a function body", 
> > > decl);
> > > @@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator *declarator,
> > > storage_class = declspecs->storage_class;
> > > if (storage_class == sc_static)
> > >   staticp = 1 + (decl_context == FIELD);
> > > +  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
> > > +/* Treat class-scope deduction guides as static member functions
> > > +   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
> > > +staticp = 2;
> > >   if (virtualp)
> > >   {
> > > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> > > index 1be5f3da6d5..089bca1d471 100644
> > > --- a/gcc/cp/name-lookup.c
> > > +++ b/gcc/cp/name-lookup.c
> > > @@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name,
> > > LOOK_want want, bool complain)
> > > else if (cxx_dialect != cxx98 && TREE_CODE (scope) == ENUMERAL_TYPE)
> > >   t = lookup_enumerator (scope, name);
> > > else if (is_class_type (scope, complain))
> > > -t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
> > > -tf_warning_or_error);
> > > -
> > > +{
> > > +  t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
> > > +  

[pushed] c++: concepts TS and explicit specialization [PR101098]

2021-07-09 Thread Jason Merrill via Gcc-patches
duplicate_decls was not recognizing the explicit specialization as matching
the implicit specialization of g because
function_requirements_equivalent_p was seeing the C constraint on the
implicit one and not on the explicit.

I think that the usefulness of much of the concepts TS support is limited and
waning, but I guess we can keep it around for GCC 12.

Tested x86_64-pc-linux-gnu, applying to trunk and 11.

PR c++/101098

gcc/cp/ChangeLog:

* decl.c (function_requirements_equivalent_p): Only compare
trailing requirements on a specialization.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/explicit-spec1.C: New test.
---
 gcc/cp/decl.c  | 4 +++-
 gcc/testsuite/g++.dg/concepts/explicit-spec1.C | 9 +
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/concepts/explicit-spec1.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index ebe1318d38d..0df689b01f8 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -955,7 +955,9 @@ static bool
 function_requirements_equivalent_p (tree newfn, tree oldfn)
 {
   /* In the concepts TS, the combined constraints are compared.  */
-  if (cxx_dialect < cxx20)
+  if (cxx_dialect < cxx20
+  && (DECL_TEMPLATE_SPECIALIZATION (newfn)
+ <= DECL_TEMPLATE_SPECIALIZATION (oldfn)))
 {
   tree ci1 = get_constraints (oldfn);
   tree ci2 = get_constraints (newfn);
diff --git a/gcc/testsuite/g++.dg/concepts/explicit-spec1.C 
b/gcc/testsuite/g++.dg/concepts/explicit-spec1.C
new file mode 100644
index 000..d9b6b3d1741
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/explicit-spec1.C
@@ -0,0 +1,9 @@
+// PR c++/101098
+// { dg-do compile { target concepts } }
+
+template concept C = __is_class(T);
+struct Y { int n; } y;
+template void g(T) { }
+int called;
+template<> void g(Y) { called = 3; }
+int main() { g(y); }

base-commit: d5b1bb0d197f9141a0f0e510f8d1b598c3df9552
-- 
2.27.0



Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Patrick Palka via Gcc-patches
On Fri, 9 Jul 2021, Jason Merrill wrote:

> On 7/9/21 3:18 PM, Patrick Palka wrote:
> > This adds support for declaring (class-scope) deduction guides for a
> > member class template.  Fortunately it seems only a couple of changes
> > are needed in order for the existing CTAD machinery to handle them like
> > any other deduction guide: we need to make sure to give them a
> > FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
> > BASELINK when looking them up.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > PR c++/79501
> > 
> > gcc/cp/ChangeLog:
> > 
> > * decl.c (grokfndecl): Don't require that deduction guides are
> > declared at namespace scope.  Check that class-scope deduction
> > guides have the same access as the member class template.
> > (grokdeclarator): Pretend class-scope deduction guides are static.
> > * name-lookup.c (lookup_qualified_name): Don't use a BASELINK
> > for class-scope deduction guides.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp1z/class-deduction92.C: New test.
> > * g++.dg/cpp1z/class-deduction93.C: New test.
> > * g++.dg/cpp1z/class-deduction94.C: New test.
> > ---
> >   gcc/cp/decl.c | 17 -
> >   gcc/cp/name-lookup.c  | 11 +---
> >   .../g++.dg/cpp1z/class-deduction92.C  | 16 
> >   .../g++.dg/cpp1z/class-deduction93.C  | 25 +++
> >   .../g++.dg/cpp1z/class-deduction94.C  | 19 ++
> >   5 files changed, 79 insertions(+), 9 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C
> > 
> > diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> > index ebe1318d38d..8b8ffb7de83 100644
> > --- a/gcc/cp/decl.c
> > +++ b/gcc/cp/decl.c
> > @@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
> >   if (deduction_guide_p (decl))
> >   {
> > -  if (!DECL_NAMESPACE_SCOPE_P (decl))
> > -   {
> > - error_at (location, "deduction guide %qD must be declared at "
> > -   "namespace scope", decl);
> > - return NULL_TREE;
> > -   }
> 
> Do we still reject deduction guides at function scope?

Yes, it looks like the parser doesn't even recognize them at function scope:

  template struct A;

  int main() {
A(int) -> A;
  }

:4:4: error: missing template arguments before ‘(’ token
:4:5: error: expected primary-expression before ‘int’
:4:13: error: invalid use of ‘struct A’

Deduction guide templates are also still rejected (as with all templates at
function scope).

> 
> > tree type = TREE_TYPE (DECL_NAME (decl));
> > if (in_namespace == NULL_TREE
> >   && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
> > @@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
> >   inform (location_of (type), "  declared here");
> >   return NULL_TREE;
> > }
> > +  if (DECL_CLASS_SCOPE_P (decl)
> > + && current_access_specifier != declared_access (TYPE_NAME (type)))
> > +   {
> > + error_at (location, "deduction guide %qD must have the same access "
> > + "as %qT", decl, type);
> > + inform (location_of (type), "  declared here");
> > +   }
> > if (funcdef_flag)
> > error_at (location,
> >   "deduction guide %qD must not have a function body", decl);
> > @@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator *declarator,
> > storage_class = declspecs->storage_class;
> > if (storage_class == sc_static)
> >   staticp = 1 + (decl_context == FIELD);
> > +  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
> > +/* Treat class-scope deduction guides as static member functions
> > +   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
> > +staticp = 2;
> >   if (virtualp)
> >   {
> > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> > index 1be5f3da6d5..089bca1d471 100644
> > --- a/gcc/cp/name-lookup.c
> > +++ b/gcc/cp/name-lookup.c
> > @@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name,
> > LOOK_want want, bool complain)
> > else if (cxx_dialect != cxx98 && TREE_CODE (scope) == ENUMERAL_TYPE)
> >   t = lookup_enumerator (scope, name);
> > else if (is_class_type (scope, complain))
> > -t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
> > -  tf_warning_or_error);
> > -
> > +{
> > +  t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
> > +tf_warning_or_error);
> > +  if (t && dguide_name_p (name))
> > +   /* Since class-scope deduction guides aren't really member functions,
> > +  don't use a BASELINK for them.  */
> > +   t = MAYBE_BASELINK_FUNCTIONS (t);
> > +}
> > if (!t)
> >   return 

[PATCH] [PHIOPT/MATCH] Remove the statement to move if not used

2021-07-09 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

Instead of waiting for DCE to remove the unused statement,
and maybe optimize another conditional, it is better if
we don't move the statement and have the statement
removed.

OK? Bootstrapped and tested on x86_64-linux-gnu.

Changes from v1:
* v2: Change the order of insertation and check to see if the lhs
  is used rather than see if the lhs was used in the sequence.

gcc/ChangeLog:

* tree-ssa-phiopt.c (match_simplify_replacement): Move
insert of the sequence before the movement of the
statement. Check if to see if the statement is used
outside of the original phi to see if we should move it.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96928-1.c: Update to similar as pr96928.c.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c |  5 -
 gcc/tree-ssa-phiopt.c | 13 ++---
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
index 2e86620da11..9e505ac9900 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
@@ -2,7 +2,10 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */
 /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 "phiopt2" } 
} */
-/* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" } } 
*/
+/* The following check is done at optimized because a ^ (~b) is rewritten as 
~(a^b)
+   and in the case of match.pd optimizing these ?:, the ~ is moved out already
+   by the time we get to phiopt2. */
+/* { dg-final { scan-tree-dump-times "c_\[0-9]*\\\(D\\\) \\\^" 1 "optimized" } 
} */
 /* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
"phiopt2" } } */
 /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 7a98b7afdf1..c6adbbd28a0 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -1020,7 +1020,16 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
 return false;
 
   gsi = gsi_last_bb (cond_bb);
-  if (stmt_to_move)
+  /* Insert the sequence generated from gimple_simplify_phiopt.  */
+  if (seq)
+gsi_insert_seq_before (, seq, GSI_CONTINUE_LINKING);
+
+  /* If there was a statement to move and the result of the statement
+ is going to be used, move it to right before the original
+ conditional.  */
+  if (stmt_to_move
+  && (gimple_assign_lhs (stmt_to_move) == result
+ || !has_single_use (gimple_assign_lhs (stmt_to_move
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -1032,8 +1041,6 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
   gsi_move_before (, );
   reset_flow_sensitive_info (gimple_assign_lhs (stmt_to_move));
 }
-  if (seq)
-gsi_insert_seq_before (, seq, GSI_SAME_STMT);
 
   replace_phi_edge_with_variable (cond_bb, e1, phi, result);
 
-- 
2.27.0



Re: [PATCH] c++: 'new T[N]' and SFINAE [PR82110]

2021-07-09 Thread Jason Merrill via Gcc-patches

On 7/9/21 3:01 PM, Patrick Palka wrote:

Here we're failing to treat 'new T[N]' as erroneous in a SFINAE context
when T isn't default constructible because expand_aggr_init_1 doesn't
communicate to build_aggr_init (its only SFINAE caller) whether the
initialization was actually successful.  To fix this, this patch makes
expand_aggr_init_1 and its subroutine expand_default_init return true on
success, false on failure so that build_aggr_init can properly return
error_mark_node on failure.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 11 (given its impact on concepts)?


OK for trunk, and 11 after the 11.2 release.


PR c++/82110

gcc/cp/ChangeLog:

* init.c (build_aggr_init): Return error_mark_node if
expand_aggr_init_1 returns false.
(expand_default_init): Change return type to bool.  Return false
on error, true on success.
(expand_aggr_init_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr78765.C: Expect another conversion failure
diagnostic.
* g++.dg/template/sfinae14.C: Expect new X[5] is ill-formed
since X is not default constructible.
* g++.dg/cpp2a/concepts-requires27.C: New test.
---
  gcc/cp/init.c | 41 +--
  gcc/testsuite/g++.dg/cpp0x/pr78765.C  |  2 +-
  .../g++.dg/cpp2a/concepts-requires27.C| 10 +
  gcc/testsuite/g++.dg/template/sfinae14.C  |  2 +-
  4 files changed, 40 insertions(+), 15 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-requires27.C

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 88f6f90a800..1d863ed8538 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -39,8 +39,8 @@ along with GCC; see the file COPYING3.  If not see
  static bool begin_init_stmts (tree *, tree *);
  static tree finish_init_stmts (bool, tree, tree);
  static void construct_virtual_base (tree, tree);
-static void expand_aggr_init_1 (tree, tree, tree, tree, int, tsubst_flags_t);
-static void expand_default_init (tree, tree, tree, tree, int, tsubst_flags_t);
+static bool expand_aggr_init_1 (tree, tree, tree, tree, int, tsubst_flags_t);
+static bool expand_default_init (tree, tree, tree, tree, int, tsubst_flags_t);
  static void perform_member_init (tree, tree);
  static int member_init_ok_or_else (tree, tree, tree);
  static void expand_virtual_init (tree, tree);
@@ -1838,12 +1838,14 @@ build_aggr_init (tree exp, tree init, int flags, 
tsubst_flags_t complain)
is_global = begin_init_stmts (_expr, _stmt);
destroy_temps = stmts_are_full_exprs_p ();
current_stmt_tree ()->stmts_are_full_exprs_p = 0;
-  expand_aggr_init_1 (TYPE_BINFO (type), exp, exp,
- init, LOOKUP_NORMAL|flags, complain);
+  bool ok = expand_aggr_init_1 (TYPE_BINFO (type), exp, exp,
+   init, LOOKUP_NORMAL|flags, complain);
stmt_expr = finish_init_stmts (is_global, stmt_expr, compound_stmt);
current_stmt_tree ()->stmts_are_full_exprs_p = destroy_temps;
TREE_READONLY (exp) = was_const;
TREE_THIS_VOLATILE (exp) = was_volatile;
+  if (!ok)
+return error_mark_node;
  
if ((VAR_P (exp) || TREE_CODE (exp) == PARM_DECL)

&& TREE_SIDE_EFFECTS (stmt_expr)
@@ -1854,7 +1856,7 @@ build_aggr_init (tree exp, tree init, int flags, 
tsubst_flags_t complain)
return stmt_expr;
  }
  
-static void

+static bool
  expand_default_init (tree binfo, tree true_exp, tree exp, tree init, int 
flags,
   tsubst_flags_t complain)
  {
@@ -1889,6 +1891,9 @@ expand_default_init (tree binfo, tree true_exp, tree exp, 
tree init, int flags,
 happen for direct-initialization, too.  */
  init = digest_init (type, init, complain);
  
+  if (init == error_mark_node)

+return false;
+
/* A CONSTRUCTOR of the target's type is a previously digested
   initializer, whether that happened just above or in
   cp_parser_late_parsing_nsdmi.
@@ -1910,7 +1915,7 @@ expand_default_init (tree binfo, tree true_exp, tree exp, 
tree init, int flags,
init = build2 (INIT_EXPR, TREE_TYPE (exp), exp, init);
TREE_SIDE_EFFECTS (init) = 1;
finish_expr_stmt (init);
-  return;
+  return true;
  }
  
if (init && TREE_CODE (init) != TREE_LIST

@@ -1927,8 +1932,12 @@ expand_default_init (tree binfo, tree true_exp, tree 
exp, tree init, int flags,
   have already built up the constructor call so we could wrap it
   in an exception region.  */;
else
-   init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
-   flags, complain | tf_no_cleanup);
+   {
+ init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
+ flags, complain | tf_no_cleanup);
+ if (init == error_mark_node)
+   return false;
+   }
  
if (TREE_CODE (init) == MUST_NOT_THROW_EXPR)

/* We need to 

Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Jason Merrill via Gcc-patches

On 7/9/21 3:18 PM, Patrick Palka wrote:

This adds support for declaring (class-scope) deduction guides for a
member class template.  Fortunately it seems only a couple of changes
are needed in order for the existing CTAD machinery to handle them like
any other deduction guide: we need to make sure to give them a
FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
BASELINK when looking them up.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/79501

gcc/cp/ChangeLog:

* decl.c (grokfndecl): Don't require that deduction guides are
declared at namespace scope.  Check that class-scope deduction
guides have the same access as the member class template.
(grokdeclarator): Pretend class-scope deduction guides are static.
* name-lookup.c (lookup_qualified_name): Don't use a BASELINK
for class-scope deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction92.C: New test.
* g++.dg/cpp1z/class-deduction93.C: New test.
* g++.dg/cpp1z/class-deduction94.C: New test.
---
  gcc/cp/decl.c | 17 -
  gcc/cp/name-lookup.c  | 11 +---
  .../g++.dg/cpp1z/class-deduction92.C  | 16 
  .../g++.dg/cpp1z/class-deduction93.C  | 25 +++
  .../g++.dg/cpp1z/class-deduction94.C  | 19 ++
  5 files changed, 79 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index ebe1318d38d..8b8ffb7de83 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
  
if (deduction_guide_p (decl))

  {
-  if (!DECL_NAMESPACE_SCOPE_P (decl))
-   {
- error_at (location, "deduction guide %qD must be declared at "
-   "namespace scope", decl);
- return NULL_TREE;
-   }


Do we still reject deduction guides at function scope?


tree type = TREE_TYPE (DECL_NAME (decl));
if (in_namespace == NULL_TREE
  && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
@@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
  inform (location_of (type), "  declared here");
  return NULL_TREE;
}
+  if (DECL_CLASS_SCOPE_P (decl)
+ && current_access_specifier != declared_access (TYPE_NAME (type)))
+   {
+ error_at (location, "deduction guide %qD must have the same access "
+ "as %qT", decl, type);
+ inform (location_of (type), "  declared here");
+   }
if (funcdef_flag)
error_at (location,
  "deduction guide %qD must not have a function body", decl);
@@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator *declarator,
storage_class = declspecs->storage_class;
if (storage_class == sc_static)
  staticp = 1 + (decl_context == FIELD);
+  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
+/* Treat class-scope deduction guides as static member functions
+   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
+staticp = 2;
  
if (virtualp)

  {
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 1be5f3da6d5..089bca1d471 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name, LOOK_want 
want, bool complain)
else if (cxx_dialect != cxx98 && TREE_CODE (scope) == ENUMERAL_TYPE)
  t = lookup_enumerator (scope, name);
else if (is_class_type (scope, complain))
-t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
-  tf_warning_or_error);
-
+{
+  t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
+tf_warning_or_error);
+  if (t && dguide_name_p (name))
+   /* Since class-scope deduction guides aren't really member functions,
+  don't use a BASELINK for them.  */
+   t = MAYBE_BASELINK_FUNCTIONS (t);
+}
if (!t)
  return error_mark_node;
return t;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction92.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
new file mode 100644
index 000..178234c76d1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
@@ -0,0 +1,16 @@
+// PR c++/79501
+// { dg-do compile { target c++17 } }
+
+template
+struct X {
+  template
+  struct B { T t; };
+
+  template B(T) -> B;
+
+  auto foo() { return B{V}; }
+};
+
+X<42> x;
+using type = decltype(x.foo());
+using type = X<42>::B;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction93.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
new file mode 100644
index 000..9d2db7a55a2
--- 

Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-09 Thread Martin Sebor via Gcc-patches

On 7/9/21 7:19 AM, Maxim Kuvyrkov wrote:

On 9 Jul 2021, at 09:16, Richard Biener via Gcc-patches 
 wrote:

On Thu, Jul 8, 2021 at 8:02 PM Martin Sebor via Gcc-patches
 wrote:


Hi Ian,

Yesterday's enhancement to -Warray-bounds has exposed a couple of
issues in libgo where the code writes into an invalid constant
address that the warning is designed to flag.

On the assumption that those invalid addresses are deliberate,
the attached patch suppresses these instances by using #pragma
GCC diagnostic but I don't think I'm supposed to commit it (at
least Git won't let me).  To avoid Go bootstrap failures please
either apply the patch or otherwise suppress the warning (e.g.,
by using a volatile pointer temporary).


Btw, I don't think we should diagnose things like

*(int*)0x21 = 0x21;

when somebody literally writes that he'll be just annoyed by diagnostics.


And we have an assortment of similar cases in 32-bit ARM kernel-page helpers.

At the moment building libatomic for arm-linux-gnueabihf fails with:
===
In function ‘select_test_and_set_8’,
 inlined from ‘select_test_and_set_8’ at 
/home/tcwg-buildslave/workspace/tcwg-dev-build/snapshots/gcc.git~master/libatomic/tas_n.c:115:1:
/home/tcwg-buildslave/workspace/tcwg-dev-build/snapshots/gcc.git~master/libatomic/config/linux/arm/host-config.h:42:34:
 error: array subscript 0 is outside array bounds of ‘unsigned int[0]’ 
[-Werror=array-bounds]
42 | #define __kernel_helper_version (*(unsigned int *)0x0ffc)
   | ~^~~~
===

In libatomic/config/linux/arm/host-config.h we have:
===
/* Kernel helper for 32-bit compare-and-exchange.  */
typedef int (__kernel_cmpxchg_t) (UWORD oldval, UWORD newval, UWORD *ptr);
#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)

/* Kernel helper for 64-bit compare-and-exchange.  */
typedef int (__kernel_cmpxchg64_t) (const U_8 * oldval, const U_8 * newval,
U_8 *ptr);
#define __kernel_cmpxchg64 (*(__kernel_cmpxchg64_t *) 0x0f60)

/* Kernel helper for memory barrier.  */
typedef void (__kernel_dmb_t) (void);
#define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)

/* Kernel helper page version number.  */
#define __kernel_helper_version (*(unsigned int *)0x0ffc)
===


This failure is tracked in pr101379.  I have added an untested POC
patch with a possible way to avoid the warning.  Other approaches
are possible (I mention some in my comment on the bug) but they
are limited by the exposure of the constant address using macros.
Hiding them behind APIs instead would make it possible to suppress
the warnings via #pragma GCC diagnostic.  Alternatively, making
the addresses extern const variables would hide the constants from
the warning altogether.

With a suitable attribute an API or variable could also describe
the size of the object.  The AVR back end, for example, has two
attributes for hardwired addresses: io and address.  One of them
(or another one like it) could be made the target-indendependent
way to declare global variables of any type at fixed addresses,
e.g., like so:

  extern __attribute__ ((address (0x0fc0))) __kernel_cmpxchg64_t
  __kernel_cmpxchg;

Martin


Re: [PATCH 05/55] rs6000: Add helper functions for parsing

2021-07-09 Thread will schmidt via Gcc-patches
On Thu, 2021-06-17 at 10:18 -0500, Bill Schmidt via Gcc-patches wrote:
> 2021-06-07  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-gen-builtins.c (consume_whitespace): New
>   function.
>   (advance_line): Likewise.
>   (safe_inc_pos): Likewise.
>   (match_identifier): Likewise.
>   (match_integer): Likewise.
>   (match_to_right_bracket): Likewise.
> ---
>  gcc/config/rs6000/rs6000-gen-builtins.c | 111 
>  1 file changed, 111 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
> b/gcc/config/rs6000/rs6000-gen-builtins.c
> index 3c53c3401b2..c5d5590e865 100644
> --- a/gcc/config/rs6000/rs6000-gen-builtins.c
> +++ b/gcc/config/rs6000/rs6000-gen-builtins.c
> @@ -210,3 +210,114 @@ ovld_diag (const char * fmt, ...)
>vfprintf (stderr, fmt, args);
>va_end (args);
>  }
> +
> +/* Pass over unprintable characters and whitespace (other than a newline,
> +   which terminates the scan).  */

AFAIK isspace() and thusly this helper only skips whitespace, so
nothing unprintable is actually handled or skipped here.
Beyond that comment nit the function seems OK.

> +static void
> +consume_whitespace (void)
> +{
> +  while (pos < LINELEN && isspace(linebuf[pos]) && linebuf[pos] != '\n')
> +pos++;
> +  return;
> +}
> +
> +/* Get the next nonblank, noncomment line, returning 0 on EOF, 1 otherwise.  
> */
> +static int
> +advance_line (FILE *file)
> +{
> +  while (1)
> +{
> +  /* Read ahead one line and check for EOF.  */
> +  if (!fgets (linebuf, sizeof linebuf, file))
> + return 0;
> +  line++;
> +  size_t len = strlen (linebuf);
> +  if (linebuf[len - 1] != '\n')
> + (*diag) ("line doesn't terminate with newline\n");
> +  pos = 0;
> +  consume_whitespace ();
> +  if (linebuf[pos] != '\n' && linebuf[pos] != ';')
> + return 1;
> +}
> +}
ok

> +
> +static inline void
> +safe_inc_pos (void)
> +{
> +  if (pos++ >= LINELEN)
> +{
> +  (*diag) ("line length overrun.\n");
> +  exit (1);
> +}
> +}

ok

> +
> +/* Match an identifier, returning NULL on failure, else a pointer to a
> +   buffer containing the identifier.  */
> +static char *
> +match_identifier (void)
> +{
> +  int lastpos = pos - 1;
> +  while (isalnum (linebuf[lastpos + 1]) || linebuf[lastpos + 1] == '_')
> +++lastpos;
> +
> +  if (lastpos < pos)
> +return 0;
> +
> +  char *buf = (char *) malloc (lastpos - pos + 2);
> +  memcpy (buf, [pos], lastpos - pos + 1);
> +  buf[lastpos - pos + 1] = '\0';
> +
> +  pos = lastpos + 1;
> +  return buf;
> +}
ok


> +
> +/* Match an integer and return the string representing its value,
> +   or a null string on failure.  */
> +static char *
> +match_integer (void)
> +{
> +  int startpos = pos;
> +  if (linebuf[pos] == '-')
> +safe_inc_pos ();
> +
> +  int lastpos = pos - 1;
> +  while (isdigit (linebuf[lastpos + 1]))
> +++lastpos;
> +
> +  if (lastpos < pos)
> +return NULL;
> +
> +  pos = lastpos + 1;
> +  char *buf = (char *) malloc (lastpos - startpos + 2);
> +  memcpy (buf, [startpos], lastpos - startpos + 1);
> +  buf[lastpos - startpos + 1] = '\0';
> +  return buf;
> +}
Ok

> +
> +/* Match a string up to but not including a ']', and return its value,
> +   or zero if there is nothing before the ']'.  Error if we don't find
> +   such a character.  */
> +static const char *
> +match_to_right_bracket (void)
> +{
> +  int lastpos = pos - 1;
> +  while (linebuf[lastpos + 1] != ']')
> +{
> +  if (linebuf[lastpos + 1] == '\n')
> + {
> +   (*diag) ("no ']' found before end of line.\n");
> +   exit (1);
> + }
> +  ++lastpos;
> +}
> +
> +  if (lastpos < pos)
> +return 0;
> +
> +  char *buf = (char *) malloc (lastpos - pos + 2);
> +  memcpy (buf, [pos], lastpos - pos + 1);
> +  buf[lastpos - pos + 1] = '\0';
> +
> +  pos = lastpos + 1;
> +  return buf;
> +}

Ok. 

presumably all tested OK.. :-)

lgtm, 
thanks
-Will



[PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-09 Thread Patrick Palka via Gcc-patches
This adds support for declaring (class-scope) deduction guides for a
member class template.  Fortunately it seems only a couple of changes
are needed in order for the existing CTAD machinery to handle them like
any other deduction guide: we need to make sure to give them a
FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
BASELINK when looking them up.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/79501

gcc/cp/ChangeLog:

* decl.c (grokfndecl): Don't require that deduction guides are
declared at namespace scope.  Check that class-scope deduction
guides have the same access as the member class template.
(grokdeclarator): Pretend class-scope deduction guides are static.
* name-lookup.c (lookup_qualified_name): Don't use a BASELINK
for class-scope deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction92.C: New test.
* g++.dg/cpp1z/class-deduction93.C: New test.
* g++.dg/cpp1z/class-deduction94.C: New test.
---
 gcc/cp/decl.c | 17 -
 gcc/cp/name-lookup.c  | 11 +---
 .../g++.dg/cpp1z/class-deduction92.C  | 16 
 .../g++.dg/cpp1z/class-deduction93.C  | 25 +++
 .../g++.dg/cpp1z/class-deduction94.C  | 19 ++
 5 files changed, 79 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index ebe1318d38d..8b8ffb7de83 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
 
   if (deduction_guide_p (decl))
 {
-  if (!DECL_NAMESPACE_SCOPE_P (decl))
-   {
- error_at (location, "deduction guide %qD must be declared at "
-   "namespace scope", decl);
- return NULL_TREE;
-   }
   tree type = TREE_TYPE (DECL_NAME (decl));
   if (in_namespace == NULL_TREE
  && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
@@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
  inform (location_of (type), "  declared here");
  return NULL_TREE;
}
+  if (DECL_CLASS_SCOPE_P (decl)
+ && current_access_specifier != declared_access (TYPE_NAME (type)))
+   {
+ error_at (location, "deduction guide %qD must have the same access "
+ "as %qT", decl, type);
+ inform (location_of (type), "  declared here");
+   }
   if (funcdef_flag)
error_at (location,
  "deduction guide %qD must not have a function body", decl);
@@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator *declarator,
   storage_class = declspecs->storage_class;
   if (storage_class == sc_static)
 staticp = 1 + (decl_context == FIELD);
+  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
+/* Treat class-scope deduction guides as static member functions
+   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
+staticp = 2;
 
   if (virtualp)
 {
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 1be5f3da6d5..089bca1d471 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name, LOOK_want 
want, bool complain)
   else if (cxx_dialect != cxx98 && TREE_CODE (scope) == ENUMERAL_TYPE)
 t = lookup_enumerator (scope, name);
   else if (is_class_type (scope, complain))
-t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
-  tf_warning_or_error);
-
+{
+  t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
+tf_warning_or_error);
+  if (t && dguide_name_p (name))
+   /* Since class-scope deduction guides aren't really member functions,
+  don't use a BASELINK for them.  */
+   t = MAYBE_BASELINK_FUNCTIONS (t);
+}
   if (!t)
 return error_mark_node;
   return t;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction92.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
new file mode 100644
index 000..178234c76d1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
@@ -0,0 +1,16 @@
+// PR c++/79501
+// { dg-do compile { target c++17 } }
+
+template
+struct X {
+  template
+  struct B { T t; };
+
+  template B(T) -> B;
+
+  auto foo() { return B{V}; }
+};
+
+X<42> x;
+using type = decltype(x.foo());
+using type = X<42>::B;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction93.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
new file mode 100644
index 000..9d2db7a55a2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
@@ -0,0 +1,25 @@
+// PR c++/79501
+// { dg-do compile { target c++17 } }

[PATCH] c++: 'new T[N]' and SFINAE [PR82110]

2021-07-09 Thread Patrick Palka via Gcc-patches
Here we're failing to treat 'new T[N]' as erroneous in a SFINAE context
when T isn't default constructible because expand_aggr_init_1 doesn't
communicate to build_aggr_init (its only SFINAE caller) whether the
initialization was actually successful.  To fix this, this patch makes
expand_aggr_init_1 and its subroutine expand_default_init return true on
success, false on failure so that build_aggr_init can properly return
error_mark_node on failure.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 11 (given its impact on concepts)?

PR c++/82110

gcc/cp/ChangeLog:

* init.c (build_aggr_init): Return error_mark_node if
expand_aggr_init_1 returns false.
(expand_default_init): Change return type to bool.  Return false
on error, true on success.
(expand_aggr_init_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr78765.C: Expect another conversion failure
diagnostic.
* g++.dg/template/sfinae14.C: Expect new X[5] is ill-formed
since X is not default constructible.
* g++.dg/cpp2a/concepts-requires27.C: New test.
---
 gcc/cp/init.c | 41 +--
 gcc/testsuite/g++.dg/cpp0x/pr78765.C  |  2 +-
 .../g++.dg/cpp2a/concepts-requires27.C| 10 +
 gcc/testsuite/g++.dg/template/sfinae14.C  |  2 +-
 4 files changed, 40 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-requires27.C

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 88f6f90a800..1d863ed8538 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -39,8 +39,8 @@ along with GCC; see the file COPYING3.  If not see
 static bool begin_init_stmts (tree *, tree *);
 static tree finish_init_stmts (bool, tree, tree);
 static void construct_virtual_base (tree, tree);
-static void expand_aggr_init_1 (tree, tree, tree, tree, int, tsubst_flags_t);
-static void expand_default_init (tree, tree, tree, tree, int, tsubst_flags_t);
+static bool expand_aggr_init_1 (tree, tree, tree, tree, int, tsubst_flags_t);
+static bool expand_default_init (tree, tree, tree, tree, int, tsubst_flags_t);
 static void perform_member_init (tree, tree);
 static int member_init_ok_or_else (tree, tree, tree);
 static void expand_virtual_init (tree, tree);
@@ -1838,12 +1838,14 @@ build_aggr_init (tree exp, tree init, int flags, 
tsubst_flags_t complain)
   is_global = begin_init_stmts (_expr, _stmt);
   destroy_temps = stmts_are_full_exprs_p ();
   current_stmt_tree ()->stmts_are_full_exprs_p = 0;
-  expand_aggr_init_1 (TYPE_BINFO (type), exp, exp,
- init, LOOKUP_NORMAL|flags, complain);
+  bool ok = expand_aggr_init_1 (TYPE_BINFO (type), exp, exp,
+   init, LOOKUP_NORMAL|flags, complain);
   stmt_expr = finish_init_stmts (is_global, stmt_expr, compound_stmt);
   current_stmt_tree ()->stmts_are_full_exprs_p = destroy_temps;
   TREE_READONLY (exp) = was_const;
   TREE_THIS_VOLATILE (exp) = was_volatile;
+  if (!ok)
+return error_mark_node;
 
   if ((VAR_P (exp) || TREE_CODE (exp) == PARM_DECL)
   && TREE_SIDE_EFFECTS (stmt_expr)
@@ -1854,7 +1856,7 @@ build_aggr_init (tree exp, tree init, int flags, 
tsubst_flags_t complain)
   return stmt_expr;
 }
 
-static void
+static bool
 expand_default_init (tree binfo, tree true_exp, tree exp, tree init, int flags,
  tsubst_flags_t complain)
 {
@@ -1889,6 +1891,9 @@ expand_default_init (tree binfo, tree true_exp, tree exp, 
tree init, int flags,
happen for direct-initialization, too.  */
 init = digest_init (type, init, complain);
 
+  if (init == error_mark_node)
+return false;
+
   /* A CONSTRUCTOR of the target's type is a previously digested
  initializer, whether that happened just above or in
  cp_parser_late_parsing_nsdmi.
@@ -1910,7 +1915,7 @@ expand_default_init (tree binfo, tree true_exp, tree exp, 
tree init, int flags,
   init = build2 (INIT_EXPR, TREE_TYPE (exp), exp, init);
   TREE_SIDE_EFFECTS (init) = 1;
   finish_expr_stmt (init);
-  return;
+  return true;
 }
 
   if (init && TREE_CODE (init) != TREE_LIST
@@ -1927,8 +1932,12 @@ expand_default_init (tree binfo, tree true_exp, tree 
exp, tree init, int flags,
   have already built up the constructor call so we could wrap it
   in an exception region.  */;
   else
-   init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
-   flags, complain | tf_no_cleanup);
+   {
+ init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
+ flags, complain | tf_no_cleanup);
+ if (init == error_mark_node)
+   return false;
+   }
 
   if (TREE_CODE (init) == MUST_NOT_THROW_EXPR)
/* We need to protect the initialization of a catch parm with a
@@ -1944,7 +1953,7 @@ expand_default_init (tree binfo, tree true_exp, tree exp, 
tree init, 

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-09 Thread Qing Zhao via Gcc-patches
Hi,

> On Jul 9, 2021, at 11:18 AM, Martin Jambor  wrote:
>> 
>>> On Jul 8, 2021, at 8:29 AM, Martin Jambor  wrote:
 diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
 index c05d22f3e8f1..35051d7c6b96 100644
 --- a/gcc/tree-sra.c
 +++ b/gcc/tree-sra.c
 @@ -384,6 +384,13 @@ static struct
 
  /* Numbber of components created when splitting aggregate parameters.  */
  int param_reductions_created;
 +
 +  /* Number of deferred_init calls that are modified.  */
 +  int deferred_init;
 +
 +  /* Number of deferred_init calls that are created by
 + generate_subtree_deferred_init.  */
 +  int subtree_deferred_init;
 } sra_stats;
 
 static void
 @@ -4096,6 +4103,110 @@ get_repl_default_def_ssa_name (struct access 
 *racc, tree reg_type)
  return get_or_create_ssa_default_def (cfun, racc->replacement_decl);
 }
 
 +
 +/* Generate statements to call .DEFERRED_INIT to initialize scalar 
 replacements
 +   of accesses within a subtree ACCESS; all its children, siblings and 
 their
 +   children are to be processed.
 +   GSI is a statement iterator used to place the new statements.  */
 +static void
 +generate_subtree_deferred_init (struct access *access,
 +  tree init_type,
 +  tree is_vla,
 +  gimple_stmt_iterator *gsi,
 +  location_t loc)
 +{
 +  do
 +{
 +  if (access->grp_to_be_replaced)
 +  {
 +tree repl = get_access_replacement (access);
 +gimple *call
 +  = gimple_build_call_internal (IFN_DEFERRED_INIT, 3,
 +TYPE_SIZE_UNIT (TREE_TYPE (repl)),
 +init_type, is_vla);
 +gimple_call_set_lhs (call, repl);
 +gsi_insert_before (gsi, call, GSI_SAME_STMT);
 +update_stmt (call);
 +gimple_set_location (call, loc);
 +sra_stats.subtree_deferred_init++;
 +  }
 +  else if (access->grp_to_be_debug_replaced)
 +  {
 +tree drepl = get_access_replacement (access);
 +tree call = build_call_expr_internal_loc
 +   (UNKNOWN_LOCATION, IFN_DEFERRED_INIT,
 +TREE_TYPE (drepl), 3,
 +TYPE_SIZE_UNIT (TREE_TYPE (drepl)),
 +init_type, is_vla);
 +gdebug *ds = gimple_build_debug_bind (drepl, call,
 +  gsi_stmt (*gsi));
 +gsi_insert_before (gsi, ds, GSI_SAME_STMT);
>>> 
>>> Is handling of grp_to_be_debug_replaced accesses necessary here?  If so,
>>> why?  grp_to_be_debug_replaced accesses are there only to facilitate
>>> debug information about a part of an aggregate decl is that is likely
>>> going to be entirely removed - so that debuggers can sometimes show to
>>> users information about what they would contain had they not removed.
>>> It seems strange you need to mark them as uninitialized because they
>>> should not have any consumers.  (But perhaps it is also harmless.)
>> 
>> This part has been discussed during the 2nd version of the patch, but
>> I think that more discussion might be necessary.
>> 
>> In the previous discussion, Richard Sandiford mentioned:
>> (https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568620.html):
>> 
>> =
>> 
>> I guess the thing we need to decide here is whether -ftrivial-auto-var-init
>> should affect debug-only constructs too.  If it doesn't, exmaining removed
>> components in a debugger might show uninitialised values in cases where
>> the user was expecting initialised ones.  There would be no security
>> concern, but it might be surprising.
>> 
>> I think in principle the DRHS can contain a call to DEFERRED_INIT.
>> Doing that would probably require further handling elsewhere though.
>> 
>> =
>> 
>> I am still not very confident now for this part of the change.
> 
> I see.  I still tend to think that with or without the generation of
> gimple_build_debug_binds, the debugger would still not display any value
> for the component in question.  Without it there would be no information
> about the component at a any place in code affected by this, with it the
> component would be explicitely uninitialized.  But OK.

So, my current change for access->grp_to_be_debug_replaced is good?

Do I need to modify any other code in addition to this in order to let debugger 
work correctly?

Or deleting this part of code might be simple and better?

>> 
>> My questions:
>> 
>> 1. If we don’t handle grp_to_be_debug_replaced at all, what will
>> happen?  ( the user of the debugger will see uninitialized values in
>> the removed part of the aggregate?  Or something else?)
> 
> Well, can you try?  :-) I think the debugger would not have anything to
> display.
I will try to come up with a small example on this.
> 
>> 2. On the other hand, if we handle 

Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]

2021-07-09 Thread will schmidt via Gcc-patches
On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote:
> Gentle ping ^2, thanks.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
> 
> 
> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> > 526.blender_r +1.72%, no obvious changes to others.

Ok.

> > 
> > 
> > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
> > > Gentle ping, thanks.
> > > 
> > > 
> > > On 2021/4/16 15:10, Xiong Hu Luo wrote:
> > > > fmod/fmodf and remainder/remainderf could be expanded instead of library
> > > > call when fast-math build, which is much faster.
> > > > 
> > > > fmodf:
> > > >   fdivs   f0,f1,f2
> > > >   frizf0,f0
> > > >   fnmsubs f1,f2,f0,f1
> > > > 
> > > > remainderf:
> > > >   fdivs   f0,f1,f2
> > > >   frinf0,f0
> > > >   fnmsubs f1,f2,f0,f1
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > 2021-04-16  Xionghu Luo  
> > > > 
> > > > PR target/97142

That PR is " Bug 97142 
  - __builtin_fmod not optimized on POWER   "

OK.


> > > > * config/rs6000/rs6000.md (fmod3): New define_expand.
> > > > (remainder3): Likewise.


> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > 2021-04-16  Xionghu Luo  
> > > > 
> > > > PR target/97142
> > > > * gcc.target/powerpc/pr97142.c: New test.

Ok.

> > > > ---
> > > >   gcc/config/rs6000/rs6000.md| 36 ++
> > > >   gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++
> > > >   2 files changed, 66 insertions(+)
> > > >   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > 
> > > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> > > > index a1315523fec..7e0e94e6ba4 100644
> > > > --- a/gcc/config/rs6000/rs6000.md
> > > > +++ b/gcc/config/rs6000/rs6000.md
> > > > @@ -4902,6 +4902,42 @@ (define_insn "fre"
> > > > [(set_attr "type" "fp")
> > > >  (set_attr "isa" "*,")])
> > > > +(define_expand "fmod3"
> > > > +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> > > > +(use (match_operand:SFDF 1 "gpc_reg_operand"))
> > > > +(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> > > > +  "TARGET_HARD_FLOAT
> > > > +  && TARGET_FPRND
> > > > +  && flag_unsafe_math_optimizations"
> > > > +{
> > > > +  rtx div = gen_reg_rtx (mode);
> > > > +  emit_insn (gen_div3 (div, operands[1], operands[2]));
> > > > +
> > > > +  rtx friz = gen_reg_rtx (mode);
> > > > +  emit_insn (gen_btrunc2 (friz, div));
> > > > +
> > > > +  emit_insn (gen_nfms4 (operands[0], operands[2], friz, 
> > > > operands[1]));
> > > > +  DONE;
> > > > + })
> > > > +
> > > > +(define_expand "remainder3"
> > > > +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> > > > +(use (match_operand:SFDF 1 "gpc_reg_operand"))
> > > > +(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> > > > +  "TARGET_HARD_FLOAT
> > > > +  && TARGET_FPRND
> > > > +  && flag_unsafe_math_optimizations"
> > > > +{
> > > > +  rtx div = gen_reg_rtx (mode);
> > > > +  emit_insn (gen_div3 (div, operands[1], operands[2]));
> > > > +
> > > > +  rtx frin = gen_reg_rtx (mode);
> > > > +  emit_insn (gen_round2 (frin, div));
> > > > +
> > > > +  emit_insn (gen_nfms4 (operands[0], operands[2], frin, 
> > > > operands[1]));
> > > > +  DONE;
> > > > + })

I notice the pattern of arguments to the final emit
is op[0],op[2],fri*,op[1]
while the description comment suggests the generated instruction 
will be fnmsubs  f1,f2,f0,f1  ;

I don't see any rearranging in the nfms4 expansions, but
presumably this is correct and just a cosmetic nit that catches my eye.

Ok.


> > > > +
> > > >   (define_insn "*rsqrt2"
> > > > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa")
> > > >   (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",wa")]
> > > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c 
> > > > b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > new file mode 100644
> > > > index 000..48f25ca5b5b
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > @@ -0,0 +1,30 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-Ofast" } */
> > > > +
> > > > +#include 
> > > > +
> > > > +float test1 (float x, float y)
> > > > +{
> > > > +  return fmodf (x, y);
> > > > +}
> > > > +
> > > > +double test2 (double x, double y)
> > > > +{
> > > > +  return fmod (x, y);
> > > > +}
> > > > +
> > > > +float test3 (float x, float y)
> > > > +{
> > > > +  return remainderf (x, y);
> > > > +}
> > > > +
> > > > +double test4 (double x, double y)
> > > > +{
> > > > +  return remainder (x, y);
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */


Ok.
I'd be tempted to add scan-assembler checks for the 

[PATCH] coroutines: Adjust outlined function names [PR95520].

2021-07-09 Thread Iain Sandoe
Hi,

The mechanism used to date for uniquing the coroutine helper
functions (actor, destroy) was over-complicating things and
leading to the noted PR and also difficulties in setting
breakpoints on these functions (so this will help PR99215 as
well).  The revised mangling matches the form used by clang.

OK for master & backports?
thanks
Iain

Signed-off-by: Iain Sandoe 

PR c++/95520 - [coroutines] __builtin_FUNCTION() returns mangled .actor instead 
of original function name

PR c++/95520

gcc/cp/ChangeLog:

* coroutines.cc (act_des_fn): Adjust coroutine
helper function name mangling.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr95520.C: New test.
---
 gcc/cp/coroutines.cc  | 14 +--
 gcc/testsuite/g++.dg/coroutines/pr95520.C | 29 +++
 2 files changed, 41 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95520.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 54ffdc8d062..1a3ab58e044 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3985,9 +3985,19 @@ register_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
 static tree
 act_des_fn (tree orig, tree fn_type, tree coro_frame_ptr, const char* name)
 {
-  tree fn_name = get_fn_local_identifier (orig, name);
+  tree fn_name;
   location_t loc = DECL_SOURCE_LOCATION (orig);
-  tree fn = build_lang_decl (FUNCTION_DECL, fn_name, fn_type);
+  tree fn = build_lang_decl (FUNCTION_DECL, DECL_NAME (orig), fn_type);
+  if (tree da_name = DECL_ASSEMBLER_NAME (orig))
+{
+  char *buf = xasprintf ("%s.%s", IDENTIFIER_POINTER (da_name), name);
+  fn_name = get_identifier (buf);
+  free (buf);
+}
+  else
+fn_name = get_fn_local_identifier (orig, name);
+
+  SET_DECL_ASSEMBLER_NAME (fn, fn_name);
   DECL_CONTEXT (fn) = DECL_CONTEXT (orig);
   DECL_SOURCE_LOCATION (fn) = loc;
   DECL_ARTIFICIAL (fn) = true;
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95520.C 
b/gcc/testsuite/g++.dg/coroutines/pr95520.C
new file mode 100644
index 000..4849b0789c7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95520.C
@@ -0,0 +1,29 @@
+// { dg-do run }
+// { dg-output "coroutine name: MyFoo" }
+#include 
+#include 
+
+struct pt
+{
+using handle_t = std::coroutine_handle;
+auto get_return_object() noexcept { return handle_t::from_promise(*this); }
+
+std::suspend_never initial_suspend () const noexcept { return {}; }
+std::suspend_never final_suspend () const noexcept { return {}; }
+void return_void() const noexcept {}
+void unhandled_exception() const noexcept {}
+};
+
+template <> struct std::coroutine_traits
+{ using promise_type = pt; };
+
+static pt::handle_t MyFoo ()
+{ 
+printf ("coroutine name: %s\n", __builtin_FUNCTION());
+co_return;
+}
+
+int main()
+{
+MyFoo ();
+}
-- 
2.24.1



[pushed] coroutines: Factor code. Match original source location in helpers [NFC].

2021-07-09 Thread Iain Sandoe
Hi,

This is primarily a source code refactoring, the only change is to
ensure that the outlined functions are marked to begin at the same
line as the original.  Otherwise, they get the default (which seems
to be input_location, which corresponds to the closing brace at the
point that this is done).  Having the source location point to that
confuses some debuggers.

This is a contributory fix to:
PR c++/99215 - coroutines: debugging with gdb

tested on x86_64-darwin, linux,
pushed to master as trivial/obvious,
thanks
Iain

Signed-off-by: Iain Sandoe 

gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Move common code to
act_des_fn.
(build_destroy_fn): Likewise.
(act_des_fn): Build the void return here.  Ensure that the
source location matches the original function.
---
 gcc/cp/coroutines.cc | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index f5ae2d6d101..54ffdc8d062 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2155,13 +2155,6 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
   /* One param, the coro frame pointer.  */
   tree actor_fp = DECL_ARGUMENTS (actor);
 
-  /* A void return.  */
-  tree resdecl = build_decl (loc, RESULT_DECL, 0, void_type_node);
-  DECL_ARTIFICIAL (resdecl) = 1;
-  DECL_IGNORED_P (resdecl) = 1;
-  DECL_RESULT (actor) = resdecl;
-  DECL_COROUTINE_P (actor) = 1;
-
   /* We have a definition here.  */
   TREE_STATIC (actor) = 1;
 
@@ -2532,15 +2525,8 @@ build_destroy_fn (location_t loc, tree coro_frame_type, 
tree destroy,
   /* One param, the coro frame pointer.  */
   tree destr_fp = DECL_ARGUMENTS (destroy);
 
-  /* A void return.  */
-  tree resdecl = build_decl (loc, RESULT_DECL, 0, void_type_node);
-  DECL_ARTIFICIAL (resdecl) = 1;
-  DECL_IGNORED_P (resdecl) = 1;
-  DECL_RESULT (destroy) = resdecl;
-
   /* We have a definition here.  */
   TREE_STATIC (destroy) = 1;
-  DECL_COROUTINE_P (destroy) = 1;
 
   tree destr_outer = push_stmt_list ();
   current_stmt_tree ()->stmts_are_full_exprs_p = 1;
@@ -4000,15 +3986,19 @@ static tree
 act_des_fn (tree orig, tree fn_type, tree coro_frame_ptr, const char* name)
 {
   tree fn_name = get_fn_local_identifier (orig, name);
+  location_t loc = DECL_SOURCE_LOCATION (orig);
   tree fn = build_lang_decl (FUNCTION_DECL, fn_name, fn_type);
   DECL_CONTEXT (fn) = DECL_CONTEXT (orig);
+  DECL_SOURCE_LOCATION (fn) = loc;
   DECL_ARTIFICIAL (fn) = true;
   DECL_INITIAL (fn) = error_mark_node;
+
   tree id = get_identifier ("frame_ptr");
   tree fp = build_lang_decl (PARM_DECL, id, coro_frame_ptr);
   DECL_CONTEXT (fp) = fn;
   DECL_ARG_TYPE (fp) = type_passed_as (coro_frame_ptr);
   DECL_ARGUMENTS (fn) = fp;
+
   /* Copy selected attributes from the original function.  */
   TREE_USED (fn) = TREE_USED (orig);
   if (DECL_SECTION_NAME (orig))
@@ -4020,6 +4010,17 @@ act_des_fn (tree orig, tree fn_type, tree 
coro_frame_ptr, const char* name)
   DECL_USER_ALIGN (fn) = DECL_USER_ALIGN (orig);
   /* Apply attributes from the original fn.  */
   DECL_ATTRIBUTES (fn) = copy_list (DECL_ATTRIBUTES (orig));
+
+  /* A void return.  */
+  tree resdecl = build_decl (loc, RESULT_DECL, 0, void_type_node);
+  DECL_CONTEXT (resdecl) = fn;
+  DECL_ARTIFICIAL (resdecl) = 1;
+  DECL_IGNORED_P (resdecl) = 1;
+  DECL_RESULT (fn) = resdecl;
+
+  /* This is a coroutine component.  */
+  DECL_COROUTINE_P (fn) = 1;
+
   return fn;
 }
 
-- 
2.24.1




[pushed] coroutines: Fix a typo in rewriting the function.

2021-07-09 Thread Iain Sandoe
Hi,

When amending the function re-write code, I made a typo in
the block connections.  This has not shown up in any test
fails (as far as can be seen) but is a regression in debug
info.

Fixed thus.

tested on x86_64-darwin, linux,
pushed to master as obvious, I plan to back-port it where needed
unless any objection is raised.
thanks
Iain

Signed-off-by: Iain Sandoe 

gcc/cp/ChangeLog:

* coroutines.cc
(coro_rewrite_function_body): Connect the replacement
function block to the block nest correctly.
---
 gcc/cp/coroutines.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index a1b0b31f497..f5ae2d6d101 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4055,8 +4055,8 @@ coro_rewrite_function_body (location_t fn_start, tree 
fnbody, tree orig,
   BIND_EXPR_BLOCK (first) = replace_blk;
   /* The top block has one child, so far, and we have now got a 
 superblock.  */
-  BLOCK_SUPERCONTEXT (block) = top_block;
-  BLOCK_SUBBLOCKS (top_block) = block;
+  BLOCK_SUPERCONTEXT (replace_blk) = top_block;
+  BLOCK_SUBBLOCKS (top_block) = replace_blk;
 }
 
   /* Wrap the function body in a try {} catch (...) {} block, if exceptions
-- 
2.24.1




[pushed] Darwin, config: Revise host config fragment.

2021-07-09 Thread Iain Sandoe
Hi,

There were two uses for the Darwin host config fragment:

The first is to arrange for targets that support mdynamic-no-pic
to be built with that enabled (since it makes a significant
difference to the compiler performance).  We can be more specific
in the application of this, since it only applies to 32b hosts
plus powerpc64-darwin9.

The second was to work around a tool bug where -fno-PIE was not
propagated to the link stage.  This second use is redundant,
since the buggy toolchain cannot bootstrap current GCC sources
anyway.

This makes the host fragment more specific and reduces the number
of toolchains for which it is included which reduces clutter in
configure lines.

tested across the Darwin range and on x86_64-linux,
pushed to master, thanks
Iain

Signed-off-by: Iain Sandoe 

config/ChangeLog:

* mh-darwin: Make this specific to handling the
mdynamic-no-pic case.

ChangeLog:

* configure: Regenerate.
* configure.ac: Adjust cases for which it is necessary to
include the Darwin host config fragment.
---
 config/mh-darwin | 57 
 configure|  2 +-
 configure.ac |  2 +-
 3 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/config/mh-darwin b/config/mh-darwin
index 148b73038c3..fb2bb5ad1d9 100644
--- a/config/mh-darwin
+++ b/config/mh-darwin
@@ -1,29 +1,38 @@
 # The -mdynamic-no-pic ensures that the compiler executable is built without
-# position-independent-code -- the usual default on Darwin. This fix speeds
-# compiles by 3-5%.  Don't add it if the compiler doesn't also support
-# -mno-dynamic-no-pic to undo it.
-DARWIN_MDYNAMIC_NO_PIC := \
-`case ${host} in i?86-*-darwin* | powerpc-*-darwin*) \
-   $(CC) -S -xc /dev/null -o /dev/null -mno-dynamic-no-pic 2>/dev/null \
-   && echo -mdynamic-no-pic ;; esac`
-DARWIN_GCC_MDYNAMIC_NO_PIC := \
-`case ${host} in i?86-*-darwin* | powerpc-*-darwin*) \
-   $(CC) -S -xc /dev/null -o /dev/null -mno-dynamic-no-pic 2>/dev/null \
-   || echo -mdynamic-no-pic ;; esac`
+# position-independent-code -- the usual default on Darwin. This speeds 
compiles
+# by 8-20% (measurements made against GCC-11).
+# However, we cannot add it unless the bootstrap compiler supports
+# -mno-dynamic-no-pic to undo it, since libiberty, at least, needs this.
 
-# ld on Darwin versions >= 10.7 defaults to PIE executables. Disable this for
-# gcc components, since it is incompatible with our pch implementation.
-DARWIN_NO_PIE := `case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; 
esac;`
+# We use Werror, since some versions of clang report unknown command line flags
+# as a warning only.
 
-BOOT_CFLAGS += $(DARWIN_MDYNAMIC_NO_PIC)
-BOOT_LDFLAGS += $(DARWIN_NO_PIE)
+# We only need to determine this for the host tool used to build stage1 (or a
+# non-bootstrapped compiler), later stages will be built by GCC which supports
+# the required flags.
 
-# Similarly, for cross-compilation.
-STAGE1_CFLAGS += $(DARWIN_MDYNAMIC_NO_PIC)
-STAGE1_LDFLAGS += $(DARWIN_NO_PIE)
+BOOTSTRAP_TOOL_CAN_USE_MDYNAMIC_NO_PIC := $(shell \
+  $(CC) -S -xc /dev/null -o /dev/null -Werror -mno-dynamic-no-pic 2>/dev/null \
+  && echo true)
 
-# Without -mno-dynamic-no-pic support, add -mdynamic-no-pic just to later
-# stages when we know it is built with gcc.
-STAGE2_CFLAGS += $(DARWIN_GCC_MDYNAMIC_NO_PIC)
-STAGE3_CFLAGS += $(DARWIN_GCC_MDYNAMIC_NO_PIC)
-STAGE4_CFLAGS += $(DARWIN_GCC_MDYNAMIC_NO_PIC)
+@if gcc-bootstrap
+ifeq (${BOOTSTRAP_TOOL_CAN_USE_MDYNAMIC_NO_PIC},true)
+STAGE1_CFLAGS += -mdynamic-no-pic
+else
+STAGE1_CFLAGS += -fPIC
+endif
+# Add -mdynamic-no-pic to later stages when we know it is built with GCC.
+BOOT_CFLAGS += -mdynamic-no-pic
+@endif gcc-bootstrap
+
+@unless gcc-bootstrap
+ifeq (${BOOTSTRAP_TOOL_CAN_USE_MDYNAMIC_NO_PIC},true)
+# FIXME: we should also enable this for cross and non-bootstrap builds but
+# that needs amendment to libcc1.
+# CFLAGS += -mdynamic-no-pic
+# CXXFLAGS += -mdynamic-no-pic
+else
+CFLAGS += -fPIC
+CXXFLAGS += -fPIC
+endif
+@endunless gcc-bootstrap
diff --git a/configure b/configure
index 732d1870b3d..85ab9915402 100755
--- a/configure
+++ b/configure
@@ -4074,7 +4074,7 @@ fi
   hppa*-*)
 host_makefile_frag="config/mh-pa"
 ;;
-  *-*-darwin*)
+  i?86-*-darwin[89]* | i?86-*-darwin1[0-7]* | powerpc*-*-darwin*)
 host_makefile_frag="config/mh-darwin"
 ;;
   powerpc-*-aix*)
diff --git a/configure.ac b/configure.ac
index 041ee249bac..1df038b04f3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1318,7 +1318,7 @@ case "${host}" in
   hppa*-*) 
 host_makefile_frag="config/mh-pa"
 ;;
-  *-*-darwin*)
+  i?86-*-darwin[[89]]* | i?86-*-darwin1[[0-7]]* | powerpc*-*-darwin*)
 host_makefile_frag="config/mh-darwin"
 ;;
   powerpc-*-aix*)
-- 
2.24.1




Re: [r12-2132 Regression] FAIL: g++.dg/warn/Warray-bounds-20.C -std=gnu++98 note (test for warnings, line 55) on Linux/x86_64

2021-07-09 Thread Martin Sebor via Gcc-patches

On 7/9/21 2:16 AM, Maxim Kuvyrkov via Gcc-patches wrote:

On 9 Jul 2021, at 02:35, sunil.k.pandey via Gcc-patches 
 wrote:

On Linux/x86_64,

a110855667782dac7b674d3e328b253b3b3c919b is the first bad commit
commit a110855667782dac7b674d3e328b253b3b3c919b
Author: Martin Sebor 
Date:   Wed Jul 7 14:05:25 2021 -0600

Correct handling of variable offset minus constant in -Warray-bounds 
[PR100137]

caused


Hi Martin,

I see these failing on aarch64-linux-gnu as well:



FAIL: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 34)
FAIL: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 37)
FAIL: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 42)


Thanks.  It looks like I removed the xfails a little bit too
indiscriminately.  The test passes with an x84_64 compiler but
the aarch64 IL still isn't handled.  Let me take care of that.




FWIW, I don’t see these on aarch64-linux-gnu:


Good! :)  I fixed these problems just yesterday.

Martin




FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++14 note (test for warnings, 
line 38)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++14 note (test for warnings, 
line 55)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++17 note (test for warnings, 
line 38)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++17 note (test for warnings, 
line 55)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++2a note (test for warnings, 
line 38)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++2a note (test for warnings, 
line 55)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++98 note (test for warnings, 
line 38)
FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++98 note (test for warnings, 
line 55)



--
Maxim Kuvyrkov
https://www.linaro.org




with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-2132/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gcc.dg/Wstringop-overflow-47.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gcc.dg/Wstringop-overflow-47.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Warray-bounds-20.C --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Warray-bounds-20.C --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)






Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-09 Thread Segher Boessenkool
On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
> I wonder if there's a way to query the target what modes the doloop
> pattern can handle (not being too familiar with the doloop code).

You can look what modes are allowed for operand 0 of doloop_end,
perhaps?  Although that is a define_expand, not a define_insn, so it is
hard to introspect.

> Why do you need to do any checks besides the new type being able to
> represent all IV values?  The original doloop IV will never wrap
> (OTOH if niter is U*_MAX then we compute niter + 1 which will become
> zero ... I suppose the doloop might still do the correct thing here
> but it also still will with a IV with larger type).

doloop_valid_p guarantees it is simple and doesn't wrap.

> I'd have expected sth like
> 
>ntype = lang_hooks.types.type_for_mode (word_mode, TYPE_UNSIGNED 
> (ntype));
> 
> thus the decision made using a mode - which is also why I wonder
> if there's a way to query the target for this.  As you say,
> it _may_ be fast, so better check (somehow).

Almost all targets just use Pmode, but there is no such guarantee I
think, and esp. some targets that do not have machine insns for this
(but want to generate different code for this anyway) can do pretty much
anything.

Maybe using just Pmode here is good enough though?


Segher


Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-09 Thread Martin Sebor via Gcc-patches

On 7/9/21 12:16 AM, Richard Biener wrote:

On Thu, Jul 8, 2021 at 8:02 PM Martin Sebor via Gcc-patches
 wrote:


Hi Ian,

Yesterday's enhancement to -Warray-bounds has exposed a couple of
issues in libgo where the code writes into an invalid constant
address that the warning is designed to flag.

On the assumption that those invalid addresses are deliberate,
the attached patch suppresses these instances by using #pragma
GCC diagnostic but I don't think I'm supposed to commit it (at
least Git won't let me).  To avoid Go bootstrap failures please
either apply the patch or otherwise suppress the warning (e.g.,
by using a volatile pointer temporary).


Btw, I don't think we should diagnose things like

 *(int*)0x21 = 0x21;

when somebody literally writes that he'll be just annoyed by diagnostics.

Of course the above might be able to use __builtin_trap (); - it looks
like it is placed where control flow should never end, kind of a
__builtin_unreachable (), which means abort () might do as well.


I agree that the literal case isn't interesting.  At the time
the warnings run the distinction between a nonnull literal and
one derived from a null has been lost.  I'm hoping to replace
this with an early pass to detect null pointer arithmetic.

Martin



Richard.


Thanks
Martin




Re: [Ada] Always translate Is_Pure flag into pure in C sense

2021-07-09 Thread Eric Botcazou
> Tested on x86-64/Linux, applied on the mainline, 11 and 10 branches.
> 
> 
> 2021-05-21  Eric Botcazou  
> 
>   * gcc-interface/decl.c (gnat_to_gnu_subprog_type): Always translate
>   the Is_Pure flag into the "pure" attribute of GNU C.

This is the missing piece, applied on the same branches.

* gcc-interface/utils.c (finish_subprog_decl): Remove obsolete line.

-- 
Eric Botcazou
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 982274c6d77..535f4ca7fba 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -3543,9 +3543,6 @@ finish_subprog_decl (tree decl, tree asm_name, tree type)
   DECL_BY_REFERENCE (result_decl) = TREE_ADDRESSABLE (type);
   DECL_RESULT (decl) = result_decl;
 
-  /* Propagate the "const" property.  */
-  TREE_READONLY (decl) = TYPE_READONLY (type);
-
   /* Propagate the "pure" property.  */
   DECL_PURE_P (decl) = TYPE_RESTRICT (type);
 


Re: [PATCH] libffi/x86: Always check __x86_64__ for x86 hosts

2021-07-09 Thread H.J. Lu via Gcc-patches
On Mon, Jul 5, 2021 at 6:00 PM H.J. Lu  wrote:
>
> Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__
> for x86 hosts.
>
> PR libffi/101336
> * configure.host: Always check __x86_64__ for x86 hosts.
> ---
>  libffi/configure.host | 21 +++--
>  1 file changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/libffi/configure.host b/libffi/configure.host
> index 786b32c5bb0..7248acb7458 100644
> --- a/libffi/configure.host
> +++ b/libffi/configure.host
> @@ -95,20 +95,13 @@ case "${host}" in
>i?86-*-* | x86_64-*-* | amd64-*)
> TARGETDIR=x86
> if test $ac_cv_sizeof_size_t = 4; then
> - case "$host" in
> -   *-gnux32)
> - TARGET=X86_64
> - ;;
> -   *)
> - echo 'int foo (void) { return __x86_64__; }' > conftest.c
> - if $CC $CFLAGS -Werror -S conftest.c -o conftest.s > /dev/null 
> 2>&1; then
> -   TARGET=X86_64;
> - else
> -   TARGET=X86;
> - fi
> - rm -f conftest.*
> - ;;
> -  esac
> + echo 'int foo (void) { return __x86_64__; }' > conftest.c
> + if $CC $CFLAGS -Werror -S conftest.c -o conftest.s > /dev/null 
> 2>&1; then
> +   TARGET=X86_64;
> + else
> +   TARGET=X86;
> +  fi
> +  rm -f conftest.*
> else
>   TARGET=X86_64;
> fi
> --
> 2.31.1
>

This has been fixed in upstream with

commit cb8474368cdef3207638d047bd6c707ad8fcb339
Author: hjl-tools 
Date:   Wed Dec 2 12:52:12 2020 -0800

libffi/x86: Always check __x86_64__ for x32 hosts (#601) (#602)

Since for x86_64-*x32 and x86_64-x32-* hosts, -m32 generates ia32 codes.
We should always check __x86_64__ for x32 hosts.

I will check it in if there are no objections.

-- 
H.J.


Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-09 Thread Martin Jambor
Hi,

On Thu, Jul 08 2021, Qing Zhao wrote:
> (Resend this email since the previous one didn’t quote, I changed one
> setting in my mail client, hopefully that can fix this issue).
>
> Hi, Martin,
>
> Thank you for the review and comment.
>
>> On Jul 8, 2021, at 8:29 AM, Martin Jambor  wrote:
>>> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>>> index c05d22f3e8f1..35051d7c6b96 100644
>>> --- a/gcc/tree-sra.c
>>> +++ b/gcc/tree-sra.c
>>> @@ -384,6 +384,13 @@ static struct
>>> 
>>>   /* Numbber of components created when splitting aggregate parameters.  */
>>>   int param_reductions_created;
>>> +
>>> +  /* Number of deferred_init calls that are modified.  */
>>> +  int deferred_init;
>>> +
>>> +  /* Number of deferred_init calls that are created by
>>> + generate_subtree_deferred_init.  */
>>> +  int subtree_deferred_init;
>>> } sra_stats;
>>> 
>>> static void
>>> @@ -4096,6 +4103,110 @@ get_repl_default_def_ssa_name (struct access *racc, 
>>> tree reg_type)
>>>   return get_or_create_ssa_default_def (cfun, racc->replacement_decl);
>>> }
>>> 
>>> +
>>> +/* Generate statements to call .DEFERRED_INIT to initialize scalar 
>>> replacements
>>> +   of accesses within a subtree ACCESS; all its children, siblings and 
>>> their
>>> +   children are to be processed.
>>> +   GSI is a statement iterator used to place the new statements.  */
>>> +static void
>>> +generate_subtree_deferred_init (struct access *access,
>>> +   tree init_type,
>>> +   tree is_vla,
>>> +   gimple_stmt_iterator *gsi,
>>> +   location_t loc)
>>> +{
>>> +  do
>>> +{
>>> +  if (access->grp_to_be_replaced)
>>> +   {
>>> + tree repl = get_access_replacement (access);
>>> + gimple *call
>>> +   = gimple_build_call_internal (IFN_DEFERRED_INIT, 3,
>>> + TYPE_SIZE_UNIT (TREE_TYPE (repl)),
>>> + init_type, is_vla);
>>> + gimple_call_set_lhs (call, repl);
>>> + gsi_insert_before (gsi, call, GSI_SAME_STMT);
>>> + update_stmt (call);
>>> + gimple_set_location (call, loc);
>>> + sra_stats.subtree_deferred_init++;
>>> +   }
>>> +  else if (access->grp_to_be_debug_replaced)
>>> +   {
>>> + tree drepl = get_access_replacement (access);
>>> + tree call = build_call_expr_internal_loc
>>> +(UNKNOWN_LOCATION, IFN_DEFERRED_INIT,
>>> + TREE_TYPE (drepl), 3,
>>> + TYPE_SIZE_UNIT (TREE_TYPE (drepl)),
>>> + init_type, is_vla);
>>> + gdebug *ds = gimple_build_debug_bind (drepl, call,
>>> +   gsi_stmt (*gsi));
>>> + gsi_insert_before (gsi, ds, GSI_SAME_STMT);
>> 
>> Is handling of grp_to_be_debug_replaced accesses necessary here?  If so,
>> why?  grp_to_be_debug_replaced accesses are there only to facilitate
>> debug information about a part of an aggregate decl is that is likely
>> going to be entirely removed - so that debuggers can sometimes show to
>> users information about what they would contain had they not removed.
>> It seems strange you need to mark them as uninitialized because they
>> should not have any consumers.  (But perhaps it is also harmless.)
>
> This part has been discussed during the 2nd version of the patch, but
> I think that more discussion might be necessary.
>
> In the previous discussion, Richard Sandiford mentioned:
> (https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568620.html):
>
> =
>
> I guess the thing we need to decide here is whether -ftrivial-auto-var-init
> should affect debug-only constructs too.  If it doesn't, exmaining removed
> components in a debugger might show uninitialised values in cases where
> the user was expecting initialised ones.  There would be no security
> concern, but it might be surprising.
>
> I think in principle the DRHS can contain a call to DEFERRED_INIT.
> Doing that would probably require further handling elsewhere though.
>
> =
>
> I am still not very confident now for this part of the change.

I see.  I still tend to think that with or without the generation of
gimple_build_debug_binds, the debugger would still not display any value
for the component in question.  Without it there would be no information
about the component at a any place in code affected by this, with it the
component would be explicitely uninitialized.  But OK.

>
> My questions:
>
> 1. If we don’t handle grp_to_be_debug_replaced at all, what will
> happen?  ( the user of the debugger will see uninitialized values in
> the removed part of the aggregate?  Or something else?)

Well, can you try?  :-) I think the debugger would not have anything to
display.

> 2. On the other hand, if we handle grp_to_be_debug_replaced as the
> current patch, what will the user of the debugger see?

I don't know.  It would be interesting to know if the generated DWARF is
different at all.

>
>> 
>> On a 

Re: [PATCH v2 2/2] rs6000: Add test for _mm_minpos_epu16

2021-07-09 Thread Bill Schmidt via Gcc-patches

Hi Paul,

On 6/8/21 2:11 PM, Paul A. Clarke via Gcc-patches wrote:

Copy the test for _mm_minpos_epu16 from
gcc/testsuite/gcc.target/i386/sse4_1-phminposuw.c, with
a few adjustments:

- Adjust the dejagnu directives for powerpc platform.
- Make the data not be monotonically increasing,
   such that some of the returned values are not
   always the first value (index 0).
- Create a list of input data testing various scenarios
   including more than one minimum value and different
   orders and indicies of the minimum value.

Typo: indices

- Fix a masking issue where the index was being truncated
   to 2 bits instead of 3 bits, which wasn't found because
   all of the returned indicies were 0 with the original

and here

   generated data.
- Support big-endian.

Thank you for attention to detail. :)


2021-06-08  Paul A. Clarke  

gcc/testsuite/ChangeLog:
 * gcc.target/powerpc/sse4_1-phminposuw.c: Copy from
 gcc/testsuite/gcc.target/i386, make more robust.
---
  .../gcc.target/powerpc/sse4_1-phminposuw.c| 68 +++
  1 file changed, 68 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c

diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c 
b/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c
new file mode 100644
index ..3bb5a2dfe4f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c
@@ -0,0 +1,68 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#define NO_WARN_X86_INTRINSICS 1
+#ifndef CHECK_H
+#define CHECK_H "sse4_1-check.h"
+#endif
+
+#ifndef TEST
+#define TEST sse4_1_test
+#endif
+
+#include CHECK_H
+
+#include 
+
+#define DIM(a) (sizeof (a) / sizeof ((a)[0]))
+
+static void
+TEST (void)
+{
+  union
+{
+  __m128i x;
+  unsigned short s[8];
+} src[] =
+{
+  { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x 
} },
+  { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x 
} },
+  { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x 
} },
+  { .s = { 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008 
} },
+  { .s = { 0x0008, 0x0007, 0x0006, 0x0005, 0x0004, 0x0003, 0x0002, 0x0001 
} },
+  { .s = { 0xfff4, 0xfff3, 0xfff2, 0xfff1, 0xfff3, 0xfff1, 0xfff2, 0xfff3 
} }
+};
+  unsigned short minVal[DIM (src)];
+  int minInd[DIM (src)];
+  unsigned short minValScalar, minIndScalar;
+  int i, j;
+  union
+{
+  int i;


No need to change, but overloading i in another scope is not the 
greatest style.  I assume this came with the original test.



+  unsigned short s[2];
+} res;
+
+  for (i = 0; i < DIM (src); i++)
+{
+  res.i = _mm_cvtsi128_si32 (_mm_minpos_epu16 (src[i].x));
+  minVal[i] = res.s[0];
+  minInd[i] = res.s[1] & 0b111;
+}
+
+  for (i = 0; i < DIM (src); i++)
+{
+  minValScalar = src[i].s[0];
+  minIndScalar = 0;
+
+  for (j = 1; j < 8; j++)
+   if (minValScalar > src[i].s[j])
+ {
+   minValScalar = src[i].s[j];
+   minIndScalar = j;
+ }
+
+  if (minValScalar != minVal[i] && minIndScalar != minInd[i])
+   abort ();
+}
+}


LGTM with spelling addressed.  I can't approve, but recommend approval 
with those changes.


Thanks,
Bill



Re: [PATCH v2 1/2] rs6000: Add support for _mm_minpos_epu16

2021-07-09 Thread Bill Schmidt via Gcc-patches

Hi Paul,

On 6/8/21 2:11 PM, Paul A. Clarke via Gcc-patches wrote:

Add a naive implementation of the subject x86 intrinsic to
ease porting.
"subject" won't be part of eventual commit, so please specify in commit 
blurb.


2021-06-08  Paul A. Clarke  

gcc/ChangeLog:
 * config/rs6000/smmintrin.h (_mm_minpos_epu16): New.
---
  gcc/config/rs6000/smmintrin.h | 25 +
  1 file changed, 25 insertions(+)

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index bdf6eb365d88..b7de38763f2b 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -116,4 +116,29 @@ _mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i __mask)
return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask);
  }

+/* Return horizontal packed word minimum and its index in bits [15:0]
+   and bits [18:16] respectively.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
Line too long, please break up.  (I realize this happens throughout this 
file already, but...)

+_mm_minpos_epu16 (__m128i __A)
+{
+  union __u
+{
+  __m128i __m;
+  __v8hu __uh;
+};
+  union __u __u = { .__m = __A }, __r = { .__m = {0} };
+  unsigned short __ridx = 0;
+  unsigned short __rmin = __u.__uh[__ridx];
+  for (unsigned long __i = __ridx + 1; __i < 8; __i++)

"__ridx + 1" can just be "1"

+{
+  if (__u.__uh[__i] < __rmin)
+{
+  __rmin = __u.__uh[__i];
+  __ridx = __i;
+}

Preceding four lines need tabs, not spaces.

+}
+  __r.__uh[0] = __rmin;
+  __r.__uh[1] = __ridx;
+  return __r.__m;
+}
  #endif


Otherwise LGTM.  I can't approve, but recommend approval with those 
things fixed.


Thanks,
Bill



Re: rs6000: Generate an lxvp instead of two adjacent lxv instructions

2021-07-09 Thread Segher Boessenkool
On Thu, Jul 08, 2021 at 08:26:45PM -0500, Peter Bergner wrote:
> On 7/8/21 6:28 PM, Segher Boessenkool wrote:
> >>  int index = WORDS_BIG_ENDIAN ? i : nvecs - 1 - i;
> >> -rtx dst_i = gen_rtx_REG (reg_mode, reg + index);
> >> -emit_insn (gen_rtx_SET (dst_i, XVECEXP (src, 0, i)));
> >> +int index_next = WORDS_BIG_ENDIAN ? index + 1 : index - 1;
> > 
> > What does index_next mean?  The machine instructions do the same thing
> > in any endianness.
> 
> Yeah, I'm bad at coming up with names! :-)   So "index" is the index
> into XVECEXP (src, 0, ...) which is the operand that is to be assigned
> to regno.  "index_next" is the index into XVECEXP (src, 0, ...) which is
> the operand to be assigned to regno + 1 (ie, the next register of the
> even/odd register pair).  Whether the "next index" is index+1 or index-1
> is dependent on LE versus BE.

I would just call it "index1", or even "j" and "k" instead of "index"
and "index_next" :-)  "next" can put people on the wrong track (it did
me :-) )

> >> +/* If we are loading an even VSX register and our memory location
> >> +   is adjacent to the next register's memory location (if any),
> >> +   then we can load them both with one LXVP instruction.  */
> >> +if ((regno & 1) == 0
> >> +&& VSX_REGNO_P (regno)
> >> +&& MEM_P (XVECEXP (src, 0, index))
> >> +&& MEM_P (XVECEXP (src, 0, index_next)))
> >> +  {
> >> +rtx base = WORDS_BIG_ENDIAN ? XVECEXP (src, 0, index)
> >> +: XVECEXP (src, 0, index_next);
> >> +rtx next = WORDS_BIG_ENDIAN ? XVECEXP (src, 0, index_next)
> >> +: XVECEXP (src, 0, index);
> > 
> > Please get rid of index_next, if you still have to do different code for
> > LE here -- it doesn't make the code any clearer (in fact I cannot follow
> > it at all anymore :-( )
> 
> We do need different code for LE versus BE.  So you want something like
> 
>   if (WORDS_BIG_ENDIAN) {...} else {...}
> 
> ...instead?  I can try that to see if the code is easier to read.

Yes exactly.  It will more directly say what it does, and there is no
"index_next" abstraction the reader has to absorb first.

> > So this converts pairs of lxv to an lxvp in only a very limited case,
> > right?  Can we instead do it more generically?  And what about stxvp?
> 
> Doing it more generically is my next TODO and that will cover both
> lxvp and stxvp.

Ah cool :-)

> My thought was to write a simple pass run at about
> the same time as our swap optimization pass to look for adjacent
> lxv's and stxv's and convert them into lxvp and stxvp.

So, very early, as soon as DF is set up.  Makes sense.

> However, that
> won't catch the above case, since the assemble/build pattern is not
> split until very late, so we still want the above change.

You probably should also have a peephole (whether you do it like here or
not :-) )

> Also, given the new pass will be more complicated than the above code,
> it will be a GCC 12 only change.

/nod

> Let me make the changes you want and I'll repost with what I come up with.

Thanks!  And thanks for the explanation.


Segher


[PATCH] Change the type of memory classification functions to bool

2021-07-09 Thread Uros Bizjak via Gcc-patches
2021-07-09  Uroš Bizjak  

gcc/
* recog.c (memory_address_addr_space_p): Change the type to bool.
Return true/false instead of 1/0.
(offsettable_memref_p): Ditto.
(offsettable_nonstrict_memref_p): Ditto.
(offsettable_address_addr_space_p): Ditto.
Change the type of addressp indirect function to bool.
* recog.h (memory_address_addr_space_p): Change the type to bool.
(strict_memory_address_addr_space_p): Ditto.
(offsettable_memref_p): Ditto.
(offsettable_nonstrict_memref_p): Ditto.
(offsettable_address_addr_space_p): Ditto.
* reload.c (maybe_memory_address_addr_space_p): Ditto.
(strict_memory_address_addr_space_p): Change the type to bool.
Return true/false instead of 1/0.
(maybe_memory_address_addr_space_p): Change the type to bool.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for master?

Uros.
diff --git a/gcc/recog.c b/gcc/recog.c
index 2114df8c0d1..5a42c45361d 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -1776,20 +1776,20 @@ pop_operand (rtx op, machine_mode mode)
   return XEXP (op, 0) == stack_pointer_rtx;
 }
 
-/* Return 1 if ADDR is a valid memory address
+/* Return true if ADDR is a valid memory address
for mode MODE in address space AS.  */
 
-int
+bool
 memory_address_addr_space_p (machine_mode mode ATTRIBUTE_UNUSED,
 rtx addr, addr_space_t as)
 {
 #ifdef GO_IF_LEGITIMATE_ADDRESS
   gcc_assert (ADDR_SPACE_GENERIC_P (as));
   GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
-  return 0;
+  return false;
 
  win:
-  return 1;
+  return true;
 #else
   return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
 #endif
@@ -2361,18 +2361,16 @@ find_constant_term_loc (rtx *p)
   return 0;
 }
 
-/* Return 1 if OP is a memory reference
-   whose address contains no side effects
-   and remains valid after the addition
-   of a positive integer less than the
-   size of the object being referenced.
+/* Return true if OP is a memory reference whose address contains
+   no side effects and remains valid after the addition of a positive
+   integer less than the size of the object being referenced.
 
We assume that the original address is valid and do not check it.
 
This uses strict_memory_address_p as a subroutine, so
don't use it before reload.  */
 
-int
+bool
 offsettable_memref_p (rtx op)
 {
   return ((MEM_P (op))
@@ -2383,7 +2381,7 @@ offsettable_memref_p (rtx op)
 /* Similar, but don't require a strictly valid mem ref:
consider pseudo-regs valid as index or base regs.  */
 
-int
+bool
 offsettable_nonstrict_memref_p (rtx op)
 {
   return ((MEM_P (op))
@@ -2391,7 +2389,7 @@ offsettable_nonstrict_memref_p (rtx op)
   MEM_ADDR_SPACE (op)));
 }
 
-/* Return 1 if Y is a memory address which contains no side effects
+/* Return true if Y is a memory address which contains no side effects
and would remain valid for address space AS after the addition of
a positive integer less than the size of that mode.
 
@@ -2401,7 +2399,7 @@ offsettable_nonstrict_memref_p (rtx op)
If STRICTP is nonzero, we require a strictly valid address,
for the sake of use in reload.c.  */
 
-int
+bool
 offsettable_address_addr_space_p (int strictp, machine_mode mode, rtx y,
  addr_space_t as)
 {
@@ -2409,19 +2407,19 @@ offsettable_address_addr_space_p (int strictp, 
machine_mode mode, rtx y,
   rtx z;
   rtx y1 = y;
   rtx *y2;
-  int (*addressp) (machine_mode, rtx, addr_space_t) =
+  bool (*addressp) (machine_mode, rtx, addr_space_t) =
 (strictp ? strict_memory_address_addr_space_p
 : memory_address_addr_space_p);
   poly_int64 mode_sz = GET_MODE_SIZE (mode);
 
   if (CONSTANT_ADDRESS_P (y))
-return 1;
+return true;
 
   /* Adjusting an offsettable address involves changing to a narrower mode.
  Make sure that's OK.  */
 
   if (mode_dependent_address_p (y, as))
-return 0;
+return false;
 
   machine_mode address_mode = GET_MODE (y);
   if (address_mode == VOIDmode)
@@ -2442,7 +2440,7 @@ offsettable_address_addr_space_p (int strictp, 
machine_mode mode, rtx y,
 
   if ((ycode == PLUS) && (y2 = find_constant_term_loc ()))
 {
-  int good;
+  bool good;
 
   y1 = *y2;
   *y2 = plus_constant (address_mode, *y2, mode_sz - 1);
@@ -2456,7 +2454,7 @@ offsettable_address_addr_space_p (int strictp, 
machine_mode mode, rtx y,
 }
 
   if (GET_RTX_CLASS (ycode) == RTX_AUTOINC)
-return 0;
+return false;
 
   /* The offset added here is chosen as the maximum offset that
  any instruction could need to add when operating on something
@@ -2486,7 +2484,7 @@ offsettable_address_addr_space_p (int strictp, 
machine_mode mode, rtx y,
   return (*addressp) (QImode, z, as);
 }
 
-/* Return 1 if ADDR is an address-expression whose effect depends
+/* Return true if ADDR is an address-expression whose effect depends
on the mode of the memory reference it is 

[PATCH v3 2/2] Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

2021-07-09 Thread H.J. Lu via Gcc-patches
Generate the marker for -fno-direct-extern-access to indicate that the
object file uses GOT to access all external symbols.  Access to protected
symbols in the resulting shared library is treated as local, which requires
canonical function pointers and cannot be used with copy relocation.

GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to
GNU binutils 2.38.  But the -z indirect-extern-access linker option is
only available for Linux/x86.  However, the --max-cache-size=SIZE linker
option was also addded within a day.  --max-cache-size=SIZE is used to
check for GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support.

This marker can be used in the following ways:

1. Linker can decide the best way to resolve a relocation against a
protected symbol before seeing all relocations against the symbol.
2. Dynamic linker can decide if it is an error to have a copy relocation
in executable against the protected symbol in a shared library by checking
if the shared library is built with -fno-direct-extern-access.

* configure.ac (HAVE_LD_INDIRECT_EXTERN_ACCESS_SUPPORT): New.
Define to 1 if linker supports
GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS.
* output.h (emit_gnu_property): New.
(emit_gnu_property_note): Likewise.
* target.def (emit_gnu_property_note): Add a argetm.asm_out hook.
* toplev.c (compile_file): Call emit_gnu_property_note before
file_end.
* varasm.c (emit_gnu_property): New.
(emit_gnu_property_note): Likewise.
* config.in: Regenerated.
* configure: Likewise.
* doc/tm.texi: Likewise.
* config/i386/gnu-property.c (emit_gnu_property): Removed.
(TARGET_ASM_EMIT_GNU_PROPERTY_NOTE): New.
* doc/tm.texi.in: Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE.
---
 gcc/config.in  |  7 +
 gcc/config/i386/gnu-property.c | 31 --
 gcc/config/i386/i386.c |  2 ++
 gcc/configure  | 26 +++
 gcc/configure.ac   | 22 
 gcc/doc/tm.texi|  5 
 gcc/doc/tm.texi.in |  2 ++
 gcc/output.h   |  2 ++
 gcc/target.def |  8 ++
 gcc/toplev.c   |  3 +++
 gcc/varasm.c   | 47 ++
 11 files changed, 124 insertions(+), 31 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 2abac530c64..2c94a046de7 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1652,6 +1652,13 @@
 #endif
 
 
+/* Define to 1 if your linker supports
+   GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_LD_INDIRECT_EXTERN_ACCESS_SUPPORT
+#endif
+
+
 /* Define if your PowerPC64 linker supports a large TOC. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_LD_LARGE_TOC
diff --git a/gcc/config/i386/gnu-property.c b/gcc/config/i386/gnu-property.c
index 4ba04403002..9fe8d00132e 100644
--- a/gcc/config/i386/gnu-property.c
+++ b/gcc/config/i386/gnu-property.c
@@ -24,37 +24,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "output.h"
 #include "linux-common.h"
 
-static void
-emit_gnu_property (unsigned int type, unsigned int data)
-{
-  int p2align = ptr_mode == SImode ? 2 : 3;
-
-  switch_to_section (get_section (".note.gnu.property",
- SECTION_NOTYPE, NULL));
-
-  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
-  /* name length.  */
-  fprintf (asm_out_file, ASM_LONG "1f - 0f\n");
-  /* data length.  */
-  fprintf (asm_out_file, ASM_LONG "4f - 1f\n");
-  /* note type: NT_GNU_PROPERTY_TYPE_0.  */
-  fprintf (asm_out_file, ASM_LONG "5\n");
-  fprintf (asm_out_file, "0:\n");
-  /* vendor name: "GNU".  */
-  fprintf (asm_out_file, STRING_ASM_OP "\"GNU\"\n");
-  fprintf (asm_out_file, "1:\n");
-  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
-  /* pr_type.  */
-  fprintf (asm_out_file, ASM_LONG "0x%x\n", type);
-  /* pr_datasz.  */
-  fprintf (asm_out_file, ASM_LONG "3f - 2f\n");
-  fprintf (asm_out_file, "2:\n");
-  fprintf (asm_out_file, ASM_LONG "0x%x\n", data);
-  fprintf (asm_out_file, "3:\n");
-  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
-  fprintf (asm_out_file, "4:\n");
-}
-
 void
 file_end_indicate_exec_stack_and_gnu_property (void)
 {
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7dee311051d..bd91c7cc7f8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24091,6 +24091,8 @@ ix86_run_selftests (void)
 #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES
 # undef TARGET_ASM_RELOC_RW_MASK
 # define TARGET_ASM_RELOC_RW_MASK ix86_reloc_rw_mask
+# undef TARGET_ASM_EMIT_GNU_PROPERTY_NOTE
+# define TARGET_ASM_EMIT_GNU_PROPERTY_NOTE emit_gnu_property_note
 #endif
 
 static bool ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED)
diff --git a/gcc/configure b/gcc/configure
index a15f8b47202..597b32b2959 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -32333,6 +32333,32 @@ fi
 { $as_echo 

[PATCH v3 0/2] Implement indirect external access

2021-07-09 Thread H.J. Lu via Gcc-patches
Changes in the v2 patch.

1. GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to
GNU binutils 2.38.  But the -z indirect-extern-access linker option is
only available for Linux/x86.  However, the --max-cache-size=SIZE linker
option was also addded within a day.  --max-cache-size=SIZE is used to
check for GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support.

Changes in the v2 patch.

1. Rename the option to -fdirect-extern-access.

---
On systems with copy relocation:
* A copy in executable is created for the definition in a shared library
at run-time by ld.so.
* The copy is referenced by executable and shared libraries.
* Executable can access the copy directly.

Issues are:
* Overhead of a copy, time and space, may be visible at run-time.
* Read-only data in the shared library becomes read-write copy in
executable at run-time.
* Local access to data with the STV_PROTECTED visibility in the shared
library must use GOT.

On systems without function descriptor, function pointers vary depending
on where and how the functions are defined.
* If the function is defined in executable, it can be the address of
function body.
* If the function, including the function with STV_PROTECTED visibility,
is defined in the shared library, it can be the address of the PLT entry
in executable or shared library.

Issues are:
* The address of function body may not be used as its function pointer.
* ld.so needs to search loaded shared libraries for the function pointer
of the function with STV_PROTECTED visibility.

Here is a proposal to remove copy relocation and use canonical function
pointer:

1. Accesses, including in PIE and non-PIE, to undefined symbols must
use GOT.
  a. Linker may optimize out GOT access if the data is defined in PIE or
  non-PIE.
2. Read-only data in the shared library remain read-only at run-time
3. Address of global data with the STV_PROTECTED visibility in the shared
library is the address of data body.
  a. Can use IP-relative access.
  b. May need GOT without IP-relative access.
4. For systems without function descriptor,
  a. All global function pointers of undefined functions in PIE and
  non-PIE must use GOT.  Linker may optimize out GOT access if the
  function is defined in PIE or non-PIE.
  b. Function pointer of functions with the STV_PROTECTED visibility in
  executable and shared library is the address of function body.
   i. Can use IP-relative access.
   ii. May need GOT without IP-relative access.
   iii. Branches to undefined functions may use PLT.
5. Single global definition marker:

Add GNU_PROPERTY_1_NEEDED:

#define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO

to indicate the needed properties by the object file.

Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS:

#define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0)

to indicate that the object file requires canonical function pointers and
cannot be used with copy relocation.  This bit should be cleared in
executable when there are non-GOT or non-PLT relocations in relocatable
input files without this bit set.

  a. Protected symbol access within the shared library can be treated as
  local.
  b. Copy relocation should be disallowed at link-time and run-time.
  c. GOT function pointer reference is required at link-time and run-time.

The indirect external access marker can be used in the following ways:

1. Linker can decide the best way to resolve a relocation against a
protected symbol before seeing all relocations against the symbol.
2. Dynamic linker can decide if it is an error to have a copy relocation
in executable against the protected symbol in a shared library by checking
if the shared library is built with -fno-direct-extern-access.

Add a compiler option, -fdirect-extern-access. -fdirect-extern-access is
the default.  With -fno-direct-extern-access:

1. Always to use GOT to access undefined symbols, including in PIE and
non-PIE.  This is safe to do and does not break the ABI.
2. In executable and shared library, for symbols with the STV_PROTECTED
visibility:
  a. The address of data symbol is the address of data body.
  b. For systems without function descriptor, the function pointer is
  the address of function body.
These break the ABI and resulting shared libraries may not be compatible
with executables which are not compiled with -fno-direct-extern-access.
3. Generate an indirect external access marker in relocatable objects if
supported by linker.

H.J. Lu (2):
  Add -f[no-]direct-extern-access
  Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

 gcc/common.opt|  4 ++
 gcc/config.in |  7 +++
 gcc/config/i386/gnu-property.c| 31 -
 gcc/config/i386/i386-protos.h |  2 +-
 gcc/config/i386/i386.c| 52 --
 gcc/configure | 26 +++
 gcc/configure.ac  | 22 ++
 gcc/doc/invoke.texi   | 13 

[PATCH v3 1/2] Add -f[no-]direct-extern-access

2021-07-09 Thread H.J. Lu via Gcc-patches
-fdirect-extern-access is the default.  With -fno-direct-extern-access:

1. Always use GOT to access undefined data and function symbols,
   including in PIE and non-PIE.  These will avoid copy relocations
   in executables.  This is compatible with existing executables and
   shared libraries.
2. In executable and shared library, bind symbols with the STV_PROTECTED
   visibility locally:
   a. The address of data symbol is the address of data body.
   b. For systems without function descriptor, the function pointer is
  the address of function body.
   c. The resulting shared libraries may not be incompatible with
  executables which have copy relocations on protected symbols or
  use executable PLT entries as function addresses for protected
  functions in shared libraries.
3. Update asm_preferred_eh_data_format to select PC relative EH encoding
format with -fno-direct-extern-access to avoid copy relocation.
4. Add ix86_reloc_rw_mask for TARGET_ASM_RELOC_RW_MASK to avoid copy
relocation with -fno-direct-extern-access.

gcc/

PR target/35513
PR target/100593
* common.opt: Add -fdirect-extern-access.
* config/i386/i386-protos.h (ix86_force_load_from_GOT_p): Add a
bool argument.
* config/i386/i386.c (ix86_force_load_from_GOT_p): Add a bool
argument to indicate call operand.  Force non-call load
from GOT for -fno-direct-extern-access.
(legitimate_pic_address_disp_p): Avoid copy relocation in PIE
for -fno-direct-extern-access.
(ix86_print_operand): Pass true to ix86_force_load_from_GOT_p
for call operand.
(asm_preferred_eh_data_format): Use PC-relative format for
-fno-direct-extern-access to avoid copy relocation.  Check
ptr_mode instead of TARGET_64BIT when selecting DW_EH_PE_sdata4.
(ix86_binds_local_p): Don't treat protected data as extern and
avoid copy relocation on common symbol with
-fno-direct-extern-access.
(ix86_reloc_rw_mask): New to avoid copy relocation for
-fno-direct-extern-access.
(TARGET_ASM_RELOC_RW_MASK): New.
* doc/invoke.texi: Document -f[no-]direct-extern-access.

gcc/testsuite/

PR target/35513
PR target/100593
* g++.dg/pr35513-1.C: New file.
* g++.dg/pr35513-2.C: Likewise.
* gcc.target/i386/pr35513-1.c: Likewise.
* gcc.target/i386/pr35513-2.c: Likewise.
* gcc.target/i386/pr35513-3.c: Likewise.
* gcc.target/i386/pr35513-4.c: Likewise.
* gcc.target/i386/pr35513-5.c: Likewise.
* gcc.target/i386/pr35513-6.c: Likewise.
* gcc.target/i386/pr35513-7.c: Likewise.
* gcc.target/i386/pr35513-8.c: Likewise.
---
 gcc/common.opt|  4 ++
 gcc/config/i386/i386-protos.h |  2 +-
 gcc/config/i386/i386.c| 50 +++--
 gcc/doc/invoke.texi   | 13 ++
 gcc/testsuite/g++.dg/pr35513-1.C  | 25 +++
 gcc/testsuite/g++.dg/pr35513-2.C  | 53 +++
 gcc/testsuite/gcc.target/i386/pr35513-1.c | 16 +++
 gcc/testsuite/gcc.target/i386/pr35513-2.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr35513-3.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr35513-4.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr35513-5.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr35513-6.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr35513-7.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr35513-8.c | 41 ++
 14 files changed, 278 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr35513-1.C
 create mode 100644 gcc/testsuite/g++.dg/pr35513-2.C
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr35513-8.c

diff --git a/gcc/common.opt b/gcc/common.opt
index d9da1131eda..67ad811d54d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1432,6 +1432,10 @@ fdiagnostics-minimum-margin-width=
 Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
 Set minimum width of left margin of source code when showing source.
 
+fdirect-extern-access
+Common Var(flag_direct_extern_access) Init(1) Optimization
+Do not use GOT to access external symbols.
+
 fdisable-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -fdisable-[tree|rtl|ipa]-=range1+range2  Disable an optimization pass.
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 51376fcc454..693cc3e5c78 100644
--- a/gcc/config/i386/i386-protos.h
+++ 

Re: [PATCH] c++: requires-expr with dependent extra args [PR101181]

2021-07-09 Thread Patrick Palka via Gcc-patches
On Thu, 8 Jul 2021, Jason Merrill wrote:

> On 7/8/21 11:28 AM, Patrick Palka wrote:
> > Here we're crashing ultimately because the mechanism for delaying
> > substitution into a requires-expression (or constexpr if) doesn't
> > expect to see dependent args.  But we end up capturing dependent
> > args here when substituting into the default template argument during
> > coerce_template_parms for the dependent specialization p.
> > 
> > This patch enables the commented out code in add_extra_args for
> > handling this situation.  It turns out we also need to make a copy of
> > the captured arguments so that coerce_template_parms doesn't later
> > add to the argument, which would form an unexpected cycle.  And we
> > need to make tsubst_template_args more forgiving about missing template
> > arguments, since the arguments we capture from coerce_template_parms are
> > incomplete.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk/11?
> > 
> > PR c++/101181
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constraint.cc (tsubst_requires_expr): Pass complain/in_decl to
> > add_extra_args.
> > * cp-tree.h (add_extra_args): Add complain/in_decl parameters.
> > * pt.c (build_extra_args): Make a copy of args.
> > (add_extra_args): Add complain/in_decl parameters.  Handle the
> > case where the extra arguments are dependent.
> > (tsubst_pack_expansion): Pass complain/in_decl to
> > add_extra_args.
> > (tsubst_template_args): Handle missing template arguments.
> > (tsubst_expr) : Pass complain/in_decl to
> > add_extra_args.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/concepts-requires26.C: New test.
> > * g++.dg/cpp2a/lambda-uneval16.C: New test.
> > ---
> >   gcc/cp/constraint.cc  |  3 +-
> >   gcc/cp/cp-tree.h  |  2 +-
> >   gcc/cp/pt.c   | 31 +--
> >   .../g++.dg/cpp2a/concepts-requires26.C| 18 +++
> >   gcc/testsuite/g++.dg/cpp2a/lambda-uneval16.C  | 22 +
> >   5 files changed, 58 insertions(+), 18 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-requires26.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval16.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 99d3ccc6998..4ee5215df50 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -2266,7 +2266,8 @@ tsubst_requires_expr (tree t, tree args, sat_info
> > info)
> > /* A requires-expression is an unevaluated context.  */
> > cp_unevaluated u;
> >   -  args = add_extra_args (REQUIRES_EXPR_EXTRA_ARGS (t), args);
> > +  args = add_extra_args (REQUIRES_EXPR_EXTRA_ARGS (t), args,
> > +info.complain, info.in_decl);
> > if (processing_template_decl)
> >   {
> > /* We're partially instantiating a generic lambda.  Substituting
> > into
> > diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> > index 58da7460001..0a5f13489cc 100644
> > --- a/gcc/cp/cp-tree.h
> > +++ b/gcc/cp/cp-tree.h
> > @@ -7289,7 +7289,7 @@ extern void add_mergeable_specialization(bool
> > is_decl, bool is_alias,
> >  tree outer, unsigned);
> >   extern tree add_to_template_args  (tree, tree);
> >   extern tree add_outermost_template_args   (tree, tree);
> > -extern tree add_extra_args (tree, tree);
> > +extern tree add_extra_args (tree, tree, tsubst_flags_t,
> > tree);
> >   extern tree build_extra_args  (tree, tree,
> > tsubst_flags_t);
> > /* in rtti.c */
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index 06116d16887..e4bdac087ad 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -12928,7 +12928,9 @@ extract_local_specs (tree pattern, tsubst_flags_t
> > complain)
> >   tree
> >   build_extra_args (tree pattern, tree args, tsubst_flags_t complain)
> >   {
> > -  tree extra = args;
> > +  /* Make a copy of the extra arguments so that they won't get changed
> > + from under us.  */
> > +  tree extra = copy_template_args (args);
> > if (local_specializations)
> >   if (tree locals = extract_local_specs (pattern, complain))
> > extra = tree_cons (NULL_TREE, extra, locals);
> > @@ -12939,7 +12941,7 @@ build_extra_args (tree pattern, tree args,
> > tsubst_flags_t complain)
> >  normal template args to ARGS.  */
> > tree
> > -add_extra_args (tree extra, tree args)
> > +add_extra_args (tree extra, tree args, tsubst_flags_t complain, tree
> > in_decl)
> >   {
> > if (extra && TREE_CODE (extra) == TREE_LIST)
> >   {
> > @@ -12959,20 +12961,14 @@ add_extra_args (tree extra, tree args)
> > gcc_assert (!TREE_PURPOSE (extra));
> > extra = TREE_VALUE (extra);
> >   }
> > -#if 1
> > -  /* I think we should always be able to substitute dependent args into the
> > - pattern.  

Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-09 Thread Rainer Orth
Hi Martin,

>> Yesterday's enhancement to -Warray-bounds has exposed a couple of
>> issues in libgo where the code writes into an invalid constant
>> address that the warning is designed to flag.
>>
>> On the assumption that those invalid addresses are deliberate,
>> the attached patch suppresses these instances by using #pragma
>> GCC diagnostic but I don't think I'm supposed to commit it (at
>> least Git won't let me).  To avoid Go bootstrap failures please
>> either apply the patch or otherwise suppress the warning (e.g.,
>> by using a volatile pointer temporary).
>
> while this patch does fix the libgo bootstrap failure, Go is completely
> broken: almost 1000 go.test failures and all libgo tests FAIL as well.
> Seen on both i386-pc-solaris2.11 and sparc-sun-solaris2.11.

FWIW, I see exactly the same failures on x86_64-pc-linux-gnu, so nothing
Solaris-specific here.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-09 Thread Maxim Kuvyrkov via Gcc-patches
> On 9 Jul 2021, at 09:16, Richard Biener via Gcc-patches 
>  wrote:
> 
> On Thu, Jul 8, 2021 at 8:02 PM Martin Sebor via Gcc-patches
>  wrote:
>> 
>> Hi Ian,
>> 
>> Yesterday's enhancement to -Warray-bounds has exposed a couple of
>> issues in libgo where the code writes into an invalid constant
>> address that the warning is designed to flag.
>> 
>> On the assumption that those invalid addresses are deliberate,
>> the attached patch suppresses these instances by using #pragma
>> GCC diagnostic but I don't think I'm supposed to commit it (at
>> least Git won't let me).  To avoid Go bootstrap failures please
>> either apply the patch or otherwise suppress the warning (e.g.,
>> by using a volatile pointer temporary).
> 
> Btw, I don't think we should diagnose things like
> 
>*(int*)0x21 = 0x21;
> 
> when somebody literally writes that he'll be just annoyed by diagnostics.

And we have an assortment of similar cases in 32-bit ARM kernel-page helpers.

At the moment building libatomic for arm-linux-gnueabihf fails with:
===
In function ‘select_test_and_set_8’,
inlined from ‘select_test_and_set_8’ at 
/home/tcwg-buildslave/workspace/tcwg-dev-build/snapshots/gcc.git~master/libatomic/tas_n.c:115:1:
/home/tcwg-buildslave/workspace/tcwg-dev-build/snapshots/gcc.git~master/libatomic/config/linux/arm/host-config.h:42:34:
 error: array subscript 0 is outside array bounds of ‘unsigned int[0]’ 
[-Werror=array-bounds]
   42 | #define __kernel_helper_version (*(unsigned int *)0x0ffc)
  | ~^~~~
===

In libatomic/config/linux/arm/host-config.h we have:
===
/* Kernel helper for 32-bit compare-and-exchange.  */
typedef int (__kernel_cmpxchg_t) (UWORD oldval, UWORD newval, UWORD *ptr);
#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)

/* Kernel helper for 64-bit compare-and-exchange.  */
typedef int (__kernel_cmpxchg64_t) (const U_8 * oldval, const U_8 * newval,
U_8 *ptr);
#define __kernel_cmpxchg64 (*(__kernel_cmpxchg64_t *) 0x0f60)

/* Kernel helper for memory barrier.  */
typedef void (__kernel_dmb_t) (void);
#define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)

/* Kernel helper page version number.  */
#define __kernel_helper_version (*(unsigned int *)0x0ffc)
===



--
Maxim Kuvyrkov
https://www.linaro.org



Re: [PATCH 10/10] vect: Reuse reduction accumulators between loops

2021-07-09 Thread Richard Sandiford via Gcc-patches
Thanks for the review.

Richard Biener  writes:
>> @@ -588,6 +600,23 @@ public:
>>/* Unrolling factor  */
>>poly_uint64 vectorization_factor;
>>
>> +  /* If this loop is an epilogue loop whose main loop can be skipped,
>> + MAIN_LOOP_EDGE is the edge from the main loop to this loop's
>> + preheader.  SKIP_MAIN_LOOP_EDGE is then the edge that skips the
>> + main loop and goes straight to this loop's preheader.
>> +
>> + Both fields are null otherwise.  */
>> +  edge main_loop_edge;
>> +  edge skip_main_loop_edge;
>> +
>> +  /* If this loop is an epilogue loop that might be skipped after executing
>> + the main loop, this edge is the one that skips the epilogue.  */
>> +  edge skip_this_loop_edge;
>> +
>> +  /* After vectorization, maps live-out SSA names to information about
>> + the reductions that generated them.  */
>> +  hash_map reusable_accumulators;
>
> Is that the LC PHI node defs or the definition inside of the loop?
> If the latter we could attach the info directly to its stmt-info?

Ah, yeah, I should improve the comment there.  It's the vectoriser's
replacement for the original LC PHI node, i.e. the final scalar result
after the reduction has taken place.

>> @@ -1186,6 +1215,21 @@ public:
>>/* The vector type for performing the actual reduction.  */
>>tree reduc_vectype;
>>
>> +  /* If IS_REDUC_INFO is true and if the reduction is operating on N
>> + elements in parallel, this vector gives the initial values of these
>> + N elements.  */
>
> That's N scalar elements or N vector elements?  I suppose it's for
> SLP reductions (rather than SLP reduction chains) and never non-SLP
> reductions?

Yeah, poor wording again, sorry.  I meant something closer to:

  /* If IS_REDUC_INFO is true and if the vector code is performing
 N scalar reductions in parallel, this vector gives the initial
 scalar values of those N reductions.  */

>> +  vec reduc_initial_values;
>> +
>> +  /* If IS_REDUC_INFO is true and if the reduction is operating on N
>> + elements in parallel, this vector gives the scalar result of each
>> + reduction.  */
>> +  vec reduc_scalar_results;

Same change here.

>> […]
>> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
>> index 2909e8a0fc3..b7b0523e3c8 100644
>> --- a/gcc/tree-vect-loop-manip.c
>> +++ b/gcc/tree-vect-loop-manip.c
>> @@ -2457,6 +2457,31 @@ vect_update_epilogue_niters (loop_vec_info 
>> epilogue_vinfo,
>>return vect_determine_partial_vectors_and_peeling (epilogue_vinfo, true);
>>  }
>>
>> +/* LOOP_VINFO is an epilogue loop and MAIN_LOOP_VALUE is available on exit
>> +   from the corresponding main loop.  Return a value that is available in
>> +   LOOP_VINFO's preheader, using SKIP_VALUE if the main loop is skipped.
>> +   Passing a null SKIP_VALUE is equivalent to passing zero.  */
>> +
>> +tree
>> +vect_get_main_loop_result (loop_vec_info loop_vinfo, tree main_loop_value,
>> +  tree skip_value)
>> +{
>> +  if (!loop_vinfo->main_loop_edge)
>> +return main_loop_value;
>> +
>> +  if (!skip_value)
>> +skip_value = build_zero_cst (TREE_TYPE (main_loop_value));
>
> shouldn't that be the initial value?

For the current use case, the above two conditions are never true.
I wrote it like this because I had a follow-on patch (which might
not go anywhere) that needed this function for 0-based IVs.

Maybe that's a bad risk/reward trade-off though.  Not having to pass
zero makes things only slightly simpler for the follow-on patch,
and I guess could be dangerous in other cases.

Perhaps in that case though I should change loop_vinfo->main_loop_edge
into a gcc_assert as well.

>> +  tree phi_result = make_ssa_name (TREE_TYPE (main_loop_value));
>> +  basic_block bb = loop_vinfo->main_loop_edge->dest;
>> +  gphi *new_phi = create_phi_node (phi_result, bb);
>> +  add_phi_arg (new_phi, main_loop_value, loop_vinfo->main_loop_edge,
>> +  UNKNOWN_LOCATION);
>> +  add_phi_arg (new_phi, skip_value,
>> +  loop_vinfo->skip_main_loop_edge, UNKNOWN_LOCATION);
>> +  return phi_result;
>> +}
>> +
>>  /* Function vect_do_peeling.
>>
>> Input:
>> […]
>> @@ -4823,6 +4842,100 @@ info_for_reduction (vec_info *vinfo, stmt_vec_info 
>> stmt_info)
>>return stmt_info;
>>  }
>>
>> +/* PHI is a reduction in LOOP_VINFO that we are going to vectorize using 
>> vector
>> +   type VECTYPE.  See if LOOP_VINFO is an epilogue loop whose main loop had 
>> a
>> +   matching reduction that we can build on.  Adjust REDUC_INFO and return 
>> true
>> +   if so, otherwise return false.  */
>> +
>> +static bool
>> +vect_find_reusable_accumulator (loop_vec_info loop_vinfo,
>> +   stmt_vec_info reduc_info)
>> +{
>> +  loop_vec_info main_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo);
>> +  if (!main_loop_vinfo)
>> +return false;
>> +
>> +  if (STMT_VINFO_REDUC_TYPE (reduc_info) != TREE_CODE_REDUCTION)
>> +return false;
>> +
>> +  

[Ada] Fix style in expansion of attribute Put_Image

2021-07-09 Thread Pierre-Marie de Rodat
Style cleanup only.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_put_image.adb (Make_Put_Image_Name): Fix style.
(Image_Should_Call_Put_Image): Likewise.
(Build_Image_Call): Likewise.diff --git a/gcc/ada/exp_put_image.adb b/gcc/ada/exp_put_image.adb
--- a/gcc/ada/exp_put_image.adb
+++ b/gcc/ada/exp_put_image.adb
@@ -1005,9 +1005,9 @@ package body Exp_Put_Image is
   return True;
end Enable_Put_Image;
 
-   -
+   -
-- Make_Put_Image_Name --
-   -
+   -
 
function Make_Put_Image_Name
  (Loc : Source_Ptr; Typ : Entity_Id) return Entity_Id
@@ -1028,6 +1028,10 @@ package body Exp_Put_Image is
   return Make_Defining_Identifier (Loc, Sname);
end Make_Put_Image_Name;
 
+   -
+   -- Image_Should_Call_Put_Image --
+   -
+
function Image_Should_Call_Put_Image (N : Node_Id) return Boolean is
begin
   if Ada_Version < Ada_2022 then
@@ -1049,6 +1053,10 @@ package body Exp_Put_Image is
   end;
end Image_Should_Call_Put_Image;
 
+   --
+   -- Build_Image_Call --
+   --
+
function Build_Image_Call (N : Node_Id) return Node_Id is
   --  For T'Image (X) Generate an Expression_With_Actions node:
   --




[Ada] par-ch6: do not mark subprogram as missing "is" if imported

2021-07-09 Thread Pierre-Marie de Rodat
Before this commit, the following piece of code:

procedure Main is
function F (X : access Integer) return Boolean with Import;
begin
   null;
end;

Resulted in the following error messages:

main.adb:2:59: error: ";" should be "is"
main.adb:5:01: error: "end F;" expected
main.adb:5:01: error: missing "begin" for procedure "Main" at line 1

The problem was that GNAT incorrectly thought `F` required a body, and
thus assumed that the `begin` keyword belonged to `F` rather than to
Main.

The solution is to teach GNAT to not treat imported subprograms as
requiring a body.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* par-ch6.adb (Contains_Import_Aspect): New function.
(P_Subprogram): Acknowledge `Import` aspects.diff --git a/gcc/ada/par-ch6.adb b/gcc/ada/par-ch6.adb
--- a/gcc/ada/par-ch6.adb
+++ b/gcc/ada/par-ch6.adb
@@ -201,6 +201,28 @@ package body Ch6 is
--  Error recovery: cannot raise Error_Resync
 
function P_Subprogram (Pf_Flags : Pf_Rec) return Node_Id is
+
+  function Contains_Import_Aspect (Aspects : List_Id) return Boolean;
+  --  Return True if Aspects contains an Import aspect.
+
+  
+  -- Contains_Import_Aspect --
+  
+
+  function Contains_Import_Aspect (Aspects : List_Id) return Boolean is
+ Aspect : Node_Id := First (Aspects);
+  begin
+ while Present (Aspect) loop
+if Chars (Identifier (Aspect)) = Name_Import then
+   return True;
+end if;
+
+Next (Aspect);
+ end loop;
+
+ return False;
+  end Contains_Import_Aspect;
+
   Specification_Node : Node_Id;
   Name_Node  : Node_Id;
   Aspects: List_Id;
@@ -982,10 +1004,12 @@ package body Ch6 is
  if Pf_Flags.Pbod
 
--  Disconnect this processing if we have scanned a null procedure
-   --  because in this case the spec is complete anyway with no body.
+   --  or an Import aspect because in this case the spec is complete
+   --  anyway with no body.
 
and then (Nkind (Specification_Node) /= N_Procedure_Specification
   or else not Null_Present (Specification_Node))
+   and then not Contains_Import_Aspect (Aspects)
  then
 SIS_Labl := Scopes (Scope.Last).Labl;
 SIS_Sloc := Scopes (Scope.Last).Sloc;




[Ada] Fix crash on type extensions with discriminants

2021-07-09 Thread Pierre-Marie de Rodat
In Ada 2022 mode, the compiler crashes when generating the Put_Image
function for a tagged type if the parent subtype is constrained.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_put_image.adb (Make_Component_Attributes): Use
Implementation_Base_Type to get the parent type. Otherwise,
Parent_Type_Decl is actually an internally generated subtype
declaration, so we blow up on
Type_Definition (Parent_Type_Decl).diff --git a/gcc/ada/exp_put_image.adb b/gcc/ada/exp_put_image.adb
--- a/gcc/ada/exp_put_image.adb
+++ b/gcc/ada/exp_put_image.adb
@@ -658,8 +658,8 @@ package body Exp_Put_Image is
   if Chars (Defining_Identifier (Item)) = Name_uParent then
  declare
 Parent_Type : constant Entity_Id :=
-  Underlying_Type (Base_Type (
-(Etype (Defining_Identifier (Item);
+  Implementation_Base_Type
+(Etype (Defining_Identifier (Item)));
 
 Parent_Aspect_Spec : constant Node_Id :=
   Find_Aspect (Parent_Type, Aspect_Put_Image);




[Ada] Add missed OS constant values

2021-07-09 Thread Pierre-Marie de Rodat
Add IPV6_FLOWINFO and IF_NAMESIZE values into generated package
System.OS_Constants.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gsocket.h: Include net/if.h to get IF_NAMESIZE constant.
* s-oscons-tmplt.c: Define IPV6_FLOWINFO for Linux.diff --git a/gcc/ada/gsocket.h b/gcc/ada/gsocket.h
--- a/gcc/ada/gsocket.h
+++ b/gcc/ada/gsocket.h
@@ -215,6 +215,7 @@
 #if !(defined (VMS) || defined (__MINGW32__))
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 


diff --git a/gcc/ada/s-oscons-tmplt.c b/gcc/ada/s-oscons-tmplt.c
--- a/gcc/ada/s-oscons-tmplt.c
+++ b/gcc/ada/s-oscons-tmplt.c
@@ -1689,8 +1689,14 @@ CND(IPV6_DSTOPTS, "Set the destination options delivery")
 CND(IPV6_HOPOPTS, "Set the hop options delivery")
 
 #ifndef IPV6_FLOWINFO
+#ifdef __linux__
+/* The IPV6_FLOWINFO is defined in linux/in6.h, but we can't include it because
+ * of conflicts with other headers. */
+# define IPV6_FLOWINFO 11
+#else
 # define IPV6_FLOWINFO -1
 #endif
+#endif
 CND(IPV6_FLOWINFO, "Set the flow ID delivery")
 
 #ifndef IPV6_HOPLIMIT




[Ada] Improve performance of Ada.Containers.Doubly_Linked_Lists.Generic_Sorting.Sort

2021-07-09 Thread Pierre-Marie de Rodat
The previous implementation could exhibit quadratic behavior in some
cases (e.g., if the input was already sorted or almost sorted). The
new implementation uses an N log N worst case algorithm.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cdlili.adb: Reimplement
Ada.Containers.Doubly_Linked_Lists.Generic_Sorting.Sort using
Mergesort instead of the previous Quicksort variant.diff --git a/gcc/ada/libgnat/a-cdlili.adb b/gcc/ada/libgnat/a-cdlili.adb
--- a/gcc/ada/libgnat/a-cdlili.adb
+++ b/gcc/ada/libgnat/a-cdlili.adb
@@ -675,68 +675,152 @@ is
 
   procedure Sort (Container : in out List) is
 
- procedure Partition (Pivot : Node_Access; Back : Node_Access);
-
- procedure Sort (Front, Back : Node_Access);
-
- ---
- -- Partition --
- ---
+ type List_Descriptor is
+record
+   First, Last : Node_Access;
+   Length  : Count_Type;
+end record;
+
+ function Merge_Sort (Arg : List_Descriptor) return List_Descriptor;
+ --  Sort list of given length using MergeSort; length must be >= 2.
+ --  As required by RM, the sort is stable.
+
+ 
+ -- Merge_Sort --
+ 
+
+ function Merge_Sort (Arg : List_Descriptor) return List_Descriptor
+ is
+procedure Split_List
+  (Unsplit : List_Descriptor; Part1, Part2 : out List_Descriptor);
+--  Split list into two parts for divide-and-conquer.
+--  Unsplit.Length must be >= 2.
+
+function Merge_Parts
+  (Part1, Part2 : List_Descriptor) return List_Descriptor;
+--  Merge two sorted lists, preserving sorted property.
+
+
+-- Split_List --
+
+
+procedure Split_List
+  (Unsplit : List_Descriptor; Part1, Part2 : out List_Descriptor)
+is
+   Rover : Node_Access := Unsplit.First;
+   Bump_Count : constant Count_Type := (Unsplit.Length - 1) / 2;
+begin
+   for Iter in 1 .. Bump_Count loop
+  Rover := Rover.Next;
+   end loop;
+
+   Part1 := (First  => Unsplit.First,
+ Last   => Rover,
+ Length => Bump_Count + 1);
+
+   Part2 := (First => Rover.Next,
+ Last  => Unsplit.Last,
+ Length => Unsplit.Length - Part1.Length);
+
+   --  Detach
+   Part1.Last.Next := null;
+   Part2.First.Prev := null;
+end Split_List;
+
+-
+-- Merge_Parts --
+-
+
+function Merge_Parts
+  (Part1, Part2 : List_Descriptor) return List_Descriptor
+is
+   Empty  : constant List_Descriptor := (null, null, 0);
+
+   procedure Detach_First (Source   : in out List_Descriptor;
+   Detached : out Node_Access);
+   --  Detach the first element from a non-empty list and
+   --  return the detached node via the Detached parameter.
+
+   --
+   -- Detach_First --
+   --
+
+   procedure Detach_First (Source   : in out List_Descriptor;
+   Detached : out Node_Access) is
+   begin
+  Detached := Source.First;
+
+  if Source.Length = 1 then
+ Source := Empty;
+  else
+ Source := (Source.First.Next,
+Source.Last,
+Source.Length - 1);
+
+ Detached.Next.Prev := null;
+ Detached.Next := null;
+  end if;
+   end Detach_First;
+
+   P1 : List_Descriptor := Part1;
+   P2 : List_Descriptor := Part2;
+   Merged : List_Descriptor := Empty;
+
+   Take_From_P2 : Boolean;
+   Detached : Node_Access;
+
+--  Start of processing for Merge_Parts
 
- procedure Partition (Pivot : Node_Access; Back : Node_Access) is
-Node : Node_Access;
+begin
+   while (P1.Length /= 0) or (P2.Length /= 0) loop
+  if P1.Length = 0 then
+ Take_From_P2 := True;
+  elsif P2.Length = 0 then
+ Take_From_P2 := False;
+  else
+ --  If the compared elements are equal then Take_From_P2
+ --  must be False in order to ensure stability.
+
+ Take_From_P2 := P2.First.Element < P1.First.Element;
+

[Ada] Crash on expansion of BIP construct in -gnatf mode

2021-07-09 Thread Pierre-Marie de Rodat
This patch fixes an issue in the compiler whereby an assignment to a
limited interface access type causes a crash when the right hand side
has an unresolvable function call in prefix notation and verbose errors
are enabled via (-gnatf).

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Is_Build_In_Place_Function_Call): Add check to
verify the Selector_Name of Exp_Node has been analyzed before
obtaining its entity.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -8275,6 +8275,15 @@ package body Exp_Ch6 is
   --  This may be a call to a protected function.
 
   elsif Nkind (Name (Exp_Node)) = N_Selected_Component then
+ --  The selector in question might not have been analyzed due to a
+ --  previous error, so analyze it here to output the appropriate
+ --  error message instead of crashing when attempting to fetch its
+ --  entity.
+
+ if not Analyzed (Selector_Name (Name (Exp_Node))) then
+Analyze (Selector_Name (Name (Exp_Node)));
+ end if;
+
  Function_Id := Etype (Entity (Selector_Name (Name (Exp_Node;
 
   else




[Ada] Add -gnatX support for casing on discriminated values

2021-07-09 Thread Pierre-Marie de Rodat
Improve existing support for the Ada extension feature of casing on
composite values to handle casing on values that are discriminated or
have discriminated subcomponents.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch5.adb (Expand_General_Case_Statement): Add new function
Else_Statements to handle the case of invalid data analogously
to how it is handled when casing on a discrete value.
* sem_case.adb (Has_Static_Discriminant_Constraint): A new
Boolean-valued function.
(Composite_Case_Ops.Scalar_Part_Count): Include discriminants
when traversing components.
(Composite_Case_Ops.Choice_Analysis.Traverse_Discrete_Parts):
Include discriminants when traversing components; the component
range for a constrained discriminant is a single value.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice): Eliminate
Done variable and modify how Next_Part is computed so that it is
always correct (as opposed to being incorrect when Done is
True).  This includes changes in Update_Result (a local
procedure).  Add new local procedure
Update_Result_For_Box_Component and call it not just for box
components but also for "missing" components (components
associated with an inactive variant).
(Check_Choices.Check_Composite_Case_Selector.Check_Component_Subtype):
Instead of disallowing all discriminated component types, allow
those that are unconstrained or statically constrained. Check
discriminant subtypes along with other component subtypes.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation to reflect current implementation status.
* gnat_rm.texi: Regenerate.diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst b/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
--- a/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
+++ b/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
@@ -2237,8 +2237,7 @@ of GNAT specific extensions are recognized as follows:
   some restrictions (described below). Aggregate syntax is used for choices
   of such a case statement; however, in cases where a "normal" aggregate would
   require a discrete value, a discrete subtype may be used instead; box
-  notation can also be used to match all values (but currently only
-  for discrete subcomponents).
+  notation can also be used to match all values.
 
   Consider this example:
 
@@ -2269,10 +2268,10 @@ of GNAT specific extensions are recognized as follows:
   set shall be a proper subset of the second (and the later alternative
   will not be executed if the earlier alternative "matches"). All possible
   values of the composite type shall be covered. The composite type of the
-  selector shall be a nonlimited untagged undiscriminated record type, all
-  of whose subcomponent subtypes are either static discrete subtypes or
-  record types that meet the same restrictions. Support for arrays is
-  planned, but not yet implemented.
+  selector shall be a nonlimited untagged (but possibly discriminated)
+  record type, all of whose subcomponent subtypes are either static discrete
+  subtypes or record types that meet the same restrictions. Support for arrays
+  is planned, but not yet implemented.
 
   In addition, pattern bindings are supported. This is a mechanism
   for binding a name to a component of a matching value for use within


diff --git a/gcc/ada/exp_ch5.adb b/gcc/ada/exp_ch5.adb
--- a/gcc/ada/exp_ch5.adb
+++ b/gcc/ada/exp_ch5.adb
@@ -3641,16 +3641,37 @@ package body Exp_Ch5 is
 return Result;
  end Elsif_Parts;
 
+ function Else_Statements return List_Id;
+ --  Returns a "raise Constraint_Error" statement if
+ --  exception propagate is permitted and No_List otherwise.
+
+ -
+ -- Else_Statements --
+ -
+
+ function Else_Statements return List_Id is
+ begin
+if Restriction_Active (No_Exception_Propagation) then
+   return No_List;
+else
+   return New_List (Make_Raise_Constraint_Error (Loc,
+  Reason => CE_Invalid_Data));
+end if;
+ end Else_Statements;
+
+ --  Local constants
+
  If_Stmt : constant Node_Id :=
Make_If_Statement (Loc,
   Condition   => Top_Level_Pattern_Match_Condition (First_Alt),
   Then_Statements => Statements (First_Alt),
-  Elsif_Parts => Elsif_Parts);
- --  Do we want an implicit "else raise Program_Error" here???
- --  Perhaps only if Exception-related restrictions are not in effect.
+  Elsif_Parts => Elsif_Parts,
+  Else_Statements => Else_Statements);
 
  Declarations : constant List_Id := New_List (Selector_Decl);
 

[Ada] Crash on inlined separate subprogram

2021-07-09 Thread Pierre-Marie de Rodat
This patch fixes an issue in the compiler whereby a pragma Inline
appearing after a subprogram body stub to which it applies and where no
specification is present causes a compile time crash.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Check_Pragma_Inline): Correctly use
Corresponding_Spec_Of_Stub when dealing subprogram body stubs.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -3454,7 +3454,12 @@ package body Sem_Ch6 is
   --  Link the body and the generated spec
 
   Set_Corresponding_Body (Decl, Body_Id);
-  Set_Corresponding_Spec (N, Subp);
+
+  if Nkind (N) = N_Subprogram_Body_Stub then
+ Set_Corresponding_Spec_Of_Stub (N, Subp);
+  else
+ Set_Corresponding_Spec (N, Subp);
+  end if;
 
   Set_Defining_Unit_Name (Specification (Decl), Subp);
 




[Ada] Declare time_t uniformly based on a system parameter

2021-07-09 Thread Pierre-Marie de Rodat
The declaration of time_t is in flux based on it's overflow in Year
2038, so declare it uniformly based on System.Parameter.time_t_bits
to ease this transition and also enable VxWorks targets which allow
it to be parameterized, to be rebuilt more easily by one source change.

Two changes of note:
s-linux__x32.ads also changes the size of suseconds_t and the field
tv_nsec in timespec to be 64 bits. Since it's a one of, it's not
handled via System.Parameters.

s-os_lib.ads contains a subtype time_t formerly of Long_Integer, changed
to Long_Long_Integer. This declaration is not used in the runtime, but
is available to application code via the renaming g-os_lib.ads. This
may cause some customer code to not compile, but is easily fixed by a
source code change. The alternative is to have two versions of
s-os_lib.ads, identical except for this subtype declaration.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Makefile.rtl: Add translations for s-parame__posix2008.ads
* libgnarl/s-linux.ads: Import System.Parameters.
(time_t): Declare using System.Parameters.time_t_bits.
* libgnarl/s-linux__alpha.ads: Likewise.
* libgnarl/s-linux__android.ads: Likewise.
* libgnarl/s-linux__hppa.ads: Likewise.
* libgnarl/s-linux__mips.ads: Likewise.
* libgnarl/s-linux__riscv.ads: Likewise.
* libgnarl/s-linux__sparc.ads: Likewise.
* libgnarl/s-linux__x32.ads: Likewise.
* libgnarl/s-qnx.ads: Likewise.
* libgnarl/s-osinte__aix.ads: Likewise.
* libgnarl/s-osinte__android.ads: Likewise.
* libgnarl/s-osinte__darwin.ads: Likewise.
* libgnarl/s-osinte__dragonfly.ads: Likewise.
* libgnarl/s-osinte__freebsd.ads: Likewise.
* libgnarl/s-osinte__gnu.ads: Likewise.
* libgnarl/s-osinte__hpux-dce.ads: Likewise.
* libgnarl/s-osinte__hpux.ads: Likewise.
* libgnarl/s-osinte__kfreebsd-gnu.ads: Likewise.
* libgnarl/s-osinte__lynxos178e.ads: Likewise.
* libgnarl/s-osinte__qnx.ads: Likewise.
* libgnarl/s-osinte__rtems.ads: Likewise.
* libgnarl/s-osinte__solaris.ads: Likewise.
* libgnarl/s-osinte__vxworks.ads: Likewise.
* libgnat/g-sothco.ads: Likewise.
* libgnat/s-osprim__darwin.adb: Likewise.
* libgnat/s-osprim__posix.adb: Likewise.
* libgnat/s-osprim__posix2008.adb: Likewise.
* libgnat/s-osprim__rtems.adb: Likewise.
* libgnat/s-osprim__x32.adb: Likewise.
* libgnarl/s-osinte__linux.ads: use type System.Linux.time_t.
* libgnat/s-os_lib.ads (time_t): Declare as subtype of
Long_Long_Integer.
* libgnat/s-parame.ads (time_t_bits): New constant.
* libgnat/s-parame__ae653.ads (time_t_bits): Likewise.
* libgnat/s-parame__hpux.ads (time_t_bits): Likewise.
* libgnat/s-parame__vxworks.ads (time_t_bits): Likewise.
* libgnat/s-parame__posix2008.ads: New file for 64 bit time_t.

patch.diff.gz
Description: application/gzip


[Ada] Add source file name to gnat bug box

2021-07-09 Thread Pierre-Marie de Rodat
...in case Current_Error_Node is Empty, which will cause it to print "No
source file position information available".  At least now we have the
file name being compiled.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* comperr.adb (Compiler_Abort): Print source file name.diff --git a/gcc/ada/comperr.adb b/gcc/ada/comperr.adb
--- a/gcc/ada/comperr.adb
+++ b/gcc/ada/comperr.adb
@@ -244,12 +244,17 @@ package body Comperr is
 end if;
 
 End_Line;
+
  else
 Write_Str ("| Error detected at ");
 Write_Location (Sloc (Current_Error_Node));
 End_Line;
  end if;
 
+ Write_Str ("| Compiling ");
+ Write_Str (Get_First_Main_File_Name);
+ End_Line;
+
  --  There are two cases now. If the file gnat_bug.box exists,
  --  we use the contents of this file at this point.
 




[Ada] Fix layout of contracts

2021-07-09 Thread Pierre-Marie de Rodat
Fix layout of contracts in libgnat/a-strunb.ads and
libgnat/a-strunb__shared.ads so that it is the same in both files.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-strunb.ads, libgnat/a-strunb__shared.ads: Fix layout
in contracts.diff --git a/gcc/ada/libgnat/a-strunb.ads b/gcc/ada/libgnat/a-strunb.ads
--- a/gcc/ada/libgnat/a-strunb.ads
+++ b/gcc/ada/libgnat/a-strunb.ads
@@ -81,7 +81,7 @@ is

 
function To_Unbounded_String
- (Source : String) return Unbounded_String
+ (Source : String)  return Unbounded_String
with
  Post   => Length (To_Unbounded_String'Result) = Source'Length,
  Global => null;
@@ -91,8 +91,7 @@ is
  (Length : Natural) return Unbounded_String
with
  Post   =>
-   Ada.Strings.Unbounded.Length (To_Unbounded_String'Result)
- = Length,
+   Ada.Strings.Unbounded.Length (To_Unbounded_String'Result) = Length,
  Global => null;
--  Returns an Unbounded_String that represents an uninitialized String
--  whose length is Length.
@@ -524,11 +523,11 @@ is
with
  Pre=>
Low - 1 <= Length (Source)
-   and then (if High >= Low
- then Low - 1
-   <= Natural'Last - By'Length
-- Natural'Max (Length (Source) - High, 0)
- else Length (Source) <= Natural'Last - By'Length),
+ and then (if High >= Low
+   then Low - 1
+ <= Natural'Last - By'Length
+  - Natural'Max (Length (Source) - High, 0)
+   else Length (Source) <= Natural'Last - By'Length),
  Contract_Cases =>
(High >= Low =>
   Length (Replace_Slice'Result)
@@ -545,11 +544,11 @@ is
with
  Pre=>
Low - 1 <= Length (Source)
-   and then (if High >= Low
- then Low - 1
-   <= Natural'Last - By'Length
-- Natural'Max (Length (Source) - High, 0)
- else Length (Source) <= Natural'Last - By'Length),
+ and then (if High >= Low
+   then Low - 1
+ <= Natural'Last - By'Length
+  - Natural'Max (Length (Source) - High, 0)
+   else Length (Source) <= Natural'Last - By'Length),
  Contract_Cases =>
(High >= Low =>
   Length (Source)
@@ -586,7 +585,7 @@ is
  Pre=> Position - 1 <= Length (Source)
  and then (if New_Item'Length /= 0
then
-   New_Item'Length <= Natural'Last - (Position - 1)),
+ New_Item'Length <= Natural'Last - (Position - 1)),
  Post   =>
Length (Overwrite'Result)
  = Natural'Max (Length (Source), Position - 1 + New_Item'Length),
@@ -600,7 +599,7 @@ is
  Pre=> Position - 1 <= Length (Source)
  and then (if New_Item'Length /= 0
then
-   New_Item'Length <= Natural'Last - (Position - 1)),
+ New_Item'Length <= Natural'Last - (Position - 1)),
  Post   =>
Length (Source)
  = Natural'Max (Length (Source)'Old, Position - 1 + New_Item'Length),


diff --git a/gcc/ada/libgnat/a-strunb__shared.ads b/gcc/ada/libgnat/a-strunb__shared.ads
--- a/gcc/ada/libgnat/a-strunb__shared.ads
+++ b/gcc/ada/libgnat/a-strunb__shared.ads
@@ -363,9 +363,8 @@ is
   Going   : Direction := Forward;
   Mapping : Maps.Character_Mapping := Maps.Identity) return Natural
with
- Pre=> (if Length (Source) /= 0
-then From <= Length (Source))
-   and then Pattern'Length /= 0,
+ Pre=> (if Length (Source) /= 0 then From <= Length (Source))
+   and then Pattern'Length /= 0,
  Global => null;
pragma Ada_05 (Index);
 
@@ -376,11 +375,9 @@ is
   Going   : Direction := Forward;
   Mapping : Maps.Character_Mapping_Function) return Natural
with
- Pre=> (if Length (Source) /= 0
-then From <= Length (Source))
-   and then Pattern'Length /= 0,
+ Pre=> (if Length (Source) /= 0 then From <= Length (Source))
+   and then Pattern'Length /= 0,
  Global => null;
-
pragma Ada_05 (Index);
 
function Index




[Ada] Fix invalid JSON for derived variant record with -gnatRj

2021-07-09 Thread Pierre-Marie de Rodat
This prevents the output of -gnatRj from containing several "variant" fields
for an extension with a variant part of a tagged type with a variant part.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* repinfo.ads (JSON output format): Document adjusted key name.
* repinfo.adb (List_Record_Layout): Use Original_Record_Component
if the normalized position of the component is not known.
(List_Structural_Record_Layout): Rename Outer_Ent parameter into
Ext_End and add Ext_Level parameter. In an extension, if the parent
subtype has static discriminants, call List_Record_Layout on it.
Output "parent_" prefixes before "variant" according to Ext_Level.
Adjust recursive calls throughout the procedure.diff --git a/gcc/ada/repinfo.adb b/gcc/ada/repinfo.adb
--- a/gcc/ada/repinfo.adb
+++ b/gcc/ada/repinfo.adb
@@ -963,10 +963,15 @@ package body Repinfo is
 
   procedure List_Structural_Record_Layout
 (Ent   : Entity_Id;
- Outer_Ent : Entity_Id;
+ Ext_Ent   : Entity_Id;
+ Ext_Level : Nat := 0;
  Variant   : Node_Id := Empty;
  Indent: Natural := 0);
-  --  Internal recursive procedure to display the structural layout
+  --  Internal recursive procedure to display the structural layout.
+  --  If Ext_Ent is not equal to Ent, it is an extension of Ent and
+  --  Ext_Level is the number of successive extensions between them.
+  --  If Variant is present, it's for a variant in the variant part
+  --  instead of the common part of Ent. Indent is the indentation.
 
   Incomplete_Layout : exception;
   --  Exception raised if the layout is incomplete in -gnatc mode
@@ -1319,7 +1324,12 @@ package body Repinfo is
   end if;
end if;
 
-   List_Component_Layout (Comp,
+   --  The Parent_Subtype in an extension is not back-annotated
+
+   List_Component_Layout (
+ (if Known_Normalized_Position (Comp)
+  then Comp
+  else Original_Record_Component (Comp)),
  Starting_Position, Starting_First_Bit, Prefix);
 end;
 
@@ -1334,15 +1344,16 @@ package body Repinfo is
 
   procedure List_Structural_Record_Layout
 (Ent   : Entity_Id;
- Outer_Ent : Entity_Id;
+ Ext_Ent   : Entity_Id;
+ Ext_Level : Nat := 0;
  Variant   : Node_Id := Empty;
  Indent: Natural := 0)
   is
  function Derived_Discriminant (Disc : Entity_Id) return Entity_Id;
- --  This function assumes that Outer_Ent is an extension of Ent.
+ --  This function assumes that Ext_Ent is an extension of Ent.
  --  Disc is a discriminant of Ent that does not itself constrain a
  --  discriminant of the parent type of Ent. Return the discriminant
- --  of Outer_Ent that ultimately constrains Disc, if any.
+ --  of Ext_Ent that ultimately constrains Disc, if any.
 
  
  --  Derived_Discriminant  --
@@ -1353,7 +1364,7 @@ package body Repinfo is
 Derived_Disc : Entity_Id;
 
  begin
-Derived_Disc := First_Discriminant (Outer_Ent);
+Derived_Disc := First_Discriminant (Ext_Ent);
 
 --  Loop over the discriminants of the extension
 
@@ -1380,7 +1391,7 @@ package body Repinfo is
Next_Discriminant (Derived_Disc);
 end loop;
 
---  Disc is not constrained by a discriminant of Outer_Ent
+--  Disc is not constrained by a discriminant of Ext_Ent
 
 return Empty;
  end Derived_Discriminant;
@@ -1432,12 +1443,21 @@ package body Repinfo is
  pragma Assert (Present (Parent_Type));
   end if;
 
-  Parent_Type := Base_Type (Parent_Type);
-  if not In_Extended_Main_Source_Unit (Parent_Type) then
- raise Not_In_Extended_Main;
+  --  Do not list variants if one of them has been selected
+
+  if Has_Static_Discriminants (Parent_Type) then
+ List_Record_Layout (Parent_Type);
+
+  else
+ Parent_Type := Base_Type (Parent_Type);
+ if not In_Extended_Main_Source_Unit (Parent_Type) then
+raise Not_In_Extended_Main;
+ end if;
+
+ List_Structural_Record_Layout
+   (Parent_Type, Ext_Ent, Ext_Level + 1);
   end if;
 
-  List_Structural_Record_Layout (Parent_Type, Outer_Ent);
   First := False;
 
   if Present (Record_Extension_Part (Definition)) then
@@ -1467,7 +1487,7 @@ package body Repinfo is
  --  If this is the parent type of an extension, retrieve

[Ada] Fix typo in comment related to derived discriminated types

2021-07-09 Thread Pierre-Marie de Rodat
Minor typo; found while fixing handling of tagged types in GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_util.ads (Map_Types): Fix typo.diff --git a/gcc/ada/exp_util.ads b/gcc/ada/exp_util.ads
--- a/gcc/ada/exp_util.ads
+++ b/gcc/ada/exp_util.ads
@@ -915,7 +915,7 @@ package Exp_Util is
--  Establish the following mapping between the attributes of tagged parent
--  type Parent_Type and tagged derived type Derived_Type.
--
-   --* Map each discriminant of Parent_Type to ether the corresponding
+   --* Map each discriminant of Parent_Type to either the corresponding
--  discriminant of Derived_Type or come constraint.
 
--* Map each primitive operation of Parent_Type to the corresponding




[Ada] Add paragraph about representation changes and Scalar_Storage_Order

2021-07-09 Thread Pierre-Marie de Rodat
This in particular documents the new warning given on overlays.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* doc/gnat_rm/implementation_defined_attributes.rst
(Scalar_Storage_Order): Add paragraph about representation
changes.
* gnat_rm.texi: Regenerate.diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst b/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst
--- a/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst
+++ b/gcc/ada/doc/gnat_rm/implementation_defined_attributes.rst
@@ -1057,6 +1057,46 @@ If a component of ``T`` is itself of a record or array type, the specfied
 attribute definition clause must be provided for the component type as well
 if desired.
 
+Representation changes that explicitly or implicitly toggle the scalar storage
+order are not supported and may result in erroneous execution of the program,
+except when performed by means of an instance of ``Ada.Unchecked_Conversion``.
+
+In particular, overlays are not supported and a warning is given for them:
+
+.. code-block:: ada
+
+ type Rec_LE is record
+I : Integer;
+ end record;
+
+ for Rec_LE use record
+I at 0 range 0 .. 31;
+ end record;
+
+ for Rec_LE'Bit_Order use System.Low_Order_First;
+ for Rec_LE'Scalar_Storage_Order use System.Low_Order_First;
+
+ type Rec_BE is record
+I : Integer;
+ end record;
+
+ for Rec_BE use record
+I at 0 range 0 .. 31;
+ end record;
+
+ for Rec_BE'Bit_Order use System.High_Order_First;
+ for Rec_BE'Scalar_Storage_Order use System.High_Order_First;
+
+ R_LE : Rec_LE;
+
+ R_BE : Rec_BE;
+ for R_BE'Address use R_LE'Address;
+
+``warning: overlay changes scalar storage order [enabled by default]``
+
+In most cases, such representation changes ought to be replaced by an
+instantiation of a function or procedure provided by ``GNAT.Byte_Swapping``.
+
 Note that the scalar storage order only affects the in-memory data
 representation. It has no effect on the representation used by stream
 attributes.


diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -11442,6 +11442,46 @@ If a component of @code{T} is itself of a record or array type, the specfied
 attribute definition clause must be provided for the component type as well
 if desired.
 
+Representation changes that explicitly or implicitly toggle the scalar storage
+order are not supported and may result in erroneous execution of the program,
+except when performed by means of an instance of @code{Ada.Unchecked_Conversion}.
+
+In particular, overlays are not supported and a warning is given for them:
+
+@example
+type Rec_LE is record
+   I : Integer;
+end record;
+
+for Rec_LE use record
+   I at 0 range 0 .. 31;
+end record;
+
+for Rec_LE'Bit_Order use System.Low_Order_First;
+for Rec_LE'Scalar_Storage_Order use System.Low_Order_First;
+
+type Rec_BE is record
+   I : Integer;
+end record;
+
+for Rec_BE use record
+   I at 0 range 0 .. 31;
+end record;
+
+for Rec_BE'Bit_Order use System.High_Order_First;
+for Rec_BE'Scalar_Storage_Order use System.High_Order_First;
+
+R_LE : Rec_LE;
+
+R_BE : Rec_BE;
+for R_BE'Address use R_LE'Address;
+@end example
+
+@code{warning: overlay changes scalar storage order [enabled by default]}
+
+In most cases, such representation changes ought to be replaced by an
+instantiation of a function or procedure provided by @code{GNAT.Byte_Swapping}.
+
 Note that the scalar storage order only affects the in-memory data
 representation. It has no effect on the representation used by stream
 attributes.




[Ada] aarch64-rtems6: use wraplf variant for a-nallfl

2021-07-09 Thread Pierre-Marie de Rodat
Since newlib doesn't implement correctly long double, use the wraplf
variant for a-nallfl.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Makefile.rtl (LIBGNAT_TARGET_PAIRS) : Use
the wraplf variant of Aux_Long_Long_Float.diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2193,7 +2193,7 @@ ifeq ($(strip $(filter-out rtems%,$(target_os))),)
 EH_MECHANISM=-gcc
   endif
 
-  ifeq ($(strip $(filter-out riscv%,$(target_cpu))),)
+  ifeq ($(strip $(filter-out aarch64% riscv%,$(target_cpu))),)
 LIBGNAT_TARGET_PAIRS += a-nallfl.ads

[Ada] Initialize local variables related to static expression functions

2021-07-09 Thread Pierre-Marie de Rodat
Explicitly initialize local variables related to analysis of expression
functions to prevent spurious checks from static analysers.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Initialize Orig_N
and Typ variables.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -298,9 +298,9 @@ package body Sem_Ch6 is
   Asp  : Node_Id;
   New_Body : Node_Id;
   New_Spec : Node_Id;
-  Orig_N   : Node_Id;
+  Orig_N   : Node_Id := Empty;
   Ret  : Node_Id;
-  Typ  : Entity_Id;
+  Typ  : Entity_Id := Empty;
 
   Def_Id : Entity_Id := Empty;
   Prev   : Entity_Id;




[Ada] Inconsistency between declaration and body of predicate functions

2021-07-09 Thread Pierre-Marie de Rodat
We need to declare a predicate function along with its type but can only
generate the body at freeze point which may be in a separate scope,
leading to inconsistencies. So fix this by deferring the generation of
the predicate function declaration and fix latent bugs uncovered along
the way.

While investigating we also discovered inconsistencies among the 3
predicate related aspects, partly fixed here.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Resolve_Aspect_Expressions): Use the same
processing for Predicate, Static_Predicate and
Dynamic_Predicate. Do not build the predicate function spec.
Update comments.
(Resolve_Name): Only reset Entity when necessary to avoid
spurious visibility errors.
(Check_Aspect_At_End_Of_Declarations): Handle consistently all
Predicate aspects.
* sem_ch3.adb (Analyze_Subtype_Declaration): Fix handling of
private types with predicates.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -10114,11 +10114,11 @@ package body Sem_Ch13 is
   then
  return;
 
---  Do not generate predicate bodies within a generic unit. The
---  expressions have been analyzed already, and the bodies play
---  no role if not within an executable unit. However, if a static
---  predicate is present it must be processed for legality checks
---  such as case coverage in an expression.
+  --  Do not generate predicate bodies within a generic unit. The
+  --  expressions have been analyzed already, and the bodies play no role
+  --  if not within an executable unit. However, if a static predicate is
+  --  present it must be processed for legality checks such as case
+  --  coverage in an expression.
 
   elsif Inside_A_Generic
 and then not Has_Static_Predicate_Aspect (Typ)
@@ -10782,7 +10782,9 @@ package body Sem_Ch13 is
  --  also make its potential components accessible.
 
  if not Analyzed (Freeze_Expr) and then Inside_A_Generic then
-if A_Id in Aspect_Dynamic_Predicate | Aspect_Predicate then
+if A_Id in Aspect_Dynamic_Predicate | Aspect_Predicate |
+   Aspect_Static_Predicate
+then
Push_Type (Ent);
Preanalyze_Spec_Expression (Freeze_Expr, Standard_Boolean);
Pop_Type (Ent);
@@ -10813,6 +10815,7 @@ package body Sem_Ch13 is
 if A_Id in Aspect_Dynamic_Predicate
  | Aspect_Predicate
  | Aspect_Priority
+ | Aspect_Static_Predicate
 then
Push_Type (Ent);
Check_Aspect_At_Freeze_Point (ASN);
@@ -10840,6 +10843,7 @@ package body Sem_Ch13 is
  | Aspect_Dynamic_Predicate
  | Aspect_Predicate
  | Aspect_Priority
+ | Aspect_Static_Predicate
  then
 Push_Type (Ent);
 Preanalyze_Spec_Expression (End_Decl_Expr, T);
@@ -15042,9 +15046,15 @@ package body Sem_Ch13 is
   or else N /= Selector_Name (Parent (N)))
  then
 Find_Direct_Name (N);
-Set_Entity (N, Empty);
 
- --  The name is component association needs no resolution
+--  Reset the Entity if N is overloaded since the entity may not
+--  be the correct one.
+
+if Is_Overloaded (N) then
+   Set_Entity (N, Empty);
+end if;
+
+ --  The name in a component association needs no resolution
 
  elsif Nkind (N) = N_Component_Association then
 Dummy := Resolve_Name (Expression (N));
@@ -15087,24 +15097,23 @@ package body Sem_Ch13 is
   --  types. These will require special handling???.
 
   when Aspect_Invariant
- | Aspect_Predicate
  | Aspect_Predicate_Failure
   =>
  null;
 
   when Aspect_Dynamic_Predicate
  | Aspect_Static_Predicate
+ | Aspect_Predicate
   =>
- --  Build predicate function specification and preanalyze
- --  expression after type replacement. The function
- --  declaration must be analyzed in the scope of the type,
- --  but the expression can reference components and
- --  discriminants of the type.
+ --  Preanalyze expression after type replacement to catch
+ --  name resolution errors if the predicate function has
+ --  not been built yet.
+ --  Note that we cannot use Preanalyze_Spec_Expression
+ --  because of the special handling 

[Ada] Incremental patch for restriction No_Dynamic_Accessibility_Checks

2021-07-09 Thread Pierre-Marie de Rodat
This patch corrects various issues discovered during testing of the
No_Dynamic_Accessibility_Checks restriction leading to level
miscalculation errors.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.ads (Type_Access_Level): Add new optional parameter
Assoc_Ent.
* sem_util.adb (Accessibility_Level): Treat access discriminants
the same as components when the restriction
No_Dynamic_Accessibility_Checks is enabled.
(Deepest_Type_Access_Level): Remove exception for
Debug_Flag_Underscore_B when returning the result of
Type_Access_Level in the case where
No_Dynamic_Accessibility_Checks is active.
(Function_Call_Or_Allocator_Level): Correctly calculate the
level of Expr based on its containing subprogram instead of
using Current_Subprogram.
* sem_res.adb (Valid_Conversion): Add actual for new parameter
Assoc_Ent in call to Type_Access_Level, and add test of
No_Dynamic_Accessibility_Checks_Enabled to ensure that static
accessibility checks are performed for all anonymous access type
conversions.diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -13734,11 +13734,16 @@ package body Sem_Res is
 --  the target type is anonymous access as well - see RM 3.10.2
 --  (10.3/3).
 
-elsif Type_Access_Level (Opnd_Type) >
-Deepest_Type_Access_Level (Target_Type)
-  and then (Nkind (Associated_Node_For_Itype (Opnd_Type)) /=
- N_Function_Specification
-or else Ekind (Target_Type) in Anonymous_Access_Kind)
+--  Note that when the restriction No_Dynamic_Accessibility_Checks
+--  is in effect wei also want to proceed with the conversion check
+--  described above.
+
+elsif Type_Access_Level (Opnd_Type, Assoc_Ent => Operand)
+> Deepest_Type_Access_Level (Target_Type)
+  and then (Nkind (Associated_Node_For_Itype (Opnd_Type))
+  /= N_Function_Specification
+or else Ekind (Target_Type) in Anonymous_Access_Kind
+or else No_Dynamic_Accessibility_Checks_Enabled (N))
 
   --  Check we are not in a return value ???
 


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -420,7 +420,7 @@ package body Sem_Util is
 
else
   return Make_Level_Literal
-   (Subprogram_Access_Level (Current_Subprogram));
+   (Subprogram_Access_Level (Entity (Name (N;
end if;
 end if;
 
@@ -791,12 +791,22 @@ package body Sem_Util is
 --  is an anonymous access type means that its associated
 --  level is that of the containing type - see RM 3.10.2 (16).
 
+--  Note that when restriction No_Dynamic_Accessibility_Checks is
+--  in effect we treat discriminant components as regular
+--  components.
+
 elsif Nkind (E) = N_Selected_Component
   and then Ekind (Etype (E))   =  E_Anonymous_Access_Type
   and then Ekind (Etype (Pre)) /= E_Anonymous_Access_Type
-  and then not (Nkind (Selector_Name (E)) in N_Has_Entity
- and then Ekind (Entity (Selector_Name (E)))
-= E_Discriminant)
+  and then (not (Nkind (Selector_Name (E)) in N_Has_Entity
+  and then Ekind (Entity (Selector_Name (E)))
+ = E_Discriminant)
+
+--  The alternative accessibility models both treat
+--  discriminants as regular components.
+
+or else (No_Dynamic_Accessibility_Checks_Enabled (E)
+  and then Allow_Alt_Model))
 then
--  When restriction No_Dynamic_Accessibility_Checks is active
--  and -gnatd_b set, the level is that of the designated type.
@@ -7215,7 +7225,6 @@ package body Sem_Util is
 
  if Allow_Alt_Model
and then No_Dynamic_Accessibility_Checks_Enabled (Typ)
-   and then not Debug_Flag_Underscore_B
  then
 return Type_Access_Level (Typ, Allow_Alt_Model);
  end if;
@@ -29157,7 +29166,8 @@ package body Sem_Util is
 
function Type_Access_Level
  (Typ : Entity_Id;
-  Allow_Alt_Model : Boolean := True) return Uint
+  Allow_Alt_Model : Boolean   := True;
+  Assoc_Ent   : Entity_Id := Empty) return Uint
is
   Btyp: Entity_Id := Base_Type (Typ);
   Def_Ent : Entity_Id;
@@ -29187,6 +29197,18 @@ package body Sem_Util is
  

[Ada] Update internal documentation of debugging information

2021-07-09 Thread Pierre-Marie de Rodat
This updates the documentation of the debugging information generated by
the compiler present in the spec of the Exp_Dbug unit.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_dbug.ads: Update documentation of various items.diff --git a/gcc/ada/exp_dbug.ads b/gcc/ada/exp_dbug.ads
--- a/gcc/ada/exp_dbug.ads
+++ b/gcc/ada/exp_dbug.ads
@@ -23,9 +23,11 @@
 --  --
 --
 
---  Expand routines for generation of special declarations used by the
---  debugger. In accordance with the Dwarf 2.2 specification, certain
---  type names are encoded to provide information to the debugger.
+--  Expand routines for the generation of special declarations used by the
+--  debugger. In accordance with the DWARF specification, certain type names
+--  may also be encoded to provide additional information to the debugger, but
+--  this practice is being deprecated and some encodings described below are no
+--  longer generated by default (they are marked OBSOLETE).
 
 with Namet; use Namet;
 with Types; use Types;
@@ -496,53 +498,104 @@ package Exp_Dbug is
--  corresponding positive value followed by a lower case m for minus to
--  indicate that the value is negative (e.g. 2m for -2).
 
-   -
-   -- Type Name Encodings --
-   -
+   
+   -- Encapsulated Types --
+   
+
+   --  In some cases, the compiler may encapsulate a type by wrapping it in a
+   --  record. For example, this is used when a size or alignment specification
+   --  requires a larger type. Consider:
+
+   --type x is mod 2 ** 64;
+   --for x'size use 256;
+
+   --  In this case, the compiler generates a record type x___PAD, which has
+   --  a single field whose name is F. This single field is 64-bit long and
+   --  contains the actual value. This kind of padding is used when the logical
+   --  value to be stored is shorter than the object in which it is allocated.
+
+   --  A similar encapsulation is done for some packed array types, in which
+   --  case the record type is x___JM and the field name is OBJECT. This is
+   --  used in the case of a packed array stored using modular representation
+   --  (see the section on representation of packed array objects). In this
+   --  case the wrapping is used to achieve correct positioning of the packed
+   --  array value (left/right justified in its field depending on endianness).
+
+   --  When the debugger sees an object of a type whose name has a suffix of
+   --  ___PAD or ___JM, the type will be a record containing a single field,
+   --  and the name of that field will be all upper case. In this case, it
+   --  should look inside to get the value of the inner field, and neither
+   --  the outer structure name, nor the field name should appear when the
+   --  value is printed.
+
+   --  Similarly, when the debugger sees a record named REP being the type of
+   --  a field inside another record type, it should treat the fields inside
+   --  REP as being part of the outer record (this REP field is only present
+   --  for code generation purposes). The REP record should not appear in the
+   --  values printed by the debugger.
+
+   
+   -- Implicit Types --
+   
+
+   --  The compiler creates implicit type names in many situations where a
+   --  type is present semantically, but no specific name is present. For
+   --  example:
+
+   -- S : Integer range M .. N;
+
+   --  Here the subtype of S is not integer, but rather an anonymous subtype
+   --  of Integer. Where possible, the compiler generates names for such
+   --  anonymous types that are related to the type from which the subtype
+   --  is obtained as follows:
+
+   -- T name suffix
+
+   --  where name is the name from which the subtype is obtained, using
+   --  lower case letters and underscores, and suffix starts with an upper
+   --  case letter. For example the name for the above declaration might be:
+
+   -- TintegerS4b
+
+   --  If the debugger is asked to give the type of an entity and the type
+   --  has the form T name suffix, it is probably appropriate to just use
+   --  "name" in the response since this is what is meaningful to the
+   --  programmer.
+
+   ---
+   -- Modular Types --
+   ---
+
+   --  A type declared
+
+   --type x is mod N;
+
+   --  is encoded as a subrange of an unsigned base type with lower bound zero
+   --  and upper bound N - 1. Thus we give these types a somewhat nonstandard
+   --  interpretation: the standard interpretation would not, in general, imply
+   --  that arithmetic operations on type x are performed modulo N (especially
+   --  not when N is not a power of 2).
+
+   --
+   -- Tagged 

[Ada] Reorder preanalysis of static expression functions

2021-07-09 Thread Pierre-Marie de Rodat
Group two variants of preanalysis of expression functions. Code cleanup
related to handling of static expression functions in GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Reorder code.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -524,6 +524,12 @@ package body Sem_Ch6 is
 Install_Formals (Def_Id);
 Preanalyze_Spec_Expression (Expr, Typ);
 End_Scope;
+ else
+Push_Scope (Def_Id);
+Install_Formals (Def_Id);
+Preanalyze_Formal_Expression (Expr, Typ);
+Check_Limited_Return (Orig_N, Expr, Typ);
+End_Scope;
  end if;
 
  --  If this is a wrapper created in an instance for a formal
@@ -561,16 +567,6 @@ package body Sem_Ch6 is
 end;
  end if;
 
- --  Preanalyze the expression if not already done above
-
- if not Inside_A_Generic then
-Push_Scope (Def_Id);
-Install_Formals (Def_Id);
-Preanalyze_Formal_Expression (Expr, Typ);
-Check_Limited_Return (Orig_N, Expr, Typ);
-End_Scope;
- end if;
-
  --  In the case of an expression function marked with the aspect
  --  Static, we need to check the requirement that the function's
  --  expression is a potentially static expression. This is done




[Ada] Decouple analysis of static expression functions from GNATprove

2021-07-09 Thread Pierre-Marie de Rodat
Analysis of static expression functions happened inside an IF branch
guarded by GNATprove_Mode. Cleanup related to handling of static
expression functions in GNATprove mode; behaviour is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Reorder code.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -526,30 +526,30 @@ package body Sem_Ch6 is
 End_Scope;
  end if;
 
+ --  If this is a wrapper created in an instance for a formal
+ --  subprogram, insert body after declaration, to be analyzed when the
+ --  enclosing instance is analyzed.
+
+ if GNATprove_Mode
+   and then Is_Generic_Actual_Subprogram (Def_Id)
+ then
+Insert_After (N, New_Body);
+
  --  To prevent premature freeze action, insert the new body at the end
  --  of the current declarations, or at the end of the package spec.
  --  However, resolve usage names now, to prevent spurious visibility
  --  on later entities. Note that the function can now be called in
- --  the current declarative part, which will appear to be prior to
- --  the presence of the body in the code. There are nevertheless no
- --  order of elaboration issues because all name resolution has taken
- --  place at the point of declaration.
-
- declare
-Decls : List_Id  := List_Containing (N);
-Par   : constant Node_Id := Parent (Decls);
+ --  the current declarative part, which will appear to be prior to the
+ --  presence of the body in the code. There are nevertheless no order
+ --  of elaboration issues because all name resolution has taken place
+ --  at the point of declaration.
 
- begin
---  If this is a wrapper created in an instance for a formal
---  subprogram, insert body after declaration, to be analyzed when
---  the enclosing instance is analyzed.
-
-if GNATprove_Mode
-  and then Is_Generic_Actual_Subprogram (Def_Id)
-then
-   Insert_After (N, New_Body);
+ else
+declare
+   Decls : List_Id  := List_Containing (N);
+   Par   : constant Node_Id := Parent (Decls);
 
-else
+begin
if Nkind (Par) = N_Package_Specification
  and then Decls = Visible_Declarations (Par)
  and then not Is_Empty_List (Private_Declarations (Par))
@@ -558,68 +558,67 @@ package body Sem_Ch6 is
end if;
 
Insert_After (Last (Decls), New_Body);
+end;
+ end if;
 
-   --  Preanalyze the expression if not already done above
+ --  Preanalyze the expression if not already done above
 
-   if not Inside_A_Generic then
-  Push_Scope (Def_Id);
-  Install_Formals (Def_Id);
-  Preanalyze_Formal_Expression (Expr, Typ);
-  Check_Limited_Return (Orig_N, Expr, Typ);
-  End_Scope;
-   end if;
+ if not Inside_A_Generic then
+Push_Scope (Def_Id);
+Install_Formals (Def_Id);
+Preanalyze_Formal_Expression (Expr, Typ);
+Check_Limited_Return (Orig_N, Expr, Typ);
+End_Scope;
+ end if;
 
-   --  In the case of an expression function marked with the
-   --  aspect Static, we need to check the requirement that the
-   --  function's expression is a potentially static expression.
-   --  This is done by making a full copy of the expression tree
-   --  and performing a special preanalysis on that tree with
-   --  the global flag Checking_Potentially_Static_Expression
-   --  enabled. If the resulting expression is static, then it's
-   --  OK, but if not, that means the expression violates the
-   --  requirements of the Ada 2022 RM in 4.9(3.2/5-3.4/5) and
-   --  we flag an error.
-
-   if Is_Static_Function (Def_Id) then
-  if not Is_Static_Expression (Expr) then
- declare
-Exp_Copy : constant Node_Id := New_Copy_Tree (Expr);
- begin
-Set_Checking_Potentially_Static_Expression (True);
+ --  In the case of an expression function marked with the aspect
+ --  Static, we need to check the requirement that the function's
+ --  expression is a potentially static expression. This is done
+ --  by making a full copy of the expression tree and performing
+ --  a special preanalysis on that tree with the global flag
+ --  

[Ada] Avoid repeated computing of type of expression functions

2021-07-09 Thread Pierre-Marie de Rodat
Cleanup related to handing of static expression functions in GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Add variable to
avoid repeated calls to Etype.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -300,6 +300,7 @@ package body Sem_Ch6 is
   New_Spec : Node_Id;
   Orig_N   : Node_Id;
   Ret  : Node_Id;
+  Typ  : Entity_Id;
 
   Def_Id : Entity_Id := Empty;
   Prev   : Entity_Id;
@@ -333,6 +334,8 @@ package body Sem_Ch6 is
  Def_Id := Analyze_Subprogram_Specification (Spec);
  Prev   := Find_Corresponding_Spec (N);
 
+ Typ := Etype (Def_Id);
+
  --  The previous entity may be an expression function as well, in
  --  which case the redeclaration is illegal.
 
@@ -406,7 +409,7 @@ package body Sem_Ch6 is
  if not Inside_A_Generic then
 Freeze_Expr_Types
   (Def_Id => Def_Id,
-   Typ=> Etype (Def_Id),
+   Typ=> Typ,
Expr   => Expr,
N  => N);
  end if;
@@ -496,6 +499,8 @@ package body Sem_Ch6 is
  Def_Id := Defining_Entity (N);
  Set_Is_Inlined (Def_Id);
 
+ Typ := Etype (Def_Id);
+
  --  Establish the linkages between the spec and the body. These are
  --  used when the expression function acts as the prefix of attribute
  --  'Access in order to freeze the original expression which has been
@@ -517,7 +522,7 @@ package body Sem_Ch6 is
 Set_Has_Completion (Def_Id, not Is_Ignored_Ghost_Entity (Def_Id));
 Push_Scope (Def_Id);
 Install_Formals (Def_Id);
-Preanalyze_Spec_Expression (Expr, Etype (Def_Id));
+Preanalyze_Spec_Expression (Expr, Typ);
 End_Scope;
  end if;
 
@@ -531,9 +536,8 @@ package body Sem_Ch6 is
  --  place at the point of declaration.
 
  declare
-Decls : List_Id:= List_Containing (N);
-Par   : constant Node_Id   := Parent (Decls);
-Typ   : constant Entity_Id := Etype (Def_Id);
+Decls : List_Id  := List_Containing (N);
+Par   : constant Node_Id := Parent (Decls);
 
  begin
 --  If this is a wrapper created in an instance for a formal
@@ -624,12 +628,11 @@ package body Sem_Ch6 is
   --  nodes that don't come from source.
 
   if Present (Def_Id)
-and then Nkind (Def_Id) in N_Has_Etype
-and then Is_Tagged_Type (Etype (Def_Id))
+and then Is_Tagged_Type (Typ)
   then
  Check_Dynamically_Tagged_Expression
(Expr=> Expr,
-Typ => Etype (Def_Id),
+Typ => Typ,
 Related_Nod => Orig_N);
   end if;
 




[Ada] Fix comment related to analysis of expression functions

2021-07-09 Thread Pierre-Marie de Rodat
Cleanup related to handing of static expression functions in GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Fix comment.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -536,7 +536,7 @@ package body Sem_Ch6 is
 Typ   : constant Entity_Id := Etype (Def_Id);
 
  begin
---  If this is a wrapper created for in an instance for a formal
+--  If this is a wrapper created in an instance for a formal
 --  subprogram, insert body after declaration, to be analyzed when
 --  the enclosing instance is analyzed.
 




[Ada] Avoid repeated calls in analysis of expression functions

2021-07-09 Thread Pierre-Marie de Rodat
Code cleanup related to handing of static expression functions in
GNATprove; behaviour is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Use Orig_N variable
instead of repeated calls to Original_Node.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -561,7 +561,7 @@ package body Sem_Ch6 is
   Push_Scope (Def_Id);
   Install_Formals (Def_Id);
   Preanalyze_Formal_Expression (Expr, Typ);
-  Check_Limited_Return (Original_Node (N), Expr, Typ);
+  Check_Limited_Return (Orig_N, Expr, Typ);
   End_Scope;
end if;
 
@@ -630,7 +630,7 @@ package body Sem_Ch6 is
  Check_Dynamically_Tagged_Expression
(Expr=> Expr,
 Typ => Etype (Def_Id),
-Related_Nod => Original_Node (N));
+Related_Nod => Orig_N);
   end if;
 
   --  We must enforce checks for unreferenced formals in our newly




[Ada] Refine types of local variables in analysis of expression functions

2021-07-09 Thread Pierre-Marie de Rodat
Code cleanup related to handing of static expression functions in
GNATprove; behaviour is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Change types local
variables from Entity_Id to Node_Id.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -640,9 +640,9 @@ package body Sem_Ch6 is
   if Present (Parameter_Specifications (New_Spec)) then
  declare
 Form_New_Def  : Entity_Id;
-Form_New_Spec : Entity_Id;
+Form_New_Spec : Node_Id;
 Form_Old_Def  : Entity_Id;
-Form_Old_Spec : Entity_Id;
+Form_Old_Spec : Node_Id;
 
  begin
 Form_New_Spec := First (Parameter_Specifications (New_Spec));




[Ada] Remove an unnecessary local constant

2021-07-09 Thread Pierre-Marie de Rodat
Code cleanup related to preanalysis in GNATprove mode; behaviour is
unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): A local Expr
constant was shadowing a global constant with the same name and
the same value.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -532,7 +532,6 @@ package body Sem_Ch6 is
 
  declare
 Decls : List_Id:= List_Containing (N);
-Expr  : constant Node_Id   := Expression (Ret);
 Par   : constant Node_Id   := Parent (Decls);
 Typ   : constant Entity_Id := Etype (Def_Id);
 




[Ada] Avoid unnecessary call in preanalysis without freezing

2021-07-09 Thread Pierre-Marie de Rodat
Cleanup related to preanalysis in GNATprove mode; behaviour is
unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_res.adb (Preanalyze_And_Resolve): Only call
Set_Must_Not_Freeze when it is necessary to restore the previous
value.diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -1886,9 +1886,9 @@ package body Sem_Res is
 
   Expander_Mode_Restore;
   Full_Analysis := Save_Full_Analysis;
-  Set_Must_Not_Freeze (N, Save_Must_Not_Freeze);
 
   if not With_Freezing then
+ Set_Must_Not_Freeze (N, Save_Must_Not_Freeze);
  Inside_Preanalysis_Without_Freezing :=
Inside_Preanalysis_Without_Freezing - 1;
   end if;




Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234] intrinsics

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 1:54 PM Richard Sandiford via Gcc-patches
 wrote:
>
> Kyrylo Tkachov  writes:
> >> -Original Message-
> >> From: Richard Sandiford 
> >> Sent: 09 July 2021 12:40
> >> To: Jonathan Wright 
> >> Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov 
> >> Subject: Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234]
> >> intrinsics
> >>
> >> Jonathan Wright  writes:
> >> > Hi,
> >> >
> >> > As subject, this patch uses a union instead of constructing a new opaque
> >> > vector structure for each of the vqtbl[234] Neon intrinsics in 
> >> > arm_neon.h.
> >> > This simplifies the header file and also improves code generation -
> >> > superfluous move instructions were emitted for every register
> >> > extraction/set in this additional structure.
> >> >
> >> > This change is safe because the C-level vector structure types e.g.
> >> > uint8x16x4_t already provide a tie for sequential register allocation
> >> > - which is required by the TBL instructions.
> >> >
> >> > Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> >> > issues.
> >> >
> >> > Ok for master?
> >>
> >> Looks good, but I think we should have some tests to defend the
> >> RA improvements.  E.g. have things like:
> >>
> >>   #include 
> >>
> >>   …
> >>
> >>   uint8x8_t
> >>   f2_u8 (uint8x16x2_t x, uint8x8_t y)
> >>   {
> >> return vqtbl2_u8 (x, y);
> >>   }
> >>
> >>   …
> >>
> >> and add a scan-assembler-not for moves.
> >>
> >> Union punning is UB for standard C++, but I think in practice we're
> >> not going to be able to treat it as such for GCC.  This would be
> >> far from the only thing to rely on union punning for correctness.
> >
> > Could we use some reinterpret_cast or bit_cast (or the builtins the rely 
> > on) for C++?
> > This may involve separate definitions for C and C++, which may not be worth 
> > it... No objections to the patch from me to be clear.
>
> The C++-correct way would be to do something like:
>
> __extension__ extern __inline uint8x8_t
> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> vqtbl2_u8 (uint8x16x2_t __tab, uint8x8_t __idx)
> {
>   __builtin_aarch64_simd_oi __o;
>   __builtin_memcpy (&__o, &__tab, sizeof (__tab));
>   return (uint8x8_t)__builtin_aarch64_tbl3v8qi (__o, (int8x8_t)__idx);
> }
>
> which does seem to produce the same code for the simple case above.
> I've no idea how well it would work out in real code though.
> Having to take the address of something is a bit unfortunate,
> but maybe we fold that away early enough for it not to matter.

We should fold it very early (during gimplification or gimple lowering)
in case the target is not strict-alignment or the copied memory is
aligned which I think it is here since objects and not pointers are involved.

> So the options would be:
>
> (1) keep it as-is
> (2) keep it as-is for C and add the memcpy version for C++
> (3) use the memcpy version for C and C++
>
> (3) is obviously better than (2) if we don't see any performance penalty.
>
> Thanks,
> Richard


Re: [PATCH 10/10] vect: Reuse reduction accumulators between loops

2021-07-09 Thread Richard Biener via Gcc-patches
On Thu, Jul 8, 2021 at 2:50 PM Richard Sandiford via Gcc-patches
 wrote:
>
> This patch adds support for reusing a main loop's reduction accumulator
> in an epilogue loop.  This in turn lets the loops share a single piece
> of vector->scalar reduction code.
>
> The patch has the following restrictions:
>
> (1) The epilogue reduction can only operate on a single vector
> (e.g. ncopies must be 1 for non-SLP reductions, and the group size
> must be <= the element count for SLP reductions).
>
> (2) Both loops must use the same vector mode for their accumulators.
> This means that the patch is restricted to targets that support
> --param vect-partial-vector-usage=1.
>
> (3) The reduction must be a standard “tree code” reduction.
>
> However, these restrictions could be lifted in future.  For example,
> if the main loop operates on 128-bit vectors and the epilogue loop
> operates on 64-bit vectors, we could in future reduce the 128-bit
> vector by one stage and use the 64-bit result as the starting point
> for the epilogue result.

Yeah, I hope that can be done quickly - it should make the
approach usable on x86_64.

> The patch tries to handle chained SLP reductions, unchained SLP
> reductions and non-SLP reductions.  It also handles cases in which
> the epilogue loop is entered directly (rather than via the main loop)
> and cases in which the epilogue loop can be skipped.
>
> vect_get_main_loop_result is a bit more general than the current
> patch needs.

I didn't see anything that would adjust the costing of the vectorization
(though I don't specifically remember how we cost vectorized epilogues
in general).

Few comments / questions inline below - I think the patch is OK
as-is though.

Thanks,
Richard.

> gcc/
> * tree-vectorizer.h (vect_reusable_accumulator): New structure.
> (_loop_vec_info::main_loop_edge): New field.
> (_loop_vec_info::skip_main_loop_edge): Likewise.
> (_loop_vec_info::skip_this_loop_edge): Likewise.
> (_loop_vec_info::reusable_accumulators): Likewise.
> (_stmt_vec_info::reduc_scalar_results): Likewise.
> (_stmt_vec_info::reused_accumulator): Likewise.
> (vect_get_main_loop_result): Declare.
> * tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize
> reduc_scalar_inputs.
> (vec_info::free_stmt_vec_info): Free reduc_scalar_inputs.
> * tree-vect-loop-manip.c (vect_get_main_loop_result): New function.
> (vect_do_peeling): Fill an epilogue loop's main_loop_edge,
> skip_main_loop_edge and skip_this_loop_edge fields.
> * tree-vect-loop.c (INCLUDE_ALGORITHM): Define.
> (vect_emit_reduction_init_stmts): New function.
> (get_initial_def_for_reduction): Use it.
> (get_initial_defs_for_reduction): Likewise.  Change the vinfo
> parameter to a loop_vec_info.
> (vect_create_epilog_for_reduction): Store the scalar results
> in the reduc_info.  If an epilogue loop is reusing an accumulator
> from the main loop, and if the epilogue loop can also be skipped,
> try to place the reduction code in the join block.  Record
> accumulators that could potentially be reused by epilogue loops.
> (vect_transform_cycle_phi): When vectorizing epilogue loops,
> try to reuse accumulators from the main loop.  Record the initial
> value in reduc_info for non-SLP reductions too.
>
> gcc/testsuite/
> * gcc.target/aarch64/sve/reduc_9.c: New test.
> * gcc.target/aarch64/sve/reduc_9_run.c: Likewise.
> * gcc.target/aarch64/sve/reduc_10.c: Likewise.
> * gcc.target/aarch64/sve/reduc_10_run.c: Likewise.
> * gcc.target/aarch64/sve/reduc_11.c: Likewise.
> * gcc.target/aarch64/sve/reduc_11_run.c: Likewise.
> * gcc.target/aarch64/sve/reduc_12.c: Likewise.
> * gcc.target/aarch64/sve/reduc_12_run.c: Likewise.
> * gcc.target/aarch64/sve/reduc_13.c: Likewise.
> * gcc.target/aarch64/sve/reduc_13_run.c: Likewise.
> * gcc.target/aarch64/sve/reduc_14.c: Likewise.
> * gcc.target/aarch64/sve/reduc_14_run.c: Likewise.
> * gcc.target/aarch64/sve/reduc_15.c: Likewise.
> * gcc.target/aarch64/sve/reduc_15_run.c: Likewise.
> ---
>  .../gcc.target/aarch64/sve/reduc_10.c |  77 +
>  .../gcc.target/aarch64/sve/reduc_10_run.c |  49 +++
>  .../gcc.target/aarch64/sve/reduc_11.c |  71 
>  .../gcc.target/aarch64/sve/reduc_11_run.c |  34 ++
>  .../gcc.target/aarch64/sve/reduc_12.c |  71 
>  .../gcc.target/aarch64/sve/reduc_12_run.c |  66 
>  .../gcc.target/aarch64/sve/reduc_13.c | 101 ++
>  .../gcc.target/aarch64/sve/reduc_13_run.c |  61 
>  .../gcc.target/aarch64/sve/reduc_14.c | 107 ++
>  .../gcc.target/aarch64/sve/reduc_14_run.c | 187 +++
>  .../gcc.target/aarch64/sve/reduc_15.c |  16 +
>  

Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234] intrinsics

2021-07-09 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: 09 July 2021 12:40
>> To: Jonathan Wright 
>> Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov 
>> Subject: Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234]
>> intrinsics
>> 
>> Jonathan Wright  writes:
>> > Hi,
>> >
>> > As subject, this patch uses a union instead of constructing a new opaque
>> > vector structure for each of the vqtbl[234] Neon intrinsics in arm_neon.h.
>> > This simplifies the header file and also improves code generation -
>> > superfluous move instructions were emitted for every register
>> > extraction/set in this additional structure.
>> >
>> > This change is safe because the C-level vector structure types e.g.
>> > uint8x16x4_t already provide a tie for sequential register allocation
>> > - which is required by the TBL instructions.
>> >
>> > Regression tested and bootstrapped on aarch64-none-linux-gnu - no
>> > issues.
>> >
>> > Ok for master?
>> 
>> Looks good, but I think we should have some tests to defend the
>> RA improvements.  E.g. have things like:
>> 
>>   #include 
>> 
>>   …
>> 
>>   uint8x8_t
>>   f2_u8 (uint8x16x2_t x, uint8x8_t y)
>>   {
>> return vqtbl2_u8 (x, y);
>>   }
>> 
>>   …
>> 
>> and add a scan-assembler-not for moves.
>> 
>> Union punning is UB for standard C++, but I think in practice we're
>> not going to be able to treat it as such for GCC.  This would be
>> far from the only thing to rely on union punning for correctness.
>
> Could we use some reinterpret_cast or bit_cast (or the builtins the rely on) 
> for C++?
> This may involve separate definitions for C and C++, which may not be worth 
> it... No objections to the patch from me to be clear.

The C++-correct way would be to do something like:

__extension__ extern __inline uint8x8_t
__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
vqtbl2_u8 (uint8x16x2_t __tab, uint8x8_t __idx)
{
  __builtin_aarch64_simd_oi __o;
  __builtin_memcpy (&__o, &__tab, sizeof (__tab));
  return (uint8x8_t)__builtin_aarch64_tbl3v8qi (__o, (int8x8_t)__idx);
}

which does seem to produce the same code for the simple case above.
I've no idea how well it would work out in real code though.
Having to take the address of something is a bit unfortunate,
but maybe we fold that away early enough for it not to matter.

So the options would be:

(1) keep it as-is
(2) keep it as-is for C and add the memcpy version for C++
(3) use the memcpy version for C and C++

(3) is obviously better than (2) if we don't see any performance penalty.

Thanks,
Richard


RE: [PATCH] aarch64: Use unions for vector tables in vqtbl[234] intrinsics

2021-07-09 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Richard Sandiford 
> Sent: 09 July 2021 12:40
> To: Jonathan Wright 
> Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov 
> Subject: Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234]
> intrinsics
> 
> Jonathan Wright  writes:
> > Hi,
> >
> > As subject, this patch uses a union instead of constructing a new opaque
> > vector structure for each of the vqtbl[234] Neon intrinsics in arm_neon.h.
> > This simplifies the header file and also improves code generation -
> > superfluous move instructions were emitted for every register
> > extraction/set in this additional structure.
> >
> > This change is safe because the C-level vector structure types e.g.
> > uint8x16x4_t already provide a tie for sequential register allocation
> > - which is required by the TBL instructions.
> >
> > Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> > issues.
> >
> > Ok for master?
> 
> Looks good, but I think we should have some tests to defend the
> RA improvements.  E.g. have things like:
> 
>   #include 
> 
>   …
> 
>   uint8x8_t
>   f2_u8 (uint8x16x2_t x, uint8x8_t y)
>   {
> return vqtbl2_u8 (x, y);
>   }
> 
>   …
> 
> and add a scan-assembler-not for moves.
> 
> Union punning is UB for standard C++, but I think in practice we're
> not going to be able to treat it as such for GCC.  This would be
> far from the only thing to rely on union punning for correctness.

Could we use some reinterpret_cast or bit_cast (or the builtins the rely on) 
for C++?
This may involve separate definitions for C and C++, which may not be worth 
it... No objections to the patch from me to be clear.
Thanks,
Kyrill

> 
> Thanks,
> Richard


Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234] intrinsics

2021-07-09 Thread Richard Sandiford via Gcc-patches
Jonathan Wright  writes:
> Hi,
>
> As subject, this patch uses a union instead of constructing a new opaque
> vector structure for each of the vqtbl[234] Neon intrinsics in arm_neon.h.
> This simplifies the header file and also improves code generation -
> superfluous move instructions were emitted for every register
> extraction/set in this additional structure.
>
> This change is safe because the C-level vector structure types e.g.
> uint8x16x4_t already provide a tie for sequential register allocation
> - which is required by the TBL instructions.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

Looks good, but I think we should have some tests to defend the
RA improvements.  E.g. have things like:

  #include 

  …

  uint8x8_t
  f2_u8 (uint8x16x2_t x, uint8x8_t y)
  {
return vqtbl2_u8 (x, y);
  }

  …

and add a scan-assembler-not for moves.

Union punning is UB for standard C++, but I think in practice we're
not going to be able to treat it as such for GCC.  This would be
far from the only thing to rely on union punning for correctness.

Thanks,
Richard


Re: [PATCH 06/10] vect: Pass reduc_info to get_initial_defs_for_reduction

2021-07-09 Thread Richard Biener via Gcc-patches
On Thu, Jul 8, 2021 at 6:48 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Thu, Jul 8, 2021 at 2:46 PM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> This patch passes the reduc_info to get_initial_defs_for_reduction,
> >> so that the function can get general information from there rather
> >> than from the first SLP statement.  This isn't a win on its own,
> >> but it becomes important with later patches.
> >
> > So the original code should have used SLP_TREE_REPRESENTATIVE
> > instead of SLP_TREE_SCALAR_STMTS ()[0] (there might have been
> > issues with doing that - my recollection is weak here).
> >
> > I'm not sure if reduc_info is actually better - only the representative
> > will have STMT_VINFO_VECTYPE set, for the reduc_info
> > there's STMT_VINFO_REDUC_VECTYPE (and STMT_VINFO_REDUC_VECTYPE_IN).
> >
> > So I think if you want to use reduc_info then you want to use
> > STMT_VINFO_REDUC_VECTYPE?
>
> I guess I'm a bit fuzzy on the details, but AIUI STMT_VINFO_REDUC_VECTYPE
> is the type that we do the arithmetic in, which might be different from
> the types of the phis.  Is that right?

Hmm, yeah (my recollection is fuzzy as well here...).

> In this context we want the types of the phis, since the routine is
> providing the initial values.  Using STMT_VINFO_REDUC_VECTYPE gives
> things like:

OK, I see.  So there's the reduc_info vs. SLP_TREE_REPRESENTATIVE issue
left.  At least I don't see that we reliably set STMT_VINFO_VECTYPE on
all scalar PHIs of a SLP reduction.  The reduc_info happens to be one of the
PHI stmt_infos (but that's an implementation detail as well).

The reduction SLP instance has the reduc_phis member to get at the
PHIs vector type (via SLP_TREE_VECTYPE).  I think we don't have
anything explicit that's good here but I notice that
vect_create_epilog_for_reduction
uses STMT_VINFO_VECTYPE (reduc_info) as well.

So I guess the patch is OK as-is.

Thanks,
Richard.


> ---
> gcc.dg/torture/pr92345.c:8:1: error: incompatible types in 'PHI' argument 1
> vector(4) int
>
> vector(4) unsigned int
>
> vect_fr_lsm.11_58 = PHI 
> ---
>
> Thanks,
> Richard
>
> >
> >> gcc/
> >> * tree-vect-loop.c (get_initial_defs_for_reduction): Take the
> >> reduc_info as an additional parameter.
> >> (vect_transform_cycle_phi): Update accordingly.
> >> ---
> >>  gcc/tree-vect-loop.c | 23 ++-
> >>  1 file changed, 10 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> >> index a31d7621c3b..565c2859477 100644
> >> --- a/gcc/tree-vect-loop.c
> >> +++ b/gcc/tree-vect-loop.c
> >> @@ -4764,32 +4764,28 @@ get_initial_def_for_reduction (loop_vec_info 
> >> loop_vinfo,
> >>return init_def;
> >>  }
> >>
> >> -/* Get at the initial defs for the reduction PHIs in SLP_NODE.
> >> -   NUMBER_OF_VECTORS is the number of vector defs to create.
> >> -   If NEUTRAL_OP is nonnull, introducing extra elements of that
> >> -   value will not change the result.  */
> >> +/* Get at the initial defs for the reduction PHIs for REDUC_INFO, whose
> >> +   associated SLP node is SLP_NODE.  NUMBER_OF_VECTORS is the number of 
> >> vector
> >> +   defs to create.  If NEUTRAL_OP is nonnull, introducing extra elements 
> >> of
> >> +   that value will not change the result.  */
> >>
> >>  static void
> >>  get_initial_defs_for_reduction (vec_info *vinfo,
> >> +   stmt_vec_info reduc_info,
> >> slp_tree slp_node,
> >> vec *vec_oprnds,
> >> unsigned int number_of_vectors,
> >> bool reduc_chain, tree neutral_op)
> >>  {
> >>vec stmts = SLP_TREE_SCALAR_STMTS (slp_node);
> >> -  stmt_vec_info stmt_vinfo = stmts[0];
> >>unsigned HOST_WIDE_INT nunits;
> >>unsigned j, number_of_places_left_in_vector;
> >> -  tree vector_type;
> >> +  tree vector_type = STMT_VINFO_VECTYPE (reduc_info);
> >>unsigned int group_size = stmts.length ();
> >>unsigned int i;
> >>class loop *loop;
> >>
> >> -  vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> >> -
> >> -  gcc_assert (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def);
> >> -
> >> -  loop = (gimple_bb (stmt_vinfo->stmt))->loop_father;
> >> +  loop = (gimple_bb (reduc_info->stmt))->loop_father;
> >>gcc_assert (loop);
> >>edge pe = loop_preheader_edge (loop);
> >>
> >> @@ -4823,7 +4819,7 @@ get_initial_defs_for_reduction (vec_info *vinfo,
> >>  {
> >>tree op;
> >>i = j % group_size;
> >> -  stmt_vinfo = stmts[i];
> >> +  stmt_vec_info stmt_vinfo = stmts[i];
> >>
> >>/* Get the def before the loop.  In reduction chain we have only
> >>  one initial value.  Else we have as many as PHIs in the group.  */
> >> @@ -7510,7 +7506,8 

[WIP, OpenMP] OpenMP metadirectives support

2021-07-09 Thread Kwok Cheung Yeung

Hello

This is a WIP implementation of metadirectives as defined in the OpenMP 5.0 
spec. I intend to add support for metadirectives as specified in OpenMP 5.1 
later (where the directive can be selected dynamically at runtime), but am 
concentrating on the static part for now. Parsing has only been implemented in 
the C frontend so far. I am especially interested in feedback regarding certain 
aspects of the implementation before I become too committed to the current design.


1) When parsing each directive variant, a vector of tokens is constructed and 
populated with the tokens for a regular equivalent pragma, along with the tokens 
for its clauses and the body. The parser routine for that pragma type is then 
called with these tokens, and the entire resulting parse tree is stored as a 
sub-tree of the metadirective tree structure.


This results in the body being parsed and stored once for each directive 
variant. I believe this is necessary because the body is parsed differently if 
there is a 'for' in the directive (using c_parser_omp_for_loop) compared to if 
there is not, plus clauses in the directive (e.g. tile, collapse) can change how 
the for loop is parsed.


As an optimisation, identical body trees could be merged together, but that can 
come later.


2) Selectors in the device set (i.e. kind, isa, arch) resolve differently 
depending on whether the program is running on a target or on the host. Since we 
don't keep multiple versions of a function for each target on the host compiler, 
resolving metadirectives with these selectors needs to be delayed until after 
LTO streaming, at which point the host or offload compiler can make the 
appropriate decision.


One negative of this is that the metadirective Gimple representation lasts 
beyond the OMP expand stage, when generally we would expect all OMP directives 
to have been expanded to something else.


3) In the OpenMP examples (version 5.0.1), section 9.7, the example 
metadirective.3.c does not work as expected.


#pragma omp declare target
void exp_pi_diff(double *d, double my_pi){
   #pragma omp metadirective \
   when( construct={target}: distribute parallel for ) \
   default( parallel for simd)
...
int main()
{
   ...
   #pragma omp target teams map(tofrom: d[0:N])
   exp_pi_diff(d,my_pi);
   ...
   exp_pi_diff(d,my_pi);

In the first call to exp_pi_diff in an '#pragma omp target' construct, the 
metadirective is expected to expand to 'distribute parallel for', but in the 
second (without the '#pragma omp target'), it should expand to 'parallel for simd'.


During OMP expansion of the 'omp target', it creates a child function that calls 
exp_pi_diff:


__attribute__((omp target entrypoint))
void main._omp_fn.0 (const struct .omp_data_t.12 & restrict .omp_data_i)
{
  ...
   :
  __builtin_GOMP_teams (0, 0);
  exp_pi_diff (d.13, my_pi);

This is not a problem on the offload compiler (since by definition its copy of 
exp_pi_diff must be in a 'target'), but if the host device is used, the same 
version of exp_pi_diff is called in both target and non-target contexts.


What would be the best way to solve this? Offhand, I can think of two solutions:

(a) Recursively go through all functions that can be reached via a target region 
and create clones for each, redirecting all function calls in the clones to the 
new cloned versions. Resolve the metadirectives in the clones and originals 
separately.


(b) Make the construct selector a dynamic selector when OpenMP 5.1 metadirective 
support is implemented. Keep track of the current construct list every time an 
OpenMP construct is entered or exited, and make the decision at runtime.



Thanks

Kwok
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 1164554e6d6..28e29fab93d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1505,6 +1505,7 @@ OBJS = \
omp-general.o \
omp-low.o \
omp-oacc-kernels-decompose.o \
+omp-expand-metadirective.o \
omp-simd-clone.o \
opt-problem.o \
optabs.o \
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 4f8e8e0128c..01dc1e6d9c0 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1312,12 +1312,14 @@ static const struct omp_pragma_def omp_pragmas[] = {
   { "allocate", PRAGMA_OMP_ALLOCATE },
   { "atomic", PRAGMA_OMP_ATOMIC },
   { "barrier", PRAGMA_OMP_BARRIER },
+  { "begin", PRAGMA_OMP_BEGIN },
   { "cancel", PRAGMA_OMP_CANCEL },
   { "cancellation", PRAGMA_OMP_CANCELLATION_POINT },
   { "critical", PRAGMA_OMP_CRITICAL },
   { "depobj", PRAGMA_OMP_DEPOBJ },
-  { "end", PRAGMA_OMP_END_DECLARE_TARGET },
+  { "end", PRAGMA_OMP_END },
   { "flush", PRAGMA_OMP_FLUSH },
+  { "metadirective", PRAGMA_OMP_METADIRECTIVE },
   { "requires", PRAGMA_OMP_REQUIRES },
   { "section", PRAGMA_OMP_SECTION },
   { "sections", PRAGMA_OMP_SECTIONS },
@@ -1387,6 +1389,41 @@ c_pp_lookup_pragma (unsigned int id, const char **space, 
const char **name)
   

Re: [PATCH] testsuite: mips: use noinline attribute instead of -fno-inline

2021-07-09 Thread Richard Sandiford via Gcc-patches
Xi Ruoyao  writes:
> On Thu, 2021-07-08 at 17:44 -0600, Jeff Law wrote:
>> 
>> 
>> On 6/25/2021 8:40 AM, Richard Sandiford wrote:
>> > Xi Ruoyao via Gcc-patches  writes:
>> > > On Fri, 2021-06-25 at 01:02 +0800, Xi Ruoyao wrote:
>> > > > On Thu, 2021-06-24 at 10:48 -0600, Jeff Law wrote:
>> > > > > I'd like to know a bit more here.  mips.exp shouldn't care
>> > > > > about the
>> > > > > options passed to the compiler and to the best of my knowledge
>> > > > > patch itself is wrong, I question if it's necessary and
>> > > > > whether or
>> > > > > not
>> > > > > your just papering over some other issue.
>> > > > There is some logic processing options in mips.exp.  Some
>> > > > options are
>> > > > overrided for multilib.  It seems the mips.exp was originally
>> > > > designed
>> > > > as:
>> > > > 
>> > > > * MIPS options should go in dg-options
>> > > > * Other options should go in dg-additional-options
>> > > > 
>> > > > In d2148424165 marxin merged some dg-additional-options into dg-
>> > > > options,
>> > > > exploited the problem.
>> > > > 
>> > > > And, the "origin" convention seems already broken: there is
>> > > > something
>> > > > like -funroll-loops which is not a MIPS option, but accepted by
>> > > > mips.exp
>> > > > in dg-options.
>> > > > 
>> > > > Possiblities are:
>> > > > 
>> > > > (1) this patch
>> > > > (2) make mips.exp accept -fno-inline as "if it is a MIPS option"
>> > > > (3) refactor mips.exp to pass everything itself doesn't know
>> > > > directly
>> > > > to gcc
>> > > Attached a diff for mips.exp trying to make it pass everything in
>> > > dg-
>> > > options which is not known by itself directly to the compiler.
>> > > 
>> > > The "smallest fix" is simply adding -fno-inline into mips.exp. 
>> > > However
>> > > I don't like it because I agree with you that mips.exp shouldn't
>> > > care
>> > > about dg-options, at least don't do it too much.
>> > As I said in the other message, I think the smallest fix is the way
>> > to
>> > go though.
>> THanks for chiming in Richard.  I didn't know all the background
>> here.   
>> Let's just go with the small fix based on your recommendation.  We can
>> always revisit if we keep running into issues in this code.
>
> Pushed at 3b33b113.

It looks like that was the originally posted patch though.  It probably
wasn't very clear, but by smallest fix, I meant adding inline to:

# Add -ffoo/-fno-foo options to mips_option_groups.
foreach option {
common
delayed-branch
expensive-optimizations
fast-math
fat-lto-objects
finite-math-only
fixed-hi
fixed-lo
lax-vector-conversions
omit-frame-pointer
optimize-sibling-calls
peephole2
schedule-insns2
split-wide-types
tree-vectorize
unroll-all-loops
unroll-loops
ipa-ra
} {
…
}

It seems inconsistent to remove -fno-inline from the dg-options
but keep -fipa-ra, for example.

Thanks,
Richard


[PATCH] aarch64: Use unions for vector tables in vqtbl[234] intrinsics

2021-07-09 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch uses a union instead of constructing a new opaque
vector structure for each of the vqtbl[234] Neon intrinsics in arm_neon.h.
This simplifies the header file and also improves code generation -
superfluous move instructions were emitted for every register
extraction/set in this additional structure.

This change is safe because the C-level vector structure types e.g.
uint8x16x4_t already provide a tie for sequential register allocation
- which is required by the TBL instructions.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-07-08  Jonathan Wright  

* config/aarch64/arm_neon.h (vqtbl2_s8): Use union instead of
additional __builtin_aarch64_simd_oi structure.
(vqtbl2_u8): Likewise.
(vqtbl2_p8): Likewise.
(vqtbl2q_s8): Likewise.
(vqtbl2q_u8): Likewise.
(vqtbl2q_p8): Likewise.
(vqtbl3_s8): Use union instead of additional
__builtin_aarch64_simd_ci structure.
(vqtbl3_u8): Likewise.
(vqtbl3_p8): Likewise.
(vqtbl3q_s8): Likewise.
(vqtbl3q_u8): Likewise.
(vqtbl3q_p8): Likewise.
(vqtbl4_s8): Use union instead of additional
__builtin_aarch64_simd_xi structure.
(vqtbl4_u8): Likewise.
(vqtbl4_p8): Likewise.
(vqtbl4q_s8): Likewise.
(vqtbl4q_u8): Likewise.
(vqtbl4q_p8): Likewise.


rb14639.patch
Description: rb14639.patch


Re: [PATCH 2/2] [PHIOPT/MATCH] Remove the statement to move if not used

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 9:16 AM Andrew Pinski  wrote:
>
> On Thu, Jul 8, 2021 at 11:50 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Fri, Jul 9, 2021 at 7:34 AM apinski--- via Gcc-patches
> >  wrote:
> > >
> > > From: Andrew Pinski 
> > >
> > > Instead of waiting for DCE to remove the unused statement,
> > > and maybe optimize another conditional, it is better if
> > > we don't move the statement and have the statement
> > > removed.
> > >
> > > gcc/ChangeLog:
> > >
> > > * tree-ssa-phiopt.c (used_in_seq): New function.
> > > (match_simplify_replacement): Don't move the statement
> > > if not used in sequence.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/tree-ssa/pr96928-1.c: Update to similar as pr96928.c.
> > > ---
> > >  gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c |  5 -
> > >  gcc/tree-ssa-phiopt.c | 24 ++-
> > >  2 files changed, 27 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
> > > b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> > > index 2e86620da11..9e505ac9900 100644
> > > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> > > @@ -2,7 +2,10 @@
> > >  /* { dg-do compile } */
> > >  /* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */
> > >  /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 
> > > "phiopt2" } } */
> > > -/* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 
> > > "phiopt2" } } */
> > > +/* The following check is done at optimized because a ^ (~b) is 
> > > rewritten as ~(a^b)
> > > +   and in the case of match.pd optimizing these ?:, the ~ is moved out 
> > > already
> > > +   by the time we get to phiopt2. */
> > > +/* { dg-final { scan-tree-dump-times "c_\[0-9]*\\\(D\\\) \\\^" 1 
> > > "optimized" } } */
> > >  /* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
> > >  /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
> > > "phiopt2" } } */
> > >  /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
> > > diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> > > index 7a98b7afdf1..a237df02153 100644
> > > --- a/gcc/tree-ssa-phiopt.c
> > > +++ b/gcc/tree-ssa-phiopt.c
> > > @@ -934,6 +934,26 @@ gimple_simplify_phiopt (bool early_p, tree type, 
> > > gimple *comp_stmt,
> > >return NULL;
> > >  }
> > >
> > > +/* Return true if the lhs of STMT is used in the SEQ sequence
> > > +   of statements.  */
> > > +static bool
> > > +used_in_seq (gimple *stmt, gimple_seq seq)
> > > +{
> > > +  tree lhs = gimple_assign_lhs (stmt);
> > > +  for (auto gsi = gsi_start (seq); !gsi_end_p (gsi); gsi_next_nondebug 
> > > ())
> > > +{
> > > +  use_operand_p use_p;
> > > +  ssa_op_iter iter;
> > > +  gimple *stmt1 = gsi_stmt (gsi);
> > > +  FOR_EACH_SSA_USE_OPERAND (use_p, stmt1, iter, SSA_OP_USE)
> > > +   {
> > > + if (USE_FROM_PTR (use_p) == lhs)
> > > +   return true;
> > > +   }
> > > +}
> > > +return false;
> > > +}
> > > +
> > >  /*  The function match_simplify_replacement does the main work of doing 
> > > the
> > >  replacement using match and simplify.  Return true if the 
> > > replacement is done.
> > >  Otherwise return false.
> > > @@ -1020,7 +1040,9 @@ match_simplify_replacement (basic_block cond_bb, 
> > > basic_block middle_bb,
> > >  return false;
> > >
> > >gsi = gsi_last_bb (cond_bb);
> > > -  if (stmt_to_move)
> > > +  if (stmt_to_move
> > > +  && (gimple_assign_lhs (stmt_to_move) == result
> > > +  || used_in_seq (stmt_to_move, seq)))
> >
> > Err, why not insert 'seq' before moving the stmt (you'd have to fiddle
> > with the iterator,
> > using GSI_CONTINUE_LINKING I think) and then check has_zero_uses on
> > the (hopefully) only
> > def of the stmt to move?
>
> Because stmt_to_move was used in the phi and if we move
> replace_phi_edge_with_variable before we move the statement, the
> statement has been removed permanently as the basic block holding it
> has been deleted.
>
> What about this order instead:
> remove stmt_to_move (not permanently)
> call replace_phi_edge_with_variable
> insert seq
> if !zero_uses
>insert stmt_to_move before seq
> else
>   release defs for stmt_to_move

Hmm, so if the stmt was used by the PHI then why not check
for has_single_use after inserting the sequence (but before
removing the PHI)?

Richard.

> Thanks,
> Andrew Pinski
>
> >
> > Richard.
> >
> > >  {
> > >if (dump_file && (dump_flags & TDF_DETAILS))
> > > {
> > > --
> > > 2.27.0
> > >


[PATCH] i386: Fix *udivmodsi4_pow2_zext_? patterns

2021-07-09 Thread Uros Bizjak via Gcc-patches
In addition to the obvious cut-n-pasto where *udivmodsi4_pow2_zext_2
never matches, limit the range of the immediate operand to prevent
out of range immediate operand of AND instruction.

Found by inspection, the patterns rarely match (if at all), since
tree optimizers do the transformation before RTL is generated. But
according to the comment above *udivmod4_pow2, the constant can
materialize after expansion, so leave these patterns around for now.

2021-07-09  Uroš Bizjak  

gcc/
* config/i386/i386.md (*udivmodsi4_pow2_zext_1): Limit the
log2 range of operands[3] to [1,31].
(*udivmodsi4_pow2_zext_2): Ditto.  Correct insn RTX pattern.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 156c6a94989..26fb81b9b4b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -8518,7 +8518,7 @@ (define_insn_and_split "*udivmodsi4_pow2_zext_1"
(umod:SI (match_dup 2) (match_dup 3)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT
-   && exact_log2 (UINTVAL (operands[3])) > 0"
+   && IN_RANGE (exact_log2 (UINTVAL (operands[3])), 1, 31)"
   "#"
   "&& reload_completed"
   [(set (match_dup 1) (match_dup 2))
@@ -8599,10 +8599,10 @@ (define_insn_and_split "*udivmodsi4_pow2_zext_2"
  (umod:SI (match_operand:SI 2 "register_operand" "0")
   (match_operand:SI 3 "const_int_operand" "n"
(set (match_operand:SI 0 "register_operand" "=r")
-   (umod:SI (match_dup 2) (match_dup 3)))
+   (udiv:SI (match_dup 2) (match_dup 3)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT
-   && exact_log2 (UINTVAL (operands[3])) > 0"
+   && IN_RANGE (exact_log2 (UINTVAL (operands[3])), 1, 31)"
   "#"
   "&& reload_completed"
   [(set (match_dup 1) (match_dup 2))


Re: Fix PR target/101377

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 9:41 AM Eric Botcazou  wrote:
>
> Hi,
>
> this is the build failure on Windows with binutils for which GNU as accepts
> the --gdwarf-5 switch but GNU ld generates broken binaries with DWARF 5.
>
> We already have the HAVE_LD_BROKEN_PE_DWARF5 kludge to disable DWARF 5 in this
> case but it only tames the DWARF version in the compiler, so the driver still
> passes --gdwarf-5 when invoked on an assembly file with -g.
>
> The attached patch is a minimal fix to plug the hole, and I don't think that
> anything more sophisticated is worth the hassle since 2.37 supports DWARF 5,
> i.e. HAVE_AS_GDWARF_5_DEBUG_FLAG and HAVE_AS_WORKING_DWARF_N_FLAG are defined
> and HAVE_LD_BROKEN_PE_DWARF5 is not with it.
>
> Tested on x86-64/Linux and x86[-64]/Windows, OK for mainline and 11 branch?

OK.

Thanks,
Richard.

>
> 2021-07-07  Eric Botcazou  
>
> PR target/101377
> * gcc.c.c (ASM_DEBUG_DWARF_OPTION): Set again to --gdwarf2
> if HAVE_AS_WORKING_DWARF_N_FLAG is not defined
> and HAVE_LD_BROKEN_PE_DWARF5 is defined.
>
> --
> Eric Botcazou


[PATCH] driver/101383 - handle -gtoggle in driver

2021-07-09 Thread Richard Biener
The driver amends assembler options with for example --gdwarf-5
when debugging is enabled but the check for that does not consider
the effect of -gtoggle which is not handled in the common option
machinery.  The following alters debug_info_level according to
-gtoggle mimicing what process_options later does in the compiler.

This in particular avoids changing of the cc1-checksum with every
bootstrap (debug) cycle as we compute that from stage2 where we
use -g -gtoggle but with --gdwarf-5 and no debug info from the
compiler the assembler will fill the line table with the temporary
assembler file names.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu, OK?

I've sofar verified the patch has the desired effect on --gdwarf-5
passing to the assembler for gcc -c t.c, gcc -c t.c -gtoggle,
gcc -c t.c -g -gtoggle and gcc -c t.c -g

Thanks,
Richard.

2021-07-09  Richard Biener  

PR driver/101383
* gcc.c (process_command): Process -gtoggle like process_options
would after parsing options.
---
 gcc/gcc.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 36a88fc99b0..6c7a4847c43 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -4924,6 +4924,16 @@ process_command (unsigned int decoded_options_count,
 #endif
 }
 
+  /* Handle -gtoggle as it would later in toplev.c:process_options to
+ make the debug-level-gt spec function work as expected.  */
+  if (flag_gtoggle)
+{
+  if (debug_info_level == DINFO_LEVEL_NONE)
+   debug_info_level = DINFO_LEVEL_NORMAL;
+  else
+   debug_info_level = DINFO_LEVEL_NONE;
+}
+
   if (output_file
   && strcmp (output_file, "-") != 0
   && strcmp (output_file, HOST_BIT_BUCKET) != 0)
-- 
2.26.2


Re: Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].

2021-07-09 Thread Iain Sandoe



> On 9 Jul 2021, at 09:57, Uros Bizjak  wrote:
> 
> On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe  wrote:
>> 
>> (early) ping;
>> if possible I’d like to get this onto master in time to back-port for 11.2.
>> 
>>> On 4 Jul 2021, at 21:08, Iain Sandoe  wrote:
>>> 
>>> Hi,
>>> 
>>> (I’m not going to defend the status quo here, it seems a bit prone
>>> to confusing a user [different interposition behaviour between the
>>> inlined and non-inlined cases] however, this is what the platform
>>> compilers implement).
>>> 
>>> 
>>> 
>>> We allow public functions defined in a TU to bind locally for PIC
>>> code (the default) on 64bit Mach-O.
>>> 
>>> If such functions are not inlined, we cannot tell at compile-time if
>>> they might be called via the lazy symbol resolver (this can depend on
>>> options given at link-time).  Therefore, we must assume that the lazy
>>> resolver could be used which clobbers R11 and R10.
>>> 
>>> The solution here is similar in form to the one used for veneer regs
>>> on Arm (but I’m open to alternate suggestions).
>>> 
>>> tested on X86_64-darwin, linux
>>> OK for master?
>>> Iain
>>> 
>>> Signed-off-by: Iain Sandoe 
>>> 
>>> PR target/100152 - [10/11/12 Regression] used caller-saved register not 
>>> preserved across a call.
>>> 
>>>   PR target/100152
>>> 
>>> gcc/ChangeLog:
>>> 
>>>  * config/i386/i386-expand.c (ix86_expand_call): If a call is
>>>  to a non-local-binding, or local but to a public symbol, then
>>>  assume that it might be indirected via the lazy symbol binder.
>>>  Mark R10 and R10 as clobbered in that case.
> 
> LGTM, but this is Darwin specific patch, so you could approve it yourself.

thanks, maybe I should have made it “RFC” (I was wondering if there was
any better way to do this).  Anyway, I’ll get it in and baking on master.

thanks again,
Iain



Re: Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].

2021-07-09 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe  wrote:
>
> (early) ping;
> if possible I’d like to get this onto master in time to back-port for 11.2.
>
> > On 4 Jul 2021, at 21:08, Iain Sandoe  wrote:
> >
> > Hi,
> >
> > (I’m not going to defend the status quo here, it seems a bit prone
> > to confusing a user [different interposition behaviour between the
> > inlined and non-inlined cases] however, this is what the platform
> > compilers implement).
> >
> > 
> >
> > We allow public functions defined in a TU to bind locally for PIC
> > code (the default) on 64bit Mach-O.
> >
> > If such functions are not inlined, we cannot tell at compile-time if
> > they might be called via the lazy symbol resolver (this can depend on
> > options given at link-time).  Therefore, we must assume that the lazy
> > resolver could be used which clobbers R11 and R10.
> >
> > The solution here is similar in form to the one used for veneer regs
> > on Arm (but I’m open to alternate suggestions).
> >
> > tested on X86_64-darwin, linux
> > OK for master?
> > Iain
> >
> > Signed-off-by: Iain Sandoe 
> >
> > PR target/100152 - [10/11/12 Regression] used caller-saved register not 
> > preserved across a call.
> >
> >PR target/100152
> >
> > gcc/ChangeLog:
> >
> >   * config/i386/i386-expand.c (ix86_expand_call): If a call is
> >   to a non-local-binding, or local but to a public symbol, then
> >   assume that it might be indirected via the lazy symbol binder.
> >   Mark R10 and R10 as clobbered in that case.

LGTM, but this is Darwin specific patch, so you could approve it yourself.

Uros.

> > ---
> > gcc/config/i386/i386-expand.c | 16 +++-
> > 1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> > index b37642e35ee..1b860e027b0 100644
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
> > callarg1,
> > pop = NULL;
> >   gcc_assert (!TARGET_64BIT || !pop);
> >
> > +  rtx addr = XEXP (fnaddr, 0);
> >   if (TARGET_MACHO && !TARGET_64BIT)
> > {
> > #if TARGET_MACHO
> > @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
> > callarg1,
> >   /* Static functions and indirect calls don't need the pic register.  
> > Also,
> >check if PLT was explicitly avoided via no-plt or "noplt" attribute, 
> > making
> >it an indirect call.  */
> > -  rtx addr = XEXP (fnaddr, 0);
> >   if (flag_pic
> > && GET_CODE (addr) == SYMBOL_REF
> > && !SYMBOL_REF_LOCAL_P (addr))
> > @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
> > callarg1,
> >   }
> > }
> >
> > +  if (TARGET_MACHO && TARGET_64BIT && !sibcall
> > +  && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr))
> > +   || !fndecl || TREE_PUBLIC (fndecl)))
> > +{
> > +  /* We allow public functions defined in a TU to bind locally for PIC
> > +  code (the default) on 64bit Mach-O.
> > +  If such functions are not inlined, we cannot tell at compile-time if
> > +  they will be called via the lazy symbol resolver (this can depend on
> > +  options given at link-time).  Therefore, we must assume that the lazy
> > +  resolver could be used which clobbers R11 and R10.  */
> > +  clobber_reg (, gen_rtx_REG (DImode, R11_REG));
> > +  clobber_reg (, gen_rtx_REG (DImode, R10_REG));
> > +}
> > +
> >   if (vec_len > 1)
> > call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec));
> >   rtx_insn *call_insn = emit_call_insn (call);
> > --
> > 2.24.1
> >
>


Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].

2021-07-09 Thread Iain Sandoe
(early) ping;
if possible I’d like to get this onto master in time to back-port for 11.2.

> On 4 Jul 2021, at 21:08, Iain Sandoe  wrote:
> 
> Hi,
> 
> (I’m not going to defend the status quo here, it seems a bit prone
> to confusing a user [different interposition behaviour between the
> inlined and non-inlined cases] however, this is what the platform
> compilers implement).
> 
> 
> 
> We allow public functions defined in a TU to bind locally for PIC
> code (the default) on 64bit Mach-O.
> 
> If such functions are not inlined, we cannot tell at compile-time if
> they might be called via the lazy symbol resolver (this can depend on
> options given at link-time).  Therefore, we must assume that the lazy
> resolver could be used which clobbers R11 and R10.
> 
> The solution here is similar in form to the one used for veneer regs
> on Arm (but I’m open to alternate suggestions).
> 
> tested on X86_64-darwin, linux
> OK for master?
> Iain
> 
> Signed-off-by: Iain Sandoe 
> 
> PR target/100152 - [10/11/12 Regression] used caller-saved register not 
> preserved across a call.
> 
>PR target/100152
> 
> gcc/ChangeLog:
> 
>   * config/i386/i386-expand.c (ix86_expand_call): If a call is
>   to a non-local-binding, or local but to a public symbol, then
>   assume that it might be indirected via the lazy symbol binder.
>   Mark R10 and R10 as clobbered in that case.
> ---
> gcc/config/i386/i386-expand.c | 16 +++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index b37642e35ee..1b860e027b0 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
> pop = NULL;
>   gcc_assert (!TARGET_64BIT || !pop);
> 
> +  rtx addr = XEXP (fnaddr, 0);
>   if (TARGET_MACHO && !TARGET_64BIT)
> {
> #if TARGET_MACHO
> @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
>   /* Static functions and indirect calls don't need the pic register.  
> Also,
>check if PLT was explicitly avoided via no-plt or "noplt" attribute, 
> making
>it an indirect call.  */
> -  rtx addr = XEXP (fnaddr, 0);
>   if (flag_pic
> && GET_CODE (addr) == SYMBOL_REF
> && !SYMBOL_REF_LOCAL_P (addr))
> @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
>   }
> }
> 
> +  if (TARGET_MACHO && TARGET_64BIT && !sibcall
> +  && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr))
> +   || !fndecl || TREE_PUBLIC (fndecl)))
> +{
> +  /* We allow public functions defined in a TU to bind locally for PIC
> +  code (the default) on 64bit Mach-O.
> +  If such functions are not inlined, we cannot tell at compile-time if
> +  they will be called via the lazy symbol resolver (this can depend on
> +  options given at link-time).  Therefore, we must assume that the lazy
> +  resolver could be used which clobbers R11 and R10.  */
> +  clobber_reg (, gen_rtx_REG (DImode, R11_REG));
> +  clobber_reg (, gen_rtx_REG (DImode, R10_REG));
> +}
> +
>   if (vec_len > 1)
> call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec));
>   rtx_insn *call_insn = emit_call_insn (call);
> -- 
> 2.24.1
> 



Re: [r12-2132 Regression] FAIL: g++.dg/warn/Warray-bounds-20.C -std=gnu++98 note (test for warnings, line 55) on Linux/x86_64

2021-07-09 Thread Maxim Kuvyrkov via Gcc-patches
> On 9 Jul 2021, at 02:35, sunil.k.pandey via Gcc-patches 
>  wrote:
> 
> On Linux/x86_64,
> 
> a110855667782dac7b674d3e328b253b3b3c919b is the first bad commit
> commit a110855667782dac7b674d3e328b253b3b3c919b
> Author: Martin Sebor 
> Date:   Wed Jul 7 14:05:25 2021 -0600
> 
>Correct handling of variable offset minus constant in -Warray-bounds 
> [PR100137]
> 
> caused

Hi Martin,

I see these failing on aarch64-linux-gnu as well:

> 
> FAIL: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 34)
> FAIL: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 37)
> FAIL: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 42)


FWIW, I don’t see these on aarch64-linux-gnu:

> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++14 note (test for warnings, 
> line 38)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++14 note (test for warnings, 
> line 55)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++17 note (test for warnings, 
> line 38)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++17 note (test for warnings, 
> line 55)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++2a note (test for warnings, 
> line 38)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++2a note (test for warnings, 
> line 55)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++98 note (test for warnings, 
> line 38)
> FAIL: g++.dg/warn/Warray-bounds-20.C  -std=gnu++98 note (test for warnings, 
> line 55)


--
Maxim Kuvyrkov
https://www.linaro.org


> 
> with GCC configured with
> 
> ../../gcc/configure 
> --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-2132/usr
>  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
> --enable-libmpx x86_64-linux --disable-bootstrap
> 
> To reproduce:
> 
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg.exp=gcc.dg/Wstringop-overflow-47.c 
> --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg.exp=gcc.dg/Wstringop-overflow-47.c 
> --target_board='unix{-m64\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg.exp=g++.dg/warn/Warray-bounds-20.C 
> --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg.exp=g++.dg/warn/Warray-bounds-20.C 
> --target_board='unix{-m32\ -march=cascadelake}'"
> 
> (Please do not reply to this email, for question about this report, contact 
> me at skpgkp2 at gmail dot com)



Fix PR target/101377

2021-07-09 Thread Eric Botcazou
Hi,

this is the build failure on Windows with binutils for which GNU as accepts 
the --gdwarf-5 switch but GNU ld generates broken binaries with DWARF 5.

We already have the HAVE_LD_BROKEN_PE_DWARF5 kludge to disable DWARF 5 in this 
case but it only tames the DWARF version in the compiler, so the driver still 
passes --gdwarf-5 when invoked on an assembly file with -g.

The attached patch is a minimal fix to plug the hole, and I don't think that 
anything more sophisticated is worth the hassle since 2.37 supports DWARF 5,
i.e. HAVE_AS_GDWARF_5_DEBUG_FLAG and HAVE_AS_WORKING_DWARF_N_FLAG are defined 
and HAVE_LD_BROKEN_PE_DWARF5 is not with it.

Tested on x86-64/Linux and x86[-64]/Windows, OK for mainline and 11 branch?


2021-07-07  Eric Botcazou  

PR target/101377
* gcc.c.c (ASM_DEBUG_DWARF_OPTION): Set again to --gdwarf2
if HAVE_AS_WORKING_DWARF_N_FLAG is not defined
and HAVE_LD_BROKEN_PE_DWARF5 is defined.

-- 
Eric Botcazoudiff --git a/gcc/gcc.c b/gcc/gcc.c
index 7837553958b..7c75d1314fa 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -910,7 +910,7 @@ proper position among the other output files.  */
than in ASM_DEBUG_SPEC, so that it applies to both .s and .c etc.
compilations.  */
 #  define ASM_DEBUG_DWARF_OPTION ""
-# elif defined(HAVE_AS_GDWARF_5_DEBUG_FLAG)
+# elif defined(HAVE_AS_GDWARF_5_DEBUG_FLAG) && !defined(HAVE_LD_BROKEN_PE_DWARF5)
 #  define ASM_DEBUG_DWARF_OPTION "%{%:dwarf-version-gt(4):--gdwarf-5;" \
 	"%:dwarf-version-gt(3):--gdwarf-4;"\
 	"%:dwarf-version-gt(2):--gdwarf-3;"\


Re: [PATCH 2/2] [PHIOPT/MATCH] Remove the statement to move if not used

2021-07-09 Thread Andrew Pinski via Gcc-patches
On Thu, Jul 8, 2021 at 11:50 PM Richard Biener via Gcc-patches
 wrote:
>
> On Fri, Jul 9, 2021 at 7:34 AM apinski--- via Gcc-patches
>  wrote:
> >
> > From: Andrew Pinski 
> >
> > Instead of waiting for DCE to remove the unused statement,
> > and maybe optimize another conditional, it is better if
> > we don't move the statement and have the statement
> > removed.
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.c (used_in_seq): New function.
> > (match_simplify_replacement): Don't move the statement
> > if not used in sequence.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/pr96928-1.c: Update to similar as pr96928.c.
> > ---
> >  gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c |  5 -
> >  gcc/tree-ssa-phiopt.c | 24 ++-
> >  2 files changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> > index 2e86620da11..9e505ac9900 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> > @@ -2,7 +2,10 @@
> >  /* { dg-do compile } */
> >  /* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */
> >  /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 
> > "phiopt2" } } */
> > -/* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" 
> > } } */
> > +/* The following check is done at optimized because a ^ (~b) is rewritten 
> > as ~(a^b)
> > +   and in the case of match.pd optimizing these ?:, the ~ is moved out 
> > already
> > +   by the time we get to phiopt2. */
> > +/* { dg-final { scan-tree-dump-times "c_\[0-9]*\\\(D\\\) \\\^" 1 
> > "optimized" } } */
> >  /* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
> >  /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
> > "phiopt2" } } */
> >  /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
> > diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> > index 7a98b7afdf1..a237df02153 100644
> > --- a/gcc/tree-ssa-phiopt.c
> > +++ b/gcc/tree-ssa-phiopt.c
> > @@ -934,6 +934,26 @@ gimple_simplify_phiopt (bool early_p, tree type, 
> > gimple *comp_stmt,
> >return NULL;
> >  }
> >
> > +/* Return true if the lhs of STMT is used in the SEQ sequence
> > +   of statements.  */
> > +static bool
> > +used_in_seq (gimple *stmt, gimple_seq seq)
> > +{
> > +  tree lhs = gimple_assign_lhs (stmt);
> > +  for (auto gsi = gsi_start (seq); !gsi_end_p (gsi); gsi_next_nondebug 
> > ())
> > +{
> > +  use_operand_p use_p;
> > +  ssa_op_iter iter;
> > +  gimple *stmt1 = gsi_stmt (gsi);
> > +  FOR_EACH_SSA_USE_OPERAND (use_p, stmt1, iter, SSA_OP_USE)
> > +   {
> > + if (USE_FROM_PTR (use_p) == lhs)
> > +   return true;
> > +   }
> > +}
> > +return false;
> > +}
> > +
> >  /*  The function match_simplify_replacement does the main work of doing the
> >  replacement using match and simplify.  Return true if the replacement 
> > is done.
> >  Otherwise return false.
> > @@ -1020,7 +1040,9 @@ match_simplify_replacement (basic_block cond_bb, 
> > basic_block middle_bb,
> >  return false;
> >
> >gsi = gsi_last_bb (cond_bb);
> > -  if (stmt_to_move)
> > +  if (stmt_to_move
> > +  && (gimple_assign_lhs (stmt_to_move) == result
> > +  || used_in_seq (stmt_to_move, seq)))
>
> Err, why not insert 'seq' before moving the stmt (you'd have to fiddle
> with the iterator,
> using GSI_CONTINUE_LINKING I think) and then check has_zero_uses on
> the (hopefully) only
> def of the stmt to move?

Because stmt_to_move was used in the phi and if we move
replace_phi_edge_with_variable before we move the statement, the
statement has been removed permanently as the basic block holding it
has been deleted.

What about this order instead:
remove stmt_to_move (not permanently)
call replace_phi_edge_with_variable
insert seq
if !zero_uses
   insert stmt_to_move before seq
else
  release defs for stmt_to_move

Thanks,
Andrew Pinski

>
> Richard.
>
> >  {
> >if (dump_file && (dump_flags & TDF_DETAILS))
> > {
> > --
> > 2.27.0
> >


Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-07-09 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 8, 2021 at 7:56 AM Segher Boessenkool
 wrote:
>
> On Wed, Jul 07, 2021 at 11:32:59PM +0800, Hongtao Liu wrote:
> > On Wed, Jul 7, 2021 at 10:54 PM Segher Boessenkool
> >  wrote:
> > > So, a "FAKE_CALL" is very much a *real* call, on the RTL level, which is
> > > where we are here.  But you want it to be treated differently because it
> > > will eventually be replaced by different insns.
> > It's CALL_INSN on the rtl level,  but it's just a normal instruction
> > that it doesn't have a call stack, and it doesn't affect the control
> > flow
>
> There is no such thing as "call stack" (whatever that may mean) to do
> with the RTL "call" insn.  How the return address is stored (if at all)
> is up to the target.  Many do not store the return address on the stack
> (for example they have an RA or LR register for it).  Those that do
> store it on a stack do not all change the stack pointer.
>
> In RTL, it *does* change the control flow.  If you don't like that,
> don't use a "call" insn.  You will have to update a *lot* more code
> than you did, otherwise.
>
> > > So because of this one thing (you need to insert partial clobbers) you
> > > force all kinds of unrelated code to have changes, namely, code thatt
> > > needs to do something with calls, but now you do not want to have that
> > > doone on some calls because you promise that call will disappear
> > > eventually, and it cannot cause any problems in the mean time?
> > >
> > > I am not convinced.  This is not design, this is a terrible hack, this
> > > is the opposite direction we should go in.
> >
> > Quote from  https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570634.html
> >
> > > Also i grep CALL_P or CALL_INSN in GCC source codes, there are many
> > > places which hold the assumption CALL_P/CALL_INSN is a real call.
> > > Considering that vzeroupper is used a lot on the i386 backend, I'm a
> > > bit worried that this implementation solution will be a bottomless
> > > pit.
> >
> > Maybe, but I think the same is true for CLOBBER_HIGH.  If we have
> > a third alternative then we should consider it, but I think the
> > call approach is still going to be less problematic then CLOBBER_HIGH.
> >
> > The main advantage of the call approach is that the CALL_P handling
> > is (mostly) conservatively correct and performance problems are just
> > a one-line change.  The CLOBBER_HIGH approach instead requires
> > changes to the way that passes track liveness information for
> > non-call instructions (so is much more than a one-line change).
> > Also, treating a CLOBBER_HIGH like a CLOBBER isn't conservatively
> > correct, because other code might be relying on part of the register
> > being preserved.
>
> And this isn't a one-line change either, and it is only partial already,
> and we don't know how deep the rabbit hole goes.
maybe, and if there's existed infrastructure to solve vzeroupper
issue, i'm ok to change my patch.
>
>
> Segher



-- 
BR,
Hongtao


Re: [PATCH 1/2] Improve early simplify and match for phiopt

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 7:35 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> Previously the idea was gimple_simplify_phiopt would call
> resimplify with a NULL sequence but that sometimes fails
> even if there was only one statement produced. The cases
> where it fails is when there are two simplifications happen.
> In the case of the min/max production, the first simplifcation
> produces:
> (convert (min @1 @2))
> And then the convert is removed by a second one.

Yep, this can happen ... (it can get even "worse" so that
dead stmts end up in the seq, but well, we can deal with that
when we run into it)

> The Min statement
> will be in the sequence while the op will be a SSA name. This was
> rejected before as could not produce something in the sequence.
> So this patch changes the way resimplify is called to always passing
> a pointer to the sequence and then decide based on if op is a
> SSA_NAME or not.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> * tree-ssa-phiopt.c (phiopt_early_allow): Change arguments
> to take sequence and gimple_match_op.  Accept the case where
> op is a SSA_NAME and one statement in the sequence.
> Also allow constants.
> (gimple_simplify_phiopt): Always pass a sequence to resimplify.
> Update call to phiopt_early_allow.  Discard the sequence if not
> used.
> ---
>  gcc/tree-ssa-phiopt.c | 62 ++-
>  1 file changed, 49 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index 8b60ee81082..7a98b7afdf1 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -812,11 +812,33 @@ two_value_replacement (basic_block cond_bb, basic_block 
> middle_bb,
>return true;
>  }
>
> -/* Return TRUE if CODE should be allowed during early phiopt.
> -   Currently this is to allow MIN/MAX and ABS/NEGATE.  */
> +/* Return TRUE if SEQ/OP pair should be allowed during early phiopt.
> +   Currently this is to allow MIN/MAX and ABS/NEGATE and constants.  */
>  static bool
> -phiopt_early_allow (enum tree_code code)
> +phiopt_early_allow (gimple_seq , gimple_match_op )
>  {
> +  /* Don't allow functions. */
> +  if (!op.code.is_tree_code ())
> +return false;
> +  tree_code code = (tree_code)op.code;
> +
> +  /* For non-empty sequence, only allow one statement.  */
> +  if (!gimple_seq_empty_p (seq))
> +{
> +  /* Check to make sure op was already a SSA_NAME.  */
> +  if (code != SSA_NAME)
> +   return false;
> +  if (!gimple_seq_singleton_p (seq))
> +   return false;
> +  gimple *stmt = gimple_seq_first_stmt (seq);
> +  /* Only allow assignments.  */
> +  if (!is_gimple_assign (stmt))
> +   return false;
> +  if (gimple_assign_lhs (stmt) != op.ops[0])
> +   return false;
> +  code = gimple_assign_rhs_code (stmt);
> +}
> +
>switch (code)
>  {
>case MIN_EXPR:
> @@ -826,6 +848,11 @@ phiopt_early_allow (enum tree_code code)
>case NEGATE_EXPR:
>case SSA_NAME:
> return true;
> +  case INTEGER_CST:
> +  case REAL_CST:
> +  case VECTOR_CST:
> +  case FIXED_CST:
> +   return true;
>default:
> return false;
>  }
> @@ -844,6 +871,7 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
> *comp_stmt,
> gimple_seq *seq)
>  {
>tree result;
> +  gimple_seq seq1 = NULL;
>enum tree_code comp_code = gimple_cond_code (comp_stmt);
>location_t loc = gimple_location (comp_stmt);
>tree cmp0 = gimple_cond_lhs (comp_stmt);
> @@ -858,18 +886,23 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
> *comp_stmt,
>gimple_match_op op (gimple_match_cond::UNCOND,
>   COND_EXPR, type, cond, arg0, arg1);
>
> -  if (op.resimplify (early_p ? NULL : seq, follow_all_ssa_edges))
> +  if (op.resimplify (, follow_all_ssa_edges))
>  {
>/* Early we want only to allow some generated tree codes. */
>if (!early_p
> - || op.code.is_tree_code ()
> - || phiopt_early_allow ((tree_code)op.code))
> + || phiopt_early_allow (seq1, op))
> {
> - result = maybe_push_res_to_seq (, seq);
> + result = maybe_push_res_to_seq (, );
>   if (result)
> -   return result;
> +   {
> + gimple_seq_add_seq_without_update (seq, seq1);
> + return result;
> +   }
> }
>  }
> +  gimple_seq_discard (seq1);
> +  seq1 = NULL;
> +
>/* Try the inverted comparison, that is !COMP ? ARG1 : ARG0. */
>comp_code = invert_tree_comparison (comp_code, HONOR_NANS (cmp0));
>
> @@ -882,18 +915,21 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
> *comp_stmt,
>gimple_match_op op1 (gimple_match_cond::UNCOND,
>COND_EXPR, type, cond, arg1, arg0);
>
> -  if (op1.resimplify (early_p ? 

PING^2: [PATCH] mips: Fix up mips_atomic_assign_expand_fenv [PR94780]

2021-07-09 Thread Xi Ruoyao via Gcc-patches
Ping again.

On Mon, 2021-06-28 at 21:50 +0800, Xi Ruoyao wrote:
> Ping.  CC several maintainers who may help to review MIPS patches. 
> Sorry if it sounds buzzing.
> 
> On Wed, 2021-06-23 at 11:11 +0800, Xi Ruoyao wrote:
> > Commit message shamelessly copied from 1777beb6b129 by jakub:
> > 
> > This function, because it is sometimes called even outside of
> > function
> > bodies, uses create_tmp_var_raw rather than create_tmp_var.  But in
> > order
> > for that to work, when first referenced, the VAR_DECLs need to
> > appear
> > in a
> > TARGET_EXPR so that during gimplification the var gets the right
> > DECL_CONTEXT and is added to local decls.
> > 
> > Bootstrapped & regtested on mips64el-linux-gnu.  Ok for trunk and
> > backport
> > to 11, 10, and 9?
> > 
> > gcc/
> > 
> > * config/mips/mips.c (mips_atomic_assign_expand_fenv): Use
> >   TARGET_EXPR instead of MODIFY_EXPR.
> > ---
> >  gcc/config/mips/mips.c | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> > index 8f043399a8e..89d1be6cea6 100644
> > --- a/gcc/config/mips/mips.c
> > +++ b/gcc/config/mips/mips.c
> > @@ -22439,12 +22439,12 @@ mips_atomic_assign_expand_fenv (tree
> > *hold,
> > tree *clear, tree *update)
> >    tree get_fcsr = mips_builtin_decls[MIPS_GET_FCSR];
> >    tree set_fcsr = mips_builtin_decls[MIPS_SET_FCSR];
> >    tree get_fcsr_hold_call = build_call_expr (get_fcsr, 0);
> > -  tree hold_assign_orig = build2 (MODIFY_EXPR, MIPS_ATYPE_USI,
> > - fcsr_orig_var,
> > get_fcsr_hold_call);
> > +  tree hold_assign_orig = build4 (TARGET_EXPR, MIPS_ATYPE_USI,
> > + fcsr_orig_var, get_fcsr_hold_call,
> > NULL, NULL);
> >    tree hold_mod_val = build2 (BIT_AND_EXPR, MIPS_ATYPE_USI,
> > fcsr_orig_var,
> >   build_int_cst (MIPS_ATYPE_USI,
> > 0xf003));
> > -  tree hold_assign_mod = build2 (MODIFY_EXPR, MIPS_ATYPE_USI,
> > -    fcsr_mod_var, hold_mod_val);
> > +  tree hold_assign_mod = build4 (TARGET_EXPR, MIPS_ATYPE_USI,
> > +    fcsr_mod_var, hold_mod_val, NULL,
> > NULL);
> >    tree set_fcsr_hold_call = build_call_expr (set_fcsr, 1,
> > fcsr_mod_var);
> >    tree hold_all = build2 (COMPOUND_EXPR, MIPS_ATYPE_USI,
> >   hold_assign_orig, hold_assign_mod);
> > @@ -22454,8 +22454,8 @@ mips_atomic_assign_expand_fenv (tree *hold,
> > tree *clear, tree *update)
> >    *clear = build_call_expr (set_fcsr, 1, fcsr_mod_var);
> >  
> >    tree get_fcsr_update_call = build_call_expr (get_fcsr, 0);
> > -  *update = build2 (MODIFY_EXPR, MIPS_ATYPE_USI,
> > -   exceptions_var, get_fcsr_update_call);
> > +  *update = build4 (TARGET_EXPR, MIPS_ATYPE_USI,
> > +   exceptions_var, get_fcsr_update_call, NULL,
> > NULL);
> >    tree set_fcsr_update_call = build_call_expr (set_fcsr, 1,
> > fcsr_orig_var);
> >    *update = build2 (COMPOUND_EXPR, void_type_node, *update,
> >     set_fcsr_update_call);
> 

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



PING^2: [PATCH] mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]

2021-07-09 Thread Xi Ruoyao via Gcc-patches
PING again.

On Thu, 2021-07-01 at 16:11 +0800, Xi Ruoyao wrote:
> Ping.
> 
> On Mon, 2021-06-21 at 21:42 +0800, Xi Ruoyao wrote:
> > Middle-end started to emit vec_cmp and vec_cmpu since GCC 11,
> > causing
> > ICE on MIPS with MSA enabled.  Add the pattern to prevent it.
> > 
> > Bootstrapped and regression tested on mips64el-linux-gnu.
> > Ok for trunk?
> > 
> > gcc/
> > 
> > * config/mips/mips-protos.h (mips_expand_vec_cmp_expr):
> > Declare.
> > * config/mips/mips.c (mips_expand_vec_cmp_expr): New
> > function.
> > * config/mips/mips-msa.md (vec_cmp): New
> >   expander.
> >   (vec_cmpu): New expander.
> > ---
> >  gcc/config/mips/mips-msa.md   | 22 ++
> >  gcc/config/mips/mips-protos.h |  1 +
> >  gcc/config/mips/mips.c    | 11 +++
> >  3 files changed, 34 insertions(+)
> > 
> > diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-
> > msa.md
> > index 3ecf2bde19f..3a67f25be56 100644
> > --- a/gcc/config/mips/mips-msa.md
> > +++ b/gcc/config/mips/mips-msa.md
> > @@ -435,6 +435,28 @@
> >    DONE;
> >  })
> >  
> > +(define_expand "vec_cmp"
> > +  [(match_operand: 0 "register_operand")
> > +   (match_operator 1 ""
> > + [(match_operand:MSA 2 "register_operand")
> > +  (match_operand:MSA 3 "register_operand")])]
> > +  "ISA_HAS_MSA"
> > +{
> > +  mips_expand_vec_cmp_expr (operands);
> > +  DONE;
> > +})
> > +
> > +(define_expand "vec_cmpu"
> > +  [(match_operand: 0 "register_operand")
> > +   (match_operator 1 ""
> > + [(match_operand:IMSA 2 "register_operand")
> > +  (match_operand:IMSA 3 "register_operand")])]
> > +  "ISA_HAS_MSA"
> > +{
> > +  mips_expand_vec_cmp_expr (operands);
> > +  DONE;
> > +})
> > +
> >  (define_insn "msa_insert_"
> >    [(set (match_operand:MSA 0 "register_operand" "=f,f")
> > (vec_merge:MSA
> > diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-
> > protos.h
> > index 2cf4ed50292..a685f7f7dd5 100644
> > --- a/gcc/config/mips/mips-protos.h
> > +++ b/gcc/config/mips/mips-protos.h
> > @@ -385,6 +385,7 @@ extern mulsidi3_gen_fn mips_mulsidi3_gen_fn
> > (enum
> > rtx_code);
> >  
> >  extern void mips_register_frame_header_opt (void);
> >  extern void mips_expand_vec_cond_expr (machine_mode, machine_mode,
> > rtx *);
> > +extern void mips_expand_vec_cmp_expr (rtx *);
> >  
> >  /* Routines implemented in mips-d.c  */
> >  extern void mips_d_target_versions (void);
> > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> > index 00a8eef96aa..8f043399a8e 100644
> > --- a/gcc/config/mips/mips.c
> > +++ b/gcc/config/mips/mips.c
> > @@ -22321,6 +22321,17 @@ mips_expand_msa_cmp (rtx dest, enum
> > rtx_code
> > cond, rtx op0, rtx op1)
> >  }
> >  }
> >  
> > +void
> > +mips_expand_vec_cmp_expr (rtx *operands)
> > +{
> > +  rtx cond = operands[1];
> > +  rtx op0 = operands[2];
> > +  rtx op1 = operands[3];
> > +  rtx res = operands[0];
> > +
> > +  mips_expand_msa_cmp (res, GET_CODE (cond), op0, op1);
> > +}
> > +
> >  /* Expand VEC_COND_EXPR, where:
> >     MODE is mode of the result
> >     VIMODE equivalent integer mode
> 

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: [PATCH 2/2] [PHIOPT/MATCH] Remove the statement to move if not used

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 7:34 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> Instead of waiting for DCE to remove the unused statement,
> and maybe optimize another conditional, it is better if
> we don't move the statement and have the statement
> removed.
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.c (used_in_seq): New function.
> (match_simplify_replacement): Don't move the statement
> if not used in sequence.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr96928-1.c: Update to similar as pr96928.c.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c |  5 -
>  gcc/tree-ssa-phiopt.c | 24 ++-
>  2 files changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> index 2e86620da11..9e505ac9900 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> @@ -2,7 +2,10 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */
>  /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 "phiopt2" 
> } } */
> -/* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" } 
> } */
> +/* The following check is done at optimized because a ^ (~b) is rewritten as 
> ~(a^b)
> +   and in the case of match.pd optimizing these ?:, the ~ is moved out 
> already
> +   by the time we get to phiopt2. */
> +/* { dg-final { scan-tree-dump-times "c_\[0-9]*\\\(D\\\) \\\^" 1 "optimized" 
> } } */
>  /* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
> "phiopt2" } } */
>  /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index 7a98b7afdf1..a237df02153 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -934,6 +934,26 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
> *comp_stmt,
>return NULL;
>  }
>
> +/* Return true if the lhs of STMT is used in the SEQ sequence
> +   of statements.  */
> +static bool
> +used_in_seq (gimple *stmt, gimple_seq seq)
> +{
> +  tree lhs = gimple_assign_lhs (stmt);
> +  for (auto gsi = gsi_start (seq); !gsi_end_p (gsi); gsi_next_nondebug 
> ())
> +{
> +  use_operand_p use_p;
> +  ssa_op_iter iter;
> +  gimple *stmt1 = gsi_stmt (gsi);
> +  FOR_EACH_SSA_USE_OPERAND (use_p, stmt1, iter, SSA_OP_USE)
> +   {
> + if (USE_FROM_PTR (use_p) == lhs)
> +   return true;
> +   }
> +}
> +return false;
> +}
> +
>  /*  The function match_simplify_replacement does the main work of doing the
>  replacement using match and simplify.  Return true if the replacement is 
> done.
>  Otherwise return false.
> @@ -1020,7 +1040,9 @@ match_simplify_replacement (basic_block cond_bb, 
> basic_block middle_bb,
>  return false;
>
>gsi = gsi_last_bb (cond_bb);
> -  if (stmt_to_move)
> +  if (stmt_to_move
> +  && (gimple_assign_lhs (stmt_to_move) == result
> +  || used_in_seq (stmt_to_move, seq)))

Err, why not insert 'seq' before moving the stmt (you'd have to fiddle
with the iterator,
using GSI_CONTINUE_LINKING I think) and then check has_zero_uses on
the (hopefully) only
def of the stmt to move?

Richard.

>  {
>if (dump_file && (dump_flags & TDF_DETAILS))
> {
> --
> 2.27.0
>


Re: PING: [PATCH] mips: check MSA support for vector modes [PR100760,PR100761,PR100762]

2021-07-09 Thread Xi Ruoyao via Gcc-patches
On Fri, 2021-07-09 at 14:01 +0800, Xi Ruoyao wrote:
> On Thu, 2021-07-08 at 17:20 -0600, Jeff Law wrote:
> > 
> > On 7/5/2021 8:04 PM, Paul Hua wrote:
> > > Looks good to me,  but I have no right to approve.
> > But your opinions are well respected :-)
> > 
> > I'll go ahead and ACK, though in general I'm stepping away from 
> > reviewing target specific work.
> 
> Thanks Paul & Jeff!
> 
> I'll edit the ChangeLog a little (PR is now required in ChangeLog)
> and commit it then.

Pushed @ 82625a42.
-- 
Xi Ruoyao 



Re: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack - V2

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 4:39 AM Jeff Law  wrote:
>
>
>
> On 7/2/2021 10:13 AM, Jeff Law wrote:
> >
> > This is a minor missed optimization we found with our internal port.
> >
> > Given this code:
> >
> > typedef struct {short a; short b;} T;
> >
> > extern void g1();
> >
> > void f(T s)
> > {
> > if (s.a < 0)
> > g1();
> > }
> >
> >
> > "s" is passed in a register, but it's still a BLKmode object because
> > the alignment of T is smaller than the alignment that an integer of
> > the same size would have (16 bits vs 32 bits).
> >
> >
> > Because "s" is BLKmode function.c is going to store it into a stack
> > slot and we'll load it from the that slot for each reference.  So on
> > the v850 (just to pick a port that likely has the same behavior we're
> > seeing) we have this RTL from CSE2:
> >
> >
> > (insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp)
> > (const_int -4 [0xfffc])) [2 S4 A32])
> > (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal}
> >  (expr_list:REG_DEAD (reg:SI 6 r6)
> > (nil)))
> > (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG)
> > (insn 8 3 9 2 (set (reg:HI 44 [ s.a ])
> > (mem/c:HI (plus:SI (reg/f:SI 34 .fp)
> > (const_int -4 [0xfffc])) [1 s.a+0 S2
> > A32])) "j.c":7:5 3 {*movhi_internal}
> >  (nil))
> > (insn 9 8 10 2 (parallel [
> > (set (reg:SI 45)
> > (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0)
> > (const_int 16 [0x10])))
> > (clobber (reg:CC 32 psw))
> > ]) "j.c":7:5 94 {ashlsi3_clobber_flags}
> >  (expr_list:REG_DEAD (reg:HI 44 [ s.a ])
> > (expr_list:REG_UNUSED (reg:CC 32 psw)
> > (nil
> > (insn 10 9 11 2 (parallel [
> > (set (reg:SI 43)
> > (ashiftrt:SI (reg:SI 45)
> > (const_int 16 [0x10])))
> > (clobber (reg:CC 32 psw))
> > ]) "j.c":7:5 104 {ashrsi3_clobber_flags}
> >  (expr_list:REG_DEAD (reg:SI 45)
> > (expr_list:REG_UNUSED (reg:CC 32 psw)
> > (nil
> >
> >
> > Insn 2 is the store into the stack. insn 8 is the load for s.a in the
> > conditional.  DSE1 replaces the MEM in insn 8 with (reg 6) since (reg
> > 6) has the value we want.  After that the store at insn 2 is dead.
> > Sadly DSE never removes the store.
> >
> > The problem is RTL DSE considers a store with no MEM_EXPR as escaping,
> > which keeps the MEM live.  The lack of a MEM_EXPR is due to call to
> > change_address to twiddle the mode on the MEM for the store at insn
> > 2.  It should be safe to copy the MEM_EXPR (which should always be a
> > PARM_DECL) from the original memory to the memory returned by
> > change_address.  Doing so results in DSE1 removing the store at insn 2.
> >
> > It would be nice to remove the stack setup/teardown.   I'm not offhand
> > aware of mechanisms to remove the setup/teardown after we've already
> > allocated a slot, even if the slot is no longer used.
> >
> > Bootstrapped and regression tested on x86, though I don't think that's
> > a particularly useful test.  So I also ran it through my tester across
> > those pesky embedded targets without regressions as well.
> >
> > I didn't include a test simply because I didn't want to have an insane
> > target selector.  I guess if we really wanted a test we could look
> > after DSE1 is done and verify there aren't any MEMs left at all.
> > Willing to try that if the consensus is we want this tested.
> >
> > OK for the trunk?
> So Richi questioned if using adjust_address rather than change_address
> was sufficient to solve the problem.  It is.  I've bootstrapped &
> regression tested on x86_64 and regression tested this against the usual
> set of crosses in the tester.
>
> OK for the trunk now?

OK.

Thanks,
Richard.

> JEff
>


Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-09 Thread Richard Biener
On Fri, 9 Jul 2021, Jiufu Guo wrote:

> Currently, doloop.xx variable is using the type as niter which may shorter
> than word size.  For some cases, it may be better to use word size type.
> For example, on some 64bit system, to access 32bit niter, subreg maybe used.
> Then using 64bit type would not need to use subreg if the value can be
> present in both 32bit and 64bit.
> 
> This patch updates doloop iv to BIT_PER_WORD size if it is fine.
> 
> Bootstrap and regtest pass on powerpc64le and x86, is this ok for trunk?
> 
> BR.
> Jiufu
> 
> gcc/ChangeLog:
> 
> 2021-07-08  Jiufu Guo  
> 
>   PR target/61837
>   * tree-ssa-loop-ivopts.c (add_iv_candidate_for_doloop):
>   Update iv on BITS_PER_WORD for niter.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-07-08  Jiufu Guo  
> 
>   PR target/61837
>   * gcc.target/powerpc/pr61837.c: New test.
> 
> ---
>  gcc/testsuite/gcc.target/powerpc/pr61837.c | 16 
>  gcc/tree-ssa-loop-ivopts.c | 10 ++
>  2 files changed, 26 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr61837.c
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr61837.c 
> b/gcc/testsuite/gcc.target/powerpc/pr61837.c
> new file mode 100644
> index 000..dc44eb9cb41
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr61837.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +void foo(int *p1, long *p2, int s)
> +{
> +  int n, v, i;
> +
> +  v = 0;
> +  for (n = 0; n <= 100; n++) {
> + for (i = 0; i < s; i++)
> +if (p2[i] == n)
> +   p1[i] = v;
> + v += 88;
> +  }
> +}
> +
> +/* { dg-final { scan-assembler-not {\mrldicl\M} } } */
> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> index 12a8a49a307..c3c2f97918d 100644
> --- a/gcc/tree-ssa-loop-ivopts.c
> +++ b/gcc/tree-ssa-loop-ivopts.c
> @@ -5690,6 +5690,16 @@ add_iv_candidate_for_doloop (struct ivopts_data *data)
>  
>tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter),
>  build_int_cst (ntype, 1));
> +
> +  /* Use type in word size may fast.  */
> +  if (TYPE_PRECISION (ntype) < BITS_PER_WORD
> +  && TYPE_PRECISION (long_unsigned_type_node) == BITS_PER_WORD
> +  && wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype

I wonder if there's a way to query the target what modes the doloop
pattern can handle (not being too familiar with the doloop code).

Why do you need to do any checks besides the new type being able to
represent all IV values?  The original doloop IV will never wrap
(OTOH if niter is U*_MAX then we compute niter + 1 which will become
zero ... I suppose the doloop might still do the correct thing here
but it also still will with a IV with larger type).

I'd have expected sth like

   ntype = lang_hooks.types.type_for_mode (word_mode, TYPE_UNSIGNED 
(ntype));

thus the decision made using a mode - which is also why I wonder
if there's a way to query the target for this.  As you say,
it _may_ be fast, so better check (somehow).

> +{
> +  ntype = long_unsigned_type_node;
> +  base = fold_convert (ntype, base);
> +}
> +
>add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, NULL, 
> true);
>  }
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] testsuite: mips: use noinline attribute instead of -fno-inline

2021-07-09 Thread Xi Ruoyao via Gcc-patches
On Thu, 2021-07-08 at 17:44 -0600, Jeff Law wrote:
> 
> 
> On 6/25/2021 8:40 AM, Richard Sandiford wrote:
> > Xi Ruoyao via Gcc-patches  writes:
> > > On Fri, 2021-06-25 at 01:02 +0800, Xi Ruoyao wrote:
> > > > On Thu, 2021-06-24 at 10:48 -0600, Jeff Law wrote:
> > > > > I'd like to know a bit more here.  mips.exp shouldn't care
> > > > > about the
> > > > > options passed to the compiler and to the best of my knowledge
> > > > > patch itself is wrong, I question if it's necessary and
> > > > > whether or
> > > > > not
> > > > > your just papering over some other issue.
> > > > There is some logic processing options in mips.exp.  Some
> > > > options are
> > > > overrided for multilib.  It seems the mips.exp was originally
> > > > designed
> > > > as:
> > > > 
> > > > * MIPS options should go in dg-options
> > > > * Other options should go in dg-additional-options
> > > > 
> > > > In d2148424165 marxin merged some dg-additional-options into dg-
> > > > options,
> > > > exploited the problem.
> > > > 
> > > > And, the "origin" convention seems already broken: there is
> > > > something
> > > > like -funroll-loops which is not a MIPS option, but accepted by
> > > > mips.exp
> > > > in dg-options.
> > > > 
> > > > Possiblities are:
> > > > 
> > > > (1) this patch
> > > > (2) make mips.exp accept -fno-inline as "if it is a MIPS option"
> > > > (3) refactor mips.exp to pass everything itself doesn't know
> > > > directly
> > > > to gcc
> > > Attached a diff for mips.exp trying to make it pass everything in
> > > dg-
> > > options which is not known by itself directly to the compiler.
> > > 
> > > The "smallest fix" is simply adding -fno-inline into mips.exp. 
> > > However
> > > I don't like it because I agree with you that mips.exp shouldn't
> > > care
> > > about dg-options, at least don't do it too much.
> > As I said in the other message, I think the smallest fix is the way
> > to
> > go though.
> THanks for chiming in Richard.  I didn't know all the background
> here.   
> Let's just go with the small fix based on your recommendation.  We can
> always revisit if we keep running into issues in this code.

Pushed at 3b33b113.



Re: [committed] move warning suppression closer to invalid access (PR101372)

2021-07-09 Thread Richard Biener via Gcc-patches
On Fri, Jul 9, 2021 at 12:57 AM Martin Sebor via Gcc-patches
 wrote:
>
> To unblock bootstrap this morning that was failing due to stricter
> array bounds checking, I suppressed two -Warray-bounds instances
> in cp/modules.cc without analyzing them, tracking the to-do in
> pr101372.  Now that I understand what's going on -- the warning
> is behaving as designed, flagging accesses to one member via
> a pointer derived from another -- I believe the suppression is
> still appropriate but can be moved to the inline function that
> does the access.  Thanks to the recent improvements to warning
> suppression (r12-1992 and related) this more targeted fix should
> work reliably while also avoiding a recurrence of the warning in
> future uses of the function.  I have committed the attached patch
> to make this change after testing it on x86_64-linux.

The way we handle libcpp identifiers as tree identifiers is indeed somewhat
fishy, but it should be visible in more places (and likely also have
TBAA issues).
We're mating

struct GTY(()) tree_identifier {
  struct tree_common common;
  struct ht_identifier id;
};

and

struct GTY(()) cpp_hashnode {
  struct ht_identifier ident;
  unsigned int is_directive : 1;
  unsigned int directive_index : 7; /* If is_directive,
...
};

to access the common ht_identifier.  I think it would be cleaner to
wherever we do this to only expose ht_identifier * to whatever API
is accessing that.  Maybe with C++ overloading it's less bad than
it sounds.

It's of course all done to save memory, re-using string representations
from the preprocessor for the tree representation of identifiers.

Richard.

> Martin


Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-09 Thread Richard Biener via Gcc-patches
On Thu, Jul 8, 2021 at 8:02 PM Martin Sebor via Gcc-patches
 wrote:
>
> Hi Ian,
>
> Yesterday's enhancement to -Warray-bounds has exposed a couple of
> issues in libgo where the code writes into an invalid constant
> address that the warning is designed to flag.
>
> On the assumption that those invalid addresses are deliberate,
> the attached patch suppresses these instances by using #pragma
> GCC diagnostic but I don't think I'm supposed to commit it (at
> least Git won't let me).  To avoid Go bootstrap failures please
> either apply the patch or otherwise suppress the warning (e.g.,
> by using a volatile pointer temporary).

Btw, I don't think we should diagnose things like

*(int*)0x21 = 0x21;

when somebody literally writes that he'll be just annoyed by diagnostics.

Of course the above might be able to use __builtin_trap (); - it looks
like it is placed where control flow should never end, kind of a
__builtin_unreachable (), which means abort () might do as well.

Richard.

> Thanks
> Martin


  1   2   >