Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Tobias Burnus

On 28.02.22 22:37, Jakub Jelinek via Fortran wrote:

On Mon, Feb 28, 2022 at 09:45:10PM +0100, Mikael Morin wrote:

Lesson learned today: expressions don’t have a corank.
Does pr104131-2.f90 really need to be rejected?


In my reading of the spec, pr104131-2.f90 is _valid_ (in newer OMP). At
least that's implied by the spec as quoted by Jakub:


OpenMP 5.2 says that detach clause should be treated as if it appears on a
firstprivate clause and for the privatization clauses says:
"A private variable must not be coindexed or appear as an actual argument to a 
procedure where
the corresponding dummy argument is a coarray."
5.1 had the same restriction.


+++ b/gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90
...
+  integer (kind=omp_event_handle_kind) :: x[*]
+  !$omp task detach (x)

Here, 'x' is a coarray – but refers to the local variable on this image.

But the following is invalid as it is coindexed:
  !$omp task detach (x[3])

where x[3] means that the value from 'x' on image 3 should be used.

The wording actually also permits array sections. But contrary to coarrays,
(which are odd but otherwise fine), I think it does not really make sense
to have arrays and array sections here.

(Do we need/want to get this clarified/changed in spec?)

But from the wording of the spec, also the first testcase seems to be valid.

 * * *



A variable that is part of another variable (as an array element or a
structure element) cannot appear in a detach clause.

which tells that the check should be on expr->ref instead of
expr->sym->as or expr->rank.


I think looking at the "sym" is fine when matching the expression
via  gfc_match_omp_variable_list with allow_derived=false (default).
As then there cannot be derived-type components.

Additionally, expr->rank > 0 rules out arrays/array sections
but permits array elements while sym->addr.dimension also rules
out array elements.

BTW: after resolving a variable, expr->ref always exists
for arrays – either to select an element or array section
or otherwise, there is an AR_FULL for a whole array.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Richard Biener via Gcc-patches
On Tue, 1 Mar 2022, Jiufu Guo wrote:

> Segher Boessenkool  writes:
> 
> > On Thu, Feb 24, 2022 at 09:50:28AM +0100, Richard Biener wrote:
> >> On Thu, 24 Feb 2022, Jiufu Guo wrote:
> >> > And another thing as Segher pointed out, CSE is doing too
> >> > much work.  It may be ok to separate the constant handling
> >> > logic from CSE.
> >> 
> >> Not sure - CSE just is value numbering, I don't see that it does
> >> more than that.  Yes, it might have developed "heuristics" over
> >> the years what to CSE and to what and where to substitute and
> >> where not.  But in the end it does just value numbering.
> >
> > It also does various micro-optimisations, like all the CC things it
> > does.
> >
> > It is not very good at doing the CSE job, but it cannot easily be
> > replaced by a better implementation because it does many other small
> > optimisations (that are not done elsewhere).
> >
> 
> Thanks a lot for these comments! I'm also wondering if we would
> rewrite this cse.cc or refactor it in some aspects.

I think time is better spent elsewhere ... I don't think CSE is as
bad as Segher depicts it - it might do "CC things" and other bits
but in the end that's going to be instruction/expression combination
things that "fit" likely because a value lattice (or just nonzero bits
in the cselib variant) existed.

So what might be interesting would be to work towards cleansing
CSE of those, producing testcases and making sure a better fit
pass (combine? fwprop? compare-elim?) performs the desired
optimization.

But I'm not really sure what Segher is talking about - I suppose
it must be magic done inside cselib (which only does analysis),
not in cse.cc itself.

Richard.

> BR,
> Jiufu
> 
> >
> > Segher
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH] Clear currently_expanding_gimple_stmt properly

2022-02-28 Thread Richard Biener via Gcc-patches



> Am 01.03.2022 um 04:27 schrieb H.J. Lu via Gcc-patches 
> :
> 
> commit a5883ba0de68efad36db145e75c86394d8bd44ea
> Author: Michael Matz 
> Date:   Tue Nov 24 15:37:32 2009 +
> 
> introduced currently_expanding_gimple_stmt, which was set and cleared in
> expand_gimple_basic_block when expanding gimple statement to RTL.  But it
> isn't cleared when expand_gimple_basic_block returns inside the loop.

Ok 

Richard 


> 
>PR middle-end/104721
>* cfgexpand.cc (expand_gimple_basic_block): Clear
>currently_expanding_gimple_stmt when returning inside the loop.
> ---
> gcc/cfgexpand.cc | 10 --
> 1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> index d51af2e3084..87536ec7ccd 100644
> --- a/gcc/cfgexpand.cc
> +++ b/gcc/cfgexpand.cc
> @@ -5927,7 +5927,10 @@ expand_gimple_basic_block (basic_block bb, bool 
> disable_tail_calls)
>{
>  new_bb = expand_gimple_cond (bb, as_a  (stmt));
>  if (new_bb)
> -return new_bb;
> +{
> +  currently_expanding_gimple_stmt = NULL;
> +  return new_bb;
> +}
>}
>   else if (is_gimple_debug (stmt))
>{
> @@ -6049,7 +6052,10 @@ expand_gimple_basic_block (basic_block bb, bool 
> disable_tail_calls)
>  if (can_fallthru)
>bb = new_bb;
>  else
> -return new_bb;
> +{
> +  currently_expanding_gimple_stmt = NULL;
> +  return new_bb;
> +}
>}
>}
>  else
> -- 
> 2.35.1
> 


[PATCH v3, rs6000] Enable absolute jump table for PPC AIX and Linux

2022-02-28 Thread HAO CHEN GUI via Gcc-patches
Hi,
   This patch enables absolute jump tables on PPC AIX and Linux. For AIX, the 
jump
table is placed in data section. For Linux, it is placed in RELRO section when
relocation is needed.

   Bootstrapped and tested on AIX,Linux BE and LE with no regressions. Is this 
okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2022-03-01 Haochen Gui 

gcc/
* config/rs6000/aix.h (JUMP_TABLES_IN_TEXT_SECTION): Define.
* config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Enable
absolute jump tables for AIX and Linux.
(rs6000_xcoff_function_rodata_section): Implement.
* config/rs6000/xcoff.h (TARGET_ASM_FUNCTION_RODATA_SECTION): Define.

gcc/testsuite
* gcc.target/powerpc/absolute-jump-table-section.c: New.


patch.diff
diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h
index ad3238bf09a..cf0708aa08b 100644
--- a/gcc/config/rs6000/aix.h
+++ b/gcc/config/rs6000/aix.h
@@ -251,9 +251,9 @@
 #define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
   (!(FIRST) ? PAD_UPWARD : targetm.calls.function_arg_padding (MODE, TYPE))

-/* Indicate that jump tables go in the text section.  */
+/* Indicate that jump tables go in the data section.  */

-#define JUMP_TABLES_IN_TEXT_SECTION 1
+#define JUMP_TABLES_IN_TEXT_SECTION 0

 /* Define any extra SPECS that the compiler needs to generate.  */
 #undef  SUBTARGET_EXTRA_SPECS
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index b2a7afabc73..440e0fde52b 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -237,9 +237,9 @@ extern int dot_symbols;
 #define TARGET_ALIGN_NATURAL 1
 #endif

-/* Indicate that jump tables go in the text section.  */
+/* Indicate that jump tables go in the rodata or RELRO section.  */
 #undef  JUMP_TABLES_IN_TEXT_SECTION
-#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
+#define JUMP_TABLES_IN_TEXT_SECTION 0

 /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
than a doubleword should be padded upward or downward.  You could
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index bc3ef0721a4..07f78d3a05b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4954,6 +4954,10 @@ rs6000_option_override_internal (bool global_init_p)
 warning (0, "%qs is deprecated and not recommended in any circumstances",
 "-mno-speculate-indirect-jumps");

+  /* Enable absolute jump tables for AIX and Linux.  */
+  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
+rs6000_relative_jumptables = 0;
+
   return ret;
 }

@@ -21419,6 +21423,16 @@ rs6000_xcoff_visibility (tree decl)
   enum symbol_visibility vis = DECL_VISIBILITY (decl);
   return visibility_types[vis];
 }
+
+static section *
+rs6000_xcoff_function_rodata_section (tree decl ATTRIBUTE_UNUSED,
+ bool relocatable)
+{
+  if (relocatable)
+return data_section;
+  else
+return readonly_data_section;
+}
 #endif


diff --git a/gcc/config/rs6000/xcoff.h b/gcc/config/rs6000/xcoff.h
index cd0f99cb9c6..0dacd86eed9 100644
--- a/gcc/config/rs6000/xcoff.h
+++ b/gcc/config/rs6000/xcoff.h
@@ -98,7 +98,7 @@
 #define TARGET_ASM_SELECT_SECTION  rs6000_xcoff_select_section
 #define TARGET_ASM_SELECT_RTX_SECTION  rs6000_xcoff_select_rtx_section
 #define TARGET_ASM_UNIQUE_SECTION  rs6000_xcoff_unique_section
-#define TARGET_ASM_FUNCTION_RODATA_SECTION default_no_function_rodata_section
+#define TARGET_ASM_FUNCTION_RODATA_SECTION rs6000_xcoff_function_rodata_section
 #define TARGET_STRIP_NAME_ENCODING  rs6000_xcoff_strip_name_encoding
 #define TARGET_SECTION_TYPE_FLAGS  rs6000_xcoff_section_type_flags
 #ifdef HAVE_AS_TLS
diff --git a/gcc/testsuite/gcc.target/powerpc/absolute-jump-table-section.c 
b/gcc/testsuite/gcc.target/powerpc/absolute-jump-table-section.c
new file mode 100644
index 000..688a6f42836
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/absolute-jump-table-section.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { *-*-aix* || *-*-linux* } } } */
+/* { dg-options "-O2 -fPIC" } */
+
+/* For Linux, the absolute jump tables are placed in .data.rel.ro.local.
+   For AIX, they're placed in data section.  */
+
+int a;
+
+int foo (char c)
+{
+  switch (c) {
+  case 'C':
+return a;
+  case 'D':
+return 3;
+  case 'A':
+return 1;
+  case '%':
+return -2;
+  case '#':
+return a+4;
+  default:
+return 100;
+  }
+}
+
+/* { dg-final { scan-assembler "\\.section\[ \t\]\\.data\\.rel\\.ro\\.local" { 
target *-*-linux* } } } */
+/* { dg-final { scan-assembler-times "\\.csect \\.data" 2 { target *-*-aix* } 
} } */
+/* { dg-final { scan-assembler-times "\\.csect \\.text" 5 { target *-*-aix* } 
} } */


Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
 wrote:
>
> On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu  wrote:
> >
> > On Mon, Feb 28, 2022 at 6:03 PM liuhongt  wrote:
> > >
> > > .. in ix86_expand_vector_move and
> > > ix86_convert_const_wide_int_to_broadcast(called by the former).
> > >
> > > ix86_expand_vector_move is called by emit_move_insn which is used by
> > > many pre_reload passes, ix86_gen_scratch_sse_rtx will break data flow
> > > when there's explict usage of xmm7/xmm15/xmm31.
> > >
> > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> > > for both w/and w/o --with-cpu=native --with-arch=native.
> > >
> > > Ok for trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > > PR target/104704
> > > * config/i386/i386-expand.cc
> > > (ix86_convert_const_wide_int_to_broadcast): Replace
> > > ix86_gen_scratch_sse_rtx with gen_reg_rtx.
> > > (ix86_expand_vector_move): Ditto.
> > > * config/i386/sse.md (*vec_dupv4si): Add alternative $r and
> > > corresponding splitter after it.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/i386/incoming-11.c: Revert r12-2665-g7f4c3943f795fd.
> > > * gcc.target/i386/pr100865-11b.c: Expect vmovdqa or vmovda64.
> > > * gcc.target/i386/pr100865-12b.c: Ditto.
> > > * gcc.target/i386/pr100865-8b.c: Ditto.
> > > * gcc.target/i386/pr100865-9b.c: Ditto.
> > > * gcc.target/i386/pr82941-1.c: Expect vzeroupper for ! ia32.
> > > * gcc.target/i386/pr82942-1.c: Ditto.
> > > * gcc.target/i386/pr82990-1.c: Ditto.
> > > * gcc.target/i386/pr82990-3.c: Ditto.
> > > * gcc.target/i386/pr82990-5.c: Ditto.
> > > ---
> > >  gcc/config/i386/i386-expand.cc   |  6 +--
> > >  gcc/config/i386/sse.md   | 41 +++-
> > >  gcc/testsuite/gcc.target/i386/incoming-11.c  |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr100865-11b.c |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr100865-12b.c |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr100865-8b.c  |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr100865-9b.c  |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr82941-1.c|  3 +-
> > >  gcc/testsuite/gcc.target/i386/pr82942-1.c|  3 +-
> > >  gcc/testsuite/gcc.target/i386/pr82990-1.c|  3 +-
> > >  gcc/testsuite/gcc.target/i386/pr82990-3.c|  3 +-
> > >  gcc/testsuite/gcc.target/i386/pr82990-5.c|  3 +-
> > >  12 files changed, 45 insertions(+), 27 deletions(-)
> > >
> > > diff --git a/gcc/config/i386/i386-expand.cc 
> > > b/gcc/config/i386/i386-expand.cc
> > > index faa0191c6dd..75a28cdd89d 100644
> > > --- a/gcc/config/i386/i386-expand.cc
> > > +++ b/gcc/config/i386/i386-expand.cc
> > > @@ -257,7 +257,7 @@ ix86_convert_const_wide_int_to_broadcast 
> > > (machine_mode mode, rtx op)
> > >machine_mode vector_mode;
> > >if (!mode_for_vector (broadcast_mode, nunits).exists (_mode))
> > >  gcc_unreachable ();
> > > -  rtx target = ix86_gen_scratch_sse_rtx (vector_mode);
> > > +  rtx target = gen_reg_rtx (vector_mode);
> >
> > I think ix86_gen_scratch_sse_rtx should check
> > currently_expanding_gimple_stmt == NULL
> > to return gen_reg_rtx (vector_mode) instead.
>
> Like this:
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b2bf90576d5..6c0e4929914 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -23786,7 +23786,7 @@ ix86_optab_supported_p (int op, machine_mode
> mode1, machine_mode,
>  rtx
>  ix86_gen_scratch_sse_rtx (machine_mode mode)
>  {
> -  if (TARGET_SSE && !lra_in_progress)
> +  if (TARGET_SSE && currently_expanding_gimple_stmt)
>  {
>unsigned int regno;
>if (TARGET_64BIT)
> (END)
Looks like it relies on PR104721.
>
> > >bool ok = ix86_expand_vector_init_duplicate (false, vector_mode,
> > >target,
> > >GEN_INT (val_broadcast));
> > > @@ -605,7 +605,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> > > operands[])
> > >if (!register_operand (op0, mode)
> > >   && !register_operand (op1, mode))
> > > {
> > > - rtx scratch = ix86_gen_scratch_sse_rtx (mode);
> > > + rtx scratch = gen_reg_rtx (mode);
> > >   emit_move_insn (scratch, op1);
> > >   op1 = scratch;
> > > }
> > > @@ -647,7 +647,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> > > operands[])
> > >&& !register_operand (op0, mode)
> > >&& !register_operand (op1, mode))
> > >  {
> > > -  rtx tmp = ix86_gen_scratch_sse_rtx (GET_MODE (op0));
> > > +  rtx tmp = gen_reg_rtx (GET_MODE (op0));
> > >emit_move_insn (tmp, op1);
> > >emit_move_insn (op0, tmp);
> > >return;
> > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > index 3066ea3734a..d124545aa5d 100644
> > > --- a/gcc/config/i386/sse.md
> > > +++ 

[PATCH] c++: improve location of fold expressions

2022-02-28 Thread Patrick Palka via Gcc-patches
This improves diagnostic quality for unsatisfied atomic constraints
that consist of a fold expression, e.g. in concepts/diagnostics3.C:

  .../diagnostic3.C:10:22: note: the expression ‘(foo && ...) [with Ts = 
{int, char}]’ evaluated to ‘false’
 10 | requires (foo && ...)
|  ^~~~

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* semantics.cc (finish_unary_fold_expr): Use input_location
instead of UNKNOWN_LOCATION.
(finish_binary_fold_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/diagnostic3.C: Adjusted expected location of
"evaluated to false" diagnostics.
---
 gcc/cp/semantics.cc | 4 ++--
 gcc/testsuite/g++.dg/concepts/diagnostic3.C | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index a2c0eb050e6..07cae993efe 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12185,7 +12185,7 @@ finish_unary_fold_expr (tree expr, int op, tree_code 
dir)
 
   /* Build the fold expression.  */
   tree code = build_int_cstu (integer_type_node, abs (op));
-  tree fold = build_min_nt_loc (UNKNOWN_LOCATION, dir, code, pack);
+  tree fold = build_min_nt_loc (input_location, dir, code, pack);
   FOLD_EXPR_MODIFY_P (fold) = (op < 0);
   TREE_TYPE (fold) = build_dependent_operator_type (NULL_TREE,
FOLD_EXPR_OP (fold),
@@ -12214,7 +12214,7 @@ finish_binary_fold_expr (tree pack, tree init, int op, 
tree_code dir)
 {
   pack = make_pack_expansion (pack);
   tree code = build_int_cstu (integer_type_node, abs (op));
-  tree fold = build_min_nt_loc (UNKNOWN_LOCATION, dir, code, pack, init);
+  tree fold = build_min_nt_loc (input_location, dir, code, pack, init);
   FOLD_EXPR_MODIFY_P (fold) = (op < 0);
   TREE_TYPE (fold) = build_dependent_operator_type (NULL_TREE,
FOLD_EXPR_OP (fold),
diff --git a/gcc/testsuite/g++.dg/concepts/diagnostic3.C 
b/gcc/testsuite/g++.dg/concepts/diagnostic3.C
index 7796e264251..410651a9c1a 100644
--- a/gcc/testsuite/g++.dg/concepts/diagnostic3.C
+++ b/gcc/testsuite/g++.dg/concepts/diagnostic3.C
@@ -7,18 +7,18 @@ template
   concept foo = (bool)(foo_v | foo_v);
 
 template
-requires (foo && ...)
+requires (foo && ...) // { dg-message "with Ts = .int, char... evaluated 
to .false." }
 void
-bar() // { dg-message "with Ts = .int, char... evaluated to .false." }
+bar()
 { }
 
 template
 struct S { };
 
 template
-requires (foo> && ...)
+requires (foo> && ...) // { dg-message "with Is = .2, 3, 4... evaluated 
to .false." }
 void
-baz() // { dg-message "with Is = .2, 3, 4... evaluated to .false." }
+baz()
 { }
 
 void
-- 
2.35.1.354.g715d08a9e5



Re: [PATCH] c++: Fix ICE with non-constant satisfaction [PR98644]

2022-02-28 Thread Patrick Palka via Gcc-patches
On Tue, 19 Jan 2021, Jason Merrill wrote:

> On 1/13/21 12:05 PM, Patrick Palka wrote:
> > In the below testcase, the expression of the atomic constraint after
> > substitution is (int *) NON_LVALUE_EXPR <1> != 0B which is not a C++
> > constant expression, but its TREE_CONSTANT flag is set (from build2),
> > so satisfy_atom fails to notice that it's non-constant (and we end
> > up tripping over the assert in satisfaction_value).
> > 
> > Since TREE_CONSTANT doesn't necessarily correspond to C++ constantness,
> > this patch makes satisfy_atom instead check is_rvalue_constant_expression.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk/10?
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/98644
> > * constraint.cc (satisfy_atom): Check is_rvalue_constant_expression
> > instead of TREE_CONSTANT.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/98644
> > * g++.dg/cpp2a/concepts-pr98644.C: New test.
> > ---
> >   gcc/cp/constraint.cc  | 2 +-
> >   gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
> >   2 files changed, 8 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 9049d087859..f99a25dc8a4 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -2969,7 +2969,7 @@ satisfy_atom (tree t, tree args, sat_info info)
> >   {
> > result = maybe_constant_value (result, NULL_TREE,
> >  /*manifestly_const_eval=*/true);
> > -  if (!TREE_CONSTANT (result))
> 
> This should be sufficient.  If the result isn't constant, maybe_constant_value
> shouldn't return it with TREE_CONSTANT set.  See
> 
> >   /* This isn't actually constant, so unset TREE_CONSTANT.  
> >
> 
> in cxx_eval_outermost_constant_expr.

I see, so the problem seems to be that the fail-fast path of
maybe_constant_value isn't clearing TREE_CONSTANT sufficiently.  Would
it make sense to fix this like so?

-- >8 --

Subject: [PATCH] c++: ICE with non-constant satisfaction value [PR98644]

Here during satisfaction the expression of the atomic constraint after
substitution is (int *) NON_LVALUE_EXPR <1> != 0B, which is not a C++
constant expression due to the reinterpret_cast, but TREE_CONSTANT is
set since its value is otherwise effectively constant.  We then call
maybe_constant_value on it, which proceeds via its fail-fast path to
exit early without clearing TREE_CONSTANT.  But satisfy_atom relies
on checking TREE_CONSTANT of the result of maybe_constant_value in order
to detect non-constant satisfaction.

This patch fixes this by making the fail-fast path of maybe_constant_value
clear TREE_CONSTANT in this case, like cxx_eval_outermost_constant_expr
in the normal path would have done.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/98644

gcc/cp/ChangeLog:

* constexpr.cc (maybe_constant_value): In the fail-fast path,
clear TREE_CONSTANT on the result if it's set on the input.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-pr98644.C: New test.
* g++.dg/parse/array-size2.C: Remove expected diagnostic about a
narrowing conversion.
---
 gcc/cp/constexpr.cc   | 4 +++-
 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C | 7 +++
 gcc/testsuite/g++.dg/parse/array-size2.C  | 2 --
 3 files changed, 10 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 4716694cb71..234cf0acc26 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -7965,8 +7965,10 @@ maybe_constant_value (tree t, tree decl, bool 
manifestly_const_eval)
 
   if (!is_nondependent_constant_expression (t))
 {
-  if (TREE_OVERFLOW_P (t))
+  if (TREE_OVERFLOW_P (t)
+ || (!processing_template_decl && TREE_CONSTANT (t)))
{
+ /* This isn't actually constant, so unset TREE_CONSTANT.  */
  t = build_nop (TREE_TYPE (t), t);
  TREE_CONSTANT (t) = false;
}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C
new file mode 100644
index 000..6772f72a3ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-pr98644.C
@@ -0,0 +1,7 @@
+// PR c++/98644
+// { dg-do compile { target c++20 } }
+
+template concept Signed = bool(T(1)); // { dg-error 
"reinterpret_cast" }
+static_assert(Signed); // { dg-error "non-constant" }
+
+constexpr bool B = requires { requires bool((char *)1); }; // { dg-error 
"reinterpret_cast" }
diff --git a/gcc/testsuite/g++.dg/parse/array-size2.C 
b/gcc/testsuite/g++.dg/parse/array-size2.C
index c4a69df3b01..e58fe266e77 100644
--- a/gcc/testsuite/g++.dg/parse/array-size2.C
+++ 

Re: [PATCH v4][GCC13] RISC-V: Provide `fmin'/`fmax' RTL patterns

2022-02-28 Thread Jim Wilson via Gcc-patches
On Tue, Feb 8, 2022 at 4:35 AM Maciej W. Rozycki  wrote:

> gcc/
> * config/riscv/riscv.md (UNSPEC_FMIN, UNSPEC_FMAX): New
> constants.
> (fmin3, fmax3): New insns.
> ...


I tried testing on some of the hardware I have.  Both the HiFive Unleashed
(2018) and HiFive Unmatched (2021) implement the current definition of
fmin/fmax.  But the Allwinner Nezha (2021) implements the previous
definition of fmin/fmax.  SiFive was involved with the fmin/fmax change, so
it isn't surprising that they implemented the new semantics before other
companies.  The Nezha board with the T-Head C906 is a popular one, so we do
need to continue to support the 2017 spec, which your patch does with the
HONORS_SNAN checks.  I agree that we don't need to worry about spec
versions older than that.

This looks OK to me.

I've got builds running in parallel on the Unleashed and Unmatched to test
but that will take a couple of days and I don't expect any problems since
you already tested it.  I could do a build on the Nezha if I had to, but
that would take at least a week as it is a much slower board and I'd rather
not do that unless I have to.  This is hardware implementing the older spec
that you probably haven't tested though.

Jim


[PATCH] Clear currently_expanding_gimple_stmt properly

2022-02-28 Thread H.J. Lu via Gcc-patches
commit a5883ba0de68efad36db145e75c86394d8bd44ea
Author: Michael Matz 
Date:   Tue Nov 24 15:37:32 2009 +

introduced currently_expanding_gimple_stmt, which was set and cleared in
expand_gimple_basic_block when expanding gimple statement to RTL.  But it
isn't cleared when expand_gimple_basic_block returns inside the loop.

PR middle-end/104721
* cfgexpand.cc (expand_gimple_basic_block): Clear
currently_expanding_gimple_stmt when returning inside the loop.
---
 gcc/cfgexpand.cc | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index d51af2e3084..87536ec7ccd 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -5927,7 +5927,10 @@ expand_gimple_basic_block (basic_block bb, bool 
disable_tail_calls)
{
  new_bb = expand_gimple_cond (bb, as_a  (stmt));
  if (new_bb)
-   return new_bb;
+   {
+ currently_expanding_gimple_stmt = NULL;
+ return new_bb;
+   }
}
   else if (is_gimple_debug (stmt))
{
@@ -6049,7 +6052,10 @@ expand_gimple_basic_block (basic_block bb, bool 
disable_tail_calls)
  if (can_fallthru)
bb = new_bb;
  else
-   return new_bb;
+   {
+ currently_expanding_gimple_stmt = NULL;
+ return new_bb;
+   }
}
}
  else
-- 
2.35.1



[PATCH] Optimize signed DImode -> TImode on power10, PR target/104698

2022-02-28 Thread Michael Meissner via Gcc-patches
Optimize signed DImode -> TImode on power10, PR target/104698.

On power10, GCC tries to optimize the signed conversion from DImode to
TImode by using the vextsd2q instruction.  However to generate this
instruction, it would have to generate 3 direct moves (1 from the GPR
registers to the altivec registers, and 2 from the altivec registers to
the GPR register).

This patch adds code back in to use the shift right immediate instruction
to do the conversion if the target/source is GPR registers.

2022-02-28   Michael Meissner  

gcc/
PR target/104698
* config/rs6000/vsx.md (mtvsrdd_diti_w1): Delete.
(extendditi2): Replace with code to deal with both GPR registers
and with altivec registers.
---
 gcc/config/rs6000/vsx.md | 73 
 1 file changed, 52 insertions(+), 21 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index b53de103872..62464f67f4d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5023,15 +5023,58 @@ (define_expand "vsignextend_si_v2di"
   DONE;
 })
 
-;; ISA 3.1 vector sign extend
-;; Move DI value from GPR to TI mode in VSX register, word 1.
-(define_insn "mtvsrdd_diti_w1"
-  [(set (match_operand:TI 0 "register_operand" "=wa")
-   (unspec:TI [(match_operand:DI 1 "register_operand" "r")]
-UNSPEC_MTVSRD_DITI_W1))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
-  "mtvsrdd %x0,0,%1"
-  [(set_attr "type" "vecmove")])
+;; Sign extend DI to TI.  We provide both GPR targets and Altivec targets.  If
+;; the register allocator prefers the GPRs, we won't have to move the value to
+;; the altivec registers, do the vextsd2q instruction and move it back.  If we
+;; aren't compiling for 64-bit power10, don't provide the service and let the
+;; machine independent code handle the extension.
+(define_insn_and_split "extendditi2"
+  [(set (match_operand:TI 0 "register_operand" "=r,r,v,v,v")
+   (sign_extend:TI (match_operand:DI 1 "input_operand" "r,m,r,wa,Z")))
+   (clobber (reg:DI CA_REGNO))]
+  "TARGET_POWERPC64 && TARGET_POWER10"
+  "#"
+  "&& reload_completed"
+  [(pc)]
+{
+  rtx dest = operands[0];
+  rtx src = operands[1];
+  int dest_regno = reg_or_subregno (dest);
+
+  /* Handle conversion to GPR registers.  Load up the low part and then do
+ a sign extension to the upper part.  */
+  if (INT_REGNO_P (dest_regno))
+{
+  rtx dest_hi = gen_highpart (DImode, dest);
+  rtx dest_lo = gen_lowpart (DImode, dest);
+
+  emit_move_insn (dest_lo, src);
+  emit_insn (gen_ashrdi3 (dest_hi, dest_lo, GEN_INT (63)));
+  DONE;
+}
+
+  /* For conversion to Altivec register, generate either a splat operation or
+ a load rightmost double word instruction.  Both instructions gets the
+ DImode value into the lower 64 bits, and then do the vextsd2q
+ instruction.  */
+  else if (ALTIVEC_REGNO_P (dest_regno))
+{
+  if (MEM_P (src))
+   emit_insn (gen_vsx_lxvrdx (dest, src));
+  else
+   {
+ rtx dest_v2di = gen_rtx_REG (V2DImode, dest_regno);
+ emit_insn (gen_vsx_splat_v2di (dest_v2di, src));
+   }
+
+  emit_insn (gen_extendditi2_vector (dest, dest));
+  DONE;
+}
+
+  else
+gcc_unreachable ();
+}
+  [(set_attr "length" "8")])
 
 ;; Sign extend 64-bit value in TI reg, word 1, to 128-bit value in TI reg
 (define_insn "extendditi2_vector"
@@ -5042,18 +5085,6 @@ (define_insn "extendditi2_vector"
   "vextsd2q %0,%1"
   [(set_attr "type" "vecexts")])
 
-(define_expand "extendditi2"
-  [(set (match_operand:TI 0 "gpc_reg_operand")
-   (sign_extend:DI (match_operand:DI 1 "gpc_reg_operand")))]
-  "TARGET_POWER10"
-  {
-/* Move 64-bit src from GPR to vector reg and sign extend to 128-bits.  */
-rtx temp = gen_reg_rtx (TImode);
-emit_insn (gen_mtvsrdd_diti_w1 (temp, operands[1]));
-emit_insn (gen_extendditi2_vector (operands[0], temp));
-DONE;
-  })
-
 
 ;; ISA 3.0 Binary Floating-Point Support
 
-- 
2.35.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Jiufu Guo via Gcc-patches
Segher Boessenkool  writes:

> On Thu, Feb 24, 2022 at 09:50:28AM +0100, Richard Biener wrote:
>> On Thu, 24 Feb 2022, Jiufu Guo wrote:
>> > And another thing as Segher pointed out, CSE is doing too
>> > much work.  It may be ok to separate the constant handling
>> > logic from CSE.
>> 
>> Not sure - CSE just is value numbering, I don't see that it does
>> more than that.  Yes, it might have developed "heuristics" over
>> the years what to CSE and to what and where to substitute and
>> where not.  But in the end it does just value numbering.
>
> It also does various micro-optimisations, like all the CC things it
> does.
>
> It is not very good at doing the CSE job, but it cannot easily be
> replaced by a better implementation because it does many other small
> optimisations (that are not done elsewhere).
>

Thanks a lot for these comments! I'm also wondering if we would
rewrite this cse.cc or refactor it in some aspects.

BR,
Jiufu

>
> Segher


Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches
 wrote:
>
> On Mon, Feb 28, 2022 at 6:03 PM liuhongt  wrote:
> >
> > .. in ix86_expand_vector_move and
> > ix86_convert_const_wide_int_to_broadcast(called by the former).
> >
> > ix86_expand_vector_move is called by emit_move_insn which is used by
> > many pre_reload passes, ix86_gen_scratch_sse_rtx will break data flow
> > when there's explict usage of xmm7/xmm15/xmm31.
> >
> > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> > for both w/and w/o --with-cpu=native --with-arch=native.
> >
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/104704
> > * config/i386/i386-expand.cc
> > (ix86_convert_const_wide_int_to_broadcast): Replace
> > ix86_gen_scratch_sse_rtx with gen_reg_rtx.
> > (ix86_expand_vector_move): Ditto.
> > * config/i386/sse.md (*vec_dupv4si): Add alternative $r and
> > corresponding splitter after it.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/incoming-11.c: Revert r12-2665-g7f4c3943f795fd.
> > * gcc.target/i386/pr100865-11b.c: Expect vmovdqa or vmovda64.
> > * gcc.target/i386/pr100865-12b.c: Ditto.
> > * gcc.target/i386/pr100865-8b.c: Ditto.
> > * gcc.target/i386/pr100865-9b.c: Ditto.
> > * gcc.target/i386/pr82941-1.c: Expect vzeroupper for ! ia32.
> > * gcc.target/i386/pr82942-1.c: Ditto.
> > * gcc.target/i386/pr82990-1.c: Ditto.
> > * gcc.target/i386/pr82990-3.c: Ditto.
> > * gcc.target/i386/pr82990-5.c: Ditto.
> > ---
> >  gcc/config/i386/i386-expand.cc   |  6 +--
> >  gcc/config/i386/sse.md   | 41 +++-
> >  gcc/testsuite/gcc.target/i386/incoming-11.c  |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-11b.c |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-12b.c |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-8b.c  |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-9b.c  |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr82941-1.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82942-1.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82990-1.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82990-3.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82990-5.c|  3 +-
> >  12 files changed, 45 insertions(+), 27 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> > index faa0191c6dd..75a28cdd89d 100644
> > --- a/gcc/config/i386/i386-expand.cc
> > +++ b/gcc/config/i386/i386-expand.cc
> > @@ -257,7 +257,7 @@ ix86_convert_const_wide_int_to_broadcast (machine_mode 
> > mode, rtx op)
> >machine_mode vector_mode;
> >if (!mode_for_vector (broadcast_mode, nunits).exists (_mode))
> >  gcc_unreachable ();
> > -  rtx target = ix86_gen_scratch_sse_rtx (vector_mode);
> > +  rtx target = gen_reg_rtx (vector_mode);
>
> I think ix86_gen_scratch_sse_rtx should check
> currently_expanding_gimple_stmt == NULL
> to return gen_reg_rtx (vector_mode) instead.
>

I'm a bit worried about continuing to use the hard register even if
only at the expand stage, if there is a recursive call to
expand_vector_move, we will still mess up the data flow.
.i.e there's emit_move_insn in ix86_expand_vector_init_duplicate.

> >bool ok = ix86_expand_vector_init_duplicate (false, vector_mode,
> >target,
> >GEN_INT (val_broadcast));
> > @@ -605,7 +605,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> > operands[])
> >if (!register_operand (op0, mode)
> >   && !register_operand (op1, mode))
> > {
> > - rtx scratch = ix86_gen_scratch_sse_rtx (mode);
> > + rtx scratch = gen_reg_rtx (mode);
> >   emit_move_insn (scratch, op1);
> >   op1 = scratch;
> > }
> > @@ -647,7 +647,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> > operands[])
> >&& !register_operand (op0, mode)
> >&& !register_operand (op1, mode))
> >  {
> > -  rtx tmp = ix86_gen_scratch_sse_rtx (GET_MODE (op0));
> > +  rtx tmp = gen_reg_rtx (GET_MODE (op0));
> >emit_move_insn (tmp, op1);
> >emit_move_insn (op0, tmp);
> >return;
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 3066ea3734a..d124545aa5d 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -25121,20 +25121,43 @@ (define_insn "vec_dupv4sf"
> > (set_attr "mode" "V4SF")])
> >
> >  (define_insn "*vec_dupv4si"
> > -  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
> > +  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x,v")
> > (vec_duplicate:V4SI
> > - (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
> > + (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0,$r")))]
> >"TARGET_SSE"
> >"@
> > %vpshufd\t{$0, %1, %0|%0, %1, 0}
> > 

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread H.J. Lu via Gcc-patches
On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu  wrote:
>
> On Mon, Feb 28, 2022 at 6:03 PM liuhongt  wrote:
> >
> > .. in ix86_expand_vector_move and
> > ix86_convert_const_wide_int_to_broadcast(called by the former).
> >
> > ix86_expand_vector_move is called by emit_move_insn which is used by
> > many pre_reload passes, ix86_gen_scratch_sse_rtx will break data flow
> > when there's explict usage of xmm7/xmm15/xmm31.
> >
> > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> > for both w/and w/o --with-cpu=native --with-arch=native.
> >
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/104704
> > * config/i386/i386-expand.cc
> > (ix86_convert_const_wide_int_to_broadcast): Replace
> > ix86_gen_scratch_sse_rtx with gen_reg_rtx.
> > (ix86_expand_vector_move): Ditto.
> > * config/i386/sse.md (*vec_dupv4si): Add alternative $r and
> > corresponding splitter after it.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/incoming-11.c: Revert r12-2665-g7f4c3943f795fd.
> > * gcc.target/i386/pr100865-11b.c: Expect vmovdqa or vmovda64.
> > * gcc.target/i386/pr100865-12b.c: Ditto.
> > * gcc.target/i386/pr100865-8b.c: Ditto.
> > * gcc.target/i386/pr100865-9b.c: Ditto.
> > * gcc.target/i386/pr82941-1.c: Expect vzeroupper for ! ia32.
> > * gcc.target/i386/pr82942-1.c: Ditto.
> > * gcc.target/i386/pr82990-1.c: Ditto.
> > * gcc.target/i386/pr82990-3.c: Ditto.
> > * gcc.target/i386/pr82990-5.c: Ditto.
> > ---
> >  gcc/config/i386/i386-expand.cc   |  6 +--
> >  gcc/config/i386/sse.md   | 41 +++-
> >  gcc/testsuite/gcc.target/i386/incoming-11.c  |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-11b.c |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-12b.c |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-8b.c  |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-9b.c  |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr82941-1.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82942-1.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82990-1.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82990-3.c|  3 +-
> >  gcc/testsuite/gcc.target/i386/pr82990-5.c|  3 +-
> >  12 files changed, 45 insertions(+), 27 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> > index faa0191c6dd..75a28cdd89d 100644
> > --- a/gcc/config/i386/i386-expand.cc
> > +++ b/gcc/config/i386/i386-expand.cc
> > @@ -257,7 +257,7 @@ ix86_convert_const_wide_int_to_broadcast (machine_mode 
> > mode, rtx op)
> >machine_mode vector_mode;
> >if (!mode_for_vector (broadcast_mode, nunits).exists (_mode))
> >  gcc_unreachable ();
> > -  rtx target = ix86_gen_scratch_sse_rtx (vector_mode);
> > +  rtx target = gen_reg_rtx (vector_mode);
>
> I think ix86_gen_scratch_sse_rtx should check
> currently_expanding_gimple_stmt == NULL
> to return gen_reg_rtx (vector_mode) instead.

Like this:

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index b2bf90576d5..6c0e4929914 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -23786,7 +23786,7 @@ ix86_optab_supported_p (int op, machine_mode
mode1, machine_mode,
 rtx
 ix86_gen_scratch_sse_rtx (machine_mode mode)
 {
-  if (TARGET_SSE && !lra_in_progress)
+  if (TARGET_SSE && currently_expanding_gimple_stmt)
 {
   unsigned int regno;
   if (TARGET_64BIT)
(END)

> >bool ok = ix86_expand_vector_init_duplicate (false, vector_mode,
> >target,
> >GEN_INT (val_broadcast));
> > @@ -605,7 +605,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> > operands[])
> >if (!register_operand (op0, mode)
> >   && !register_operand (op1, mode))
> > {
> > - rtx scratch = ix86_gen_scratch_sse_rtx (mode);
> > + rtx scratch = gen_reg_rtx (mode);
> >   emit_move_insn (scratch, op1);
> >   op1 = scratch;
> > }
> > @@ -647,7 +647,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> > operands[])
> >&& !register_operand (op0, mode)
> >&& !register_operand (op1, mode))
> >  {
> > -  rtx tmp = ix86_gen_scratch_sse_rtx (GET_MODE (op0));
> > +  rtx tmp = gen_reg_rtx (GET_MODE (op0));
> >emit_move_insn (tmp, op1);
> >emit_move_insn (op0, tmp);
> >return;
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 3066ea3734a..d124545aa5d 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -25121,20 +25121,43 @@ (define_insn "vec_dupv4sf"
> > (set_attr "mode" "V4SF")])
> >
> >  (define_insn "*vec_dupv4si"
> > -  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
> > +  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x,v")
> > (vec_duplicate:V4SI
> > - 

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread H.J. Lu via Gcc-patches
On Mon, Feb 28, 2022 at 6:03 PM liuhongt  wrote:
>
> .. in ix86_expand_vector_move and
> ix86_convert_const_wide_int_to_broadcast(called by the former).
>
> ix86_expand_vector_move is called by emit_move_insn which is used by
> many pre_reload passes, ix86_gen_scratch_sse_rtx will break data flow
> when there's explict usage of xmm7/xmm15/xmm31.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> for both w/and w/o --with-cpu=native --with-arch=native.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/104704
> * config/i386/i386-expand.cc
> (ix86_convert_const_wide_int_to_broadcast): Replace
> ix86_gen_scratch_sse_rtx with gen_reg_rtx.
> (ix86_expand_vector_move): Ditto.
> * config/i386/sse.md (*vec_dupv4si): Add alternative $r and
> corresponding splitter after it.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/incoming-11.c: Revert r12-2665-g7f4c3943f795fd.
> * gcc.target/i386/pr100865-11b.c: Expect vmovdqa or vmovda64.
> * gcc.target/i386/pr100865-12b.c: Ditto.
> * gcc.target/i386/pr100865-8b.c: Ditto.
> * gcc.target/i386/pr100865-9b.c: Ditto.
> * gcc.target/i386/pr82941-1.c: Expect vzeroupper for ! ia32.
> * gcc.target/i386/pr82942-1.c: Ditto.
> * gcc.target/i386/pr82990-1.c: Ditto.
> * gcc.target/i386/pr82990-3.c: Ditto.
> * gcc.target/i386/pr82990-5.c: Ditto.
> ---
>  gcc/config/i386/i386-expand.cc   |  6 +--
>  gcc/config/i386/sse.md   | 41 +++-
>  gcc/testsuite/gcc.target/i386/incoming-11.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr100865-11b.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr100865-12b.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr100865-8b.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr100865-9b.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr82941-1.c|  3 +-
>  gcc/testsuite/gcc.target/i386/pr82942-1.c|  3 +-
>  gcc/testsuite/gcc.target/i386/pr82990-1.c|  3 +-
>  gcc/testsuite/gcc.target/i386/pr82990-3.c|  3 +-
>  gcc/testsuite/gcc.target/i386/pr82990-5.c|  3 +-
>  12 files changed, 45 insertions(+), 27 deletions(-)
>
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index faa0191c6dd..75a28cdd89d 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -257,7 +257,7 @@ ix86_convert_const_wide_int_to_broadcast (machine_mode 
> mode, rtx op)
>machine_mode vector_mode;
>if (!mode_for_vector (broadcast_mode, nunits).exists (_mode))
>  gcc_unreachable ();
> -  rtx target = ix86_gen_scratch_sse_rtx (vector_mode);
> +  rtx target = gen_reg_rtx (vector_mode);

I think ix86_gen_scratch_sse_rtx should check
currently_expanding_gimple_stmt == NULL
to return gen_reg_rtx (vector_mode) instead.

>bool ok = ix86_expand_vector_init_duplicate (false, vector_mode,
>target,
>GEN_INT (val_broadcast));
> @@ -605,7 +605,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> operands[])
>if (!register_operand (op0, mode)
>   && !register_operand (op1, mode))
> {
> - rtx scratch = ix86_gen_scratch_sse_rtx (mode);
> + rtx scratch = gen_reg_rtx (mode);
>   emit_move_insn (scratch, op1);
>   op1 = scratch;
> }
> @@ -647,7 +647,7 @@ ix86_expand_vector_move (machine_mode mode, rtx 
> operands[])
>&& !register_operand (op0, mode)
>&& !register_operand (op1, mode))
>  {
> -  rtx tmp = ix86_gen_scratch_sse_rtx (GET_MODE (op0));
> +  rtx tmp = gen_reg_rtx (GET_MODE (op0));
>emit_move_insn (tmp, op1);
>emit_move_insn (op0, tmp);
>return;
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 3066ea3734a..d124545aa5d 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -25121,20 +25121,43 @@ (define_insn "vec_dupv4sf"
> (set_attr "mode" "V4SF")])
>
>  (define_insn "*vec_dupv4si"
> -  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
> +  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x,v")
> (vec_duplicate:V4SI
> - (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
> + (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0,$r")))]
>"TARGET_SSE"
>"@
> %vpshufd\t{$0, %1, %0|%0, %1, 0}
> vbroadcastss\t{%1, %0|%0, %1}
> -   shufps\t{$0, %0, %0|%0, %0, 0}"
> -  [(set_attr "isa" "sse2,avx,noavx")
> -   (set_attr "type" "sselog1,ssemov,sselog1")
> -   (set_attr "length_immediate" "1,0,1")
> -   (set_attr "prefix_extra" "0,1,*")
> -   (set_attr "prefix" "maybe_vex,maybe_evex,orig")
> -   (set_attr "mode" "TI,V4SF,V4SF")])
> +   shufps\t{$0, %0, %0|%0, %0, 0}
> +   #"
> +  [(set_attr "isa" "sse2,avx,noavx,noavx512vl")
> +   (set_attr "type" "sselog1,ssemov,sselog1,sselog1")
> +   (set_attr "length_immediate" "1,0,1,1")
> +   

[PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread liuhongt via Gcc-patches
.. in ix86_expand_vector_move and
ix86_convert_const_wide_int_to_broadcast(called by the former).

ix86_expand_vector_move is called by emit_move_insn which is used by
many pre_reload passes, ix86_gen_scratch_sse_rtx will break data flow
when there's explict usage of xmm7/xmm15/xmm31.

Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
for both w/and w/o --with-cpu=native --with-arch=native.

Ok for trunk?

gcc/ChangeLog:

PR target/104704
* config/i386/i386-expand.cc
(ix86_convert_const_wide_int_to_broadcast): Replace
ix86_gen_scratch_sse_rtx with gen_reg_rtx.
(ix86_expand_vector_move): Ditto.
* config/i386/sse.md (*vec_dupv4si): Add alternative $r and
corresponding splitter after it.

gcc/testsuite/ChangeLog:

* gcc.target/i386/incoming-11.c: Revert r12-2665-g7f4c3943f795fd.
* gcc.target/i386/pr100865-11b.c: Expect vmovdqa or vmovda64.
* gcc.target/i386/pr100865-12b.c: Ditto.
* gcc.target/i386/pr100865-8b.c: Ditto.
* gcc.target/i386/pr100865-9b.c: Ditto.
* gcc.target/i386/pr82941-1.c: Expect vzeroupper for ! ia32.
* gcc.target/i386/pr82942-1.c: Ditto.
* gcc.target/i386/pr82990-1.c: Ditto.
* gcc.target/i386/pr82990-3.c: Ditto.
* gcc.target/i386/pr82990-5.c: Ditto.
---
 gcc/config/i386/i386-expand.cc   |  6 +--
 gcc/config/i386/sse.md   | 41 +++-
 gcc/testsuite/gcc.target/i386/incoming-11.c  |  2 +-
 gcc/testsuite/gcc.target/i386/pr100865-11b.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr100865-12b.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr100865-8b.c  |  2 +-
 gcc/testsuite/gcc.target/i386/pr100865-9b.c  |  2 +-
 gcc/testsuite/gcc.target/i386/pr82941-1.c|  3 +-
 gcc/testsuite/gcc.target/i386/pr82942-1.c|  3 +-
 gcc/testsuite/gcc.target/i386/pr82990-1.c|  3 +-
 gcc/testsuite/gcc.target/i386/pr82990-3.c|  3 +-
 gcc/testsuite/gcc.target/i386/pr82990-5.c|  3 +-
 12 files changed, 45 insertions(+), 27 deletions(-)

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index faa0191c6dd..75a28cdd89d 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -257,7 +257,7 @@ ix86_convert_const_wide_int_to_broadcast (machine_mode 
mode, rtx op)
   machine_mode vector_mode;
   if (!mode_for_vector (broadcast_mode, nunits).exists (_mode))
 gcc_unreachable ();
-  rtx target = ix86_gen_scratch_sse_rtx (vector_mode);
+  rtx target = gen_reg_rtx (vector_mode);
   bool ok = ix86_expand_vector_init_duplicate (false, vector_mode,
   target,
   GEN_INT (val_broadcast));
@@ -605,7 +605,7 @@ ix86_expand_vector_move (machine_mode mode, rtx operands[])
   if (!register_operand (op0, mode)
  && !register_operand (op1, mode))
{
- rtx scratch = ix86_gen_scratch_sse_rtx (mode);
+ rtx scratch = gen_reg_rtx (mode);
  emit_move_insn (scratch, op1);
  op1 = scratch;
}
@@ -647,7 +647,7 @@ ix86_expand_vector_move (machine_mode mode, rtx operands[])
   && !register_operand (op0, mode)
   && !register_operand (op1, mode))
 {
-  rtx tmp = ix86_gen_scratch_sse_rtx (GET_MODE (op0));
+  rtx tmp = gen_reg_rtx (GET_MODE (op0));
   emit_move_insn (tmp, op1);
   emit_move_insn (op0, tmp);
   return;
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3066ea3734a..d124545aa5d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -25121,20 +25121,43 @@ (define_insn "vec_dupv4sf"
(set_attr "mode" "V4SF")])
 
 (define_insn "*vec_dupv4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
+  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x,v")
(vec_duplicate:V4SI
- (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
+ (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0,$r")))]
   "TARGET_SSE"
   "@
%vpshufd\t{$0, %1, %0|%0, %1, 0}
vbroadcastss\t{%1, %0|%0, %1}
-   shufps\t{$0, %0, %0|%0, %0, 0}"
-  [(set_attr "isa" "sse2,avx,noavx")
-   (set_attr "type" "sselog1,ssemov,sselog1")
-   (set_attr "length_immediate" "1,0,1")
-   (set_attr "prefix_extra" "0,1,*")
-   (set_attr "prefix" "maybe_vex,maybe_evex,orig")
-   (set_attr "mode" "TI,V4SF,V4SF")])
+   shufps\t{$0, %0, %0|%0, %0, 0}
+   #"
+  [(set_attr "isa" "sse2,avx,noavx,noavx512vl")
+   (set_attr "type" "sselog1,ssemov,sselog1,sselog1")
+   (set_attr "length_immediate" "1,0,1,1")
+   (set_attr "prefix_extra" "0,1,*,0")
+   (set_attr "prefix" "maybe_vex,maybe_evex,orig,maybe_vex")
+   (set_attr "mode" "TI,V4SF,V4SF,TI")
+   (set (attr "preferred_for_speed")
+ (cond [(eq_attr "alternative" "3")
+ (symbol_ref "TARGET_INTER_UNIT_MOVES_TO_VEC")
+  ]
+  (symbol_ref "true")))])
+
+(define_split
+  [(set (match_operand:V4SI 0 "sse_reg_operand")
+   

Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Mikael Morin

Le 28/02/2022 à 22:37, Jakub Jelinek a écrit :

On Mon, Feb 28, 2022 at 09:45:10PM +0100, Mikael Morin wrote:

Le 28/02/2022 à 21:37, Mikael Morin a écrit :

Maybe corank should be checked together with rank?


Lesson learned today: expressions don’t have a corank.


There is gfc_is_coindexed that can be used.


Does pr104131-2.f90 really need to be rejected?


OpenMP 5.2 says that detach clause should be treated as if it appears on a
firstprivate clause and for the privatization clauses says:
"A private variable must not be coindexed or appear as an actual argument to a 
procedure where
the corresponding dummy argument is a coarray."
5.1 had the same restriction.



There is also:


A variable that is part of another variable (as an array element or a structure 
element) cannot
appear in a detach clause.


which tells that the check should be on expr->ref instead of 
expr->sym->as or expr->rank.


None of the above excerpts from the spec forbid pr104131.f90 or 
pr104131-2.f90 by the way.


[committed] [PR104637] LRA: Split hard regs as many as possible on one subpass

2022-02-28 Thread Vladimir Makarov via Gcc-patches

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104637

The patch was successfully bootstrapped and tested on x86-64, aarch64, 
and ppc64.


commit ec1b9ba2d7913fe5e9deacc8e55e7539262f5124
Author: Vladimir N. Makarov 
Date:   Mon Feb 28 16:43:50 2022 -0500

[PR104637] LRA: Split hard regs as many as possible on one subpass

LRA hard reg split subpass is a small subpass used as the last
resort for LRA when it can not assign a hard reg to a reload
pseudo by other ways (e.g. by spilling non-reload pseudos).  For
simplicity the subpass works on one split base (as each split
changes pseudo live range info).  In this case it results in
reaching maximal possible number of subpasses.  The patch
implements as many non-overlapping hard reg splits
splits as possible on each subpass.

gcc/ChangeLog:

PR rtl-optimization/104637
* lra-assigns.cc (lra_split_hard_reg_for): Split hard regs as many
as possible on one subpass.

gcc/testsuite/ChangeLog:

PR rtl-optimization/104637
* gcc.target/i386/pr104637.c: New.

diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index c1d40ea2a14..ab3a6e6e9cc 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1774,8 +1774,8 @@ lra_split_hard_reg_for (void)
  iterations.  Either it's an asm and something is wrong with the
  constraints, or we have run out of spill registers; error out in
  either case.  */
-  bool asm_p = false;
-  bitmap_head failed_reload_insns, failed_reload_pseudos;
+  bool asm_p = false, spill_p = false;
+  bitmap_head failed_reload_insns, failed_reload_pseudos, over_split_insns;
   
   if (lra_dump_file != NULL)
 fprintf (lra_dump_file,
@@ -1786,6 +1786,7 @@ lra_split_hard_reg_for (void)
   bitmap_ior (_reload_pseudos, _inheritance_pseudos, _split_regs);
   bitmap_ior_into (_reload_pseudos, _subreg_reload_pseudos);
   bitmap_ior_into (_reload_pseudos, _optional_reload_pseudos);
+  bitmap_initialize (_split_insns, _obstack);
   for (i = lra_constraint_new_regno_start; i < max_regno; i++)
 if (reg_renumber[i] < 0 && lra_reg_info[i].nrefs != 0
 	&& (rclass = lra_get_allocno_class (i)) != NO_REGS
@@ -1793,14 +1794,41 @@ lra_split_hard_reg_for (void)
   {
 	if (! find_reload_regno_insns (i, first, last))
 	  continue;
-	if (BLOCK_FOR_INSN (first) == BLOCK_FOR_INSN (last)
-	&& spill_hard_reg_in_range (i, rclass, first, last))
+	if (BLOCK_FOR_INSN (first) == BLOCK_FOR_INSN (last))
 	  {
-	bitmap_clear (_reload_pseudos);
-	return true;
+	/* Check that we are not trying to split over the same insn
+	   requiring reloads to avoid splitting the same hard reg twice or
+	   more.  If we need several hard regs splitting over the same insn
+	   it can be finished on the next iterations.
+
+	   The following loop iteration number is small as we split hard
+	   reg in a very small range.  */
+	for (insn = first;
+		 insn != NEXT_INSN (last);
+		 insn = NEXT_INSN (insn))
+	  if (bitmap_bit_p (_split_insns, INSN_UID (insn)))
+		break;
+	if (insn != NEXT_INSN (last)
+		|| !spill_hard_reg_in_range (i, rclass, first, last))
+	  {
+		bitmap_set_bit (_reload_pseudos, i);
+	  }
+	else
+	  {
+		for (insn = first;
+		 insn != NEXT_INSN (last);
+		 insn = NEXT_INSN (insn))
+		  bitmap_set_bit (_split_insns, INSN_UID (insn));
+		spill_p = true;
+	  }
 	  }
-	bitmap_set_bit (_reload_pseudos, i);
   }
+  bitmap_clear (_split_insns);
+  if (spill_p)
+{
+  bitmap_clear (_reload_pseudos);
+  return true;
+}
   bitmap_clear (_reload_pseudos);
   bitmap_initialize (_reload_insns, _obstack);
   EXECUTE_IF_SET_IN_BITMAP (_reload_pseudos, 0, u, bi)
diff --git a/gcc/testsuite/gcc.target/i386/pr104637.c b/gcc/testsuite/gcc.target/i386/pr104637.c
new file mode 100644
index 000..65e8635d55e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104637.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-Og -fno-forward-propagate -mavx" } */
+
+typedef short __attribute__((__vector_size__ (64))) U;
+typedef unsigned long long __attribute__((__vector_size__ (32))) V;
+typedef long double __attribute__((__vector_size__ (64))) F;
+
+int i;
+U u;
+F f;
+
+void
+foo (char a, char b, _Complex char c, V v)
+{
+  u = (U) { u[0] / 0, u[1] / 0, u[2] / 0, u[3] / 0, u[4] / 0, u[5] / 0, u[6] / 0, u[7] / 0,
+	u[8] / 0, u[0] / 0, u[9] / 0, u[10] / 0, u[11] / 0, u[12] / 0, u[13] / 0, u[14] / 0, u[15] / 0,
+	u[16] / 0, u[17] / 0, u[18] / 0, u[19] / 0, u[20] / 0, u[21] / 0, u[22] / 0, u[23] / 0,
+	u[24] / 0, u[25] / 0, u[26] / 0, u[27] / 0, u[28] / 0, u[29] / 0, u[30] / 0, u[31] / 0 };
+  c += i;
+  f = (F) { v[0], v[1], v[2], v[3] };
+  i = (char) (__imag__ c + i);
+}


Re: [PATCH] PR fortran/104573 - ICE in resolve_structure_cons, at fortran/resolve.cc:1299

2022-02-28 Thread Mikael Morin

Le 28/02/2022 à 22:32, Mikael Morin a écrit :

So please use a condition on expr->ts.type instead.
I said «instead», but «as well» is more appropriate; both expr.ts.type 

and expr.ts.u.derived conditions are probably necessary.


Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 28, 2022 at 09:45:10PM +0100, Mikael Morin wrote:
> Le 28/02/2022 à 21:37, Mikael Morin a écrit :
> > Maybe corank should be checked together with rank?
> 
> Lesson learned today: expressions don’t have a corank.
> Does pr104131-2.f90 really need to be rejected?

OpenMP 5.2 says that detach clause should be treated as if it appears on a
firstprivate clause and for the privatization clauses says:
"A private variable must not be coindexed or appear as an actual argument to a 
procedure where
the corresponding dummy argument is a coarray."
5.1 had the same restriction.

Jakub



Re: [PATCH] PR fortran/104573 - ICE in resolve_structure_cons, at fortran/resolve.cc:1299

2022-02-28 Thread Mikael Morin

Le 16/02/2022 à 22:20, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

while we detect invalid uses of type(*), we may run into other issues
later when the declared variable is used, leading to an ICE due to a
NULL pointer dereference.  This is demonstrated by Gerhard's testcase.

Steve and I came to rather similar fixes, see PR.  Mine is attached.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald




diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 266e41e25b1..2fa1acdbd6d 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1288,15 +1288,17 @@ resolve_structure_cons (gfc_expr *expr, int init)
}
 }

-  cons = gfc_constructor_first (expr->value.constructor);
-
   /* A constructor may have references if it is the result of substituting a
  parameter variable.  In this case we just pull out the component we
  want.  */
   if (expr->ref)
 comp = expr->ref->u.c.sym->components;
-  else
+  else if (expr->ts.u.derived)
 comp = expr->ts.u.derived->components;


These unprotected union accesses always make me nervous.
I have tried (hard) to exhibit a case not fixed by your patch,
and I have found the case below that almost qualifies, except that there 
is an ICE before anything can happen.

With a minor tweak to prevent the ICE, the problem does appear.

program p
  type t
integer :: a
  end type
  character(3), parameter :: x = t(2)
  character(3), parameter :: y = x
  print *, y
end

In that case the character length information occupies the same space as 
a derived type symbol; the else-if condition evaluates to true, and 
everything breaks from there.


So please use a condition on expr->ts.type instead.
I think the relevant values associated with ts->u.derived are 
BT_DERIVED, BT_CLASS and BT_UNION.


OK with that change.

Thanks, and sorry for the time I took before looking at it.


[committed] d: Merge upstream dmd cf63dd8e5, druntime caf14b0f, phobos 41aaf8c26.

2022-02-28 Thread Iain Buclaw via Gcc-patches
Hi,

This patch merges the D front-end implementation with upstream dmd
cf63dd8e5, as well as the D runtime libraries with druntime caf14b0f,
and phobos 41aaf8c26, synchronizing with the release of 2.099.0-rc1.

D front-end changes:

- Import dmd v2.099.0-rc.1.
- The `main' can now return type `noreturn' and supports return
  inference.

D Runtime changes:

- Import druntime v2.099.0-rc.1.
- C bindings for stat_t on powerpc-linux has been fixed.

Phobos changes:

- Import phobos v2.099.0-rc.1.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
powerpc-linux-gnu.  Committed to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* d-target.cc (Target::_init): Initialize C type size fields.
* dmd/MERGE: Merge upstream dmd cf63dd8e5.
* dmd/VERSION: Update version to v2.099.0-rc.1.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime caf14b0f.
* src/MERGE: Merge upstream phobos 41aaf8c26.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/simd7413a.d: Update.
* gdc.dg/ubsan/pr88957.d: Update.
* gdc.dg/simd18489.d: New test.
* gdc.dg/torture/simd21727.d: New test.
---
 gcc/d/d-target.cc |   9 +-
 gcc/d/dmd/MERGE   |   2 +-
 gcc/d/dmd/VERSION |   2 +-
 gcc/d/dmd/common/outbuffer.d  |  32 +-
 gcc/d/dmd/cparse.d|  66 +-
 gcc/d/dmd/cppmangle.d |  44 +-
 gcc/d/dmd/dmangle.d   | 626 +-
 gcc/d/dmd/dmodule.d   |   8 +
 gcc/d/dmd/dsymbolsem.d|   3 +-
 gcc/d/dmd/expressionsem.d |   6 +-
 gcc/d/dmd/file_manager.d  |   6 +-
 gcc/d/dmd/func.d  |  19 +-
 gcc/d/dmd/lexer.d |  12 +-
 gcc/d/dmd/mtype.d |   8 +
 gcc/d/dmd/root/file.d |  40 +-
 gcc/d/dmd/root/speller.d  |  23 +-
 gcc/d/dmd/root/string.d   |  11 +-
 gcc/d/dmd/semantic3.d |  22 +-
 gcc/d/dmd/target.d|   4 +
 gcc/d/dmd/target.h|   4 +
 gcc/d/dmd/tokens.h|  20 +-
 gcc/d/dmd/traits.d|   7 +-
 gcc/d/dmd/typesem.d   |  13 +-
 gcc/testsuite/gdc.dg/simd18489.d  |   8 +
 .../ice21727.d => gdc.dg/torture/simd21727.d} |  11 +-
 gcc/testsuite/gdc.dg/torture/simd7413a.d  |   1 -
 gcc/testsuite/gdc.dg/ubsan/pr88957.d  |   3 +-
 gcc/testsuite/gdc.test/compilable/b18489.d|   8 -
 .../gdc.test/compilable/issue21390.d  |   3 +
 .../gdc.test/fail_compilation/fail17927.d |   2 +-
 .../gdc.test/fail_compilation/fix17751.d  |  22 -
 .../gdc.test/fail_compilation/issue22826.d|   7 +
 .../gdc.test/fail_compilation/test21546.d |  59 ++
 .../gdc.test/fail_compilation/test22023.d |  26 +
 .../gdc.test/fail_compilation/test22818.d |  21 +
 gcc/testsuite/gdc.test/runnable/nan.d |  17 +-
 gcc/testsuite/gdc.test/runnable/previewin.d   |   6 +-
 gcc/testsuite/gdc.test/runnable/sroa13220.d   | 103 ---
 gcc/testsuite/gdc.test/runnable/test15.d  |   2 +-
 gcc/testsuite/gdc.test/runnable/testconst.d   |  16 +-
 gcc/testsuite/gdc.test/runnable/testscope2.d  |   2 +-
 .../runnable/traits_getPointerBitmap.d|   2 +-
 libphobos/libdruntime/MERGE   |   2 +-
 libphobos/libdruntime/core/gc/gcinterface.d   |   4 +-
 libphobos/libdruntime/core/internal/gc/bits.d |  12 +-
 .../core/internal/gc/impl/conservative/gc.d   | 257 ---
 .../libdruntime/core/internal/gc/pooltable.d  |  29 +-
 .../libdruntime/core/internal/gc/proxy.d  |   4 +-
 libphobos/libdruntime/core/memory.d   |   4 +-
 libphobos/libdruntime/core/stdcpp/string.d|   8 +-
 .../libdruntime/core/sys/posix/sys/stat.d |  85 ++-
 libphobos/libdruntime/core/time.d | 158 +++--
 libphobos/libdruntime/object.d|  13 +-
 libphobos/src/MERGE   |   2 +-
 libphobos/src/std/file.d  |   4 +-
 libphobos/src/std/getopt.d|   8 +-
 libphobos/src/std/range/primitives.d  |  11 +-
 libphobos/src/std/sumtype.d   | 108 ++-
 58 files changed, 1164 insertions(+), 851 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/simd18489.d
 rename gcc/testsuite/{gdc.test/runnable/ice21727.d => 
gdc.dg/torture/simd21727.d} (71%)
 delete mode 100644 gcc/testsuite/gdc.test/compilable/b18489.d
 create mode 100644 gcc/testsuite/gdc.test/compilable/issue21390.d
 delete mode 100644 gcc/testsuite/gdc.test/fail_compilation/fix17751.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/issue22826.d
 create mode 100644 

Re: [x86_64 PATCH] PR tree-opt/91384: peephole2 to eliminate testl after negl.

2022-02-28 Thread Uros Bizjak via Gcc-patches
On Mon, Feb 28, 2022 at 6:36 PM Roger Sayle  wrote:
>
>
> This patch is my proposed solution to PR tree-optimization/91384 which is
> a missed-optimization/code quality regression on x86_64.  The problematic
> idiom is "if (r = -a)" which is equivalent to both "r = -a; if (r != 0)"
> and alternatively "r = -a; if (a != 0)".  In this particular case, on
> x86_64, we prefer to use the condition codes from the negation, rather
> than require an explicit testl instruction.
>
> Unfortunately, combine can't help, as it doesn't attempt to merge pairs
> of instructions that share the same operand(s), only pairs/triples of
> instructions where the result of each instruction feeds the next.  But
> I doubt there's sufficient benefit to attempt this kind of "combination"
> (that wouldn't already be caught by the tree-ssa passes).
>
> Fortunately, it's relatively easy to fix this up (addressing the
> regression) during peephole2 to eliminate the unnecessary testl in:
>
> movl%edi, %ebx
> negl%ebx
> testl   %edi, %edi
> je  .L2
>
> Tested on x86_64-pc-linux-gnu with make bootstrap and make -k check,
> both with and without --target_board='unix{-m32\ -march=cascadelake}'
> with no new failures.  Ok for mainline?
>
>
> 2022-02-28  Roger Sayle  
>
> gcc/ChangeLog
> PR tree-optimization/91384
> * config/i386/i386.md (peephole2): Eliminate final testl insn
> from the sequence *movsi_internal, *negsi_1, *cmpsi_ccno_1 by
> transforming using *negsi_2 for the negation.
>
> gcc/testsuite/ChangeLog
> PR tree-optimization/91384
> * gcc.target/i386/pr91384.c: New test case.

OK.

Thanks,
Uros.

>
> Thanks in advance,
> Roger
> --
>


Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Mikael Morin

Le 28/02/2022 à 21:37, Mikael Morin a écrit :

Maybe corank should be checked together with rank?


Lesson learned today: expressions don’t have a corank.
Does pr104131-2.f90 really need to be rejected?


Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Mikael Morin

Le 28/02/2022 à 19:38, Kwok Cheung Yeung a écrit :


In gfc_expression_rank, e->ref is non-NULL, so e->rank is not set from 
the symtree. It then iterates through the ref elements - ref->type == 
REF_ARRAY and ref->u.ar.type == AR_ELEMENT, so e->rank remains at 0.



This is the expected behavior.


I'll move the check to resolve_omp_clauses and see if it works there.


It won’t work differently there.

Looking at the testcases, the rank should be 1 for pr104131.f90 and 0 
for pr104131-2.f90.

A scalar coarray remains a scalar; its rank is 0.
Maybe corank should be checked together with rank?


Re: [PATCH] Fix error recovery in toplev::finalize.

2022-02-28 Thread David Malcolm via Gcc-patches
On Mon, 2022-02-28 at 18:47 +0100, Richard Biener wrote:
> 
> 
> > Am 28.02.2022 um 16:31 schrieb David Malcolm via Gcc-patches <
> > gcc-patches@gcc.gnu.org>:
> > 
> > On Mon, 2022-02-28 at 12:49 +0100, Martin Liška wrote:
> > > Use flag_checking instead of CHECKING_P
> > > and run toplev::finalize only if there is not error seen.
> > > 
> > > Patch can bootstrap on x86_64-linux-gnu and survives regression
> > > tests.
> > 
> > Did the testing include the libgccjit test suite?  ("jit" is not in -
> > -
> > enable-languages=all)
> > 
> > > 
> > > Ready to be installed?
> > 
> > I'm not keen on this change; IIRC it's valid to attempt to compile a
> > gcc_jit_context that fails with an error, and then to attempt a
> > different gcc_jit_context that succeeds, within the same process.  If
> > I'm reading the patch right, the patch as written removes this
> > cleanup,
> > which would thwart that.
> > 
> > I can try to cook up a testcase for the above use case.
> > 
> > Is there another way to fix PR 104648?
> 
> The function was never called on a release checking build btw.  Is
> there something like flag_jit one could test?

Sorry, I was misremembering - with libgccjit, toplev.finalize () is
called from playback::context::compile in jit/jit-playback.cc, not here
from main.cc

So this cleanup would still be called for libgccjit, and the patch
doesn't affect that.

Looking at PR ipa/104648, it seems to only be triggerable from the C++
frontend, so can't affect libgccjit.

So I think the patch is OK; sorry for the noise.
Dave


> 
> > Thanks
> > Dave
> > 
> > 
> > 
> > > Thanks,
> > > Martin
> > > 
> > >     PR ipa/104648
> > > 
> > > gcc/ChangeLog:
> > > 
> > >     * main.cc (main): Use flag_checking instead of CHECKING_P
> > >     and run toplev::finalize only if there is not error seen.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >     * g++.dg/pr104648.C: New test.
> > > ---
> > >   gcc/main.cc | 6 +++---
> > >   gcc/testsuite/g++.dg/pr104648.C | 9 +
> > >   2 files changed, 12 insertions(+), 3 deletions(-)
> > >   create mode 100644 gcc/testsuite/g++.dg/pr104648.C
> > > 
> > > diff --git a/gcc/main.cc b/gcc/main.cc
> > > index f9dd6b2af58..4ba28b7de53 100644
> > > --- a/gcc/main.cc
> > > +++ b/gcc/main.cc
> > > @@ -37,9 +37,9 @@ main (int argc, char **argv)
> > >  true /* init_signals */);
> > >   
> > >     int r = toplev.main (argc, argv);
> > > -#if CHECKING_P
> > > -  toplev.finalize ();
> > > -#endif
> > > +
> > > +  if (flag_checking && !seen_error ())
> > > +    toplev.finalize ();
> > >   
> > >     return r;
> > >   }
> > > diff --git a/gcc/testsuite/g++.dg/pr104648.C
> > > b/gcc/testsuite/g++.dg/pr104648.C
> > > new file mode 100644
> > > index 000..b8b7c2864cf
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/pr104648.C
> > > @@ -0,0 +1,9 @@
> > > +// { dg-do compile }
> > > +// { dg-options "-fvtable-verify=preinit" }
> > > +
> > > +struct A {};
> > > +struct B : virtual A
> > > +{
> > > +  B () {};
> > > +  B () {}; /* { dg-error "cannot be overloaded with" } */
> > > +};
> > 
> > 
> 




Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Kwok Cheung Yeung

On 28/02/2022 5:37 pm, Jakub Jelinek wrote:

On Mon, Feb 28, 2022 at 06:33:15PM +0100, Mikael Morin wrote:

It is true that the spots I saw in fortran/openmp.cc that test rank look
like:
  if (!gfc_resolve_expr (el->expr)
  || el->expr->ts.type != BT_INTEGER || el->expr->rank != 0)
etc., so probably !gfc_resolve_expr call is missing.


As long as the expression is expected to not be a (contained) function call,
I think it should work.

In the general case non-syntaxic errors are preferably checked and reported
later at resolution stage, where contained functions are known.


Oh, I've missed that it is done during parsing and not during resolution.
That !gfc_resolve_expr call and the checking if it is BT_INTEGER etc.
should be certainly moved to resolve_omp_clauses.



Calling gfc_resolve_expr does not work to update the rank when called 
from gfc_match_omp_detach:


(gdb) p *e->ref
$3 = {type = REF_ARRAY, u = {ar = {type = AR_ELEMENT, dimen = 0, codimen 
= 1, in_allocate = false, team = 0x0, stat = 0x0, where = {nextc = 
0x2e532d8, lb = 0x2e53260}, as = 0x2e04110, c_where = {{nextc = 0x0, lb 
= 0x0} }, start = {0x0 }, end = {0x0 
}, stride = {0x0 }, dimen_type = 
{DIMEN_THIS_IMAGE, 0 }}, c = {component = 0x2, sym = 
0x1}, ss = {start = 0x2, end = 0x1, length = 0x0}, i = INQUIRY_KIND}, 
next = 0x0}


In gfc_expression_rank, e->ref is non-NULL, so e->rank is not set from 
the symtree. It then iterates through the ref elements - ref->type == 
REF_ARRAY and ref->u.ar.type == AR_ELEMENT, so e->rank remains at 0.


I'll move the check to resolve_omp_clauses and see if it works there.

Thanks

Kwok


[coroutines] expanding inside a template

2022-02-28 Thread Nathan Sidwell

Iain,
this is the second bug, also found in Folly and also not extracted to a 
testcase.  We were ICEing because we ended up tsubst_copying something 
that had already been tsubst, leading to an assert failure (mostly such 
repeated tsubsting is harmless).


We had a non-dependent co_await in a non-dependent-type template fn, so 
we processed it at definition time, and then reprocessed at 
instantiation time.


This is not quite the right fix, as it'll make all co_awaits in a 
template function have dependent type.  However, in practice it appears 
less ICEy!


Exprs only have dependent type if at least one operand is dependent -- 
which was what you were trying to do.  Coroutines have the additional 
wrinkle, that the current fn's type is an implicit operand.


So, if the coroutine function's type is not dependent, and the operand 
is not dependent, we should determine the type of the co_await 
expression using the DEPENDENT_EXPR wrapper machinery.  That allows us 
to determine the subexpression type, but leave its operand unchanged and 
then instantiate it later.


I'm not sure if the std explicitly calls out this dependent-subexpr-type 
wrinkle.


nathan
--
Nathan Sidwell
Summary:
The coroutine machinery attempts to process non-dependent
   coroutine expressions at template definition time.  That's just wrong.


diff --git a/9.x/src/gcc-10.x/gcc/cp/coroutines.cc b/9.x/src/gcc-10.x/gcc/cp/coroutines.cc
index 91c017f0b7..d0f292f2a6 100644
--- a/9.x/src/gcc-10.x/gcc/cp/coroutines.cc
+++ b/9.x/src/gcc-10.x/gcc/cp/coroutines.cc
@@ -1153,7 +1153,7 @@ finish_co_await_expr (location_t kw, tree expr)
   /* If we don't know the promise type, we can't proceed, build the
  co_await with the expression unchanged.  */
   tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+  if (processing_template_decl)
 return build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
 		   NULL_TREE, NULL_TREE, NULL_TREE, integer_zero_node);
 
@@ -1230,7 +1230,7 @@ finish_co_yield_expr (location_t kw, tree expr)
   /* If we don't know the promise type, we can't proceed, build the
  co_await with the expression unchanged.  */
   tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+  if (processing_template_decl)
 return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr, NULL_TREE);
 
   if (!coro_promise_type_found_p (current_function_decl, kw))
@@ -1316,7 +1316,7 @@ finish_co_return_stmt (location_t kw, tree expr)
   /* If we don't know the promise type, we can't proceed, build the
  co_return with the expression unchanged.  */
   tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+  if (processing_template_decl)
 {
   /* co_return expressions are always void type, regardless of the
 	 expression type.  */
-- 
2.30.2



Re: [PATCH] Fix error recovery in toplev::finalize.

2022-02-28 Thread Richard Biener via Gcc-patches



> Am 28.02.2022 um 16:31 schrieb David Malcolm via Gcc-patches 
> :
> 
> On Mon, 2022-02-28 at 12:49 +0100, Martin Liška wrote:
>> Use flag_checking instead of CHECKING_P
>> and run toplev::finalize only if there is not error seen.
>> 
>> Patch can bootstrap on x86_64-linux-gnu and survives regression
>> tests.
> 
> Did the testing include the libgccjit test suite?  ("jit" is not in --
> enable-languages=all)
> 
>> 
>> Ready to be installed?
> 
> I'm not keen on this change; IIRC it's valid to attempt to compile a
> gcc_jit_context that fails with an error, and then to attempt a
> different gcc_jit_context that succeeds, within the same process.  If
> I'm reading the patch right, the patch as written removes this cleanup,
> which would thwart that.
> 
> I can try to cook up a testcase for the above use case.
> 
> Is there another way to fix PR 104648?

The function was never called on a release checking build btw.  Is there 
something like flag_jit one could test?

> Thanks
> Dave
> 
> 
> 
>> Thanks,
>> Martin
>> 
>> PR ipa/104648
>> 
>> gcc/ChangeLog:
>> 
>> * main.cc (main): Use flag_checking instead of CHECKING_P
>> and run toplev::finalize only if there is not error seen.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> * g++.dg/pr104648.C: New test.
>> ---
>>   gcc/main.cc | 6 +++---
>>   gcc/testsuite/g++.dg/pr104648.C | 9 +
>>   2 files changed, 12 insertions(+), 3 deletions(-)
>>   create mode 100644 gcc/testsuite/g++.dg/pr104648.C
>> 
>> diff --git a/gcc/main.cc b/gcc/main.cc
>> index f9dd6b2af58..4ba28b7de53 100644
>> --- a/gcc/main.cc
>> +++ b/gcc/main.cc
>> @@ -37,9 +37,9 @@ main (int argc, char **argv)
>>  true /* init_signals */);
>>   
>> int r = toplev.main (argc, argv);
>> -#if CHECKING_P
>> -  toplev.finalize ();
>> -#endif
>> +
>> +  if (flag_checking && !seen_error ())
>> +toplev.finalize ();
>>   
>> return r;
>>   }
>> diff --git a/gcc/testsuite/g++.dg/pr104648.C
>> b/gcc/testsuite/g++.dg/pr104648.C
>> new file mode 100644
>> index 000..b8b7c2864cf
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/pr104648.C
>> @@ -0,0 +1,9 @@
>> +// { dg-do compile }
>> +// { dg-options "-fvtable-verify=preinit" }
>> +
>> +struct A {};
>> +struct B : virtual A
>> +{
>> +  B () {};
>> +  B () {}; /* { dg-error "cannot be overloaded with" } */
>> +};
> 
> 


[coroutines] member slicing fix

2022-02-28 Thread Nathan Sidwell

Iain,
this is the first of 2 patches I needed on top of your github WIP 
series.  It is against GCC-10, but AFAICT will also be needed on trunk.


We were not descending into the object operand of COMPONENT_REF exprs, 
and end up bitcopying a field out of a temporary, leading to badness. 
The breaking out of replace_unary_preamble is merely coincidental 
simplification.


I also think one needs to descend the second operand of COMPOUND_EXPR, 
but when I did that badness happens, probably because of the artificial 
COMPOUND_EXPRs that get generated, which you mentioned?


The bug manifested itself in Folly, and it was too difficult to extract 
a testcase.


nathan
--
Nathan Sidwell
Summary:
The ported coroutine patches still contained a bug.  It could
   cause object slicing and bit copying when the compiler got confused
   about the lifetime of a member access.  It needs to look at the object
   being accessed, not just the data member.


diff --git a/9.x/src/gcc-10.x/gcc/cp/coroutines.cc b/9.x/src/gcc-10.x/gcc/cp/coroutines.cc
index d0f292f2a6..ee004d5b98 100644
--- a/9.x/src/gcc-10.x/gcc/cp/coroutines.cc
+++ b/9.x/src/gcc-10.x/gcc/cp/coroutines.cc
@@ -3007,6 +3007,7 @@ struct coro_flattened_statement {
 
   static tree *find_unary_preamble (tree, bool&);
   static bool expr_result_needs_frame_temp (tree);
+  static tree replace_unary_preamble (tree unary, tree replace);
 
   tree get_flattened_statement (tree);
   void build_flattened_statement ();
@@ -3565,15 +3566,15 @@ tree *
 coro_flattened_statement::find_unary_preamble (tree expr, bool& addr_taken)
 {
   tree *non_u_ptr = NULL;
-  bool done = false;
 
   /* Find the first non-unary operation in an expression.  */
-  while (!done)
+  for (;;)
 if (UNARY_CLASS_P (expr)
-	|| (EXPRESSION_CLASS_P (expr)
-	&& TREE_CODE_LENGTH (TREE_CODE (expr)) == 1)
 	|| TREE_CODE (expr) == INDIRECT_REF
-	|| TREE_CODE (expr) == VIEW_CONVERT_EXPR)
+	|| TREE_CODE (expr) == VIEW_CONVERT_EXPR
+	|| TREE_CODE (expr) == COMPONENT_REF
+	|| (EXPRESSION_CLASS_P (expr)
+	&& TREE_CODE_LENGTH (TREE_CODE (expr)) == 1))
   {
 	if (TREE_CODE (expr) == STMT_EXPR
 	|| TREE_CODE (expr) == CLEANUP_POINT_EXPR)
@@ -3582,14 +3583,24 @@ coro_flattened_statement::find_unary_preamble (tree expr, bool& addr_taken)
 	if (TREE_CODE (expr) == ADDR_EXPR)
 	  addr_taken = true;
 	non_u_ptr = _OPERAND (expr, 0);
-	expr = TREE_OPERAND (expr, 0);
+	expr = *non_u_ptr;
   }
 else
-  done = true;
+  break;
 
   return non_u_ptr;
 }
 
+tree coro_flattened_statement::replace_unary_preamble (tree unary, tree replace)
+{
+  tree *slot = 
+  while (*slot)
+slot = _OPERAND (*slot, 0);
+  *slot = replace;
+
+  return unary;
+}
+
 /* EXPR is an expression with a result that must persist across one or more
suspension points, does this need a frame var?  */
 
@@ -3639,15 +3650,7 @@ coro_flattened_statement::flatten_aggr_init (var_nest_node *t, tree expr,
 handle_call_param (t, _INIT_EXPR_ARG (expr, p_num), "AI");
 
   /* [A = ]  revised_call.  */
-  if (unary_preamble)
-{
-  gcc_checking_assert (!VOID_TYPE_P (TREE_TYPE (expr)));
-  tree x_op = unary_preamble;
-  while (TREE_OPERAND (x_op, 0))
-	x_op = TREE_OPERAND (x_op, 0);
-  TREE_OPERAND (x_op, 0) = expr;
-  expr = unary_preamble;
-}
+  expr = replace_unary_preamble (unary_preamble, expr);
 
   if (!discarded)
 expr = build2_loc (loc, expr_code, TREE_TYPE (t->var), t->var, expr);
@@ -3710,14 +3713,7 @@ coro_flattened_statement::flatten_await (var_nest_node *t, tree expr,
   flatten_await_inner (t, expr);
 
   /* [A = ]  expr.  */
-  if (unary_preamble)
-{
-  tree x_op = unary_preamble;
-  while (TREE_OPERAND (x_op, 0))
-	x_op = TREE_OPERAND (x_op, 0);
-  TREE_OPERAND (x_op, 0) = expr;
-  expr = unary_preamble;
-}
+  expr = replace_unary_preamble (unary_preamble, expr);
 
   if (!discarded)
 expr = build2_loc (loc, expr_code, TREE_TYPE (t->var), t->var, expr);
@@ -3790,14 +3786,7 @@ coro_flattened_statement::flatten_binary (var_nest_node *t, tree expr,
   flatten_expression (ins);  /* Recurse into the second sub-expr...  */
 }
 
-  if (unary_preamble)
-{
-  tree x_op = unary_preamble;
-  while (TREE_OPERAND (x_op, 0))
-	x_op = TREE_OPERAND (x_op, 0);
-  TREE_OPERAND (x_op, 0) = expr;
-  expr = unary_preamble;
-}
+  expr = replace_unary_preamble (unary_preamble, expr);
 
   if (!discarded)
 expr = build2_loc (loc, expr_code, TREE_TYPE (t->var), t->var, expr);
@@ -3890,14 +3879,7 @@ coro_flattened_statement::flatten_call (var_nest_node *t, tree expr,
 handle_call_param (t, _EXPR_ARG(expr, p_num), "CT");
 
   /* [A = ]  revised_call.  */
-  if (unary_preamble)
-{
-  tree x_op = unary_preamble;
-  while (TREE_OPERAND (x_op, 0))
-	x_op = TREE_OPERAND (x_op, 0);
-  TREE_OPERAND (x_op, 0) = expr;
-  expr = unary_preamble;
-}
+  expr = replace_unary_preamble (unary_preamble, expr);
 
   if (!discarded)

Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 28, 2022 at 06:33:15PM +0100, Mikael Morin wrote:
> > It is true that the spots I saw in fortran/openmp.cc that test rank look
> > like:
> >  if (!gfc_resolve_expr (el->expr)
> >  || el->expr->ts.type != BT_INTEGER || el->expr->rank != 0)
> > etc., so probably !gfc_resolve_expr call is missing.
> > 
> As long as the expression is expected to not be a (contained) function call,
> I think it should work.
> 
> In the general case non-syntaxic errors are preferably checked and reported
> later at resolution stage, where contained functions are known.

Oh, I've missed that it is done during parsing and not during resolution.
That !gfc_resolve_expr call and the checking if it is BT_INTEGER etc.
should be certainly moved to resolve_omp_clauses.

Jakub



[x86_64 PATCH] PR tree-opt/91384: peephole2 to eliminate testl after negl.

2022-02-28 Thread Roger Sayle

This patch is my proposed solution to PR tree-optimization/91384 which is
a missed-optimization/code quality regression on x86_64.  The problematic
idiom is "if (r = -a)" which is equivalent to both "r = -a; if (r != 0)"
and alternatively "r = -a; if (a != 0)".  In this particular case, on
x86_64, we prefer to use the condition codes from the negation, rather
than require an explicit testl instruction.

Unfortunately, combine can't help, as it doesn't attempt to merge pairs
of instructions that share the same operand(s), only pairs/triples of
instructions where the result of each instruction feeds the next.  But
I doubt there's sufficient benefit to attempt this kind of "combination"
(that wouldn't already be caught by the tree-ssa passes).

Fortunately, it's relatively easy to fix this up (addressing the
regression) during peephole2 to eliminate the unnecessary testl in:

movl%edi, %ebx
negl%ebx
testl   %edi, %edi
je  .L2

Tested on x86_64-pc-linux-gnu with make bootstrap and make -k check,
both with and without --target_board='unix{-m32\ -march=cascadelake}'
with no new failures.  Ok for mainline?


2022-02-28  Roger Sayle  

gcc/ChangeLog
PR tree-optimization/91384
* config/i386/i386.md (peephole2): Eliminate final testl insn
from the sequence *movsi_internal, *negsi_1, *cmpsi_ccno_1 by
transforming using *negsi_2 for the negation.

gcc/testsuite/ChangeLog
PR tree-optimization/91384
* gcc.target/i386/pr91384.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 8ffa641..4f082ee 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -11012,6 +11012,19 @@
   [(set_attr "type" "negnot")
(set_attr "mode" "")])
 
+;; Optimize *negsi_1 followed by *cmpsi_ccno_1 (PR target/91384)
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+   (match_operand:SWI 1 "general_reg_operand"))
+   (parallel [(set (match_dup 0) (neg:SWI (match_dup 0)))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (reg:CCZ FLAGS_REG) (compare:CCZ (match_dup 1) (const_int 0)))]
+  ""
+  [(set (match_dup 0) (match_dup 1))
+   (parallel [(set (reg:CCZ FLAGS_REG)
+  (compare:CCZ (neg:SWI (match_dup 0)) (const_int 0)))
+ (set (match_dup 0) (neg:SWI (match_dup 0)))])])
+
 ;; Special expand pattern to handle integer mode abs
 
 (define_expand "abs2"
diff --git a/gcc/testsuite/gcc.target/i386/pr91384.c 
b/gcc/testsuite/gcc.target/i386/pr91384.c
new file mode 100644
index 000..24a60a9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr91384.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void foo (void);
+void bar (void);
+
+int
+test (int a)
+{
+  int r;
+
+  if (r = -a)
+foo ();
+  else
+bar ();
+
+  return r;
+}
+
+/* { dg-final { scan-assembler-not "testl" } } */


Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Mikael Morin

Le 28/02/2022 à 17:00, Jakub Jelinek a écrit :

On Mon, Feb 28, 2022 at 04:54:24PM +0100, Mikael Morin wrote:

Le 28/02/2022 à 15:27, Kwok Cheung Yeung a écrit :

On 28/02/2022 2:07 pm, Jakub Jelinek wrote:

(...)

Don't we usually test instead || (*expr)->rank != 0 when testing for
scalars?


(...)


So (*expr)->rank is 0 here even with an array. I'm not sure why - is
rank updated later, or did we forget to call something on the event
handle expression?

Testing against n->sym->as for an array check has been used elsewhere in
openmp.cc, to prevent reductions against arrays in OpenACC in
resolve_omp_clauses.


I can’t tell what openmp requires; it depends on your needs.

Checking sym->as captures array variables which may include scalar
expressions (arr(10) is a scalar expression even if arr is an array
variable), while checking expr->rank only capture array expression,
including scalar variable with array subcomponent (scal%array_comp(:) is an
array expression, even if scal is a scalar variable).

gfc_resolve_expr, through gfc_expression_rank takes care of properly setting
expr->rank.
If the check is done at resolution stage (somewhere in resolve_omp_clauses I
guess?), the rank should be set.

I hope it helps.


It is true that the spots I saw in fortran/openmp.cc that test rank look
like:
 if (!gfc_resolve_expr (el->expr)
 || el->expr->ts.type != BT_INTEGER || el->expr->rank != 0)
etc., so probably !gfc_resolve_expr call is missing.

As long as the expression is expected to not be a (contained) function 
call, I think it should work.


In the general case non-syntaxic errors are preferably checked and 
reported later at resolution stage, where contained functions are known.


Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Segher Boessenkool
On Thu, Feb 24, 2022 at 09:50:28AM +0100, Richard Biener wrote:
> On Thu, 24 Feb 2022, Jiufu Guo wrote:
> > And another thing as Segher pointed out, CSE is doing too
> > much work.  It may be ok to separate the constant handling
> > logic from CSE.
> 
> Not sure - CSE just is value numbering, I don't see that it does
> more than that.  Yes, it might have developed "heuristics" over
> the years what to CSE and to what and where to substitute and
> where not.  But in the end it does just value numbering.

It also does various micro-optimisations, like all the CC things it
does.

It is not very good at doing the CSE job, but it cannot easily be
replaced by a better implementation because it does many other small
optimisations (that are not done elsewhere).


Segher


Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Segher Boessenkool
Hi!

On Thu, Feb 24, 2022 at 03:48:54PM +0800, Jiufu Guo wrote:
> Segher Boessenkool  writes:
> > That is the problem yes.  You need insns to call insn_cost on.  You can
> > look in combine.c:combine_validate_cost to see how this can be done; but
> > you need to have some code to generate in the first place, and for CSE
> > it isn't always clear what code to generate, it really is based on RTL
> > expressions having a cost.
> 
> Hi Segher,
> 
> Thanks! combine_validate_cost is useful to help me on
> evaluating the costs of several instructions or replacements.
> 
> As you pointed out, at CSE, it may not be clear to know what
> extact insn sequences will be generated. Actually,  the same
> issue also exists on RTL expression.  At CSE, it may not clear
> the exact cost, since the real instructions maybe emitted in
> very late passes.

But there will be RTL insns already.  Those may not correspond 1-1 to
the eventual machine insns (ideally they do most of the time though),
but it should be possible to estimate pretty accurate costs for them.

Costs are never exact anyway, it is only one number, while in reality
there are many dimensions.

> To get the accurate cost, we may analyze the constant in the
> hook(insn_cost or rtx_cost) and estimate the possible final
> instructions and then calculate the costs.

Yes.

> We discussed one idea: let the hook insn_cost accept
> any interim instruction, and estimate the real instruction
> base on the interim insn, and then return the estimated
> costs.

No.  insn_cost is only for correct, existing instructions, not for
made-up nonsense.  I created insn_cost precisely to get away from that
aspect of rtx_cost (and some other issues, like, it is incredibly hard
and cumbersome to write a correct rtx_cost).

> For example: input insn "r119:DI=0x100803004101001" to
> insn_cost; and in rs6000_insn_cost (for ppc), analyze
> constant "0x100803004101001" which would need 5 insns;
> then rs6000_insn_cost sumarize the cost of 5 insns.
> 
> A minor concern: because we know that reading this
> constant from the pool is faster than building it by insns,
> we will generate instructions to load constant from the pool
> finally, do not emit 5 real instructions to build the value.
> So, we are more interested in if it is faster to load from
> pool or not.

That is one reason why it is better to generate (close to) machine
insns as early as possible: it makes it much easier to estimate
realistic costs.  (Another important reason is it allows other
optimisations, without us having to do any work for it!)


Segher


Re: [PATCH] c++: ->template and implicit typedef [PR104608]

2022-02-28 Thread Jason Merrill via Gcc-patches

On 2/22/22 17:46, Marek Polacek wrote:

Here we have a forward declaration of Parameter for which we create
an implicit typedef, which is a TYPE_DECL.  Then, when looking it up
at template definition time, cp_parser_template_id gets (since r12-6754)
this TYPE_DECL which it can't handle.


Hmm, getting that global TYPE_DECL from lookup seems like a bug; isn't 
the lookup earlier in cp_parser_template_name in object scope?



This patch defers lookup for implicit typedefs, a la r12-6879.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/104608

gcc/cp/ChangeLog:

* parser.cc (cp_parser_template_name): Repeat lookup of implicit
typedef.

gcc/testsuite/ChangeLog:

* g++.dg/parse/template-keyword3.C: New test.
---
  gcc/cp/parser.cc   |  3 ++-
  gcc/testsuite/g++.dg/parse/template-keyword3.C | 12 
  2 files changed, 14 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/parse/template-keyword3.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 03d99aba13e..5e89e3737b0 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -18681,7 +18681,8 @@ cp_parser_template_name (cp_parser* parser,
  return error_mark_node;
}
else if ((!DECL_P (decl) && !is_overloaded_fn (decl))
-  || TREE_CODE (decl) == USING_DECL)
+  || TREE_CODE (decl) == USING_DECL
+  || DECL_IMPLICIT_TYPEDEF_P (decl))
/* Repeat the lookup at instantiation time.  */
decl = identifier;
  }
diff --git a/gcc/testsuite/g++.dg/parse/template-keyword3.C 
b/gcc/testsuite/g++.dg/parse/template-keyword3.C
new file mode 100644
index 000..59fe0fc180b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/template-keyword3.C
@@ -0,0 +1,12 @@
+// PR c++/104608
+
+class Parameter;
+template  class Function
+: public R
+{
+Function();
+};
+template 
+Function::Function() {
+this->template Parameter();
+}

base-commit: bc66b471d16ef2fd8cb66fd1131b41f80ecb9961




Re: [PATCH] c++: Lost deprecated/unavailable attr in class tmpl [PR104682]

2022-02-28 Thread Marek Polacek via Gcc-patches
On Mon, Feb 28, 2022 at 04:30:01PM +, Iain Sandoe wrote:
> 
> 
> > On 28 Feb 2022, at 16:13, Jason Merrill  wrote:
> > 
> > On 2/25/22 17:59, Marek Polacek wrote:
> >> [ Most likely a GCC 13 patch, but I'm posting it now so that I don't lose 
> >> it. ]
> >> When looking into the other PR I noticed that we fail to give a warning
> >> for a deprecated enumerator when the enum is in a class template.  This
> >> only happens when the attribute doesn't have an argument.  The reason is
> >> that when we tsubst_enum, we create a new enumerator:
> >>   build_enumerator (DECL_NAME (decl), value, newtag,
> >>DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl));
> >> but DECL_ATTRIBUTES (decl) is null when the attribute was provided
> >> without an argument -- in that case it simply melts into a tree flag.
> >> handle_deprecated_attribute has:
> >>   if (!args)
> >>  *no_add_attrs = true;
> >> so the attribute isn't retained and we lose it when tsubsting.  Same
> >> thing when the attribute is on the enum itself.
> >> Attribute unavailable is a similar case, but it's different in that
> >> it can be a late attribute whereas "deprecated" can't:
> > 
> > Iain, was this difference intentional?
> 
> The intent was to treat the two attributes the same way - so any difference
> is unintentional.

Thanks.  I'll send a patch soon.

Marek



Re: [PATCH] c++: Lost deprecated/unavailable attr in class tmpl [PR104682]

2022-02-28 Thread Iain Sandoe



> On 28 Feb 2022, at 16:13, Jason Merrill  wrote:
> 
> On 2/25/22 17:59, Marek Polacek wrote:
>> [ Most likely a GCC 13 patch, but I'm posting it now so that I don't lose 
>> it. ]
>> When looking into the other PR I noticed that we fail to give a warning
>> for a deprecated enumerator when the enum is in a class template.  This
>> only happens when the attribute doesn't have an argument.  The reason is
>> that when we tsubst_enum, we create a new enumerator:
>>   build_enumerator (DECL_NAME (decl), value, newtag,
>>DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl));
>> but DECL_ATTRIBUTES (decl) is null when the attribute was provided
>> without an argument -- in that case it simply melts into a tree flag.
>> handle_deprecated_attribute has:
>>   if (!args)
>>  *no_add_attrs = true;
>> so the attribute isn't retained and we lose it when tsubsting.  Same
>> thing when the attribute is on the enum itself.
>> Attribute unavailable is a similar case, but it's different in that
>> it can be a late attribute whereas "deprecated" can't:
> 
> Iain, was this difference intentional?

The intent was to treat the two attributes the same way - so any difference
is unintentional.

Iain

> 
>> is_late_template_attribute has
>> /* But some attributes specifically apply to templates.  */
>> && !is_attribute_p ("abi_tag", name)
>> && !is_attribute_p ("deprecated", name)
>> && !is_attribute_p ("visibility", name))
>>  return true;
>>else
>>  return false;
>> which looks strange, but attr-unavailable-9.C tests that we don't error when
>> the attribute is applied on a template.
>> Bootstrapped/regtested on x86_64-pc-linux-gnu.
> 
> This looks extremely safe, so let's go ahead and apply it to trunk.
> 
>>  PR c++/104682
>> gcc/cp/ChangeLog:
>>  * cp-tree.h (build_enumerator): Adjust.
>>  * decl.cc (finish_enum): Make it return the new decl.
>>  * pt.cc (tsubst_enum): Propagate TREE_DEPRECATED and TREE_UNAVAILABLE.
>> gcc/testsuite/ChangeLog:
>>  * g++.dg/ext/attr-unavailable-10.C: New test.
>>  * g++.dg/ext/attr-unavailable-11.C: New test.
>>  * g++.dg/warn/deprecated-17.C: New test.
>>  * g++.dg/warn/deprecated-18.C: New test.
>> ---
>>  gcc/cp/cp-tree.h  |  2 +-
>>  gcc/cp/decl.cc|  4 +-
>>  gcc/cp/pt.cc  | 17 +++--
>>  .../g++.dg/ext/attr-unavailable-10.C  | 22 +++
>>  .../g++.dg/ext/attr-unavailable-11.C  | 22 +++
>>  gcc/testsuite/g++.dg/warn/deprecated-17.C | 35 ++
>>  gcc/testsuite/g++.dg/warn/deprecated-18.C | 37 +++
>>  7 files changed, 133 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-10.C
>>  create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-11.C
>>  create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-17.C
>>  create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-18.C
>> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
>> index 37d462fca6e..80994e94793 100644
>> --- a/gcc/cp/cp-tree.h
>> +++ b/gcc/cp/cp-tree.h
>> @@ -6833,7 +6833,7 @@ extern void xref_basetypes (tree, 
>> tree);
>>  extern tree start_enum  (tree, tree, tree, 
>> tree, bool, bool *);
>>  extern void finish_enum_value_list  (tree);
>>  extern void finish_enum (tree);
>> -extern void build_enumerator(tree, tree, tree, 
>> tree, location_t);
>> +extern tree build_enumerator(tree, tree, tree, 
>> tree, location_t);
>>  extern tree lookup_enumerator   (tree, tree);
>>  extern bool start_preparsed_function(tree, tree, int);
>>  extern bool start_function  (cp_decl_specifier_seq *,
>> diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
>> index 7b48b56231b..7f80f9d4d7a 100644
>> --- a/gcc/cp/decl.cc
>> +++ b/gcc/cp/decl.cc
>> @@ -16409,7 +16409,7 @@ finish_enum (tree enumtype)
>> Apply ATTRIBUTES if available.  LOC is the location of NAME.
>> Assignment of sequential values by default is handled here.  */
>>  -void
>> +tree
>>  build_enumerator (tree name, tree value, tree enumtype, tree attributes,
>>location_t loc)
>>  {
>> @@ -16611,6 +16611,8 @@ incremented enumerator value is too large for 
>> %"));
>>  /* Add this enumeration constant to the list for this type.  */
>>TYPE_VALUES (enumtype) = tree_cons (name, decl, TYPE_VALUES (enumtype));
>> +
>> +  return decl;
>>  }
>>/* Look for an enumerator with the given NAME within the enumeration
>> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
>> index 70f02db8757..8fb17349ee1 100644
>> --- a/gcc/cp/pt.cc
>> +++ b/gcc/cp/pt.cc
>> @@ -26944,9 +26944,8 @@ tsubst_enum (tree tag, tree newtag, tree args)
>>for (e = 

Re: [PATCH] c++: ICE with attribute on enumerator [PR104667]

2022-02-28 Thread Jason Merrill via Gcc-patches

On 2/28/22 12:22, Marek Polacek wrote:

On Mon, Feb 28, 2022 at 12:16:47PM -0400, Jason Merrill wrote:

On 2/25/22 17:59, Marek Polacek wrote:

When processing a template, the enumerators we build don't have a type
yet.  But is_late_template_attribute is not prepared to see a _DECL
without a type, so we crash on

enum tree_code code = TREE_CODE (type);

(I found that we don't give the "is deprecated" warning for the enumerator
'f' in the test.  Reported as PR104682.)

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?

PR c++/104667

gcc/cp/ChangeLog:

* decl2.cc (is_late_template_attribute): Cope with a decl without
a type.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attrib64.C: New test.
---
   gcc/cp/decl2.cc |  2 +-
   gcc/testsuite/g++.dg/ext/attrib64.C | 11 +++
   2 files changed, 12 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/g++.dg/ext/attrib64.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 2e58419ea51..dc7710660d0 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1300,7 +1300,7 @@ is_late_template_attribute (tree attr, tree decl)
 /* We can't apply any attributes to a completely unknown type until
 instantiation time.  */
-  enum tree_code code = TREE_CODE (type);
+  enum tree_code code = type ? TREE_CODE (type) : ERROR_MARK;


Maybe return true for null type before looking at the code?  OK with that
change.


I didn't do that because I thought it'd be better to go on and reach

1310   else if (dependent_type_p (type)
1311/* But some attributes specifically apply to templates.  */
1312&& !is_attribute_p ("abi_tag", name)
1313&& !is_attribute_p ("deprecated", name)
1314&& !is_attribute_p ("visibility", name))

null type means dependent, but the attribute can still be one of the
"special" ones.

Do you still want me to make that

   if (!type)
 return true;

change?


Please.  The comment above 'code' applies even more to null type: we 
can't apply any attributes to it.


Jason



Re: [PATCH] c++: ICE with attribute on enumerator [PR104667]

2022-02-28 Thread Marek Polacek via Gcc-patches
On Mon, Feb 28, 2022 at 12:16:47PM -0400, Jason Merrill wrote:
> On 2/25/22 17:59, Marek Polacek wrote:
> > When processing a template, the enumerators we build don't have a type
> > yet.  But is_late_template_attribute is not prepared to see a _DECL
> > without a type, so we crash on
> > 
> >enum tree_code code = TREE_CODE (type);
> > 
> > (I found that we don't give the "is deprecated" warning for the enumerator
> > 'f' in the test.  Reported as PR104682.)
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?
> > 
> > PR c++/104667
> > 
> > gcc/cp/ChangeLog:
> > 
> > * decl2.cc (is_late_template_attribute): Cope with a decl without
> > a type.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/ext/attrib64.C: New test.
> > ---
> >   gcc/cp/decl2.cc |  2 +-
> >   gcc/testsuite/g++.dg/ext/attrib64.C | 11 +++
> >   2 files changed, 12 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/ext/attrib64.C
> > 
> > diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> > index 2e58419ea51..dc7710660d0 100644
> > --- a/gcc/cp/decl2.cc
> > +++ b/gcc/cp/decl2.cc
> > @@ -1300,7 +1300,7 @@ is_late_template_attribute (tree attr, tree decl)
> > /* We can't apply any attributes to a completely unknown type until
> >  instantiation time.  */
> > -  enum tree_code code = TREE_CODE (type);
> > +  enum tree_code code = type ? TREE_CODE (type) : ERROR_MARK;
> 
> Maybe return true for null type before looking at the code?  OK with that
> change.

I didn't do that because I thought it'd be better to go on and reach

1310   else if (dependent_type_p (type)
1311/* But some attributes specifically apply to templates.  */
1312&& !is_attribute_p ("abi_tag", name)
1313&& !is_attribute_p ("deprecated", name)
1314&& !is_attribute_p ("visibility", name))

null type means dependent, but the attribute can still be one of the
"special" ones.

Do you still want me to make that

  if (!type)
return true;

change?

Marek



Re: [PATCH] c++: Lost deprecated/unavailable attr in class tmpl [PR104682]

2022-02-28 Thread Marek Polacek via Gcc-patches
On Mon, Feb 28, 2022 at 12:13:36PM -0400, Jason Merrill wrote:
> On 2/25/22 17:59, Marek Polacek wrote:
> > [ Most likely a GCC 13 patch, but I'm posting it now so that I don't lose 
> > it. ]
> > 
> > When looking into the other PR I noticed that we fail to give a warning
> > for a deprecated enumerator when the enum is in a class template.  This
> > only happens when the attribute doesn't have an argument.  The reason is
> > that when we tsubst_enum, we create a new enumerator:
> > 
> >build_enumerator (DECL_NAME (decl), value, newtag,
> > DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl));
> > 
> > but DECL_ATTRIBUTES (decl) is null when the attribute was provided
> > without an argument -- in that case it simply melts into a tree flag.
> > handle_deprecated_attribute has:
> > 
> >if (!args)
> >   *no_add_attrs = true;
> > 
> > so the attribute isn't retained and we lose it when tsubsting.  Same
> > thing when the attribute is on the enum itself.
> > 
> > Attribute unavailable is a similar case, but it's different in that
> > it can be a late attribute whereas "deprecated" can't:
> 
> Iain, was this difference intentional?

FWIW, I'm in favor of treating deprecated/unavailable the same, that is,
adding unavailable...

> > is_late_template_attribute has
> > 
> >  /* But some attributes specifically apply to templates.  */
> >  && !is_attribute_p ("abi_tag", name)
> >  && !is_attribute_p ("deprecated", name)

...here.  But that really does seem like a GCC 13 change.

> >  && !is_attribute_p ("visibility", name))
> >   return true;
> > else
> >   return false;
> > 
> > which looks strange, but attr-unavailable-9.C tests that we don't error when
> > the attribute is applied on a template.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu.
> 
> This looks extremely safe, so let's go ahead and apply it to trunk.

Will do, thanks.

Marek



Re: [PATCH] c++: ICE with attribute on enumerator [PR104667]

2022-02-28 Thread Jason Merrill via Gcc-patches

On 2/25/22 17:59, Marek Polacek wrote:

When processing a template, the enumerators we build don't have a type
yet.  But is_late_template_attribute is not prepared to see a _DECL
without a type, so we crash on

   enum tree_code code = TREE_CODE (type);

(I found that we don't give the "is deprecated" warning for the enumerator
'f' in the test.  Reported as PR104682.)

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?

PR c++/104667

gcc/cp/ChangeLog:

* decl2.cc (is_late_template_attribute): Cope with a decl without
a type.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attrib64.C: New test.
---
  gcc/cp/decl2.cc |  2 +-
  gcc/testsuite/g++.dg/ext/attrib64.C | 11 +++
  2 files changed, 12 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/ext/attrib64.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 2e58419ea51..dc7710660d0 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1300,7 +1300,7 @@ is_late_template_attribute (tree attr, tree decl)
  
/* We can't apply any attributes to a completely unknown type until

 instantiation time.  */
-  enum tree_code code = TREE_CODE (type);
+  enum tree_code code = type ? TREE_CODE (type) : ERROR_MARK;


Maybe return true for null type before looking at the code?  OK with 
that change.



if (code == TEMPLATE_TYPE_PARM
  || code == BOUND_TEMPLATE_TEMPLATE_PARM
  || code == TYPENAME_TYPE)
diff --git a/gcc/testsuite/g++.dg/ext/attrib64.C 
b/gcc/testsuite/g++.dg/ext/attrib64.C
new file mode 100644
index 000..4a4505fc4b2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/attrib64.C
@@ -0,0 +1,11 @@
+// PR c++/104667
+// { dg-do compile }
+
+template struct A {
+  enum E { // { dg-warning "only applies to function types" }
+e __attribute__ ((access(read_only))),
+f __attribute__ ((deprecated))
+  };
+};
+
+A a;

base-commit: ae3c4e521dd0b66db712639298cd08331d62f315




Re: [PATCH] c++: Lost deprecated/unavailable attr in class tmpl [PR104682]

2022-02-28 Thread Jason Merrill via Gcc-patches

On 2/25/22 17:59, Marek Polacek wrote:

[ Most likely a GCC 13 patch, but I'm posting it now so that I don't lose it. ]

When looking into the other PR I noticed that we fail to give a warning
for a deprecated enumerator when the enum is in a class template.  This
only happens when the attribute doesn't have an argument.  The reason is
that when we tsubst_enum, we create a new enumerator:

   build_enumerator (DECL_NAME (decl), value, newtag,
DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl));

but DECL_ATTRIBUTES (decl) is null when the attribute was provided
without an argument -- in that case it simply melts into a tree flag.
handle_deprecated_attribute has:

   if (!args)
  *no_add_attrs = true;

so the attribute isn't retained and we lose it when tsubsting.  Same
thing when the attribute is on the enum itself.

Attribute unavailable is a similar case, but it's different in that
it can be a late attribute whereas "deprecated" can't:


Iain, was this difference intentional?


is_late_template_attribute has

 /* But some attributes specifically apply to templates.  */
 && !is_attribute_p ("abi_tag", name)
 && !is_attribute_p ("deprecated", name)
 && !is_attribute_p ("visibility", name))
  return true;
else
  return false;

which looks strange, but attr-unavailable-9.C tests that we don't error when
the attribute is applied on a template.

Bootstrapped/regtested on x86_64-pc-linux-gnu.


This looks extremely safe, so let's go ahead and apply it to trunk.


PR c++/104682

gcc/cp/ChangeLog:

* cp-tree.h (build_enumerator): Adjust.
* decl.cc (finish_enum): Make it return the new decl.
* pt.cc (tsubst_enum): Propagate TREE_DEPRECATED and TREE_UNAVAILABLE.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-unavailable-10.C: New test.
* g++.dg/ext/attr-unavailable-11.C: New test.
* g++.dg/warn/deprecated-17.C: New test.
* g++.dg/warn/deprecated-18.C: New test.
---
  gcc/cp/cp-tree.h  |  2 +-
  gcc/cp/decl.cc|  4 +-
  gcc/cp/pt.cc  | 17 +++--
  .../g++.dg/ext/attr-unavailable-10.C  | 22 +++
  .../g++.dg/ext/attr-unavailable-11.C  | 22 +++
  gcc/testsuite/g++.dg/warn/deprecated-17.C | 35 ++
  gcc/testsuite/g++.dg/warn/deprecated-18.C | 37 +++
  7 files changed, 133 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-10.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-11.C
  create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-17.C
  create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-18.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 37d462fca6e..80994e94793 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6833,7 +6833,7 @@ extern void xref_basetypes(tree, 
tree);
  extern tree start_enum(tree, tree, tree, 
tree, bool, bool *);
  extern void finish_enum_value_list(tree);
  extern void finish_enum   (tree);
-extern void build_enumerator   (tree, tree, tree, tree, 
location_t);
+extern tree build_enumerator   (tree, tree, tree, tree, 
location_t);
  extern tree lookup_enumerator (tree, tree);
  extern bool start_preparsed_function  (tree, tree, int);
  extern bool start_function(cp_decl_specifier_seq *,
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 7b48b56231b..7f80f9d4d7a 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -16409,7 +16409,7 @@ finish_enum (tree enumtype)
 Apply ATTRIBUTES if available.  LOC is the location of NAME.
 Assignment of sequential values by default is handled here.  */
  
-void

+tree
  build_enumerator (tree name, tree value, tree enumtype, tree attributes,
  location_t loc)
  {
@@ -16611,6 +16611,8 @@ incremented enumerator value is too large for 
%"));
  
/* Add this enumeration constant to the list for this type.  */

TYPE_VALUES (enumtype) = tree_cons (name, decl, TYPE_VALUES (enumtype));
+
+  return decl;
  }
  
  /* Look for an enumerator with the given NAME within the enumeration

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 70f02db8757..8fb17349ee1 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -26944,9 +26944,8 @@ tsubst_enum (tree tag, tree newtag, tree args)
for (e = TYPE_VALUES (tag); e; e = TREE_CHAIN (e))
  {
tree value;
-  tree decl;
+  tree decl = TREE_VALUE (e);
  
-  decl = TREE_VALUE (e);

/* Note that in a template enum, the TREE_VALUE is the
 CONST_DECL, not the corresponding INTEGER_CST.  */
value = tsubst_expr (DECL_INITIAL (decl),
@@ -26958,8 +26957,14 @@ tsubst_enum (tree tag, tree 

Re: [PATCH 1/5 V1] RISC-V:Implement instruction patterns for Crypto extension

2022-02-28 Thread Kito Cheng via Gcc-patches
On Wed, Feb 23, 2022 at 5:46 PM  wrote:
>
> From: LiaoShihua 
>
>
> gcc/ChangeLog:
>
> * config/riscv/predicates.md (bs_operand): operand for bs
> (rnum_operand):
> * config/riscv/riscv.md: include crypto.md
> * config/riscv/crypto.md: New file.
>
> Co-Authored-By: Wu 
> ---
>  gcc/config/riscv/crypto.md | 383 +
>  gcc/config/riscv/predicates.md |   8 +
>  gcc/config/riscv/riscv.md  |   1 +
>  3 files changed, 392 insertions(+)
>  create mode 100644 gcc/config/riscv/crypto.md
>
> diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
> new file mode 100644
> index 000..591066fac3b
> --- /dev/null
> +++ b/gcc/config/riscv/crypto.md
> @@ -0,0 +1,383 @@
> +;; Machine description for K extension.
> +;; Copyright (C) 2022 Free Software Foundation, Inc.
> +;; Contributed by SiYu Wu (s...@isrc.iscas.ac.cn) and ShiHua Liao 
> (shi...@iscas.ac.cn).
> +
> +;; This file is part of GCC.
> +
> +;; GCC is free software; you can redistribute it and/or modify
> +;; it under the terms of the GNU General Public License as published by
> +;; the Free Software Foundation; either version 3, or (at your option)
> +;; any later version.
> +
> +;; GCC is distributed in the hope that it will be useful,
> +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;; GNU General Public License for more details.
> +
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; .
> +
> +(define_c_enum "unspec" [
> +;;ZBKB unspecs
> +UNSPEC_ROR
> +UNSPEC_ROL

We have standard patterns for ROR and ROL, so I think we don't need
unspec for those two.

...
> +(define_insn "riscv_ror_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")
> +  (match_operand:X 2 "register_operand" "r")]
> +  UNSPEC_ROR))]
> +  "TARGET_ZBKB"
> +  "ror\t%0,%1,%2")
>
> +
> +(define_insn "riscv_rol_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")
> +  (match_operand:X 2 "register_operand" "r")]
> +  UNSPEC_ROL))]
> +  "TARGET_ZBKB"
> +  "rol\t%0,%1,%2")

riscv_ror_ and riscv_rol_ can be removed.

> +
> +(define_insn "riscv_brev8_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")]
> +  UNSPEC_BREV8))]
> +  "TARGET_ZBKB"
> +  "brev8\t%0,%1")
> +
> +(define_insn "riscv_bswap"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")]
> +  UNSPEC_BSWAP))]
> +  "TARGET_ZBKB"
> +  "bswap\t%0,%1")
> +
> +(define_insn "riscv_zip"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +(unspec:SI [(match_operand:SI 1 "register_operand" "r")]
> +  UNSPEC_ZIP))]
> +  "TARGET_ZBKB && !TARGET_64BIT"
> +  "zip\t%0,%1")
> +
> +(define_insn "riscv_unzip"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +(unspec:SI [(match_operand:SI 1 "register_operand" "r")]
> +  UNSPEC_UNZIP))]
> +  "TARGET_ZBKB && !TARGET_64BIT"
> +  "unzip\t%0,%1")
> +
> +(define_insn "riscv_clmul_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")
> +  (match_operand:X 2 "register_operand" "r")]
> +  UNSPEC_CLMUL))]
> +  "TARGET_ZBKC"
> +  "clmul\t%0,%1,%2")
> +
> +(define_insn "riscv_clmulh_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")
> +  (match_operand:X 2 "register_operand" "r")]
> +  UNSPEC_CLMULH))]
> +  "TARGET_ZBKC"
> +  "clmulh\t%0,%1,%2")
> +
> +(define_insn "riscv_xperm8_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")
> +  (match_operand:X 2 "register_operand" "r")]
> +  UNSPEC_XPERM8))]
> +  "TARGET_ZBKX"
> +  "xperm8\t%0,%1,%2")
> +
> +(define_insn "riscv_xperm4_"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +(unspec:X [(match_operand:X 1 "register_operand" "r")
> +  (match_operand:X 2 "register_operand" "r")]
> +  UNSPEC_XPERM4))]
> +  "TARGET_ZBKX"
> +  "xperm4\t%0,%1,%2")
> +
> +(define_insn "riscv_aes32dsi"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +(unspec:SI [(match_operand:SI 1 "register_operand" "r")
> +   (match_operand:SI 2 "register_operand" "r")
> +   (match_operand:SI 3 "bs_operand" "i")]
> +   UNSPEC_AES_DSI))]
> +  "TARGET_ZKND && 

Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 28, 2022 at 04:54:24PM +0100, Mikael Morin wrote:
> Le 28/02/2022 à 15:27, Kwok Cheung Yeung a écrit :
> > On 28/02/2022 2:07 pm, Jakub Jelinek wrote:
> (...)
> > > Don't we usually test instead || (*expr)->rank != 0 when testing for
> > > scalars?
> > > 
> (...)
> > 
> > So (*expr)->rank is 0 here even with an array. I'm not sure why - is
> > rank updated later, or did we forget to call something on the event
> > handle expression?
> > 
> > Testing against n->sym->as for an array check has been used elsewhere in
> > openmp.cc, to prevent reductions against arrays in OpenACC in
> > resolve_omp_clauses.
> > 
> I can’t tell what openmp requires; it depends on your needs.
> 
> Checking sym->as captures array variables which may include scalar
> expressions (arr(10) is a scalar expression even if arr is an array
> variable), while checking expr->rank only capture array expression,
> including scalar variable with array subcomponent (scal%array_comp(:) is an
> array expression, even if scal is a scalar variable).
> 
> gfc_resolve_expr, through gfc_expression_rank takes care of properly setting
> expr->rank.
> If the check is done at resolution stage (somewhere in resolve_omp_clauses I
> guess?), the rank should be set.
> 
> I hope it helps.

It is true that the spots I saw in fortran/openmp.cc that test rank look
like:
if (!gfc_resolve_expr (el->expr)
|| el->expr->ts.type != BT_INTEGER || el->expr->rank != 0)
etc., so probably !gfc_resolve_expr call is missing.

Jakub



Re: [PATCH 5/5 V1] RISC-V:Implement architecture extension test macros for Crypto extension

2022-02-28 Thread Kito Cheng via Gcc-patches
and could you separate this from this patch series, I would like to
include this into GCC 12, and defer other stuffs to GCC 13

On Thu, Feb 24, 2022 at 5:55 PM Kito Cheng  wrote:
>
> I would suggest implementing that in riscv_subset_list::parse so that
> it also affect the ELF attribute emission.
>
> On Wed, Feb 23, 2022 at 5:44 PM  wrote:
> >
> > From: LiaoShihua 
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):Add __riscv_zks, 
> > __riscv_zk, __riscv_zkn
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/predef-17.c: New test.
> >
> > ---
> >  gcc/config/riscv/riscv-c.cc|  9 
> >  gcc/testsuite/gcc.target/riscv/predef-17.c | 59 ++
> >  2 files changed, 68 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/predef-17.c
> >
> > diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> > index 73c62f41274..d6c153e8d7c 100644
> > --- a/gcc/config/riscv/riscv-c.cc
> > +++ b/gcc/config/riscv/riscv-c.cc
> > @@ -63,6 +63,15 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
> >builtin_define ("__riscv_fdiv");
> >builtin_define ("__riscv_fsqrt");
> >  }
> > +
> > +  if (TARGET_ZBKB && TARGET_ZBKC && TARGET_ZBKX && TARGET_ZKNE && 
> > TARGET_ZKND && TARGET_ZKNH)
> > +{
> > +  builtin_define ("__riscv_zk");
> > +  builtin_define ("__riscv_zkn");
> > +}
> > +
> > +  if (TARGET_ZBKB && TARGET_ZBKC && TARGET_ZBKX && TARGET_ZKSED && 
> > TARGET_ZKSH)
> > +  builtin_define ("__riscv_zks");
> >
> >switch (riscv_abi)
> >  {
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-17.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-17.c
> > new file mode 100644
> > index 000..4366dee1016
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-17.c
> > @@ -0,0 +1,59 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64i_zbkb_zbkc_zbkx_zknd_zkne_zknh_zksed_zksh 
> > -mabi=lp64 -mcmodel=medlow -misa-spec=2.2" } */
> > +
> > +int main () {
> > +
> > +#ifndef __riscv_arch_test
> > +#error "__riscv_arch_test"
> > +#endif
> > +
> > +#if __riscv_xlen != 64
> > +#error "__riscv_xlen"
> > +#endif
> > +
> > +#if !defined(__riscv_i)
> > +#error "__riscv_i"
> > +#endif
> > +
> > +#if !defined(__riscv_zk)
> > +#error "__riscv_zk"
> > +#endif
> > +
> > +#if !defined(__riscv_zkn)
> > +#error "__riscv_zkn"
> > +#endif
> > +
> > +#if !defined(__riscv_zks)
> > +#error "__riscv_zks"
> > +#endif
> > +
> > +#if !defined(__riscv_zbkb)
> > +#error "__riscv_zbkb"
> > +#endif
> > +
> > +#if !defined(__riscv_zbkc)
> > +#error "__riscv_zbkc"
> > +#endif
> > +
> > +#if !defined(__riscv_zbkx)
> > +#error "__riscv_zbkx"
> > +#endif
> > +
> > +#if !defined(__riscv_zknd)
> > +#error "__riscv_zknd"
> > +#endif
> > +
> > +#if !defined(__riscv_zkne)
> > +#error "__riscv_zkne"
> > +#endif
> > +
> > +#if !defined(__riscv_zknh)
> > +#error "__riscv_zknh"
> > +#endif
> > +
> > +#if !defined(__riscv_zksh)
> > +#error "__riscv_zksh"
> > +#endif
> > +
> > +  return 0;
> > +}
> > \ No newline at end of file
> > --
> > 2.31.1.windows.1
> >


Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Mikael Morin

Le 28/02/2022 à 15:27, Kwok Cheung Yeung a écrit :

On 28/02/2022 2:07 pm, Jakub Jelinek wrote:

(...)

Don't we usually test instead || (*expr)->rank != 0 when testing for
scalars?


(...)


So (*expr)->rank is 0 here even with an array. I'm not sure why - is 
rank updated later, or did we forget to call something on the event 
handle expression?


Testing against n->sym->as for an array check has been used elsewhere in 
openmp.cc, to prevent reductions against arrays in OpenACC in 
resolve_omp_clauses.



I can’t tell what openmp requires; it depends on your needs.

Checking sym->as captures array variables which may include scalar 
expressions (arr(10) is a scalar expression even if arr is an array 
variable), while checking expr->rank only capture array expression, 
including scalar variable with array subcomponent (scal%array_comp(:) is 
an array expression, even if scal is a scalar variable).


gfc_resolve_expr, through gfc_expression_rank takes care of properly 
setting expr->rank.
If the check is done at resolution stage (somewhere in 
resolve_omp_clauses I guess?), the rank should be set.


I hope it helps.


Re: [PATCH 3/5 V1] RISC-V:Implement intrinsics for Crypto extension

2022-02-28 Thread Kito Cheng via Gcc-patches
Those header files have license issues that should relicinced to GPL,
and don't put rvk_asm_intrin.h rvk_emu_intrin.h, since they are not
too meaningful when we have compiler support.

General comment:
- Use /* */ rather than //, that gives much more compatibility, that
is illegal for c89.
- Add a new line at the end of file, that prevents something like "\
No newline at end of file" in the diff.

> --- /dev/null
> +++ b/gcc/config/riscv/riscv_crypto_scalar.h
> @@ -0,0 +1,247 @@
> +// riscv_crypto_scalar.h
> +// 2021-11-08  Markku-Juhani O. Saarinen 
> +// Copyright (c) 2021, PQShield Ltd. All rights reserved.
> +
> +// === Scalar crypto: General mapping from intrinsics to compiler 
> builtins,
> +// inline assembler, or to an (insecure) porting / emulation 
> layer.
> +
> +/*
> + * _rv_*(...)
> + *   RV32/64 intrinsics that return the "long" data type
> + *
> + * _rv32_*(...)
> + *   RV32/64 intrinsics that return the "int32_t" data type
> + *
> + * _rv64_*(...)
> + *   RV64-only intrinsics that return the "int64_t" data type
> + *
> + */
> +
> +#ifndef _RISCV_CRYPTO_SCALAR_H
> +#define _RISCV_CRYPTO_SCALAR_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#if !defined(__riscv_xlen) && !defined(RVKINTRIN_EMULATE)
> +#warning "Target is not RISC-V. Enabling insecure emulation."
> +#define RVKINTRIN_EMULATE 1
> +#endif
> +
> +#if defined(RVKINTRIN_EMULATE)
> +
> +// intrinsics via emulation (insecure -- porting / debug option)
> +#include "rvk_emu_intrin.h"
> +#define _RVK_INTRIN_IMPL(s) _rvk_emu_##s
> +
> +#elif defined(RVKINTRIN_ASSEMBLER)
> +
> +// intrinsics via inline assembler (builtins not available)
> +#include "rvk_asm_intrin.h"
> +#define _RVK_INTRIN_IMPL(s) _rvk_asm_##s
> +#else
> +
> +// intrinsics via compiler builtins
> +#include 
> +#define _RVK_INTRIN_IMPL(s) __builtin_riscv_##s
> +
> +#endif

Drop rvk_emu_intrin.h and rvk_asm_intrin.h here.

> +
> +// set type if not already set
> +#if !defined(RVKINTRIN_RV32) && !defined(RVKINTRIN_RV64)
...
> +static inline long _rv_sm3p0(long rs1)
> +   { return _RVK_INTRIN_IMPL(sm3p0)(rs1); }  
>   //  SM3P0
> +
> +static inline long _rv_sm3p1(long rs1)
> +   { return _RVK_INTRIN_IMPL(sm3p1)(rs1); }  
>   //  SM3P1
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +

#undef _RVK_INTRIN_IMPL before end of this header to prevent
introducing unexpected symbols.


> +#endif //  _RISCV_CRYPTO_SCALAR_H
> \ No newline at end of file


Re: [PATCH] i386: Fix V8HF vector init under -mno-avx [PR 104664]

2022-02-28 Thread Uros Bizjak via Gcc-patches
On Mon, Feb 28, 2022 at 9:59 AM Hongyu Wang  wrote:
>
> Hi,
>
> For V8HFmode vector init with HFmode, do not directly emits V8HF move
> with subreg, which may cause reload to assign general register to move
> src.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,}.
>
> Ok for master?
>
> gcc/ChangeLog:
>
> PR target/104664
> * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate):
>   Use vec_setv8hf_0 for HF to V8HFmode move instead of subreg.
>
> gcc/testsuite/ChangeLog:
>
> PR target/104664
> * gcc.target/i386/pr104664.c: New test.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-expand.cc   |  7 ++-
>  gcc/testsuite/gcc.target/i386/pr104664.c | 16 
>  2 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr104664.c
>
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index faa0191c6dd..530f83fab88 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -14899,7 +14899,12 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
> machine_mode mode,
>   dperm.one_operand_p = true;
>
>   if (mode == V8HFmode)
> -   tmp1 = lowpart_subreg (V8HFmode, force_reg (HFmode, val), HFmode);
> +   {
> + tmp1 = force_reg (HFmode, val);
> + tmp2 = gen_reg_rtx (mode);
> + emit_insn (gen_vec_setv8hf_0 (tmp2, CONST0_RTX (mode), tmp1));
> + tmp1 = gen_lowpart (mode, tmp2);
> +   }
>   else
> {
>   /* Extend to SImode using a paradoxical SUBREG.  */
> diff --git a/gcc/testsuite/gcc.target/i386/pr104664.c 
> b/gcc/testsuite/gcc.target/i386/pr104664.c
> new file mode 100644
> index 000..8a3d6c7cc85
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr104664.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-march=x86-64 -mtune=generic -Og -ffinite-math-only" } */
> +
> +typedef _Float128 __attribute__((__vector_size__ (16))) U;
> +typedef _Float128 __attribute__((__vector_size__ (32))) V;
> +typedef _Float16  __attribute__((__vector_size__ (16))) W;
> +
> +U u;
> +V v;
> +W w;
> +
> +void
> +foo (void)
> +{
> +w *= (W)(u == __builtin_shufflevector (v, u, 2));
> +}
> --
> 2.18.1
>


Re: [PATCH] Fix error recovery in toplev::finalize.

2022-02-28 Thread David Malcolm via Gcc-patches
On Mon, 2022-02-28 at 12:49 +0100, Martin Liška wrote:
> Use flag_checking instead of CHECKING_P
> and run toplev::finalize only if there is not error seen.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression
> tests.

Did the testing include the libgccjit test suite?  ("jit" is not in --
enable-languages=all)

> 
> Ready to be installed?

I'm not keen on this change; IIRC it's valid to attempt to compile a
gcc_jit_context that fails with an error, and then to attempt a
different gcc_jit_context that succeeds, within the same process.  If
I'm reading the patch right, the patch as written removes this cleanup,
which would thwart that.

I can try to cook up a testcase for the above use case.

Is there another way to fix PR 104648?

Thanks
Dave



> Thanks,
> Martin
> 
> PR ipa/104648
> 
> gcc/ChangeLog:
> 
> * main.cc (main): Use flag_checking instead of CHECKING_P
> and run toplev::finalize only if there is not error seen.
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.dg/pr104648.C: New test.
> ---
>   gcc/main.cc | 6 +++---
>   gcc/testsuite/g++.dg/pr104648.C | 9 +
>   2 files changed, 12 insertions(+), 3 deletions(-)
>   create mode 100644 gcc/testsuite/g++.dg/pr104648.C
> 
> diff --git a/gcc/main.cc b/gcc/main.cc
> index f9dd6b2af58..4ba28b7de53 100644
> --- a/gcc/main.cc
> +++ b/gcc/main.cc
> @@ -37,9 +37,9 @@ main (int argc, char **argv)
>  true /* init_signals */);
>   
>     int r = toplev.main (argc, argv);
> -#if CHECKING_P
> -  toplev.finalize ();
> -#endif
> +
> +  if (flag_checking && !seen_error ())
> +    toplev.finalize ();
>   
>     return r;
>   }
> diff --git a/gcc/testsuite/g++.dg/pr104648.C
> b/gcc/testsuite/g++.dg/pr104648.C
> new file mode 100644
> index 000..b8b7c2864cf
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr104648.C
> @@ -0,0 +1,9 @@
> +// { dg-do compile }
> +// { dg-options "-fvtable-verify=preinit" }
> +
> +struct A {};
> +struct B : virtual A
> +{
> +  B () {};
> +  B () {}; /* { dg-error "cannot be overloaded with" } */
> +};




Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Kwok Cheung Yeung

On 28/02/2022 2:07 pm, Jakub Jelinek wrote:

On Mon, Feb 28, 2022 at 02:01:03PM +, Kwok Cheung Yeung wrote:

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 19142c4d8d0..50a1c476009 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -531,9 +531,10 @@ gfc_match_omp_detach (gfc_expr **expr)
if (gfc_match_variable (expr, 0) != MATCH_YES)
  goto syntax_error;
  
-  if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != gfc_c_intptr_kind)

+  if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != gfc_c_intptr_kind
+  || (*expr)->symtree->n.sym->as)


Don't we usually test instead || (*expr)->rank != 0 when testing for
scalars?

Jakub



If I run GCC in GDB on the pr104131.f90 testcase and inspect the expr, I 
get:


534   if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != 
gfc_c_intptr_kind

(gdb) p **expr
$2 = {expr_type = EXPR_VARIABLE, ts = {type = BT_INTEGER, kind = 8, u = 
{derived = 0x0, cl = 0x0, pad = 0}, interface = 0x0, is_c_interop = 1, 
is_iso_c = 0, f90_type = BT_INTEGER, deferred = false, interop_kind = 
0x2e3fb80}, rank = 0, shape = 0x0, symtree = 0x2e3ffe0, ref = 0x2e3e600, 
where = { ...


So (*expr)->rank is 0 here even with an array. I'm not sure why - is 
rank updated later, or did we forget to call something on the event 
handle expression?


Testing against n->sym->as for an array check has been used elsewhere in 
openmp.cc, to prevent reductions against arrays in OpenACC in 
resolve_omp_clauses.


Kwok


Re: [PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 28, 2022 at 02:01:03PM +, Kwok Cheung Yeung wrote:
> gcc/fortran/
> 
>   PR fortran/104131
>   * openmp.cc (gfc_match_omp_detach): Check that the event handle is not
>   an array type.
> 
> gcc/testsuite/
> 
>   PR fortran/104131
>   * gfortran.dg/gomp/pr104131.f90: New.
>   * gfortran.dg/gomp/pr104131-2.f90: New.
>   * gfortran.dg/gomp/task-detach-1.f90: Update expected error message.
> ---
>  gcc/fortran/openmp.cc|  5 +++--
>  gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90| 10 ++
>  gcc/testsuite/gfortran.dg/gomp/pr104131.f90  | 10 ++
>  gcc/testsuite/gfortran.dg/gomp/task-detach-1.f90 |  2 +-
>  4 files changed, 24 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr104131.f90
> 
> diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
> index 19142c4d8d0..50a1c476009 100644
> --- a/gcc/fortran/openmp.cc
> +++ b/gcc/fortran/openmp.cc
> @@ -531,9 +531,10 @@ gfc_match_omp_detach (gfc_expr **expr)
>if (gfc_match_variable (expr, 0) != MATCH_YES)
>  goto syntax_error;
>  
> -  if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != 
> gfc_c_intptr_kind)
> +  if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != gfc_c_intptr_kind
> +  || (*expr)->symtree->n.sym->as)

Don't we usually test instead || (*expr)->rank != 0 when testing for
scalars?

Jakub



[PATCH] openmp, fortran: Check that event handles passed to detach clauses are not arrays [PR104131]

2022-02-28 Thread Kwok Cheung Yeung

Hello

This patch addresses PR fortran/104131 on the GCC bug tracker, where an 
ICE would occur if an array or co-array was passed as the event handle 
in the detach clause of a task.


Since the event handle is supposed to be a scalar of type 
omp_event_handle_kind, we can simply reject the event handle during 
parsing if it is any type of array, thereby preventing the situation 
leading to an ICE in the first place.


Okay for trunk?

Thanks

KwokFrom 8ed3b8bd793298f94bdefbdff32f91eaea1a9d70 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Mon, 28 Feb 2022 12:34:22 +
Subject: [PATCH] openmp, fortran: Check that event handles passed to detach
 clauses are not arrays [PR104131]

2022-02-28  Kwok Cheung Yeung  

gcc/fortran/

PR fortran/104131
* openmp.cc (gfc_match_omp_detach): Check that the event handle is not
an array type.

gcc/testsuite/

PR fortran/104131
* gfortran.dg/gomp/pr104131.f90: New.
* gfortran.dg/gomp/pr104131-2.f90: New.
* gfortran.dg/gomp/task-detach-1.f90: Update expected error message.
---
 gcc/fortran/openmp.cc|  5 +++--
 gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90| 10 ++
 gcc/testsuite/gfortran.dg/gomp/pr104131.f90  | 10 ++
 gcc/testsuite/gfortran.dg/gomp/task-detach-1.f90 |  2 +-
 4 files changed, 24 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr104131.f90

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 19142c4d8d0..50a1c476009 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -531,9 +531,10 @@ gfc_match_omp_detach (gfc_expr **expr)
   if (gfc_match_variable (expr, 0) != MATCH_YES)
 goto syntax_error;
 
-  if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != gfc_c_intptr_kind)
+  if ((*expr)->ts.type != BT_INTEGER || (*expr)->ts.kind != gfc_c_intptr_kind
+  || (*expr)->symtree->n.sym->as)
 {
-  gfc_error ("%qs at %L should be of type "
+  gfc_error ("%qs at %L should be a scalar of type "
 "integer(kind=omp_event_handle_kind)",
 (*expr)->symtree->n.sym->name, &(*expr)->where);
   return MATCH_ERROR;
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90 
b/gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90
new file mode 100644
index 000..8d10367ba3b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/pr104131-2.f90
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! { dg-options "-fopenmp -fcoarray=single" }
+
+program p
+  use iso_c_binding, only: c_intptr_t
+  integer, parameter :: omp_event_handle_kind = c_intptr_t
+  integer (kind=omp_event_handle_kind) :: x[*]
+  !$omp task detach (x) ! { dg-error "'x' at \\\(1\\\) should be a scalar of 
type integer\\\(kind=omp_event_handle_kind\\\)" }
+  !$omp end task ! { dg-error "Unexpected !\\\$OMP END TASK statement at 
\\\(1\\\)" }
+end
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr104131.f90 
b/gcc/testsuite/gfortran.dg/gomp/pr104131.f90
new file mode 100644
index 000..70a2dedfd7f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/pr104131.f90
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+
+program p
+  use iso_c_binding, only: c_intptr_t
+  integer, parameter :: omp_event_handle_kind = c_intptr_t
+  integer(omp_event_handle_kind) :: x(1)
+  !$omp task detach(x) ! { dg-error "'x' at \\\(1\\\) should be a scalar of 
type integer\\\(kind=omp_event_handle_kind\\\)" }
+  !$omp end task ! { dg-error "Unexpected !\\\$OMP END TASK statement at 
\\\(1\\\)" }
+end
diff --git a/gcc/testsuite/gfortran.dg/gomp/task-detach-1.f90 
b/gcc/testsuite/gfortran.dg/gomp/task-detach-1.f90
index 020be13a8b6..b73db07b7c3 100644
--- a/gcc/testsuite/gfortran.dg/gomp/task-detach-1.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/task-detach-1.f90
@@ -18,7 +18,7 @@ program task_detach_1
   !$omp task detach(x) mergeable ! { dg-error "'DETACH' clause at \\\(1\\\) 
must not be used together with 'MERGEABLE' clause" }
   !$omp end task
 
-  !$omp task detach(z) ! { dg-error "'z' at \\\(1\\\) should be of type 
integer\\\(kind=omp_event_handle_kind\\\)" }
+  !$omp task detach(z) ! { dg-error "'z' at \\\(1\\\) should be a scalar of 
type integer\\\(kind=omp_event_handle_kind\\\)" }
   !$omp end task ! { dg-error "Unexpected !\\\$OMP END TASK statement at 
\\\(1\\\)" }
   
   !$omp task detach (x) firstprivate (x) ! { dg-error "DETACH event handle 'x' 
in FIRSTPRIVATE clause at \\\(1\\\)" }
-- 
2.25.1



[PATCH] ipa: Improve error handling for target_clone single value

2022-02-28 Thread Martin Liška

The patch moves attribute checking to handle_target_clones_attribute where
we drop the attribute if it contains only a single value.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR ipa/104533

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_target_clones_attribute): Use
get_target_clone_attr_len and report warning soon.

gcc/ChangeLog:

* multiple_target.cc (get_attr_len): Move to tree.c.
(expand_target_clones): Remove single value checking.
* tree.cc (get_target_clone_attr_len): New fn.
* tree.h (get_target_clone_attr_len): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/i386/pr104533.C: New test.
---
 gcc/c-family/c-attribs.cc|  6 ++
 gcc/multiple_target.cc   | 26 +---
 gcc/testsuite/g++.target/i386/pr104533.C | 11 ++
 gcc/tree.cc  | 24 ++
 gcc/tree.h   |  2 ++
 5 files changed, 44 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/pr104533.C

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 3849dba90b2..d394ea9d57e 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -5486,6 +5486,12 @@ handle_target_clones_attribute (tree *node, tree name, 
tree ARG_UNUSED (args),
   "with %qs attribute", name, "target");
  *no_add_attrs = true;
}
+  else if (get_target_clone_attr_len (args) == -1)
+   {
+ warning (OPT_Wattributes,
+  "single % attribute is ignored");
+ *no_add_attrs = true;
+   }
   else
   /* Do not inline functions with multiple clone targets.  */
DECL_UNINLINABLE (*node) = 1;
diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
index 5a5a75f0c38..7fe02fb55c8 100644
--- a/gcc/multiple_target.cc
+++ b/gcc/multiple_target.cc
@@ -185,30 +185,6 @@ create_dispatcher_calls (struct cgraph_node *node)
 }
 }
 
-/* Return length of attribute names string,

-   if arglist chain > 1, -1 otherwise.  */
-
-static int
-get_attr_len (tree arglist)
-{
-  tree arg;
-  int str_len_sum = 0;
-  int argnum = 0;
-
-  for (arg = arglist; arg; arg = TREE_CHAIN (arg))
-{
-  const char *str = TREE_STRING_POINTER (TREE_VALUE (arg));
-  size_t len = strlen (str);
-  str_len_sum += len + 1;
-  for (const char *p = strchr (str, ','); p; p = strchr (p + 1, ','))
-   argnum++;
-  argnum++;
-}
-  if (argnum <= 1)
-return -1;
-  return str_len_sum;
-}
-
 /* Create string with attributes separated by comma.
Return number of attributes.  */
 
@@ -342,7 +318,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)

 return false;
 
   tree arglist = TREE_VALUE (attr_target);

-  int attr_len = get_attr_len (arglist);
+  int attr_len = get_target_clone_attr_len (arglist);
 
   /* No need to clone for 1 target attribute.  */

   if (attr_len == -1)
diff --git a/gcc/testsuite/g++.target/i386/pr104533.C 
b/gcc/testsuite/g++.target/i386/pr104533.C
new file mode 100644
index 000..6a1d8def097
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr104533.C
@@ -0,0 +1,11 @@
+/* PR ipa/104533 */
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c++11 -fPIC -Ofast 
-fno-semantic-interposition" } */
+/* { dg-require-ifunc "" } */
+
+struct B
+{
+  virtual ~B();
+};
+__attribute__((target_clones("avx")))
+B::~B() = default; /* { dg-warning "single .target_clones. attribute is 
ignored" } */
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 2bbef2d6b75..4522d90c4d9 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -14553,6 +14553,30 @@ get_attr_nonstring_decl (tree expr, tree *ref)
   return NULL_TREE;
 }
 
+/* Return length of attribute names string,

+   if arglist chain > 1, -1 otherwise.  */
+
+int
+get_target_clone_attr_len (tree arglist)
+{
+  tree arg;
+  int str_len_sum = 0;
+  int argnum = 0;
+
+  for (arg = arglist; arg; arg = TREE_CHAIN (arg))
+{
+  const char *str = TREE_STRING_POINTER (TREE_VALUE (arg));
+  size_t len = strlen (str);
+  str_len_sum += len + 1;
+  for (const char *p = strchr (str, ','); p; p = strchr (p + 1, ','))
+   argnum++;
+  argnum++;
+}
+  if (argnum <= 1)
+return -1;
+  return str_len_sum;
+}
+
 #if CHECKING_P
 
 namespace selftest {

diff --git a/gcc/tree.h b/gcc/tree.h
index 95334b077da..36ceed57064 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -6579,4 +6579,6 @@ extern unsigned fndecl_dealloc_argno (tree);
object or pointer.  Otherwise return null.  */
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
 
+extern int get_target_clone_attr_len (tree);

+
 #endif  /* GCC_TREE_H  */
--
2.35.1



Re: [PATCH] Fix error recovery in toplev::finalize.

2022-02-28 Thread Richard Biener via Gcc-patches
On Mon, Feb 28, 2022 at 12:49 PM Martin Liška  wrote:
>
> Use flag_checking instead of CHECKING_P
> and run toplev::finalize only if there is not error seen.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

Richard.

> Thanks,
> Martin
>
> PR ipa/104648
>
> gcc/ChangeLog:
>
> * main.cc (main): Use flag_checking instead of CHECKING_P
> and run toplev::finalize only if there is not error seen.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/pr104648.C: New test.
> ---
>   gcc/main.cc | 6 +++---
>   gcc/testsuite/g++.dg/pr104648.C | 9 +
>   2 files changed, 12 insertions(+), 3 deletions(-)
>   create mode 100644 gcc/testsuite/g++.dg/pr104648.C
>
> diff --git a/gcc/main.cc b/gcc/main.cc
> index f9dd6b2af58..4ba28b7de53 100644
> --- a/gcc/main.cc
> +++ b/gcc/main.cc
> @@ -37,9 +37,9 @@ main (int argc, char **argv)
>  true /* init_signals */);
>
> int r = toplev.main (argc, argv);
> -#if CHECKING_P
> -  toplev.finalize ();
> -#endif
> +
> +  if (flag_checking && !seen_error ())
> +toplev.finalize ();
>
> return r;
>   }
> diff --git a/gcc/testsuite/g++.dg/pr104648.C b/gcc/testsuite/g++.dg/pr104648.C
> new file mode 100644
> index 000..b8b7c2864cf
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr104648.C
> @@ -0,0 +1,9 @@
> +// { dg-do compile }
> +// { dg-options "-fvtable-verify=preinit" }
> +
> +struct A {};
> +struct B : virtual A
> +{
> +  B () {};
> +  B () {}; /* { dg-error "cannot be overloaded with" } */
> +};
> --
> 2.35.1
>


Re: PING**3 - [PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-02-28 Thread Richard Biener via Gcc-patches
On Mon, 28 Feb 2022, Tobias Burnus wrote:

> Ping**3
> 
> On 23.02.22 09:42, Tobias Burnus wrote:
> > PING**2 for the ME review or at least comments to that patch,
> > which fixes a build issue/ICE with nvptx
> >
> > Patch:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
> > (for gcc/cfgexpand.cc + gcc/expr.cc)
> >
> > (There is some discussion by Tom and Roger about the BE in the patch
> > thread, which only not relate to the ME patch. But there is no
> > ME-patch comment so far.)
> The related BE patch has been already committed, but to be effective, it
> needs the ME patch.

I'm not sure I'm qualified to review this - maybe Richard is.

Richard.

> >
> > Thanks,
> >
> > Tobias
> >
> > On 17.02.22 15:35, Tobias Burnus wrote:
> >> PING for this cfgexpand.cc + expr.cc change by Roger.
> >>
> >> This is a pre-requisite for Roger's nvptx patch to avoid an ICE
> >> during bootstrap:
> >>
> >> * https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
> >>   "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
> >>   (see patch for additional reasoning for this patch)
> >> * See also https://gcc.gnu.org/PR104489
> >>nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
> >> emit-rtl.cc:1022
> >>
> >> Thanks,
> >>
> >> Tobias
> >>
> >> On 09.02.22 21:12, Roger Sayle wrote:
> >>> This patch adds middle-end support for target ABIs that pass/return
> >>> floating point values in integer registers with precision wider than
> >>> the original FP mode.  An example, is the nvptx backend where 16-bit
> >>> HFmode registers are passed/returned as (promoted to) SImode registers.
> >>> Unfortunately, this currently falls foul of the various (recent?)
> >>> sanity
> >>> checks that (very sensibly) prevent creating paradoxical SUBREGs of
> >>> floating point registers.  The approach below is to explicitly
> >>> perform the
> >>> conversion/promotion in two steps, via an integer mode of same
> >>> precision
> >>> as the floating point value.  So on nvptx, 16-bit HFmode is initially
> >>> converted to 16-bit HImode (using SUBREG), then zero-extended to
> >>> SImode,
> >>> and likewise when going the other way, parameters truncated to HImode
> >>> then converted to HFmode (using SUBREG).  These changes are localized
> >>> to expand_value_return and expanding DECL_RTL to support strange ABIs,
> >>> rather than inside convert_modes or gen_lowpart, as mismatched
> >>> precision integer/FP conversions should be explicit in the RTL,
> >>> and these semantics not generally visible/implicit in user code.
> >>>
> >>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> >>> and make -k check with no new failures, and on nvptx-none, where it is
> >>> the middle-end portion of a pair of patches to allow the default ISA to
> >>> be advanced.  Ok for mainline?
> >>>
> >>>
> >>> 2022-02-09  Roger Sayle  
> >>>
> >>> gcc/ChangeLog
> >>> * cfgexpand.cc (expand_value_return): Allow backends to promote
> >>> a scalar floating point return value to a wider integer mode.
> >>> * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise,
> >>> allow
> >>> backends to promote scalar FP PARM_DECLs to wider integer
> >>> modes.
> >>>
> >>>
> >>> Thanks in advance,
> >>> Roger
> >>> --
> >>>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
> München, HRB 106955
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH 2/2]middle-end Backport complex vect testsuite to GCC 11

2022-02-28 Thread Richard Biener via Gcc-patches
On Mon, 28 Feb 2022, Tamar Christina wrote:

> Hi All,
> 
> The GCC 12 testsuite for complex numbers pattern recognition is a lot more
> exhaustive than the GCC 11 one.  This backports the testsuite to GCC 11 so
> any further changes to the branch prevents regressions.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
> 
> Ok for GCC-11?

OK.

> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c: Update test
>   cases to not be UNSUPPORTED.
>   * gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: Likewise.
>   * gcc.dg/vect/complex/bb-slp-complex-add-pattern-short.c: Likewise.
>   * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-int.c:
>   Likewise.
>   * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c:
>   Likewise.
>   * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-short.c:
>   Likewise.
>   * gcc.dg/vect/complex/complex-add-pattern-template.c: Likewise.
>   * gcc.dg/vect/complex/complex-add-template.c: Likewise.
>   * gcc.dg/vect/complex/complex-operations-run.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c:
>   Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-byte.c: Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-int.c: Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-long.c: Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-short.c: Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-byte.c:
>   Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-int.c:
>   Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c:
>   Likewise.
>   * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-short.c:
>   Likewise.
>   * gcc.dg/vect/complex/complex.exp: Copyright year update.
> 
> --- inline copy of patch -- 
> diff --git 
> a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c 
> b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c
> index 
> 8eba24dc187895150ee3515d5bd2a35b46528388..cead05f1cc4e02790630a6cbfe8378c2de3778f3
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c
> +++ b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c
> @@ -1,12 +1,15 @@
>  /* { dg-do compile } */
> -/* { dg-require-effective-target vect_complex_add_int } */
>  /* { dg-require-effective-target stdint_types } */
> -/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { 

Re: [PATCH 1/2]middle-end Handle difference between complex negations in SLP tree better (GCC 11 backport)

2022-02-28 Thread Richard Biener via Gcc-patches
On Mon, 28 Feb 2022, Tamar Christina wrote:

> Hi All,
> 
> GCC 11 handled negations rather differently than GCC 12.  This difference
> caused the previous backport to regress some of the conjugate cases that it
> used to handle before.  The testsuite in GCC 11 wasn't as robust as that in
> master so it didn't catch it.
> 
> The second patch in this series backports the testcases from master to GCC-11
> to prevent this in the future.
> 
> This patch deals with the conjugate cases correctly by updating the detection
> code to deal with the different order of operands.
> 
> For MUL the problem is that the presence of an ADD can cause the order of the
> operands to flip, unlike in GCC 12.  So to handle this if we detect the shape
> of a MUL but the data-flow check fails, we swap both operands and try again.
> 
> Since a * b == b * a this is fine and allows us to keep the df-check simple.
> This doesn't cause a compile time issue either as most of the data will be in
> the caches from the previous call.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions on updated testsuite.
> 
> Ok for GCC 11?

OK.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-vect-slp-patterns.c (vect_validate_multiplication): Correctly
>   detect conjugate cases.
>   (complex_mul_pattern::matches): Likewise.
>   (complex_fma_pattern::matches): Move accumulator last as expected.
>   (complex_fma_pattern::build): Likewise.
>   (complex_fms_pattern::matches): Handle different conjugate form.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
> index 
> a3bd90ff85b4ca5423a94388d480b66051a83e08..8b08a0f33dd1cd23ad9577243524c1feaa5e8ed9
>  100644
> --- a/gcc/tree-vect-slp-patterns.c
> +++ b/gcc/tree-vect-slp-patterns.c
> @@ -873,10 +873,8 @@ compatible_complex_nodes_p (slp_compat_nodes_map_t 
> *compat_cache,
>  static inline bool
>  vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
> slp_compat_nodes_map_t *compat_cache,
> -   vec _op,
> -   vec _op,
> -   bool subtract,
> -   enum _conj_status *_status)
> +   vec _op, vec _op,
> +   bool subtract, enum _conj_status *_status)
>  {
>auto_vec ops;
>enum _conj_status stats = CONJ_NONE;
> @@ -902,29 +900,31 @@ vect_validate_multiplication 
> (slp_tree_to_load_perm_map_t *perm_cache,
>  
>/* Default to style and perm 0, most operations use this one.  */
>int style = 0;
> -  int perm = subtract ? 1 : 0;
> +  int perm = 0;
>  
> -  /* Check if we have a negate operation, if so absorb the node and continue
> - looking.  */
> +  /* Determine which style we're looking at.  We only have different ones
> + whenever a conjugate is involved.  If so absorb the node and continue.  
> */
>bool neg0 = vect_match_expression_p (right_op[0], NEGATE_EXPR);
>bool neg1 = vect_match_expression_p (right_op[1], NEGATE_EXPR);
>  
> -  /* Determine which style we're looking at.  We only have different ones
> - whenever a conjugate is involved.  */
> -  if (neg0 && neg1)
> -;
> -  else if (neg0)
> -{
> -  right_op[0] = SLP_TREE_CHILDREN (right_op[0])[0];
> -  stats = CONJ_FST;
> -  if (subtract)
> - perm = 0;
> -}
> -  else if (neg1)
> +   /* Determine which style we're looking at.  We only have different ones
> +  whenever a conjugate is involved.  */
> +  if (neg0 != neg1 && (neg0 || neg1))
>  {
> -  right_op[1] = SLP_TREE_CHILDREN (right_op[1])[0];
> -  stats = CONJ_SND;
> -  perm = 1;
> +  unsigned idx = !!neg1;
> +  right_op[idx] = SLP_TREE_CHILDREN (right_op[idx])[0];
> +  if (linear_loads_p (perm_cache, left_op[!!!neg1]) == PERM_EVENEVEN)
> + {
> +   stats = CONJ_FST;
> +   style = 1;
> +   if (subtract && neg0)
> + perm = 1;
> + }
> +  else
> + {
> +   stats = CONJ_SND;
> +   perm = 1;
> + }
>  }
>  
>*_status = stats;
> @@ -1069,7 +1069,16 @@ complex_mul_pattern::matches (complex_operation_t op,
>enum _conj_status status;
>if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
>right_op, false, ))
> -return IFN_LAST;
> +{
> + /* Try swapping the operands and trying again.  */
> + std::swap (left_op[0], left_op[1]);
> + right_op.truncate (0);
> + right_op.safe_splice (SLP_TREE_CHILDREN (muls[1]));
> + std::swap (right_op[0], right_op[1]);
> + if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
> +right_op, false, ))
> +   return IFN_LAST;
> +}
>  
>if (status == CONJ_NONE)
>  ifn = IFN_COMPLEX_MUL;
> @@ -1089,7 +1098,7 @@ complex_mul_pattern::matches (complex_operation_t op,
>

Re: [PATCH] arc: Fix for new ifcvt behavior [PR104154]

2022-02-28 Thread Claudiu Zissulescu Ianculescu via Gcc-patches
Hi Robin,

The patch looks good. Please go ahead and merge it, please let me know if
you cannot.

Thank you,
Claudiu

On Mon, Feb 21, 2022 at 9:57 AM Robin Dapp via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Hi,
>
> I figured I'd just go ahead and post this patch as well since it seems
> to have fixed the arc build problems.
>
> It would be nice if someone could bootstrap/regtest if Jeff hasn't
> already done so.  I was able to verify that the two testcases attached
> to the PR build cleanly but not much more.  Thank you.
>
> Regards
>  Robin
>
> --
>
> PR104154
>
> gcc/ChangeLog:
>
> * config/arc/arc.cc (gen_compare_reg):  Return the CC-mode
> comparison ifcvt passed us.
>
> ---
>
> From fa98a40abd55e3a10653f6a8c5b2414a2025103b Mon Sep 17 00:00:00 2001
> From: Robin Dapp 
> Date: Mon, 7 Feb 2022 08:39:41 +0100
> Subject: [PATCH] arc: Fix for new ifcvt behavior [PR104154]
>
> ifcvt now passes a CC-mode "comparison" to backends.  This patch
> simply returns from gen_compare_reg () in that case since nothing
> needs to be prepared anymore.
>
> PR104154
>
> gcc/ChangeLog:
>
> * config/arc/arc.cc (gen_compare_reg):  Return the CC-mode
> comparison ifcvt passed us.
> ---
>  gcc/config/arc/arc.cc | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc
> index 8cc173519ab..5e40ec2c04d 100644
> --- a/gcc/config/arc/arc.cc
> +++ b/gcc/config/arc/arc.cc
> @@ -2254,6 +2254,12 @@ gen_compare_reg (rtx comparison, machine_mode omode)
>
>
>cmode = GET_MODE (x);
> +
> +  /* If ifcvt passed us a MODE_CC comparison we can
> + just return it.  It should be in the proper form already.   */
> +  if (GET_MODE_CLASS (cmode) == MODE_CC)
> +return comparison;
> +
>if (cmode == VOIDmode)
>  cmode = GET_MODE (y);
>gcc_assert (cmode == SImode || cmode == SFmode || cmode == DFmode);
> --
> 2.31.1
>
>


[PATCH] Fix error recovery in toplev::finalize.

2022-02-28 Thread Martin Liška

Use flag_checking instead of CHECKING_P
and run toplev::finalize only if there is not error seen.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR ipa/104648

gcc/ChangeLog:

* main.cc (main): Use flag_checking instead of CHECKING_P
and run toplev::finalize only if there is not error seen.

gcc/testsuite/ChangeLog:

* g++.dg/pr104648.C: New test.
---
 gcc/main.cc | 6 +++---
 gcc/testsuite/g++.dg/pr104648.C | 9 +
 2 files changed, 12 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr104648.C

diff --git a/gcc/main.cc b/gcc/main.cc
index f9dd6b2af58..4ba28b7de53 100644
--- a/gcc/main.cc
+++ b/gcc/main.cc
@@ -37,9 +37,9 @@ main (int argc, char **argv)
 true /* init_signals */);
 
   int r = toplev.main (argc, argv);

-#if CHECKING_P
-  toplev.finalize ();
-#endif
+
+  if (flag_checking && !seen_error ())
+toplev.finalize ();
 
   return r;

 }
diff --git a/gcc/testsuite/g++.dg/pr104648.C b/gcc/testsuite/g++.dg/pr104648.C
new file mode 100644
index 000..b8b7c2864cf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr104648.C
@@ -0,0 +1,9 @@
+// { dg-do compile }
+// { dg-options "-fvtable-verify=preinit" }
+
+struct A {};
+struct B : virtual A
+{
+  B () {};
+  B () {}; /* { dg-error "cannot be overloaded with" } */
+};
--
2.35.1



[PATCH 2/2]middle-end Backport complex vect testsuite to GCC 11

2022-02-28 Thread Tamar Christina via Gcc-patches
Hi All,

The GCC 12 testsuite for complex numbers pattern recognition is a lot more
exhaustive than the GCC 11 one.  This backports the testsuite to GCC 11 so
any further changes to the branch prevents regressions.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Ok for GCC-11?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c: Update test
cases to not be UNSUPPORTED.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: Likewise.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-short.c: Likewise.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-int.c:
Likewise.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c:
Likewise.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-short.c:
Likewise.
* gcc.dg/vect/complex/complex-add-pattern-template.c: Likewise.
* gcc.dg/vect/complex/complex-add-template.c: Likewise.
* gcc.dg/vect/complex/complex-operations-run.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c:
Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c:
Likewise.
* gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-byte.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-int.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-long.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-short.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-byte.c:
Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-int.c:
Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c:
Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-short.c:
Likewise.
* gcc.dg/vect/complex/complex.exp: Copyright year update.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c 
b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c
index 
8eba24dc187895150ee3515d5bd2a35b46528388..cead05f1cc4e02790630a6cbfe8378c2de3778f3
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target vect_complex_add_int } */
 /* { dg-require-effective-target stdint_types } */
-/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+/* { dg-additional-options 

[PATCH 1/2]middle-end Handle difference between complex negations in SLP tree better (GCC 11 backport)

2022-02-28 Thread Tamar Christina via Gcc-patches
Hi All,

GCC 11 handled negations rather differently than GCC 12.  This difference
caused the previous backport to regress some of the conjugate cases that it
used to handle before.  The testsuite in GCC 11 wasn't as robust as that in
master so it didn't catch it.

The second patch in this series backports the testcases from master to GCC-11
to prevent this in the future.

This patch deals with the conjugate cases correctly by updating the detection
code to deal with the different order of operands.

For MUL the problem is that the presence of an ADD can cause the order of the
operands to flip, unlike in GCC 12.  So to handle this if we detect the shape
of a MUL but the data-flow check fails, we swap both operands and try again.

Since a * b == b * a this is fine and allows us to keep the df-check simple.
This doesn't cause a compile time issue either as most of the data will be in
the caches from the previous call.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions on updated testsuite.

Ok for GCC 11?

Thanks,
Tamar

gcc/ChangeLog:

* tree-vect-slp-patterns.c (vect_validate_multiplication): Correctly
detect conjugate cases.
(complex_mul_pattern::matches): Likewise.
(complex_fma_pattern::matches): Move accumulator last as expected.
(complex_fma_pattern::build): Likewise.
(complex_fms_pattern::matches): Handle different conjugate form.

--- inline copy of patch -- 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 
a3bd90ff85b4ca5423a94388d480b66051a83e08..8b08a0f33dd1cd23ad9577243524c1feaa5e8ed9
 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -873,10 +873,8 @@ compatible_complex_nodes_p (slp_compat_nodes_map_t 
*compat_cache,
 static inline bool
 vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
  slp_compat_nodes_map_t *compat_cache,
- vec _op,
- vec _op,
- bool subtract,
- enum _conj_status *_status)
+ vec _op, vec _op,
+ bool subtract, enum _conj_status *_status)
 {
   auto_vec ops;
   enum _conj_status stats = CONJ_NONE;
@@ -902,29 +900,31 @@ vect_validate_multiplication (slp_tree_to_load_perm_map_t 
*perm_cache,
 
   /* Default to style and perm 0, most operations use this one.  */
   int style = 0;
-  int perm = subtract ? 1 : 0;
+  int perm = 0;
 
-  /* Check if we have a negate operation, if so absorb the node and continue
- looking.  */
+  /* Determine which style we're looking at.  We only have different ones
+ whenever a conjugate is involved.  If so absorb the node and continue.  */
   bool neg0 = vect_match_expression_p (right_op[0], NEGATE_EXPR);
   bool neg1 = vect_match_expression_p (right_op[1], NEGATE_EXPR);
 
-  /* Determine which style we're looking at.  We only have different ones
- whenever a conjugate is involved.  */
-  if (neg0 && neg1)
-;
-  else if (neg0)
-{
-  right_op[0] = SLP_TREE_CHILDREN (right_op[0])[0];
-  stats = CONJ_FST;
-  if (subtract)
-   perm = 0;
-}
-  else if (neg1)
+   /* Determine which style we're looking at.  We only have different ones
+  whenever a conjugate is involved.  */
+  if (neg0 != neg1 && (neg0 || neg1))
 {
-  right_op[1] = SLP_TREE_CHILDREN (right_op[1])[0];
-  stats = CONJ_SND;
-  perm = 1;
+  unsigned idx = !!neg1;
+  right_op[idx] = SLP_TREE_CHILDREN (right_op[idx])[0];
+  if (linear_loads_p (perm_cache, left_op[!!!neg1]) == PERM_EVENEVEN)
+   {
+ stats = CONJ_FST;
+ style = 1;
+ if (subtract && neg0)
+   perm = 1;
+   }
+  else
+   {
+ stats = CONJ_SND;
+ perm = 1;
+   }
 }
 
   *_status = stats;
@@ -1069,7 +1069,16 @@ complex_mul_pattern::matches (complex_operation_t op,
   enum _conj_status status;
   if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
 right_op, false, ))
-return IFN_LAST;
+{
+   /* Try swapping the operands and trying again.  */
+   std::swap (left_op[0], left_op[1]);
+   right_op.truncate (0);
+   right_op.safe_splice (SLP_TREE_CHILDREN (muls[1]));
+   std::swap (right_op[0], right_op[1]);
+   if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
+  right_op, false, ))
+ return IFN_LAST;
+}
 
   if (status == CONJ_NONE)
 ifn = IFN_COMPLEX_MUL;
@@ -1089,7 +1098,7 @@ complex_mul_pattern::matches (complex_operation_t op,
   ops->quick_push (right_op[1]);
   ops->quick_push (left_op[0]);
 }
-  else if (kind == PERM_EVENEVEN && status != CONJ_SND)
+  else if (kind == PERM_EVENEVEN && status == CONJ_NONE)
 {
   ops->quick_push (left_op[0]);
   

[PATCH] Simplify PRE fix

2022-02-28 Thread Richard Biener via Gcc-patches
The following reverts a part of the PR103037 fix which is no longer necessary
after the fix for PR104700.  That makes the possible cummulative backport
smaller.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-02-28  Richard Biener  

* tree-ssa-pre.cc (compute_avail): Revert part of last change.
---
 gcc/tree-ssa-pre.cc | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index ca034ff7153..47d70c85c3c 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -3993,14 +3993,7 @@ compute_avail (function *fun)
  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
{
  pre_expr e = get_or_alloc_expr_for_name (op);
- unsigned value_id = get_expr_value_id (e);
- if (value_id_constant_p (value_id))
-   {
- get_or_alloc_expr_for_constant (VN_INFO (op)->valnum);
- continue;
-   }
-
- add_to_value (value_id, e);
+ add_to_value (get_expr_value_id (e), e);
  bitmap_insert_into_set (TMP_GEN (block), e);
  bitmap_value_insert_into_set (AVAIL_OUT (block), e);
}
-- 
2.34.1


Re: [PATCH] PR middle-end/80270: ICE in extract_bit_field_1

2022-02-28 Thread Eric Botcazou via Gcc-patches
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?
> 
> 
> 2022-02-27  Roger Sayle  
> 
> gcc/ChangeLog
>   PR middle-end/80270
>   * expmed.cc (extract_integral_bit_field): If OP0 is a hard
>   register, copy it to a pseudo before calling simplify_gen_subreg.

Looks good to me, but why not using copy_to_reg here?

-- 
Eric Botcazou




Re: [PATCH][V4][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-28 Thread Richard Biener via Gcc-patches
On Sat, 26 Feb 2022, Qing Zhao wrote:

> Hi, 
> 
> This is the 4th version based on the discussion so far.
> 
> The major change is:
> 
> > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
> > suppress_warning (tmp_dst, OPT_Wuninitialized);
> > with a comment explaing why we do that.
> 
> 
> The patch has been bootstrapped and regress tested on both x86 and aarch64.
> Okay for trunk?

OK.

Thanks,
Richard.

> Thanks.
> 
> Qing
> 
> =
> From 276975e60827942f0dd4043ce5f52e600327d6a8 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Thu, 24 Feb 2022 22:38:38 +
> Subject: [PATCH] Suppress uninitialized warnings for new created uses from
>  __builtin_clear_padding folding [PR104550]
> 
> __builtin_clear_padding() will clear all the padding bits of the 
> object.
> actually, it doesn't involve any use of an user variable. Therefore, users do
> not expect any uninitialized warning from it. It's reasonable to suppress
> uninitialized warnings for all new created uses from __builtin_clear_padding
> folding.
> 
>   PR middle-end/104550
> 
> gcc/ChangeLog:
> 
>   * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
>   created uses.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/auto-init-pr104550-1.c: New test.
>   * gcc.dg/auto-init-pr104550-2.c: New test.
>   * gcc.dg/auto-init-pr104550-3.c: New test.
> ---
>  gcc/gimple-fold.cc  | 12 +++-
>  gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
>  gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
>  gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
>  4 files changed, 43 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
> 
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 16f02c2d098d..c9179abb27ed 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -4379,7 +4379,17 @@ clear_padding_flush (clear_padding_struct *buf, bool 
> full)
> else
>   {
> src = make_ssa_name (type);
> -   g = gimple_build_assign (src, unshare_expr (dst));
> +   tree tmp_dst = unshare_expr (dst);
> +   /* The folding introduces a read from the tmp_dst, we should
> +  prevent uninitialized warning analysis from issuing warning
> +  for such fake read.  In order to suppress warning only for
> +  this expr, we should set the location of tmp_dst to
> +  UNKNOWN_LOCATION first, then suppress_warning will call
> +  set_no_warning_bit to set the no_warning flag only for
> +  tmp_dst.  */
> +   SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
> +   suppress_warning (tmp_dst, OPT_Wuninitialized);
> +   g = gimple_build_assign (src, tmp_dst);
> gimple_set_location (g, buf->loc);
> gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
> tree mask = native_interpret_expr (type,
> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
> b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
> new file mode 100644
> index ..a08110c3a170
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
> @@ -0,0 +1,10 @@
> +/* PR 104550*/
> +/* { dg-do compile } */
> +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
> +struct vx_audio_level {
> + int has_monitor_level : 1;
> +};
> +
> +void vx_set_monitor_level() {
> + struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } 
> */
> +}
> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
> b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
> new file mode 100644
> index ..2c395b32d322
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
> @@ -0,0 +1,11 @@
> +/* PR 104550 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
> +struct vx_audio_level {
> + int has_monitor_level : 1;
> +};
> +
> +void vx_set_monitor_level() {
> + struct vx_audio_level info; 
> + __builtin_clear_padding ();  /* { dg-bogus "info" "is used 
> uninitialized" } */ 
> +}
> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c 
> b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
> new file mode 100644
> index ..9893e37f12d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
> @@ -0,0 +1,11 @@
> +/* PR 104550 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
> +struct vx_audio_level {
> + int has_monitor_level : 1;
> +};
> +
> +void vx_set_monitor_level() {
> + struct vx_audio_level info;   /* { dg-bogus "info" "is used uninitialized" 
> } */
> + __builtin_clear_padding ();  /* { dg-bogus "info" "is used 
> 

[PATCH] testsuite: Check fpic support in pr103275.c

2022-02-28 Thread Marc Poulhiès via Gcc-patches
Test must check for effective support of fpic.

Tested on x86_64-pc-linux-gnu{-m32,}.

ok for master?

gcc/testsuite/ChangeLog:

* gcc/gcc.target/i386/pr103275.c: Add missing
dg-require-effective-target for checking fpic.

---
 gcc/testsuite/gcc.target/i386/pr103275.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.target/i386/pr103275.c 
b/gcc/testsuite/gcc.target/i386/pr103275.c
index c93413f3cde..efac7b4b24a 100644
--- a/gcc/testsuite/gcc.target/i386/pr103275.c
+++ b/gcc/testsuite/gcc.target/i386/pr103275.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target ia32 } }  */
+/* { dg-require-effective-target fpic } */
 /* { dg-options "-O2 -march=tigerlake -fPIC" } */
 /* { dg-final { scan-assembler-not {(?n)kmovd.*@gotntpoff} } }  */
 
-- 
2.25.1



Re: [PATCH] libatomic: Improve 16-byte atomics on Intel AVX [PR104688]

2022-02-28 Thread Andreas Schwab
On Feb 28 2022, Jakub Jelinek via Gcc-patches wrote:

> On Mon, Feb 28, 2022 at 04:27:19PM +0800, Xi Ruoyao wrote:
>> On Mon, 2022-02-28 at 07:06 +0100, Jakub Jelinek via Gcc-patches wrote:
>> > +++ libatomic/Makefile.am   2022-02-25 17:25:16.298314196 +0100
>> > @@ -138,8 +138,9 @@ IFUNC_OPTIONS    = -march=i586
>> >  libatomic_la_LIBADD += $(addsuffix _8_1_.lo,$(SIZEOBJS))
>> >  endif
>> >  if ARCH_X86_64
>> > -IFUNC_OPTIONS   = -mcx16
>> > -libatomic_la_LIBADD += $(addsuffix _16_1_.lo,$(SIZEOBJS))
>> > +IFUNC_OPTIONS   = -mcx16 -mcx16
>> 
>> The duplication of "-mcx16" is unintentional, I guess?
>
> No, it is intentional.
> The only place IFUNC_OPTIONS is used is in:
> IFUNC_OPT   = $(word $(PAT_S),$(IFUNC_OPTIONS))
> so for *_1.*o it uses the first word in IFUNC_OPTIONS, for
> *_2.*o second word etc.
> The thing that is not currently supported is if we'd need more than one
> option for one ifunc variant (which is possibly in the future, if e.g.
> -matomic-loadstore-16 option is added, unless that option also
> implies -mcx16, we'd need a way for one option to include both.
> Maybe -Wc,-mcx16,-matomic-loadstore-16 would work though.

Or use a separate variable for each variant:

IFUNC_OPTIONS_1 = ...
IFUNC_OPTIONS_2 = ...
IFUNC_OPT   = $(IFUNC_OPTIONS_$(PAT_S))

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [committed] arc: Fail conditional move expand patterns

2022-02-28 Thread Claudiu Zissulescu Ianculescu via Gcc-patches
Hi Robin,

I don't know how I missed your arc related patch, I'll bootstrap and test
your patch asap.

Thanks,
Claudiu


On Fri, Feb 25, 2022 at 3:29 PM Robin Dapp  wrote:

> > If the movcc comparison is not valid it triggers an assert in the
> > current implementation.  This behavior is not needed as we can FAIL
> > the movcc expand pattern.
>
> In case of a MODE_CC comparison you can also just return it as described
> here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154
>
> or here:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590639.html
>
> If there already is a "CC comparison" the backend does not need to
> create one and ifcvt can make use of this, creating better sequences.
>
> Regards
>  Robin
>


Re: PING**3 - [PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-02-28 Thread Tobias Burnus

Ping**3

On 23.02.22 09:42, Tobias Burnus wrote:

PING**2 for the ME review or at least comments to that patch,
which fixes a build issue/ICE with nvptx

Patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
(for gcc/cfgexpand.cc + gcc/expr.cc)

(There is some discussion by Tom and Roger about the BE in the patch
thread, which only not relate to the ME patch. But there is no
ME-patch comment so far.)

The related BE patch has been already committed, but to be effective, it
needs the ME patch.


Thanks,

Tobias

On 17.02.22 15:35, Tobias Burnus wrote:

PING for this cfgexpand.cc + expr.cc change by Roger.

This is a pre-requisite for Roger's nvptx patch to avoid an ICE
during bootstrap:

* https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
  "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
  (see patch for additional reasoning for this patch)
* See also https://gcc.gnu.org/PR104489
   nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
emit-rtl.cc:1022

Thanks,

Tobias

On 09.02.22 21:12, Roger Sayle wrote:

This patch adds middle-end support for target ABIs that pass/return
floating point values in integer registers with precision wider than
the original FP mode.  An example, is the nvptx backend where 16-bit
HFmode registers are passed/returned as (promoted to) SImode registers.
Unfortunately, this currently falls foul of the various (recent?)
sanity
checks that (very sensibly) prevent creating paradoxical SUBREGs of
floating point registers.  The approach below is to explicitly
perform the
conversion/promotion in two steps, via an integer mode of same
precision
as the floating point value.  So on nvptx, 16-bit HFmode is initially
converted to 16-bit HImode (using SUBREG), then zero-extended to
SImode,
and likewise when going the other way, parameters truncated to HImode
then converted to HFmode (using SUBREG).  These changes are localized
to expand_value_return and expanding DECL_RTL to support strange ABIs,
rather than inside convert_modes or gen_lowpart, as mismatched
precision integer/FP conversions should be explicit in the RTL,
and these semantics not generally visible/implicit in user code.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures, and on nvptx-none, where it is
the middle-end portion of a pair of patches to allow the default ISA to
be advanced.  Ok for mainline?


2022-02-09  Roger Sayle  

gcc/ChangeLog
* cfgexpand.cc (expand_value_return): Allow backends to promote
a scalar floating point return value to a wider integer mode.
* expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise,
allow
backends to promote scalar FP PARM_DECLs to wider integer
modes.


Thanks in advance,
Roger
--


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] docs: Document more .gcda file name generation.

2022-02-28 Thread Martin Liška

This makes documentation more precise.

Ready to be installed?
Thanks,
Martin

PR gcov-profile/104677

gcc/ChangeLog:

* doc/invoke.texi: Document more .gcda file name generation.
---
 gcc/doc/invoke.texi | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ec291c06542..07da1eb2047 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13459,6 +13459,7 @@ counts to a file called @file{@var{sourcename}.gcda} 
for each source
 file.  The information in this data file is very dependent on the
 structure of the generated code, so you must use the same source code
 and the same optimization options for both compilations.
+See details about the file naming in @option{-fprofile-arcs}.
 
 With @option{-fbranch-probabilities}, GCC puts a

 @samp{REG_BR_PROB} note on each @samp{JUMP_INSN} and @samp{CALL_INSN}.
@@ -15237,6 +15238,12 @@ explicitly specified and it is not the final 
executable, otherwise it is
 the basename of the source file.  In both cases any suffix is removed
 (e.g.@: @file{foo.gcda} for input file @file{dir/foo.c}, or
 @file{dir/foo.gcda} for output file specified as @option{-o dir/foo.o}).
+
+Note that if a command line directly links source files, the corresponding
+@var{.gcda} files will be prefixed with a name of the output file.
+E.g. @code{gcc a.c b.c -o binary} would generate @file{binary-a.gcda} and
+@file{binary-b.gcda} files.
+
 @xref{Cross-profiling}.
 
 @cindex @command{gcov}

@@ -15330,7 +15337,8 @@ profile data file appears in the same directory as the 
object file.
 In order to prevent the file name clashing, if the object file name is
 not an absolute path, we mangle the absolute path of the
 @file{@var{sourcename}.gcda} file and use it as the file name of a
-@file{.gcda} file.  See similar option @option{-fprofile-note}.
+@file{.gcda} file.  See details about the file naming in 
@option{-fprofile-arcs}.
+See similar option @option{-fprofile-note}.
 
 When an executable is run in a massive parallel environment, it is recommended

 to save profile to different folders.  That can be done with variables
--
2.35.1



[PATCH] tree-optimization/104700 - adjust constant handling in PRE

2022-02-28 Thread Richard Biener via Gcc-patches
The following refactors find_or_generate_expression to more properly
handle constant valued SSA names thereby simplifying the code and
avoiding ICEing after the last change to NARY processing.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-02-28  Richard Biener  

PR tree-optimization/104700
* tree-ssa-pre.cc (get_or_alloc_expr_for): Remove and inline
into ...
(find_or_generate_expression): ... here, simplifying code.

* gcc.dg/pr104700-2.c: New testcase.
* gcc.dg/torture/pr104700-1.c: Likewise.
---
 gcc/testsuite/gcc.dg/pr104700-2.c | 21 +
 gcc/testsuite/gcc.dg/torture/pr104700-1.c | 38 +++
 gcc/tree-ssa-pre.cc   | 26 +++-
 3 files changed, 70 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104700-2.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr104700-1.c

diff --git a/gcc/testsuite/gcc.dg/pr104700-2.c 
b/gcc/testsuite/gcc.dg/pr104700-2.c
new file mode 100644
index 000..e0759cca70f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr104700-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-ccp -fno-tree-dce -fno-tree-vrp" } */
+
+int a, b;
+int main() {
+  int c = 2, d, e = 0;
+  if (a)
+e = 2;
+  int f, g = -(1L | (e && f && f & e));
+  if (g)
+  L:
+g = c;
+  c = 0;
+  d = e * g;
+  if (d)
+goto L;
+  while (e) {
+int i = (a && b) * i;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr104700-1.c 
b/gcc/testsuite/gcc.dg/torture/pr104700-1.c
new file mode 100644
index 000..7b864d6628b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr104700-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-pre" } */
+
+int printf(const char *, ...);
+int a, b, c = 2, d, e, *f, g;
+void o() {
+  unsigned h = 1;
+  int j = -1, k, l = 1, m = 2, i;
+  while (c < 2)
+;
+L1:
+  k = h;
+  h = -1;
+  if (k < 2 && !c) {
+printf("%d", k);
+goto L1;
+  }
+  if (!j)
+l = printf("0");
+  if (g)
+k = 0;
+  if (a && k)
+goto L2;
+  while (f) {
+m = a;
+d = i;
+i = e;
+f = 
+  L2:
+if (d == l && !m)
+  l = b;
+  }
+  unsigned *n[1] = {};
+}
+int main() {
+  o();
+  return 0;
+}
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index d6c83a72dd8..ca034ff7153 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -1197,18 +1197,6 @@ get_or_alloc_expr_for_constant (tree constant)
   return newexpr;
 }
 
-/* Get or allocate a pre_expr for a piece of GIMPLE, and return it.
-   Currently only supports constants and SSA_NAMES.  */
-static pre_expr
-get_or_alloc_expr_for (tree t)
-{
-  if (TREE_CODE (t) == SSA_NAME)
-return get_or_alloc_expr_for_name (t);
-  else if (is_gimple_min_invariant (t))
-return get_or_alloc_expr_for_constant (t);
-  gcc_unreachable ();
-}
-
 /* Return the folded version of T if T, when folded, is a gimple
min_invariant or an SSA name.  Otherwise, return T.  */
 
@@ -2779,8 +2767,16 @@ create_component_ref_by_pieces (basic_block block, 
vn_reference_t ref,
 static tree
 find_or_generate_expression (basic_block block, tree op, gimple_seq *stmts)
 {
-  pre_expr expr = get_or_alloc_expr_for (op);
-  unsigned int lookfor = get_expr_value_id (expr);
+  /* Constants are always leaders.  */
+  if (is_gimple_min_invariant (op))
+return op;
+
+  gcc_assert (TREE_CODE (op) == SSA_NAME);
+  vn_ssa_aux_t info = VN_INFO (op);
+  unsigned int lookfor = info->value_id;
+  if (value_id_constant_p (lookfor))
+return info->valnum;
+
   pre_expr leader = bitmap_find_leader (AVAIL_OUT (block), lookfor);
   if (leader)
 {
@@ -2808,7 +2804,7 @@ find_or_generate_expression (basic_block block, tree op, 
gimple_seq *stmts)
 its operand values.  */
   if (temp->kind == NARY)
return create_expression_by_pieces (block, temp, stmts,
-   get_expr_type (expr));
+   TREE_TYPE (op));
 }
 
   /* Defer.  */
-- 
2.34.1


Re: [PATCH] libatomic: Improve 16-byte atomics on Intel AVX [PR104688]

2022-02-28 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 28, 2022 at 04:27:19PM +0800, Xi Ruoyao wrote:
> On Mon, 2022-02-28 at 07:06 +0100, Jakub Jelinek via Gcc-patches wrote:
> > +++ libatomic/Makefile.am   2022-02-25 17:25:16.298314196 +0100
> > @@ -138,8 +138,9 @@ IFUNC_OPTIONS    = -march=i586
> >  libatomic_la_LIBADD += $(addsuffix _8_1_.lo,$(SIZEOBJS))
> >  endif
> >  if ARCH_X86_64
> > -IFUNC_OPTIONS   = -mcx16
> > -libatomic_la_LIBADD += $(addsuffix _16_1_.lo,$(SIZEOBJS))
> > +IFUNC_OPTIONS   = -mcx16 -mcx16
> 
> The duplication of "-mcx16" is unintentional, I guess?

No, it is intentional.
The only place IFUNC_OPTIONS is used is in:
IFUNC_OPT   = $(word $(PAT_S),$(IFUNC_OPTIONS))
so for *_1.*o it uses the first word in IFUNC_OPTIONS, for
*_2.*o second word etc.
The thing that is not currently supported is if we'd need more than one
option for one ifunc variant (which is possibly in the future, if e.g.
-matomic-loadstore-16 option is added, unless that option also
implies -mcx16, we'd need a way for one option to include both.
Maybe -Wc,-mcx16,-matomic-loadstore-16 would work though.

Jakub



[committed][libgomp, testsuite, nvptx] Add -mptx=_ in declare-variant-3-sm*.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi,

When running with target board unix/-foffload=-mptx=3.1, we run into:
...
lto1: error: PTX version (-mptx) needs to be at least 4.2 to support \
  selected -misa (sm_53)^M
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned \
  1 exit status^M
compilation terminated.^M
  ...
FAIL: libgomp.c/declare-variant-3-sm53.c (test for excess errors)
...

Fix this by adding -foffload=-mptx=_ in the libgomp.c/declare-variant-3-sm*.c
test-cases.

Tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[libgomp, testsuite, nvptx] Add -mptx=_ in declare-variant-3-sm*.c

libgomp/ChangeLog:

2022-02-28  Tom de Vries  

* testsuite/libgomp.c/declare-variant-3-sm30.c: Add -foffload=-mptx=_.
* testsuite/libgomp.c/declare-variant-3-sm35.c: Same.
* testsuite/libgomp.c/declare-variant-3-sm53.c: Same.
* testsuite/libgomp.c/declare-variant-3-sm70.c: Same.
* testsuite/libgomp.c/declare-variant-3-sm75.c: Same.
* testsuite/libgomp.c/declare-variant-3-sm80.c: Same.

---
 libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c | 2 +-
 libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c | 2 +-
 libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c | 2 +-
 libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c | 2 +-
 libgomp/testsuite/libgomp.c/declare-variant-3-sm75.c | 2 +-
 libgomp/testsuite/libgomp.c/declare-variant-3-sm80.c | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c 
b/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
index ad1602c13cd..a49bc12064a 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload=-misa=sm_30" } */
+/* { dg-additional-options "-foffload=-misa=sm_30 -foffload=-mptx=_" } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
 #include "declare-variant-3.h"
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c 
b/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c
index 1a7cda2456b..9f71acb8738 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c
@@ -1,5 +1,5 @@
 /* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload=-misa=sm_35" } */
+/* { dg-additional-options "-foffload=-misa=sm_35 -foffload=-mptx=_" } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
 #include "declare-variant-3.h"
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c 
b/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c
index a37b5fdaa28..fa713920ce0 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c
@@ -1,5 +1,5 @@
 /* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload=-misa=sm_53" } */
+/* { dg-additional-options "-foffload=-misa=sm_53 -foffload=-mptx=_" } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
 #include "declare-variant-3.h"
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c 
b/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c
index ab022cd79f9..90f0116c582 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c
@@ -1,5 +1,5 @@
 /* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload=-misa=sm_70" } */
+/* { dg-additional-options "-foffload=-misa=sm_70 -foffload=-mptx=_" } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
 #include "declare-variant-3.h"
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm75.c 
b/libgomp/testsuite/libgomp.c/declare-variant-3-sm75.c
index 7d09195d9c4..86f2e72866a 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3-sm75.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm75.c
@@ -1,5 +1,5 @@
 /* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload=-misa=sm_75" } */
+/* { dg-additional-options "-foffload=-misa=sm_75 -foffload=-mptx=_" } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
 #include "declare-variant-3.h"
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm80.c 
b/libgomp/testsuite/libgomp.c/declare-variant-3-sm80.c
index 898ae6e4da8..de208d9bdd1 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3-sm80.c
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm80.c
@@ -1,5 +1,5 @@
 /* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload=-misa=sm_80" } */
+/* { dg-additional-options "-foffload=-misa=sm_80 -foffload=-mptx=_" } */
 /* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
 
 #include "declare-variant-3.h"


[committed][nvptx, testsuite] Add -mptx=_ in nvptx.exp test-cases

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi,

When running with target board nvptx-none-run/-mptx=3.1, I run into:
...
cc1: error: PTX version (-mptx) needs to be at least 4.2 to support selected \
  -misa (sm_53)^M
compiler exited with status 1
FAIL: gcc.target/nvptx/atomic-store-1.c (test for excess errors)
...

Fix this and similar cases by adding an explicit -mptx=_ setting.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx, testsuite] Add -mptx=_ in nvptx.exp test-cases

gcc/testsuite/ChangeLog:

2022-02-28  Tom de Vries  

* gcc.target/nvptx/atomic-store-1.c: Add -mptx=_.
* gcc.target/nvptx/atomic-store-2.c: Same.
* gcc.target/nvptx/float16-1.c: Same.
* gcc.target/nvptx/float16-2.c: Same.
* gcc.target/nvptx/float16-3.c: Same.
* gcc.target/nvptx/float16-4.c: Same.
* gcc.target/nvptx/float16-5.c: Same.
* gcc.target/nvptx/float16-6.c: Same.
* gcc.target/nvptx/tanh-1.c: Same.
* gcc.target/nvptx/uniform-simt-1.c: Same.
* gcc.target/nvptx/uniform-simt-3.c: Same.

---
 gcc/testsuite/gcc.target/nvptx/atomic-store-1.c | 2 +-
 gcc/testsuite/gcc.target/nvptx/atomic-store-2.c | 2 +-
 gcc/testsuite/gcc.target/nvptx/float16-1.c  | 2 +-
 gcc/testsuite/gcc.target/nvptx/float16-2.c  | 2 +-
 gcc/testsuite/gcc.target/nvptx/float16-3.c  | 2 +-
 gcc/testsuite/gcc.target/nvptx/float16-4.c  | 2 +-
 gcc/testsuite/gcc.target/nvptx/float16-5.c  | 2 +-
 gcc/testsuite/gcc.target/nvptx/float16-6.c  | 2 +-
 gcc/testsuite/gcc.target/nvptx/tanh-1.c | 2 +-
 gcc/testsuite/gcc.target/nvptx/uniform-simt-1.c | 2 +-
 gcc/testsuite/gcc.target/nvptx/uniform-simt-3.c | 2 +-
 11 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c
index d611f2d410f..eecd00854f7 100644
--- a/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c
@@ -2,7 +2,7 @@
shared state space.  */
 
 /* { dg-do compile } */
-/* { dg-options "-misa=sm_53" } */
+/* { dg-options "-misa=sm_53 -mptx=_" } */
 
 enum memmodel
 {
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
index b58f33f2abd..127d2c4cbe2 100644
--- a/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
@@ -2,7 +2,7 @@
shared state space.  */
 
 /* { dg-do compile } */
-/* { dg-options "-misa=sm_70" } */
+/* { dg-options "-misa=sm_70 -mptx=_" } */
 
 enum memmodel
 {
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-1.c 
b/gcc/testsuite/gcc.target/nvptx/float16-1.c
index 9c3f8fe8f9d..873a0543535 100644
--- a/gcc/testsuite/gcc.target/nvptx/float16-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/float16-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -misa=sm_53 -ffast-math" } */
+/* { dg-options "-O2 -ffast-math -misa=sm_53 -mptx=_" } */
 
 _Float16 var;
 
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-2.c 
b/gcc/testsuite/gcc.target/nvptx/float16-2.c
index 2d1dc1aafb5..30a3092bc29 100644
--- a/gcc/testsuite/gcc.target/nvptx/float16-2.c
+++ b/gcc/testsuite/gcc.target/nvptx/float16-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ffast-math -misa=sm_80" } */
+/* { dg-options "-O2 -ffast-math -misa=sm_80 -mptx=_" } */
 
 _Float16 x;
 _Float16 y;
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-3.c 
b/gcc/testsuite/gcc.target/nvptx/float16-3.c
index 3abcec39a8a..edd6514a976 100644
--- a/gcc/testsuite/gcc.target/nvptx/float16-3.c
+++ b/gcc/testsuite/gcc.target/nvptx/float16-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -misa=sm_53" } */
+/* { dg-options "-O2 -misa=sm_53 -mptx=_" } */
 
 _Float16 var;
 
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-4.c 
b/gcc/testsuite/gcc.target/nvptx/float16-4.c
index 173f9600ac7..0a823971e75 100644
--- a/gcc/testsuite/gcc.target/nvptx/float16-4.c
+++ b/gcc/testsuite/gcc.target/nvptx/float16-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -misa=sm_53 -ffast-math" } */
+/* { dg-options "-O2 -ffast-math -misa=sm_53 -mptx=_" } */
 
 _Float16 var;
 
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-5.c 
b/gcc/testsuite/gcc.target/nvptx/float16-5.c
index 700b3159a97..2261f42baac 100644
--- a/gcc/testsuite/gcc.target/nvptx/float16-5.c
+++ b/gcc/testsuite/gcc.target/nvptx/float16-5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -misa=sm_53 -ffast-math" } */
+/* { dg-options "-O2 -ffast-math -misa=sm_53 -mptx=_" } */
 
 _Float16 a;
 _Float16 b;
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-6.c 
b/gcc/testsuite/gcc.target/nvptx/float16-6.c
index 4889577f7f6..9ca714ca76f 100644
--- a/gcc/testsuite/gcc.target/nvptx/float16-6.c
+++ b/gcc/testsuite/gcc.target/nvptx/float16-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -misa=sm_53" } */
+/* { dg-options "-O2 -misa=sm_53 -mptx=_" } */
 
 _Float16 x;
 

[committed][nvptx] Add -mptx=_

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi,

Add an -mptx=_ value, that indicates the default ptx version.

It can be used to undo an explicit -mptx setting, so this:
...
$ gcc test.c -mptx=3.1 -mptx=_
...
has the same effect as:
...
$ gcc test.c
...

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx] Add -mptx=_

gcc/ChangeLog:

2022-02-28  Tom de Vries  

* config/nvptx/nvptx-opts.h (enum ptx_version): Add
PTX_VERSION_default.
* config/nvptx/nvptx.cc (handle_ptx_version_option): Handle
PTX_VERSION_default.
* config/nvptx/nvptx.opt: Add EnumValue "_" / PTX_VERSION_default.

---
 gcc/config/nvptx/nvptx-opts.h | 1 +
 gcc/config/nvptx/nvptx.cc | 3 ++-
 gcc/config/nvptx/nvptx.opt| 3 +++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx-opts.h b/gcc/config/nvptx/nvptx-opts.h
index e918d43ea16..30852b6992c 100644
--- a/gcc/config/nvptx/nvptx-opts.h
+++ b/gcc/config/nvptx/nvptx-opts.h
@@ -32,6 +32,7 @@ enum ptx_isa
 
 enum ptx_version
 {
+  PTX_VERSION_default,
   PTX_VERSION_3_0,
   PTX_VERSION_3_1,
   PTX_VERSION_4_2,
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index b9451c2ed09..7862a90a65a 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -296,7 +296,8 @@ sm_version_to_string (enum ptx_isa sm)
 static void
 handle_ptx_version_option (void)
 {
-  if (!OPTION_SET_P (ptx_version_option))
+  if (!OPTION_SET_P (ptx_version_option)
+  || ptx_version_option == PTX_VERSION_default)
 {
   ptx_version_option = default_ptx_version_option ();
   return;
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 9776c3b9a1f..f555ad1d8bf 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -94,6 +94,9 @@ Enum(ptx_version) String(6.3) Value(PTX_VERSION_6_3)
 EnumValue
 Enum(ptx_version) String(7.0) Value(PTX_VERSION_7_0)
 
+EnumValue
+Enum(ptx_version) String(_) Value(PTX_VERSION_default)
+
 mptx=
 Target RejectNegative ToLower Joined Enum(ptx_version) Var(ptx_version_option)
 Specify the version of the ptx version to use.


Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Jiufu Guo via Gcc-patches
Richard Biener  writes:

> On Fri, 25 Feb 2022, Jiufu Guo wrote:
>
>> Richard Biener  writes:
>> 
>> > On Fri, 25 Feb 2022, Jiufu Guo wrote:
>> >
>> >> Richard Biener  writes:
>> >> 
>> >> > On Thu, 24 Feb 2022, Jiufu Guo wrote:
>> >> >
>> >> >> Jiufu Guo via Gcc-patches  writes:
>> >> >> 
>> >> >> > Segher Boessenkool  writes:
>> >> >> >
>> >> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
>> >> >> >>> I'm assuming we're always dealing with
>> >> >> >>> 
>> >> >> >>>   (set (reg:MODE ..) )
>> >> >> >>> 
>> >> >> >>> here and CSE is not substituting into random places of an
>> >> >> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
>> >> >> >>> to for a constant, if it should implicitely evaluate the cost
>> >> >> >>> of putting the result into a register for example.
>> >> >> >>
>> >> >> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
>> >> >> >> for anything that is used as input in a machine instruction -- but 
>> >> >> >> you
>> >> >> >> need much more context to determine that.  insn_cost is much 
>> >> >> >> simpler and
>> >> >> >> much easier to use.
>> >> >> >>
>> >> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using
>> >> >> >>> your proposed new target hook and comparing it with the original
>> >> >> >>> unfolded src (again with SET and 1).
>> >> >> >>
>> >> >> >> It is required to generate valid instructions no matter what, before
>> >> >> >> the pass has finished that is.  On all more modern architectures it 
>> >> >> >> is
>> >> >> >> futile to think you can usefully consider the cost of an RTL 
>> >> >> >> expression
>> >> >> >> and derive a real-world cost of the generated code from that.
>> >> >> >
>> >> >> > Thanks Segher for pointing out these!  Here is  another reason that I
>> >> >> > did not use rtx_cost: in a few passes, there are codes to check the
>> >> >> > constants and store them in constant pool.  I'm thinking to 
>> >> >> > integerate
>> >> >> > those codes in a consistent way.
>> >> >> 
>> >> >> Hi Segher, Richard!
>> >> >> 
>> >> >> I'm thinking the way like: For a constant,
>> >> >> 1. if the constant could be used as an immediate for the
>> >> >> instruction, then retreated as an operand;
>> >> >> 2. otherwise if the constant can not be stored into a
>> >> >> constant pool, then handle through instructions;
>> >> >> 3. if it is faster to access constant from pool, then emit
>> >> >> constant as data(.rodata);
>> >> >> 4. otherwise, handle the constant by instructions.
>> >> >> 
>> >> >> And to store the constant into a pool, besides force_const_mem,
>> >> >> create reference (toc) may be needed on some platforms.
>> >> >> 
>> >> >> For this particular issue in CSE, there is already code that
>> >> >> tries to put constant into a pool (invoke force_const_mem).
>> >> >> While the code is too late.  So, we may check the constant
>> >> >> earlier and store it into constant pool if profitable.
>> >> >> 
>> >> >> And another thing as Segher pointed out, CSE is doing too
>> >> >> much work.  It may be ok to separate the constant handling
>> >> >> logic from CSE.
>> >> >
>> >> > Not sure - CSE just is value numbering, I don't see that it does
>> >> > more than that.  Yes, it might have developed "heuristics" over
>> >> > the years what to CSE and to what and where to substitute and
>> >> > where not.  But in the end it does just value numbering.
>> >> >
>> >> >> 
>> >> >> I update a new version patch as follow (did not seprate CSE):
>> >> >
>> >> > How is the new target hook better in any way compared to rtx_cost
>> >> > or insn_cost?  It looks like a total hack.
>> >> >
>> >> > I suppose the actual way of materializing a constant is done
>> >> > behind GCCs back and not exposed anywhere?  But instead you
>> >> > claim the constants are valid when they actually are not?
>> >> > Isn't the problem then that the rs6000 backend lies?
>> >> 
>> >> Hi Richard,
>> >> 
>> >> Thanks for your comments and sugguestions!
>> >> 
>> >> Materializing a constant should be done behind GCC.
>> >> On rs6000, in expand pass, during emit_move, the constant is
>> >> checked and store into constant pool if necessary.
>> >> Some other platforms are doing a similar thing, e.g.
>> >> ix86_expand_vector_move, alpha_expand_mov,...
>> >> mips_legitimize_const_move.
>> >> 
>> >> But, it does not as we expect, force_const_mem is also 
>> >> exposed other places (not only ira/reload for stack reference).
>> >> 
>> >> CSE is one place, for example, CSE first retrieve the constant
>> >> from insn's equal, but it also tries to put constant into
>> >> pool for some condition (the condition was introduced at
>> >> early age of cse.c, and it is rare to run into in trunk).
>> >> In some aspects, IMHO, this seems not a great work of CSE.
>> >> 
>> >> And this is how the 'invalid(or say slow)' constant occurs.
>> >> e.g.  before cse:
>> >> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
>> >>   REG_EQUAL 0x100803004101001
>> >> after cse: 

[committed][nvptx, testsuite] Add -misa=sm_30 in nvptx/atomic-store-3.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi,

When running with target board nvptx-none-run/-misa=sm_70 I run into:
...
FAIL: gcc.target/nvptx/atomic-store-3.c scan-assembler-times st.global.u32 1
FAIL: gcc.target/nvptx/atomic-store-3.c scan-assembler-times st.global.u64 1
...

Fix this by adding an explicit -misa=sm_30 in the test-case.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx, testsuite] Add -misa=sm_30 in nvptx/atomic-store-3.c

gcc/testsuite/ChangeLog:

2022-02-28  Tom de Vries  

* gcc.target/nvptx/atomic-store-3.c: Add -misa=sm_30.

---
 gcc/testsuite/gcc.target/nvptx/atomic-store-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-store-3.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-store-3.c
index cc0264f2b06..5d417b84b3e 100644
--- a/gcc/testsuite/gcc.target/nvptx/atomic-store-3.c
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-store-3.c
@@ -1,7 +1,7 @@
 /* Test the atomic store expansion, global state space.  */
 
 /* { dg-do compile } */
-/* { dg-additional-options "-Wno-long-long" } */
+/* { dg-additional-options "-Wno-long-long -misa=sm_30" } */
 
 enum memmodel
 {


[committed][nvptx, testsuite] Add -misa=sm_30 in nvptx/uniform-simt-2.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi,

When running with target board nvptx-none-run/-misa=sm_53 we run into:
...
cc1: error: PTX version (-mptx) needs to be at least 4.2 to support selected \
  -misa (sm_53)^M
compiler exited with status 1
FAIL: gcc.target/nvptx/uniform-simt-2.c (test for excess errors)
...

Fix this by adding an explicit -misa=sm_30 in the test-case.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx, testsuite] Add -misa=sm_30 in nvptx/uniform-simt-2.c

gcc/testsuite/ChangeLog:

2022-02-28  Tom de Vries  

* gcc.target/nvptx/uniform-simt-2.c: Add -misa=sm_30.

---
 gcc/testsuite/gcc.target/nvptx/uniform-simt-2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/nvptx/uniform-simt-2.c 
b/gcc/testsuite/gcc.target/nvptx/uniform-simt-2.c
index 0f1e4e780fe..b1eee0d618f 100644
--- a/gcc/testsuite/gcc.target/nvptx/uniform-simt-2.c
+++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-2.c
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -muniform-simt -mptx=3.1" } */
+/* { dg-options "-O2 -muniform-simt -mptx=3.1 -misa=sm_30" } */
 
 enum memmodel
 {


[committed][nvptx, testsuite] Add -misa=sm_35 in nvptx/rotate.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi,

When running with target board nvptx-none-run/-misa=sm_30 we run into:
...
FAIL: gcc.target/nvptx/rotate.c scan-assembler-times shf.l.wrap.b32 1
FAIL: gcc.target/nvptx/rotate.c scan-assembler-times shf.r.wrap.b32 1
FAIL: gcc.target/nvptx/rotate.c scan-assembler-not and.b32
...

Fix this by adding an explicit -misa=sm_35 in the test-case.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx, testsuite] Add -misa=sm_35 in nvptx/rotate.c

gcc/testsuite/ChangeLog:

2022-02-28  Tom de Vries  

* gcc.target/nvptx/rotate.c: Add -misa=sm_35.

---
 gcc/testsuite/gcc.target/nvptx/rotate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/nvptx/rotate.c 
b/gcc/testsuite/gcc.target/nvptx/rotate.c
index 1c9b83b4809..a6045166b57 100644
--- a/gcc/testsuite/gcc.target/nvptx/rotate.c
+++ b/gcc/testsuite/gcc.target/nvptx/rotate.c
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-O2 -save-temps" } */
+/* { dg-options "-O2 -save-temps -misa=sm_35" } */
 
 #define MASK 0x1f
 


[PATCH] opts: fix -gtoggle + optimize attribute

2022-02-28 Thread Martin Liška

Note -fvar-tracking is enabled automatically with OPT_LEVELS_1_PLUS and
so we need to drop it if we are called from optimize attribute and the
option is unset.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR middle-end/104381

gcc/ChangeLog:

* opts.cc (finish_options): If debug info is disabled
(debug_info_level) and -fvar-tracking is unset, disable it.

gcc/testsuite/ChangeLog:

* gcc.dg/pr104381.c: New test.
---
 gcc/opts.cc | 49 +++--
 gcc/testsuite/gcc.dg/pr104381.c | 20 ++
 2 files changed, 48 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104381.c

diff --git a/gcc/opts.cc b/gcc/opts.cc
index 19c68aed065..2370bb0aafe 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -1302,6 +1302,34 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
 SET_OPTION_IF_UNSET (opts, opts_set, flag_vect_cost_model,
 VECT_COST_MODEL_CHEAP);
 
+  if (flag_gtoggle)

+{
+  /* Make sure to process -gtoggle only once.  */
+  flag_gtoggle = false;
+  if (debug_info_level == DINFO_LEVEL_NONE)
+   {
+ debug_info_level = DINFO_LEVEL_NORMAL;
+
+ if (write_symbols == NO_DEBUG)
+   write_symbols = PREFERRED_DEBUGGING_TYPE;
+   }
+  else
+   debug_info_level = DINFO_LEVEL_NONE;
+}
+
+  if (!OPTION_SET_P (debug_nonbind_markers_p))
+debug_nonbind_markers_p
+  = (optimize
+&& debug_info_level >= DINFO_LEVEL_NORMAL
+&& dwarf_debuginfo_p ()
+&& !(flag_selective_scheduling || flag_selective_scheduling2));
+
+  /* Note -fvar-tracking is enabled automatically with OPT_LEVELS_1_PLUS and
+ so we need to drop it if we are called from optimize attribute.  */
+  if (debug_info_level < DINFO_LEVEL_NORMAL
+  && !OPTION_SET_P (flag_var_tracking))
+flag_var_tracking = false;
+
   /* One could use EnabledBy, but it would lead to a circular dependency.  */
   if (!OPTION_SET_P (flag_var_tracking_uninit))
  flag_var_tracking_uninit = flag_var_tracking;
@@ -1328,27 +1356,6 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
   profile_flag = 0;
 }
 
-  if (flag_gtoggle)

-{
-  /* Make sure to process -gtoggle only once.  */
-  flag_gtoggle = false;
-  if (debug_info_level == DINFO_LEVEL_NONE)
-   {
- debug_info_level = DINFO_LEVEL_NORMAL;
-
- if (write_symbols == NO_DEBUG)
-   write_symbols = PREFERRED_DEBUGGING_TYPE;
-   }
-  else
-   debug_info_level = DINFO_LEVEL_NONE;
-}
-
-  if (!OPTION_SET_P (debug_nonbind_markers_p))
-debug_nonbind_markers_p
-  = (optimize
-&& debug_info_level >= DINFO_LEVEL_NORMAL
-&& dwarf_debuginfo_p ()
-&& !(flag_selective_scheduling || flag_selective_scheduling2));
 
   diagnose_options (opts, opts_set, loc);

 }
diff --git a/gcc/testsuite/gcc.dg/pr104381.c b/gcc/testsuite/gcc.dg/pr104381.c
new file mode 100644
index 000..a3aec919bee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr104381.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -g -gtoggle -fdump-tree-optimized" } */
+
+int foo (int x)
+{
+  int tem = x + 1;
+  int tem2 = tem - 1;
+  return tem2;
+}
+
+int
+__attribute__((optimize("no-tree-pre")))
+bar (int x)
+{
+  int tem = x + 1;
+  int tem2 = tem - 1;
+  return tem2;
+}
+
+// { dg-final { scan-tree-dump-not "DEBUG " "optimized" } }
--
2.35.1



[PATCH] i386: Fix V8HF vector init under -mno-avx [PR 104664]

2022-02-28 Thread Hongyu Wang via Gcc-patches
Hi,

For V8HFmode vector init with HFmode, do not directly emits V8HF move
with subreg, which may cause reload to assign general register to move
src.

Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,}.

Ok for master?

gcc/ChangeLog:

PR target/104664
* config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate):
  Use vec_setv8hf_0 for HF to V8HFmode move instead of subreg.

gcc/testsuite/ChangeLog:

PR target/104664
* gcc.target/i386/pr104664.c: New test.
---
 gcc/config/i386/i386-expand.cc   |  7 ++-
 gcc/testsuite/gcc.target/i386/pr104664.c | 16 
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104664.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index faa0191c6dd..530f83fab88 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -14899,7 +14899,12 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
machine_mode mode,
  dperm.one_operand_p = true;
 
  if (mode == V8HFmode)
-   tmp1 = lowpart_subreg (V8HFmode, force_reg (HFmode, val), HFmode);
+   {
+ tmp1 = force_reg (HFmode, val);
+ tmp2 = gen_reg_rtx (mode);
+ emit_insn (gen_vec_setv8hf_0 (tmp2, CONST0_RTX (mode), tmp1));
+ tmp1 = gen_lowpart (mode, tmp2);
+   }
  else
{
  /* Extend to SImode using a paradoxical SUBREG.  */
diff --git a/gcc/testsuite/gcc.target/i386/pr104664.c 
b/gcc/testsuite/gcc.target/i386/pr104664.c
new file mode 100644
index 000..8a3d6c7cc85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104664.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-march=x86-64 -mtune=generic -Og -ffinite-math-only" } */
+
+typedef _Float128 __attribute__((__vector_size__ (16))) U;
+typedef _Float128 __attribute__((__vector_size__ (32))) V;
+typedef _Float16  __attribute__((__vector_size__ (16))) W;
+
+U u;
+V v;
+W w;
+
+void
+foo (void)
+{
+w *= (W)(u == __builtin_shufflevector (v, u, 2));
+}
-- 
2.18.1



Re: [PATCH] configure: add --disable-fix-includes

2022-02-28 Thread Martin Liška

On 2/5/22 03:26, Allan McRae wrote:

On 5/2/22 01:22, Martin Liška wrote:

On 2/4/22 14:30, Jakub Jelinek via Gcc-patches wrote:

We don't ship any include-fixed headers in Fedora/RHEL.


Removing include-fixed from an installed folder, I see:

make[2]: Entering directory '/home/marxin/Programming/postgres/src/common'
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O3 -march=native -flto=auto -DFRONTEND -I. -I../../src/common -I../../src/include  -D_GNU_SOURCE -DVAL_CC="\"gcc\"" -DVAL_CPPFLAGS="\"-D_GNU_SOURCE\"" -DVAL_CFLAGS="\"-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O3 -march=native -flto=auto\"" -DVAL_CFLAGS_SL="\"-fPIC\"" -DVAL_LDFLAGS="\"-O3 -march=native -flto=auto -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib64',--enable-new-dtags\"" -DVAL_LDFLAGS_EX="\"\"" 
-DVAL_LDFLAGS_SL="\"\"" -DVAL_LIBS="\"-lpgcommon -lpgport -lz -lreadline -lm \""  -c -o pg_lzcompress.o pg_lzcompress.c

In file included from pg_lzcompress.c:186:
/usr/include/limits.h:124:26: error: no include path in which to search for 
limits.h
   124 | # include_next 
   |  ^
pg_lzcompress.c:226:9: error: ‘INT_MAX’ undeclared here (not in a function)
   226 | INT_MAX,    /* No upper 
limit on what we'll try to
   | ^~~
pg_lzcompress.c:189:1: note: ‘INT_MAX’ is defined in header ‘’; did you 
forget to ‘#include ’?
   188 | #include "common/pg_lzcompress.h"
   +++ |+#include 

How do you solve this in Fedora/RHEL?


The Fedora gcc.spec file has this:

mv $FULLPATH/include-fixed/syslimits.h $FULLPATH/include/syslimits.h
mv $FULLPATH/include-fixed/limits.h $FULLPATH/include/limits.h


Yes, I noticed that as well.



My understanding are these are not real fixinclude processed headers.


You are correct. I've just prepared a patch that would exclude these 2 header 
files
from include-fixed. I'm planning the patch for next stage1.

Martin



Allan




Re: [PATCH] libatomic: Improve 16-byte atomics on Intel AVX [PR104688]

2022-02-28 Thread Xi Ruoyao via Gcc-patches
On Mon, 2022-02-28 at 07:06 +0100, Jakub Jelinek via Gcc-patches wrote:
> +++ libatomic/Makefile.am   2022-02-25 17:25:16.298314196 +0100
> @@ -138,8 +138,9 @@ IFUNC_OPTIONS    = -march=i586
>  libatomic_la_LIBADD += $(addsuffix _8_1_.lo,$(SIZEOBJS))
>  endif
>  if ARCH_X86_64
> -IFUNC_OPTIONS   = -mcx16
> -libatomic_la_LIBADD += $(addsuffix _16_1_.lo,$(SIZEOBJS))
> +IFUNC_OPTIONS   = -mcx16 -mcx16

The duplication of "-mcx16" is unintentional, I guess?
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University