date:20230720

[Bug target/99889] Add powerpc ELFv1 support for -fpatchable-function-entry* with "o" sections

2023-07-20 Thread linkw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99889

Kewen Lin  changed:

   What|Removed |Added

 CC||i at maskray dot me

--- Comment #5 from Kewen Lin  ---
*** Bug 110729 has been marked as a duplicate of this bug. ***

[Bug middle-end/110729] -fpatchable-function-entries: __patchable_function_entries has wrong sh_link

2023-07-20 Thread linkw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110729

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Resolution|FIXED   |DUPLICATE

--- Comment #5 from Kewen Lin  ---


*** This bug has been marked as a duplicate of bug 99889 ***

Re: [PATCH v4 2/3] c++: Improve constexpr error for dangling local variables [PR110619]

2023-07-20 Thread Nathaniel Shead via Gcc-patches

On Thu, Jul 20, 2023 at 11:46:47AM -0400, Jason Merrill wrote:
> On 7/20/23 05:36, Nathaniel Shead wrote:
> > Currently, when typeck discovers that a return statement will refer to a
> > local variable it rewrites to return a null pointer. This causes the
> > error messages for using the return value in a constant expression to be
> > unhelpful, especially for reference return values.
> > 
> > This patch removes this "optimisation".
> 
> This isn't an optimization, it's for safety, removing a way for an attacker
> to get a handle on other data on the stack (CWE-562).
> 
> But I agree that we need to preserve some element of UB for constexpr
> evaluation to see.
> 
> Perhaps we want to move this transformation to cp_maybe_instrument_return,
> so it happens after maybe_save_constexpr_fundef?

Hm, OK. I can try giving this a go. I guess I should move the entire
maybe_warn_about_returning_address_of_local function to cp-gimplify.cc
to be able to detect this? Or is there a better way of marking that a
return expression will return a reference to a local for this
transformation? (I guess I can't use whether the warning has been
surpressed or not because the warning might not be enabled at all.)

It looks like this warning is raised also by diag_return_locals in
gimple-ssa-isolate-paths, should the transformation also be made here?

I note that the otherwise very similar -Wdangling-pointer warning
doesn't do this transformation either, should that also be something I
look into fixing here?

> > Relying on this raises a warning
> > by default and causes UB anyway, so there should be no issue in doing
> > so. We also suppress additional warnings from later passes that detect
> > this as a dangling pointer, since we've already indicated this anyway.
> > 
> > PR c++/110619
> > 
> > gcc/cp/ChangeLog:
> > 
> > * semantics.cc (finish_return_stmt): Suppress dangling pointer
> > reporting on return statement if already reported.
> > * typeck.cc (check_return_expr): Don't set return expression to
> > zero for dangling addresses.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp1y/constexpr-lifetime5.C: Test reported message is
> > correct.
> > * g++.dg/cpp1y/constexpr-lifetime6.C: Likewise.
> > * g++.dg/cpp1y/constexpr-110619.C: New test.
> > * g++.dg/warn/Wreturn-local-addr-6.C: Remove check for return
> > value optimisation.
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >   gcc/cp/semantics.cc  |  5 -
> >   gcc/cp/typeck.cc |  5 +++--
> >   gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C| 10 ++
> >   gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime5.C |  4 ++--
> >   gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime6.C |  8 
> >   gcc/testsuite/g++.dg/warn/Wreturn-local-addr-6.C |  3 ---
> >   6 files changed, 23 insertions(+), 12 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C
> > 
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 8fb47fd179e..107407de513 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -1260,7 +1260,10 @@ finish_return_stmt (tree expr)
> > r = build_stmt (input_location, RETURN_EXPR, expr);
> > if (no_warning)
> > -suppress_warning (r, OPT_Wreturn_type);
> > +{
> > +  suppress_warning (r, OPT_Wreturn_type);
> > +  suppress_warning (r, OPT_Wdangling_pointer_);
> > +}
> > r = maybe_cleanup_point_expr_void (r);
> > r = add_stmt (r);
> > diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
> > index 859b133a18d..47233b3b717 100644
> > --- a/gcc/cp/typeck.cc
> > +++ b/gcc/cp/typeck.cc
> > @@ -11273,8 +11273,9 @@ check_return_expr (tree retval, bool *no_warning)
> > else if (!processing_template_decl
> >&& maybe_warn_about_returning_address_of_local (retval, loc)
> >&& INDIRECT_TYPE_P (valtype))
> > -   retval = build2 (COMPOUND_EXPR, TREE_TYPE (retval), retval,
> > -build_zero_cst (TREE_TYPE (retval)));
> > +   /* Suppress the Wdangling-pointer warning in the return statement
> > +  that would otherwise occur.  */
> > +   *no_warning = true;
> >   }
> > /* A naive attempt to reduce the number of -Wdangling-reference false
> > diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C 
> > b/gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C
> > new file mode 100644
> > index 000..cca13302238
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C
> > @@ -0,0 +1,10 @@
> > +// { dg-do compile { target c++14 } }
> > +// { dg-options "-Wno-return-local-addr" }
> > +// PR c++/110619
> > +
> > +constexpr auto f() {
> > +int i = 0;
> > +return 
> > +};
> > +
> > +static_assert( f() != nullptr );
> > diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime5.C 
> > b/gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime5.C
> > index a4bc71d890a..ad3ef579f63 100644
> > ---

[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747

--- Comment #10 from Andrew Pinski  ---
(In reply to CVS Commits from comment #8)
> * g++.target/i386/pr61747.C: New testcase.

The testcase fails now, I don't know what caused it to fail though:
FAIL: g++.target/i386/pr61747.C  -std=gnu++14  scan-assembler-times max 4

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Andrew Pinski  ---
Fixed.

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #9 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:f32518726ee8e836d12d49aec8432679fcc42503

commit r14-2695-gf32518726ee8e836d12d49aec8432679fcc42503
Author: Andrew Pinski 
Date:   Fri Jul 21 02:26:09 2023 +

libfortran: Fix build for targets that don't have 10byte or 16 byte
floating point

So the problem here is EXPAND_INTER_MACRO_16 expands to nothing if 16 byte
FP does not
exist but we still add a comma after it and that causes a build failure.
The same is true for EXPAND_INTER_MACRO_10 too.

Committed as obvious after a bootstrap and test on x86_64-linux-gnu and
aarch64-linux-gnu.

libgfortran/ChangeLog:

PR libfortran/110759
* ieee/ieee_arithmetic.F90
(COMP_INTERFACE): Remove the comma after EXPAND_INTER_MACRO_16
and EXPAND_INTER_MACRO_10.
(EXPAND_INTER_MACRO_16): Add comma here if 16 byte fp exist.
(EXPAND_INTER_MACRO_10): Likewise.

[PATCH] libfortran: Fix build for targets that don't have 10byte or 16 byte floating point

2023-07-20 Thread Andrew Pinski via Gcc-patches

So the problem here is EXPAND_INTER_MACRO_16 expands to nothing if 16 byte FP 
does not
exist but we still add a comma after it and that causes a build failure.
The same is true for EXPAND_INTER_MACRO_10 too.

Committed as obvious after a bootstrap and test on x86_64-linux-gnu and 
aarch64-linux-gnu.

libgfortran/ChangeLog:

PR libfortran/110759
* ieee/ieee_arithmetic.F90
(COMP_INTERFACE): Remove the comma after EXPAND_INTER_MACRO_16
and EXPAND_INTER_MACRO_10.
(EXPAND_INTER_MACRO_16): Add comma here if 16 byte fp exist.
(EXPAND_INTER_MACRO_10): Likewise.
---
 libgfortran/ieee/ieee_arithmetic.F90 | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libgfortran/ieee/ieee_arithmetic.F90 
b/libgfortran/ieee/ieee_arithmetic.F90
index aa897abae39..debe40449f4 100644
--- a/libgfortran/ieee/ieee_arithmetic.F90
+++ b/libgfortran/ieee/ieee_arithmetic.F90
@@ -535,13 +535,13 @@ UNORDERED_MACRO(4,4)
   end interface
 
 #ifdef HAVE_GFC_REAL_16
-#  define EXPAND_INTER_MACRO_16(TYPE,OP) 
_gfortran_ieee_/**/TYPE/**/_/**/OP/**/_16
+#  define EXPAND_INTER_MACRO_16(TYPE,OP) 
_gfortran_ieee_/**/TYPE/**/_/**/OP/**/_16 ,
 #else
 #  define EXPAND_INTER_MACRO_16(TYPE,OP)
 #endif
 
 #ifdef HAVE_GFC_REAL_10
-#  define EXPAND_INTER_MACRO_10(TYPE,OP) 
_gfortran_ieee_/**/TYPE/**/_/**/OP/**/_10
+#  define EXPAND_INTER_MACRO_10(TYPE,OP) 
_gfortran_ieee_/**/TYPE/**/_/**/OP/**/_10 ,
 #else
 #  define EXPAND_INTER_MACRO_10(TYPE,OP)
 #endif
@@ -549,8 +549,8 @@ UNORDERED_MACRO(4,4)
 #define COMP_INTERFACE(TYPE,OP) \
   interface IEEE_/**/TYPE/**/_/**/OP ; \
 procedure \
-  EXPAND_INTER_MACRO_16(TYPE,OP) , \
-  EXPAND_INTER_MACRO_10(TYPE,OP) , \
+  EXPAND_INTER_MACRO_16(TYPE,OP) \
+  EXPAND_INTER_MACRO_10(TYPE,OP) \
   _gfortran_ieee_/**/TYPE/**/_/**/OP/**/_8 , \
   _gfortran_ieee_/**/TYPE/**/_/**/OP/**/_4 ; \
   end interface ; \
-- 
2.31.1

Re: [PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches

on 2023/7/20 20:37, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi,
>>
>> Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order
>> of LEN_STORE from {len,vector,bias} to {len,bias,vector},
>> in order to make them consistent with LEN_MASK_STORE and
>> MASK_STORE.  But it missed to update the related handlings
>> in tree-ssa-sccvn.cc, it caused the failure shown in PR
>> 110744.  This patch is to fix the related handlings with
>> the correct index.
>>
>> Bootstrapped and regress-tested on x86_64-redhat-linux,
>> powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -
>>  PR tree-optimization/110744
>>
>> gcc/ChangeLog:
>>
>>  * tree-ssa-sccvn.cc (vn_reference_lookup_3): Correct the index of bias
>>  operand for ifn IFN_LEN_STORE.
> 
> OK, thanks.
> 

Thanks Richard!  Pushed as r14-2694.

BR,
Kewen

Re: [PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches

on 2023/7/20 20:34, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi,
>>
>> As PR110729 reported, there was one issue for .section
>> __patchable_function_entries with -ffunction-sections, that
>> is we put the same symbol as link_to section symbol for all
>> functions wrongly.  The commit r13-4294 for PR99889 has
>> fixed this with the corresponding label LPFE* which sits in
>> the function_section.
>>
>> As Fangrui suggested[1], this patch is to add a bit more test
>> coverage.  I didn't find a good way to check all linked_to
>> symbols are different, so I checked for LPFE[012] here.
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624866.html
>>
>> Tested well on x86_64-redhat-linux, powerpc64-linux-gnu
>> P7/P8/P9 and powerpc64le-linux-gnu P9/P10.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -
>>  PR testsuite/110729
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.dg/pr110729.c: New test.
> 
> OK, thanks.

Thanks Richard!  Pushed as r14-2693.

BR,
Kewen

[Bug tree-optimization/110744] [14 regression] gcc.dg/tree-ssa/pr84512.c fails after r14-2267-gb8806f6ffbe72

2023-07-20 Thread linkw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110744

Kewen Lin  changed:

   What|Removed |Added

  Component|other   |tree-optimization
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Kewen Lin  ---
Should be fixed on trunk.

[Bug other/110744] [14 regression] gcc.dg/tree-ssa/pr84512.c fails after r14-2267-gb8806f6ffbe72

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110744

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:a6654c08fde11890d621fa7831180d410054568a

commit r14-2694-ga6654c08fde11890d621fa7831180d410054568a
Author: Kewen Lin 
Date:   Fri Jul 21 00:18:19 2023 -0500

sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order
of LEN_STORE from {len,vector,bias} to {len,bias,vector},
in order to make them consistent with LEN_MASK_STORE and
MASK_STORE.  But it missed to update the related handlings
in tree-ssa-sccvn.cc, it caused the failure shown in PR
110744.  This patch is to fix the related handlings with
the correct index.

PR tree-optimization/110744

gcc/ChangeLog:

* tree-ssa-sccvn.cc (vn_reference_lookup_3): Correct the index of
bias
operand for ifn IFN_LEN_STORE.

[Bug middle-end/110729] -fpatchable-function-entries: __patchable_function_entries has wrong sh_link

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110729

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:6894581ac453361e3fb4e1ffd54f9499acb87466

commit r14-2693-g6894581ac453361e3fb4e1ffd54f9499acb87466
Author: Kewen Lin 
Date:   Fri Jul 21 00:16:29 2023 -0500

testsuite: Add a test case for PR110729 [PR110729]

As PR110729 reported, there was one issue for .section
__patchable_function_entries with -ffunction-sections, that
is we put the same symbol as link_to section symbol for all
functions wrongly.  The commit r13-4294 for PR99889 has
fixed this with the corresponding label LPFE* which sits in
the function_section.

As Fangrui suggested [1], this patch is to add a bit more
test coverage.  I didn't find a good way to check all
linked_to symbols are different, so I checked for LPFE[012].

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624866.html

PR testsuite/110729

gcc/testsuite/ChangeLog:

* gcc.dg/pr110729.c: New test.

Re: [PATCH 2/2 ver 4] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-20 Thread Kewen.Lin via Gcc-patches

Hi Carl,

on 2023/7/18 03:20, Carl Love wrote:
> GCC maintainers:
> 
> Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case statement
> rs6000/rs6000-c.cc.  The existing REPLACE_ELT iterator name was changed
> to REPLACE_ELT_V along with the associated define_mode_attr.  Renamed
> VEC_RU to REPLACE_ELT for the iterator name and VEC_RU_char to
> REPLACE_ELT_char.  Fixed the double test in vec-replace-word-
> runnable_1.c to be consistent with the other tests.  Removed the "dg-do 
> link" from both tests.  Put in an explicit cast in test 
> vec-replace-word-runnable_2.c to eliminate the need for the 
> -flax-vector-conversions dg-option.
> 
> Version 3, added code to altivec_resolve_overloaded_builtin so the
> correct instruction is selected for the size of the second argument. 
> This restores the instruction counts to the original values where the
> correct instructions were originally being generated.  The naming of
> the overloaded builtin instances and builtin definitions were changed
> to reflect the type of the second argument since the type of the first
> argument is now the same for all overloaded instances.  A new builtin
> test file was added for the case where the first argument is cast to
> the unsigned long long type.  This test requires the -flax-vector-
> conversions gcc command line option.  Since the other tests do not
> require this option, I felt that the new test needed to be in a
> separate file.  Finally some formatting fixes were made in the original
> test file.  Patch has been retested on Power 10 with no regressions.
> 
> Version 2, fixed various typos.  Updated the change log body to say the
> instruction counts were updated.  The instruction counts changed as a
> result of changing the first argument of the vec_replace_unaligned
> builtin call from vector unsigned long long (vull) to vector unsigned
> char (vuc).  When the first argument was vull the builtin call
> generated the vinsd instruction for the two test cases.  The updated
> call with vuc as the first argument generates two vinsw instructions
> instead.  Patch was retested on Power 10 with no regressions.
> 
> The following patch fixes the first argument in the builtin definition
> and the corresponding test cases.  Initially, the builtin specification
> was wrong due to a cut and past error.  The documentation was fixed in:
> 
>commit ed3fea09b18f67e757b5768b42cb6e816626f1db
>Author: Bill Schmidt 
>Date:   Fri Feb 4 13:07:17 2022 -0600
> 
>rs6000: Correct function prototypes for vec_replace_unaligned
> 
>Due to a pasto error in the documentation, vec_replace_unaligned was
>implemented with the same function prototypes as vec_replace_elt.  
>It was intended that vec_replace_unaligned always specify output
>vectors as having type vector unsigned char, to emphasize that 
>elements are potentially misaligned by this built-in function.  
>This patch corrects the misimplementation.
> 
> 
> This patch fixes the arguments in the definitions and updates the
> testcases accordingly.  Additionally, a few minor spacing issues are
> fixed.
> 
> The patch has been tested on Power 10 with no regressions.  Please let
> me know if the patch is acceptable for mainline.  Thanks.
> 
>  Carl 
> 
> 
> 
> rs6000, fix vec_replace_unaligned built-in arguments
> 
> The first argument of the vec_replace_unaligned built-in should always be
> of type unsigned char, as specified in gcc/doc/extend.texi.

Shouldn't be "vector unsigned char" instead of "unsigned char"?

Or do I miss something?

> 
> This patch fixes the builtin definitions and updates the test cases to use
> the correct arguments.  The original test file is renamed and a second test
> file is added for a new test case.
> 
> gcc/ChangeLog:
>   * config/rs6000/rs6000-builtins.def: Rename
>   __builtin_altivec_vreplace_un_uv2di as __builtin_altivec_vreplace_un_udi
>   __builtin_altivec_vreplace_un_uv4si as __builtin_altivec_vreplace_un_usi
>   __builtin_altivec_vreplace_un_v2df as __builtin_altivec_vreplace_un_df
>   __builtin_altivec_vreplace_un_v2di as __builtin_altivec_vreplace_un_di
>   __builtin_altivec_vreplace_un_v4sf as __builtin_altivec_vreplace_un_sf
>   __builtin_altivec_vreplace_un_v4si as __builtin_altivec_vreplace_un_si.
>   Rename VREPLACE_UN_UV2DI as VREPLACE_UN_UDI, VREPLACE_UN_UV4SI as
>   VREPLACE_UN_USI, VREPLACE_UN_V2DF as VREPLACE_UN_DF,
>   VREPLACE_UN_V2DI as VREPLACE_UN_DI, VREPLACE_UN_V4SF as
>   VREPLACE_UN_SF, VREPLACE_UN_V4SI as VREPLACE_UN_SI.
>   Rename vreplace_un_v2di as vreplace_un_di, vreplace_un_v4si as
>   vreplace_un_si, vreplace_un_v2df as vreplace_un_df,
>   vreplace_un_v2di as vreplace_un_di, vreplace_un_v4sf as
>   vreplace_un_sf, vreplace_un_v4si as vreplace_un_si.
>   * config/rs6000/rs6000-c.cc (find_instance): Add case
>   RS6000_OVLD_VEC_REPLACE_UN.

Re: [PATCH v3] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-07-20 Thread Jeff Law via Gcc-patches





On 7/20/23 21:49, Kito Cheng wrote:

LGTM, I think long jump is another issue and making ra become a fixed
register will escalate to an ABI issue, so that should not be a
blocker for this patch.
I'll take a look tomorrow, but I'm supportive of what Yanzhang is trying 
to do in principle.  I've got a few hot items to deal with tonight though.


WRT making $ra fixed.  In practice fixing a register just takes it out 
of the pool of things available to the allocator.  Furthermore $ra is 
always considered clobbered at call sites.  So while one could view it 
as an ABI change, it's not one that's actually observable in practice. 
I suspect that's one of the reasons why $ra is used by the assembler in 
this manner -- it minimizes both the ABI and performance impacts.


jeff

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #9)
> One thing I noticed is that:
>   _2 = MAX_EXPR <_6, a3_7(D)>;
>   _3 = MAX_EXPR <_2, a3_7(D)>;
> 
> Is not optimized at all.
> 
> (for minmax (min max)
>  (simplify
>   (minmax:c (minmax:c@2 @0 @1) @0)
>   @2))

Submitted the patch for that as
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625135.html .
Note after that patch we get decent code for the original testcases but it is
not fully optimized at the gimple level.

[PATCH] MATCH: Add Max,a> -> Max simplifcation

2023-07-20 Thread Andrew Pinski via Gcc-patches

This adds a simple match pattern to simplify
`max,a>` to `max`.  Reassociation handles
this already (r0-77700-ge969dbde29bfd396259357) but
seems like we should be able to handle this even before
reassociation.

This fixes part of PR tree-optimization/80574 but more
work is needed fix it the rest of the way. The original
testcase there is fixed but the RTL level is what fixes
it the rest of the way.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd (minmax,a>->minmax): New
transformation.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/reassoc-12.c: Disable all of
the passes that enables match-and-simplify.
* gcc.dg/tree-ssa/minmax-23.c: New test.
---
 gcc/match.pd   |  6 +-
 gcc/testsuite/gcc.dg/tree-ssa/minmax-23.c  | 22 ++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-12.c |  3 ++-
 3 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-23.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 4dfe92623f7..bfd15d6cd4a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3503,7 +3503,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for minmax (min max)
  (simplify
   (minmax @0 @0)
-  @0))
+  @0)
+/* max(max(x,y),x) -> max(x,y)  */
+ (simplify
+  (minmax:c (minmax:c@2 @0 @1) @0)
+  @2))
 /* For fmin() and fmax(), skip folding when both are sNaN.  */
 (for minmax (FMIN_ALL FMAX_ALL)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-23.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-23.c
new file mode 100644
index 000..0b7e51bb97e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-23.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fno-tree-reassoc -fdump-tree-optimized" } */
+
+
+#define MAX(a,b) (a)>=(b) ? (a) : (b)
+
+#define MIN(a,b) (a)<=(b) ? (a) : (b)
+
+int test1(int a, int b)
+{
+  int d = MAX(a,b);
+  return MAX(a,d);
+}
+int test2(int a, int b)
+{
+  int d = MIN(a,b);
+  return MIN(a,d);
+}
+
+/* We should be optimize these two functions even without reassociation. */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR " 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR " 1 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-12.c 
b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-12.c
index 9a138ebcf70..2238147de19 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-12.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-reassoc1-details" } */
+/* Match-and-simplify can handle now MAX,a>->MAX, disable all of 
the passes that uses that. */
+/* { dg-options "-O1 -fdump-tree-reassoc1-details -fno-tree-ccp -fno-tree-ccp 
-fno-tree-forwprop -fno-tree-fre" } */
 int f(int a, int b)
 {
   /* MAX_EXPR  should cause it to be equivalent to a.  */
-- 
2.31.1

Re: [PATCH v3] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-07-20 Thread Kito Cheng via Gcc-patches

LGTM, I think long jump is another issue and making ra become a fixed
register will escalate to an ABI issue, so that should not be a
blocker for this patch.

On Tue, Jul 18, 2023 at 4:10 PM yanzhang.wang--- via Gcc-patches
 wrote:
>
> From: Yanzhang Wang 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
>   when enabling -mno-omit-leaf-frame-pointer
> (riscv_option_override): Override omit-frame-pointer.
> (riscv_frame_pointer_required): Save s0 for non-leaf function
> (TARGET_FRAME_POINTER_REQUIRED): Override defination
> * config/riscv/riscv.opt: Add option support.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/omit-frame-pointer-1.c: New test.
> * gcc.target/riscv/omit-frame-pointer-2.c: New test.
> * gcc.target/riscv/omit-frame-pointer-3.c: New test.
> * gcc.target/riscv/omit-frame-pointer-4.c: New test.
> * gcc.target/riscv/omit-frame-pointer-test.c: New test.
>
> Signed-off-by: Yanzhang Wang 
> ---
>  gcc/config/riscv/riscv.cc | 34 ++-
>  gcc/config/riscv/riscv.opt|  4 +++
>  .../gcc.target/riscv/omit-frame-pointer-1.c   |  7 
>  .../gcc.target/riscv/omit-frame-pointer-2.c   |  7 
>  .../gcc.target/riscv/omit-frame-pointer-3.c   |  7 
>  .../gcc.target/riscv/omit-frame-pointer-4.c   |  7 
>  .../riscv/omit-frame-pointer-test.c   | 13 +++
>  7 files changed, 78 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-test.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 706c18416db..caae6168c29 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -379,6 +379,10 @@ static const struct riscv_tune_info 
> riscv_tune_info_table[] = {
>  #include "riscv-cores.def"
>  };
>
> +/* Global variable to distinguish whether we should save and restore s0/fp 
> for
> +   function.  */
> +static bool riscv_save_frame_pointer;
> +
>  void riscv_frame_info::reset(void)
>  {
>total_size = 0;
> @@ -4948,7 +4952,11 @@ riscv_save_reg_p (unsigned int regno)
>if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed)
>  return true;
>
> -  if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
> +  /* Need not to use ra for leaf when frame pointer is turned off by option
> + whatever the omit-leaf-frame's value.  */
> +  bool keep_leaf_ra = frame_pointer_needed && crtl->is_leaf
> +&& !TARGET_OMIT_LEAF_FRAME_POINTER;
> +  if (regno == RETURN_ADDR_REGNUM && (crtl->calls_eh_return || keep_leaf_ra))
>  return true;
>
>/* If this is an interrupt handler, then must save extra registers.  */
> @@ -6577,6 +6585,21 @@ riscv_option_override (void)
>if (flag_pic)
>  riscv_cmodel = CM_PIC;
>
> +  /* We need to save the fp with ra for non-leaf functions with no fp and ra
> + for leaf functions while no-omit-frame-pointer with
> + omit-leaf-frame-pointer.  The x_flag_omit_frame_pointer has the first
> + priority to determine whether the frame pointer is needed.  If we do not
> + override it, the fp and ra will be stored for leaf functions, which is 
> not
> + our wanted.  */
> +  riscv_save_frame_pointer = false;
> +  if (TARGET_OMIT_LEAF_FRAME_POINTER_P (global_options.x_target_flags))
> +{
> +  if (!global_options.x_flag_omit_frame_pointer)
> +   riscv_save_frame_pointer = true;
> +
> +  global_options.x_flag_omit_frame_pointer = 1;
> +}
> +
>/* We get better code with explicit relocs for CM_MEDLOW, but
>   worse code for the others (for now).  Pick the best default.  */
>if ((target_flags_explicit & MASK_EXPLICIT_RELOCS) == 0)
> @@ -7857,6 +7880,12 @@ riscv_preferred_else_value (unsigned, tree, unsigned 
> int nops, tree *ops)
>return nops == 3 ? ops[2] : ops[0];
>  }
>
> +static bool
> +riscv_frame_pointer_required (void)
> +{
> +  return riscv_save_frame_pointer && !crtl->is_leaf;
> +}
> +
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_ASM_ALIGNED_HI_OP
>  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> @@ -8161,6 +8190,9 @@ riscv_preferred_else_value (unsigned, tree, unsigned 
> int nops, tree *ops)
>  #undef TARGET_PREFERRED_ELSE_VALUE
>  #define TARGET_PREFERRED_ELSE_VALUE riscv_preferred_else_value
>
> +#undef TARGET_FRAME_POINTER_REQUIRED
> +#define TARGET_FRAME_POINTER_REQUIRED riscv_frame_pointer_required
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>
>  #include "gt-riscv.h"
> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> index dd062f1c8bd..4dfd8f78ad5 100644
> ---

Re: Warning specifically for a returning noreturn

2023-07-20 Thread Julian Waters via Gcc

Hi all,

I've found the places responsible for the warnings, but before I do
anything I'd like to discuss a couple of things.

1. What would a good name for the warning switch be? How does
-Wreturning-noreturn sound?
2. I've thought about this for a while, and I feel like throwing a warning
for a noreturn method that isn't explicitly noreturn in the Control Flow
Graph is a little too harsh. The point of the attribute is to hint to gcc
that the method will never return even if it appears so, and requiring that
the body explicitly do something like call abort() or loop infinitely kind
of defeats the purpose of the attribute, in my opinion
3. If (2) is just me missing something, should I split the warning into 2
different warnings for a noreturn definition with an explicit return
statement and an implicit one in the case of a method not explicitly
throwing/looping infinitely, etc?

Thoughts?

best regards,
Julian

On Wed, Jul 5, 2023 at 9:13 PM Julian Waters 
wrote:

> Hi Jonathan,
>
> Thanks for the reply, is there a place in gcc's source code I could look
> at for this? As for the returning an explicit value from noreturn, I'm
> unfortunately not the one who wrote the code that way; I'm merely a build
> systems developer trying to get it to work with gcc :/
>
> best regards,
> Julian
>
> On Wed, 5 Jul 2023, 19:26 Jonathan Wakely,  wrote:
>
>> On Wed, 5 Jul 2023 at 12:01, Julian Waters via Gcc 
>> wrote:
>> >
>> > I see, thanks Andrew.
>> >
>> > Anyone else have opinions on this besides Liu or Andrew? The responses
>> have
>> > been surprisingly quiet thus far
>>
>> IMHO all warnings should have an option controlling them, so that you
>> can disable them via pragmas.
>>
>> But I agree that you shouldn't need to return from a noreturn
>> function, it can either throw or use __builtin_unreachable() on the
>> line where you currently return.
>>
>>
>> >
>> > best regards,
>> > Julian
>> >
>> > On Wed, 5 Jul 2023, 09:40 Andrew Pinski,  wrote:
>> >
>> > > On Tue, Jul 4, 2023 at 6:32 PM Julian Waters > >
>> > > wrote:
>> > > >
>> > > > Hi Andrew, thanks for the quick response,
>> > > >
>> > > > What if the method has a return value? I know it sounds
>> > > counterintuitive, but in some places HotSpot relies on the noreturn
>> > > attribute being applied to methods that do return a value in an
>> unreachable
>> > > code path. Does the unreachable builtin cover that case too?
>> > >
>> > > It is wrong to use noreturn on a function other than one which has a
>> > > return type of void as documented.
>> > >
>> > >
>> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noreturn-function-attribute
>> > > :
>> > > ```
>> > > It does not make sense for a noreturn function to have a return type
>> > > other than void.
>> > > ```
>> > >
>> > > Thanks,
>> > > Andrew Pinski
>> > >
>> > >
>> > > >
>> > > > best regards.
>> > > > Julian
>> > > >
>> > > > On Wed, Jul 5, 2023 at 9:07 AM Andrew Pinski 
>> wrote:
>> > > >>
>> > > >> On Tue, Jul 4, 2023 at 5:54 PM Julian Waters via Gcc <
>> gcc@gcc.gnu.org>
>> > > wrote:
>> > > >> >
>> > > >> > Hi all,
>> > > >> >
>> > > >> > Currently to disable the warning that a noreturn method does
>> return,
>> > > it's
>> > > >> > required to disable warnings entirely. This can be very
>> inconvenient
>> > > when
>> > > >> > -Werror is enabled with a noreturn method that isn't specifically
>> > > calling
>> > > >> > something like std::abort() at the end, when one wants all other
>> > > -Wall and
>> > > >> > -Wextra warnings to be reported, for instance in the Java
>> HotSpot VM
>> > > (which
>> > > >> > I'm currently adapting to compile with gcc on all supported
>> > > platforms). Is
>> > > >> > there a possibility we can add a disable warning option
>> specifically
>> > > for
>> > > >> > this case? Something like -Wno-returning-noreturn. I'm
>> interested in
>> > > adding
>> > > >> > this myself if it's not convenient for gcc's maintainers to do
>> so at
>> > > the
>> > > >> > moment, but I'd need some guidance on where to look and what the
>> > > relevant
>> > > >> > code is
>> > > >>
>> > > >> You could just add
>> > > >> __builtin_unreachable(); (or std::unreachable(); if you are C++23
>> or
>> > > >> unreachable() if you are using C23).
>> > > >> Or even add while(true) ;
>> > > >>
>> > > >> I am pretty sure not having an option is on purpose and not really
>> > > >> interested in adding an option here because of the above
>> workarounds.
>> > > >>
>> > > >> Thanks,
>> > > >> Andrew Pinski
>> > > >>
>> > > >> >
>> > > >> > best regards,
>> > > >> > Julian
>> > >
>>
>

Re: [PATCH v4] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-20 Thread juzhe.zh...@rivai.ai

-  if (mode == FRM_MODE_DYN_EXIT && prev_mode != FRM_MODE_DYN)
+  if (mode == FRM_MODE_DYN_CALL && prev_mode != FRM_MODE_DYN)
/* No need to emit when prev mode is DYN already.  */
-   emit_insn (gen_fsrmsi_restore_exit (backup_reg));
+   emit_insn (gen_fsrmsi_restore_volatile (backup_reg));

No, I don't think for DYN_CALL, you need to emit restore_volatile.
It should be normal restore.


-(define_insn "fsrmsi_restore_exit"
+(define_insn "fsrmsi_restore_volatile"
No need to change it, recover it back to restore_exit.
The volatile restore should always be only used for exit.


Add one more test:

vfadd (static)

CALL


 ->   add a bunch integer RVV intrinsic (Ideally it 
should be only one backup frm insn after CALL).
...


vfadd (static)






juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-07-20 14:43
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang
Subject: [PATCH v4] RISC-V: Support CALL for RVV floating-point dynamic rounding
From: Pan Li 
 
In basic dynamic rounding mode, we simply ignore call instructions and
we would like to take care of call in this PATCH.
 
During the call, the frm may be updated or keep as is. Thus, we must
make sure at least 2 things.
 
1. The static frm before call should not pollute the frm value in call.
2. The updated frm value in call should be sticky after call completed.
 
We will perfrom some steps to make above happen.
 
1. Mark call instruction with new mode DYN_CALL.
2. Mark the instruction after CALL from NONE to DYN.
3. When emit for a DYN_CALL, we will restore the frm value.
4. When emit from a DYN_CALL, we will backup the frm value.
 
Let's take a flow for this.
 
   +-+
   | Entry (DYN) | <- frrm a5
   +-+
  /   \
+---+ +---+
| VFADD | | VFADD RTZ |  <- fsrmi 1(RTZ)
+---+ +---+
  ||
+---+ +---+
| CALL  | | CALL  |  <- fsrm a5
+---+ +---+
  |   |
+---+ +---+
| SHIFT | <- frrm a5  | VFADD |  <- frrm a5
+---+ +---+
  |  /
+---+   /
| VFADD RUP | <- fsrm1 3(RUP)
+---+ /
   \ /
+-+
| Exit (DYN_EXIT) | <- fsrm a5
+-+
 
Please *NOTE* some corn cases like no instruction after a call is not
well handled, and will be coverred in another PATCH(s) soon.
 
Signed-off-by: Pan Li 
Co-Authored-By: Juzhe-Zhong 
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (DYNAMIC_FRM_RTL): New macro.
(STATIC_FRM_P): Ditto.
(struct mode_switching_info): New struct for mode switching.
(struct machine_function): Add new field mode switching.
(riscv_emit_frm_mode_set): Add DYN_CALL emit.
(riscv_frm_mode_needed): New function for frm mode needed.
(riscv_mode_needed): Extrac function for frm.
(riscv_frm_mode_after): Add DYN_CALL after.
* config/riscv/vector.md (frm_mode): Add dyn_call.
(fsrmsi_restore_exit): Rename to _volatile.
(fsrmsi_restore_volatile): Likewise.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/float-point-frm-insert-7.c: Adjust
test cases.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-33.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-34.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-35.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-36.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-37.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-38.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-39.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-40.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-41.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-42.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-43.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-44.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-45.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-46.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-47.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-48.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-51.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-53.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-54.c: New test.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-55.c: New test.
*

[Bug other/98375] [meta bug] GCC 12 pending patches

2023-07-20 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375
Bug 98375 depends on bug 89701, which changed state.

Bug 89701 Summary: Provide -fcf-protection=branch,return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89701

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/89701] Provide -fcf-protection=branch,return

2023-07-20 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89701

Hongtao.liu  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Hongtao.liu  ---
Fixed in GCC14.

[Bug target/89701] Provide -fcf-protection=branch,return

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89701

--- Comment #5 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:1c6231c05bdccab3a21abcbb75e2094ea3e98782

commit r14-2692-g1c6231c05bdccab3a21abcbb75e2094ea3e98782
Author: liuhongt 
Date:   Fri May 12 15:15:08 2023 +0800

Provide -fcf-protection=branch,return.

Use EnumSet instead of EnumBitSet since CF_FULL is not power of 2.
It is a bit tricky for sets classification, cf_branch and cf_return
should be in different sets, but they both "conflicts" cf_full,
cf_none. And current EnumSet don't handle this well.

So in the current implementation, only cf_full,cf_none are exclusive
to each other, but they can be combined with any cf_branch, cf_return,
cf_check. It's not perfect, but still an improvement than original
one.

gcc/ChangeLog:

PR target/89701
* common.opt: (fcf-protection=): Add EnumSet attribute to
support combination of params.

gcc/testsuite/ChangeLog:

* c-c++-common/fcf-protection-10.c: New test.
* c-c++-common/fcf-protection-11.c: New test.
* c-c++-common/fcf-protection-12.c: New test.
* c-c++-common/fcf-protection-8.c: New test.
* c-c++-common/fcf-protection-9.c: New test.
* gcc.target/i386/pr89701-1.c: New test.
* gcc.target/i386/pr89701-2.c: New test.
* gcc.target/i386/pr89701-3.c: New test.

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #8 from Patrick O'Neill  ---
Awesome, thank you!

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 Target||!=x86_64
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #7 from Andrew Pinski  ---
I will commit a fix in a hour or two; just going to build to make sure it
works.

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-07-21

--- Comment #6 from Andrew Pinski  ---
I see the problem now.
 537 #ifdef HAVE_GFC_REAL_16
 538 #  define EXPAND_INTER_MACRO_16(TYPE,OP)
_gfortran_ieee_/**/TYPE/**/_/**/OP/**/_16
 539 #else
 540 #  define EXPAND_INTER_MACRO_16(TYPE,OP)
 541 #endif
 542 

...
 552   EXPAND_INTER_MACRO_16(TYPE,OP) , \


The comma should be part of EXPAND_INTER_MACRO_16 macro instead.

[PATCH] cleanup: Change condition order

2023-07-20 Thread Juzhe-Zhong

Hi, Richard and Richi.

I have double check the recent codes for len && mask support again.

Some places code structure:

if (len_mask_fn)
...
else if (mask_fn)
...

some places code structure:

if (mask_len_fn)
...
else if (mask)

Base on previous review comment from Richi:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625067.html

len mask stuff should be checked before mask.

So I reorder all condition order to check LEN MASK stuff before MASK.

This is the last clean up patch.

Boostrap and Regression is on the way.

gcc/ChangeLog:

* tree-vect-stmts.cc (check_load_store_for_partial_vectors): Change 
condition order.
(vectorizable_operation): Ditto.

---
 gcc/tree-vect-stmts.cc | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d5b4f020332..2fe856db9ab 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1635,17 +1635,17 @@ check_load_store_for_partial_vectors (loop_vec_info 
loop_vinfo, tree vectype,
   internal_fn len_ifn = (is_load
 ? IFN_MASK_LEN_GATHER_LOAD
 : IFN_MASK_LEN_SCATTER_STORE);
-  if (internal_gather_scatter_fn_supported_p (ifn, vectype,
+  if (internal_gather_scatter_fn_supported_p (len_ifn, vectype,
  gs_info->memory_type,
  gs_info->offset_vectype,
  gs_info->scale))
-   vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype,
-  scalar_mask);
-  else if (internal_gather_scatter_fn_supported_p (len_ifn, vectype,
+   vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1);
+  else if (internal_gather_scatter_fn_supported_p (ifn, vectype,
   gs_info->memory_type,
   gs_info->offset_vectype,
   gs_info->scale))
-   vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1);
+   vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype,
+  scalar_mask);
   else
{
  if (dump_enabled_p ())
@@ -6596,16 +6596,16 @@ vectorizable_operation (vec_info *vinfo,
  && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
  && mask_out_inactive)
{
- if (cond_fn != IFN_LAST
- && direct_internal_fn_supported_p (cond_fn, vectype,
+ if (cond_len_fn != IFN_LAST
+ && direct_internal_fn_supported_p (cond_len_fn, vectype,
 OPTIMIZE_FOR_SPEED))
-   vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num,
-  vectype, NULL);
- else if (cond_len_fn != IFN_LAST
-  && direct_internal_fn_supported_p (cond_len_fn, vectype,
- OPTIMIZE_FOR_SPEED))
vect_record_loop_len (loop_vinfo, lens, ncopies * vec_num, vectype,
  1);
+ else if (cond_fn != IFN_LAST
+  && direct_internal_fn_supported_p (cond_fn, vectype,
+ OPTIMIZE_FOR_SPEED))
+   vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num,
+  vectype, NULL);
  else
{
  if (dump_enabled_p ())
-- 
2.36.1

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #5 from Patrick O'Neill  ---
Linux target rv32gc-ilp32d, rv64gc-lp64d.
Newlib still builds successfully.

Re: [PATCH 1/2] rs6000, add argument to function find_instance

2023-07-20 Thread Kewen.Lin via Gcc-patches

Hi Carl,

on 2023/7/18 03:19, Carl Love wrote:
> 
> GCC maintainers:
> 
> The rs6000 function find_instance assumes that it is called for built-
> ins with only two arguments.  There is no checking for the actual
> number of aruguments used in the built-in.  This patch adds an
> additional parameter to the function call containing the number of
> aruguments in the built-in.  The function will now do the needed checks
> for all of the arguments.
> 
> This fix is needed for the next patch in the series that fixes the
> vec_replace_unaligned built-in.c test.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.
> 
> Carl 
> 
> 
> 
> rs6000, add argument to function find_instance
> 
> The function find_instance assumes it is called to check a built-in  with 
> ~~ two spaces.
> only two arguments.  Ths patch extends the function by adding a parameter
   s/Ths/This/
> specifying the number of buit-in arguments to check.
  s/bult-in/built-in/

> 
> gcc/ChangeLog:
>   * config/rs6000/rs6000-c.cc (find_instance): Add new parameter that
>   specifies the number of built-in arguments to check.
>   (altivec_resolve_overloaded_builtin): Update calls to find_instance
>   to pass the number of built-in argument to be checked.

s/argument/arguments/

> ---
>  gcc/config/rs6000/rs6000-c.cc | 27 +++
>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index a353bca19ef..350987b851b 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -1679,7 +1679,7 @@ tree

There is one function comment here describing the meaning of each parameter,
I think we should add a corresponding for NARGS, may be something like:

"; and NARGS specifies the number of built-in arguments."

Also we need to update the below "two"s with "NARGS".

"TYPES contains an array of two types..." and "ARGS contains an array of two 
arguments..."

since we already extend this to handle NARGS instead of two.

>  find_instance (bool *unsupported_builtin, ovlddata **instance,
>  rs6000_gen_builtins instance_code,
>  rs6000_gen_builtins fcode,
> -tree *types, tree *args)
> +tree *types, tree *args, int nargs)
>  {
>while (*instance && (*instance)->bifid != instance_code)
>  *instance = (*instance)->next;
> @@ -1691,17 +1691,28 @@ find_instance (bool *unsupported_builtin, ovlddata 
> **instance,
>if (!inst->fntype)
>  return error_mark_node;
>tree fntype = rs6000_builtin_info[inst->bifid].fntype;
> -  tree parmtype0 = TREE_VALUE (TYPE_ARG_TYPES (fntype));
> -  tree parmtype1 = TREE_VALUE (TREE_CHAIN (TYPE_ARG_TYPES (fntype)));
> +  tree argtype = TYPE_ARG_TYPES (fntype);
> +  tree parmtype;

Nit: We can move "tree parmtype" into the loop (close to its only use).

> +  int args_compatible = true;

s/int/bool/

>  
> -  if (rs6000_builtin_type_compatible (types[0], parmtype0)
> -  && rs6000_builtin_type_compatible (types[1], parmtype1))
> +  for (int i = 0; i   {
> +  parmtype = TREE_VALUE (argtype);

 tree parmtype = TREE_VALUE (argtype);

> +  if (! rs6000_builtin_type_compatible (types[i], parmtype))

Nit: One unexpected(?) space after "!".

> + {
> +   args_compatible = false;
> +   break;
> + }
> +  argtype = TREE_CHAIN (argtype);
> +}
> +
> +  if (args_compatible)
> +  {

Nit: indent issue for "{".

Ok for trunk with these nits fixed.  Btw, the description doesn't say
how this was tested, I'm not sure if it's only tested together with
"patch 2/2", but please ensure it's bootstrapped and regress-tested
on BE and LE when committing.  Thanks!

BR,
Kewen

>if (rs6000_builtin_decl (inst->bifid, false) != error_mark_node
> && rs6000_builtin_is_supported (inst->bifid))
>   {
> tree ret_type = TREE_TYPE (inst->fntype);
> -   return altivec_build_resolved_builtin (args, 2, fntype, ret_type,
> +   return altivec_build_resolved_builtin (args, nargs, fntype, ret_type,
>inst->bifid, fcode);
>   }
>else
> @@ -1921,7 +1932,7 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
> instance_code = RS6000_BIF_CMPB_32;
>  
>   tree call = find_instance (_builtin, ,
> -instance_code, fcode, types, args);
> +instance_code, fcode, types, args, nargs);
>   if (call != error_mark_node)
> return call;
>   break;
> @@ -1958,7 +1969,7 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
> }
>  
>   tree call = find_instance (_builtin, ,
> -instance_code, fcode,

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #4 from Andrew Pinski  ---
What exact target? Is this the elf target or the linux target?


I just bootstrapped on x86_64-linux-gnu and it works.

[PATCH] cleanup: make all cond_len_* and mask_len_* consistent on the order of mask and len

2023-07-20 Thread Juzhe-Zhong

This patch is depending on:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625121.html

Hi, Richard and Richi.

This patch is to align the order of mask and len.

Currently, According to this piece code:
if (final_len && final_mask)
  call = gimple_build_call_internal (
IFN_LEN_MASK_GATHER_LOAD, 7, dataref_ptr,
vec_offset, scale, zero, final_mask, final_len,
bias);

You can see the order of mask and len, is {mask,len,bias}.
"mask" comes before "len". The reason of this order is that we want to
reuse the current codes of MASK_GATHER_LOAD/MASK_SCATTER_STORE.

Same situation for COND_LEN_*, we want to reuse the codes of COND_*.

Reusing codes from the existing MASK_* or COND_* can allow us not to
change the codes too much and make the codes elegant and easy to maintain && 
read.

To avoid any confusions of auto-vectorization patterns that includes both mask 
and len,

this patch align the order of mask and len for both Gimple IR and RTL pattern 
into

{mask, len, bias} to make everything cleaner and more elegant.

Bootstrap and Regression is on the way.

gcc/ChangeLog:

* config/riscv/autovec.md: Align order of mask and len.
* config/riscv/riscv-v.cc (expand_load_store): Ditto.
(expand_gather_scatter): Ditto.
* doc/md.texi: Ditto.
* internal-fn.cc (add_len_and_mask_args): Ditto.
(add_mask_and_len_args): Ditto.
(expand_partial_load_optab_fn): Ditto.
(expand_partial_store_optab_fn): Ditto.
(expand_scatter_store_optab_fn): Ditto.
(expand_gather_load_optab_fn): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_len_load_store_bias): Ditto.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.

---
 gcc/config/riscv/autovec.md | 96 ++---
 gcc/config/riscv/riscv-v.cc | 12 ++---
 gcc/doc/md.texi | 36 +++---
 gcc/internal-fn.cc  | 50 +--
 gcc/tree-vect-stmts.cc  |  8 ++--
 5 files changed, 101 insertions(+), 101 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 7eb96d42c18..d899922586a 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -25,9 +25,9 @@
 (define_expand "mask_len_load"
   [(match_operand:V 0 "register_operand")
(match_operand:V 1 "memory_operand")
-   (match_operand 2 "autovec_length_operand")
-   (match_operand 3 "const_0_operand")
-   (match_operand: 4 "vector_mask_operand")]
+   (match_operand: 2 "vector_mask_operand")
+   (match_operand 3 "autovec_length_operand")
+   (match_operand 4 "const_0_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_load_store (operands, true);
@@ -37,9 +37,9 @@
 (define_expand "mask_len_store"
   [(match_operand:V 0 "memory_operand")
(match_operand:V 1 "register_operand")
-   (match_operand 2 "autovec_length_operand")
-   (match_operand 3 "const_0_operand")
-   (match_operand: 4 "vector_mask_operand")]
+   (match_operand: 2 "vector_mask_operand")
+   (match_operand 3 "autovec_length_operand")
+   (match_operand 4 "const_0_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_load_store (operands, false);
@@ -67,9 +67,9 @@
(match_operand:RATIO64I 2 "register_operand")
(match_operand 3 "")
(match_operand 4 "")
-   (match_operand 5 "autovec_length_operand")
-   (match_operand 6 "const_0_operand")
-   (match_operand: 7 "vector_mask_operand")]
+   (match_operand: 5 "vector_mask_operand")
+   (match_operand 6 "autovec_length_operand")
+   (match_operand 7 "const_0_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_gather_scatter (operands, true);
@@ -82,9 +82,9 @@
(match_operand:RATIO32I 2 "register_operand")
(match_operand 3 "")
(match_operand 4 "")
-   (match_operand 5 "autovec_length_operand")
-   (match_operand 6 "const_0_operand")
-   (match_operand: 7 "vector_mask_operand")]
+   (match_operand: 5 "vector_mask_operand")
+   (match_operand 6 "autovec_length_operand")
+   (match_operand 7 "const_0_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_gather_scatter (operands, true);
@@ -97,9 +97,9 @@
(match_operand:RATIO16I 2 "register_operand")
(match_operand 3 "")
(match_operand 4 "")
-   (match_operand 5 "autovec_length_operand")
-   (match_operand 6 "const_0_operand")
-   (match_operand: 7 "vector_mask_operand")]
+   (match_operand: 5 "vector_mask_operand")
+   (match_operand 6 "autovec_length_operand")
+   (match_operand 7 "const_0_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_gather_scatter (operands, true);
@@ -112,9 +112,9 @@
(match_operand:RATIO8I 2 "register_operand")
(match_operand 3 "")
(match_operand 4 "")
-   (match_operand 5 "autovec_length_operand")
-   (match_operand 6 "const_0_operand")
-   (match_operand: 7

[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA

2023-07-20 Thread xuli1 at eswincomputing dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

--- Comment #16 from xuli1 at eswincomputing dot com  ---
(In reply to rguent...@suse.de from comment #12)
> On Thu, 20 Jul 2023, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
> > 
> > --- Comment #11 from JuzheZhong  ---
> > (In reply to rguent...@suse.de from comment #10)
> > > On Thu, 20 Jul 2023, juzhe.zhong at rivai dot ai wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
> > > > 
> > > > --- Comment #9 from JuzheZhong  ---
> > > > (In reply to rguent...@suse.de from comment #8)
> > > > > On Thu, 20 Jul 2023, juzhe.zhong at rivai dot ai wrote:
> > > > > 
> > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
> > > > > > 
> > > > > > --- Comment #6 from JuzheZhong  ---
> > > > > > (In reply to rguent...@suse.de from comment #5)
> > > > > > > On Thu, 20 Jul 2023, kito at gcc dot gnu.org wrote:
> > > > > > > 
> > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
> > > > > > > > 
> > > > > > > > --- Comment #4 from Kito Cheng  ---
> > > > > > > > > OK, so TA is either merge or all-ones.
> > > > > > > > 
> > > > > > > > Yes, your understand is correct, just few more detail is that 
> > > > > > > > can be mixing
> > > > > > > > with either merge or all-ones.
> > > > > > > > 
> > > > > > > > e.g.
> > > > > > > > 
> > > > > > > > An 4 x i32 vector with mask 1 0 1 0
> > > > > > > > 
> > > > > > > > Op  =  | a | b | c | d |
> > > > > > > > Mask = | 1 | 0 | 1 | 0 |
> > > > > > > > 
> > > > > > > > the result could be:
> > > > > > > > | a | b | c | d |
> > > > > > > > | a | all-1 | c | d |
> > > > > > > > | a | all-1 | c | all-1 |
> > > > > > > > | a | all-1 | c | d |
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > Not sure how you can use MA at the moment since you specify 
> > > > > > > > > an existing operand in your target hook.  As far as
> > > > > > > > > I can see there's no value the target hook can provide that 
> > > > > > > > > matches any
> > > > > > > > of the implementation semantics?
> > > > > > > > 
> > > > > > > > That's the key point - we don't know how to return an undefined 
> > > > > > > > value there, we
> > > > > > > > have intrinsic can generate undefined value, but it seems 
> > > > > > > > impossible to
> > > > > > > > generate that within the hook.
> > > > > > > 
> > > > > > > Well, neither *A nor *U can be specified currently.  As said for 
> > > > > > > 'merge'
> > > > > > > we would need another operand.  And since 'unspecified' is either 
> > > > > > > merge
> > > > > > > or all-ones we can't express that either.  It's not really 
> > > > > > > 'undefined'
> > > > > > > either.
> > > > > > > 
> > > > > > > Note this also means the proposal to define a .MASK_LOAD as 
> > > > > > > zeroing
> > > > > > > masked elements is not going to work for RISC-V, instead we'd need
> > > > > > > an explicit 'else' value there as well.
> > > > > > > 
> > > > > > > In fact we could follow .MASK_LOAD for .COND_* and simply omit
> > > > > > > the 'else' operand for the case of 'unspecified', no?  GIMPLE 
> > > > > > > would
> > > > > > > be fine omitting it, not sure whether there's precedent for
> > > > > > > optabs with optional operands?
> > > > > > 
> > > > > > For RVV auto-vectorization, we define COND_LEN_* has else value in 
> > > > > > the
> > > > > > arguments. But the else value is not always the real value we need 
> > > > > > to
> > > > > > care about, this is the code from vectorizable_operation:
> > > > > > 
> > > > > >   if (reduc_idx >= 0)
> > > > > > {
> > > > > >   /* Perform the operation on active elements only and 
> > > > > > take
> > > > > >  inactive elements from the reduction chain input.  
> > > > > > */
> > > > > >   gcc_assert (!vop2);
> > > > > >   vops.quick_push (reduc_idx == 1 ? vop1 : vop0);
> > > > > > }
> > > > > >   else
> > > > > > {
> > > > > >   auto else_value = targetm.preferred_else_value
> > > > > > (cond_fn, vectype, vops.length () - 1, [1]);
> > > > > >   vops.quick_push (else_value);
> > > > > > }
> > > > > > 
> > > > > > 
> > > > > > You can see for reduction operations, the else value is the real 
> > > > > > value we
> > > > > > need to depend on, we should use "TU" (Undisturbed or merge value) 
> > > > > > in RVV.
> > > > > > Meaning the inactive elements should remain the "old" value that's 
> > > > > > why we
> > > > > > use "TU".
> > > > > 
> > > > > Sure.  For the above case that's obviously correct.
> > > > > 
> > > > > > However, for single binary operations for example, division, we 
> > > > > > just only
> > > > > > need to forbid the division operations of the inactive elements in 
> > > > > > the 
> > > > > > hardware, we don't care the value of the inactive elements value. 
> > > > > > so in
> > > > > > this case, we want to use "TA". In

RE: [r14-2639 Regression] FAIL: gcc.dg/vect/bb-slp-pr95839-v8.c scan-tree-dump slp2 "optimized: basic block" on Linux/x86_64

2023-07-20 Thread Jiang, Haochen via Gcc-patches

> -Original Message-
> From: Richard Biener 
> Sent: Thursday, July 20, 2023 9:28 PM
> To: Maciej W. Rozycki 
> Cc: haochen.jiang ; gcc-
> regress...@gcc.gnu.org; gcc-patches@gcc.gnu.org; Jiang, Haochen
> 
> Subject: Re: [r14-2639 Regression] FAIL: gcc.dg/vect/bb-slp-pr95839-v8.c
> scan-tree-dump slp2 "optimized: basic block" on Linux/x86_64
> 
> On Thu, Jul 20, 2023 at 3:13 PM Maciej W. Rozycki 
> wrote:
> >
> > On Thu, 20 Jul 2023, Richard Biener wrote:
> >
> > > > c1e420549f2305efb70ed37e693d380724eb7540 is the first bad commit
> > > > commit c1e420549f2305efb70ed37e693d380724eb7540
> > > > Author: Maciej W. Rozycki 
> > > > Date:   Wed Jul 19 11:59:29 2023 +0100
> > > >
> > > > testsuite: Add 64-bit vector variant for bb-slp-pr95839.c
> > >
> > > I think the issue is we disable V2SF on ia32 because of the conflict
> > > with MMX which we don't want to use.
> >
> >  I'm not sure if I have a way to test with such a target.  Would you
> > expect:
> >
> > /* { dg-require-effective-target vect64 } */
> >
> > to cover it?  If so, then I'll put it back as in the original version
> > and post for Haochen to verify.

I suppose just commit to trunk and it should be ok since it is only -m32 issue.

Thx,
Haochen

> 
> Yeah, that should work here.
> 
> Richard.
> 
> >   Maciej

[PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-20 Thread HAO CHEN GUI via Gcc-patches

Hi,
  This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx
for all subtargets when the mode is V4SI and the index of extracted element
is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz
which can help eliminate redundant zero extend.

  Compared to last version, the main change is to add a new expand for V4SI
and separate "vsx_extract_si" to 2 insn patterns.
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622101.html

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.

Thanks
Gui Haochen


ChangeLog
rs6000: Generate mfvsrwz for all subtargets and remove redundant zero extend

mfvsrwz has lower latency than xxextractuw or vextuw[lr]x.  So it should be
generated even with p9 vector enabled.  Also the instruction is already
zero extended.  A combine pattern is needed to eliminate redundant zero
extend instructions.

gcc/
PR target/106769
* config/rs6000/vsx.md (expand vsx_extract_): Set it only
for V8HI and V16QI.
(vsx_extract_v4si): New expand for V4SI.
(*vsx_extract__di_p9): Not generate the insn when it can
be generated by mfvsrwz.
(mfvsrwz): New insn pattern for zero extended vsx_extract_v4si.
(*vsx_extract_si): Removed.
(vsx_extract_v4si_0): New insn pattern to deal with V4SI extract
when the index of extracted element is 1 with BE and 2 with LE.
(vsx_extract_v4si_1): New insn and split pattern which deals with
the cases not handled by vsx_extract_v4si_0.

gcc/testsuite/
PR target/106769
* gcc.target/powerpc/pr106769.h: New.
* gcc.target/powerpc/pr106769-p8.c: New.
* gcc.target/powerpc/pr106769-p9.c: New.

patch.diff
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 0a34ceebeb5..ad249441bcf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3722,9 +3722,9 @@ (define_insn "vsx_xxpermdi2__1"
 (define_expand  "vsx_extract_"
   [(parallel [(set (match_operand: 0 "gpc_reg_operand")
   (vec_select:
-   (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand")
+   (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand")
(parallel [(match_operand:QI 2 "const_int_operand")])))
- (clobber (match_scratch:VSX_EXTRACT_I 3))])]
+ (clobber (match_scratch:VSX_EXTRACT_I2 3))])]
   "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
 {
   /* If we have ISA 3.0, we can do a xxextractuw/vextractu{b,h}.  */
@@ -3736,6 +3736,23 @@ (define_expand  "vsx_extract_"
 }
 })

+(define_expand  "vsx_extract_v4si"
+  [(parallel [(set (match_operand:SI 0 "gpc_reg_operand")
+  (vec_select:SI
+   (match_operand:V4SI 1 "gpc_reg_operand")
+   (parallel [(match_operand:QI 2 "const_0_to_3_operand")])))
+ (clobber (match_scratch:V4SI 3))])]
+  "TARGET_DIRECT_MOVE_64BIT"
+{
+  if (TARGET_P9_VECTOR
+  && INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2))
+{
+  emit_insn (gen_vsx_extract_v4si_p9 (operands[0], operands[1],
+ operands[2]));
+  DONE;
+}
+})
+
 (define_insn "vsx_extract__p9"
   [(set (match_operand: 0 "gpc_reg_operand" "=r,")
(vec_select:
@@ -3798,7 +3815,9 @@ (define_insn_and_split "*vsx_extract__di_p9"
  (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v,")
  (parallel [(match_operand:QI 2 "const_int_operand" "n,n")]
(clobber (match_scratch:SI 3 "=r,X"))]
-  "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB"
+  "TARGET_VEXTRACTUB
+   && (mode != V4SImode
+   || INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2))"
   "#"
   "&& reload_completed"
   [(parallel [(set (match_dup 4)
@@ -3830,58 +3849,78 @@ (define_insn_and_split "*vsx_extract__store_p9"
(set (match_dup 0)
(match_dup 3))])

-(define_insn_and_split  "*vsx_extract_si"
+(define_insn "mfvsrwz"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (vec_select:SI
+   (match_operand:V4SI 1 "vsx_register_operand" "wa")
+   (parallel [(match_operand:QI 2 "const_int_operand" "n")]
+   (clobber (match_scratch:V4SI 3 "=v"))]
+  "TARGET_DIRECT_MOVE_64BIT
+   && INTVAL (operands[2]) == (BYTES_BIG_ENDIAN ? 1 : 2)"
+  "mfvsrwz %0,%x1"
+  [(set_attr "type" "mfvsr")
+   (set_attr "isa" "p8v")])
+
+(define_insn "vsx_extract_v4si_0"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,wa,Z,wa")
+   (vec_select:SI
+(match_operand:V4SI 1 "gpc_reg_operand" "v,v,v,0")
+(parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n,n")])))
+   (clobber (match_scratch:V4SI 3 "=v,v,v,v"))]
+  "TARGET_DIRECT_MOVE_64BIT
+   && (!TARGET_P9_VECTOR || INTVAL (operands[2]) == (BYTES_BIG_ENDIAN ? 1 : 
2))"
+{
+   if (which_alternative == 0)
+ return "mfvsrwz %0,%x1";
+
+   if (which_alternative == 1)
+ return "xxlor %x0,%x1,%x1";
+
+   if

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-20 Thread HAO CHEN GUI via Gcc-patches

Sorry for the typo
s/change/chance

在 2023/7/21 8:59, HAO CHEN GUI 写道:
> Hi Jeff,
> 
> 在 2023/7/21 5:27, Jeff Law 写道:
>> Wouldn't it make more sense to just try rotate/mask in the original mode 
>> before trying a shift in a widened mode?  I'm not sure why we need a target 
>> hook here.
> 
> There is no change to try rotate/mask with the original mode when
> expensive_optimizations is set. The subst widens the shift mode.
> 
>   if (flag_expensive_optimizations)
> {
>   /* Pass pc_rtx so no substitutions are done, just
>  simplifications.  */
>   if (i1)
> {
>   subst_low_luid = DF_INSN_LUID (i1);
>   i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
> }
> 
>   subst_low_luid = DF_INSN_LUID (i2);
>   i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
> }
> 
> I don't know if the wider mode is helpful to other targets, so
> I added the target hook.
> 
> Thanks
> Gui Haochen

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-20 Thread HAO CHEN GUI via Gcc-patches

Hi Jeff,

在 2023/7/21 5:27, Jeff Law 写道:
> Wouldn't it make more sense to just try rotate/mask in the original mode 
> before trying a shift in a widened mode?  I'm not sure why we need a target 
> hook here.

There is no change to try rotate/mask with the original mode when
expensive_optimizations is set. The subst widens the shift mode.

  if (flag_expensive_optimizations)
{
  /* Pass pc_rtx so no substitutions are done, just
 simplifications.  */
  if (i1)
{
  subst_low_luid = DF_INSN_LUID (i1);
  i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
}

  subst_low_luid = DF_INSN_LUID (i2);
  i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
}

I don't know if the wider mode is helpful to other targets, so
I added the target hook.

Thanks
Gui Haochen

[Bug middle-end/110612] text-art: four clang warnings

2023-07-20 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110612

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from David Malcolm  ---
Thanks for filing this.

I believe all of these should be fixed by the above commit; please let me know
if any such warnings remain.

[Bug analyzer/110455] [14 Regression] tree check: expected none of vector_type, have vector_type in get_gassign_result, at analyzer/region-model.cc:870 with -fanalyzer

2023-07-20 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110455

David Malcolm  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from David Malcolm  ---
Thanks for filing this bug.  Should be fixed by the above commit.

[Bug other/86656] [meta-bug] Issues found with -fsanitize=address

2023-07-20 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86656
Bug 86656 depends on bug 110433, which changed state.

Bug 110433 Summary: ASAN reports mismatching new/delete when compiling analyzer 
testcases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110433

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug analyzer/110433] ASAN reports mismatching new/delete when compiling analyzer testcases

2023-07-20 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110433

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from David Malcolm  ---
Probably fixed by the above patch (by adding the virtual dtor); please reopen
if it isn't.

[Bug analyzer/110387] [14 Regression] ICE: in key_t, at analyzer/region.h:1110 with -fanalyzer

2023-07-20 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110387

David Malcolm  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #3 from David Malcolm  ---
Should be fixed by the above patch.

[pushed] analyzer/text-art: fix clang warnings [PR110433,PR110612]

2023-07-20 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-2689-g7006f02bbc3f1d.


gcc/analyzer/ChangeLog:
PR analyzer/110433
PR middle-end/110612
* access-diagram.cc (class spatial_item): Add virtual dtor.

gcc/ChangeLog:
PR middle-end/110612
* text-art/table.cc (table_geometry::table_geometry): Drop m_table
field.
(table_geometry::table_x_to_canvas_x): Add cast to comparison.
(table_geometry::table_y_to_canvas_y): Likewise.
* text-art/table.h (table_geometry::m_table): Drop unused field.
* text-art/widget.h (wrapper_widget::update_child_alloc_rects):
Add "override".
---
 gcc/analyzer/access-diagram.cc | 1 +
 gcc/text-art/table.cc  | 7 +++
 gcc/text-art/table.h   | 1 -
 gcc/text-art/widget.h  | 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/analyzer/access-diagram.cc b/gcc/analyzer/access-diagram.cc
index 467c9bdd734..d7b669a4e38 100644
--- a/gcc/analyzer/access-diagram.cc
+++ b/gcc/analyzer/access-diagram.cc
@@ -1125,6 +1125,7 @@ private:
 class spatial_item
 {
 public:
+  virtual ~spatial_item () {}
   virtual void add_boundaries (boundaries , logger *) const = 0;
 
   virtual table make_table (const bit_to_table_map ,
diff --git a/gcc/text-art/table.cc b/gcc/text-art/table.cc
index 71a10246257..2f857a0e2a7 100644
--- a/gcc/text-art/table.cc
+++ b/gcc/text-art/table.cc
@@ -507,8 +507,7 @@ table_cell_sizes::get_canvas_size (const table::rect_t 
) const
 /* class text_art::table_geometry.  */
 
 table_geometry::table_geometry (const table , table_cell_sizes 
_sizes)
-: m_table (table),
-  m_cell_sizes (cell_sizes),
+: m_cell_sizes (cell_sizes),
   m_canvas_size (canvas::size_t (0, 0)),
   m_col_start_x (table.get_size ().w),
   m_row_start_y (table.get_size ().h)
@@ -558,7 +557,7 @@ int
 table_geometry::table_x_to_canvas_x (int table_x) const
 {
   /* Allow one beyond the end, for the right-hand border of the table.  */
-  if (table_x == m_col_start_x.size ())
+  if (table_x == (int)m_col_start_x.size ())
 return m_canvas_size.w - 1;
   return m_col_start_x[table_x];
 }
@@ -570,7 +569,7 @@ int
 table_geometry::table_y_to_canvas_y (int table_y) const
 {
   /* Allow one beyond the end, for the right-hand border of the table.  */
-  if (table_y == m_row_start_y.size ())
+  if (table_y == (int)m_row_start_y.size ())
 return m_canvas_size.h - 1;
   return m_row_start_y[table_y];
 }
diff --git a/gcc/text-art/table.h b/gcc/text-art/table.h
index 2dc5c3c41cb..17eda912f1a 100644
--- a/gcc/text-art/table.h
+++ b/gcc/text-art/table.h
@@ -232,7 +232,6 @@ class table_geometry
   }
 
  private:
-  const table _table;
   table_cell_sizes _cell_sizes;
   canvas::size_t m_canvas_size;
 
diff --git a/gcc/text-art/widget.h b/gcc/text-art/widget.h
index 8798e436d94..5156a7ea572 100644
--- a/gcc/text-art/widget.h
+++ b/gcc/text-art/widget.h
@@ -148,7 +148,7 @@ class wrapper_widget : public widget
   {
 return m_child->get_req_size ();
   }
-  void update_child_alloc_rects ()
+  void update_child_alloc_rects () override
   {
 m_child->set_alloc_rect (get_alloc_rect ());
   }
-- 
2.26.3

[pushed] analyzer: avoid usage of TYPE_PRECISION on vector types [PR110455]

2023-07-20 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-2690-ga4913a19d24a79.

gcc/analyzer/ChangeLog:
PR analyzer/110455
* region-model.cc (region_model::get_gassign_result): Only check
for bad shift counts when dealing with an integral type.

gcc/testsuite/ChangeLog:
PR analyzer/110455
* gcc.dg/analyzer/pr110455.c: New test.
---
 gcc/analyzer/region-model.cc | 3 ++-
 gcc/testsuite/gcc.dg/analyzer/pr110455.c | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr110455.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 187013a37cc..e01b1c88299 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -860,7 +860,8 @@ region_model::get_gassign_result (const gassign *assign,
   or by greater than or equal to the number of bits that exist in
   the operand."  */
if (const tree rhs2_cst = rhs2_sval->maybe_get_constant ())
- if (TREE_CODE (rhs2_cst) == INTEGER_CST)
+ if (TREE_CODE (rhs2_cst) == INTEGER_CST
+ && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
{
  if (tree_int_cst_sgn (rhs2_cst) < 0)
ctxt->warn
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr110455.c 
b/gcc/testsuite/gcc.dg/analyzer/pr110455.c
new file mode 100644
index 000..7f979436b79
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr110455.c
@@ -0,0 +1,7 @@
+int __attribute__((__vector_size__ (4))) v;
+
+void
+foo (void)
+{
+  v | v << 1;
+}
-- 
2.26.3

[pushed] analyzer: fix ICE on certain pointer subtractions [PR110387]

2023-07-20 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-2688-g5a0aff76a99804.

gcc/analyzer/ChangeLog:
PR analyzer/110387
* region.h (struct cast_region::key_t): Support "m_type" being
null by using "m_original_region" for empty/deleted slots.

gcc/testsuite/ChangeLog:
PR analyzer/110387
* gcc.dg/analyzer/out-of-bounds-pr110387.c: New test.
---
 gcc/analyzer/region.h | 16 +++-
 .../gcc.dg/analyzer/out-of-bounds-pr110387.c  | 19 +++
 2 files changed, 30 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-pr110387.c

diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h
index 0c79490c9c0..2cbb9234728 100644
--- a/gcc/analyzer/region.h
+++ b/gcc/analyzer/region.h
@@ -1107,7 +1107,7 @@ public:
 key_t (const region *original_region, tree type)
 : m_original_region (original_region), m_type (type)
 {
-  gcc_assert (type);
+  gcc_assert (original_region);
 }
 
 hashval_t hash () const
@@ -1124,10 +1124,16 @@ public:
  && m_type == other.m_type);
 }
 
-void mark_deleted () { m_type = reinterpret_cast (1); }
-void mark_empty () { m_type = NULL_TREE; }
-bool is_deleted () const { return m_type == reinterpret_cast (1); }
-bool is_empty () const { return m_type == NULL_TREE; }
+void mark_deleted ()
+{
+  m_original_region = reinterpret_cast (1);
+}
+void mark_empty () { m_original_region = nullptr; }
+bool is_deleted () const
+{
+  return m_original_region == reinterpret_cast (1);
+}
+bool is_empty () const { return m_original_region == nullptr; }
 
 const region *m_original_region;
 tree m_type;
diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-pr110387.c 
b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-pr110387.c
new file mode 100644
index 000..a046659c83e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-pr110387.c
@@ -0,0 +1,19 @@
+char a, b, c, d;
+long x;
+
+void
+_S_copy (long __n)
+{
+  __builtin_memcpy (, , __n); /* { dg-prune-output 
"-Wanalyzer-out-of-bounds" } */
+  /* This only warns on some targets; the purpose of the test is to verify that
+ we don't ICE.  */
+}
+
+void
+_M_construct ()
+{
+  x =  - 
+  unsigned long __dnew = x;
+  if (__dnew > 1)
+_S_copy ( - );
+}
-- 
2.26.3

[Bug analyzer/110455] [14 Regression] tree check: expected none of vector_type, have vector_type in get_gassign_result, at analyzer/region-model.cc:870 with -fanalyzer

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110455

--- Comment #1 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:a4913a19d24a794c97f38d9c65c47c1fb9f2140c

commit r14-2690-ga4913a19d24a794c97f38d9c65c47c1fb9f2140c
Author: David Malcolm 
Date:   Thu Jul 20 20:24:10 2023 -0400

analyzer: avoid usage of TYPE_PRECISION on vector types [PR110455]

gcc/analyzer/ChangeLog:
PR analyzer/110455
* region-model.cc (region_model::get_gassign_result): Only check
for bad shift counts when dealing with an integral type.

gcc/testsuite/ChangeLog:
PR analyzer/110455
* gcc.dg/analyzer/pr110455.c: New test.

Signed-off-by: David Malcolm

[Bug analyzer/110433] ASAN reports mismatching new/delete when compiling analyzer testcases

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110433

--- Comment #3 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:7006f02bbc3f1d0b7ed7fe2122abc0896aa848d2

commit r14-2689-g7006f02bbc3f1d0b7ed7fe2122abc0896aa848d2
Author: David Malcolm 
Date:   Thu Jul 20 20:24:06 2023 -0400

analyzer/text-art: fix clang warnings [PR110433,PR110612]

gcc/analyzer/ChangeLog:
PR analyzer/110433
PR middle-end/110612
* access-diagram.cc (class spatial_item): Add virtual dtor.

gcc/ChangeLog:
PR middle-end/110612
* text-art/table.cc (table_geometry::table_geometry): Drop m_table
field.
(table_geometry::table_x_to_canvas_x): Add cast to comparison.
(table_geometry::table_y_to_canvas_y): Likewise.
* text-art/table.h (table_geometry::m_table): Drop unused field.
* text-art/widget.h (wrapper_widget::update_child_alloc_rects):
Add "override".

Signed-off-by: David Malcolm

[Bug middle-end/110612] text-art: four clang warnings

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110612

--- Comment #2 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:7006f02bbc3f1d0b7ed7fe2122abc0896aa848d2

commit r14-2689-g7006f02bbc3f1d0b7ed7fe2122abc0896aa848d2
Author: David Malcolm 
Date:   Thu Jul 20 20:24:06 2023 -0400

analyzer/text-art: fix clang warnings [PR110433,PR110612]

gcc/analyzer/ChangeLog:
PR analyzer/110433
PR middle-end/110612
* access-diagram.cc (class spatial_item): Add virtual dtor.

gcc/ChangeLog:
PR middle-end/110612
* text-art/table.cc (table_geometry::table_geometry): Drop m_table
field.
(table_geometry::table_x_to_canvas_x): Add cast to comparison.
(table_geometry::table_y_to_canvas_y): Likewise.
* text-art/table.h (table_geometry::m_table): Drop unused field.
* text-art/widget.h (wrapper_widget::update_child_alloc_rects):
Add "override".

Signed-off-by: David Malcolm

[Bug analyzer/110387] [14 Regression] ICE: in key_t, at analyzer/region.h:1110 with -fanalyzer

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110387

--- Comment #2 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:5a0aff76a9980488a760ece72323e7ed1f2c0e5e

commit r14-2688-g5a0aff76a9980488a760ece72323e7ed1f2c0e5e
Author: David Malcolm 
Date:   Thu Jul 20 20:24:01 2023 -0400

analyzer: fix ICE on certain pointer subtractions [PR110387]

gcc/analyzer/ChangeLog:
PR analyzer/110387
* region.h (struct cast_region::key_t): Support "m_type" being
null by using "m_original_region" for empty/deleted slots.

gcc/testsuite/ChangeLog:
PR analyzer/110387
* gcc.dg/analyzer/out-of-bounds-pr110387.c: New test.

Signed-off-by: David Malcolm

Re: [PATCH] Optimize vlddqu to vmovdqu for TARGET_AVX

2023-07-20 Thread Hongtao Liu via Gcc-patches

On Thu, Jul 20, 2023 at 4:11 PM Uros Bizjak via Gcc-patches
 wrote:
>
> On Thu, Jul 20, 2023 at 9:35 AM liuhongt  wrote:
> >
> > For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast
> > as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations.
> > Can someone confirm this with AMD folks?
> > If AMD doesn't like such optimization, I'll put my optimization under
> > micro-architecture tuning.
>
> The instruction is reachable only as __builtin_ia32_lddqu* (aka
> _mm_lddqu_si*), so it was chosen by the programmer for a reason. I
> think that in this case, the compiler should not be too smart and
> change the instruction behind the programmer's back. The caveats are
> also explained at length in the ISA manual.
fine.
>
> Uros.
>
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > If AMD also like such optimization, Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > * config/i386/sse.md (_lddqu): Change to
> > define_expand, expand as simple move when TARGET_AVX
> > && ( == 16 || !TARGET_AVX256_SPLIT_UNALIGNED_LOAD).
> > The original define_insn is renamed to
> > ..
> > (_lddqu): .. this.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/vlddqu_vinserti128.c: New test.
> > ---
> >  gcc/config/i386/sse.md| 15 ++-
> >  .../gcc.target/i386/vlddqu_vinserti128.c  | 11 +++
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c
> >
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 2d81347c7b6..d571a78f4c4 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -1835,7 +1835,20 @@ (define_peephole2
> >[(set (match_dup 4) (match_dup 1))]
> >"operands[4] = adjust_address (operands[0], V2DFmode, 0);")
> >
> > -(define_insn "_lddqu"
> > +(define_expand "_lddqu"
> > +  [(set (match_operand:VI1 0 "register_operand")
> > +   (unspec:VI1 [(match_operand:VI1 1 "memory_operand")]
> > +   UNSPEC_LDDQU))]
> > +  "TARGET_SSE3"
> > +{
> > +  if (TARGET_AVX && ( == 16 || 
> > !TARGET_AVX256_SPLIT_UNALIGNED_LOAD))
> > +{
> > +  emit_move_insn (operands[0], operands[1]);
> > +  DONE;
> > +}
> > +})
> > +
> > +(define_insn "*_lddqu"
> >[(set (match_operand:VI1 0 "register_operand" "=x")
> > (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m")]
> > UNSPEC_LDDQU))]
> > diff --git a/gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c 
> > b/gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c
> > new file mode 100644
> > index 000..29699a5fa7f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-mavx2 -O2" } */
> > +/* { dg-final { scan-assembler-times "vbroadcasti128" 1 } } */
> > +/* { dg-final { scan-assembler-not {(?n)vlddqu.*xmm} } } */
> > +
> > +#include 
> > +__m256i foo(void *data) {
> > +__m128i X1 = _mm_lddqu_si128((__m128i*)data);
> > +__m256i V1 = _mm256_broadcastsi128_si256 (X1);
> > +return V1;
> > +}
> > --
> > 2.39.1.388.g2fc9e9ca3c
> >



-- 
BR,
Hongtao

[Bug c/110664] -std=c2x -pedantic-errors pedwarns on _Float128

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110664

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic, rejects-valid
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-07-20

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/110760] slp introduces new overflow arithmetic

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110760

--- Comment #4 from Andrew Pinski  ---
(In reply to Krister Walfridsson from comment #3)
> (In reply to Andrew Pinski from comment #1)
> > I thought we decided that vector types don't apply the overflow rules and
> > always just wrap ...
> 
> That makes sense. But on the other hand, PR 110495 is a similar issue, and
> that was fixed...
> 
> And TYPE_OVERFLOW_WRAPS should return true for integer vectors if they
> always wrap (or is it only valid for scalars? But ANY_INTEGRAL_TYPE_P is
> careful to handle vectors and complex numbers too, so I thought the
> ANY_INTEGRAL_TYPE_CHECK in TYPE_OVERFLOW_WRAPS means that it work for
> vectors too).

That is slightly different, it was introducing -2(OVF) too.

[Bug tree-optimization/110760] slp introduces new overflow arithmetic

2023-07-20 Thread kristerw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110760

--- Comment #3 from Krister Walfridsson  ---
(In reply to Andrew Pinski from comment #1)
> I thought we decided that vector types don't apply the overflow rules and
> always just wrap ...

That makes sense. But on the other hand, PR 110495 is a similar issue, and that
was fixed...

And TYPE_OVERFLOW_WRAPS should return true for integer vectors if they always
wrap (or is it only valid for scalars? But ANY_INTEGRAL_TYPE_P is careful to
handle vectors and complex numbers too, so I thought the
ANY_INTEGRAL_TYPE_CHECK in TYPE_OVERFLOW_WRAPS means that it work for vectors
too).

[PATCH] cleanup: Change LEN_MASK into MASK_LEN

2023-07-20 Thread Juzhe-Zhong

Hi.

Since start from LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE, COND_LEN_* 
patterns,
the order of len and mask is {mask,len,bias}.

The reason we make "mask" argument comes before "len" is because we want to keep
the "mask" location same as mask_* or cond_* patterns to make use of current 
codes flow
of mask_* and cond_*. Otherwise, we will need to change codes much more and 
make codes
hard to maintain.

Now, we already have COND_LEN_*, it's naturally that we should rename 
"LEN_MASK" into "MASK_LEN"
to keep name scheme consistent.

This patch only changes the name "LEN_MASK" into "MASK_LEN".
No codes functionality change.

gcc/ChangeLog:

* config/riscv/autovec.md (len_maskload): Change LEN_MASK 
into MASK_LEN.
(mask_len_load): Ditto.
(len_maskstore): Ditto.
(mask_len_store): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_gather_load): Ditto.
(mask_len_gather_load): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
(len_mask_scatter_store): Ditto.
(mask_len_scatter_store): Ditto.
* doc/md.texi: Ditto.
* genopinit.cc (main): Ditto.
(CMP_NAME): Ditto. Ditto.
* gimple-fold.cc (arith_overflowed_p): Ditto.
(gimple_fold_partial_load_store_mem_ref): Ditto.
(gimple_fold_call): Ditto.
* internal-fn.cc (len_maskload_direct): Ditto.
(mask_len_load_direct): Ditto.
(len_maskstore_direct): Ditto.
(mask_len_store_direct): Ditto.
(expand_call_mem_ref): Ditto.
(expand_len_maskload_optab_fn): Ditto.
(expand_mask_len_load_optab_fn): Ditto.
(expand_len_maskstore_optab_fn): Ditto.
(expand_mask_len_store_optab_fn): Ditto.
(direct_len_maskload_optab_supported_p): Ditto.
(direct_mask_len_load_optab_supported_p): Ditto.
(direct_len_maskstore_optab_supported_p): Ditto.
(direct_mask_len_store_optab_supported_p): Ditto.
(internal_load_fn_p): Ditto.
(internal_store_fn_p): Ditto.
(internal_gather_scatter_fn_p): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
(internal_len_load_store_bias): Ditto.
* internal-fn.def (LEN_MASK_GATHER_LOAD): Ditto.
(MASK_LEN_GATHER_LOAD): Ditto.
(LEN_MASK_LOAD): Ditto.
(MASK_LEN_LOAD): Ditto.
(LEN_MASK_SCATTER_STORE): Ditto.
(MASK_LEN_SCATTER_STORE): Ditto.
(LEN_MASK_STORE): Ditto.
(MASK_LEN_STORE): Ditto.
* optabs-query.cc (supports_vec_gather_load_p): Ditto.
(supports_vec_scatter_store_p): Ditto.
* optabs-tree.cc (target_supports_mask_load_store_p): Ditto.
(target_supports_len_load_store_p): Ditto.
* optabs.def (OPTAB_CD): Ditto.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Ditto.
(call_may_clobber_ref_p_1): Ditto.
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Ditto.
(dse_optimize_stmt): Ditto.
* tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Ditto.
(get_alias_ptr_type_for_ptr_address): Ditto.
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Ditto.
* tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Ditto.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto.
(vect_get_strided_load_store_ops): Ditto.
(vectorizable_store): Ditto.
(vectorizable_load): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: Ditto.
*

[Bug c/110664] -std=c2x -pedantic-errors pedwarns on _Float128

2023-07-20 Thread joseph at codesourcery dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110664

--- Comment #1 from joseph at codesourcery dot com  ---
Yes, this would be a bug.

[Bug testsuite/110756] [14 Regression] commit g:92d1425ca78 causes failures in g++.dg/gomp/pr58567.C

2023-07-20 Thread thiago.bauermann at linaro dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110756

--- Comment #2 from Thiago Jung Bauermann  
---
Ah! Thanks for the analysis. Should I submit the following patch to the mailing
list then?

diff --git a/gcc/testsuite/g++.dg/gomp/pr58567.C
b/gcc/testsuite/g++.dg/gomp/pr58567.C
index 35a5bb027ffe..866d831c65e4 100644
--- a/gcc/testsuite/g++.dg/gomp/pr58567.C
+++ b/gcc/testsuite/g++.dg/gomp/pr58567.C
@@ -5,7 +5,7 @@
 template void foo()
 {
   #pragma omp parallel for
-  for (typename T::X i = 0; i < 100; ++i)  /* { dg-error "'int' is not a
class, struct, or union type|expected iteration declaration or initialization"
} */
+  for (typename T::X i = 0; i < 100; ++i)  /* { dg-error "'int' is not a
class, struct, or union type|invalid type for iteration variable 'i'" } */
 ;
 }

[Bug tree-optimization/110760] slp introduces new overflow arithmetic

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110760

Andrew Pinski  changed:

   What|Removed |Added

Summary|slp introduces new wrapped  |slp introduces new overflow
   |arithmetic  |arithmetic

--- Comment #2 from Andrew Pinski  ---
>these calculations may wrap

You mean overflow rather than wrap. Wrapping is a defined behavior while
overflow is what is considered undefined ...

[Bug tree-optimization/110760] slp introduces new wrapped arithmetic

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110760

--- Comment #1 from Andrew Pinski  ---
I thought we decided that vector types don't apply the overflow rules and
always just wrap ...

[Bug middle-end/110754] assume create spurious load for volatile variable

2023-07-20 Thread xry111 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754

--- Comment #4 from Xi Ruoyao  ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Xi Ruoyao from comment #1)
> > Is this a bug?  The standard defines accessing volatile objects as
> > side-effects so it's not allowed to merge volatile loads, AFAIU.
> 
> Yes because assume attribute is defined not to have any side effects.
> 
> Confirmed.
> 
> gimplifier produces:
> 
>   [[assume (D.2786)]]
> {
>   {
> int n.0;
> 
> n.0 = n;
> D.2786 = n.0 == 1;
>   }
> }
> 
> And then lowering produces:
>   _2 = n;
>   .ASSUME (_Z3bari._assume.0, _2);
> 
> But really it should have passed the address of n rather than the value
> since n is volatile here .

Alright, I mistakenly believed [[assume(x)]]; is same as if (!x)
unreachable();.

[Bug middle-end/110754] assume create spurious load for volatile variable

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754

--- Comment #3 from Andrew Pinski  ---
Seems like lowering passes everything via value rather than some stuff by
reference

[Bug tree-optimization/110760] New: slp introduces new wrapped arithmetic

2023-07-20 Thread kristerw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110760

Bug ID: 110760
   Summary: slp introduces new wrapped arithmetic
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kristerw at gcc dot gnu.org
  Target Milestone: ---

Consider the following function from gcc.dg/vect/bb-slp-layout-5.c:

int a[4], b[4], c[4];

void f1()
{
  a[0] = b[3] - c[3];
  a[1] = b[2] + c[2];
  a[2] = b[1] - c[1];
  a[3] = b[0] + c[0];
}

This is vectorized by slp2:
  vector(4) int vect__1.5;
  vector(4) int vect__2.8;
  vector(4) int vect__12.10;
  vector(4) int vect__3.9;
  vector(4) int _22;
  vect__1.5_18 = MEM  [(int *)];
  vect__2.8_19 = MEM  [(int *)];
  vect__12.10_21 = vect__1.5_18 + vect__2.8_19;
  vect__3.9_20 = vect__1.5_18 - vect__2.8_19;
  _22 = VEC_PERM_EXPR ;
  MEM  [(int *)] = _22;

But this introduces new calculations in the temporary vectors of the unused
elements:
  b[0] - c[0];
  b[1] + c[1];
  b[2] - c[2];
  b[3] + c[3];
and these calculations may wrap for input where the original program did not
wrap.

[Bug middle-end/110754] assume create spurious load for volatile variable

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-07-20
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||wrong-code
  Component|c++ |middle-end

--- Comment #2 from Andrew Pinski  ---
(In reply to Xi Ruoyao from comment #1)
> Is this a bug?  The standard defines accessing volatile objects as
> side-effects so it's not allowed to merge volatile loads, AFAIU.

Yes because assume attribute is defined not to have any side effects.

Confirmed.

gimplifier produces:

  [[assume (D.2786)]]
{
  {
int n.0;

n.0 = n;
D.2786 = n.0 == 1;
  }
}

And then lowering produces:
  _2 = n;
  .ASSUME (_Z3bari._assume.0, _2);

But really it should have passed the address of n rather than the value since n
is volatile here .

gcc-11-20230720 is now available

2023-07-20 Thread GCC Administrator via Gcc

Snapshot gcc-11-20230720 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20230720/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision 7bd1373f87d581b1e5482f9c558d481c38027a99

You'll find:

 gcc-11-20230720.tar.xz   Complete GCC

  SHA256=14996fb0a8aa45dec9031bffd42d9da553f4f79a5634c6c14a592472ed406c1a
  SHA1=9fde135a3cb84f7c30119cea689d2679e5046ee5

Diffs from 11-20230713 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

[Bug c++/110754] assume create spurious load for volatile variable

2023-07-20 Thread xry111 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #1 from Xi Ruoyao  ---
It happens w/o assume:

int bar(int p)
{
volatile int n = p;
if (1 == n)
__builtin_unreachable();
return 1 + n;
}

Is this a bug?  The standard defines accessing volatile objects as side-effects
so it's not allowed to merge volatile loads, AFAIU.

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #3 from Patrick O'Neill  ---
It may have broken other targets - I can only confirm with builds for RISCV so
I didn't want to speculate too much

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #2 from Andrew Pinski  ---
I don't see how it could have broke riscv only ...

[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
Summary|IEEE Fortran change broke   |[14 Regression] IEEE
   |RISC-V linux build  |Fortran change broke RISC-V
   ||linux build
   Keywords||build

[Bug libfortran/110759] IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

--- Comment #1 from Patrick O'Neill  ---
Created attachment 55593
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55593=edit
Failing build log

[Bug libfortran/110759] New: IEEE Fortran change broke RISC-V linux build

2023-07-20 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759

Bug ID: 110759
   Summary: IEEE Fortran change broke RISC-V linux build
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Using https://github.com/riscv-collab/riscv-gnu-toolchain with tip-of-tree GCC
fails to build linux target.

Failures:

2023-07-20T14:14:06.1566362Z
../../../../gcc/libgfortran/ieee/ieee_arithmetic.F90:563:617:
2023-07-20T14:14:06.1566776Z 
2023-07-20T14:14:06.1567818Z   563 |   IEEE_COMPARISON(QUIET,EQ)
2023-07-20T14:14:06.1568423Z   |   
   
   
   
   
   
   
   
 1
2023-07-20T14:14:06.1569478Z Error: Syntax error in PROCEDURE statement at (1)
2023-07-20T14:14:06.1576233Z
../../../../gcc/libgfortran/ieee/ieee_arithmetic.F90:564:617:
2023-07-20T14:14:06.1576587Z 
2023-07-20T14:14:06.1577039Z   564 |   IEEE_COMPARISON(QUIET,GE)
2023-07-20T14:14:06.1577606Z  

Appears to be caused by
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=dca2874897ec58ea1c22a9c2161f112fff07cfb2

Affects rv32gc and rv64gc

Known failure: c5bd0e5870aed178b7f82e7b94f59a383e7c5b4f
Known success: 49bed11d96cf727de7e6ed35f065a4df29f6c589

[Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758

--- Comment #1 from Andrew Pinski  ---
I suspect this is most likely the profile updates changes ...

[Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758

Andrew Pinski  changed:

   What|Removed |Added

 Blocks||26163
Version|13.1.0  |14.0
   Keywords||missed-optimization
   Target Milestone|--- |14.0
Summary|8% hmmer regression on zen1 |[14 Regression] 8% hmmer
   |and zen3 with -Ofast|regression on zen1/3 with
   |-march=native -flto between |-Ofast -march=native -flto
   |g:8377cf1bf41a0a9d  |between g:8377cf1bf41a0a9d
   |(2023-07-05 01:46) and  |(2023-07-05 01:46) and
   |g:3a61ca1b9256535e  |g:3a61ca1b9256535e
   |(2023-07-06 16:56) and  |(2023-07-06 16:56);
   |g:d76d19c9bc5ef113  |g:d76d19c9bc5ef113
   |(2023-07-16 00:16) and  |(2023-07-16 00:16) and
   |g:a5088dc3f5ef73c8  |g:a5088dc3f5ef73c8
   |(2023-07-17 03:24)  |(2023-07-17 03:24)


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA

2023-07-20 Thread juzhe.zhong at rivai dot ai via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

--- Comment #15 from JuzheZhong  ---
I am wondering: do we have have other situations need "undef" value to do
optimizations? If yes, I am aggree with Richard that we need to support "undef"
value.  But "undef" value in Gimple IR support would be a long term work since
it
is not an easy job. For example, in llvm, undef + a -> undef, but undef & a ->
0.

[Bug target/110758] New: 8% hmmer regression on zen1 and zen3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56) and g:d76d19c9bc5e

2023-07-20 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758

Bug ID: 110758
   Summary: 8% hmmer regression on zen1 and zen3 with -Ofast
-march=native -flto between g:8377cf1bf41a0a9d
(2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06
16:56) and g:d76d19c9bc5ef113 (2023-07-16 00:16) and
g:a5088dc3f5ef73c8 (2023-07-17 03:24)
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=476.180.0
the earlier jump looks like random code layout change.
Later jump is also seen with PGO
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=474.180.0
and -O2
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=469.180.0

zen1 machine
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=289.180.0

Re: [PATCH v3] Implement new RTL optimizations pass: fold-mem-offsets.

2023-07-20 Thread Jeff Law via Gcc-patches





On 7/20/23 00:18, Vineet Gupta wrote:



On 7/18/23 21:31, Jeff Law via Gcc-patches wrote:


In a run with -fno-fold-mem-offsets, the same insn 93 is successfully 
grok'ed by cprop_hardreg,


| (insn 93 337 522 11 (set (mem/c:DF (plus:DI (reg/f:DI 2 sp)
|    (const_int 8 [0x8])) [4 %sfp+-8 S8 A64])
|    (const_double:DF 0.0 [0x0.0p+0])) "sff.i":23:11 190 
{*movdf_hardfloat_rv64}

^^^
| (expr_list:REG_EQUAL (const_double:DF 0.0 [0x0.0p+0])
|    (nil)))

P.S. I wonder if it is a good idea in general to call recog() post 
reload since the insn could be changed sufficiently to no longer 
match the md patterns. Of course I don't know the answer.

If this ever causes a problem, it's a backend bug.  It's that simple.

Conceptually it should always be safe to set INSN_CODE to -1 for any 
insn.


Sure the -1 should be handled, but are you implying that f-mo- will 
always generate a valid combination and recog() failing is simply a bug 
in backend and/or f-m-o. If not, the -1 setting can potentially trigger 
an ICE in future.
A recog failure after setting INSN_CODE to -1 would always be an 
indicator of a target bug at the point where f-m-o runs.


In that would be generally true as well.  There are some very obscure 
exceptions and those exceptions are for narrow periods of time.








Odds are for this specific case in the RV backend, we just need a 
constraint to store 0.0 into a memory location.  That can actually be 
implemented as a store from x0 since 0.0 has the bit pattern 0x0. This 
is probably a good thing to expose anyway as an optimization and can 
move forward independently of the f-m-o patch.


I call dibs on this :-) Seems like an interesting little side project.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110748

It's yours  :-)

jeff

[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
Summary|7% parest regression on |[14 Regression] 7% parest
   |zen3 -Ofast -march=native   |regression on zen3 -Ofast
   |-flto between   |-march=native -flto between
   |g:4dbb3af1efe55174  |g:4dbb3af1efe55174
   |(2023-07-14 00:54) and  |(2023-07-14 00:54) and
   |g:a5088dc3f5ef73c8  |g:a5088dc3f5ef73c8
   |(2023-07-17 03:24)  |(2023-07-17 03:24)
   Target Milestone|--- |14.0
 CC||pinskia at gcc dot gnu.org
Version|13.1.0  |14.0

Re: [PATCH] c++: fix ICE with is_really_empty_class [PR110106]

2023-07-20 Thread Marek Polacek via Gcc-patches

On Thu, Jul 20, 2023 at 03:51:32PM -0400, Marek Polacek wrote:
> On Thu, Jul 20, 2023 at 02:37:07PM -0400, Jason Merrill wrote:
> > On 7/20/23 14:13, Marek Polacek wrote:
> > > On Wed, Jul 19, 2023 at 10:11:27AM -0400, Patrick Palka wrote:
> > > > On Tue, 18 Jul 2023, Marek Polacek via Gcc-patches wrote:
> > > > 
> > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 
> > > > > branches?
> > > > 
> > > > Looks reasonable to me.
> > > 
> > > Thanks.
> > > > Though I wonder if we could also fix this by not checking potentiality
> > > > at all in this case?  The problematic call to 
> > > > is_rvalue_constant_expression
> > > > happens from cp_parser_constant_expression with 'allow_non_constant' != > > > > 0
> > > > and with 'non_constant_p' being a dummy out argument that comes from
> > > > cp_parser_functional_cast, so the result of 
> > > > is_rvalue_constant_expression
> > > > is effectively unused in this case, and we should be able to safely 
> > > > elide
> > > > it when 'allow_non_constant && non_constant_p == nullptr'.
> > > 
> > > Sounds plausible.  I think my patch could be applied first since it
> > > removes a tiny bit of code, then I can hopefully remove the flag below,
> > > then maybe go back and optimize the call to is_rvalue_constant_expression.
> > > Does that sound sensible?
> > > 
> > > > Relatedly, ISTM the member cp_parser::non_integral_constant_expression_p
> > > > is also effectively unused and could be removed?
> > > 
> > > It looks that way.  Seems it's only used in cp_parser_constant_expression:
> > > 10806   if (allow_non_constant_p)
> > > 10807 *non_constant_p = parser->non_integral_constant_expression_p;
> > > but that could be easily replaced by a local var.  I'd be happy to see if
> > > we can actually do away with it.  (I wonder why it was introduced and when
> > > it actually stopped being useful.)
> > 
> > It was for the C++98 notion of constant-expression, which was more of a
> > parser-level notion, and has been supplanted by the C++11 version.  I'm
> > happy to remove it, and therefore remove the is_rvalue_constant_expression
> > call.
> 
> Wonderful.  I'll do that next.

I found a use of parser->non_integral_constant_expression_p:
finish_id_expression_1 can set it to true which then makes
a difference in cp_parser_constant_expression in C++98.  In
cp_parser_constant_expression we set n_i_c_e_p to false, call
cp_parser_assignment_expression in which finish_id_expression_1
sets n_i_c_e_p to true, then back in cp_parser_constant_expression
we skip the cxx11 block, and set *non_constant_p to true.  If I
remove n_i_c_e_p, we lose that.  This can be seen in init/array60.C.

Marek

[Bug middle-end/110757] New: 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)

2023-07-20 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757

Bug ID: 110757
   Summary: 7% parest regression on zen3 -Ofast -march=native
-flto between g:4dbb3af1efe55174 (2023-07-14 00:54)
and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

seems there are two commits producing this regression. Run in between is
d76d19c9bc5ef113 (2023-07-16 00:16)

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=475.457.0

There are earlier two jumps between g:52577a301ef1b86d (2023-05-30 02:20) and
g:d0c064c3eabc75cf (2023-05-31 16:46)
and between g:7ebd4a1d61993c0a (2023-04-28 07:23) and g:977a3be3ccbc7f17
(2023-05-01 13:40)

8% regression is also seen on zen1 machine:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=287.457.0

[Bug testsuite/110756] [14 Regression] commit g:92d1425ca78 causes failures in g++.dg/gomp/pr58567.C

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110756

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||testsuite-fail
  Component|c++ |testsuite
   Target Milestone|--- |14.0
   Last reconfirmed||2023-07-20
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.  When I looked into this, I got the feeling it was a testsuite issue
in that the error message changed slightly.

Before that change we had got:
```
:8:22: error: 'int' is not a class, struct, or union type
8 |   for (typename T::X i = 0; i < 100; ++i)  /* { dg-error "'int' is not
a class, struct, or union type|expected iteration declaration or
initialization" } */
  |  ^
:8:22: error: 'int' is not a class, struct, or union type
:8:22: error: 'int' is not a class, struct, or union type
:8:3: error: expected iteration declaration or initialization
8 |   for (typename T::X i = 0; i < 100; ++i)  /* { dg-error "'int' is not
a class, struct, or union type|expected iteration declaration or
initialization" } */
  |   ^~~
```

And afterwards:
```
:8:22: error: 'int' is not a class, struct, or union type
8 |   for (typename T::X i = 0; i < 100; ++i)  /* { dg-error "'int' is not
a class, struct, or union type|expected iteration declaration or
initialization" } */
  |  ^
:8:3: error: invalid type for iteration variable 'i'
8 |   for (typename T::X i = 0; i < 100; ++i)  /* { dg-error "'int' is not
a class, struct, or union type|expected iteration declaration or
initialization" } */
  |   ^~~
```

The afterwards seems like a better error message really and no longer repeated
too.

☝ Buildbot (Sourceware): gccrust - retry lost connection update (retry) (master)

2023-07-20 Thread builder--- via Gcc-rust

A retry build has been detected on builder gccrust-gentoo-sparc while building 
gccrust.

Full details are available at:
https://builder.sourceware.org/buildbot/#builders/241/builds/863

Build state: retry lost connection update (retry)
Revision: (unknown)
Worker: gentoo-sparc-big
Build Reason: (unknown)
Blamelist: A. Wilcox , Abdul Rafey 
, Alan Modra , Aldy Hernandez 
, Alex Coplan , Alexander Monakov 
, Alexandre Oliva , Alexandre Oliva 
, Allan McRae , Andre Simoes Dias Vieira 
, Andre Vehreschild , Andre 
Vieira , Andrea Corallo 
, Andreas Krebbel , Andreas 
Schwab , Andreas Schwab , Andrew 
Carlotti , Andrew Carlotti , 
Andrew Jenner , Andrew MacLeod , 
Andrew Pinski , Andrew Pinski , Andrew 
Stubbs , Anthony Green , Antoni 
Boucher , Ard Biesheuvel , Arjun Shankar 
, Arnaud Charlet , Arsen Arsenovic 
, Arsen Arsenović , ArshErgon 
, Artem Klimov , Arthur Cohen 
, Avinash Sonawane , Benno Evers 
, Benson Muite , Bernd 
Kuhls , Bernhard Reutner-Fischer , 
Bernhard Reutner-Fischer , Bill Schmidt 
, Bill Seurer , Björn Schäpers 
, Bob Duff , Boris Yakobowski 
, Bruce Korb , Bruno Haible 
, Cedric Landet , Cesar Philippidis 
, Charalampos Mitrodimas , 
Charles-François Natali , Chenghua Xu 
, Chenghua Xu , Christoph 
Müllner , Christophe Lyon 
, Christophe Lyon , 
Chung-Ju Wu , Chung-Lin Tang , 
Claire Dross , Claudiu Zissulescu , 
Claudiu Zissulescu , Clément Chigot , 
Clément Chigot , CohenArthur , 
Costas Argyris , Cui,Lili , 
Cupertino Miranda , Dan Li 
, Daniel Mercier , Dave 
, Dave Evans , David Edelsohn 
, David Faust , David Malcolm 
, David Seifert , Detlef Vollmann 
, Dimitar Dimitrov , Dimitrij Mijoski 
, Dimitrije Milosevic , 
Dimitrije Milošević , Dmitriy Anisimkov 
, Dongsheng Song , Doug Rupp 
, Ed Catmur , Ed Schonberg 
, Ed Smith-Rowland , Emanuele 
Micheletti , Eric Biggers 
, Eric Botcazou , Eric Botcazou 
, Eric Gallager , Etienne Servais 
, Eugene Rozenfeld , Faisal Abbas 
<90.abbasfai...@gmail.com>, Faisal Abbas , Fedor 
Rybin , Fei Gao , Flavio Cruz 
, Florian Weimer , Francois-Xavier 
Coudert , Francois-Xavier Coudert , 
François Dumont , Frederik Harwath 
, Fritz Reese , Frolov Daniil 
, GCC Administrator , Gaius 
Mulley , Gary Dismukes , 
Georg-Johann Lay , Gerald Pfeifer , Ghjuvan 
Lacambre , Giuliano Belinassi , 
Guillaume Gomez , Guillermo E. Martinez 
, H.J. Lu , Hafiz Abid 
Qadeer , Hans-Peter Nilsson , Haochen 
Gui , Haochen Jiang , Harald 
Anlauf , Hongyu Wang , Hu, Lin1 
, Iain Buclaw , Iain Sandoe 
, Ian Lance Taylor , Ilya Leoshkevich 
, Immad Mir , Immad Mir 
, Indu Bhagat , Iskander 
Shakirzyanov , Jakob Hasse 
<0xja...@users.noreply.github.com>, Jakub Dupak , Jakub 
Jelinek , Jan Beulich , Jan Hubicka 
, Jan-Benedict Glaw , Jason Merrill 
, Javier Miranda , Jeff Chapman II 
, Jeff Law , Jeff Law 
, Jeff Law , Jerry DeLisle 
, Jia-Wei Chen , Jia-wei Chen 
, Jiakun Fan <120090...@link.cuhk.edu.cn>, Jiawei 
, Jin Ma , Jinyang He 
, Jiufu Guo , Joao Azevedo 
, Joel Brobecker , Joel Holdsworth 
, Joel Phillips , Joel 
Teichroeb , Joffrey Huguet , Johannes 
Kanig , Johannes Kliemann , John David 
Anglin , Jonathan Grant , Jonathan Wakely 
, Jonathan Yong <10wa...@gmail.com>, Jonny Grant 
, Jose E. Marchesi , Joseph Myers 
, Josue Nava Bello , José Rui 
Faustino de Sousa , Ju-Zhe Zhong , 
Julia Lapenko , Julian Brown 
, Julien Bortolussi , Junxian 
Zhu , Justin Squirek , 
Juzhe-Zhong , Jørgen Kvalsvik 
, Keef Aragon , 
Kewen Lin , Kewen.Lin , Kim Kuparinen 
, Kito Cheng , Kong 
Lingling , Kwok Cheung Yeung , 
Kyrylo Tkachov , Kévin Le Gouguec 
, LIU Hao , Lewis Hyatt 
, Li Xu , Liaiss Merzougue 
, Liao Shihua , LiaoShihua 
, Lili Cui , Lin Sinan 
, Lin Sinan , Liwei Xu 
, Lorenzo Salvadore , Lulu 
Cheng , Lyra , M V V S Manoj Kumar 
, MAHAD , Maciej W. Rozycki 
, Maciej W. Rozycki , Mahmoud Mohamed 
, Marc Nieper-Wißkirchen , 
Marc Poulhiès , Marc Poulhiès , Marcel 
Vollweiler , Marco Falke , 
Marek Polacek , Mark Mentovai , Mark 
Wielaard , Martin Jambor , Martin Liska 
, Martin Liška , Martin Sebor 
, Martin Uecker , Matthew Jasper 
, Matthias Kretz , Max Filippov 
, Mayshao , Meghan Denny 
, Michael Collison , Michael Eager 
, Michael Meissner , Mikael Morin 
, Mikhail Ablakatov , Monk Chiang 
, Muhammad Mahad , Murray Steele 
, Nathan Sidwell , Nathaniel Shead 
, Navid Rahimi , Nick 
Clifton , Nikos Alexandris , 
Nirmal Patel , Olivier Hainque , Owen 
Avery , Palmer Dabbelt , Pan 
Li , Parthib <94271200+parthib...@users.noreply.github.com>, 
Parthib , Pascal Obry , Pat Haugen 
, Patrick Bernardi , Patrick 
Palka , Paul A. Clarke , Paul Thomas 
, Paul-Antoine Arras , Pekka Seppänen 
, Peter Bergner , Peter Foley 
, Petter Tomner , Philip Herron 
, Philip Herron , 
Philipp Fent , Philipp Tomsich , 
Pierre-Emmanuel Patry , Pierre-Marie de 
Rodat , Piotr Trojanek , Prajwal S N 
, Prathamesh Kulkarni 
, Przemyslaw Wirkus 
, Qian Jianhua , Qian Jianhua 
, Qing Zhao , Quentin Ochem 
, Raiki Tamura , Rainer Orth 
, Rainer Orth , Ramana 
Radhakrishnan , Ramana

[Bug c++/110756] New: [14 Regression] commit g:92d1425ca78 causes failures in g++.dg/gomp/pr58567.C

2023-07-20 Thread thiago.bauermann at linaro dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110756

Bug ID: 110756
   Summary: [14 Regression] commit g:92d1425ca78 causes failures
in g++.dg/gomp/pr58567.C
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: thiago.bauermann at linaro dot org
CC: ppalka at gcc dot gnu.org
  Target Milestone: ---

Our CI detected that commit g:92d1425ca780 "c++: redundant targ coercion for
var/alias tmpls" caused these testsuite failures:

Running g++:g++.dg/gomp/gomp.exp ...
FAIL: g++.dg/gomp/pr58567.C -std=c++14 (test for excess errors)
FAIL: g++.dg/gomp/pr58567.C -std=c++17 (test for excess errors)
FAIL: g++.dg/gomp/pr58567.C -std=c++20 (test for excess errors)
FAIL: g++.dg/gomp/pr58567.C -std=c++98 (test for excess errors)

I confirmed that the problem is still present in trunk as of commit
g:b50a851eef4b "i386: Double-word sign-extension missed-optimization
[PR110717]" from today.

I also confirmed that reverting the mentioned commit from trunk fixes the test
failures. Reproduced the problem on Ubuntu 22.04, on both aarch64-linux and
x86_64-linux.

The relevant part of g++.log is:

Executing on host:
/home/thiago.bauermann/.cache/builds/gcc-native/gcc/testsuite/g++/../../xg++
-B/home/thiago.bauermann/.cache/builds/gcc-native/gcc/testsuite/g++/../../
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C   
-fdiagnostics-plain-output  -nostdinc++
-I/home/thiago.bauermann/.cache/builds/gcc-native/aarch64-unknown-linux-gnu/libstdc++-v3/include/aarch64-unknown-linux-gnu
-I/home/thiago.bauermann/.cache/builds/gcc-native/aarch64-unknown-linux-gnu/libstdc++-v3/include
-I/home/thiago.bauermann/src/gcc/libstdc++-v3/libsupc++
-I/home/thiago.bauermann/src/gcc/libstdc++-v3/include/backward
-I/home/thiago.bauermann/src/gcc/libstdc++-v3/testsuite/util -fmessage-length=0
 -std=c++20 -fopenmp  -S -o pr58567.s(timeout = 300)
spawn -ignore SIGHUP
/home/thiago.bauermann/.cache/builds/gcc-native/gcc/testsuite/g++/../../xg++
-B/home/thiago.bauermann/.cache/builds/gcc-native/gcc/testsuite/g++/../../
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C
-fdiagnostics-plain-output -nostdinc++
-I/home/thiago.bauermann/.cache/builds/gcc-native/aarch64-unknown-linux-gnu/libstdc++-v3/include/aarch64-unknown-linux-gnu
-I/home/thiago.bauermann/.cache/builds/gcc-native/aarch64-unknown-linux-gnu/libstdc++-v3/include
-I/home/thiago.bauermann/src/gcc/libstdc++-v3/libsupc++
-I/home/thiago.bauermann/src/gcc/libstdc++-v3/include/backward
-I/home/thiago.bauermann/src/gcc/libstdc++-v3/testsuite/util -fmessage-length=0
-std=c++20 -fopenmp -S -o pr58567.s
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C: In
instantiation of 'void foo() [with T = int]':
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C:14:11:  
required from here
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C:8:22: error:
'int' is not a class, struct, or union type
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C:8:3: error:
invalid type for iteration variable 'i'
compiler exited with status 1
PASS: g++.dg/gomp/pr58567.C  -std=c++20  (test for errors, line 8)
FAIL: g++.dg/gomp/pr58567.C  -std=c++20 (test for excess errors)
Excess errors:
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/gomp/pr58567.C:8:3: error:
invalid type for iteration variable 'i'


Tested with:

$ ~/src/gcc/configure \
--disable-bootstrap \
--disable-multilib \
&& make -j 8 \
&& make -C gcc check-c++ RUNTESTFLAGS=g++.dg/gomp/gomp.exp

Re: semantics of uninitialized values in GIMPLE

2023-07-20 Thread Krister Walfridsson via Gcc


On Tue, 11 Jul 2023, Krister Walfridsson wrote:

On Tue, 11 Jul 2023, Richard Biener wrote:

I'll update my implementation, and will come back with a more detailed
proposal in a few weeks when I have tried some more things.


Thanks!  I've also taken the opportunity given by your work at the recent
bugs to propose a talk at this years GNU Cauldron about undefined
behavior on GCC and hope to at least start on documenting the state of
art^WGCC in the internals manual for this.  If you have any pointers to
your work / research I'd be happy to point to it, learn from it (and maybe
steal a word or two ;)).


Nice!

No, I have not published anything since the original release of 'pysmtgcc'
last year -- I was planning to document it in detail, but found out that
nothing worked, and I did not really understand what I was doing... And
the Python prototype started to be annoying, so I threw all away and wrote
a new tool in C++.

The new implementation now handles most of GIMPLE (but it still does not
handle loops or function calls). I also have support for checking that the
generated assembly has the same semantics as the optimized GIMPLE (only
partial support for RISC-V for now). I plan to publish a write-up of the
memory model soon in a series of blog posts -- I'll send you the link when
it is available.


I have now published a few blog posts about my work. You can find them at:
  https://github.com/kristerw/pysmtgcc
I'll publish the remaining blog posts next week.

   /Krister

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Jeff Law via Gcc-patches





On 7/20/23 06:38, Richard Biener via Gcc-patches wrote:

When we materialize a layout we push edge permutes to constant/external
defs without checking we can actually do so.  For externals defined
by vector stmts rather than scalar components we can't.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

PR tree-optimization/110742
* tree-vect-slp.cc (vect_optimize_slp_pass::get_result_with_layout):
Do not materialize an edge permutation in an external node with
vector defs.
(vect_slp_analyze_node_operations_1): Guard purely internal
nodes better.

* g++.dg/torture/pr110742.C: New testcase.

OK
jeff

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-20 Thread Jeff Law via Gcc-patches





On 7/18/23 21:06, HAO CHEN GUI via Gcc-patches wrote:

Hi,
   The shift mode will be widen in combine pass if the operand has a normal
subreg. But when the target already has rotate/mask/insert instructions on
the narrow mode, it's unnecessary to widen the mode for lshiftrt. As
the lshiftrt is commonly converted to rotate/mask insn, the widen mode
blocks it to be further combined to rotate/mask/insert insn. The PR93738
shows the case.

The lshiftrt:SI (subreg:SI (reg:DI)) is converted to
subreg:SI (lshiftrt:DI (reg:DI)) and fails to match rotate/mask pattern.

Trying 13, 10 -> 14:
13: r127:SI=r125:SI&0xf0ff
   REG_DEAD r125:SI
10: r124:SI=r129:DI#4 0>>0xc&0xf00
   REG_DEAD r129:DI
14: r128:SI=r127:SI|r124:SI

Failed to match this instruction:
(set (reg:SI 128)
 (ior:SI (and:SI (reg:SI 125 [+-2 ])
 (const_int -3841 [0xf0ff]))
 (and:SI (subreg:SI (zero_extract:DI (reg:DI 129)
 (const_int 32 [0x20])
 (const_int 20 [0x14])) 4)
 (const_int 3840 [0xf00]
Failed to match this instruction:
(set (reg:SI 128)
 (ior:SI (and:SI (reg:SI 125 [+-2 ])
 (const_int -3841 [0xf0ff]))
 (and:SI (subreg:SI (and:DI (lshiftrt:DI (reg:DI 129)
 (const_int 12 [0xc]))
 (const_int 4294967295 [0x])) 4)
 (const_int 3840 [0xf00]

If not widen the shift mode, it can be combined to rotate/mask/insert insn
as expected.

Trying 13, 10 -> 14:
13: r127:SI=r125:SI&0xf0ff
   REG_DEAD r125:SI
10: r124:SI=r129:DI#4 0>>0xc&0xf00
   REG_DEAD r129:DI
14: r128:SI=r127:SI|r124:SI
   REG_DEAD r127:SI
   REG_DEAD r124:SI
Successfully matched this instruction:
(set (reg:SI 128)
 (ior:SI (and:SI (reg:SI 125 [+-2 ])
 (const_int -3841 [0xf0ff]))
 (and:SI (lshiftrt:SI (subreg:SI (reg:DI 129) 4)
 (const_int 12 [0xc]))
 (const_int 3840 [0xf00]


   This patch adds a target hook to indicate if rotate/mask instructions are
supported on certain mode. If it's true, widen lshiftrt mode is skipped
and shift is done on original mode.

   The patch fixes the regression of other rs6000 test cases. They're listed
in the second patch.

   The patch passed regression test on Power Linux and x86 platforms.

Thanks
Gui Haochen

ChangeLog
combine: Not winden shift mode when target has rotate/mask instruction on
original mode

To winden shift mode is unnecessary when target already has rotate/mask
instuctions on the original mode.  It might blocks the further combine
optimization on the original mode.  For instance, further combine the insns
to a rotate/mask/insert instruction on the original mode.

This patch adds a hook to indicate if a target supports rotate/mask
instructions on the certain mode.  If it returns true, the widen shift
mode will be skipped on lshiftrt.

gcc/
PR target/93738
* combine.cc (try_widen_shift_mode): Skip to widen mode for lshiftrt
when the target has rotate/mask instructions on original mode.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_HAVE_ROTATE_AND_MASK): Add.
* target.def (have_rotate_and_mask): New target hook.
* targhooks.cc (default_have_rotate_and_mask): New function.
* targhooks.h (default_have_rotate_and_mask): Declare.
Wouldn't it make more sense to just try rotate/mask in the original mode 
before trying a shift in a widened mode?  I'm not sure why we need a 
target hook here.


jeff

Re: [PATCH v5 4/5] c++modules: report imported CMI files as dependencies

2023-07-20 Thread Nathan Sidwell via Gcc


On 7/19/23 20:47, Ben Boeckel wrote:

On Wed, Jul 19, 2023 at 17:11:08 -0400, Nathan Sidwell wrote:

GCC is neither of these descriptions.  a CMI does not contain the transitive
closure of its imports.  It contains an import table.  That table lists the
transitive closure of its imports (it needs that closure to do remapping), and
that table contains the CMI pathnames of the direct imports.  Those pathnames
are absolute, if the mapper provded an absolute pathm or relative to the CMI 
repo.

The rationale here is that if you're building a CMI, Foo, which imports a bunch
of modules, those imported CMIs will have the same (relative) location in this
compilation and in compilations importing Foo (why would you move them?) Note
this is NOT inhibiting relocatable builds, because of the CMI repo.


But it is inhibiting distributed builds because the distributing tool
would need to know:

- what CMIs are actually imported (here, "read the module mapper file"
   (in CMake's case, this is only the modules that are needed; a single
   massive mapper file for an entire project would have extra entries) or
   "act as a proxy for the socket/program specified" for other
   approaches);


This information is in the machine (& human) README section of the CMI.


- read the CMIs as it sends to the remote side to gather any other CMIs
   that may be needed (recursively);

Contrast this with the MSVC and Clang (17+) mechanism where the command
line contains everything that is needed and a single bolus can be sent.


um, the build system needs to create that command line? Where does the build 
system get that information?  IIUC it'll need to read some file(s) to do that.




And relocatable is probably fine. How does it interact with reproducible
builds? Or are GCC CMIs not really something anyone should consider for
installation (even as a "here, maybe this can help consumers"
mechanism)?


Module CMIs should be considered a cacheable artifact.  They are neither object 
files nor source files.





On 7/18/23 20:01, Ben Boeckel wrote:

Maybe I'm missing how this *actually* works in GCC as I've really only
interacted with it through the command line, but I've not needed to
mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and
read through some embedded information in `b.cmi` or does `b.cmi`
include enough information to not need to read it at all? If the former,
distributed builds are going to have a problem knowing what files to
send just from the command line (I'll call this "implicit thin"). If the
latter, that is the "fat" CMI that I'm thinking of.


please don't use perjorative terms like 'fat' and 'thin'.


Sorry, I was internally analogizing to "thinLTO".

--Ben


--
Nathan Sidwell

Re: [PATCH v5 4/5] c++modules: report imported CMI files as dependencies

2023-07-20 Thread Nathan Sidwell via Gcc-patches


On 7/19/23 20:47, Ben Boeckel wrote:

On Wed, Jul 19, 2023 at 17:11:08 -0400, Nathan Sidwell wrote:

GCC is neither of these descriptions.  a CMI does not contain the transitive
closure of its imports.  It contains an import table.  That table lists the
transitive closure of its imports (it needs that closure to do remapping), and
that table contains the CMI pathnames of the direct imports.  Those pathnames
are absolute, if the mapper provded an absolute pathm or relative to the CMI 
repo.

The rationale here is that if you're building a CMI, Foo, which imports a bunch
of modules, those imported CMIs will have the same (relative) location in this
compilation and in compilations importing Foo (why would you move them?) Note
this is NOT inhibiting relocatable builds, because of the CMI repo.


But it is inhibiting distributed builds because the distributing tool
would need to know:

- what CMIs are actually imported (here, "read the module mapper file"
   (in CMake's case, this is only the modules that are needed; a single
   massive mapper file for an entire project would have extra entries) or
   "act as a proxy for the socket/program specified" for other
   approaches);


This information is in the machine (& human) README section of the CMI.


- read the CMIs as it sends to the remote side to gather any other CMIs
   that may be needed (recursively);

Contrast this with the MSVC and Clang (17+) mechanism where the command
line contains everything that is needed and a single bolus can be sent.


um, the build system needs to create that command line? Where does the build 
system get that information?  IIUC it'll need to read some file(s) to do that.




And relocatable is probably fine. How does it interact with reproducible
builds? Or are GCC CMIs not really something anyone should consider for
installation (even as a "here, maybe this can help consumers"
mechanism)?


Module CMIs should be considered a cacheable artifact.  They are neither object 
files nor source files.





On 7/18/23 20:01, Ben Boeckel wrote:

Maybe I'm missing how this *actually* works in GCC as I've really only
interacted with it through the command line, but I've not needed to
mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and
read through some embedded information in `b.cmi` or does `b.cmi`
include enough information to not need to read it at all? If the former,
distributed builds are going to have a problem knowing what files to
send just from the command line (I'll call this "implicit thin"). If the
latter, that is the "fat" CMI that I'm thinking of.


please don't use perjorative terms like 'fat' and 'thin'.


Sorry, I was internally analogizing to "thinLTO".

--Ben


--
Nathan Sidwell

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #9 from Andrew Pinski  ---
One thing I noticed is that:
  _2 = MAX_EXPR <_6, a3_7(D)>;
  _3 = MAX_EXPR <_2, a3_7(D)>;

Is not optimized at all.

(for minmax (min max)
 (simplify
  (minmax:c (minmax:c@2 @0 @1) @0)
  @2))

Re: [pushed][LRA]: Check and update frame to stack pointer elimination after stack slot allocation

2023-07-20 Thread Rainer Orth

Hi Vladimir,

> The following patch is necessary for porting avr to LRA.
>
> The patch was successfully bootstrapped and tested on x86-64, aarch64, and
> ppc64le.
>
> There is still avr poring problem with reloading of subreg of frame
> pointer.  I'll address it later on this week.

this patch most likely broke sparc-sun-solaris2.11 bootstrap:

/var/gcc/regression/master/11.4-gcc/build/./gcc/xgcc 
-B/var/gcc/regression/master/11.4-gcc/build/./gcc/ 
-B/vol/gcc/sparc-sun-solaris2.11/bin/ -B/vol/gcc/sparc-sun-solaris2.11/lib/ 
-isystem /vol/gcc/sparc-sun-solaris2.11/include -isystem 
/vol/gcc/sparc-sun-solaris2.11/sys-include   -fchecking=1 -c -g -O2   -W -Wall 
-gnatpg -nostdinc   g-alleve.adb -o g-alleve.o
+===GNAT BUG DETECTED==+ 
| 14.0.0 20230720 (experimental) [master 
506f068e7d01ad2fb107185b8fb204a0ec23785c] (sparc-sun-solaris2.11) GCC error:|
| in update_reg_eliminate, at lra-eliminations.cc:1179 |
| Error detected around g-alleve.adb:4132:8

This is in stage 3.  I haven't investigated further yet.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[Bug tree-optimization/110755] [13/14 Regression] Wrong optimization of fabs on ppc64el at -O1

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110755

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection, wrong-code
Summary|Wrong optimization of fabs  |[13/14 Regression] Wrong
   |on ppc64el at -O1   |optimization of fabs on
   ||ppc64el at -O1
   Host|powerpc64le-unknown-linux-g |
   |nu  |
   Last reconfirmed||2023-07-20
 Target|powerpc64le-unknown-linux-g |
   |nu  |
  Build|powerpc64le-unknown-linux-g |
   |nu  |
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |13.2
 Ever confirmed|0   |1
  Component|target  |tree-optimization

--- Comment #3 from Andrew Pinski  ---
This is not just a bug on powerpc either.


Folding statement: r_6 = ABS_EXPR ;
Applying pattern match.pd:1653, gimple-match.cc:35557
gimple_simplified to r_6 = r_5;

[Bug target/110755] Wrong optimization of fabs on ppc64el at -O1

2023-07-20 Thread aurelien at aurel32 dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110755

--- Comment #2 from Aurelien Jarno  ---
(In reply to Andrew Pinski from comment #1)
> Hmm, this might be the case where you need -frounding-math since we don't
> expectly implement the pragma.

Indeed the original glibc code is compiled with -frounding-math. However adding
it or not doesn't change the resulting code.

[Bug target/110755] Wrong optimization of fabs on ppc64el at -O1

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110755

--- Comment #1 from Andrew Pinski  ---
Hmm, this might be the case where you need -frounding-math since we don't
expectly implement the pragma.

[Bug tree-optimization/91425] Ordered compares aren't optimised

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91425

--- Comment #9 from Andrew Pinski  ---
The biggest issue here is that the both ifcombine (and reassociate) and phiopt
does not touch the case where there could be a trapping oeprator to move it and
combine it with a previous operator. This could/should be improved.

[Bug target/110755] New: Wrong optimization of fabs on ppc64el at -O1

2023-07-20 Thread aurelien at aurel32 dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110755

Bug ID: 110755
   Summary: Wrong optimization of fabs on ppc64el at -O1
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: aurelien at aurel32 dot net
  Target Milestone: ---

The following code, extracted and simplified from the PowerPC implementation of
nearbyintf() in the GNU libc is wrongly optimized at -O1 and -O2 on ppc64el
with GCC 13. The fabs is removed, while it is not the case with GCC 12

#include 
#include 
#include 

static float my_nearbyintf (float x)
{ 
  float r = x;

  if (x > 0.0)
{ 
  r += 0x1p+23;
  r -= 0x1p+23;
  r = fabs (r);
}

  return r;
}

int main()
{
volatile float in = 0.5f;

fesetround (FE_DOWNWARD);
printf("mynearbyintf(in) = %lf\n", my_nearbyintf(in));
}

This causes the result to have the wrong sign.

Re: Beginner Looking for Mentors to Get Started

2023-07-20 Thread David Edelsohn via Gcc

Hi, Veera

Thanks for your interest in GCC.  Welcome!

A good place to start is the GCC Wiki Getting Started page:
https://gcc.gnu.org/wiki/#Getting_Started_with_GCC_Development

David Malcolm has written a wonderful introduction to GCC for newcomers:
https://gcc-newbies-guide.readthedocs.io/en/latest/

and browse other recent answers to similar questions in the archives
of this mailing list.

Thanks, David

On Thu, Jul 20, 2023 at 2:37 PM Veera Sivarajan via Gcc  wrote:
>
> Hi,
>
> I'm a college student interested in contributing to gcc. I'm passionate
> about compilers and have worked on a few personnel projects using Rust, C++
> and C. Please let me know if anyone is interested in mentoring me on where
> I can get started?
>
> Thanks,
> Veera

Re: [PATCH] c++: fix ICE with is_really_empty_class [PR110106]

2023-07-20 Thread Marek Polacek via Gcc-patches

On Thu, Jul 20, 2023 at 02:37:07PM -0400, Jason Merrill wrote:
> On 7/20/23 14:13, Marek Polacek wrote:
> > On Wed, Jul 19, 2023 at 10:11:27AM -0400, Patrick Palka wrote:
> > > On Tue, 18 Jul 2023, Marek Polacek via Gcc-patches wrote:
> > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 
> > > > branches?
> > > 
> > > Looks reasonable to me.
> > 
> > Thanks.
> > > Though I wonder if we could also fix this by not checking potentiality
> > > at all in this case?  The problematic call to 
> > > is_rvalue_constant_expression
> > > happens from cp_parser_constant_expression with 'allow_non_constant' != 0
> > > and with 'non_constant_p' being a dummy out argument that comes from
> > > cp_parser_functional_cast, so the result of is_rvalue_constant_expression
> > > is effectively unused in this case, and we should be able to safely elide
> > > it when 'allow_non_constant && non_constant_p == nullptr'.
> > 
> > Sounds plausible.  I think my patch could be applied first since it
> > removes a tiny bit of code, then I can hopefully remove the flag below,
> > then maybe go back and optimize the call to is_rvalue_constant_expression.
> > Does that sound sensible?
> > 
> > > Relatedly, ISTM the member cp_parser::non_integral_constant_expression_p
> > > is also effectively unused and could be removed?
> > 
> > It looks that way.  Seems it's only used in cp_parser_constant_expression:
> > 10806   if (allow_non_constant_p)
> > 10807 *non_constant_p = parser->non_integral_constant_expression_p;
> > but that could be easily replaced by a local var.  I'd be happy to see if
> > we can actually do away with it.  (I wonder why it was introduced and when
> > it actually stopped being useful.)
> 
> It was for the C++98 notion of constant-expression, which was more of a
> parser-level notion, and has been supplanted by the C++11 version.  I'm
> happy to remove it, and therefore remove the is_rvalue_constant_expression
> call.

Wonderful.  I'll do that next.
 
> > > > -- >8 --
> > > > 
> > > > is_really_empty_class is liable to crash when it gets an incomplete
> > > > or dependent type.  Since r11-557, we pass the yet-uninstantiated
> > > > class type S<0> of the PARM_DECL s to is_really_empty_class -- because
> > > > of the potential_rvalue_constant_expression -> 
> > > > is_rvalue_constant_expression
> > > > change in cp_parser_constant_expression.  Here we're not parsing
> > > > a template so we did not check COMPLETE_TYPE_P as we should.
> > > > 
> > > > PR c++/110106
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * constexpr.cc (potential_constant_expression_1): Check 
> > > > COMPLETE_TYPE_P
> > > > even when !processing_template_decl.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp0x/noexcept80.C: New test.
> > > > ---
> > > >   gcc/cp/constexpr.cc |  2 +-
> > > >   gcc/testsuite/g++.dg/cpp0x/noexcept80.C | 12 
> > > >   2 files changed, 13 insertions(+), 1 deletion(-)
> > > >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept80.C
> > > > 
> > > > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > > > index 6e8f1c2b61e..1f59c5472fb 100644
> > > > --- a/gcc/cp/constexpr.cc
> > > > +++ b/gcc/cp/constexpr.cc
> > > > @@ -9116,7 +9116,7 @@ potential_constant_expression_1 (tree t, bool 
> > > > want_rval, bool strict, bool now,
> > > > if (now && want_rval)
> > > > {
> > > >   tree type = TREE_TYPE (t);
> > > > - if ((processing_template_decl && !COMPLETE_TYPE_P (type))
> > > > + if (!COMPLETE_TYPE_P (type)
> > > >   || dependent_type_p (type)
> 
> There shouldn't be a problem completing the type here, so it seems to me
> that we're missing a call to complete_type_p, at least when
> !processing_template_decl.  Probably need to move the dependent_type_p check
> up as a result.

Like so?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
is_really_empty_class is liable to crash when it gets an incomplete
or dependent type.  Since r11-557, we pass the yet-uninstantiated
class type S<0> of the PARM_DECL s to is_really_empty_class -- because
of the potential_rvalue_constant_expression -> is_rvalue_constant_expression
change in cp_parser_constant_expression.  Here we're not parsing
a template so we did not check COMPLETE_TYPE_P as we should.

It should work to complete the type before checking COMPLETE_TYPE_P.

PR c++/110106

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Try to complete the
type when !processing_template_decl.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept80.C: New test.
---
 gcc/cp/constexpr.cc |  5 +++--
 gcc/testsuite/g++.dg/cpp0x/noexcept80.C | 12 
 2 files changed, 15 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept80.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index

Re: LRA for avr: Clobber with match_dup

2023-07-20 Thread Vladimir Makarov via Gcc




On 7/20/23 07:17, senthilkumar.selva...@microchip.com wrote:

Hi,

   The avr backend has this define_insn_and_split

(define_insn_and_split "*tablejump_split"
   [(set (pc)
 (unspec:HI [(match_operand:HI 0 "register_operand" "!z,*r,z")]
UNSPEC_INDEX_JMP))
(use (label_ref (match_operand 1 "" "")))
(clobber (match_dup 0))
(clobber (const_int 0))]
   "!AVR_HAVE_EIJMP_EICALL"
   "#"
   "&& reload_completed"
   [(parallel [(set (pc)
(unspec:HI [(match_dup 0)]
   UNSPEC_INDEX_JMP))
   (use (label_ref (match_dup 1)))
   (clobber (match_dup 0))
   (clobber (const_int 0))
   (clobber (reg:CC REG_CC))])]
   ""
   [(set_attr "isa" "rjmp,rjmp,jmp")])

Note the (clobber (match_dup 0)). When building

$ avr-gcc -mmcu=avr51 gcc/gcc/testsuite/gcc.c-torture/compile/930120-1.c -O3 
-funroll-loops -fdump-rtl-all

The web pass transforms this insn from

(jump_insn 120 538 124 25 (parallel [
 (set (pc)
 (unspec:HI [
 (reg:HI 138)
 ] UNSPEC_INDEX_JMP))
 (use (label_ref 121))
 (clobber (reg:HI 138))
 (clobber (const_int 0 [0]))
 ]) "gcc/gcc/testsuite/gcc.c-torture/compile/930120-1.c":55:5 779 
{*tablejump_split}
  (expr_list:REG_DEAD (reg:HI 138)
 (expr_list:REG_UNUSED (reg:HI 138)
 (nil)))
  -> 121)

to

  Web oldreg=138 newreg=279
Updating insn 120 (138->279)

(jump_insn 120 538 124 25 (parallel [
 (set (pc)
 (unspec:HI [
 (reg:HI 138)
 ] UNSPEC_INDEX_JMP))
 (use (label_ref 121))
 (clobber (reg:HI 279))
 (clobber (const_int 0 [0]))
 ]) "gcc/gcc/testsuite/gcc.c-torture/compile/930120-1.c":55:5 779 
{*tablejump_split}
  (expr_list:REG_DEAD (reg:HI 138)
 (expr_list:REG_UNUSED (reg:HI 138)
 (nil)))
  -> 121)

Note the reg in the clobber is now 279, and not 138.

With classic reload, however, this gets set back to whatever hardreg was 
assigned to r138.
It is just a fortunate side effect of algorithm how the reload pass 
changes pseudos to their hard registers.

(jump_insn 120 538 121 26 (parallel [
 (set (pc)
 (unspec:HI [
 (reg/f:HI 30 r30 [138])
 ] UNSPEC_INDEX_JMP))
 (use (label_ref 121))
 (clobber (reg/f:HI 30 r30 [138]))
 (clobber (const_int 0 [0]))
 ]) "gcc/gcc/testsuite/gcc.c-torture/compile/930120-1.c":55:5 779 
{*tablejump_split}
  (nil)
  -> 121)

With LRA, however, the pseudo reg remains unassigned, eventually causing an ICE 
in cselib_invalidate_regno.

(jump_insn 120 538 121 26 (parallel [
 (set (pc)
 (unspec:HI [
 (reg/f:HI 30 r30 [138])
 ] UNSPEC_INDEX_JMP))
 (use (label_ref 121))
 (clobber (reg:HI 279))
 (clobber (const_int 0 [0]))
 ]) "gcc/gcc/testsuite/gcc.c-torture/compile/930120-1.c":55:5 779 
{*tablejump_split}
  (nil)
  -> 121)

Is this something that LRA should be able to fix?


No, LRA can not do this but it keeps match_dup correct after any reloads 
and insn transformations.


I think it is a web optimization bug.  RA assumes the insn recognition 
should give the same insn code as it present in the insn.


In my opinion any optimization pass can assume this at the pass start 
and should keep this condition after its work finish.

libgo patch committet: Don't collect package CGOLDFLAGS

2023-07-20 Thread Ian Lance Taylor via Gcc-patches

This libgo patch to the go command sources stops collecting package
CGOLDFLAGS when using gccgo.  The flags are already collected via
cmd/cgo.

The gccgo_link_c test is tweaked to do real linking as with this
change the cgo ldflags are not fully reflected in go build -n output,
since they now only come from the built archive.

This is a backport of https://go.dev/cl/497117 from the main repo.

This is for https://go.dev/issue/60287.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
d2437e29edbe2673867d0e965d6431aff5cec941
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index c44cdc2baac..83ab3e2d64c 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-92152c88ea8e2dd9e8c67e91bf4ae5e3edf1b506
+d04b024021bb7dbaa434a6d902bd12beb08e315f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/cmd/go/internal/work/gccgo.go 
b/libgo/go/cmd/go/internal/work/gccgo.go
index 1e8250002ee..c1026c71e01 100644
--- a/libgo/go/cmd/go/internal/work/gccgo.go
+++ b/libgo/go/cmd/go/internal/work/gccgo.go
@@ -413,16 +413,9 @@ func (tools gccgoToolchain) link(b *Builder, root *Action, 
out, importcfg string
}
 
for _, a := range allactions {
-   // Gather CgoLDFLAGS, but not from standard packages.
-   // The go tool can dig up runtime/cgo from GOROOT and
-   // think that it should use its CgoLDFLAGS, but gccgo
-   // doesn't use runtime/cgo.
if a.Package == nil {
continue
}
-   if !a.Package.Standard {
-   cgoldflags = append(cgoldflags, a.Package.CgoLDFLAGS...)
-   }
if len(a.Package.CgoFiles) > 0 {
usesCgo = true
}
@@ -452,9 +445,6 @@ func (tools gccgoToolchain) link(b *Builder, root *Action, 
out, importcfg string
 
ldflags = append(ldflags, cgoldflags...)
ldflags = append(ldflags, envList("CGO_LDFLAGS", "")...)
-   if root.Package != nil {
-   ldflags = append(ldflags, root.Package.CgoLDFLAGS...)
-   }
if cfg.Goos != "aix" {
ldflags = str.StringList("-Wl,-(", ldflags, "-Wl,-)")
}
diff --git a/libgo/go/cmd/go/testdata/script/gccgo_link_c.txt 
b/libgo/go/cmd/go/testdata/script/gccgo_link_c.txt
index db2a29128b2..8d67ae2bc7e 100644
--- a/libgo/go/cmd/go/testdata/script/gccgo_link_c.txt
+++ b/libgo/go/cmd/go/testdata/script/gccgo_link_c.txt
@@ -4,8 +4,9 @@
 [!cgo] skip
 [!exec:gccgo] skip
 
-go build -n -compiler gccgo cgoref
+! go build -x -compiler gccgo cgoref
 stderr 'gccgo.*\-L [^ ]*alibpath \-lalib' # make sure that Go-inline "#cgo 
LDFLAGS:" ("-L alibpath -lalib") passed to gccgo linking stage
+! stderr 'gccgo.*-lalib.*-lalib' # make sure -lalib is only passed once
 
 -- cgoref/cgoref.go --
 package main

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-07-20 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

--- Comment #10 from Jakub Jelinek  ---
Though, grepping tmp-mddump.md files shows only x86 having ashlti3 and ashrti3
expanders.

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-07-20 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

Jakub Jelinek  changed:

   What|Removed |Added

 CC||krebbel at gcc dot gnu.org,
   ||law at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek  ---
Wonder how many important targets provide double-word shift patterns vs. ones
which expand it through generic code.
aarch64 looks quite small:
foo:
extrx1, x1, x0, 5
asr x1, x1, 59
ret
powerpc probably could be improved:
foo:
srwi 9,4,5
mr 10,9
rlwimi 4,9,5,0,31-5
rlwimi 10,3,27,0,31-27
srawi 3,10,27
blr
ditto s390x:
foo:
lg  %r1,0(%r3)
lg  %r3,8(%r3)
srlg%r5,%r3,5
lghi%r0,31
sllg%r1,%r1,59
ogr %r1,%r5
ngr %r3,%r0
sllg%r5,%r5,5
srag%r1,%r1,59
ogr %r3,%r5
stg %r3,8(%r2)
stg %r1,0(%r2)
br  %r14
ditto riscv64:
foo:
srlia5,a0,5
sllia1,a1,59
or  a1,a5,a1
sllia5,a1,5
andia0,a0,31
or  a0,a5,a0
sraia1,a1,59
ret

[Bug c++/110754] New: assume create spurious load for volatile variable

2023-07-20 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754

Bug ID: 110754
   Summary: assume create spurious load for volatile variable
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: muecker at gwdg dot de
  Target Milestone: ---

With optimization, using assume with volatile create spurious loads:

int bar(int p)
{
volatile int n = p;
[[assume (1 == n)]];
return 1 + n;
}


bar(int):
mov DWORD PTR [rsp-4], edi
mov eax, DWORD PTR [rsp-4]
mov eax, DWORD PTR [rsp-4]
add eax, 1
ret

[Bug fortran/110658] MINVAL/MAXVAL and deferred-length character arrays

2023-07-20 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110658

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |11.5

--- Comment #7 from anlauf at gcc dot gnu.org ---
Fixed.

[Bug fortran/95947] PACK intrinsic returns blank strings when an allocatable character array with allocatable length is used

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95947

--- Comment #11 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:7bd1373f87d581b1e5482f9c558d481c38027a99

commit r11-10918-g7bd1373f87d581b1e5482f9c558d481c38027a99
Author: Harald Anlauf 
Date:   Sun Jul 16 22:17:27 2023 +0200

Fortran: intrinsics and deferred-length character arguments
[PR95947,PR110658]

gcc/fortran/ChangeLog:

PR fortran/95947
PR fortran/110658
* trans-expr.c (gfc_conv_procedure_call): For intrinsic procedures
whose result characteristics depends on the first argument and
which
can be of type character, the character length will not be
deferred.

gcc/testsuite/ChangeLog:

PR fortran/95947
PR fortran/110658
* gfortran.dg/deferred_character_37.f90: New test.

(cherry picked from commit 95ddd2659849a904509067ec3a2770135149a722)

[Bug fortran/110658] MINVAL/MAXVAL and deferred-length character arrays

2023-07-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110658

--- Comment #6 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:7bd1373f87d581b1e5482f9c558d481c38027a99

commit r11-10918-g7bd1373f87d581b1e5482f9c558d481c38027a99
Author: Harald Anlauf 
Date:   Sun Jul 16 22:17:27 2023 +0200

Fortran: intrinsics and deferred-length character arguments
[PR95947,PR110658]

gcc/fortran/ChangeLog:

PR fortran/95947
PR fortran/110658
* trans-expr.c (gfc_conv_procedure_call): For intrinsic procedures
whose result characteristics depends on the first argument and
which
can be of type character, the character length will not be
deferred.

gcc/testsuite/ChangeLog:

PR fortran/95947
PR fortran/110658
* gfortran.dg/deferred_character_37.f90: New test.

(cherry picked from commit 95ddd2659849a904509067ec3a2770135149a722)

1 2 3 4 >

1 - 100 of 330 matches

Mail list logo