[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #12 from Richard Biener  ---
Fixed.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-26 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:989bc4ca2f2978baecff00f6d0532994b82897ef

commit r11-2899-g989bc4ca2f2978baecff00f6d0532994b82897ef
Author: Richard Biener 
Date:   Wed Aug 26 08:44:59 2020 +0200

tree-optimization/96565 - improve DSE with paths ending in noreturn

This improves DSEs stmt walking by not considering a DEF without
uses for further processing (and thus giving up when there's two
paths to follow).

2020-08-26  Richard Biener  

PR tree-optimization/96565
* tree-ssa-dse.c (dse_classify_store): Remove defs with
no uses from further processing.

* gcc.dg/tree-ssa/ssa-dse-40.c: New testcase.
* gcc.dg/builtin-object-size-4.c: Adjust.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #10 from Richard Biener  ---
Ah, no - error in my patch.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #9 from Richard Biener  ---
OK, well - but the fix exposes (IIRC I ran into this at some time in the past
already) that GIMPLE_RESX does not have virtual operands but it needs to
represent a use of global memory at least in the case it exits the current
function.  So with the fix which is simply

diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index cc93f559286..6b2f64c0250 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -888,11 +888,16 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
  gimple *def = defs[i];
  gimple *use_stmt;
  use_operand_p use_p;
+ /* If the path ends here we do not need to process it further.
+This for example happens with calls to noreturn functions.  */
+ if (gimple_code (def) != GIMPLE_PHI
+ && has_zero_uses (gimple_vdef (def)))
+   defs.unordered_remove (i);
  /* If the path to check starts with a kill we do not need to
 process it further.
 ???  With byte tracking we need only kill the bytes currently
 live.  */
- if (stmt_kills_ref_p (def, ref))
+ else if (stmt_kills_ref_p (def, ref))
{
  if (by_clobber_p && !gimple_clobber_p (def))
*by_clobber_p = false;

I see

FAIL: g++.dg/eh/spec7.C  -std=gnu++98 execution test
FAIL: g++.dg/eh/spec7.C  -std=gnu++14 execution test
FAIL: g++.dg/eh/spec7.C  -std=gnu++17 execution test
FAIL: g++.dg/eh/spec7.C  -std=gnu++2a execution test
FAIL: gcc.target/i386/cleanup-1.c execution test
FAIL: gcc.target/i386/cleanup-2.c execution test

also (a testsuite issue)

FAIL: gcc.dg/builtin-object-size-4.c execution test

the RESX issue is possibly latent even without the above patch.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #8 from Richard Biener  ---
OK, the case is quite simple to fix.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #7 from rguenther at suse dot de  ---
On Tue, 25 Aug 2020, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565
> 
> --- Comment #5 from Jakub Jelinek  ---
> Wouldn't many spots in the middle-end be upset about the gimple clobber having
> a variable length type (both the MEM_REF on the lhs and corresponding type on
> the CONSTRUCTOR)?

Well, it of course would need to be wrapped in a WITH_SIZE_EXPR (ick).
Maybe we can clobber the original VAR_DECL ...

> I think tree DCE should have everything it needs for the removals even without
> such CLOBBERs, except that it is harder but not impossible to find out which
> __builtin_stack_restore frees what.

I'm not sure we can do this, that is, treat __builtin_stack_restore as
"free".  What we can possibly do is avoid treating __builtin_stack_restore
as use (but we still need to consider it clobbering things).  That is,
fixing ref_maybe_used_by_stmt_p should be possible but stmt_kills_ref_p
is way harder.

Let me see if I can do ref_maybe_used_by_stmt_p.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #6 from Jakub Jelinek  ---
I'd say that the VLAs are more like "malloc" with the stack restore than
performing one or more "free" conceptually, so perhaps we should instead emit
calls to free-like internal function .FREE_VLA (ptr); right before the
__builtin_stack_restore for the VLAs freed at that point, and handle the
__builtin_alloca_with_align and .FREE_VLA pairs like we treat malloc/free in
tree DSE, and then throw away those ifn calls at some point.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #5 from Jakub Jelinek  ---
Wouldn't many spots in the middle-end be upset about the gimple clobber having
a variable length type (both the MEM_REF on the lhs and corresponding type on
the CONSTRUCTOR)?
I think tree DCE should have everything it needs for the removals even without
such CLOBBERs, except that it is harder but not impossible to find out which
__builtin_stack_restore frees what.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #4 from Marc Glisse  ---
(In reply to Richard Biener from comment #3)
> I guess the "usual" way of dealing with this would be to have
> CLOBBERs for all VLAs before the __builtin_stack_restore.

That looks like a good idea.

I didn't try to follow in a debugger why DSE fails to remove the write when
those 2 builtins are present while it manages if I call
__builtin_alloca_with_align directly, but I don't immediately see a reason for
that difference, even in the absence of clobbers. Or maybe that's just the
usual limitations of DSE (there is a branch after all...).

I first thought that __builtin_stack_save/restore might need some extra
attributes (advertising for instance that they do not read/write memory or let
anything escape, without weakening them to the point where the compiler would
move them around too much or remove them), but since the call to the opaque g
does not seem to prevent DSE from removing the write, that's probably not the
issue.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener  ---
So we have

   :
  saved_stack.2_7 = __builtin_stack_save ();
  _1 = (long int) x_6(D);
  _2 = _1 + -1;
  _13 = (sizetype) _2;
  _4 = (sizetype) x_6(D);
  _5 = (bitsizetype) _4;
  _14 = _5 * 8;
  arr.1_19 = __builtin_alloca_with_align (_4, 8);
  (*arr.1_19)[0] = 0;
  _12 = g ();
  if (_12 != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  abort ();

   :
  __builtin_stack_restore (saved_stack.2_7);
  return;

I guess the "usual" way of dealing with this would be to have
CLOBBERs for all VLAs before the __builtin_stack_restore.

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

--- Comment #2 from Marc Glisse  ---
Actually, it isn't so much the alloca call itself, it seems to be
__builtin_stack_save / __builtin_stack_restore that prevent DSE from removing
arr[0] = 0 (without that write, DCE can remove __builtin_alloca_with_align, and
__builtin_stack_* disappear in FAB).

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565

Marc Glisse  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-08-11

--- Comment #1 from Marc Glisse  ---
We simplify for alloca, for malloc, but not for __builtin_alloca_with_align
(aka VLA).