[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

--- Comment #11 from Richard Biener  ---
(In reply to Richard Biener from comment #9)
> Ick, it's via iv_analysis_loop_init used from doloop and unroll.

In that context I'll note that if it is doloop itself that does the DF
live query then it's doloops fault to not operate within the constraints
of iv_analysis_loop_init.  It might be possible to add a flag to
iv_analysis_loop_init to elide its df_analyze stuff and instead have the
caller do a function-wide DF analysis.

I see doloop_optimize_loops does

  if (optimize == 1)
{
  df_live_add_problem ();
  df_live_set_all_dirty ();
}

  for (auto loop : loops_list (cfun, 0))
doloop_optimize (loop);

but never calls df_analyze.  I'm not actually sure what happens if you
then do df_analyze_loops, whether that scraps all but the loop BB
info computed?  Anyway, I don't think that simply adding another problem
"around" iv_analysis_loop_init works as we can see here.  If
iv_analysis_loop_init were to compute liveness it would see that it needs
more that just the loop body blocks.

Also note that my patch while it might work in most cases, isn't fool-proof.
The actual use might be in the very last BB of the function (very unlikely
for CC, of course).  Doing an analysis of all reachable blocks perverts
the use of df_analyze_loop.

So.  It looks all broken design to me (in doloop).

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

--- Comment #10 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #8)

flow_bb_inside_loop_p( has missing space before (.
Anyway, getting rid of its use would likely not be backportable...

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

--- Comment #9 from Richard Biener  ---
Ick, it's via iv_analysis_loop_init used from doloop and unroll.

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

--- Comment #8 from Richard Biener  ---
Created attachment 57574
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57574=edit
patch

A quick-and-dirty thing would be indeed to simply include all exit blocks
like with the attached.

I'll note that df_analyze_loop has serious scalability issues so IMO we
should try to get rid of it and its uses.  I don't know the doloop pass
at all (invariant motion also uses this function), but it would be best
if it could do df_analyze on the whole function and either decide it
doesn't need to update DF during transform since it doesn't effect other
loops it processes or it would update DF info manually.

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
CCing Richi and Honza.

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-29 Thread stefansf at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

--- Comment #6 from Stefan Schulze Frielinghaus  
---
Looks like wrong liveness information.  The problem is that df_analyze_loop
only considers basic blocks which strictly belong to a loop which is not
enough.  During loop2_doloop basic block 9 (previously 8) embodies the CC
consumer jump_insn 42 and is not part of the loop and therefore does not
contribute to the liveness analysis.  A quick and dirty experiment by forcing a
merge with BB 9

diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index f0eb4c93957..79f37e22ec1 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -957,9 +957,11 @@ df_worklist_propagate_backward (struct dataflow *dataflow,
   if (EDGE_COUNT (bb->succs) > 0)
 FOR_EACH_EDGE (e, ei, bb->succs)
   {
-   if (bbindex_to_postorder[e->dest->index] < last_change_age.length ()
+   if ((bbindex_to_postorder[e->dest->index] < last_change_age.length ()
&& age <= last_change_age[bbindex_to_postorder[e->dest->index]]
&& bitmap_bit_p (considered, e->dest->index))
+   || (strcmp ("loop2_doloop", current_pass->name) == 0
+   && e->src->index == 6 && e->dest->index == 9))
   changed |= dataflow->problem->con_fun_n (e);
   }
   else if (dataflow->problem->con_fun_0)

shows that, now, CC is live at BB 6 and therefore doloop performs no
transformation due to

bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified);
BITMAP_FREE (modified);

if (fail)
  {
if (dump_file)
  fprintf (dump_file, "Doloop: doloop pattern clobbers live out\n");
return false;
  }

In a first try I enlarged the set of basic blocks for which df_analyze_loop is
run to also include basic blocks which have a direct edge originating from a
basic block of a loop.  Of course, this solves this problem.  However, in
general this may not be enough.  I'm wondering what the IL allows.  Is it
possible to have a graph containing not only outgoing edges of a loop but also
ingoing?  If so I think we would need to compute the set of basic blocks which
are reachable from within the loop.  Any thoughts?

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

--- Comment #5 from Jakub Jelinek  ---
Bisected to r13-1268-g8c99e307b20c502e55c425897fb3884ba8f05882 .
But that just means the bug was latent before that.

[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform

2024-02-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994

Jakub Jelinek  changed:

   What|Removed |Added

Summary|Probable C++ code   |[13/14 Regression] Probable
   |generation bug with -O2 on  |C++ code generation bug
   |s390x platform  |with -O2 on s390x platform
   Target Milestone|--- |13.3
   Priority|P3  |P2