[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875 Andrew Macleod changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #4 from Andrew Macleod --- When range_of_stmt invokes prefill_name to evaluate unvisited dependencies it should not mark visited names as always_current. when ranger_cache::get_globaL_range() is invoked with the optional "current_p" flag, it triggers additional functionality. This call is meant to be from within ranger and it is understood that if the current value is not current, set_global_range will always be called later with a value. Thus it sets the always_current flag in the temporal cache to avoid computation cycles. the prefill_stmt_dependencies () mechanism within ranger is intended to emulate the bahaviour of range_of_stmt on an arbitrarily long series of unresolved dependencies without triggering the overhead of huge call chains from the range_of_expr/range_on_entry/range_on_exit routines. Rather, it creates a stack of unvisited names, and invokes range_of_stmt on them directly in order to get initial cache values for each ssa-name. The issue in this PR was that routine was incorrectly invoking the get_global_cache to determine whether there was a global value. If there was, it would move on to the next dependency without invoking set_global_range to clear the always_current flag. What it should have been doing was simply checking if there as a global value, and if there was not, add the name for processing and THEN invoke get_global_value to do all the special processing. fixed.
[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875 --- Comment #3 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:cf2ae3fff4ee9bf884b122ee6cd83bffd791a16f commit r14-3792-gcf2ae3fff4ee9bf884b122ee6cd83bffd791a16f Author: Andrew MacLeod Date: Thu Sep 7 11:15:50 2023 -0400 Some ssa-names get incorrectly marked as always_current. When range_of_stmt invokes prefill_name to evaluate unvisited dependencies it should not mark already visited names as always_current. PR tree-optimization/110875 gcc/ * gimple-range.cc (gimple_ranger::prefill_name): Only invoke cache-prefilling routine when the ssa-name has no global value. gcc/testsuite/ * gcc.dg/pr110875.c: New.
[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875 Aldy Hernandez changed: What|Removed |Added CC||amacleod at redhat dot com --- Comment #2 from Aldy Hernandez --- (In reply to Andrew Pinski from comment #1) > I am super confused about VRP's ranges: > We have the following that ranges that get exported and their relationships: > Global Exported: a.8_105 = [irange] int [-2, 0] > _10 = a.8_105 + -1; > Global Exported: _10 = [irange] int [-INF, -6][-3, -1][1, 2147483645] > _103 = (unsigned int) _10; > Global Exported: _103 = [irange] unsigned int [1, 2147483645][2147483648, > 4294967290][4294967294, +INF] > Simplified relational if (_103 > 1) > into if (_103 != 1) > > > Shouldn't the range of _10 just be [-3,-1] > If so _103 can't get 0 or 1 ? And then if that gets it right then the call > to foo will go away. [It looks like a caching issue of some kind. Looping Andrew.] Yes, that is indeed confusing. _10 should have a more refined range. Note that there's a dependency between a.8_105 and _10: [local count: 327784168]: # f_lsm.17_26 = PHI # a.8_105 = PHI <0(3), _10(13)> # b_lsm.19_33 = PHI # b_lsm_flag.20_53 = PHI <0(3), 1(13)> # a_lsm.21_49 = PHI <_54(D)(3), _10(13)> _9 = e.10_39 + 4294967061; _10 = a.8_105 + -1; if (_10 != -3(OVF)) goto ; [94.50%] else goto ; [5.50%] This is what I see with --param=ranger-debug=tracegori in VRP2... We first calculate a.8_105 to [-INF, -5][-2, 0][2, 2147483646]: 1140 range_of_stmt (a.8_105) at stmt a.8_105 = PHI <0(3), _10(13)> 1141 ROS dependence fill ROS dep fill (a.8_105) at stmt a.8_105 = PHI <0(3), _10(13)> ROS dep fill (_10) at stmt _10 = a.8_105 + -1; 1142 range_of_expr(a.8_105) at stmt _10 = a.8_105 + -1; TRUE : (1142) range_of_expr (a.8_105) [irange] int [-INF, -5][-2, 0][2, 2147483646] Which we later refine with SCEV: Statement _10 = a.8_105 + -1; is executed at most 2147483647 (bounded by 2147483647) + 1 times in loop 4. Loops range found for a.8_105: [irange] int [-2, 0] and calculated range :[irange] int [-INF, -6][-2, 0][2, 2147483645] TRUE : (1140) range_of_stmt (a.8_105) [irange] int [-2, 0] Global Exported: a.8_105 = [irange] int [-2, 0] I have verified that range_of_expr after this point returns [-2, 0], so we know both globally and locally this refined range. However, when we try to fold _10 later on, we use the cached value instead of recalculating with the new range for a.8_105: Folding statement: _10 = a.8_105 + -1; 872 range_of_stmt (_10) at stmt _10 = a.8_105 + -1; TRUE : (872) cached (_10) [irange] int [-INF, -6][-3, -1][1, 2147483645]
[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-08-03 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed. Though I have no idea how to fix this really. The first major change to the IR happens in thread2 where we decide to do a jump thread with the change that we didn't do before. In GCC 13 we had: ``` [local count: 282631250]: # a.8_39 = PHI <_12(23), 0(3)> # f_lsm.17_20 = PHI # f_lsm_flag.18_22 = PHI # b_lsm.19_45 = PHI <0(23), b_lsm.19_53(3)> # b_lsm_flag.20_47 = PHI <1(23), 0(3)> # a_lsm.21_49 = PHI <_12(23), _55(D)(3)> _1 = a.8_39 != 0; _2 = (int) _1; if (_2 != a.8_39) goto ; [41.79%] ``` On the trunk we get: ``` [local count: 339987332]: # a.8_38 = PHI <_10(24), 0(3)> # f_lsm.17_18 = PHI # f_lsm_flag.18_20 = PHI # b_lsm.19_44 = PHI <0(24), b_lsm.19_52(3)> # b_lsm_flag.20_46 = PHI <1(24), 0(3)> # a_lsm.21_48 = PHI <_10(24), _54(D)(3)> _13 = (unsigned int) a.8_38; if (_13 > 1) goto ; [34.74%] else goto ; [65.26%] ``` We duplicate bb4 for bb3 as we can figure that _13>1 will be false. This was not done for the IR in GCC 13. I am super confused about VRP's ranges: We have the following that ranges that get exported and their relationships: Global Exported: a.8_105 = [irange] int [-2, 0] _10 = a.8_105 + -1; Global Exported: _10 = [irange] int [-INF, -6][-3, -1][1, 2147483645] _103 = (unsigned int) _10; Global Exported: _103 = [irange] unsigned int [1, 2147483645][2147483648, 4294967290][4294967294, +INF] Simplified relational if (_103 > 1) into if (_103 != 1) Shouldn't the range of _10 just be [-3,-1] If so _103 can't get 0 or 1 ? And then if that gets it right then the call to foo will go away.
[Bug tree-optimization/110875] [14 Regression] Dead Code Elimination Regression since r14-2501-g285c9d042e9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110875 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |14.0