[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Andrew Pinski changed: What|Removed |Added Summary|ICE on valid code at -O3 on |[14 Regression] ICE on |x86_64-linux-gnu: |valid code at -O3 on |verify_flow_info failed |x86_64-linux-gnu: ||verify_flow_info failed Target Milestone|--- |14.0 Keywords||ice-on-valid-code
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=111960 Ever confirmed|0 |1 Last reconfirmed||2023-10-30 Keywords||needs-bisection --- Comment #2 from Andrew Pinski --- This seems like it is recusive inlining causing the issues ... Anyways confirmed.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Sam James changed: What|Removed |Added CC||sjames at gcc dot gnu.org --- Comment #3 from Sam James --- (In reply to Zhendong Su from comment #0) > This appears to be a recent regression. > Out of interest, when you say this, do you have a rough range in mind? It'd make bisecting easier. Or do you just mean you surely would've hit it by now with your testing if it had been there a while?
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #9 from Jakub Jelinek --- Still reproduceable with --- gcc/tree-scalar-evolution.cc +++ gcc/tree-scalar-evolution.cc @@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop) /* Propagate constants immediately, but leave an unused initialization around to avoid invalidating the SCEV cache. */ - if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) + if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) replace_uses_by (rslt, def); /* Create the replacement statements. */ The bb with uninitialized count is created by #7 0x0069060b in create_empty_bb (after=) at ../../gcc/cfghooks.cc:773 #8 0x00e2c995 in gimple_duplicate_bb (bb=, id=0x7fffc610) at ../../gcc/tree-cfg.cc:6513 #9 0x00691158 in duplicate_block (bb=, e=, after=, id=0x7fffc610) at ../../gcc/cfghooks.cc:1119 #10 0x006918f5 in copy_bbs (bbs=0x3bfa670, n=3, new_bbs=0x3bce9c0, edges=0x7fffc790, num_edges=2, new_edges=0x7fffc780, base=0x7fffe9f1f7d0, after=, update_dominance=true) at ../../gcc/cfghooks.cc:1384 #11 0x006a19c6 in duplicate_loop_body_to_header_edge (loop=0x7fffe9f1f7d0, e= 62)>, ndupl=2, wont_exit=0x3ac78f0, orig= 66)>, Python Exception : There is no member or method named m_vecpfx. to_remove=0x39ba7b0, flags=5) at ../../gcc/cfgloopmanip.cc:1403 #12 0x00fc8fd9 in gimple_duplicate_loop_body_to_header_edge (loop=0x7fffe9f1f7d0, e= 225)>, ndupl=2, wont_exit=0x3ac78f0, orig= 66)>, Python Exception : There is no member or method named m_vecpfx. to_remove=0x39ba7b0, flags=5) at ../../gcc/tree-ssa-loop-manip.cc:860 #13 0x00fa53f6 in try_unroll_loop_completely (loop=0x7fffe9f1f7d0, exit= 66)>, niter=, may_be_zero=false, ul=UL_ALL, maxiter=2, locus=..., allow_peel=true) at ../../gcc/tree-ssa-loop-ivcanon.cc:960 Seems in the above backtrace it is duplicate_block which does the new_bb->count updates. It does: 1107 profile_count new_count = e ? e->count (): profile_count::uninitialized (); but e is NULL, so here new_count is unitialized, and then 1114 if (bb->count < new_count) 1115new_count = bb->count; here p bb->count.debug () 2305843009213693950 (estimated locally, freq 144115188075855872.) p new_count.debug () uninitialized but bb->count < new_count is false due to bool operator< (const profile_probability &other) const { return initialized_p () && other.initialized_p () && m_val < other.m_val; } Shouldn't that be if (!(bb->count >= new_count)) or if (bb->count < new_count || !new_count.initialized_p ()) ? Honza?
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #10 from Richard Biener --- Looks like so, can you test that? I think !(bb->count >= new_count) is good, we're using this kind of compare regularly.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #11 from Jakub Jelinek --- (In reply to Richard Biener from comment #10) > Looks like so, can you test that? I think !(bb->count >= new_count) is good, > we're using this kind of compare regularly. Sure, I'll test that.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #12 from Jakub Jelinek --- (In reply to Jakub Jelinek from comment #11) > (In reply to Richard Biener from comment #10) > > Looks like so, can you test that? I think !(bb->count >= new_count) is > > good, > > we're using this kind of compare regularly. > > Sure, I'll test that. Actually no, that doesn't help, nor the IMO better if (!new_count.initialized_p () || bb->count < new_count) new_count = bb->count; because if say bb->count is not initialized but e->count is, we don't want to overwrite it. The thing is that new_count is actually not used unless e is non-NULL. The actual problem is different, bb->count of one of the duplicated blocks is initialized to the largest possible unitialized m_val (0x3ffe aka 2305843009213693950 (estimated locally, freq 144115188075855872.) ) and then scaled to uninitialized. This is because in the second duplicate_loop_body_to_header_edge on the testcase (with the #c9 patch to reproduce it even on the trunk) we have (gdb) p count_le.debug () 1729382256910270463 (estimated locally, freq 108086391056891904.) (gdb) p count_out_orig.debug () 576460752303423488 (estimated locally, freq 36028797018963968.) but 1264 profile_count new_count_le = count_le + count_out_orig; is (gdb) p new_count_le.debug () uninitialized because 0x17ff + 0x800 yields the largest possible value. If profile_count wants to use the 0x1fff value as unitialized, shouldn't it perform saturating arithmetics such that the counts will be never larger than 0x1ffe unless it is really meant to be uninitialized? I mean in all those spots like operator+ which just m_val + other.m_val and similar without checking for overflow? What about apply_scale etc.? Honza?
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #13 from Jakub Jelinek --- Created attachment 57821 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57821&action=edit gcc14-pr112303.patch This patch fixes the ICE for me. Seems we already did something like that in other spots (e.g. in apply_scale).
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #14 from Jan Hubicka --- > This patch fixes the ICE for me. > Seems we already did something like that in other spots (e.g. in apply_scale). In general if the overflow happens, some pass must have misbehaved and do something crazy when updating profile. But indeed we probably ought to cap here instead of randomly getting to uninitialized. It may make sense to make these enable checking only ICEs. I will look into why the overflow happens. Honza
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #15 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:d5a3b4afcdf4d517334a2717dbb65ae0d2c26507 commit r14-9707-gd5a3b4afcdf4d517334a2717dbb65ae0d2c26507 Author: Jakub Jelinek Date: Thu Mar 28 15:00:44 2024 +0100 profile-count: Avoid overflows into uninitialized [PR112303] The testcase in the patch ICEs with --- gcc/tree-scalar-evolution.cc +++ gcc/tree-scalar-evolution.cc @@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop) /* Propagate constants immediately, but leave an unused initialization around to avoid invalidating the SCEV cache. */ - if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) + if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) replace_uses_by (rslt, def); /* Create the replacement statements. */ (the addition of the above made the ICE latent), because profile_count addition doesn't check for overflows and if unlucky, we can even overflow into the uninitialized value. Getting really huge profile counts is very easy even when not using recursive inlining in loops, e.g. __attribute__((noipa)) void bar (void) { __builtin_exit (0); } __attribute__((noipa)) void foo (void) { for (int i = 0; i < 1000; ++i) for (int j = 0; j < 1000; ++j) for (int k = 0; k < 1000; ++k) for (int l = 0; l < 1000; ++l) for (int m = 0; m < 1000; ++m) for (int n = 0; n < 1000; ++n) for (int o = 0; o < 1000; ++o) for (int p = 0; p < 1000; ++p) for (int q = 0; q < 1000; ++q) for (int r = 0; r < 1000; ++r) for (int s = 0; s < 1000; ++s) for (int t = 0; t < 1000; ++t) for (int u = 0; u < 1000; ++u) for (int v = 0; v < 1000; ++v) for (int w = 0; w < 1000; ++w) for (int x = 0; x < 1000; ++x) for (int y = 0; y < 1000; ++y) for (int z = 0; z < 1000; ++z) for (int a = 0; a < 1000; ++a) for (int b = 0; b < 1000; ++b) bar (); } int main () { foo (); } reaches the maximum count already on the 11th loop. Some other methods of profile_count like apply_scale already do use MIN (val, max_count) before assignment to m_val, this patch just extends that to operator{+,+=} methods. Furthermore, one overload of apply_probability wasn't using safe_scale_64bit and so could very easily overflow as well - prob is required to be [0, 1] and if m_val is near the max_count, it can overflow even with multiplications by 8. 2024-03-28 Jakub Jelinek PR tree-optimization/112303 * profile-count.h (profile_count::operator+): Perform addition in uint64_t variable and set m_val to MIN of that val and max_count. (profile_count::operator+=): Likewise. (profile_count::operator-=): Formatting fix. (profile_count::apply_probability): Use safe_scale_64bit even in the int overload. * gcc.c-torture/compile/pr112303.c: New test.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #16 from GCC Commits --- The releases/gcc-13 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:b7b4ef2ff20c5023a41ed663dd8f4724b4ff0f9c commit r13-8525-gb7b4ef2ff20c5023a41ed663dd8f4724b4ff0f9c Author: Jakub Jelinek Date: Thu Mar 28 15:00:44 2024 +0100 profile-count: Avoid overflows into uninitialized [PR112303] The testcase in the patch ICEs with --- gcc/tree-scalar-evolution.cc +++ gcc/tree-scalar-evolution.cc @@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop) /* Propagate constants immediately, but leave an unused initialization around to avoid invalidating the SCEV cache. */ - if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) + if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) replace_uses_by (rslt, def); /* Create the replacement statements. */ (the addition of the above made the ICE latent), because profile_count addition doesn't check for overflows and if unlucky, we can even overflow into the uninitialized value. Getting really huge profile counts is very easy even when not using recursive inlining in loops, e.g. __attribute__((noipa)) void bar (void) { __builtin_exit (0); } __attribute__((noipa)) void foo (void) { for (int i = 0; i < 1000; ++i) for (int j = 0; j < 1000; ++j) for (int k = 0; k < 1000; ++k) for (int l = 0; l < 1000; ++l) for (int m = 0; m < 1000; ++m) for (int n = 0; n < 1000; ++n) for (int o = 0; o < 1000; ++o) for (int p = 0; p < 1000; ++p) for (int q = 0; q < 1000; ++q) for (int r = 0; r < 1000; ++r) for (int s = 0; s < 1000; ++s) for (int t = 0; t < 1000; ++t) for (int u = 0; u < 1000; ++u) for (int v = 0; v < 1000; ++v) for (int w = 0; w < 1000; ++w) for (int x = 0; x < 1000; ++x) for (int y = 0; y < 1000; ++y) for (int z = 0; z < 1000; ++z) for (int a = 0; a < 1000; ++a) for (int b = 0; b < 1000; ++b) bar (); } int main () { foo (); } reaches the maximum count already on the 11th loop. Some other methods of profile_count like apply_scale already do use MIN (val, max_count) before assignment to m_val, this patch just extends that to operator{+,+=} methods. Furthermore, one overload of apply_probability wasn't using safe_scale_64bit and so could very easily overflow as well - prob is required to be [0, 1] and if m_val is near the max_count, it can overflow even with multiplications by 8. 2024-03-28 Jakub Jelinek PR tree-optimization/112303 * profile-count.h (profile_count::operator+): Perform addition in uint64_t variable and set m_val to MIN of that val and max_count. (profile_count::operator+=): Likewise. (profile_count::operator-=): Formatting fix. (profile_count::apply_probability): Use safe_scale_64bit even in the int overload. * gcc.c-torture/compile/pr112303.c: New test. (cherry picked from commit d5a3b4afcdf4d517334a2717dbb65ae0d2c26507)
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #17 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Andrew Pinski changed: What|Removed |Added Keywords||needs-bisection --- Comment #6 from Andrew Pinski --- This seems to have been fixed recently.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Keywords|needs-bisection | --- Comment #7 from Jakub Jelinek --- Doesn't ICE since r14-6010-g2dde9f326ded84814a78c3044294b535c1f97b41 No idea whether that was the fix for this or just something that made it latent.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #8 from rguenther at suse dot de --- On Tue, 5 Dec 2023, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||jakub at gcc dot gnu.org, >||rguenth at gcc dot gnu.org >Keywords|needs-bisection | > > --- Comment #7 from Jakub Jelinek --- > Doesn't ICE since r14-6010-g2dde9f326ded84814a78c3044294b535c1f97b41 > No idea whether that was the fix for this or just something that made it > latent. I'm quite sure it just made it latent.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Sam James changed: What|Removed |Added Keywords|needs-bisection | CC||hubicka at gcc dot gnu.org Summary|[14 Regression] ICE on |[14 Regression] ICE on |valid code at -O3 on|valid code at -O3 on |x86_64-linux-gnu: |x86_64-linux-gnu: |verify_flow_info failed |verify_flow_info failed ||since ||r14-3459-g0c78240fd7d519 --- Comment #4 from Sam James --- bisect says: commit 0c78240fd7d519fc27ca822f66a92f85edf43f70 Author: Jan Hubicka Date: Thu Aug 24 15:10:46 2023 +0200 Check that passes do not forget to define profile in r14-3459-g0c78240fd7d519. It's probably been there for a while. This has popped up in a bunch of places naturally.
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Version|unknown |14.0
[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #5 from Zhendong Su --- (In reply to Sam James from comment #3) > (In reply to Zhendong Su from comment #0) > > This appears to be a recent regression. > > > > Out of interest, when you say this, do you have a rough range in mind? It'd > make bisecting easier. Or do you just mean you surely would've hit it by now > with your testing if it had been there a while? By "This appears to be a recent regression", I typically mean, according to Compiler Explorer, the bug is only reproduced with its current trunk build.