https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65698
--- Comment #3 from Yuri Rumyantsev <ysrumyan at gmail dot com> --- I see that this bug was no considered for a while. Here is my additional comment. First of all, this test was extracted from bzip2 benchmark, mainGTU function. The problem is that (1) tree optimizer collects cse for i1 * 2 and i2 * 2; (2) Forward propagation pass do not substitute it back to address computation since use_killed_between is very simplified it handles only simple basic block or semi-hammock: /* Finally, if DEF_BB is the sole predecessor of TARGET_BB. */ if (single_pred_p (target_bb) && single_pred (target_bb) == def_bb) This function must be enhanced to handle arbitrary cfg. Note that this deficiency increases register pressure on 2 and we have more spills/fills for x86 32-bit target.