https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120941
--- Comment #41 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to rguent...@suse.de from comment #40) > On Wed, 30 Jul 2025, hjl.tools at gmail dot com wrote: > > With my patch, we got > > > > basic_block bb = nearest_common_dominator_for_set (CDI_DOMINATORS, bbs); > > /* For X86_CSE_VEC_DUP, don't place the vector set outside of the loop > > to avoid extra spills. */ > > if (!load || load->kind != X86_CSE_VEC_DUP) > > { > > while (bb->loop_father->latch > > != EXIT_BLOCK_PTR_FOR_FN (cfun)) > > bb = get_immediate_dominator (CDI_DOMINATORS, > > bb->loop_father->header); > > } > > > > When load->kind == X86_CSE_VEC_DUP, bb is the nearest common dominator which > > may be inside the loop. > > That ensures this when there's a single reaching def for all of the uses > in 'bbs'. So your patch then is also a correctness fix. Before my patch, a single reaching def is hoisted outside of all loops, which is still correct, but caused extra spill in some cases.