https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #26 from Kewen Lin <linkw at gcc dot gnu.org> --- > > By following this idea, to release the restriction on loop_outer > > (loop_father) when setting the father_bbs, I can see FRE works as > > expectedly. But it actually does the rpo_vn from cfun's entry to its exit. > > Yeah, that's the reason we do not do it. We could possibly restrict it > to a containing loop, or if the containing loop is the whole function, > restrict it to the original preheader block to the loop exits (which are > no longer there, we'd need to pre-record those I think) Thanks for the suggestion! I tried the idea to restrict it to run from the original preheader block to the loop exits (pre-record both as you said), but it can't support the array d eliminated finally, unfortunately this case requires VN to run across the boundary between the original loops. Now I ended up to run one time the whole function VN if there isn't any loops after unrolling. I guess if there are no loops, the CFG should be simple in most times and then not so costly? > > Besides, when SLP happens, FRE gen the bit_field_ref and remove array d, but > > for scalar codes it needs one more time dse run after cunroll to get array d > > eliminated. I guess it's not costly? Can one pass be run or not controlled > > by something in another pass? via global variable and add one parameter in > > passes.def seems weird. If it's costly, probably we can go by factoring out > > one routine to be called instead of running a pass, like do_rpo_vn? > > No, we don't have a good way to schedule passes from other passes. And yes, > the way forward is to support key transforms on regions. Oh, and every > pass that does memory alias analysis (DSE, DCE, VN, etc.) is costly. > OK, I'll have a look at DSE and try to get it to support region style. Although it may not help this case since it needs to operate things across loop boundary.