https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #32 from Kewen Lin <linkw at gcc dot gnu.org> --- (In reply to Richard Biener from comment #31) > (In reply to Kewen Lin from comment #29) > > (In reply to Hongtao.liu from comment #28) > > > > Probably you can try to tweak it in ix86_add_stmt_cost? when the > > > > statement > > > > > > Yes, it's the place. > > > > > > > is UB to UH conversion statement, further check if the def of the input > > > > UB > > > > is MEM. > > > > > > Only if there's no multi-use for UB. More generally, it's quite difficult > > > to > > > guess later optimizations for the purpose of more accurate vectorization > > > cost model, :(. > > > > Yeah, it's hard sadly. The generic cost modeling is rough, > > ix86_add_stmt_cost is more fine-grain (at least than what we have on Power > > :)), if you want to check it more, it seems doable in target specific hook > > finish_cost where you can get the whole vinfo object, but it could end up > > with very heavy analysis and might not be worthy. > > > > Do you mind to check if it can also fix this degradation on x86 to run FRE > > and DSE just after cunroll? I found it worked for Power, hoped it can help > > there too. > > Btw, we could try sth like adding a TODO_force_next_scalar_cleanup to be > returned from passes that see cleanup opportunities and have the pass > manager queue that up, looking for a special marked pass and enabling > that so we could have > > NEXT_PASS (pass_predcom); > NEXT_PASS (pass_complete_unroll); > NEXT_PASS (pass_scalar_cleanup); > PUSH_INSERT_PASSES_WITHIN (pass_scalar_cleanup); > NEXT_PASS (pass_fre, false /* may_iterate */); > NEXT_PASS (pass_dse); > POP_INSERT_PASSES (); > > with pass_scalar_cleanup gate() returning false otherwise. Eventually > pass properties would match this better, or sth else. > Thanks for the suggestion! Before cooking the patch, I have one question that it looks to only update function property is enough, eg: some pass sets property PROP_ok_for_cleanup and later pass_scalar_cleanup only goes for the func with this property (checking in gate), I'm not quite sure the reason for the TODO_flag TODO_force_next_scalar_cleanup.