https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789

--- Comment #32 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #31)
> (In reply to Kewen Lin from comment #29)
> > (In reply to Hongtao.liu from comment #28)
> > > > Probably you can try to tweak it in ix86_add_stmt_cost? when the 
> > > > statement
> > > 
> > > Yes, it's the place.
> > > 
> > > > is UB to UH conversion statement, further check if the def of the input 
> > > > UB
> > > > is MEM.
> > > 
> > > Only if there's no multi-use for UB. More generally, it's quite difficult 
> > > to
> > > guess later optimizations for the purpose of more accurate vectorization
> > > cost model, :(.
> > 
> > Yeah, it's hard sadly. The generic cost modeling is rough,
> > ix86_add_stmt_cost is more fine-grain (at least than what we have on Power
> > :)), if you want to check it more, it seems doable in target specific hook
> > finish_cost where you can get the whole vinfo object, but it could end up
> > with very heavy analysis and might not be worthy.
> > 
> > Do you mind to check if it can also fix this degradation on x86 to run FRE
> > and DSE just after cunroll? I found it worked for Power, hoped it can help
> > there too.
> 
> Btw, we could try sth like adding a TODO_force_next_scalar_cleanup to be
> returned from passes that see cleanup opportunities and have the pass
> manager queue that up, looking for a special marked pass and enabling
> that so we could have
> 
>           NEXT_PASS (pass_predcom);
>           NEXT_PASS (pass_complete_unroll);
>           NEXT_PASS (pass_scalar_cleanup);
>           PUSH_INSERT_PASSES_WITHIN (pass_scalar_cleanup);
>             NEXT_PASS (pass_fre, false /* may_iterate */);
>             NEXT_PASS (pass_dse);
>           POP_INSERT_PASSES ();
> 
> with pass_scalar_cleanup gate() returning false otherwise.  Eventually
> pass properties would match this better, or sth else.
> 

Thanks for the suggestion! Before cooking the patch, I have one question that
it looks to only update function property is enough, eg: some pass sets
property PROP_ok_for_cleanup and later pass_scalar_cleanup only goes for the
func with this property (checking in gate), I'm not quite sure the reason for
the TODO_flag TODO_force_next_scalar_cleanup.

Reply via email to