https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104408

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #4)
> (In reply to Richard Biener from comment #3)
> > match.pd just does canonicalization here.  SLP discovery could handle this
> > in the existing swap operands or reassoc support but I guess the desire here
> > is to pull out a Complex SLP pattern.
> 
> Yes, though also to optimize the case where you don't have the optab,
> currently the generated code is much worse at -Ofast.
> 
> > 
> > So - no perfect idea yet how to reliably match a Complex pattern here but
> > trying to attack this from the match.pd side sounds wrong.
> 
> Well the problem is that the scalar code is suboptimal too. even without
> matching a complex pattern, so the epilogue here does an extra sub on each
> unrolled step.

Well, the issue is then why the scalar code is not optimized (yes, it's
not so easy).

> So I initially figured we'd want to not perform the canonization if it's
> coming at the expense of sharing. However that looks harder than I though at
> first as there are multiple points in const-fold.c that will try and force
> this form.

Yep.  The canonicalization likely happens early before we do CSE.

> I can probably fix the epilogue post vectorization but that seemed like a
> worse solution.

Well, the CSE opportunity needs to be realized despite the canonialization.

Reply via email to