On Mon, 13 Feb 2023, Jeff Law wrote:

> 
> 
> On 2/10/23 04:02, Richard Biener via Gcc-patches wrote:
> > This fixes an oversight to when removing the hard limits on using
> > generic vectors for the vectorizer to enable both SLP and BB
> > vectorization to use those.  The vectorizer relies on vector lowering
> > to expand plus, minus and negate to bit operations but vector
> > lowering has a hard limit on the minimum number of elements per
> > work item.  Vectorizer costs for the testcase at hand work out
> > to vectorize a loop with just two work items per vector and that
> > causes element wise expansion and spilling.
> > 
> > The fix for now is to re-instantiate the hard limit, matching what
> > vector lowering does.  For the future the way to go is to emit the
> > lowered sequence directly from the vectorizer instead.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> > 
> > Thanks,
> > Richard.
> > 
> >  PR tree-optimization/108724
> >  * tree-vect-stmts.cc (vectorizable_operation): Avoid
> >  using word_mode vectors when vector lowering will
> >  decompose them to elementwise operations.
> > 
> >  * gcc.target/i386/pr108724.c: New testcase.
> OK.  Though can't this be a problem with logicals too?  Or is there something
> special about +- going on here?

Logical ops do not cross lanes even when using scalar operations on GPRs.
For +- you have to compute the MSB separately to avoid spilling over to
the next vector lane.

Richard.

Reply via email to