On Mon, 13 Feb 2023, Jeff Law wrote: > > > On 2/10/23 04:02, Richard Biener via Gcc-patches wrote: > > This fixes an oversight to when removing the hard limits on using > > generic vectors for the vectorizer to enable both SLP and BB > > vectorization to use those. The vectorizer relies on vector lowering > > to expand plus, minus and negate to bit operations but vector > > lowering has a hard limit on the minimum number of elements per > > work item. Vectorizer costs for the testcase at hand work out > > to vectorize a loop with just two work items per vector and that > > causes element wise expansion and spilling. > > > > The fix for now is to re-instantiate the hard limit, matching what > > vector lowering does. For the future the way to go is to emit the > > lowered sequence directly from the vectorizer instead. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > > > Thanks, > > Richard. > > > > PR tree-optimization/108724 > > * tree-vect-stmts.cc (vectorizable_operation): Avoid > > using word_mode vectors when vector lowering will > > decompose them to elementwise operations. > > > > * gcc.target/i386/pr108724.c: New testcase. > OK. Though can't this be a problem with logicals too? Or is there something > special about +- going on here?
Logical ops do not cross lanes even when using scalar operations on GPRs. For +- you have to compute the MSB separately to avoid spilling over to the next vector lane. Richard.