https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110979

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The wrong-code part is fixed now, what remains is the inefficiency.  I don't
think we currently cost the "excess" lanes in regular vectorized operations but
of course for open-coded fold-left reductions we should likely account for
possibly VF - 1 extra scalar ops (but in the "epilog" even if that doesn't
exist, since that only applies to the last vector iteration).  I fear it's not
going to be enough to fend off vectorization though.

Reply via email to