https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95058

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Status|UNCONFIRMED                 |ASSIGNED
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-05-12

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so for non 7 BE we end up not vectorizing because it doesn't look
profitable
which IMHO is good.  It would be nice to also see dumps before the respective
rev. because in theory (well...) the cost computation should be the same.
Ah!  OK, so we now have

0x10002001470 _1 1 times vec_construct costs 2 in prologue
0x10002001470 _1 1 times vec_construct costs 2 in prologue
0x10002001470 _1 2 times vector_store costs 2 in body
0x10001ecfcc0 _1 1 times scalar_store costs 1 in body
0x10001ecfcc0 _2 1 times scalar_store costs 1 in body
0x10001ecfcc0 _3 1 times scalar_store costs 1 in body
0x10001ecfcc0 _4 1 times scalar_store costs 1 in body

that is, the SLP graph has the expected cost.  Originally we likely
had costed against 4 scalar stores and 4 scalar loads (but the scalar
loads will still be there).  On x86_64 we get

0x3975280 _1 1 times vec_construct costs 8 in prologue
0x3975280 _1 1 times vec_construct costs 8 in prologue
0x3975280 _1 2 times vector_store costs 24 in body
0x3942cb0 _1 1 times scalar_store costs 12 in body
0x3942cb0 _2 1 times scalar_store costs 12 in body
0x3942cb0 _3 1 times scalar_store costs 12 in body
0x3942cb0 _4 1 times scalar_store costs 12 in body

so it's still profitable there.

Note I suggest to leave the FAILs in place for now since in my dev tree
I see the vec_construct gone again so it would start passing again
on ppc as well.

Sorry for the intermediate breakage.

Reply via email to