https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95058
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed| |2020-05-12 --- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- OK, so for non 7 BE we end up not vectorizing because it doesn't look profitable which IMHO is good. It would be nice to also see dumps before the respective rev. because in theory (well...) the cost computation should be the same. Ah! OK, so we now have 0x10002001470 _1 1 times vec_construct costs 2 in prologue 0x10002001470 _1 1 times vec_construct costs 2 in prologue 0x10002001470 _1 2 times vector_store costs 2 in body 0x10001ecfcc0 _1 1 times scalar_store costs 1 in body 0x10001ecfcc0 _2 1 times scalar_store costs 1 in body 0x10001ecfcc0 _3 1 times scalar_store costs 1 in body 0x10001ecfcc0 _4 1 times scalar_store costs 1 in body that is, the SLP graph has the expected cost. Originally we likely had costed against 4 scalar stores and 4 scalar loads (but the scalar loads will still be there). On x86_64 we get 0x3975280 _1 1 times vec_construct costs 8 in prologue 0x3975280 _1 1 times vec_construct costs 8 in prologue 0x3975280 _1 2 times vector_store costs 24 in body 0x3942cb0 _1 1 times scalar_store costs 12 in body 0x3942cb0 _2 1 times scalar_store costs 12 in body 0x3942cb0 _3 1 times scalar_store costs 12 in body 0x3942cb0 _4 1 times scalar_store costs 12 in body so it's still profitable there. Note I suggest to leave the FAILs in place for now since in my dev tree I see the vec_construct gone again so it would start passing again on ppc as well. Sorry for the intermediate breakage.