[Bug middle-end/95669] -O3 generates more complicated code to return 8-byte struct of zeros, sometimes

2021-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95669

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug middle-end/95669] -O3 generates more complicated code to return 8-byte struct of zeros, sometimes

2020-06-15 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95669

Richard Biener  changed:

   What|Removed |Added

  Component|c++ |middle-end
 Target||x86_64-*-*
   Last reconfirmed||2020-06-15
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||missed-optimization

--- Comment #1 from Richard Biener  ---
with 'dummy' and the implicit zero initialization of it we retain

   [local count: 1073741824]:
  if (a_3(D) < b_4(D)) 
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  D.2350 = {};
  goto ; [100.00%]

   [local count: 536870913]:
  _1 = a_3(D) * b_4(D);
  D.2350.val = _1;
  MEM  [(void *) + 4B] = 1;

   [local count: 1073741824]:
  return D.2350;

wich generates straigt-forward code while with 'dummy' elided we manage
to completely scalarize things and do

   [local count: 1073741824]:
  if (a_3(D) < b_4(D)) 
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _1 = a_3(D) * b_4(D);

   [local count: 1073741824]:
  # cstore_11 = PHI <_1(3), 0(2)>
  # cstore_10 = PHI <1(3), 0(2)>
  D.2349.ok = cstore_10;
  D.2349.val = cstore_11;
  return D.2349;

which is basically two conditional moves we expand via strange bit
shufflings because D.2349 (struct res) is assigned a register.