[Bug tree-optimization/115097] Strange suboptimal codegen specifically at -O2 when copying struct type

rguenth at gcc dot gnu.org via Gcc-bugs Tue, 14 May 2024 23:50:35 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115097


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |15.0
          Component|c                           |tree-optimization
                 CC|                            |jamborm at gcc dot gnu.org
   Last reconfirmed|                            |2024-05-15
             Status|UNCONFIRMED                 |NEW
             Target|                            |x86_64-*-*
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  The IL difference is

struct A test1 (struct A & a)
{ 
  struct A D.2842;

  <bb 2> [local count: 1073741824]:
  D.2842 = MEM[(const struct A &)a_2(D)];
  return D.2842;

vs

struct A test2 (struct A & a)
{
  struct A D.2873;
  struct A retval.4;

  <bb 2> [local count: 1073741824]:
  D.2873 = MEM[(const struct A &)a_2(D)];
  retval.4 = D.2873;
  return retval.4;

so there's an additional aggregate copy.  With -O2 SRA scalarizes that
copy and we're not able to elide the resulting code on RTL while without
the SRA we can handle this fine.

SRA makes test2 into

struct A test2 (struct A & a)
{ 
  short int SR.12;
  int SR.11;
  struct A retval.4;

  <bb 2> [local count: 1073741824]:
  SR.11_3 = MEM[(const struct A &)a_2(D)].a;
  SR.12_6 = MEM[(const struct A &)a_2(D)].b;
  retval.4.a = SR.11_3;
  retval.4.b = SR.12_6;
  return retval.4;


The extra copy is introduced during gimplfication, the GENERIC looks the
same (but of course there's a hidden difference):

;; Function A test1(A&) (null)
;; enabled by -tree-original


<<cleanup_point return <retval> = TARGET_EXPR <D.2823, *(const struct A &)
a>>>;


;; Function A test2(A&&) (null)
;; enabled by -tree-original


<<cleanup_point return <retval> = TARGET_EXPR <D.2833, *(const struct A &)
a>>>;

[Bug tree-optimization/115097] Strange suboptimal codegen specifically at -O2 when copying struct type

Reply via email to