9 Regression] gimple mem-to-mem assignment badly optimized

glisse at gcc dot gnu.org Sat, 18 Aug 2018 08:38:08 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87008


            Bug ID: 87008
           Summary: [8/9 Regression] gimple mem-to-mem assignment badly
                    optimized
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---

Created attachment 44554
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44554&action=edit
preprocessed testcase

Original testcase posted at
https://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2018/08/msg00012.html

#include <Eigen/Dense>

using Vec = Eigen::Matrix<double, 2, 1>;

Vec f ()
{
  Vec sum = Vec::Zero();
  for (int i = 0; i < 1024; ++i)
  {
    const Vec dirA = sum;
    const Vec dirB = dirA;

    sum += dirA.dot(dirB) * dirA;
  }
  return sum;
}

compiled on x86_64 with eigen 3.3.4, -DEIGEN_DONT_VECTORIZE -O3 .

The .optimized dump contains

  MEM[(struct DenseStorage *)&dirA].m_data = MEM[(const struct DenseStorage
&)sum_5(D)].m_data;
  dirA_18 = MEM[(struct plain_array *)&dirA];
  dirA$8_3 = MEM[(struct plain_array *)&dirA + 8B];
  MEM[(struct DenseStorage *)&dirB].m_data = MEM[(const struct DenseStorage
&)&dirA].m_data;
  dirB_35 = MEM[(struct plain_array *)&dirB];
  dirB$8_48 = MEM[(struct plain_array *)&dirB + 8B];

which translates to

        movdqu  (%rax), %xmm1
        movaps  %xmm1, -40(%rsp)
        movsd   -40(%rsp), %xmm2
        movsd   -32(%rsp), %xmm0
        movaps  %xmm1, -24(%rsp)
        movsd   -16(%rsp), %xmm1
        movsd   -24(%rsp), %xmm5

This is clearly quite bad, we should for instance CSE dirA_18 and dirB_35. This
is yet another case where gimple optimizers have a hard time handling
mem-to-mem assignment. I think we have relevant code in vn_reference_lookup_3
(case 5 in particular). ESRA used to help.

I would also have expected better from RTL optimization, but that may be too
optimistic.

[Bug tree-optimization/87008] New: [8/9 Regression] gimple mem-to-mem assignment badly optimized

Reply via email to