https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87008
Bug ID: 87008 Summary: [8/9 Regression] gimple mem-to-mem assignment badly optimized Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- Created attachment 44554 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44554&action=edit preprocessed testcase Original testcase posted at https://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2018/08/msg00012.html #include <Eigen/Dense> using Vec = Eigen::Matrix<double, 2, 1>; Vec f () { Vec sum = Vec::Zero(); for (int i = 0; i < 1024; ++i) { const Vec dirA = sum; const Vec dirB = dirA; sum += dirA.dot(dirB) * dirA; } return sum; } compiled on x86_64 with eigen 3.3.4, -DEIGEN_DONT_VECTORIZE -O3 . The .optimized dump contains MEM[(struct DenseStorage *)&dirA].m_data = MEM[(const struct DenseStorage &)sum_5(D)].m_data; dirA_18 = MEM[(struct plain_array *)&dirA]; dirA$8_3 = MEM[(struct plain_array *)&dirA + 8B]; MEM[(struct DenseStorage *)&dirB].m_data = MEM[(const struct DenseStorage &)&dirA].m_data; dirB_35 = MEM[(struct plain_array *)&dirB]; dirB$8_48 = MEM[(struct plain_array *)&dirB + 8B]; which translates to movdqu (%rax), %xmm1 movaps %xmm1, -40(%rsp) movsd -40(%rsp), %xmm2 movsd -32(%rsp), %xmm0 movaps %xmm1, -24(%rsp) movsd -16(%rsp), %xmm1 movsd -24(%rsp), %xmm5 This is clearly quite bad, we should for instance CSE dirA_18 and dirB_35. This is yet another case where gimple optimizers have a hard time handling mem-to-mem assignment. I think we have relevant code in vn_reference_lookup_3 (case 5 in particular). ESRA used to help. I would also have expected better from RTL optimization, but that may be too optimistic.