https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #5 from Erik Schnetter <schnetter at gmail dot com> ---
As you suggested, the problem is probably not caused by register spills, but by
stores into a struct that are not optimized away. In this case, the respective
struct elements are unused in the code.

I traced the results of the first __builtin_ia32_maskloadpd256:

  _63940 = __builtin_ia32_maskloadpd256 (_63955, prephitmp_86203);
  MEM <const vector(4) double> [(struct mat3 *)&vars + 992B] = _63940;
  _178613 = .FMA (_63940, _64752, _178609);
  MEM <const vector(4) double> [(struct mat3 *)&vars + 1312B] = _63940;

The respective struct locations (+ 992B, + 1312B) are indeed not used anywhere
else.

The struct is of type z4c_vars. It (and its parent) are defined in lines 279837
to 280818. It is large.

Is there e.g. a parameter I could set to make GCC try harder avoid unnecessary
stores?

Reply via email to