https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107916

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
            Summary|PPC VSX code generation for |vector_size(32) is
                   |OpenZFS                     |inefficient for VSX on
                   |                            |powerpc64
          Component|target                      |middle-end

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Reduced testcase:
```
#include <stdint.h>

typedef uint32_t u32x4 __attribute__ ((vector_size (16)));

typedef uint32_t u32x8 __attribute__ ((vector_size (32)));
typedef uint64_t u64x4 __attribute__ ((vector_size (32)));

#pragma GCC push_options

#if defined(__x86_64__)

#ifdef __clang_major__
#pragma clang attribute push(__attribute__((target("avx2"))), \
  apply_to = function)
#else
#pragma GCC target ("avx2")
#endif

#elif defined(__powerpc64__)
#ifdef __clang_major__
#pragma clang attribute
push(__attribute__((target("vsx,block-ops-unaligned-vsx,power8-vector"))), \
  apply_to = function)
#else
#pragma GCC target ("vsx,block-ops-unaligned-vsx,power8-vector,power9-vector")
#endif

#endif

void f(int n, u32x8 *a, u32x8 *b)
{
  u32x8 c = {0};
  for(int i = 0; i < n; i++)
     c+=*a;
  *b += c;
}
#ifdef __clang_major__
#if defined(__x86_64__) || defined(__powerpc64__)
#pragma clang attribute pop
#endif
#else
#pragma GCC pop_options
#endif
```
Basically what is going wrong is that c is being pushed to the stack. But
really I had expected c's phi node to be split during vector lowering.

Reply via email to