Hi all,

I've been looking at implementing the complex multiply patterns for the amdgcn port, but I'm not getting the code I was hoping for. When I try to use the patterns on x86_64 or AArch64 they don't seem to work there either, so is there something wrong with the middle-end? I've tried both current HEAD and GCC 11.

The example shown in the internals manual is a simple loop multiplying two arrays of complex numbers, and writing the results to a third. I had expected that it would use the largest vectorization factor available, with the real/imaginary numbers in even/odd lanes as described, but the vectorization factor is only 2 (so, a single complex number), and I have to set -fvect-cost-model=unlimited to get even that.

I tried another example with SLP and that too uses the cmul patterns only for a single real/imaginary pair.

Did proper vectorization of cmul ever really work? There is a case in the testsuite for the pattern match, but it isn't in a loop.

Thanks

Andrew

P.S. I attached my testcase, in case I'm doing something stupid.

P.P.S. The manual says the pattern is "cmulm4", etc., but it's actually "cmulm3" in the implementation.
typedef _Complex double complexT;
#define arraysize 256

void f(
complexT a[restrict arraysize],
complexT b[restrict arraysize],
complexT c[restrict arraysize]
       )
{
#if defined(LOOP)
  for (int i = 0; i < arraysize; i++)
    c[i] = a[i] * b[i];
#else

    c[0] = a[0] * b[0];
    c[1] = a[1] * b[1];
    c[2] = a[2] * b[2];
    c[3] = a[3] * b[3];
    c[4] = a[4] * b[4];
    c[5] = a[5] * b[5];
    c[6] = a[6] * b[6];
    c[7] = a[7] * b[7];
    c[8] = a[8] * b[8];
    c[9] = a[9] * b[9];
    c[10] = a[10] * b[10];
    c[11] = a[11] * b[11];
    c[12] = a[12] * b[12];
    c[13] = a[13] * b[13];
    c[14] = a[14] * b[14];
    c[15] = a[15] * b[15];
    c[16] = a[16] * b[16];
    c[17] = a[17] * b[17];
    c[18] = a[18] * b[18];
    c[19] = a[19] * b[19];
    c[20] = a[20] * b[20];
    c[21] = a[21] * b[21];
    c[22] = a[22] * b[22];
    c[23] = a[23] * b[23];
    c[24] = a[24] * b[24];
    c[25] = a[25] * b[25];
    c[26] = a[26] * b[26];
    c[27] = a[27] * b[27];
    c[28] = a[28] * b[28];
    c[29] = a[29] * b[29];
    c[30] = a[30] * b[30];
    c[31] = a[31] * b[31];
    c[32] = a[32] * b[32];
#endif
}

Reply via email to