https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122065

            Bug ID: 122065
           Summary: powerpc64le: Vector multiplication optimizations for
                    double word does not happen for power8/9
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: avinashd at gcc dot gnu.org
                CC: avinashd at gcc dot gnu.org, jskumari at gcc dot gnu.org,
                    meissner at gcc dot gnu.org, segher at gcc dot gnu.org
  Target Milestone: ---
            Target: powerpc-*-*-*

There are some optimizations that are possible for certain multiplication
patterns that should be possible to implement in power8/9 since it has
instructions for double word shift add and subtract. But double word
multplication always gets converted to scalar code even for simple patterns. 
This happens when code is already in vector form. For example the following 2
functions should produce same assembly, but the one written in scalar form
produces better code.

vector unsigned long long
lshift1_64_altivec (vector unsigned long long a)
{
  return a * (vector unsigned long long) { 4, 4 };
}
produces
        .cfi_startproc
        xxpermdi 0,34,34,3
        mfvsrd 9,34
        mfvsrd 10,0
        sldi 9,9,2
        mtvsrd 0,9
        sldi 10,10,2
        mtvsrd 34,10
        xxpermdi 34,0,34,0
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0
        .cfi_endproc

void lshift1_64(uint64_t *a) {
  a[0] *= 4;
  a[1] *= 4;
}
produces
lshift1:
.LFB0:
        .cfi_startproc
        lxvd2x 32,0,3
        vspltisw 1,2
        vsld 0,0,1
        stxvd2x 32,0,3
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0
        .cfi_endproc

Reply via email to