https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63277

cbaylis at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cbaylis at gcc dot gnu.org

--- Comment #4 from cbaylis at gcc dot gnu.org ---
A much simplified test case based on arm_neon_excessive_vmov_wo_vcombine.c 

$ arm-unknown-linux-gnueabihf-gcc -O2 -S -o - -mfpu=neon mini.c 

#include <arm_neon.h>

void f(int8_t *p)
{
   int8x16_t v;
   int8x8_t v2;
   int8x8x2_t vx;

   v=vld1q_s8(p);
   v2=vld1_s8(p);
   vx.val[0] = vget_low_s8(v);
   vx.val[1] = vget_high_s8(v);
   v2 = vtbl2_s8(vx, v2);
   vst1_s8(p, v2);
}

With -dp, the generated code is:
f:
        vld1.8  {d18-d19}, [r0] @ 6     neon_vld1v16qi  [length = 4]
        vmov    d16, d18  @ v8qi        @ 10    *neon_movv8qi/1 [length = 4]
        vld1.8  {d20}, [r0]     @ 7     neon_vld1v8qi   [length = 4]
        vmov    d17, d19  @ v8qi        @ 11    *neon_movv8qi/1 [length = 4]
        vtbl.8  d16, {d16, d17}, d20    @ 12    neon_vtbl2v8qi  [length = 4]
        vst1.8  {d16}, [r0]     @ 13    neon_vst1v8qi   [length = 4]
        bx      lr      @ 24    *thumb2_return  [length = 4]

By the time IRA runs, the insns which result in the moves look like this:
(insn 9 18 11 2 (set (subreg:V8QI (reg/v:TI 116 [ vx ]) 0)
        (subreg:V8QI (reg:V16QI 114 [ D.14019 ]) 0)) /tmp/mini.c:11 827
{*neon_movv8qi}

The registers 116 and 114 are allocated to different hard registers, as they
conflict. Presumably, the register allocator could be taught to treat this
subreg->subreg move as a copy and allow the same hard register to be allocated.

Reply via email to