Wilco Dijkstra <wilco.dijks...@arm.com> writes: > James Greenhalgh wrote: > >> > Add a missing ? to aarch64_get_lane to fix a failure in the testsuite. >> >> > I'd prefer more detail than this for a workaround; which test, why did it >> > start to fail, why is this the right solution, etc. > > It was gcc.target/aarch64/vect_copy_lane_1.c generating: > > test_copy_laneq_f64: > umov x0, v1.d[1] > fmov d0, x0 > ret > > For some reason returning a double uses DImode temporaries, so it's essential > to prefer FP_REGS here and mark the lane copy correctly.
The "?" change seems to make intrinsic sense given the extra cost of the GPR alternative. But I think the real reason for this failure is that we define no V1DF patterns, and target-independent code falls back to using moves in the corresponding *integer* mode. So for that function we generate the rather ugly code: (note 6 1 3 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 3 6 2 2 (clobber (reg/v:V1DF 92 [ aD.21157 ])) "vect_copy_lane_1.c":45 -1 (nil)) (insn 2 3 4 2 (set (subreg:DI (reg/v:V1DF 92 [ aD.21157 ]) 0) (reg:DI 32 v0 [ aD.21157 ])) "vect_copy_lane_1.c":45 47 {*movdi_aarch64} (nil)) (insn 4 2 5 2 (set (reg/v:V2DF 93 [ bD.21158 ]) (reg:V2DF 33 v1 [ bD.21158 ])) "vect_copy_lane_1.c":45 1063 {*aarch64_simd_movv2df} (nil)) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (insn 8 5 9 2 (set (reg:DF 95) (vec_select:DF (reg/v:V2DF 93 [ bD.21158 ]) (parallel [ (const_int 1 [0x1]) ]))) "./include/arm_neon.h":14441 1993 {aarch64_get_lanev2df} (nil)) (insn 9 8 11 2 (set (reg:DI 96) (subreg:DI (reg:DF 95) 0)) "vect_copy_lane_1.c":45 47 {*movdi_aarch64} (nil)) (insn 11 9 10 2 (clobber (reg:V1DF 91 [ <retval> ])) "vect_copy_lane_1.c":45 -1 (nil)) (insn 10 11 15 2 (set (subreg:DI (reg:V1DF 91 [ <retval> ]) 0) (reg:DI 96)) "vect_copy_lane_1.c":45 47 {*movdi_aarch64} (nil)) (insn 15 10 16 2 (set (reg:DI 32 v0) (subreg:DI (reg:V1DF 91 [ <retval> ]) 0)) "vect_copy_lane_1.c":45 47 {*movdi_aarch64} (nil)) (insn 16 15 0 2 (use (reg/i:V1DF 32 v0)) "vect_copy_lane_1.c":45 -1 (nil)) which by IRA gets optimised to: (insn 9 8 15 2 (set (subreg:DF (reg:DI 96) 0) (vec_select:DF (reg:V2DF 33 v1 [ bD.21158 ]) (parallel [ (const_int 1 [0x1]) ]))) "vect_copy_lane_1.c":45 1993 {aarch64_get_lanev2df} (expr_list:REG_DEAD (reg:V2DF 33 v1 [ bD.21158 ]) (nil))) (insn 15 9 16 2 (set (reg:DI 32 v0) (reg:DI 96)) "vect_copy_lane_1.c":45 47 {*movdi_aarch64} (expr_list:REG_DEAD (reg:DI 96) (nil))) (insn 16 15 18 2 (use (reg/i:V1DF 32 v0)) "vect_copy_lane_1.c":45 -1 (nil)) with the move now being done purely in DImode. This defeats the heuristic in aarch64_ira_change_pseudo_allocno_class because the pseudo appears to be a normal integer rather than a (float) vector. Although the "?" fixes this particular instance, I think more complicated V1DF code would still regress by being forced to use GENERAL_REGS. Of course, the fix is to add the move pattern rather than disable the heuristic... Thanks, Richard