On 17/03/15 03:48, Kyrill Tkachov wrote: > > On 16/03/15 13:15, Kugan wrote: >> On 16/03/15 23:32, Kugan wrote: >>>>> lower-subreg.c:compute_costs() only cares about the cost of a (set >>>>> (reg) >>>>> (const_int )) move but I think the intention, at least for now, is to >>>>> return extra_cost->vect.alu for all the vector operations. >>>> Almost, what we want at the moment is COSTS_N_INSNS (1) + >>>> extra_cost->vect.alu >>> Thanks Kyrill for the review. >>> >>>>> Regression tested on aarch64-linux-gnu with no new regression. Is this >>>>> OK for trunk? >>>> Are you sure it's a (set (reg) (const_int)) that's being costed here? I >>>> thought for moves into vecto registers it would be a (set (reg) >>>> (const_vector)) which we don't handle in our rtx costs currently. I >>>> think the correct approach would be to extend the aarch64_rtx_costs >>>> switch statement to handle the CONST_VECT case. I believe you can use >>>> aarch64_simd_valid_immediate to check whether x is a valid immediate >>>> for >>>> a simd instruction and give it a cost of extra_cost->vect.alu. The >>>> logic >>>> should be similar to the CONST_INT case. >>> Sorry about the (set (reg) (const_int)) above. But the actual RTL that >>> is being split at 220r.subreg2 is >>> >>> (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0) >>> (subreg:V4SF (reg/v:OI 73 [ __o ]) 0)) >>> /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625 >>> >>> 800 {*aarch64_simd_movv4sf} >>> (nil)) >>> >>> And also, if we return RTX cost above COSTS_N_INSNS (1), it will be >>> split and it dosent recover from there. Therefore we need something like >>> the below to prevent that happening. >>> >> Hi Kyrill, >> >> How about the attached patch? It is similar to what is currently done >> for scalar register move. > > Hi Kugan, > yeah, I think this is a better approach, though I can't approve. >
Here is the patch with minor comment update. Regression tested on aarch64-linux-gnu with no new regression. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2015-03-17 Kugan Vivekanandarajah <kug...@linaro.org> Jim Wilson <jim.wil...@linaro.org> PR target/65375 * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle vector register copies.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index cba3c1a..d6ad0af 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, /* Fall through. */ case REG: + if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1)) + { + /* The cost is 1 per vector-register copied. */ + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) + / GET_MODE_SIZE (V4SImode); + *cost = COSTS_N_INSNS (n_minus_1 + 1); + } /* const0_rtx is in general free, but we will use an instruction to set a register to 0. */ - if (REG_P (op1) || op1 == const0_rtx) - { + else if (REG_P (op1) || op1 == const0_rtx) + { /* The cost is 1 per register copied. */ int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) / UNITS_PER_WORD;