Hi ,

This improves the vdup_n intrinsics where one tries to form constant
vectors. This uses targetm.fold_builtin to fold these vector
initializations to actual vector constants. The vdup_n cases are fine
with both endian-ness as the vector constant is just duplicated. In
addition I've made the *neon_vmov patterns take a const_zero vector to
allow the compiler to generate vmov.i32 <reg>, #0 for vdup_n_f32
(0.0f); type operations. It has the nice side effect that zero
initalization of FP vectors for Neon doesn't need a load from the
literal pool. I will point out that the vcreate and a lot of the other
intrinsics could be improved in a similar vein (caveat big-endian) .
This helps in a number of cases where we were initially generating a
mov of a constant into an integer register and then dupping it over
and indeed helps the tree optimizers recognize the value for the
constant vector that it is.

This also needed some work with making a testcase for vabd more robust
which just showed that the folding works !

In the process I've also cleaned up a few prototypes which was obvious.

Tested cross on arm-linux-gnueabi with no regressions.

Ok (to commit as 2 separate patches one for the prototype cleanup and
the other for the vdup case ) ?

regards,
Ramana



2012-06-20  Ramana Radhakrishnan  <ramana.radhakrish...@linaro.org>

        * config/arm/arm.c (arm_vector_alignment_reachable): Fix declaration.
        (arm_builtin_support_vector_misalignment): Likewise.
        (arm_preferred_rename_class): Likewise.
        (arm_vectorize_vec_perm_const_ok): Likewise.
        (arm_fold_builtin): New.
        (TARGET_FOLD_BUILTIN): New.
        * config/arm/neon.md (*neon_mov<mode>:VDX, VQX): Add Dz alternative.

        testsuite/
        * gcc.target/arm/neon-combine-sub-abs-into-abd.c: Make test
        more robust.

Attachment: vmovzero.patch
Description: Binary data

Reply via email to