On Tue, Jul 28, 2015 at 12:25:55PM +0100, Alan Lawrence wrote:
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode.
> * config/aarch64/aarch64-builtins.c (VAR13, VAR14): New.
> (aarch64_scalar_builtin_types, aarch64_init_simd_builtin_scalar_types):
> Add __builtin_aarch64_simd_hf.
> * config/aarch64/arm_neon.h (float16x4x2_t, float16x8x2_t,
> float16x4x3_t, float16x8x3_t, float16x4x4_t, float16x8x4_t,
> vcombine_f16, vst2_lane_f16, vst2q_lane_f16, vst3_lane_f16,
> vst3q_lane_f16, vst4_lane_f16, vst4q_lane_f16, vld2_f16, vld2q_f16,
> vld3_f16, vld3q_f16, vld4_f16, vld4q_f16, vld2_dup_f16, vld2q_dup_f16,
> vld3_dup_f16, vld3q_dup_f16, vld4_dup_f16, vld4q_dup_f16,
> vld2_lane_f16, vld2q_lane_f16, vld3_lane_f16, vld3q_lane_f16,
> vld4_lane_f16, vld4q_lane_f16, vst2_f16, vst2q_f16, vst3_f16,
> vst3q_f16, vst4_f16, vst4q_f16, vcreate_f16): New.
>
> * config/aarch64/iterators.md (VALLDIF, Vtype, Vetype, Vbtype,
> V_cmp_result, v_cmp_result): Add cases for V4HF and V8HF.
> (VDC, Vdbl): Add V4HF.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/vldN_1.c: Add float16x4_t and float16x8_t cases.
> * gcc.target/aarch64/vldN_dup_1.c: Likewise.
> * gcc.target/aarch64/vldN_lane_1.c: Likewise.
Hi Alan,
The arm_neon.h portion of this patch does not apply after Charles' recent
changes. Could you please rebase and resubmit the patch for review?
Thanks,
James
> @@ -10000,6 +10044,8 @@ vst2_lane_ ## funcsuffix (ptrtype *__ptr,
> \
> __ptr, __o, __c); \
> }
>
> +__ST2_LANE_FUNC (float16x4x2_t, float16x8x2_t, float16_t, v8hf, hf, f16,
> + float16x8_t)
> __ST2_LANE_FUNC (float32x2x2_t, float32x4x2_t, float32_t, v4sf, sf, f32,
> float32x4_t)
> __ST2_LANE_FUNC (float64x1x2_t, float64x2x2_t, float64_t, v2df, df, f64,
Hunks like this fail, as the macro should look like ( from
config/aarch64/arm_neon.h ):
#define __ST2_LANE_FUNC(intype, largetype, ptrtype, mode, \
qmode, ptr_mode, funcsuffix, signedtype) \
__ST2_LANE_FUNC (float32x2x2_t, float32x4x2_t, float32_t, v2sf, v4sf, sf, f32,
float32x4_t)
So I would expect the lines you add to look something like:
> +__ST2_LANE_FUNC (float16x4x2_t, float16x8x2_t, float16_t, v4hf, v8hf, hf,
> f16,
> + float16x8_t)
Thanks,
James