On 8/28/20 11:33 AM, Peter Maydell wrote: > +#define float16_nop(N, M, S) (M) > +#define float32_nop(N, M, S) (M) > +#define float64_nop(N, M, S) (M) > > +DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16, H2) > +DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32, H4) > +DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64, ) > + > +/* > + * Non-fused multiply-accumulate operations, for Neon. NB that unlike > + * the fused ops below they assume accumulate both from and into Vd. > + */ > +DO_FMUL_IDX(gvec_fmla_nf_idx_h, add, float16, H2) > +DO_FMUL_IDX(gvec_fmla_nf_idx_s, add, float32, H4) > +DO_FMUL_IDX(gvec_fmls_nf_idx_h, sub, float16, H2) > +DO_FMUL_IDX(gvec_fmls_nf_idx_s, sub, float32, H4) > + > +#undef float16_nop > +#undef float32_nop > +#undef float64_nop
This floatN_nop stuff is pretty ugly. Better to pass in either floatN_mul, or the floatN_muladd_nf helpers that you added earlier. Although I guess you're missing float64_muladd_nf so far. r~