On 6/7/21 9:58 AM, Peter Maydell wrote:
+#define DO_VCADD(OP, ESIZE, TYPE, H, FN0, FN1) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, void *vm) \ + { \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE r[16 / ESIZE]; \ + /* Calculate all results first to avoid overwriting inputs */ \ + for (e = 0; e < 16 / ESIZE; e++) { \ + if (!(e & 1)) { \ + r[e] = FN0(n[H(e)], m[H(e + 1)]); \ + } else { \ + r[e] = FN1(n[H(e)], m[H(e - 1)]); \ + } \ + } \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + uint64_t bytemask = mask_to_bytemask##ESIZE(mask); \ + d[H(e)] &= ~bytemask; \ + d[H(e)] |= (r[e] & bytemask); \ + } \ + mve_advance_vpt(env); \ + }
I guess this is ok. You could unroll the loop once, so that you compute only even+odd results before writeback.
+/* + * VCADD Qd == Qm at size MO_32 is UNPREDICTABLE; we choose not to diagnose + * so we can reuse the DO_2OP macro. (Our implementation calculates the + * "expected" results in this case.) + */
You've done this elsewhere, though. Either way, Reviewed-by: Richard Henderson <richard.hender...@linaro.org> r~