On 03.05.19 05:45, Richard Henderson wrote: > On 5/2/19 7:09 AM, David Hildenbrand wrote: >> 128-bit handling courtesy of Richard H. >> >> Signed-off-by: David Hildenbrand <da...@redhat.com> >> --- >> target/s390x/insn-data.def | 2 + >> target/s390x/translate_vx.inc.c | 94 +++++++++++++++++++++++++++++++++ >> 2 files changed, 96 insertions(+) > > Reviewed-by: Richard Henderson <richard.hender...@linaro.org> > >> +static void gen_acc(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b, uint8_t es) >> +{ >> + const uint8_t msb_bit_nr = NUM_VEC_ELEMENT_BITS(es) - 1; >> + TCGv_i64 msb_mask = tcg_const_i64(dup_const(es, 1ull << msb_bit_nr)); >> + TCGv_i64 t1 = tcg_temp_new_i64(); >> + TCGv_i64 t2 = tcg_temp_new_i64(); >> + TCGv_i64 t3 = tcg_temp_new_i64(); >> + >> + /* Calculate the carry into the MSB, ignoring the old MSBs */ >> + tcg_gen_andc_i64(t1, a, msb_mask); >> + tcg_gen_andc_i64(t2, b, msb_mask); >> + tcg_gen_add_i64(t1, t1, t2); >> + /* Calculate the MSB without any carry into it */ >> + tcg_gen_xor_i64(t3, a, b); >> + /* Calculate the carry out of the MSB in the MSB bit position */ >> + tcg_gen_and_i64(d, a, b); >> + tcg_gen_and_i64(t1, t1, t3); >> + tcg_gen_or_i64(d, d, t1); >> + /* Isolate and shift the carry into position */ >> + tcg_gen_and_i64(d, d, msb_mask); >> + tcg_gen_shri_i64(d, d, msb_bit_nr); >> + >> + tcg_temp_free_i64(t1); >> + tcg_temp_free_i64(t2); >> + tcg_temp_free_i64(t3); >> +} > ...> +static void gen_acc32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) >> +{ >> + gen_acc(d, a, b, ES_32); >> +} >> + >> +static void gen_acc_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) >> +{ >> + TCGv_i64 t = tcg_temp_new_i64(); >> + >> + tcg_gen_add_i64(t, a, b); >> + tcg_gen_setcond_i64(TCG_COND_LTU, d, t, b); >> + tcg_temp_free_i64(t); >> +} > > As an aside, I think the 32-bit version should use 32-bit ops, as per > gen_acc_i64. That would be 4 * 2 operations instead of 2 * 9 over the 128-bit > vector.
Makes sense, thanks! > > > r~ > -- Thanks, David / dhildenb