On 3/12/20 7:58 AM, LIU Zhiwei wrote: > +static bool vrgather_vx_check(DisasContext *s, arg_rmrr *a) > +{ > + return (vext_check_isa_ill(s, RVV) && > + vext_check_overlap_mask(s, a->rd, a->vm, true) && > + vext_check_reg(s, a->rd, false) && > + vext_check_reg(s, a->rs2, false) && > + (a->rd != a->rs2)); > +} > +GEN_OPIVX_TRANS(vrgather_vx, vrgather_vx_check) > +GEN_OPIVI_TRANS(vrgather_vi, 1, vrgather_vx, vrgather_vx_check)
The unmasked versions of these should use gvec_dup. For the immediate version, where we can validate the index at translation time, we can use tcg_gen_gvec_dup_mem, so that the host vector dup-from-memory instruction can be used. For the register version, we should re-use the code from vext.x.s where we load the element, bound the index and squash the value to zero for index >= VLMAX. Then use tcg_gen_gvec_dup_i64. For the masked versions, we should load the value, as above, and then re-use the vmerge helper with vs2 = vd, so that we get vd[i] = v0[i].lsb ? val : vd[i] > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index 2219fdd6c5..5788e46dcf 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -4647,3 +4647,71 @@ GEN_VEXT_VSLIDE1DOWN_VX(vslide1down_vx_b, uint8_t, H1, > clearb) > GEN_VEXT_VSLIDE1DOWN_VX(vslide1down_vx_h, uint16_t, H2, clearh) > GEN_VEXT_VSLIDE1DOWN_VX(vslide1down_vx_w, uint32_t, H4, clearl) > GEN_VEXT_VSLIDE1DOWN_VX(vslide1down_vx_d, uint64_t, H8, clearq) > + > +/* Vector Register Gather Instruction */ > +#define GEN_VEXT_VRGATHER_VV(NAME, ETYPE, H, CLEAR_FN) \ > +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ > + CPURISCVState *env, uint32_t desc) \ > +{ \ > + uint32_t mlen = vext_mlen(desc); \ > + uint32_t vlmax = env_archcpu(env)->cfg.vlen / mlen; \ > + uint32_t vm = vext_vm(desc); \ > + uint32_t vl = env->vl; \ > + uint32_t index, i; \ > + \ > + for (i = 0; i < vl; i++) { \ > + if (!vm && !vext_elem_mask(v0, mlen, i)) { \ > + continue; \ > + } \ > + index = *((ETYPE *)vs1 + H(i)); \ > + if (index >= vlmax) { The type of index should be ETYPE or uint64_t, and similar for vlmax just so they match. r~