riscv: vector slide instructions

Richard Henderson Mon, 16 Mar 2020 11:51:28 -0700

On 3/16/20 1:04 AM, LIU Zhiwei wrote:
>> As a preference, I think you can do away with this helper.
>> Simply use the slideup helper with argument 1, and then
>> afterwards store the integer register into element 0.  You should be able to
>> re-use code from vmv.s.x for that.
> When I try it, I find it is some difficult, because  vmv.s.x will clean
> the elements (0 < index < VLEN/SEW).


Well, two things about that:

(1) The 0.8 version of vmv.s.x does *not* zero the other elements, so we'll
want to be prepared for that.

(2) We have 8 insns that, in the end come down to a direct element access,
possibly with some other processing.

So we'll want basic helper functions that can locate an element by immediate
offset and by variable offset:

/* Compute the offset of vreg[idx] relative to cpu_env.
   The index must be in range of VLMAX. */
int vec_element_ofsi(int vreg, int idx, int sew);

/* Compute a pointer to vreg[idx].
   If need_bound is true, mask idx into VLMAX,
   Otherwise we know a-priori that idx is already in bounds. */
void vec_element_ofsx(DisasContext *s, TCGv_ptr base,
                      TCGv idx, int sew, bool need_bound);

/* Load idx >= VLMAX ? 0 : vreg[idx] */
void vec_element_loadi(DisasContext *s, TCGv_i64 val,
                       int vreg, int idx, int sew);
void vec_element_loadx(DisasContext *s, TCGv_i64 val,
                       int vreg, TCGv idx, int sew);

/* Store vreg[imm] = val.
   The index must be in range of VLMAX.  */
void vec_element_storei(DisasContext *s, int vreg, int imm,
                        TCGv_i64 val);
void vec_element_storex(DisasContext *s, int vreg,
                        TCGv idx, TCGv_i64 val);

(3) It would be handy to have TCGv cpu_vl.

Then:

vext.x.v:
    If rs1 == 0,
        Use vec_element_loadi(s, x[rd], vs2, 0, s->sew).
    else
        Use vec_element_loadx(s, x[rd], vs2, x[rs1], true).

vmv.s.x:
    over = gen_new_label();
    tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
    For 0.7.1:
        Use tcg_gen_dup8i to zero all VLMAX elements of vd.
        If rs1 == 0, goto done.
    Use vec_element_storei(s, vs2, 0, x[rs1]).
 done:
    gen_set_label(over);

vfmv.f.s:
    Use vec_element_loadi(x, f[rd], vs2, 0).
    NaN-box f[rd] as necessary for SEW.

vfmv.s.f:
    tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
    For 0.7.1:
        Use tcg_gen_dup8i to zero all VLMAX elements of vd.
    Let tmp = f[rs1], nan-boxed as necessary for SEW.
    Use vec_element_storei(s, vs2, 0, tmp).
    gen_set_label(over);

vslide1up.vx:
    Ho hum, I forgot about masking.  Some options:
    (1) Call a helper just as you did in your original patch.
    (2) Call a helper only for !vm, for vm as below.
    (3) Call vslideup w/1.
        tcg_gen_brcondi(TCG_COND_EQ, cpu_vl, 0, over);
        If !vm,
            // inline test for v0[0]
            vec_element_loadi(s, tmp, 0, 0, MO_8);
            tcg_gen_andi_i64(tmp, tmp, 1);
            tcg_gen_brcondi(TCG_COND_EQ, tmp, 0, over);
        Use vec_element_store(s, vd, 0, x[rs1]).
        gen_set_label(over);

vslide1down.vx:
    For !vm, this is complicated enough for a helper.
    If using option 3 for vslide1up, then the store becomes:
    tcg_gen_subi_tl(tmp, cpu_vl, 1);
    vec_element_storex(s, base, tmp, x[rs1]);

vrgather.vx:
    If !vm or !vl_eq_vlmax, use helper.
    vec_element_loadx(s, tmp, vs2, x[rs1]);
    Use tcg_gen_gvec_dup_i64 to store to tmp to vd.

vrgather.vi:
    If !vm or !vl_eq_vlmax, use helper.
    If imm >= vlmax,
        Use tcg_gen_dup8i to zero vd;
    else,
        ofs = vec_element_ofsi(s, vs2, imm, s->sew);
        tcg_gen_gvec_dup_mem(sew, vreg_ofs(vd),
                             ofs, vlmax, vlmax);


r~

Re: [PATCH v5 57/60] target/riscv: vector slide instructions

Reply via email to