On Sat, Aug 15, 2020 at 2:36 AM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 8/13/20 7:48 PM, Frank Chang wrote:
> > esz is passed from e.g. GEN_VEXT_LD_STRIDE() macro:
> >
> >> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)        \
> >> void HELPER(NAME)(void *vd, void * v0, target_ulong base,  \
> >>                   target_ulong stride, CPURISCVState *env, \
> >>                   uint32_t desc)                           \
> >> {                                                          \
> >>     uint32_t vm = vext_vm(desc);                           \
> >>     vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \
> >>                      sizeof(ETYPE), GETPC(), MMU_DATA_LOAD);       \
> >> }
> >>
> >> GEN_VEXT_LD_STRIDE(vlse8_v,  int8_t,  lde_b)
> >
> > which is calculated by sizeof(ETYPE), so the results would be: 1, 2, 4,
> 8.
> > and vext_max_elems() is called by e.g. vext_ldst_stride():
>
> Ah, yes.
>
> >> uint32_t max_elems = vext_max_elems(desc, esz);
> >
> > I can add another parameter to the macro and pass the hard-coded
> log2(esz) number
> > if it's the better way instead of using ctzl().
> > Or if there's another approach to get the log2(esz) number more
> elegantly?
>
> Using ctzl(sizeof(type)) in the GEN_VEXT_LD_STRIDE macro will work well.
> This
> will be constant folded by the compiler.
>
>
> r~
>

Checked the codes again,
GEN_VEXT_LD_STRIDE() will eventually call vext_ldst_stride() and pass esz
as the parameter.
However, esz is not only used in vext_max_elems() but also used for other
calculation, e.g.:

    probe_pages(env, base + stride * i, nf * esz, ra, access_type);
and
    target_ulong addr = base + stride * i + k * esz;

If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(),
I would still have to do: (1 << esz) to get the correct element size in the
above calculations.
Would it eliminate the performance gain we have in vext_max_elems() instead?

Frank Chang

Reply via email to